Discussion:
i18n for netnews
(too old to reply)
Ivan Shmakov
2012-08-01 15:13:56 UTC
Permalink
[Cross-posting and setting Followup-To: news:news.misc, for
obvious reasons.]
(In particular, I'm planning to work on a "NNTP server"
implementation next year, and it's likely that it will reject
messages with non-ASCII headers outright.)
That's appropriate for now, but keep your eye on the
internationalization efforts at IETF. There is an RFC set for
internationalized e-mail, and news is the obvious next step;
As of RFC 5536 (published November 2009), it's /allowed/ to use
national characters in Netnews article headers, /provided/ that
RFC 2047 is used to encode them in 7-bit ASCII. For instance:

Subject: al =?utf-8?Q?=C4=89iu?= abonantoj

is perfectly valid, while:

Subject: al \xE6iu abonantoj

(where \xE6 is the octet to be interpreted using a particular,
unspecified encoding; the form I was complaining to) is not.
I would assume that such an enhancement will include a new capability
in the registry described in RFC 3977, 3.3.4. Initial IANA Register.
--
FSF associate member #7257 http://sf-day.org/
Shmuel (Seymour J.) Metz
2012-08-02 15:55:07 UTC
Permalink
Post by Ivan Shmakov
As of RFC 5536 (published November 2009), it's /allowed/ to use
national characters in Netnews article headers, /provided/ that
RFC 2047 is used to encode them in 7-bit ASCII.
Yes, but compare that to what is going on with e-mail; the
experimental RFC 5335, allowing raw UTF-8 in headers, has been
replaced by a standard track RFC, 6532. While RFC 3977 provides an
equivalent to 8BITMIME, there is no netnews equivalent to SMTPUTF8,
and I'm predicting that there will eventually be one.

BTW, I interpreted your "non-ASCII" as referring to octets beyond 127,
rather than to ASCII encoding of non-ASCII data.
--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to ***@library.lspace.org
Ivan Shmakov
2012-08-08 10:13:09 UTC
Permalink
[Cross-posting to news:comp.mail.misc, just in case; dropping
news:comp.lang.perl.misc from Followup-To:.]
Post by Shmuel (Seymour J.) Metz
Post by Ivan Shmakov
As of RFC 5536 (published November 2009), it's /allowed/ to use
national characters in Netnews article headers, /provided/ that
RFC 2047 is used to encode them in 7-bit ASCII.
Yes, but compare that to what is going on with e-mail; the
experimental RFC 5335, allowing raw UTF-8 in headers, has been
replaced by a standard track RFC, 6532.
ACK, thanks for the pointer!
Post by Shmuel (Seymour J.) Metz
While RFC 3977 provides an equivalent to 8BITMIME, there is no
netnews equivalent to SMTPUTF8, and I'm predicting that there will
eventually be one.
So, I should be prepared to allow both pure-ASCII /and/ UTF-8.
It's still better than allowing an arbitrary octet sequence in
an unspecified encoding in the header.
Post by Shmuel (Seymour J.) Metz
BTW, I interpreted your "non-ASCII" as referring to octets beyond
127, rather than to ASCII encoding of non-ASCII data.
Indeed, it's exactly what I've meant.

PS. Still, I believe that while the systems of the days passed
warranted a separation between Netnews /Transfer/ Agents (BKA
NNTP "servers", though it's a bit a misnomer) and Netnews /User/
Agents (NNTP "clients"; and, similarly, between Mail Transfer
Agents and Mail User Agents), the performance of the modern
computers, along with the reasonable success of the contemporary
P2P systems, makes it possible to get rid of such a distinction,
and allow for a direct "user-to-user" communication, in both a
Netnews- and Mail-like fashion. Perhaps, such an approach would
be more in line with the "Gen X" habits?
--
FSF associate member #7257 http://sf-day.org/
Shmuel (Seymour J.) Metz
2012-08-08 12:27:21 UTC
Permalink
Post by Ivan Shmakov
So, I should be prepared to allow both pure-ASCII /and/ UTF-8.
Prepared in the sense of having a flag with initial value false that,
if set, will permit non-ASCII.
Post by Ivan Shmakov
PS. Still, I believe that while the systems of the days passed
warranted a separation between Netnews /Transfer/ Agents (BKA
NNTP "servers", though it's a bit a misnomer) and Netnews /User/
Agents (NNTP "clients"; and, similarly, between Mail Transfer
Agents and Mail User Agents), the performance of the modern
computers, along with the reasonable success of the contemporary
P2P systems, makes it possible to get rid of such a distinction,
and allow for a direct "user-to-user" communication, in both a
Netnews- and Mail-like fashion. Perhaps, such an approach would
be more in line with the "Gen X" habits?
I don't agree; I like being able to work offline and I like not having
to use bloated web intefraces. Further, there are security issues.
--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to ***@library.lspace.org
Ivan Shmakov
2012-08-08 14:13:16 UTC
Permalink
[Still wondering why news:comp.lang.perl.misc wasn't dropped.]
Post by Shmuel (Seymour J.) Metz
Post by Ivan Shmakov
So, I should be prepared to allow both pure-ASCII /and/ UTF-8.
Prepared in the sense of having a flag with initial value false that,
if set, will permit non-ASCII.
... to permit UTF-8 (or anything that could be interpreted as
that) in the header, not just arbitrary binary data. (Binary
data will be allowed for the body, subject to the relevant
restrictions, and provided that Content-Transfer-Encoding: is
present and has "8bit" as its value.)
Post by Shmuel (Seymour J.) Metz
Post by Ivan Shmakov
PS. Still, I believe that while the systems of the days passed
warranted a separation between Netnews /Transfer/ Agents (BKA NNTP
"servers", though it's a bit a misnomer) and Netnews /User/ Agents
(NNTP "clients"; and, similarly, between Mail Transfer Agents and
Mail User Agents), the performance of the modern computers, along
with the reasonable success of the contemporary P2P systems, makes
it possible to get rid of such a distinction, and allow for a direct
"user-to-user" communication, in both a Netnews- and Mail-like
fashion. Perhaps, such an approach would be more in line with the
"Gen X" habits?
I don't agree; I like being able to work offline
After the data is downloaded from a P2P network (it may be
BitTorrent, GNUnet, Freenet, or whatever else), it could be used
off-line just perfectly.
Post by Shmuel (Seymour J.) Metz
and I like not having to use bloated web intefraces.
There're P2P agents with almost whatever interface: CLI,
full-screen text, graphical, Web, XML-RPC, etc.

(Not to mention that the contemporary Web is anything but a P2P
network.)
Post by Shmuel (Seymour J.) Metz
Further, there are security issues.
Namely? Freenet, GNUnet and Tor networks seem to have an
explicit focus on security, while BitTorrent's metadata (both
.torrent and Metalink) may be protected by digital signatures
(OpenPGP will work for either; Metalink should support XMLDSig,
too.)
--
FSF associate member #7257 http://sf-day.org/
Shmuel (Seymour J.) Metz
2012-08-08 17:18:25 UTC
Permalink
Post by Ivan Shmakov
After the data is downloaded from a P2P network (it may be
BitTorrent, GNUnet, Freenet, or whatever else), it could be used
off-line just perfectly.
Do those have the routing capabilities that mail and news require?
Also, wouldn't you still have a sparation betwen a user agent and a
transfer agent?
Post by Ivan Shmakov
Namely?
E.g., audit trail of the route for mail.
--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to ***@library.lspace.org
Ivan Shmakov
2012-08-09 12:47:34 UTC
Permalink
[Dropping news:comp.lang.perl.misc from Followup-To: as a matter
of habit.]
Post by Shmuel (Seymour J.) Metz
Post by Ivan Shmakov
After the data is downloaded from a P2P network (it may be
BitTorrent, GNUnet, Freenet, or whatever else), it could be used
off-line just perfectly.
Do those have the routing capabilities that mail and news require?
The current Netnews architecture relies, in essence, on a
simplistic flood-fill routing. That is, for each group, we can
imagine a network of nodes, each of which, upon receipt of a new
message, relays it to all of its peers. The only two things
that complicate this scheme are Path: and IHAVE, which both
reduce the load by not sending the message the peer already has.

In this scheme, and taking INN as an example, newsfeeds(5), and
its reciprocal incoming.conf(5), serve two purposes: they remedy
the fact that Netnews currently lack autodiscovery (contrary to
the P2P networks mentioned above), and they code (in a crude,
but working, fashion) the "trust" relationship between the
peers.

However, dealing with "trust" at the link level (and not at the
user level) can by itself led to certain security implications.

With a sound use of digital signatures (and implementing the
relevant WoT, or re-using the OpenPGP one), we can lay the
control over what's trustworthy and what's not straight to the
hands of the user.

The mail routing is a trickier one. However, considering that
virtually the only reason behind such a routing nowadays is the
belief in the security of firewalled intranets, we may simplify
the whole task of routing to a three-hop (user, relay, user)
scheme, detailed below.

Let's first assume that "autodiscovery" is in place. Now, Alice
chooses a "relay" (there may be both free of charge and paid
ones), and her agent puts (into the distributed hash table, or
DHT) a (digitally-signed) "pointer" record that all her mail
should be delivered to that relay. When Bob wants to send mail
to Alice, its agent searches the DHT, finds the "pointer"
record, and sends a copy of the letter to the relay thus found.
(Naturally, Alice's agent checks the relay for any new messages
periodically when online.)

It makes sense for the Bob's letter to mention a few of his
previous messages (sent to Alice) in the header. This way, it
would be much harder for a (malicious) relay to hide the
information of any such message. Also, it makes sense to
encrypt all the communication, so such a relay won't be able to
intercept or tamper the messages, either.

This scheme could be modified slightly by requesting Bob to copy
the message he wishes to send to Alice to his own "relay" B,
while sending only a pointer to Alice's A. This way, Bob (and
not Alice) "pays" for the handling of his outgoing letters,
which may thwart certain abuse scenarios.

Note that should Mallory try to send numerous pointers to a
single message, it'd be possible to "mark" such a message as
"spam" just once per group of users of the "Netnews-like part"
of the P2P communication system being discussed.
Post by Shmuel (Seymour J.) Metz
Also, wouldn't you still have a separation between a user agent and a
transfer agent?
It /may/ be done, for various reasons, including the
compatibility with the MUA's currently in use.

Both GNUnet and Freenet (IIRC) implement an HTTP interface, so
that an ordinary Web browser can be used to connect to the
network. OTOH, BitTorrent agents (commonly called /clients/,
though it's a misnomer) are mostly self-contained.
Post by Shmuel (Seymour J.) Metz
Post by Ivan Shmakov
Post by Shmuel (Seymour J.) Metz
Further, there are security issues.
Namely?
E. g., audit trail of the route for mail.
And what exactly it's for? I don't see why one may need to care
/how/ the letter has reached its destination if it's a valid and
wanted one. And if it's not, imposing a policy on the relays is
only a partial solution.
--
FSF associate member #7257 http://sf-day.org/
Shmuel (Seymour J.) Metz
2012-08-10 13:40:45 UTC
Permalink
Post by Ivan Shmakov
In this scheme, and taking INN as an example, newsfeeds(5), and
its reciprocal incoming.conf(5), serve two purposes: they remedy
the fact that Netnews currently lack autodiscovery (contrary to
the P2P networks mentioned above), and they code (in a crude,
but working, fashion) the "trust" relationship between the
peers.
There's more than trust involved in current news routing; there's also
the ability, e.g., for specific servers to carry only specific groups,
and to announce what they carry.
Post by Ivan Shmakov
With a sound use of digital signatures (and implementing the
relevant WoT, or re-using the OpenPGP one), we can lay the
control over what's trustworthy and what's not straight to the
hands of the user.
I don't see how.
Post by Ivan Shmakov
Let's first assume that "autodiscovery" is in place. Now, Alice
chooses a "relay" (there may be both free of charge and paid
ones), and her agent puts (into the distributed hash table, or
DHT) a (digitally-signed) "pointer" record that all her mail
should be delivered to that relay.
How do you authenticate a messge from a stranger? With SMTP there's an
unforgable source IP address that I can filter on.
Post by Ivan Shmakov
It makes sense for the Bob's letter to mention a few of his
previous messages (sent to Alice) in the header.
Not if there are no previous messages.
Post by Ivan Shmakov
Both GNUnet and Freenet (IIRC) implement an HTTP interface, so
that an ordinary Web browser can be used to connect to the
network.
Exactly what I want to avoid.
Post by Ivan Shmakov
And what exactly it's for?
The ausit trail in e-mail is intended for diagnostic purposaes, but in
practice it is used for spam filtering as well.
Post by Ivan Shmakov
I don't see why one may need to care /how/ the letter
has reached its destination if it's a valid and wanted
one.
There's no automated way to tell that a message is a validated and
wanted one, so you have to rely on heuristics. Deep filtering is one
useful heuristic.
Post by Ivan Shmakov
only a partial solution.
Partial solutions are all we have.

In addition to the difficulty of developing a new infrastructure that
retains the functionality of the existing infrastructure, there's also
the problem of transition.
--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to ***@library.lspace.org
Ivan Shmakov
2012-08-12 16:19:57 UTC
Permalink
Post by Shmuel (Seymour J.) Metz
In this scheme, and taking INN as an example, newsfeeds(5), and its
reciprocal incoming.conf(5), serve two purposes: they remedy the
fact that Netnews currently lack autodiscovery (contrary to the P2P
networks mentioned above), and they code (in a crude, but working,
fashion) the "trust" relationship between the peers.
There's more than trust involved in current news routing; there's
also the ability, e. g., for specific servers to carry only specific
groups,
First of all, newsgroups doesn't matter that much. They're just
tags, which are immutable once the message is posted, and, quite
often than not, are incomplete or misleading. (And this thread
is a good example of that.) Would I run my own NNTP "server",
I'd make it "track" a number of selected individuals, and all
the threads they've been participating in, whatever are the
newsgroups.

Unfortunately, NNTP doesn't make this task all that easy.
Post by Shmuel (Seymour J.) Metz
and to announce what they carry.
AIUI, NNTP only provides for a way to know if a newsgroup is
created on the "server", not whether it's actually carried
(i. e., subscribed to on one or more of the peers) or not.
Post by Shmuel (Seymour J.) Metz
With a sound use of digital signatures (and implementing the
relevant WoT, or re-using the OpenPGP one), we can lay the control
over what's trustworthy and what's not straight to the hands of the
user.
I don't see how.
Isn't it trivial to make a whitelist of public keys of all the
people one wants to receive mail from? It's much less an issue
than providing a way for a stranger (or one who has lost his or
her private key) to communicate, while still stopping abuse.
Post by Shmuel (Seymour J.) Metz
Let's first assume that "autodiscovery" is in place. Now, Alice
chooses a "relay" (there may be both free of charge and paid ones),
and her agent puts (into the distributed hash table, or DHT) a
(digitally-signed) "pointer" record that all her mail should be
delivered to that relay.
How do you authenticate a message from a stranger?
If the stranger's public key isn't reachable via my own WoT, and
if it wasn't used to sign any "public" messages I may find, then
I have no way to authenticate it.
Post by Shmuel (Seymour J.) Metz
With SMTP there's an unforgeable source IP address that I can filter
on.
Not quite, at least since the time various Webmails became
widespread. For privacy concerns, they're quite likely to
"hide" the IP of the HTTP client used to submit the message.

Also, the IP in question may be autoconfigured, or "leased" via
DHCPv6 or DHCP, or it may be an IPv4 address of the NAT'ting
router, just as well. Thus, only the network prefix could be
obtained (more or less) reliably from the message's headers.

Tor could be used to anonymize IP, too, but its default is to
block traffic to TCP port 25. (Which doesn't quite help if
there was a "transit Webmail" at TCP port 80 or 443, though.)
Post by Shmuel (Seymour J.) Metz
It makes sense for the Bob's letter to mention a few of his previous
messages (sent to Alice) in the header.
Not if there are no previous messages.
Obviously.
Post by Shmuel (Seymour J.) Metz
Both GNUnet and Freenet (IIRC) implement an HTTP interface, so that
an ordinary Web browser can be used to connect to the network.
Exactly what I want to avoid.
HTTP is a quite decent file transfer protocol, and "speaking" it
isn't a problem. (Not anymore, and arguably much less, than
speaking FTP, for instance.) My guess is that one can access
these networks with, say, GNU Wget or aria2, too.
Post by Shmuel (Seymour J.) Metz
And what exactly it's for?
The audit trail in e-mail is intended for diagnostic purposes,
Yes, but as I've said before, any complex routing doesn't make
much sense for e-mail anymore, and there isn't much "diagnostic"
that could be obtained in the "two MTA's" case other than that
already recorded in the logs of those MTA's.

Don't we already have some complex routing running on the
network level? Why repeat it a few levels higher?
Post by Shmuel (Seymour J.) Metz
but in practice it is used for spam filtering as well.
Essentially, that means that instead of relying on almost
unforgeable TCP/IP "peer" (address, port) pair, one decides to
rely on the Received: headers, as added by the "third party"
(= transit MTA's.)
Post by Shmuel (Seymour J.) Metz
I don't see why one may need to care /how/ the letter has reached
its destination if it's a valid and wanted one.
There's no automated way to tell that a message is a validated and
wanted one, so you have to rely on heuristics. Deep filtering is one
useful heuristic.
We can design a system for /reliable/ (something that the
present e-mail doesn't quite offer) delivery between
pairwise-trusted peers. It's up to the user then to decide
whose messages he wants to be delivered to him or her.
Post by Shmuel (Seymour J.) Metz
only a partial solution.
Partial solutions are all we have.
Ultimately, yes.
Post by Shmuel (Seymour J.) Metz
In addition to the difficulty of developing a new infrastructure that
retains the functionality of the existing infrastructure, there's
also the problem of transition.
Yes, at least to some extent.

Though it was my understanding that opting for social networking
sites, GenX has pretty much forsaken e-mail. (My guess is that
they didn't notice that part of the functionality has vanished
meanwhile.)
--
FSF associate member #7257 http://sf-day.org/
Shmuel (Seymour J.) Metz
2012-08-13 01:11:52 UTC
Permalink
Post by Ivan Shmakov
AIUI, NNTP only provides for a way to know if a newsgroup is
created on the "server", not whether it's actually carried
7.6.6. LIST NEWSGROUPS

This keyword MUST be supported by servers advertising the READER
capability.

The newsgroups list is maintained by NNTP servers to contain the
name
of each newsgroup that is available on the server and a short
description about the purpose of the group. Each line of this list
consists of two fields separated from each other by one or more
space
or TAB characters (the usual practice is a single TAB). The first
field is the name of the newsgroup, and the second is a short
description of the group.
Post by Ivan Shmakov
Isn't it trivial to make a whitelist of public keys of all the
people one wants to receive mail from?
Irrelevant; the issue is e-mail from strangers.
Post by Ivan Shmakov
If the stranger's public key isn't reachable via my own WoT, and
if it wasn't used to sign any "public" messages I may find, then
I have no way to authenticate it.
Exactly.
Post by Ivan Shmakov
Not quite, at least since the time various Webmails became
widespread.
The webmail server has to inject the messages to SMTP servers, and
they have access to its IP address. I can blocklist that IP address if
appropriate.
Post by Ivan Shmakov
Also, the IP in question may be autoconfigured, or "leased" via
DHCPv6 or DHCP,
The solution is to reject all traffic from that IP block unless they
are blocking outbound port 25.
Post by Ivan Shmakov
Essentially, that means that instead of relying on almost
unforgeable TCP/IP "peer" (address, port) pair, one decides to
rely on the Received: headers, as added by the "third party"
(= transit MTA's.)
No, that means that you don't accept any relayed traffic from an MTA
that you haven't already determined logs the IP addresses correctly.
Post by Ivan Shmakov
We can design a system for /reliable/ (something that the
present e-mail doesn't quite offer) delivery between
pairwise-trusted peers.
That's not good enough.
--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to ***@library.lspace.org
Ivan Shmakov
2012-08-13 16:59:05 UTC
Permalink
Post by Shmuel (Seymour J.) Metz
AIUI, NNTP only provides for a way to know if a newsgroup is created
on the "server", not whether it's actually carried
7.6.6. LIST NEWSGROUPS
This keyword MUST be supported by servers advertising the READER
capability.
The newsgroups list is maintained by NNTP servers to contain the
name of each newsgroup that is available on the server and a short
description about the purpose of the group.
... And the purpose of this command is to deliver this
description to the newsreader software, and therefore to the
user.

There's no magic way that the Netnews "server" responding to
this command may know it's no longer feed with the relevant
articles by its peers.

[...]
Post by Shmuel (Seymour J.) Metz
Not quite, at least since the time various Webmails became
widespread.
The webmail server has to inject the messages to SMTP servers, and
they have access to its IP address. I can blocklist that IP address
if appropriate.
Say, the IP address of one (or more) of Google Mail MX'es?
Post by Shmuel (Seymour J.) Metz
Also, the IP in question may be autoconfigured, or "leased" via
DHCPv6 or DHCP,
The solution is to reject all traffic from that IP block unless they
are blocking outbound port 25.
The "privacy-enabled" IPv6 autoconfiguration makes the host
choose a new IPv6 address once in, like, 15 minutes. Likewise,
"dynamic" IP's (still widely used, as it seems) are likely to
change every few hours to few days.

And did I mention botnets, BTW?
Post by Shmuel (Seymour J.) Metz
Essentially, that means that instead of relying on almost
unforgeable TCP/IP "peer" (address, port) pair, one decides to rely
on the Received: headers, as added by the "third party" (= transit
MTA's.)
No, that means that you don't accept any relayed traffic from an MTA
that you haven't already determined logs the IP addresses correctly.
And how do you determine that without accepting some traffic
first?
Post by Shmuel (Seymour J.) Metz
We can design a system for /reliable/ (something that the present
e-mail doesn't quite offer) delivery between pairwise-trusted peers.
That's not good enough.
Perhaps.

It's worth trying, anyway.
--
FSF associate member #7257 http://sf-day.org/
Shmuel (Seymour J.) Metz
2012-08-13 22:32:47 UTC
Permalink
Post by Ivan Shmakov
There's no magic way that the Netnews "server" responding to
this command may know it's no longer feed with the relevant
articles by its peers.
Nor is there in any other protocol.
Post by Ivan Shmakov
Say, the IP address of one (or more) of Google Mail MX'es?
I'm tempted.
Post by Ivan Shmakov
The "privacy-enabled" IPv6 autoconfiguration makes the host
choose a new IPv6 address once in, like, 15 minutes.
I'm not running a whistleblowers hot line or the like and have no need
to receive anonymous messages.
Post by Ivan Shmakov
And how do you determine that without accepting some traffic
first?
You don't, but you can certainly apply harsher filtering criteria to
MTA's that haven't established trust.
--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to ***@library.lspace.org
Loading...