Ivan Shmakov
2012-01-17 08:12:13 UTC
[Cross-posting to news:news.misc, for it's no longer just about
NNTP and IMAP.]
Consider, e. g., news:sci.electronics.design, news:comp.os.vms,
news:comp.arch.embedded, news:alt.russian.z1, etc.
I'm currently working on a new NNTP server -- so it's not entirely
moribund ;-).
I'm contemplating developing a kind of a “news caching proxy”,
actually. However, I consider making somewhat radical changes
to the usual news processing procedures.
In particular, I aim for a better support of presentation of
netnews as (apart from having them accessible via IMAP and NNTP)
both Atom feeds (including Atom Posting Protocol support) and
Web pages (XHTML.) To this end, I believe it's essential to
outrightly discard non-ASCII article headers, /and/ non-ASCII
article bodies of the articles lacking proper MIME headers, as
both are ambiguous as to what character encoding is used.
Also, I consider using an RDBMS to store particular
(Message-ID:, References:), if not all, of the message headers,
so to allow for instant threading. Perhaps this storage could
be used to implement some “extended search” facilities within
either or both of the NNTP and IMAP interfaces as well, but I
have no specific plans on that at this moment.
Perhaps, RDBMS could also be used for MIME parts below certain
threshold of size (in octets; or perhaps characters, for text/*
MIME parts, and for strictly ASCII non-MIME article bodies.)
The MIME parts are to be stored separately (from each other and
from the header), in 8-bit, even if originally coming as, e. g.,
Base64 or quoted-printable. The parts above the size threshold
would be stored separately on the filesystem. If possible, a
few more transformations would also be implemented. In
particular, it may be possible to transform any armored OpenPGP
signatures into their MIME-based counterparts.
I also intend to compute message digests (SHA 256, SHA-1) for
the MIME parts of over some trivial (1024 octets or so) size,
and aggressively replace duplicates with links. If the message
(or a part) is digitally signed with a known key, the digests
computed would be verified against the signature, and the
message discarded on failure.
As for the “caching” part, this agent would both allow
conventional feeds (for input), as well as periodical and
on-demand fetching of articles (like, e. g., suck(1), but also
allowing for partial retrieval of the newsgroups' articles,
based on criteria specified over the XOVER data.) It will also
maintain the “last access time” for the articles, to be taken
into account in the “expiration” process.
The “proxy” part would mean that the server could be instructed
to preserve the Xref: header of its “primary” (“backing”) source
(though only for the specific newsgroups or hierarchies, and
still allowing for different sources to be used to actually
fetch the articles.) The posting in proxy mode will be
performed synchronously, with no reply being sent to the user
agent before the “backing” source itself replies, or a timeout.
For the implementation of the prototype, I'm now considering the
Perl language, as there's a sheer amount of extensions
available, providing support for a variety of protocols and data
presentations.
messages. In particular, certain MUA's allow for the copies of
the messages sent via e-mail to be saved into a dedicated IMAP
mailbox. Although unconventional, I guess that the MUA's could
be changed to use such a functionality to post new messages.
NNTP and IMAP.]
With netnews and NNTP being more or less moribund, this is
primarily of historical interest these days.
There seems to be a few still active newsgroups, actually.primarily of historical interest these days.
Consider, e. g., news:sci.electronics.design, news:comp.os.vms,
news:comp.arch.embedded, news:alt.russian.z1, etc.
moribund ;-).
actually. However, I consider making somewhat radical changes
to the usual news processing procedures.
In particular, I aim for a better support of presentation of
netnews as (apart from having them accessible via IMAP and NNTP)
both Atom feeds (including Atom Posting Protocol support) and
Web pages (XHTML.) To this end, I believe it's essential to
outrightly discard non-ASCII article headers, /and/ non-ASCII
article bodies of the articles lacking proper MIME headers, as
both are ambiguous as to what character encoding is used.
Also, I consider using an RDBMS to store particular
(Message-ID:, References:), if not all, of the message headers,
so to allow for instant threading. Perhaps this storage could
be used to implement some “extended search” facilities within
either or both of the NNTP and IMAP interfaces as well, but I
have no specific plans on that at this moment.
Perhaps, RDBMS could also be used for MIME parts below certain
threshold of size (in octets; or perhaps characters, for text/*
MIME parts, and for strictly ASCII non-MIME article bodies.)
The MIME parts are to be stored separately (from each other and
from the header), in 8-bit, even if originally coming as, e. g.,
Base64 or quoted-printable. The parts above the size threshold
would be stored separately on the filesystem. If possible, a
few more transformations would also be implemented. In
particular, it may be possible to transform any armored OpenPGP
signatures into their MIME-based counterparts.
I also intend to compute message digests (SHA 256, SHA-1) for
the MIME parts of over some trivial (1024 octets or so) size,
and aggressively replace duplicates with links. If the message
(or a part) is digitally signed with a known key, the digests
computed would be verified against the signature, and the
message discarded on failure.
As for the “caching” part, this agent would both allow
conventional feeds (for input), as well as periodical and
on-demand fetching of articles (like, e. g., suck(1), but also
allowing for partial retrieval of the newsgroups' articles,
based on criteria specified over the XOVER data.) It will also
maintain the “last access time” for the articles, to be taken
into account in the “expiration” process.
The “proxy” part would mean that the server could be instructed
to preserve the Xref: header of its “primary” (“backing”) source
(though only for the specific newsgroups or hierarchies, and
still allowing for different sources to be used to actually
fetch the articles.) The posting in proxy mode will be
performed synchronously, with no reply being sent to the user
agent before the “backing” source itself replies, or a timeout.
For the implementation of the prototype, I'm now considering the
Perl language, as there's a sheer amount of extensions
available, providing support for a variety of protocols and data
presentations.
For now it is transit only (no readers), but I would like to
eventually add reader support. I'd already intended to support both
NNTP and IMAP access once that happens.
However, given the lack of any way to post new messages via IMAP, I'm
not sure it would be very useful for Usenet.
The IMAP protocol allows for both retrieval /and/ storage ofeventually add reader support. I'd already intended to support both
NNTP and IMAP access once that happens.
However, given the lack of any way to post new messages via IMAP, I'm
not sure it would be very useful for Usenet.
messages. In particular, certain MUA's allow for the copies of
the messages sent via e-mail to be saved into a dedicated IMAP
mailbox. Although unconventional, I guess that the MUA's could
be changed to use such a functionality to post new messages.
--
FSF associate member #7257
FSF associate member #7257