I speculate that searching it indirectly via what appears on the
WWW is either (a) a means of re-using Google Web mechanisms for
Google Groups, or (b) an attempt to apply the page ranking idea to
Usenet. The former seems the more likely.
Since the archive belongs to Google and is maintained in any manner
Google sees fit, I'm not clear on what you're trying to say here.
That's because you're not thinking about the database. Think about the
database that Google has. It comprises a whole lot of stuff from
DejaNews, more old articles taken from various sources, all of the
Usenet traffic that Google has encountered directly, all of Google's
*own* non-Usenet discussion fora, and various third party WWW discussion
fora. But it's almost certainly *not* in the form of a news spool. As
I said, it's probably in a form that allows Google to re-use its WWW
searching and indexing mechanisms (spider+index servers+doc servers).
As I also said, it's also possible that Google wanted to employ some
sort of equivalent to the page ranking mechanism that it has for the
WWW. Given that WWW discussion fora are involved, I suspect that Google
tries to treat all of these things as if they were WWW discussion fora,
and then employs its WWW mechanisms upon them.
Most Usenet servers purge the articles after they become a predefined
age; [...]
This particular piece of folk wisdom hasn't been true for some years,
now, note. Several of the major Usenet nodes simply don't expire
non-binaries postings at all nowadays. Their abilities to store posts
have far outstripped the size of the text portion of a full Usenet feed,
which is only a tiny proportion of the full 10TiB/day feed. I remember
that when I last looked at Highwinds Media it had articles in some
newsgroups going back to 2006 or so. Power Usenet is currently
advertising "3013+ days text retention". In other words, it hasn't
expired a non-binaries posting for *eight years*. I haven't expired any
non-binary posts from my node in that time, either. We've all
effectively just turned non-binaries expiry off, half a decade or more ago.