Discussion:
Usenet Markup
(too old to reply)
Jason Evans
2021-11-12 07:30:48 UTC
Permalink
Some of you may have seen the HTML spammer in news.admin.misc and a few
other places. Someone else recently asked on Reddit why HTML isn't used
more on Usenet. I think the basic answer is because so many people still
use terminal-based newsreaders like SLRN and NN and HTML just makes
everything harder to read.

I think there could be a middle ground. Take something like gemtext which
is a very limited subset of the Markdown markup language. Gemtext has
only 5 different operations that are used for formatting:

Links:
=> https://blog.theuse.net My Blog

Headings:
# Heading
## Sub-heading
### Sub-sub-heading

Lists:
* Item 1
* Item 2
* Item 3
That's what she said
Preformatted text/Source code:
```
#!/bin/bash

echo "this is a bash script"
```

Right now there are no newsreaders that handle this kind of markup but if
there were, users who do not use that newsreader would not be distracted
by it being in an article that they were reading because they don't
interfere with normal reading the way HTML interferes.

Just throwing this out there for discussion.
Grant Taylor
2021-11-12 18:40:32 UTC
Permalink
Post by Jason Evans
Some of you may have seen the HTML spammer in news.admin.misc and a few
other places. Someone else recently asked on Reddit why HTML isn't used
more on Usenet.
In a word, "convention".
Post by Jason Evans
I think the basic answer is because so many people still use
terminal-based newsreaders like SLRN and NN and HTML just makes
everything harder to read.
Maybe, maybe not.
Post by Jason Evans
I think there could be a middle ground. Take something like gemtext which
is a very limited subset of the Markdown markup language.
gemtext seems like it might be fairly innocuous, much like
format=flowed. Though I suspect that gemtext's update would be lower
than format=flowed's uptake.
Post by Jason Evans
Right now there are no newsreaders that handle this kind of markup but if
there were, users who do not use that newsreader would not be distracted
by it being in an article that they were reading because they don't
interfere with normal reading the way HTML interferes.
Valid point.
Post by Jason Evans
Just throwing this out there for discussion.
I think introducing another form of markup seems like it's only going to
muddy the water even more.

The wonderful thing about standards is that we have so many to pick
from. A la. xkce 927 -- Standards

https://xkcd.com/927/
--
Grant. . . .
unix || die
Adam H. Kerman
2021-11-12 20:35:11 UTC
Permalink
Post by Grant Taylor
Post by Jason Evans
Some of you may have seen the HTML spammer in news.admin.misc and a few
other places. Someone else recently asked on Reddit why HTML isn't used
more on Usenet.
In a word, "convention".
Post by Jason Evans
I think the basic answer is because so many people still use
terminal-based newsreaders like SLRN and NN and HTML just makes
everything harder to read.
Maybe, maybe not.
Post by Jason Evans
I think there could be a middle ground. Take something like gemtext which
is a very limited subset of the Markdown markup language.
gemtext seems like it might be fairly innocuous, much like
format=flowed. Though I suspect that gemtext's update would be lower
than format=flowed's uptake.
And yet format=flowed is badly implemented on any number of Mail and
News clients.

Whether gemtext is innocuous depends on successful implementation in
clients.

I like plain text ASCII. It's universally readable.
Post by Grant Taylor
Post by Jason Evans
Right now there are no newsreaders that handle this kind of markup but if
there were, users who do not use that newsreader would not be distracted
by it being in an article that they were reading because they don't
interfere with normal reading the way HTML interferes.
Valid point.
Post by Jason Evans
Just throwing this out there for discussion.
I think introducing another form of markup seems like it's only going to
muddy the water even more.
The wonderful thing about standards is that we have so many to pick
from. A la. xkce 927 -- Standards
https://xkcd.com/927/
Grant Taylor
2021-11-13 00:32:31 UTC
Permalink
Post by Adam H. Kerman
And yet format=flowed is badly implemented on any number of Mail and
News clients.
I've been using format=flowed for close to 20 years without any problems.

Bugs exist in almost all computer programs. Implementations of
format=flowed, or default configurations therefor, are subject to
similar bugs.
Post by Adam H. Kerman
Whether gemtext is innocuous depends on successful implementation
in clients.
I like plain text ASCII. It's universally readable.
format=flowed *is* /plain/ /text/.
--
Grant. . . .
unix || die
Adam H. Kerman
2021-11-13 01:32:17 UTC
Permalink
Post by Grant Taylor
Post by Adam H. Kerman
And yet format=flowed is badly implemented on any number of Mail and
News clients.
I've been using format=flowed for close to 20 years without any problems.
I think that's wonderful. I've attempted to use it in different clients.
Almost none can handle it in followup. The quotes ended up nonstandard.
Post by Grant Taylor
Bugs exist in almost all computer programs. Implementations of
format=flowed, or default configurations therefor, are subject to
similar bugs.
Ok. That's a tautology.
Post by Grant Taylor
Post by Adam H. Kerman
Whether gemtext is innocuous depends on successful implementation
in clients.
I like plain text ASCII. It's universally readable.
format=flowed *is* /plain/ /text/.
What does that have to do with my comment about ASCII? It's not
character set dependent.

I've seen clients produce long lines in format=flowed, which makes it
NOT universally readable if it won't output lines of 78 characters or
less. There's no reason to output text not intended to be displayed on
a screen width of 80 characters as it's designed to treat each paragraph
as one long line and reformat on the fly based on screen width.
Grant Taylor
2021-11-13 04:13:07 UTC
Permalink
Post by Adam H. Kerman
I think that's wonderful. I've attempted to use it in different clients.
Almost none can handle it in followup. The quotes ended up nonstandard.
I've not noticed a problem in follow up replies vs new messages per se.
I think the problem has to do with the source material, be it newly
typed text or copy that's being replied to.

For instance, your comment above is not formatted per format=flowed
standards. I've slightly altered a copy below such that it is formatted
per format=flowed. I've also increased the quote depth.
Post by Adam H. Kerman
Post by Adam H. Kerman
I think that's wonderful. I've attempted to use it in different clients.
Almost none can handle it in followup. The quotes ended up nonstandard.
Here's another copy of it that has been completely reformatted the way
that I usually do things. Plus yet another increase in quote depth.
Post by Adam H. Kerman
Post by Adam H. Kerman
Post by Adam H. Kerman
I think that's wonderful. I've attempted to use it in different
clients. Almost none can handle it in followup. The quotes ended
up nonstandard.
:-)
Post by Adam H. Kerman
What does that have to do with my comment about ASCII? It's not
character set dependent.
Sorry, I think of plain text as being ASCII. Or more precisely plain
text is a subset of ASCII.
Post by Adam H. Kerman
I've seen clients produce long lines in format=flowed, which makes it
NOT universally readable if it won't output lines of 78 characters or
less.
That statement tells me without a doubt that those long lines (as viewed
in the message source) are NOT format=flowed.

It sounds like you are instead talking about the simply really long /
unwrapped lines of text. Which is something that I consider to be an
abomination.
Post by Adam H. Kerman
There's no reason to output text not intended to be displayed on a
screen width of 80 characters as it's designed to treat each paragraph
as one long line and reformat on the fly based on screen width.
Why is there any reason to artificially limit /display/ of text to 80
-- or pick the number you prefer -- characters?

I believe that format=flowed is a wonderful happy medium. It allows
format=flowed enabled readers to re-wrap the text to the window width
for /display/ while the underlying source is fixed width. Thus
format=flowed will respect those that want the text to be re-wrapped
/and/ those that want text to be < 80 characters per line.

Format=flowed works by doing two things:

1) Adding a header indicating that format=flowed is being used.
2) Output lines of text up to but not exceeding a fixed width. That
fixed width is usually set to 72 through 76 characters.

The output lines that are supposed to be continued end with a space.

The existence of the format=flowed header means that any line that ends
with a space is supposed to have the next line appended to it.
--
Grant. . . .
unix || die
Adam H. Kerman
2021-11-13 12:42:42 UTC
Permalink
Post by Grant Taylor
Post by Adam H. Kerman
I think that's wonderful. I've attempted to use it in different clients.
Almost none can handle it in followup. The quotes ended up nonstandard.
I've not noticed a problem in follow up replies vs new messages per se.
I think the problem has to do with the source material, be it newly
typed text or copy that's being replied to.
The problem in my experience has not been the source material. With a
client that poorly implements it, I've observed that the quote of
material that began as flowed text is no longer flowed text.
Post by Grant Taylor
Post by Adam H. Kerman
. . .
What does that have to do with my comment about ASCII? It's not
character set dependent.
Sorry, I think of plain text as being ASCII. Or more precisely plain
text is a subset of ASCII.
I'm not going to dispute that plain text using the Latin alphabet in a
language other that English with accented characters is plain text.
However, I do not recognize as plain text substituting open and close
single and double quotes in punctuation or em and en dashes for which
there are perfectly good ASCII punctuation marks. I have yet to see one
of these "smart quote" implementations that distinguish between single
close quote, apostrophe, and acute accent. All three may be represented
by the same glyph but they have separate character codes in UTF.
Post by Grant Taylor
Post by Adam H. Kerman
I've seen clients produce long lines in format=flowed, which makes it
NOT universally readable if it won't output lines of 78 characters or
less.
That statement tells me without a doubt that those long lines (as viewed
in the message source) are NOT format=flowed.
format=flowed is not line length dependent, for the entire paragraph may
be one long line or a series of lines of widely varying length, and
still display as intended within the viewport.

I looked at RFC 3676. The recommendation not to exceed 78 characters is
a SHOULD, not a MUST.
Post by Grant Taylor
It sounds like you are instead talking about the simply really long /
unwrapped lines of text. Which is something that I consider to be an
abomination.
No, I'm not. I'm talking about those who use a variable-width
character set instead of a fixed-width character set when composing News
and Mail. The line length is then set by their viewport, ignoring the
needs of those of us who continue to expect output for an 80 character
width terminal or emulation.
Post by Grant Taylor
Post by Adam H. Kerman
. . .
Grant Taylor
2021-11-13 17:10:08 UTC
Permalink
Post by Adam H. Kerman
The problem in my experience has not been the source material. With
a client that poorly implements it, I've observed that the quote of
material that began as flowed text is no longer flowed text.
Ah.

I too have seen many email / news clients not properly handle quoting of
previously format=flowed text.

To me, that quote text is /new/ text that happens to resemble the
original format=flowed text. Somewhat like a photo copy of a photo copy.

You are referring to an area where there are far more bugs. An area
where I have taken to manually reformatting text that go into articles
that I reply to. I have a script that I use to reformat quoted text.

Use of format=flowed is important enough to me that I take time to
reformat=flowed (reflow?) text in messages that I reply to.
Post by Adam H. Kerman
I'm not going to dispute that plain text using the Latin alphabet in
a language other that English with accented characters is plain text.
However, I do not recognize as plain text substituting open and close
single and double quotes in punctuation or em and en dashes for which
there are perfectly good ASCII punctuation marks. I have yet to see one
of these "smart quote" implementations that distinguish between single
close quote, apostrophe, and acute accent. All three may be represented
by the same glyph but they have separate character codes in UTF.
That statement didn't go where I thought it was going to go. I will say
that I may have simplified "plain text" to some extent. But I didn't
think this thread was going far enough into the weeds about that. I'll
mostly bow out of that discussion.

Mostly as in there is a difference in meaning of a dash (a.k.a. hyphen),
an en-dash, and an em-dash. Admittedly they are lost on many people.

I think a comparison can be made to an "o", "O", and "0". As there was
a time when people had to be taught to use the proper letter in the
proper context, particularly when interacting with computers.

1-5 (dash / hyphen) vs 1–5 (en-dash) Am I subtracting 5 from 1 or am I
saying 1 through 5? The dash vs en-dash makes a difference.

<Title> — <subtitle> An em-dash is a form of separator / pause.

There are differences between the "-" (dash / hyphen), "–" (en-dash),
and "—" (em-dash). Some may think that the differences are an
unnecessary nuance.
Post by Adam H. Kerman
format=flowed is not line length dependent, for the entire paragraph
may be one long line or a series of lines of widely varying length,
and still display as intended within the viewport.
Which is one of the reasons that I like format=flowed as much as I do.
Post by Adam H. Kerman
I looked at RFC 3676. The recommendation not to exceed 78 characters
is a SHOULD, not a MUST.
Yep.
Post by Adam H. Kerman
No, I'm not. I'm talking about those who use a variable-width character
set instead of a fixed-width character set when composing News and
Mail. The line length is then set by their viewport, ignoring the
needs of those of us who continue to expect output for an 80 character
width terminal or emulation.
The font face should not make a difference. Variable vs fixed width
text should support format=flowed perfectly fine.

The fact that you are running into a hard line length — other than
someone using the wrong value close to 78 — tells me that you are
dealing with something that's not implementing format=flowed properly.
--
Grant. . . .
unix || die
Adam H. Kerman
2021-11-13 18:25:32 UTC
Permalink
Post by Grant Taylor
Post by Adam H. Kerman
format=flowed is not line length dependent, for the entire paragraph
may be one long line or a series of lines of widely varying length,
and still display as intended within the viewport.
Which is one of the reasons that I like format=flowed as much as I do.
Post by Adam H. Kerman
I looked at RFC 3676. The recommendation not to exceed 78 characters
is a SHOULD, not a MUST.
Yep.
Post by Adam H. Kerman
No, I'm not. I'm talking about those who use a variable-width character
set instead of a fixed-width character set when composing News and
Mail. The line length is then set by their viewport, ignoring the
needs of those of us who continue to expect output for an 80 character
width terminal or emulation.
The font face should not make a difference.
Of course not, but I didn't write the client to flout conventional line
length.
Post by Grant Taylor
Variable vs fixed width text should support format=flowed perfectly fine.
The fact that you are running into a hard line length — other than
someone using the wrong value close to 78 — tells me that you are
dealing with something that's not implementing format=flowed properly.
I'm disagreeing with you on that. Outputting conventional line length is
a separate issue from outputting standard flowed text.
Grant Taylor
2021-11-14 00:41:41 UTC
Permalink
Post by Adam H. Kerman
I'm disagreeing with you on that. Outputting conventional line length
is a separate issue from outputting standard flowed text.
Please provide an example of where you are seeing a problem.

I'm still not tracking where the problem would relate to format=flowed.
--
Grant. . . .
unix || die
Adam H. Kerman
2021-11-14 01:35:31 UTC
Permalink
Post by Grant Taylor
Post by Adam H. Kerman
I'm disagreeing with you on that. Outputting conventional line length
is a separate issue from outputting standard flowed text.
Please provide an example of where you are seeing a problem.
I'm still not tracking where the problem would relate to format=flowed.
That would be my point. It's an unrelated issue.
Julien ÉLIE
2021-11-13 08:47:54 UTC
Permalink
Hi Adam,
Post by Adam H. Kerman
I like plain text ASCII. It's universally readable.
Just responding to say I like UTF-8 better :-)

P.-S.: I think you now decode well my messages :-)
--
Julien ÉLIE

« Dès que le silence se fait, les gens le meublent. » (Raymond Devos)
Adam H. Kerman
2021-11-13 13:46:11 UTC
Permalink
Julien à LIE <***@nom-de-mon-site.com.invalid> wrote:

I am aware that you spell your last name with Latin Capital E With Acute
Accent. Your encoded word is =c3=80. I'm using the vim text editor to edit
this followup. I see Latin Capital A with Tilde.
Post by Julien ÉLIE
Hi Adam,
Post by Adam H. Kerman
I like plain text ASCII. It's universally readable.
Just responding to say I like UTF-8 better :-)
P.-S.: I think you now decode well my messages :-)
No. It's never displayed on my UTF-8 terminal emulations as you intended.
On the Linux Mint laptop I'm using Xfce Terminal Emulator.
Post by Julien ÉLIE
--
Julien ÉLIE
I see Latin Capital A with Tilde, then <89> which is an undecoded display.
Post by Julien ÉLIE
« DÚs que le silence se fait, les gens le meublent. » (Raymond Devos)
Here I see several non-printing characters and various accented letters
not displaying as you intended.
Grant Taylor
2021-11-13 17:14:54 UTC
Permalink
Post by Adam H. Kerman
No. It's never displayed on my UTF-8 terminal emulations as you
intended. On the Linux Mint laptop I'm using Xfce Terminal Emulator.
Sounds to me like you need to get a better terminal emulator.

XTerm displays it just fine and looks identical to what Thunderbird
displays.
Post by Adam H. Kerman
I see Latin Capital A with Tilde, then <89> which is an undecoded display.
Or perhaps your MUA / editor needs some tweaking.

I'm able to see Julien's text just fine inside of vim in XTerm.
Post by Adam H. Kerman
Here I see several non-printing characters and various accented
letters not displaying as you intended.
Sounds to me like you're seeing the message source, not a rendered
message. Hence the MUA comment.
--
Grant. . . .
unix || die
Grant Taylor
2021-11-13 17:17:08 UTC
Permalink
Post by Grant Taylor
Sounds to me like you need to get a better terminal emulator.
...
Or perhaps your MUA / editor needs some tweaking.
...
Sounds to me like you're seeing the message source, not a rendered
message.  Hence the MUA comment.
You are obviously free to use whatever software / hardware that you want
to use.

But I implore you to understand where the limitation is. Maybe it's a
configuration issue of what you're using. Maybe it's a lack of capability.

Use what you want to, but understand what and how your choice impacts
what you do.
--
Grant. . . .
unix || die
Adam H. Kerman
2021-11-13 18:32:24 UTC
Permalink
Post by Grant Taylor
Post by Adam H. Kerman
No. It's never displayed on my UTF-8 terminal emulations as you
intended. On the Linux Mint laptop I'm using Xfce Terminal Emulator.
Sounds to me like you need to get a better terminal emulator.
It may not be the terminal emulator. I have a similar problem with
Julien's articles on PuTTY on a Windows 8.1 desktop.

My guess is it's something in the LOCALE setting of the remote terminal
but I have no idea what it could be since both the terminal and the two
emulations are set to UTF-8.
Post by Grant Taylor
Or perhaps your MUA / editor needs some tweaking.
Then you'll have to clue me in. As far as I'm aware, vim doesn't touch
this stuff and can't override the terminal emulation.
Post by Grant Taylor
Post by Adam H. Kerman
Here I see several non-printing characters and various accented
letters not displaying as you intended.
Sounds to me like you're seeing the message source, not a rendered
message. Hence the MUA comment.
When there's an incompatibility, non-printing characters become visible.
I sometimes do that deliberately to make sure I've eliminated them when
I intend to send plain text.
Grant Taylor
2021-11-14 00:43:37 UTC
Permalink
Post by Adam H. Kerman
My guess is it's something in the LOCALE setting of the remote terminal
but I have no idea what it could be since both the terminal and the
two emulations are set to UTF-8.
Maybe.
Post by Adam H. Kerman
Then you'll have to clue me in. As far as I'm aware, vim doesn't
touch this stuff and can't override the terminal emulation.
I was thinking that the MUA might not be processing / decoding things
correctly and thus displaying them incorrectly.
--
Grant. . . .
unix || die
Donkey Button
2021-11-21 05:37:51 UTC
Permalink
What idiot would post a noncontroversial opinion like this through
mixmin?
*plonk*

The name-calling and [snipped] profanity is unhinged and pubescent
behavior. It seems like this person was asking for his address to
be killfiled. I have granted his subliminal request.

--
Donkey Button
Adam H. Kerman
2021-11-21 18:26:09 UTC
Permalink
Post by Donkey Button
What idiot would post a noncontroversial opinion like this through
mixmin?
*plonk*
The name-calling and [snipped] profanity is unhinged and pubescent
behavior. It seems like this person was asking for his address to
be killfiled. I have granted his subliminal request.
hi seamus

fuck off seamus

bye bye seamus
Kefra Gotex
2022-11-07 11:49:09 UTC
Permalink
Post by Jason Evans
Some of you may have seen the HTML spammer in news.admin.misc and a few
other places. Someone else recently asked on Reddit why HTML isn't used
more on Usenet. I think the basic answer is because so many people still
use terminal-based newsreaders like SLRN and NN and HTML just makes
everything harder to read.
I think there could be a middle ground. Take something like gemtext which
is a very limited subset of the Markdown markup language. Gemtext has
=> https://blog.theuse.net My Blog
# Heading
## Sub-heading
### Sub-sub-heading
* Item 1
* Item 2
* Item 3
That's what she said
```
#!/bin/bash
echo "this is a bash script"
```
Right now there are no newsreaders that handle this kind of markup but if
there were, users who do not use that newsreader would not be distracted
by it being in an article that they were reading because they don't
interfere with normal reading the way HTML interferes.
Just throwing this out there for discussion.
I think markdown would be slightly better, since it can syntactically be
rendered as HTML, with all styling controlled by the reader rather than
the author.

For those addicted to extreme simplicity, a half-dozen markdown rules is
all they would need to know, which would put it on par with Gemini text.

A custom NNTP header in each message could indicate the format for
readers aware of formats. It could be like a !DOCTYPE declartion. For
example:

Doctype: html5
Doctype: xml
Markup: gemini 1.0
Markup: markdown
Markup: commonmark
Markup: GFM 3.2
Markup: RST
Markup: asciidoc
Markup: bbcode
Markup: wikimedia

Readers without markup awareness could just display as is.

Mime dividers could allow multiple formats to be attached. This would
allow the client software to choose which format to render. Using a
proper compression algorithm like 7z or xz would deflate multiple
markups of the same text very well since they would share most words in
common. Therefore bandwidth inflation would not really be an issue.

The big tech shills would want the client to access a remote URI to get
or validate the doctype or markup declaration, so they can get IPs of
Usenetizens. The moment anyone would try to put this poison into the
protocol, it would be necessary to expose it for its true motivation.
There is no reason whatsoever for any NNTP client to access any URI to
validate any formatting declaration. A RFC would need to run ahead of
this making clear that URI access is prohibited for rendering of any
format. Look at google API scripts and fonts for an example of how that
surveillance operation works in web pages.

Loading...