astronomy-chat.net
Promoting astronomy discussion.



Main
Date: 12 Nov 2006 02:25:03
From: Chalky
Subject: Restricted ASCII?


ASCII is defined in wiki as an 8-bit system, developed from telegraphic
codes, to allow for 256 characters.

However, I have noticed that this set is truncated to 7 characters at
sci.astro.research to conform to its first commercial use as a
seven-bit teleprinter code (1963).

This can result in considerable garbling of 8-bit ASCII text.

Since I am pretty sure I have seen Schrodinger's equation spelled
correctly at sci.physics.research, I am curious to discover how endemic
this restriction of the ASCII set actually still is, in the Usenet
groups.

Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
-, and for copyright, below

=B0 =B1 =A9

Chalky





 
Date: 12 Nov 2006 15:31:45
From: David Woolley
Subject: Re: Restricted ASCII?


In article <1163343127.057255.151280@h48g2000cwc.googlegroups.com >,
chalkyspam@bleachboys.co.uk wrote:

> I think it is unfortunate that we now appear to be restricted to that
> subset of ascii which is represented by a single keypress on a standard
> British or American keyboard.

If you include shift and control modifiers, what you get on a standard
US keyboard is the whole of ASCII. With the same caveat, what you get
on a standard British keyboard is a superset of ASCII, because old keyboards
support =A3 as a simple shifted character. (Modern ones also have the
EURO symbol, but that requires the ISO 8859/15 character code or one
of the proprietary ones. EURO, on PC keyboards, also requires the use
of the alt-graphics modifier key.)

As to American keyboards, I would have thought some parts of America
used keyboards optimised for Spanish or Portuguese.

Incidentally, this one was identified as pure ASCII, but I've had
to convert to ISO 8859/1, in order to include =A3. I haven't used
ISO 8859/15, as that is relatively recent and there may be people
without support for it, either in the browser, or in fonts.

> Content-Type: text/plain; charset="us-ascii"

(Note the level of quoting is getting close to the level at which my
spam filter cuts in because the article is too long.)

PS. It looks like the article was rejected by the moderator on the
research newsgroup, and that they rejected it for the obvious reason
that it was off topic.


 
Date: 12 Nov 2006 15:38:12
From: Tom Roberts
Subject: Re: Restricted ASCII?


Chalky wrote:
> ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> codes, to allow for 256 characters.
>
> However, I have noticed that this set is truncated to 7 characters at
> sci.astro.research to conform to its first commercial use as a
> seven-bit teleprinter code (1963). [...]

This is not due to the newsgroup itself or to the underlying newsgroup
software. It is due to the newsgroup client used by individual people.
For instance, I use Firefox and your symbols come out fine, except in
moderated newsgroups. In moderated newsgroups it also depends on the
email system used (author- >moderator) and on the client software used by
the moderator, because postings are processed via email and by the
moderator's client before being distributed.

So this is almost surely due to either the email software or the
moderator of sci.astro.research using newsgroup software that truncates
the high bit.


Tom Roberts


 
Date: 12 Nov 2006 07:30:41
From: Chalky
Subject: Re: Restricted ASCII?



George Dishman wrote:

> [note: hand indented because the special characters
> included by "Chalky" forced the use of "quoted
> printable" which prevents Outlook Express handling
> the ident automatically.]

OK, but what does this mean in terms of displayed information, since I
don't really understand this comment?

> "Chalky" <chalkyspam@bleachboys.co.uk> wrote in message
> news:1163327103.618298.79130@e3g2000cwe.googlegroups.com...
> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> > codes, to allow for 256 characters.
>
> http://en.wikipedia.org/wiki/ASCII

WOW. That text has CERTAINLY been changed since this morning. What I
quoted above was copied and pasted _directly_ from the first sentence
of that reference, under the sub heading "History", at ~10AM BST today!
>
> "ASCII was first published as a standard in 1967 and
> was last updated in 1986. It currently defines codes
> for 128 characters. 33 are non-printing, mostly
> obsolete control characters that affect how text is
> processed, and the other 95 printable characters are
> as follows (starting with the space character):"
>
> and later
>
> "ASCII is, strictly, a seven-bit code, meaning that it
> uses the bit patterns representable with seven binary
> digits (a range of 0 to 127 decimal) to represent
> character information."

Yes. That is copied from the CHANGED Wiki reference, which _postdates_
my first posting of today.

I disagree with that conveniently changed reference of this afternoon,
anyway.

The Server Side coding of http://1stlight.org/design/ascii.asp,
specifically intructs any Windows NT 4, Windows 2000, or subsequent
Microsoft server, to display the ascii symbols for all n from 1 to 255,
in sequence. Neither the server nor any known browser has any
difficulty in doing so. This works with both server side and client
side scripting, and has done so since the last century.

> History aside, use of 8-bit characters breaks one
> of the most common newsreaders.

Which one?

> It may be MS's fault

I doubt that.

> but that's where we are.

I doubt that too. As suggested by Edward Green, this seems to be a bug
introduced by the sci.astro.research moderator's software/interface. As
I have already pointed out, 8 bit info works fine at
sci.physics.research, and in every other usenet group tried.


Chalky



  
Date: 12 Nov 2006 16:25:15
From: George Dishman
Subject: Re: Restricted ASCII?



"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message
news:1163345440.980996.198470@m7g2000cwm.googlegroups.com...
>
> George Dishman wrote:
>
>> [note: hand indented because the special characters
>> included by "Chalky" forced the use of "quoted
>> printable" which prevents Outlook Express handling
>> the ident automatically.]
>
> OK, but what does this mean in terms of displayed information, since I
> don't really understand this comment?

It means the " > " which prefixes each quoted line
doesn't get put in by the newsreader, I had to edit
it into each line I quoted myself. That's why I
trimmed most of your post.

>> "Chalky" <chalkyspam@bleachboys.co.uk> wrote in message
>> news:1163327103.618298.79130@e3g2000cwe.googlegroups.com...
>> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
>> > codes, to allow for 256 characters.
>>
>> http://en.wikipedia.org/wiki/ASCII
>
> WOW. That text has CERTAINLY been changed since this morning. What I
> quoted above was copied and pasted _directly_ from the first sentence
> of that reference, under the sub heading "History", at ~10AM BST today!

I suspect you can find some history of editing of
the page on Wiki that would tell you who changed
it but it was like that when I went there. BTW,
there are some duplicated topics on Wiki so it
might be worth checking your browser history to
be sure you got the same page.

>> "ASCII was first published as a standard in 1967 and
>> was last updated in 1986. It currently defines codes
>> for 128 characters. 33 are non-printing, mostly
>> obsolete control characters that affect how text is
>> processed, and the other 95 printable characters are
>> as follows (starting with the space character):"
>>
>> and later
>>
>> "ASCII is, strictly, a seven-bit code, meaning that it
>> uses the bit patterns representable with seven binary
>> digits (a range of 0 to 127 decimal) to represent
>> character information."
>
> Yes. That is copied from the CHANGED Wiki reference, which _postdates_
> my first posting of today.
>
> I disagree with that conveniently changed reference of this afternoon,
> anyway.

I know what it says now is correct, I've been
familiar with the coding for decades (one of my
early jobs required converting between 7-bit and
5-bit).

> The Server Side coding of http://1stlight.org/design/ascii.asp,
> specifically intructs any Windows NT 4, Windows 2000, or subsequent
> Microsoft server, to display the ascii symbols for all n from 1 to 255,
> in sequence. Neither the server nor any known browser has any
> difficulty in doing so. This works with both server side and client
> side scripting, and has done so since the last century.

Again, MS seldom restricts itself to standards.

>> History aside, use of 8-bit characters breaks one
>> of the most common newsreaders.
>
> Which one?

Outlook Express as I said above.

>> It may be MS's fault
>
> I doubt that.
>
>> but that's where we are.
>
> I doubt that too. As suggested by Edward Green, this seems to be a bug
> introduced by the sci.astro.research moderator's software/interface. As
> I have already pointed out, 8 bit info works fine at
> sci.physics.research, and in every other usenet group tried.

It depends on the characters used. I have had this
problem earlier this year on posts in sci.astro
which is unmoderated.

The first message you sent had this in the headers:

Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1163327108 1505 127.0.0.1 (12 Nov 2006 10:25:08
GMT)

The post I am replying to now has no problems and
these are the headers:

Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Trace: posting.google.com 1163345446 13704 127.0.0.1 (12 Nov 2006
15:30:46 GMT)

Google automatically switched to non-ASCII
because you included the special characters.

George




 
Date: 12 Nov 2006 06:52:07
From: Chalky
Subject: Re: Restricted ASCII?



Edward Green wrote:

> Chalky wrote:
>
> > Edward Green wrote:
>
> > > I read everything at Google, the new AOL to some, and your symbols look
> > > fine in any group. I don't know what's going on behind the scenes,
> > > but I don't think a "Usenet group" is the logical entity you think it
> > > is. You have the messages, and you have the formatting, which is left
> > > up to the Newsreader.
> >
> > Not true. The top bit is erased at sci.astro.research.
>
> Can you reference a particular post? Maybe there is something funny
> going on in the moderation step.

Sure. I can do better than that. The following was the
sci.astro.research moderator's response to my own (unaccepted) posting:

I am sorry to inform you that your post has been found unsuitable for
posting to sci.astro.research, for the following reason(s):

* insufficiently relevant to research in astronomy or astrophysics.
Either your message is completely off-topic for this forum, in which
case please submit it to a more appropriate group; or it has
insufficient content related to research to allow it to be posted
under the sci.astro.research charter, in which case it may be better
to post it in sci.astro or one of the other unmoderated groups in
the sci.astro hierarchy.

Moderator, sci.astro.research

[This discussion is off-topic for s.a.r., but you might want to refer
to http://en.wikipedia.org/wiki/ASCII -- mjh]

----------------------------------------------------------------------

Text of your message:
---------------------

>From martinh@chiark.greenend.org.uk Thu Nov 09 16:40:59 2006
Return-path: <martinh@chiark.greenend.org.uk >
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on
hercules.herts.ac.uk
X-Spam-Level:
X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00,
UNPARSEABLE_RELAY autolearn=ham version=3.1.3
Envelope-to: mjh@localhost
Delivery-date: Thu, 09 Nov 2006 16:40:59 +0000
Received: from localhost ([127.0.0.1] ident=mjh)
by hercules.herts.ac.uk with esmtp (Exim 3.36 #1 (Debian))
id 1GiCxX-00084E-00
for <mjh@localhost >; Thu, 09 Nov 2006 16:40:59 +0000
Received: from tucana.herts.ac.uk [147.197.215.113]
by localhost with IMAP (fetchmail-6.2.5)
for mjh@localhost (single-drop); Thu, 09 Nov 2006 16:40:59 +0000 (GMT)
Received: from corvus.herts.ac.uk ([147.197.215.112] helo=corvus)
by tucana.herts.ac.uk with esmtp (Exim 4.44)
id 1GiCxB-0001Lm-82
for m.j.hardcastle@herts.ac.uk; Thu, 09 Nov 2006 16:40:37 +0000
Received: from [193.201.200.170] (helo=chiark.greenend.org.uk)
by corvus with smtp (Exim 4.40)
id 1GiCx9-0006eG-3h
for mjh@star.herts.ac.uk; Thu, 09 Nov 2006 16:40:35 +0000
Received: from [193.4.58.12] (helo=horus.isnic.is ident=root)
by chiark.greenend.org.uk (Debian Exim 3.36 #1) with esmtp
(return-path news@google.com)
id 1GiCwy-0000OO-00
for sci.astro.research@slimy.greenend.org.uk; Thu, 09 Nov 2006 16:40:24
+0000
Received: from proxy.google.com (proxy.google.com [66.102.7.4])
by horus.isnic.is (8.12.9p2/8.12.9/isnic) with ESMTP id kA9GeMUx027497
for <sci-astro-research@moderators.isc.org >; Thu, 9 Nov 2006 16:40:22
GMT
(envelope-from news@google.com)
Received: from G081002
by proxy.google.com with ESMTP id kA9GeLHu015417
for <sci-astro-research@moderators.isc.org >; Thu, 9 Nov 2006 08:40:21
-0800
Received: (from news@localhost)
by Google Production with id kA9GeKTj031064
for sci-astro-research@moderators.isc.org; Thu, 9 Nov 2006 08:40:20
-0800
To: sci-astro-research@moderators.isc.org
Path: m7g2000cwm.googlegroups.com!not-for-mail
From: "Chalky" <chalkyspam@bleachboys.co.uk >
Newsgroups: sci.astro.research
Subject: Re: A Revised Planck Scale?
Date: 9 Nov 2006 08:40:17 -0800
Organization: http://groups.google.com
Lines: 39
Message-ID: <1163090417.072908.314350@m7g2000cwm.googlegroups.com >
References: <mt2.0-21097-1162632715@hercules.herts.ac.uk >
<mt2.0-21568-1162989572@hercules.herts.ac.uk >
NNTP-Posting-Host: 195.92.67.65
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Trace: posting.google.com 1163090420 31043 127.0.0.1 (9 Nov 2006
16:40:20 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Thu, 9 Nov 2006 16:40:20 +0000 (UTC)
In-Reply-To: <mt2.0-21568-1162989572@hercules.herts.ac.uk >
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT
5.0),gzip(gfe),gzip(gfe)
X-HTTP-Via: 1.1 webcacheH01 (NetCache NetApp/5.5R3D3)
Complaints-To: groups-abuse@google.com
Injection-Info: m7g2000cwm.googlegroups.com; posting-host=195.92.67.65;
posting-account=oMPGkg0AAAB-lceMS5dlyP2BwpYen6gq
X-C-UH-MailScanner: No Virus detected
X-UH-MailScanner-From: martinh@chiark.greenend.org.uk
X-UH-MailScanner-Information: UH-mail
X-UH-MailScanner: No Virus detected


Oh No wrote:

> Thus spake Oh No <NotI@charlesfrancis.wanadoo.co.uk>
> >I should just like to add that the Schwarzschild radius of the proton
> >is not something which appears in standard physical models, the reason
> >being that a classical massive point particle is not a consistent idea
> >in general relativity. In fact a proton must be treated quantum
> >mechanically, and we do not have an accepted theory on that, but if the
> >Schwarzschild radius of the proton were considered then it would have a
> >magnitude given by
> >
> > 2Gm/c^3 =3D 8.28 x 10 e^-63 m
> >
> >Planck length also has a formal definition
> >
> > l_p =3D sqrt(hbar*G/c^3) =3D 1.61605e-35 =B1 1.0e-39 m
> >
> >Neither of these figures is open to revision beyond that allowed by
> >experimental margins of error. If you are defining other quantities, you
> >should give them other names.
> >
> With apologies, I copy pasted those figures from another source. The
> equations looked all right when I posted, but obviously they did not
> contain pure ASCII

You are wrong. At least the latter was pure ascii. Many useful
caracters are part of the pure ascii set that used to be accepted in at
least some of these newsgroups. The classic example is the correct o in
Schrodinger, recently featured (correctly) in a title at sci.physics
research.

I think it is unfortunate that we now appear to be restricted to that
subset of ascii which is represented by a single keypress on a standard
British or American keyboard.


Chalky

----- End forwarded message -----

> > These postings
> > were in response to a suggestion from the moderator there, that this
> > should be discussed here.
>
> But where is "here"?

sci.physics, sci.physics.relativity, sci.astro, sci.astro.amateur,
sci.astro.seti (Sci.astro.research moderator's recommendation just
being sci.astro)

Chalky



 
Date: 12 Nov 2006 06:36:55
From: Chalky
Subject: Re: Restricted ASCII?



Chalky wrote:

> Edward Green wrote:
>
> > Chalky wrote:
> >
> > > ASCII is defined in wiki as an 8-bit system, developed from telegraph=
ic
> > > codes, to allow for 256 characters.
> >
> > Neat. Another discussion of computer archaics. And I was afraid
> > outlets for procrastination were closed!
> >
> > > However, I have noticed that this set is truncated to 7 characters at
> > > sci.astro.research to conform to its first commercial use as a
> > > seven-bit teleprinter code (1963).
> >
> > I always thought of them as ASCII and extended-ASCII, or possibly ANSI?
> >
> > > This can result in considerable garbling of 8-bit ASCII text.
> > >
> > > Since I am pretty sure I have seen Schrodinger's equation spelled
> > > correctly at sci.physics.research, I am curious to discover how endem=
ic
> > > this restriction of the ASCII set actually still is, in the Usenet
> > > groups.
> > >
> > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> > > -, and for copyright, below
> > >
> > > =B0 =B1 =A9
> >
> > I read everything at Google, the new AOL to some, and your symbols look
> > fine in any group. I don't know what's going on behind the scenes,
> > but I don't think a "Usenet group" is the logical entity you think it
> > is. You have the messages, and you have the formatting, which is left
> > up to the Newsreader.
>
> Not true. The top bit is erased at sci.astro.research.

To clarify further, by "the top bit" I mean the _Most Significant Bit_.


In computers that are more advanced than 8 bit microprocessors (circa
early '70s), the most significant bit of the computer word is typically
employed for the most important variable. For data, this typically
means + or -. Similarly, for the instruction set (eg in the General
Instruments CP 1600 microprocessor [circa mid-late '70s]), this
typically was used to signal a switch between internal (MSB=3D0) and
external (MSB=3D1) data manipulations [albeit still, in that example,
only then employing the MSB of a 12 bit instruction word]

When we come back down to 8 bit data words, then the MSB is still used
to switch from the restricted (1963) set of American Bell teleprinter
code characters (MSB=3D0), and the extended set (MSB=3D1), as originally
intended when ascii was proposed and defined as an 8 bit code.

Chalky



  
Date: 12 Nov 2006 16:00:37
From: Sorcerer
Subject: Re: Restricted ASCII?



"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message
news:1163342215.613268.5060@m73g2000cwd.googlegroups.com...

Chalky wrote:

> Edward Green wrote:
>
> > Chalky wrote:
> >
> > > ASCII is defined in wiki as an 8-bit system, developed from
> > > telegraphic
> > > codes, to allow for 256 characters.
> >
> > Neat. Another discussion of computer archaics. And I was afraid
> > outlets for procrastination were closed!
> >
> > > However, I have noticed that this set is truncated to 7 characters at
> > > sci.astro.research to conform to its first commercial use as a
> > > seven-bit teleprinter code (1963).
> >
> > I always thought of them as ASCII and extended-ASCII, or possibly ANSI?
> >
> > > This can result in considerable garbling of 8-bit ASCII text.
> > >
> > > Since I am pretty sure I have seen Schrodinger's equation spelled
> > > correctly at sci.physics.research, I am curious to discover how
> > > endemic
> > > this restriction of the ASCII set actually still is, in the Usenet
> > > groups.
> > >
> > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> > > -, and for copyright, below
> > >
> > > ° ± ©
> >
> > I read everything at Google, the new AOL to some, and your symbols look
> > fine in any group. I don't know what's going on behind the scenes,
> > but I don't think a "Usenet group" is the logical entity you think it
> > is. You have the messages, and you have the formatting, which is left
> > up to the Newsreader.
>
> Not true. The top bit is erased at sci.astro.research.

To clarify further, by "the top bit" I mean the _Most Significant Bit_.


In computers that are more advanced than 8 bit microprocessors (circa
early '70s), the most significant bit of the computer word is typically
employed for the most important variable. For data, this typically
means + or -. Similarly, for the instruction set (eg in the General
Instruments CP 1600 microprocessor [circa mid-late '70s]), this
typically was used to signal a switch between internal (MSB=0) and
external (MSB=1) data manipulations [albeit still, in that example,
only then employing the MSB of a 12 bit instruction word]

When we come back down to 8 bit data words, then the MSB is still used
to switch from the restricted (1963) set of American Bell teleprinter
code characters (MSB=0), and the extended set (MSB=1), as originally
intended when ascii was proposed and defined as an 8 bit code.

Chalky


I used to own a 110 baud teleprinter. Being mechanical the MSB
was ignored, but not only that, lower case was ignored also. It was
essentially 6-bit. Man, that used to clatter, but it worked with a drop
of oil. Then, glory be, a TV interface. Full 8-bit prom for the character
set, 2K ram for the entire screen. You could do a lot with a 4 MHz
Zilog Z80 and a cassette recorder for mass storage.
I see Woolworth are selling Chinese B/W 5" screen TVs for £10 now.
Androcles





 
Date: 12 Nov 2006 14:51:08
From: George Dishman
Subject: Re: Restricted ASCII?


[note: hand indented because the special characters
included by "Chalky" forced the use of "quoted
printable" which prevents Outlook Express handling
the ident automatically.]

"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message
news:1163327103.618298.79130@e3g2000cwe.googlegroups.com...
> ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> codes, to allow for 256 characters.

http://en.wikipedia.org/wiki/ASCII

"ASCII was first published as a standard in 1967 and
was last updated in 1986. It currently defines codes
for 128 characters. 33 are non-printing, mostly
obsolete control characters that affect how text is
processed, and the other 95 printable characters are
as follows (starting with the space character):"

and later

"ASCII is, strictly, a seven-bit code, meaning that it
uses the bit patterns representable with seven binary
digits (a range of 0 to 127 decimal) to represent
character information."

History aside, use of 8-bit characters breaks one
of the most common newsreaders. It may be MS's fault
but that's where we are.

Chalky




  
Date: 12 Nov 2006 16:00:37
From: Sorcerer
Subject: Re: Restricted ASCII?


Ok, thanks. Maybe it'll be fixed some day.

"George Dishman" <george@briar.demon.co.uk > wrote in message
news:ej7bfu$2td$1@news.freedom2surf.net...


 
Date: 12 Nov 2006 05:57:47
From: Edward Green
Subject: Re: Restricted ASCII?


Chalky wrote:

> Edward Green wrote:

> > I read everything at Google, the new AOL to some, and your symbols look
> > fine in any group. I don't know what's going on behind the scenes,
> > but I don't think a "Usenet group" is the logical entity you think it
> > is. You have the messages, and you have the formatting, which is left
> > up to the Newsreader.
>
> Not true. The top bit is erased at sci.astro.research.

Can you reference a particular post? Maybe there is something funny
going on in the moderation step.

> These postings
> were in response to a suggestion from the moderator there, that this
> should be discussed here.

But where is "here"?



 
Date: 12 Nov 2006 05:57:15
From: Chalky
Subject: Re: Restricted ASCII?



Sorcerer wrote:

> Returned from sci.physics.relativity, absent auto-indent.

Sorcerer wrote:

> Returned from sci.physics, also absent auto-indent.
> Androcles

Sorry, could you explain what you mean by this?
As far as I am aware, auto-indent is not an ascii code.
As far as I am aware, I did not employ an auto-indent in these
postings, anyway.

C



  
Date: 12 Nov 2006 15:21:08
From: Sorcerer
Subject: Re: Restricted ASCII?



"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message
news:1163339834.991566.50860@i42g2000cwa.googlegroups.com...


 
Date: 12 Nov 2006 05:51:18
From: Chalky
Subject: Re: Restricted ASCII?



Edward Green wrote:

> Chalky wrote:
>
> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> > codes, to allow for 256 characters.
>
> Neat. Another discussion of computer archaics. And I was afraid
> outlets for procrastination were closed!
>
> > However, I have noticed that this set is truncated to 7 characters at
> > sci.astro.research to conform to its first commercial use as a
> > seven-bit teleprinter code (1963).
>
> I always thought of them as ASCII and extended-ASCII, or possibly ANSI?
>
> > This can result in considerable garbling of 8-bit ASCII text.
> >
> > Since I am pretty sure I have seen Schrodinger's equation spelled
> > correctly at sci.physics.research, I am curious to discover how endemic
> > this restriction of the ASCII set actually still is, in the Usenet
> > groups.
> >
> > Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> > -, and for copyright, below
> >
> > =B0 =B1 =A9
>
> I read everything at Google, the new AOL to some, and your symbols look
> fine in any group. I don't know what's going on behind the scenes,
> but I don't think a "Usenet group" is the logical entity you think it
> is. You have the messages, and you have the formatting, which is left
> up to the Newsreader.

Not true. The top bit is erased at sci.astro.research. These postings
were in response to a suggestion from the moderator there, that this
should be discussed here.

> Lets set some rational follow-ups, shall we?

Yes please

Chalky



  
Date: 12 Nov 2006 07:56:57
From: Starlord
Subject: Re: Restricted ASCII?


We are just fine in using the plain text in S.A.A.


--
The Lone Sidewalk Astronomer of Rosamond

Telescope Buyers FAQ
http://home.inreach.com/starlord
Sidewalk Astronomy
www.sidewalkastronomy.info
The Church of Eternity
http://home.inreach.com/starlord/church/Eternity.html


"Chalky" <chalkyspam@bleachboys.co.uk > wrote garbage





  
Date: 12 Nov 2006 15:04:41
From: Sorcerer
Subject: Re: Restricted ASCII?



"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message
news:1163339478.048813.247840@b28g2000cwb.googlegroups.com...

Edward Green wrote:

> Chalky wrote:
>
> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> > codes, to allow for 256 characters.
>
> Neat. Another discussion of computer archaics. And I was afraid
> outlets for procrastination were closed!
>
> > However, I have noticed that this set is truncated to 7 characters at
> > sci.astro.research to conform to its first commercial use as a
> > seven-bit teleprinter code (1963).
>
> I always thought of them as ASCII and extended-ASCII, or possibly ANSI?
>
> > This can result in considerable garbling of 8-bit ASCII text.
> >
> > Since I am pretty sure I have seen Schrodinger's equation spelled
> > correctly at sci.physics.research, I am curious to discover how endemic
> > this restriction of the ASCII set actually still is, in the Usenet
> > groups.
> >
> > Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> > -, and for copyright, below
> >
> > ° ± ©
>
> I read everything at Google, the new AOL to some, and your symbols look
> fine in any group. I don't know what's going on behind the scenes,
> but I don't think a "Usenet group" is the logical entity you think it
> is. You have the messages, and you have the formatting, which is left
> up to the Newsreader.

Not true. The top bit is erased at sci.astro.research. These postings
were in response to a suggestion from the moderator there, that this
should be discussed here.

> Lets set some rational follow-ups, shall we?

Yes please

Chalky


At least I now know why I have no indents auto-supplied by Outlook
Express when I hit "Reply". Good experiment, Chalky.




 
Date: 12 Nov 2006 05:46:50
From: Edward Green
Subject: Re: Restricted ASCII?


Chalky wrote:

> ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> codes, to allow for 256 characters.

Neat. Another discussion of computer archaics. And I was afraid
outlets for procrastination were closed!

> However, I have noticed that this set is truncated to 7 characters at
> sci.astro.research to conform to its first commercial use as a
> seven-bit teleprinter code (1963).

I always thought of them as ASCII and extended-ASCII, or possibly ANSI?

> This can result in considerable garbling of 8-bit ASCII text.
>
> Since I am pretty sure I have seen Schrodinger's equation spelled
> correctly at sci.physics.research, I am curious to discover how endemic
> this restriction of the ASCII set actually still is, in the Usenet
> groups.
>
> Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> -, and for copyright, below
>
> =B0 =B1 =A9

I read everything at Google, the new AOL to some, and your symbols look
fine in any group. I don't know what's going on behind the scenes,
but I don't think a "Usenet group" is the logical entity you think it
is. You have the messages, and you have the formatting, which is left
up to the Newsreader.

Lets set some rational follow-ups, shall we?



  
Date: 12 Nov 2006 15:01:00
From: Sorcerer
Subject: Re: Restricted ASCII?


He's testing, and you are whining about follow ups.
You are right, you don't know what's going on behind the scenes,
you clueless MORON, Green!



"Edward Green" <spamspamspam3@netzero.com > wrote in message
news:1163339210.186352.264010@h48g2000cwc.googlegroups.com...
Chalky wrote:

> ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> codes, to allow for 256 characters.

Neat. Another discussion of computer archaics. And I was afraid
outlets for procrastination were closed!

> However, I have noticed that this set is truncated to 7 characters at
> sci.astro.research to conform to its first commercial use as a
> seven-bit teleprinter code (1963).

I always thought of them as ASCII and extended-ASCII, or possibly ANSI?

> This can result in considerable garbling of 8-bit ASCII text.
>
> Since I am pretty sure I have seen Schrodinger's equation spelled
> correctly at sci.physics.research, I am curious to discover how endemic
> this restriction of the ASCII set actually still is, in the Usenet
> groups.
>
> Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> -, and for copyright, below
>
> ° ± ©

I read everything at Google, the new AOL to some, and your symbols look
fine in any group. I don't know what's going on behind the scenes,
but I don't think a "Usenet group" is the logical entity you think it
is. You have the messages, and you have the formatting, which is left
up to the Newsreader.

Lets set some rational follow-ups, shall we?




 
Date: 12 Nov 2006 05:41:30
From: Chalky
Subject: Re: Restricted ASCII?



Chalky wrote:

> Chalky wrote:
>
> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> > codes, to allow for 256 characters.
> >
> > However, I have noticed that this set is truncated to 7 characters at
> > sci.astro.research to conform to its first commercial use as a
> > seven-bit teleprinter code (1963).
> >
> > This can result in considerable garbling of 8-bit ASCII text.
> >
> > Since I am pretty sure I have seen Schrodinger's equation spelled
> > correctly at sci.physics.research, I am curious to discover how endemic
> > this restriction of the ASCII set actually still is, in the Usenet
> > groups.
> >
> > Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> > -, and for copyright, below
> >
> > =B0 =B1 =A9
> >
> > Chalky
>
> Interesting. All groups display correctly except
> sci.physics.relativity, which still displayed 8 bits, but translated
> into a more old-fashioned font.
>
> Looks like sci.astro.research is the only newsgroup actually restricted
> to 7 bits.
>
> I wonder why that is?
>
> Chalky

I have since noticed that the wiki reference and all references
therefrom are complete rubbish in all other respects, since they all
still restrict the displayed characters to the least significant 7 bits
of ASCII (i.e. restriction to Bell teleprinter code, circa 1963).

This erases all unique characteristics of Scandanavian (and Germanic)
languages, all unique characteristics of Latin languages (such as
French & Spanish), and all currencies other than the Yankey Dollar.
(Thus excluding the British Pound Sterling, the Euro, and the Japanese
Yen [to name a few important examples], as well as precluding the use
of any more advanced scientific notation.)

So, this is (probably) goodbye from me to sci.astro.research. (I can't
cope with this more-than-40-year-out-of-date ascii restriction. [Or, as
Captain Beefheart said more eloquently, "I cry, but I can't buy your
Veterans' Day Poppy."])

In view of this apparent dearth of up-to-date information on ascii on
the internet, I am now recommending (to the relevant management) that
the intRAnet version of the file http://1stlight.org/design/ascii.asp,
should now be included on the intERnet version of that site, too.

Chalky



 
Date: 12 Nov 2006 02:46:29
From: Chalky
Subject: Re: Restricted ASCII?



Chalky wrote:

> ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> codes, to allow for 256 characters.
>
> However, I have noticed that this set is truncated to 7 characters at
> sci.astro.research to conform to its first commercial use as a
> seven-bit teleprinter code (1963).
>
> This can result in considerable garbling of 8-bit ASCII text.
>
> Since I am pretty sure I have seen Schrodinger's equation spelled
> correctly at sci.physics.research, I am curious to discover how endemic
> this restriction of the ASCII set actually still is, in the Usenet
> groups.
>
> Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> -, and for copyright, below
>
> =B0 =B1 =A9
>
> Chalky

Interesting. All groups display correctly except
sci.physics.relativity, which still displayed 8 bits, but translated
into a more old-fashioned font.

Looks like sci.astro.research is the only newsgroup actually restricted
to 7 bits.

I wonder why that is?

Chalky



  
Date: 12 Nov 2006 14:34:16
From: David Woolley
Subject: Re: Restricted ASCII?


In article <1163328389.468587.34850@h48g2000cwc.googlegroups.com >,
chalkyspam@bleachboys.co.uk wrote:
> Chalky wrote:

> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> > codes, to allow for 256 characters.

That wiki is wrong. It is using a common marketing/popular
computing misuse of the term. Which wiki was it and
which article? The current edit of the English version
of Wikepedia seems to correctly define it as seven bit:
<http://en.wikipedia.org/w/index.php?title=ASCII&oldid=87321585 >.

ASCII is the US variant of ISO 646 (I think it is the same as the
reference variant), which is a seven bit code.

As indicated by the Content-Type header, your article is not in ASCII:

> Content-Type: text/plain; charset="iso-8859-1"
^^^^^^^^^^
Which indicates that it is in the eight bit code ISO 8859/1, which contains
ISO 646 (reference variant) as a subset. You were not posting in ASCII.

One of your articles was, however, in "just send eight" format, which,
while it generally passes through Usenet OK, because most of Usenet is
8 bit clean, leaves the receiving newsreader to guess what character
set was actually intended, and is therefore invalid. In Chinese speaking
parts of the world, just send 8 content is likely to be in GB2312 or
Big 5, not ISO 8859/1. Even in the West, it is quite likely to be
Window-1252 a variant of ISO 8859/1 in which 32 control characters are
replaced by extra graphics, but it could be the Mac version, instead.

> > This can result in considerable garbling of 8-bit ASCII text.

There is no such thing as 8 bit ASCII text.

> >
> > Since I am pretty sure I have seen Schrodinger's equation spelled
> > correctly at sci.physics.research, I am curious to discover how endemic
> > this restriction of the ASCII set actually still is, in the Usenet
> > groups.

For all except moderated newsgroups, and I don't think your test post
would have been passed by a moderator, this is purely a function of the
newsreaders or other user agents used. For a cross posted article,
only one copy is ever transmitted, so any corruption would apply to all
newsgroups. (In your case, the user agent seems to be the Google
Groups nntp to HTML/HTTP gateway.)

> >
> > Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> > -, and for copyright, below
> >
> > =B0 =B1 =A9

Actually, as well as using ISO 8859/1, you used quoted printable, and
therefore actually only sent 7 bits in all except the just send eight
case.

>
> Chalky

> Interesting. All groups display correctly except
> sci.physics.relativity, which still displayed 8 bits, but translated
> into a more old-fashioned font.

Fonts are purely a user agent issue. As I said, unless you post to a
moderated group, in which case it is the moderator's computer system
that determines what happens to things other than pure ASCII, this is
an issue with your newsreader, because the same copy of the article is
used for all unmoderated groups in a cross-posting.

> Looks like sci.astro.research is the only newsgroup actually restricted
> to 7 bits.

It looks like sci.astro.research *is* moderated ("m" flag):

200 news.demon.co.uk InterNetNews NNRP server INN 2.4.1 ready (posting ok).
list active sci.astro.research
215 Newsgroups in form "group high low flags".
sci.astro.research 0000004703 0000003871 m
.

Maybe it is auto-moderated which is why the test got through.

Your problem is with the email system used by the moderator. It
seems to be resolving the quoted printable coding to 8 bit, but
then losing the MIME information when re-submitting the approved
version.

It's generally best to restrict yourself to the proper definition of
ASCII unless you are in a restricted community, typically a language
community, or the material can't be satifactorily represented in ASCII.
For maths, there is only a narrow band in which this is valid, as one
soon reaches a point where one needs to use TeX, troff's eqn, or MathML,
which would normally be treated as binaries on Usenet, so are best placed
on a web site.



  
Date: 12 Nov 2006 13:32:03
From: Sorcerer
Subject: Re: Restricted ASCII?



"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message
news:1163328389.468587.34850@h48g2000cwc.googlegroups.com...

Chalky wrote:

> ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> codes, to allow for 256 characters.
>
> However, I have noticed that this set is truncated to 7 characters at
> sci.astro.research to conform to its first commercial use as a
> seven-bit teleprinter code (1963).
>
> This can result in considerable garbling of 8-bit ASCII text.
>
> Since I am pretty sure I have seen Schrodinger's equation spelled
> correctly at sci.physics.research, I am curious to discover how endemic
> this restriction of the ASCII set actually still is, in the Usenet
> groups.
>
> Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> -, and for copyright, below
>
> ° ± ©
>
> Chalky

Interesting. All groups display correctly except
sci.physics.relativity, which still displayed 8 bits, but translated
into a more old-fashioned font.

Looks like sci.astro.research is the only newsgroup actually restricted
to 7 bits.

I wonder why that is?

Chalky

Returned from sci.physics, also absent auto-indent.
Androcles




  
Date: 12 Nov 2006 13:26:14
From: Sorcerer
Subject: Re: Restricted ASCII?



"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message
news:1163328389.468587.34850@h48g2000cwc.googlegroups.com...

Chalky wrote:

> ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> codes, to allow for 256 characters.
>
> However, I have noticed that this set is truncated to 7 characters at
> sci.astro.research to conform to its first commercial use as a
> seven-bit teleprinter code (1963).
>
> This can result in considerable garbling of 8-bit ASCII text.
>
> Since I am pretty sure I have seen Schrodinger's equation spelled
> correctly at sci.physics.research, I am curious to discover how endemic
> this restriction of the ASCII set actually still is, in the Usenet
> groups.
>
> Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> -, and for copyright, below
>
> ° ± ©
>
> Chalky

Interesting. All groups display correctly except
sci.physics.relativity, which still displayed 8 bits, but translated
into a more old-fashioned font.

Looks like sci.astro.research is the only newsgroup actually restricted
to 7 bits.

I wonder why that is?

Chalky


Returned from sci.physics.relativity, absent auto-indent.






   
Date: 12 Nov 2006 13:58:06
From: Sorcerer
Subject: Re: Restricted ASCII?



"Sorcerer" <Headmaster@hogwarts.physics_e > wrote in message
news:WhF5h.157081$lT5.7614@fe2.news.blueyonder.co.uk...


  
Date: 12 Nov 2006 20:03:11
From: Pat O'Connell
Subject: Re: Restricted ASCII?


Chalky wrote:
> Chalky wrote:
>
>> ASCII is defined in wiki as an 8-bit system, developed from telegraphic
>> codes, to allow for 256 characters.
>>
>> However, I have noticed that this set is truncated to 7 characters at
>> sci.astro.research to conform to its first commercial use as a
>> seven-bit teleprinter code (1963).
>>
>> This can result in considerable garbling of 8-bit ASCII text.
>>
>> Since I am pretty sure I have seen Schrodinger's equation spelled
>> correctly at sci.physics.research, I am curious to discover how endemic
>> this restriction of the ASCII set actually still is, in the Usenet
>> groups.
>>
>> Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
>> -, and for copyright, below
>>
>> ° ± ©
>>
>> Chalky
>
> Interesting. All groups display correctly except
> sci.physics.relativity, which still displayed 8 bits, but translated
> into a more old-fashioned font.
>
> Looks like sci.astro.research is the only newsgroup actually restricted
> to 7 bits.

Only characters 0 through 127 have been standardized as part of ASCII.
Each computer operating system (for instance DOS, Windows, Mac, and VMS)
displays its own symbol set for 128 through 255.

--
Pat O'Connell
[note munged EMail address]
Take nothing but pictures, Leave nothing but footprints,
Kill nothing but vandals...


   
Date: 13 Nov 2006 15:43:30
From: Paul Schlyter
Subject: Re: Restricted ASCII?


An interesting history of character codes, from Morse Codes through
Baudot Code to ASCII-1967 can be found here:

http://www.wps.com/projects/codes/

The author says, on that page:

# ASCII is and always was a seven bit code. I am shocked at the number of
# people and sources that claim it to be an 8-bit code. There are only 128
# character codes in ASCII.
# Many of the extentions to ASCII are 8 bits, but they are not ASCII.

--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/


   
Date: 14 Nov 2006 00:12:38
From: Ben Rudiak-Gould
Subject: Re: Restricted ASCII?


Pat O'Connell wrote:
>Each computer operating system (for instance DOS, Windows, Mac, and VMS)
>displays its own symbol set for 128 through 255.

Every *localized version* of every OS has a *default* character encoding
that many tools use in the absence of other encoding information. The local
default makes no sense (on either end) for documents that are being sent
between computers, like email and Usenet messages and HTML pages. Therefore
you should always specify the encoding of such a message explicitly within
the message itself, which makes the local default irrelevant.

-- Ben


 
Date: 13 Nov 2006 05:25:52
From: Matt Giwer
Subject: Re: Restricted ASCII?


Chalky wrote:
> ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> codes, to allow for 256 characters.
>
> However, I have noticed that this set is truncated to 7 characters at
> sci.astro.research to conform to its first commercial use as a
> seven-bit teleprinter code (1963).
>
> This can result in considerable garbling of 8-bit ASCII text.
>
> Since I am pretty sure I have seen Schrodinger's equation spelled
> correctly at sci.physics.research, I am curious to discover how endemic
> this restriction of the ASCII set actually still is, in the Usenet
> groups.
>
> Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> -, and for copyright, below

It has been so long I doubt I remember enough to do this subject the injustice
it deserves. First there were others for different languages such as UKSCII
whose main difference was the pound symbol instead of the $. This # is an
octopule not a pound sign. Those are just two of the SCIIs that were around.

Anyone who wants to tell me I am full of it and have it wrong please feel free.
It has been over 20 years since I read it. The it was a Ma Bell handbook on the
oxymoronic RS-232 standard. It was mostly a recounting of the different ways the
"standard" was implemented.

It started as a standard (Ma Bell rented what was used on its lines) with 26
uppercase, 10 numbers, punctuation, printable characters like / () $ and control
characters, line feed, page feed, tab and such for teletypes. Those are those
huge chuncka-chuncka machines that some of us were lucky enough to have to learn
programming on instead of punchcards. At that time it was six bits plus one
parity bit, 64 total possibilities for seven bits total. For teletypes the
writer created a paper tape version of his article which was then transmitted
over the expensive long distance line to save costs. That is why the old movies
show them printing so fast. Also it was adapted to papertape programming to
replace punchcards which started to control looms.

When teletypes were replaced with machines that could do lower case it
extended to seven bits plus parity other printable characters were defined such
as []{} but a single standard for all the 128 possible codes never did develop.
That is likely because it was not around long enough for a single manufacturer
to dominate the market and Ma Bell was gone by then.

In the early days of PCs there was a short time with 6+1 bits, uppercase only,
followed by 7+1 bits. With the development of home modems and doing it
affordably it was not possible for many years to drop the parity bit. Also
relaying data through some machines (IBM mainframes mostly) required limiting
the data to 6+1 for the lack of standardization of the full 128 bit charset and
some of their legacy mainframes were never upgraded to 128.

As the gods would have it phone lines got better and modems included error
correcting code and had the full 8+2 bits for error correction. In any event the
computer no longer needed to deal with the parity issue and the full 8 bits
could be used for display characters.

Given the lack of standardization of all the extras in the lower 128 it is not
surprising the upper 128 were wildcards. I think it was Apple ][ upgrade chipset
or maybe the ill-conceived III which first used them for international
characters. Atari simply made them the inverse of their choices for the lower
128. This was before displays went graphic as in windows and MS was on DOS 7.

As to what all is out there today, you tell me. From what I see occasionally
UNICODE is not much of a match for the snail as far as progress goes. I can see
all kinds of reasons for that given all the "alphabets" around. It is a worth
looking into if you want to see why progress is slow. For example some Arabic
alphabets have parts of letters "underlining" other letters. Another font I saw
requires the ability to embed letters within other letters like putting a
lowercase e inside an uppercase L.

--
Hodie pridie Idus Octobres MMVI est
-- The Ferric Webceasar
nizkor http://www.giwersworld.org/nizkook/nizkook.phtml
Iraqi democracy http://www.giwersworld.org/911/armless.phtml a3


  
Date: 13 Nov 2006 01:10:12
From: Paul
Subject: Re: Restricted ASCII?




Matt Giwer wrote:

> Chalky wrote:
> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> > codes, to allow for 256 characters.
> >
> > However, I have noticed that this set is truncated to 7 characters at
> > sci.astro.research to conform to its first commercial use as a
> > seven-bit teleprinter code (1963).
> >
> > This can result in considerable garbling of 8-bit ASCII text.
> >
> > Since I am pretty sure I have seen Schrodinger's equation spelled
> > correctly at sci.physics.research, I am curious to discover how endemic
> > this restriction of the ASCII set actually still is, in the Usenet
> > groups.
> >
> > Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> > -, and for copyright, below
>
> It has been so long I doubt I remember enough to do this subject the injustice
> it deserves.

Very funny! Matt I love the way you write.
Yermiah





 
Date: 12 Nov 2006 21:46:06
From: Chalky
Subject: Restricted ASCII? The final test



Thanks too to Sorcerer (Androcles), and George Dishman for your
collective constructive feedback. It seems that you probably had a
secondary display problem but I didn't, because you have Usenet
postings e-mailed to you.

I just go to the website to read what I am interested in, and, when I
respond, I do so via form submission, so there is no e-mail protocol
involved, my side.

Your resultant problem might have been because I originally pasted in
the displayed characters which sprang to life after I had typed in the
decimal translation of the machine code for those characters.
Consequently, for a final test, I am simply typing in the decimal
translations of the machine codes for the Japanese Yen, the registered
trade mark, and the Euro, encased in the beautifully symmetric Spanish
version of the question mark, using the HTML identifiers &, #, and ;
below:

¿ ¥ ® € ?


Let me know what you see.


Chalky



 
Date: 12 Nov 2006 20:59:20
From: Chalky
Subject: Re: Restricted ASCII?



Greg Hennessy wrote:

> The facts are you claimed ascii was 8 bit. It isn't. Wiki defines it
> as 7 bit.

And, of course, Wiki is always infallible, is it?

> There is an 8 bit code called EBCDIC, which is mostly used in IBM
> mainframes, but that has nothing to do with ASCII.

I have no argument with that. So please now explain why, less than 24
hours ago, infallible Wiki connected EBCDIC and ASCII together, via the
qualifying phrase, a 8-bit system that would allow for 256 characters,

> You were confused.

If so, it looks like Wiki was too.

See also Wiki ref to US-ASCII, and my reference to the server side
instruction Asc(), which definitely returns the results of all 8 bits
in the byte.

Cheers

It has been fun.

Chalky



  
Date: 13 Nov 2006 15:43:30
From: Paul Schlyter
Subject: Re: Restricted ASCII?


In article <1163393960.821344.305840@m73g2000cwd.googlegroups.com >,
Chalky <chalkyspam@bleachboys.co.uk > wrote:
>
>Greg Hennessy wrote:
>
>> The facts are you claimed ascii was 8 bit. It isn't. Wiki defines it
>> as 7 bit.
>
>And, of course, Wiki is always infallible, is it?

Wiki is no more infallible than its authors of course. But he claimed
wiki said ASCII was 8-bit, and wiki never said that.

Btw wiki is correct in this particular case: ASCII *is* a 7-bit code.

>> There is an 8 bit code called EBCDIC, which is mostly used in IBM
>> mainframes, but that has nothing to do with ASCII.
>
>I have no argument with that. So please now explain why, less than 24
>hours ago, infallible Wiki connected EBCDIC and ASCII together, via the
>qualifying phrase, a 8-bit system that would allow for 256 characters,

EBCDIC predated ASCII, and was of course considered when the 5-bit Baudot
code evolved into ASCII. However, early versions of EBCDIC had "holes"
in its character table for each byte where both nibbles weren't in the
range 0 to 9. Which is natural, since EBCDIC was an evolution of
the earlier BCD codes:

BCD = Binary Coded Decimal
EBCDIC = Extended Binary Coded Decimal Interchange Code

ASCII used the bit space more efficiently, using all possible bit
combinations.


>> You were confused.
>
>If so, it looks like Wiki was too.

No - Wiki never claimed ASCII to be an 8-bit code. Re-read the older
version of that page, and re-read it carefully....

>See also Wiki ref to US-ASCII, and my reference to the server side
>instruction Asc(), which definitely returns the results of all 8 bits
>in the byte.
>
>Cheers
>
>It has been fun.
>
>Chalky
>


--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/


 
Date: 12 Nov 2006 20:19:34
From: Chalky
Subject: Re: Restricted ASCII?


David Woolley wrote:

> In article <1163358882.886919.240610@m7g2000cwm.googlegroups.com>,
> chalkyspam@bleachboys.co.uk wrote:
> > Greg Hennessy wrote:
>
> > > and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit
> > > system that would allow for 256 characters, the".

Via reconstruction, the exact Wiki sentence on the morning of my
original posting, would have been: "Some time after the [[EBCDIC]]
code, a 8-bit system that would allow for 256 characters, the ASCII
developed from telegraphic codes and first entered commercial use as a
seven-bit teleprinter code promoted by Bell data services in 1963."

Given that the subsequenty deleted qualification ", a 8-bit system that
would allow for 256 characters, " is linguistically incorrect in its
own right, that phrase could equally have referred to ASCII, EBCDIC, or
both, within that sentence (as I had originally assumed).

Thanks for helping to clear up that point of confusion. Perhaps such
confusion could be avoided in the future by adopting the
http://en.wikipedia.org/wiki/MIME method of referring to this
explicitly restricted 7 bit instruction set as US-ASCII.

David Woolley also wrote:

> In article <1163345440.980996.198470@m7g2000cwm.googlegroups.com>,
> chalkyspam@bleachboys.co.uk wrote:

> > The Server Side coding of http://1stlight.org/design/ascii.asp,

> This URL produces a request to "Click Here AFTER inserting security key
> into your computer", so is useless as a public reference.

Yes, I did say in an earlier posting that this link is only currently
accessible on the intRAnet version of that site. (Thanks for confirming
that the relevant site security lock still works when that software is
instead running on the intERnet copy of the server)

> > specifically intructs any Windows NT 4, Windows 2000, or subsequent
> > Microsoft server, to display the ascii symbols for all n from 1 to 255,

> They are misusing the term ASCII. That's a very common mistake.
> Probably what it actually does is to display the raw font encoding for
> the current font.

The relevant employed server side visual basic (ASP) coding is:

a=Chr(n)
Response.Write Asc(a)
(for all n from 1 to 255)

Perhaps you should now complain to Microsoft that they have not sawn
off the most significant bit of the data byte in their Asc()
instruction, to specifically restrict that server side scripting to
only handling the US-ASCII subset.

(Yes, I already know that not every possible permutation of 1s and 0s
in that byte, results in a displayable graphic. This is equally true
for the 7 bit subset, as for the 8 bit set.)

Cheers

It has been fun.

Chalky



  
Date: 13 Nov 2006 00:09:40
From: Rich Townsend
Subject: Re: Restricted ASCII?


Chalky wrote:
> David Woolley wrote:
>
>> In article <1163358882.886919.240610@m7g2000cwm.googlegroups.com>,
>> chalkyspam@bleachboys.co.uk wrote:
>>> Greg Hennessy wrote:
>>>> and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit
>>>> system that would allow for 256 characters, the".
>
> Via reconstruction, the exact Wiki sentence on the morning of my
> original posting, would have been: "Some time after the [[EBCDIC]]
> code, a 8-bit system that would allow for 256 characters, the ASCII
> developed from telegraphic codes and first entered commercial use as a
> seven-bit teleprinter code promoted by Bell data services in 1963."
>
> Given that the subsequenty deleted qualification ", a 8-bit system that
> would allow for 256 characters, " is linguistically incorrect in its
> own right, that phrase could equally have referred to ASCII, EBCDIC, or
> both, within that sentence (as I had originally assumed).
>
> Thanks for helping to clear up that point of confusion. Perhaps such
> confusion could be avoided in the future by adopting the
> http://en.wikipedia.org/wiki/MIME method of referring to this
> explicitly restricted 7 bit instruction set as US-ASCII.
>
> David Woolley also wrote:
>
>> In article <1163345440.980996.198470@m7g2000cwm.googlegroups.com>,
>> chalkyspam@bleachboys.co.uk wrote:
>
>>> The Server Side coding of http://1stlight.org/design/ascii.asp,
>
>> This URL produces a request to "Click Here AFTER inserting security key
>> into your computer", so is useless as a public reference.
>
> Yes, I did say in an earlier posting that this link is only currently
> accessible on the intRAnet version of that site. (Thanks for confirming
> that the relevant site security lock still works when that software is
> instead running on the intERnet copy of the server)
>
>>> specifically intructs any Windows NT 4, Windows 2000, or subsequent
>>> Microsoft server, to display the ascii symbols for all n from 1 to 255,
>
>> They are misusing the term ASCII. That's a very common mistake.
>> Probably what it actually does is to display the raw font encoding for
>> the current font.
>
> The relevant employed server side visual basic (ASP) coding is:
>
> a=Chr(n)
> Response.Write Asc(a)
> (for all n from 1 to 255)
>
> Perhaps you should now complain to Microsoft that they have not sawn
> off the most significant bit of the data byte in their Asc()
> instruction, to specifically restrict that server side scripting to
> only handling the US-ASCII subset.
>

No, ASCII is the proper designation for the 7-bit encoding -- the 'A' standing
for 'American'. Anything with 8 bits just isn't ASCII, it's ISO 8859/1 or somesuch.

How do I know this? I spent five months working in the standards department of a
very large news company, and it was my business to know.

cheers,

Rich


   
Date: 13 Nov 2006 10:44:36
From: Starlord
Subject: Re: Restricted ASCII?


I know my old Atari 800XL used ASCII and so does my Atari TT030.

But this has nothing to do with Astronomy or telescopes.


--
The Lone Sidewalk Astronomer of Rosamond

Telescope Buyers FAQ
http://home.inreach.com/starlord
Sidewalk Astronomy
www.sidewalkastronomy.info
The Church of Eternity
http://home.inreach.com/starlord/church/Eternity.html


"Rich Townsend" <rhdt@barVOIDtol.udel.edu > wrote in message
news:ej8umk$nbg$1@scrotar.nss.udel.edu...




 
Date: 12 Nov 2006 20:25:03
From: David Woolley
Subject: Re: Restricted ASCII?


In article <1163358882.886919.240610@m7g2000cwm.googlegroups.com >,
chalkyspam@bleachboys.co.uk wrote:
> Greg Hennessy wrote:

> > and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit
> > system that would allow for 256 characters, the".

> Precisely. This is exactly what I copied this morning, followed by

It says nothing about the nature of ASCII. EBCDIC is a proprietary,
IBM, character code, which has a certain relationship to punched card
codes. (Punched cards have 12 potential holes in each column. With
the normal codes exactly one of these is punched for each digit from
0 to 9 (somewhat desirable with early manual card punches, where you
had to push a key for each hole - in my time only used for making
corrections). Uppercase characters were coded by punching one, or both,
of the remaining rows. EBCDIC reflects this structure by coding the
0 to 9 punching into the low order four bits, with the result that
character codes were not contiguous. I suspect this was done because
it simplified the electronics used in the card readers.)

I did see this edit, but discarded it because it was clearly an
irrelevant side comment. In fact the edit history makes it clear that
the reason for this change was that it misrepresented the time relation
between the creation of the two different codes. (I actually read
the edit comments before looking at the actual edits.)

> negligible linguistic modification, which made ABSOLUTELY no difference
> to the meaning of the (then) wiki ref., under History.

Obviously it makes no significant difference because its a comment about
EBCDIC in an article about ASCII.

> Thank you for this objective confirmation.

It confirms that there was a change today and that change did not have
any relevance, except to someone who had completely misunderstood the
original.

I would definitely have been speaking out of my mouth if I hadn't
typed this without speaking.

(EBCDIC has some relevance to USENET as newsreaders on EBCDIC based
machines cannot assume that their character code is identical to
ASCII in the first 128 code - it is very different.)

EBCDIC = Extended Binary Coded Decimal Interchange Code



 
Date: 12 Nov 2006 11:18:46
From: Chalky
Subject: Re: Restricted ASCII?



David Woolley wrote:

> In article <1163345440.980996.198470@m7g2000cwm.googlegroups.com>,
> chalkyspam@bleachboys.co.uk wrote:
>
> > Yes. That is copied from the CHANGED Wiki reference, which _postdates_
> > my first posting of today.
>
> I've gone through the edit history and no edit since, at the latest,
> November 7th, has made such a change. Even if the page had been vandalised,
> the change would still be in the edit history. The only time an edit
> would be taken out of the history is if it would be illegal, or at
> least legally unsafe, to keep it. The sort of change we are discussing
> here, by no means, fits that category (unless the same edit introduced
> legally unsafe material, which would only really occur for vandalism
> with this article).
>
> Note that Wiki is a generic term. What we seem to be, actually, talking
> about is the English version of Wikipedia.
>
> > I disagree with that conveniently changed reference of this afternoon,
> > anyway.
>
> The current reference (spot version URL given earlier) is correct
> in the area under discussion and hasn't significantly changed in the
> last few days.
>
> > The Server Side coding of http://1stlight.org/design/ascii.asp,
>
> This URL produces a request to "Click Here AFTER inserting security key
> into your computer", so is useless as a public reference.
>
> > specifically intructs any Windows NT 4, Windows 2000, or subsequent
> > Microsoft server, to display the ascii symbols for all n from 1 to 255,
>
> They are misusing the term ASCII. That's a very common mistake.
> Probably what it actually does is to display the raw font encoding for
> the current font. Most Windows fonts in the UK are either Unicode or
> Windows-1252 coded, both of which have the ASCII and ISO 8859/1 graphics
> as a subset.
>
> In particular, ASCII code points 0 to 31 are not displayable, although
> a few of them have an effect on formatting. Nor is ASCII code point 127.
> 128 through 255, as repeatedly stated, are not in ASCII, and even with the
> other non-proprietary codes (ISO 8859/* and ISO 10646 (~Unicode)) code
> points 128 through 159 are not displayable graphics.
>
> > in sequence. Neither the server nor any known browser has any
> > difficulty in doing so. This works with both server side and client
> > side scripting, and has done so since the last century.
>
> There are two ways of specifying characters to a browser, one is by
> literally including the character code in the data stream. In that case,
> it should be interpreted in the context of the charset parameter in the
> Content-Type and whether or not a character even exists will depend on
> the character set specified. The other is to provide it using numeric
> entitities (  etc.) (or named ones referencing them) or through
> scripting, in which case the characters should be in the HTML native
> character set, which is ISO 8859/1 for versions before HTML 4.0 and
> ISO 10646 for later versions. In both these cases, codes 128 through
> 159 are absolutely forbidden, as is 127 and most of the range from 0
> to 31. If the browser displays these characters, it is doing so for
> error recovery reasons, or for compatibility with early browsers that
> were rather sloppy in their character code handling (in particular,
> they tended to display the current platform code page character, rather
> than the correct one defined by the standard - in the USA and UK, this
> tended to produce the same result, for conforming characters).
>
> > I doubt that too. As suggested by Edward Green, this seems to be a bug
> > introduced by the sci.astro.research moderator's software/interface. As
>
> That seems to be the case. Strictly speaking, the use of MIME
> has never been standardised on USENET, so anything that doesn't use
> ASCII is non-standard. However, the de facto situation is that MIME
> using quoted-printable or base64 works with modern news readers and MIME
> using 8bit works most of the time.
>
> In the *.research moderation case, it does seem that the MIME encoding
> is being undone, which suggests that the system as a whole (which
> might include Google) is broken because it is partially MIME aware.
> Things ought to work OK if the software is fully MIME aware or, like
> the USENET backbone, totally unaware of MIME.
>
> > I have already pointed out, 8 bit info works fine at
> > sci.physics.research, and in every other usenet group tried.
>
> Your examples, mostly, have not sent 8 bit characters over USENET. They
> have used quoted-printable encoding, in which bytes that cannot be
> represented directly using ASCII are sent as three ASCII characters,
> "=" and two hexadecimal digits. "=" followed by space, and "=" at the
> end of the line also have special meanings, and some ASCII characters also
> have to be coded using "=" and hex digits, including, of course, "="
> itself. (I say bytes, rather than characters, because the modern trend,
> particularly for email and web pages, is to move to the use of the
> UTF-8 encoding (or sometimes UTF-7) of ISO 10646, which is a variable
> length code. Quoted printable encodes the component bytes, not the
> whole character. UTF-7 reduces the problem, because it only uses
> bytes which are printable in ASCII, and possibly common control
> characters.)
>
> As far as the USENET backbone is concerned, the result is pure ASCII, and
> that is what it transmits. Only when the article is passed to a user
> agent (e.g. Outlook Express) or gatewayed to another protocol (e.g. email
> for the moderation process or HTTP/HTML for Google Groups) is the MIME
> encoding detected and resolved. Some articles are actually sent raw 8 bit.
> These generally also work across the USENET backbone as USENET has
> generally been carried by 8 bit clean protocols (unlike email), and
> there have been few, if any, IBM mainframe based USENET systems, using
> EBCDIC (an 8 bit code that is not based on ASCII, and the most likely
> ASCII-incompatible code to have been encoutered in recent systems).
>
> Problems with quoting with MIME encoded material may be to do with the
> way that GUI mail and news user agents normally mis-use MIME to try and
> send reflowable paragraphs.

See my response to Greg Hennessy. You are talking out of your arse.

C



 
Date: 12 Nov 2006 11:14:42
From: Chalky
Subject: Re: Restricted ASCII?



Greg Hennessy wrote:

> >> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> >> > codes, to allow for 256 characters.
> >>
> >> Nope. Ascii is 7 bit.
> >> http://en.wikipedia.org/wiki/ASCII
> >
> > Check out my response to George. This definition was changed TODAY at
> > Wiki
>
> Not according to the WIKI history logs. There is one change listed for
> today, by Chris Chittleborough, who changed "similar to" into "like"
> and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit
> system that would allow for 256 characters, the".

Precisely. This is exactly what I copied this morning, followed by
negligible linguistic modification, which made ABSOLUTELY no difference
to the meaning of the (then) wiki ref., under History.

Thank you for this objective confirmation.


Chalky



  
Date: 12 Nov 2006 19:24:39
From: Greg Hennessy
Subject: Re: Restricted ASCII?


>> >> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
>> >> > codes, to allow for 256 characters.
>> >>
>> >> Nope. Ascii is 7 bit.
>> >> http://en.wikipedia.org/wiki/ASCII
>> >
>> > Check out my response to George. This definition was changed TODAY at
>> > Wiki
>>
>> Not according to the WIKI history logs. There is one change listed for
>> today, by Chris Chittleborough, who changed "similar to" into "like"
>> and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit
>> system that would allow for 256 characters, the".
>
> Precisely. This is exactly what I copied this morning, followed by
> negligible linguistic modification, which made ABSOLUTELY no difference
> to the meaning of the (then) wiki ref., under History.

The facts are you claimed ascii was 8 bit. It isn't. Wiki defines it
as 7 bit.

There is an 8 bit code called EBCDIC, which is mostly used in IBM
mainframes, but that has nothing to do with ASCII.

You were confused.

> Thank you for this objective confirmation.

I am not confirming you. I am proving you wrong.



 
Date: 12 Nov 2006 18:47:18
From: David Woolley
Subject: Re: Restricted ASCII?


In article <1163345440.980996.198470@m7g2000cwm.googlegroups.com >,
chalkyspam@bleachboys.co.uk wrote:

> Yes. That is copied from the CHANGED Wiki reference, which _postdates_
> my first posting of today.

I've gone through the edit history and no edit since, at the latest,
November 7th, has made such a change. Even if the page had been vandalised,
the change would still be in the edit history. The only time an edit
would be taken out of the history is if it would be illegal, or at
least legally unsafe, to keep it. The sort of change we are discussing
here, by no means, fits that category (unless the same edit introduced
legally unsafe material, which would only really occur for vandalism
with this article).

Note that Wiki is a generic term. What we seem to be, actually, talking
about is the English version of Wikipedia.

> I disagree with that conveniently changed reference of this afternoon,
> anyway.

The current reference (spot version URL given earlier) is correct
in the area under discussion and hasn't significantly changed in the
last few days.

> The Server Side coding of http://1stlight.org/design/ascii.asp,

This URL produces a request to "Click Here AFTER inserting security key
into your computer", so is useless as a public reference.

> specifically intructs any Windows NT 4, Windows 2000, or subsequent
> Microsoft server, to display the ascii symbols for all n from 1 to 255,

They are misusing the term ASCII. That's a very common mistake.
Probably what it actually does is to display the raw font encoding for
the current font. Most Windows fonts in the UK are either Unicode or
Windows-1252 coded, both of which have the ASCII and ISO 8859/1 graphics
as a subset.

In particular, ASCII code points 0 to 31 are not displayable, although
a few of them have an effect on formatting. Nor is ASCII code point 127.
128 through 255, as repeatedly stated, are not in ASCII, and even with the
other non-proprietary codes (ISO 8859/* and ISO 10646 (~Unicode)) code
points 128 through 159 are not displayable graphics.

> in sequence. Neither the server nor any known browser has any
> difficulty in doing so. This works with both server side and client
> side scripting, and has done so since the last century.

There are two ways of specifying characters to a browser, one is by
literally including the character code in the data stream. In that case,
it should be interpreted in the context of the charset parameter in the
Content-Type and whether or not a character even exists will depend on
the character set specified. The other is to provide it using numeric
entitities (  etc.) (or named ones referencing them) or through
scripting, in which case the characters should be in the HTML native
character set, which is ISO 8859/1 for versions before HTML 4.0 and
ISO 10646 for later versions. In both these cases, codes 128 through
159 are absolutely forbidden, as is 127 and most of the range from 0
to 31. If the browser displays these characters, it is doing so for
error recovery reasons, or for compatibility with early browsers that
were rather sloppy in their character code handling (in particular,
they tended to display the current platform code page character, rather
than the correct one defined by the standard - in the USA and UK, this
tended to produce the same result, for conforming characters).

> I doubt that too. As suggested by Edward Green, this seems to be a bug
> introduced by the sci.astro.research moderator's software/interface. As

That seems to be the case. Strictly speaking, the use of MIME
has never been standardised on USENET, so anything that doesn't use
ASCII is non-standard. However, the de facto situation is that MIME
using quoted-printable or base64 works with modern news readers and MIME
using 8bit works most of the time.

In the *.research moderation case, it does seem that the MIME encoding
is being undone, which suggests that the system as a whole (which
might include Google) is broken because it is partially MIME aware.
Things ought to work OK if the software is fully MIME aware or, like
the USENET backbone, totally unaware of MIME.

> I have already pointed out, 8 bit info works fine at
> sci.physics.research, and in every other usenet group tried.

Your examples, mostly, have not sent 8 bit characters over USENET. They
have used quoted-printable encoding, in which bytes that cannot be
represented directly using ASCII are sent as three ASCII characters,
"=" and two hexadecimal digits. "=" followed by space, and "=" at the
end of the line also have special meanings, and some ASCII characters also
have to be coded using "=" and hex digits, including, of course, "="
itself. (I say bytes, rather than characters, because the modern trend,
particularly for email and web pages, is to move to the use of the
UTF-8 encoding (or sometimes UTF-7) of ISO 10646, which is a variable
length code. Quoted printable encodes the component bytes, not the
whole character. UTF-7 reduces the problem, because it only uses
bytes which are printable in ASCII, and possibly common control
characters.)

As far as the USENET backbone is concerned, the result is pure ASCII, and
that is what it transmits. Only when the article is passed to a user
agent (e.g. Outlook Express) or gatewayed to another protocol (e.g. email
for the moderation process or HTTP/HTML for Google Groups) is the MIME
encoding detected and resolved. Some articles are actually sent raw 8 bit.
These generally also work across the USENET backbone as USENET has
generally been carried by 8 bit clean protocols (unlike email), and
there have been few, if any, IBM mainframe based USENET systems, using
EBCDIC (an 8 bit code that is not based on ASCII, and the most likely
ASCII-incompatible code to have been encoutered in recent systems).

Problems with quoting with MIME encoded material may be to do with the
way that GUI mail and news user agents normally mis-use MIME to try and
send reflowable paragraphs.


  
Date: 13 Nov 2006 05:33:44
From: =?UTF-8?Q?Jeff=E2=80=A6Relf?=
Subject: " ASCII Art " ( e.g. tables ), 80 columns, monospaced, unwrapped.


Hi David_Woolley, What newsreader are you using ?

" ASCII Art " ( e.g. tables ), 80 columns, monospaced, unwrapped,
should be the standard, I maintain... even on cell phones.

As far as the proper Encoding/Charset/etc. to use ( when posting ),
I say, " If Google displays it correctly, then it's correct. "

But FireFox ( my browser ) also effects the way posts look.
For example, my userContent.CSS prevents line wrapping, to wit:

* { white-space: nowrap !important; }

Cotse.NET/users/jeffrelf/userContent.CSS
( to see the change, restart FireFox and do a Cntrl-F5 [ refresh ] )

So, to wrap WikiPedia's insanely long lines ( shame on them ! ),
I copy and paste from WikiPedia to MS_Word.

My hand-rolled newsreader ( X.EXE, X.TXT, X.CPP ) puts all posts in
a single ( complexly maintained, UTF-16 ) .TXT file;
so all text is editable ( e.g. I can add new-lines, if need be ).




 
Date: 12 Nov 2006 10:50:53
From: Chalky
Subject: Re: Restricted ASCII?



Tom Roberts wrote:

> Chalky wrote:
> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> > codes, to allow for 256 characters.
> >
> > However, I have noticed that this set is truncated to 7 characters at
> > sci.astro.research to conform to its first commercial use as a
> > seven-bit teleprinter code (1963). [...]
>
> This is not due to the newsgroup itself or to the underlying newsgroup
> software. It is due to the newsgroup client used by individual people.
> For instance, I use Firefox and your symbols come out fine, except in
> moderated newsgroups. In moderated newsgroups it also depends on the
> email system used (author->moderator) and on the client software used by
> the moderator, because postings are processed via email and by the
> moderator's client before being distributed.
>
> So this is almost surely due to either the email software or the
> moderator of sci.astro.research using newsgroup software that truncates
> the high bit.
>
>
> Tom Roberts

_Thank you._

This highly consistent with my own analsyis of the experiment, and of
the resultant responses to my associated postings.

Cheers

Chalky



 
Date: 12 Nov 2006 10:35:48
From: Chalky
Subject: Re: Restricted ASCII?



Greg Hennessy wrote:

> On 2006-11-12, Chalky <chalkyspam@bleachboys.co.uk> wrote:
> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> > codes, to allow for 256 characters.
>
> Nope. Ascii is 7 bit.
> http://en.wikipedia.org/wiki/ASCII

Check out my response to George. This definition was changed TODAY at
Wiki

C



  
Date: 12 Nov 2006 19:01:01
From: Greg Hennessy
Subject: Re: Restricted ASCII?


>> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
>> > codes, to allow for 256 characters.
>>
>> Nope. Ascii is 7 bit.
>> http://en.wikipedia.org/wiki/ASCII
>
> Check out my response to George. This definition was changed TODAY at
> Wiki

Not according to the WIKI history logs. There is one change listed for
today, by Chris Chittleborough, who changed "similar to" into "like"
and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit
system that would allow for 256 characters, the".



  
Date: 13 Nov 2006 07:42:43
From: Paul Schlyter
Subject: Re: Restricted ASCII?


In article <1163356548.517299.123720@m7g2000cwm.googlegroups.com >,
Chalky <chalkyspam@bleachboys.co.uk > wrote:

> Greg Hennessy wrote:
>
>> On 2006-11-12, Chalky <chalkyspam@bleachboys.co.uk> wrote:
>>> ASCII is defined in wiki as an 8-bit system, developed from telegraphic
>>> codes, to allow for 256 characters.
>>
>> Nope. Ascii is 7 bit.
>> http://en.wikipedia.org/wiki/ASCII
>
> Check out my response to George. This definition was changed TODAY at
> Wiki

No, it wasn't! I checked a version of that wikipedia entry from a
few days ago, and that version also said ASCII is a 7-bit code. Sorry....

--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/


 
Date: 12 Nov 2006 10:32:12
From: Chalky
Subject: Re: Restricted ASCII?


I would like to say, first of all, that it is a pleasure to receive two
responses from someone with adequate technical competence to actually
teach me something (possibly?).

David Woolley wrote:

> In article <1163343127.057255.151280@h48g2000cwc.googlegroups.com>,
> chalkyspam@bleachboys.co.uk wrote:
>
> > I think it is unfortunate that we now appear to be restricted to that
> > subset of ascii which is represented by a single keypress on a standard
> > British or American keyboard.
>
> If you include shift and control modifiers, what you get on a standard
> US keyboard is the whole of ASCII.

Interesting. I will try that out soon, if I can get hold of a standard
American keyboard, from my only (drinking partner) USAF employee, in
my (parochial) village.

> With the same caveat, what you get
> on a standard British keyboard is a superset of ASCII, because old keyboa=
rds
> support =A3 as a simple shifted character.

How old are you talking about?. British keyboards support the
BritishPoundSterling symbol, without shift, but fail to support the
Yankee Dollar symbol thus. Similarly, Spanish keyboards support the
inverted question mark (necessary for correct punctuation in that
language), and, by now, the Euro. (Additional examples will be apparent
to the contemporary cosmopolitan European reader.)

> As to American keyboards, I would have thought some parts of America
> used keyboards optimised for Spanish or Portuguese.

I would imagine so too. By American I meant WASP Northern America,
which is where commercial computer technology first got off the ground,
after British innovators such as Babbage and Turing? (not sure of
latter spelling. [The AI guy who was instrumental in breaking the
Enigma code in WW2]) prepared the groundwork, and then British venture
capitalism failed to adequately exploit that lead (same old story
again).

> Incidentally, this one was identified as pure ASCII, but I've had
> to convert to ISO 8859/1, in order to include =A3.

Sure. With my cynic's hat on, I would suggest this could be interpreted
as yet another example of American Imperialism (and associated
stupidity).

> I haven't used
> ISO 8859/15, as that is relatively recent and there may be people
> without support for it, either in the browser, or in fonts.
>
> > Content-Type: text/plain; charset=3D"us-ascii"
>
> (Note the level of quoting is getting close to the level at which my
> spam filter cuts in because the article is too long.)

In that case, I hope this particular response "hits the spot"

> PS. It looks like the article was rejected by the moderator on the
> research newsgroup, and that they rejected it for the obvious reason
> that it was off topic.

Yes, I totally accept that.
However, it is not off topic in the sense of ensuring the communication
channel is left wide open.

David Woolley subsequently wrote:

> In article <1163328389.468587.34850@h48g2000cwc.googlegroups.com>,
> chalkyspam@bleachboys.co.uk wrote:
> > Chalky wrote:
>
> > > ASCII is defined in wiki as an 8-bit system, developed from telegraph=
ic
> > > codes, to allow for 256 characters.
>
> That wiki is wrong. It is using a common marketing/popular
> computing misuse of the term. Which wiki was it and
> which article?

The reference quoted by George Dishman. Morning as opposed to afternoon
version (GMT) of today.

> The current edit of the English version
> of Wikepedia seems to correctly define it as seven bit:
> <http://en.wikipedia.org/w/index.php?title=3DASCII&oldid=3D87321585>.
>
> ASCII is the US variant of ISO 646 (I think it is the same as the
> reference variant), which is a seven bit code.

OK, if so, let me put my question slightly differently. Why is it that
a European moderated sci.((astro)) research group (Moderators: Jonathan
Thornberg [Germany] and Martin Hardcastle [Britain]) is enslaved to a
US variant, whereas the U.S. moderated sci.((physics)) research group
(Moderators: Igor Khavkine and Phillip Helbig), is not?

> As indicated by the Content-Type header, your article is not in ASCII:

And who wrote the Content-Type header program? A WASP American, no
doubt, who was employed by Bill Gates (the _pretender_ to the throne of
software perfection!).

> > Content-Type: text/plain; charset=3D"iso-8859-1"
> ^^^^^^^^^^
> Which indicates that it is in the eight bit code ISO 8859/1, which contai=
ns
> ISO 646 (reference variant) as a subset. You were not posting in ASCII.

OK. so you are arguing here about a mere debatable technical
definition, not about basic (and invariant) principles of information
transfer protocol, as encapsulated in HyperText Markup Language (HTML),
and its contained (and unambiguous) 8 bit interpretation of the
character set.

[snip nonsense]

> > > Since I am pretty sure I have seen Schrodinger's equation spelled
> > > correctly at sci.physics.research, I am curious to discover how endem=
ic
> > > this restriction of the ASCII set actually still is, in the Usenet
> > > groups.
>
> For all except moderated newsgroups, and I don't think your test post
> would have been passed by a moderator, this is purely a function of the
> newsreaders or other user agents used. For a cross posted article,
> only one copy is ever transmitted, so any corruption would apply to all
> newsgroups. (In your case, the user agent seems to be the Google
> Groups nntp to HTML/HTTP gateway.)

OK That sounds like good info to me.

> > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> > > -, and for copyright, below
> > >
> > > =3DB0 =3DB1 =3DA9

OK that is a hexadecimal translation of the html coding: &#n; where n
is anything from 0 to 255.

Hex has always been a crap way of displaying information, and there is
material in the prior published literature, to suggest that hex was
only introduced as an obscuration technique to disguise the underlying
simplicity of the 8 bit microprocessor instruction set. This is
particularly evident when you compare the instruction sets of early 16
bit microprocessors, expressed in octal, versus the much cruder (but
apparently more complex) instruction sets of more primitive 8 bit
microprocessors, which were instead expressed in hexadecimal, for
marketing purposes.

> Actually, as well as using ISO 8859/1, you used quoted printable, and
> therefore actually only sent 7 bits in all except the just send eight
> case.

David Woolley also wrote:

> > Interesting. All groups display correctly except
> > sci.physics.relativity, which still displayed 8 bits, but translated
> > into a more old-fashioned font.
>
> Fonts are purely a user agent issue.

Wrong. You can over-ride this, if you wish, using the <PRE ></PRE> HTML
enclosers. There are other ways too.

>As I said, unless you post to a
> moderated group, in which case it is the moderator's computer system
> that determines what happens to things other than pure ASCII, this is
> an issue with your newsreader, because the same copy of the article is
> used for all unmoderated groups in a cross-posting.
>
> > Looks like sci.astro.research is the only newsgroup actually restricted
> > to 7 bits.

[snip additional undecipherable material]

> It's generally best to restrict yourself to the proper definition of
> ASCII unless you are in a restricted community,

You mean like the World Wide Web?

> typically a language
> community,

You mean like HTML, CGI, ASP, php, and Javascript?

> or the material can't be satifactorily represented in ASCII.
> For maths, there is only a narrow band in which this is valid, as one
> soon reaches a point where one needs to use TeX, troff's eqn, or MathML,
> which would normally be treated as binaries on Usenet, so are best placed
> on a web site.

Absolutely.

In view of such excessive restrictions, perhaps use of these newsgroups
is now best limited to the provision of relevant hyperlinks.

Chalky



 
Date: 12 Nov 2006 17:28:12
From: Greg Hennessy
Subject: Re: Restricted ASCII?


On 2006-11-12, Chalky <chalkyspam@bleachboys.co.uk > wrote:
> ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> codes, to allow for 256 characters.

Nope. Ascii is 7 bit.
http://en.wikipedia.org/wiki/ASCII

ASCII was first published as a standard in 1967 and was last updated
in 1986. It currently defines codes for 128 characters. 33 are
non-printing, mostly obsolete control characters that affect how text
is processed, and the other 95 printable characters are as follows
(starting with the space character):



 
Date: 13 Nov 2006 02:54:10
From: Chalky
Subject: Re: Restricted ASCII?



Paul Schlyter wrote:

> Sorry, but http://en.wikipedia.org/wiki/ASCII says that ASCII is 7-bit.
> Please check the reference before pointing to it !

I did, and it was promptly changed, after my original posting on this
subject (yesterday). However, if David Wooley's faith in the accuracy
of Wiki logs of alterations is justified, we now know that the
misleading text string read as:

a 8-bit system that would allow for 256 characters, the ASCII developed
from telegraphic codes and first entered commercial use as a seven-bit
teleprinter code promoted by Bell data services in 1963.

This was still posted up at Wiki about 24 hours ago, as a subsection of
the first sentence expressed under "History".

> That's ISO-8859-1, not ASCII..... :-)
>
> Check out: http://czyborra.com/

Thanks. Interesting. This appears to confirm that the 8 bit ISO-8859-1
standard has been the de facto standard for interpreting single byte
character definitions in all HTML browsers since the last century. So
my momentary recent concerns about cross computer compatibility (as a
consequence of recent comments here), appear to be unfounded. I don't
really care whether this is called ISO-8859-1 or ASCII, provided it
works consistently across all client browser platforms, for all 8 bit
coding of text.

> The characters in the range 80-FF (hex) are often erroneously called
> "extended ASCII" or even just "ASCII".

Yes, this is what I unerstood by the term ASCII, prior to more informed
feedback, during the discussion. If this places me amongst the ranks of
the www proletariat, I don't really mind that either. :-)


Your input was appreciated.


Chalky



  
Date: 13 Nov 2006 15:13:17
From: Paul Schlyter
Subject: Re: Restricted ASCII?


In article <1163415250.692641.297160@i42g2000cwa.googlegroups.com >,
Chalky <chalkyspam@bleachboys.co.uk > wrote:
>
>Paul Schlyter wrote:
>
>> Sorry, but http://en.wikipedia.org/wiki/ASCII says that ASCII is 7-bit.
>> Please check the reference before pointing to it !
>
>I did, and it was promptly changed, after my original posting on this
>subject (yesterday). However, if David Wooley's faith in the accuracy
>of Wiki logs of alterations is justified, we now know that the
>misleading text string read as:
>
>a 8-bit system that would allow for 256 characters, the ASCII developed
>from telegraphic codes and first entered commercial use as a seven-bit
>teleprinter code promoted by Bell data services in 1963.

Please don't truncate this paragraph so it becomes misleading. If we
include some of the words before, it will read:

# History
#
# Some time after the EBCDIC code, a 8-bit system that would allow for
# 256 characters, the ASCII developed from telegraphic codes and first
# entered commercial use as a seven-bit teleprinter code promoted by Bell
# data services in 1963

The 8-bit system referred to is EBCDIC, not ASCII. And EBCDIC is indeed
an 8-bit character code, although it has "holes" here and there in its
allocation of characters to codes.


>This was still posted up at Wiki about 24 hours ago, as a subsection of
>the first sentence expressed under "History".

True, however the older version of that section never claimed ASCII to be
an 8-bit character set...... :-)

>> That's ISO-8859-1, not ASCII..... :-)
>>
>> Check out: http://czyborra.com/
>
>Thanks. Interesting. This appears to confirm that the 8 bit ISO-8859-1
>standard has been the de facto standard for interpreting single byte
>character definitions in all HTML browsers since the last century.

That's probably region dependent. Remember that we also have ISO-Latin-2
to ISO-Latin-13, most of which are the preferred version of ISO-Latin
in various parts of the world. And China, Japan and Korea preferred other
character sets which included the Kanji and related characters.

>So my momentary recent concerns about cross computer compatibility (as a
>consequence of recent comments here), appear to be unfounded. I don't
>really care whether this is called ISO-8859-1 or ASCII, provided it
>works consistently across all client browser platforms, for all 8 bit
>coding of text.

Unfortunately, it's not quite that easy. If you write HTML, you'd better
be aware of what character encoding you're using, or else your web pages
may display incorrectly here and there. In western Europe and the US,
the predominant encoding is of course ISO-8859-1 aka ISO-Latin-1, however
UTF-8 is getting more and more common (UTF-8 is a popular 8-bit encoding
of Unicode).

And, as usual, Microsoft does things in its own ways. Instead of using
proper ISO-Latin-1, Windows uses Win-Latin-1 (aka Windows Code Page 1252).
Now, Win-Latin-1 and ISO-Latin-1 are very similar -- the only difference
are the characters 0x80 to 0x9F: in ISO-Latin (all versions) 0x80 to 0x9F
just duplicate the control characters 0x00 to 0x1F, while Win-Latin-1
puts additional printable characters in 0x80 to 0x9F. Quite often,
people who write web pages on a Windows platform are really using
Win-Latin-1, but they believe they're using ISO-Latin-1 and say so in
the headers of their HTML files - as a result, their web pages may look
weird here and there, when viewed on a web browser on a non-Windows
computer.

>> The characters in the range 80-FF (hex) are often erroneously called
>> "extended ASCII" or even just "ASCII".
>
>Yes, this is what I unerstood by the term ASCII, prior to more informed
>feedback, during the discussion. If this places me amongst the ranks of
>the www proletariat, I don't really mind that either. :-)

It's never too late to educate yourself.... :-)

The problem with "extended ASCII" was that there were so many varieties
of it. The Commodore PET had its own version (sometimes called PETSCII).
The early IBM PC had another version, the early Mac a third version, etc.

Not until ISO-Latin was defined did a convergence towards a common
standard happen. On the PC side, the switch to ISO-Latin was made in
Windows (at least almost - Win-Latin isn't quite ISO-Latin, but nearly
so). On the Mac side I don't know when the switch happened, but I'm
positive Mac OS-X uses ISO-Latin -- some of the later pre-OS-X versons
of Mac OS may have used it too. However, the older character sets
live on as backward compatibility - in Windows each time you open
a console window (often incorrectly called a "DOS box" - it was a DOS box
on Win-95/98/ME, but on Win-NT/2k/XP it's more than that). You can
configure Windows console windows to work in ISO-Latin instead of CP-850,
but if you do, nationalized versions of the standard Windows utilities
may output text which look weird, since these texts are still in CP-850.


>Your input was appreciated.
>
>
>Chalky


--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/


  
Date: 13 Nov 2006 12:52:39
From: Greg Hennessy
Subject: Re: Restricted ASCII?


On 2006-11-13, Chalky <chalkyspam@bleachboys.co.uk > wrote:
> I did, and it was promptly changed, after my original posting on this
> subject (yesterday). However, if David Wooley's faith in the accuracy
> of Wiki logs of alterations is justified, we now know that the
> misleading text string read as:
>
> a 8-bit system that would allow for 256 characters, the ASCII developed
> from telegraphic codes and first entered commercial use as a seven-bit
> teleprinter code promoted by Bell data services in 1963.

No, that is *NOT* what the text read as. You are now simply
lieing. The text, as shown on
http://en.wikipedia.org/w/index.php?title=ASCII&diff=87495535&oldid=87462571
was

"Some time after the EBCDIC code, a 8-bit system that would allow
for 256 characters, the ASCII developed from Telegraphy


 
Date: 13 Nov 2006 07:42:42
From: Paul Schlyter
Subject: Re: Restricted ASCII?


In article <1163327103.618298.79130@e3g2000cwe.googlegroups.com >,
Chalky <chalkyspam@bleachboys.co.uk > wrote:

> ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> codes, to allow for 256 characters.

Sorry, but http://en.wikipedia.org/wiki/ASCII says that ASCII is 7-bit.
Please check the reference before pointing to it !

> However, I have noticed that this set is truncated to 7 characters at
> sci.astro.research to conform to its first commercial use as a
> seven-bit teleprinter code (1963).
>
> This can result in considerable garbling of 8-bit ASCII text.
>
> Since I am pretty sure I have seen Schrodinger's equation spelled
> correctly at sci.physics.research, I am curious to discover how endemic
> this restriction of the ASCII set actually still is, in the Usenet
> groups.
>
> Towards this end, I have pasted in the 8 bit ASCII for degrees, for
> +-, and for copyright, below
>
> =B0 =B1 =A9

That's ISO-8859-1, not ASCII..... :-)

Check out: http://czyborra.com/


The characters in the range 80-FF (hex) are often erroneously called
"extended ASCII" or even just "ASCII". But there were many different,
and mutually incompatible, ways to extend ASCII to an 8-bit code. So
instead of talking about "ASCII" or "Extended ASCII" as an 8-bit code,
it's better to refer to the 8-bit code with its proper name (such as
ISO-8859-1, or IBM CP-850). If you do that, others will know *which*
ASCII "extension" you're referring to.

> Chalky
--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/


 
Date: 12 Nov 2006 23:19:58
From: Chalky
Subject: Re: Restricted ASCII?



Pat O'Connell wrote:

> Only characters 0 through 127 have been standardized as part of ASCII.
> Each computer operating system (for instance DOS, Windows, Mac, and VMS)
> displays its own symbol set for 128 through 255.

Thanks for this info..

But, _good grief_, is this for real?

I had assumed that if I checked that a web page displayed correctly
with Netscape 4, Netscape 7, Mozilla, Firefox, Opera, and Internet
Explorer browsers, this would mean that it probably displayed
correctly, period.

Are you now telling me that I also have to run all these browsers on a
range of different computer makes, with different dates of manufacture,
before I can have any confidence in this?

If so, is there a reference link I can go to, to identify what the
problem characters are likely to be?

Or is it now necessary for each web designer to construct his own
graphics symbols for everything that is not already defined in 7 bit
US-ASCII ?


Chalky



  
Date: 13 Nov 2006 12:51:03
From: Richard Tobin
Subject: Re: Restricted ASCII?


In article <1163402398.151550.250220@m7g2000cwm.googlegroups.com >,
Chalky <chalkyspam@bleachboys.co.uk > wrote:

>> Only characters 0 through 127 have been standardized as part of ASCII.
>> Each computer operating system (for instance DOS, Windows, Mac, and VMS)
>> displays its own symbol set for 128 through 255.

This is an exaggeration. Most modern software knows about different
character encodings and can display them. It's true that if you
present some data without any information about the encoding,
different systems may display different things by default.

>Are you now telling me that I also have to run all these browsers on a
>range of different computer makes, with different dates of manufacture,
>before I can have any confidence in this?

HTML uses Unicode. Any particular page will use some character encoding
that covers all or part of Unicode, and can use character references
(such as £) or entity references (such as é) for characters
that the encoding doesn't provide. Web servers are supposed to inform
browsers what the encoding is, so they can display the right characters.

Unfortunately many pages are incompetently written, and many web
servers are incorrectly configured. For example, pages written in
some Windows encoding are often incorrectly claimed by the server to
be in ISO-Latin-1. Browsers try to compensate for this, but don't
always get it right.

A simple solution for authors is to use only plain ascii characters in
your web page, and use character or entity references for all the
others.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.


   
Date: 13 Nov 2006 15:13:16
From: Paul Schlyter
Subject: Re: Restricted ASCII?


In article <ej9pnn$mdb$1@pc-news.cogsci.ed.ac.uk >,
Richard Tobin <richard@cogsci.ed.ac.uk > wrote:
>
>Unfortunately many pages are incompetently written, and many web
>servers are incorrectly configured. For example, pages written in
>some Windows encoding are often incorrectly claimed by the server to
>be in ISO-Latin-1.

That's Windows Core Page 1252, aka Win-Latin-1

The difference between ISO-Latin-1 and Win-Latin-1 are in the characters
0x80 to 0x9F: in ISO-Latin-1 these characters are a duplication of the
control characters 0x00 to 0x1F, while in Win-Latin-1 additional
printable characters have been put here. If you want to stay in the ISO
domain of character sets, you must use Unicode to display the characters
which have been put in 0x80 to 0x9F in Win-Latin-1. Here you can find
a table which translates Win-Latin-1 to Unicode:
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT

>Browsers try to compensate for this, but don't always get it right.

Browsers which are told that a particular web page contain ISO-Latin
will probably just skip any characters in the 0x80 to 0x9F range,
treating them as control characters. So the effect of putting Win-Latin
into a HTML page and then tell the browser it's ISO-Latin is that some
characters will vanish when displayed.

Another problem are web pages which don't say anything at all what character
encoding it uses - then the browser must guess. If the web page uses
pure US-ASCII, all works well. But if the web page uses ISO-Latin, the
browser might try to display it as UTF-8, with peculiar results. Or the
page may contain UTF-8 which the browser tries to display as ISO-Latin,
also with peculiar results.



--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/


    
Date: 13 Nov 2006 17:02:15
From: Richard Tobin
Subject: Re: Restricted ASCII?


In article <eja0ba$1vup$1@merope.saaf.se >,
Paul Schlyter <pausch@saaf.se > wrote:

>Browsers which are told that a particular web page contain ISO-Latin
>will probably just skip any characters in the 0x80 to 0x9F range,
>treating them as control characters.

Not in my experience. For example, when I try it with Firefox or
Safari on a Mac, it shows the cp-1252 characters. This is probably
because it doesn't check the range at all, and just happens to have to
be using a font in which those code points (wrongly) have the cp-1252
glyphs.

>Another problem are web pages which don't say anything at all what character
>encoding it uses - then the browser must guess. If the web page uses
>pure US-ASCII, all works well. But if the web page uses ISO-Latin, the
>browser might try to display it as UTF-8, with peculiar results.

In theory, HTTP specifies that the default character set for documents
with media-type text/* (including text/html) is Latin-1.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.


     
Date: 14 Nov 2006 00:33:21
From: Ben Rudiak-Gould
Subject: Re: Restricted ASCII?


Richard Tobin wrote:
>Paul Schlyter <pausch@saaf.se> wrote:
>>Browsers which are told that a particular web page contain ISO-Latin
>>will probably just skip any characters in the 0x80 to 0x9F range,
>>treating them as control characters.
>
>Not in my experience. For example, when I try it with Firefox or
>Safari on a Mac, it shows the cp-1252 characters. This is probably
>because it doesn't check the range at all, and just happens to have to
>be using a font in which those code points (wrongly) have the cp-1252
>glyphs.

My guess is that it's a deliberate workaround for a FrontPage bug. There are
lots of pages on the web generated by broken versions of FrontPage that
misidentify Windows code page 1252 as Latin-1. Aliasing Latin-1 to CP1252
makes those pages display correctly and doesn't hurt anything else (since
actual Latin-1 pages presumably won't use the C1 characters at all).

-- Ben


  
Date: 13 Nov 2006 07:44:19
From: Sorcerer
Subject: Re: Restricted ASCII?



"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message
news:1163402398.151550.250220@m7g2000cwm.googlegroups.com...


   
Date: 13 Nov 2006 15:43:30
From: Paul Schlyter
Subject: Re: Restricted ASCII?


In article <nnV5h.124886$3x1.20740@fe1.news.blueyonder.co.uk >,
Sorcerer <Headmaster@hogwarts.physics_e > wrote:
..........
> I was accused of writing milliseconds (symbol roman m)
>when I had written microseconds (symbol greek mu) by someone
>not using Internet Explorer or Firefox. I am NOT changing it.
>The reader can change his browser as far as I'm concerned.
>See for yourself:
> http://www.androcles01.pwp.blueyonder.co.uk/Smart/Smart.htm
>
>Androcles.

You could try to write:

μ

instead of:

<font face="Symbol" >m</font>

though. That requires a Unicode font in the web browser to display properly,
but it worked fine on my browsers.

Or you could encode your web page as UTF-8 (and then of course also say so in
the header of the HTML) -- then you could write greek letters directly in
your HTML code. But that requires an UTF-8 capable browser, but most modern
browser have that capability.

Note: the Unicode character code for Greek mu is: 0x03BC

--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/


    
Date: 13 Nov 2006 17:32:34
From: Sorcerer
Subject: Re: Restricted ASCII?



"Paul Schlyter" <pausch@saaf.se > wrote in message
news:eja37t$2165$1@merope.saaf.se...


     
Date: 13 Nov 2006 17:47:05
From: Richard Tobin
Subject: Re: Restricted ASCII?


In article <S_16h.159294$lT5.31497@fe2.news.blueyonder.co.uk >,
Sorcerer <Headmaster@hogwarts.physics_e > wrote:

>Thank you, I could. But I'm not going to, and neither am I going to use
>Chinese, Hebrew or Cherokee characters in anticipation of someone having
>a browser that doesn't recognise Greek characters. Either he gets a browser
>that does, or he fails to communicate. That's his problem, not mine.

But you didn't use a Greek character. You used an "m", and requested
a font in which you expect it to look like a mu. Why not just use
a mu?

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.


      
Date: 13 Nov 2006 17:58:04
From: Sorcerer
Subject: Re: Restricted ASCII?



"Richard Tobin" <richard@cogsci.ed.ac.uk > wrote in message
news:ejab2p$vc2$1@pc-news.cogsci.ed.ac.uk...



Did you snip something? It seems I have too and now I can't find it.
Androcles




     
Date: 13 Nov 2006 22:43:30
From: Paul Schlyter
Subject: Re: Restricted ASCII?


In article <S_16h.159294$lT5.31497@fe2.news.blueyonder.co.uk >,
Sorcerer <Headmaster@hogwarts.physics_e > wrote:

> "Paul Schlyter" <pausch@saaf.se> wrote in message
> news:eja37t$2165$1@merope.saaf.se...
>> In article <nnV5h.124886$3x1.20740@fe1.news.blueyonder.co.uk>,
>> Sorcerer <Headmaster@hogwarts.physics_e> wrote:
>> ..........
>> > I was accused of writing milliseconds (symbol roman m)
>> >when I had written microseconds (symbol greek mu) by someone
>> >not using Internet Explorer or Firefox. I am NOT changing it.
>> >The reader can change his browser as far as I'm concerned.
>> >See for yourself:
>> > http://www.androcles01.pwp.blueyonder.co.uk/Smart/Smart.htm
>> >
>> >Androcles.
>>
>> You could try to write:
>>
>> μ
>>
>> instead of:
>>
>> <font face="Symbol">m</font>
>>
>> though. That requires a Unicode font in the web browser to display
>> properly, but it worked fine on my browsers.
>>
>> Or you could encode your web page as UTF-8 (and then of course also say so
>> in the header of the HTML) -- then you could write greek letters directly
>> in your HTML code. But that requires an UTF-8 capable browser, but most
>> modern browser have that capability.
>>
>> Note: the Unicode character code for Greek mu is: 0x03BC
>
> Thank you, I could. But I'm not going to, and neither am I going to use
> Chinese, Hebrew or Cherokee characters in anticipation of someone having
> a browser that doesn't recognise Greek characters. Either he gets a browser
> that does, or he fails to communicate. That's his problem, not mine.

....excuse me, but you really got this backwards. Using μ in your
HTML really requires the browser to recognize Greek characters -- it just
won't work on browsers recognizing no Greek characters....


> Many years ago I went to Italy to speak with some engineers concerning
> the electronics of Tornado, which was built jointly between Britain,
> Germany and Italy.
> http://archive.cs.uu.nl/pub/AIRCRAFT-IMAGES/Tornado.jpg
> I discovered there were no electronics data books in Italian, they
> were using American just as I was. So we both had to use a common
> subset of English which I admit was easier for me to learn than he,
> but had I been German we'd have been on equal footing.
>
> Esperanto is a failure, English is a success.

Currently it is, but in a century or two the situation may be different.
Maybe we're all speaking Chinese then..... :-)

You probably see no difference between a century and eternity though... <g >

> So... I'm not going to
> rewrite
> or change my pages to Unicode. You are free to do so if you wish.
> Androcles

--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/


      
Date: 14 Nov 2006 04:31:08
From: Sorcerer
Subject: Re: Restricted ASCII?



"Paul Schlyter" <pausch@saaf.se > wrote in message
news:ejaqps$2aq0$1@merope.saaf.se...


    
Date: 14 Nov 2006 08:19:59
From: David Woolley
Subject: Re: Restricted ASCII?


In article <eja37t$2165$1@merope.saaf.se >,
pausch@saaf.se (Paul Schlyter) wrote:

As this is a sort of new thread, I'll drop in just for a while, probably
just for one posting.

> You could try to write:

> μ

Correct.

> instead of:

> <font face="Symbol">m</font>

Which has always ever only meant "m" since the very first version of
HTML, albeit a rather strange way of writing an "m". This particular
technique, which I call the "symbol hack", allowed one to make HTML 3.2,
and earlier browser, display characters that were not in the western
European only character set (ISO 8859/1) allowed by those versions
of HTML. It always was an abuse of HTML and would not work on text
only and non-visual browsers, or when abstracting for search engines
(although some may now recognize the hack).

It hasn't been necessary for almost 7 years now and probably only works
because browsers tend to be bugwise compatible with earlier browsers rather
than properly conforming with the specification. A fully conformant
HTML 4.01 browser should display "m", using a fallback font. Androcles
is the one using the non-compliant browser. (One should never use
popular browser to check HTML; they go out of their way to second
guess bad HTML.)

The main reason that it probably worked in early browsers is that they
didn't honour the character set rules for HTML at all and simply passed
the raw bytes to their font engines.

> though. That requires a Unicode font in the web browser to display properly,
> but it worked fine on my browsers.

It only really requires the browser to know the Unicode encoding for
the Symbol font, something that ought to be built into any browser on
a platform with a custom encoded Symbol font.



 
Date: 12 Nov 2006 22:06:31
From: Chalky
Subject: Re: Restricted ASCII? The final test


Chalky wrote:

> Thanks too to Sorcerer (Androcles), and George Dishman for your
> collective constructive feedback. It seems that you probably had a
> secondary display problem but I didn't, because you have Usenet
> postings e-mailed to you.
>
> I just go to the website to read what I am interested in, and, when I
> respond, I do so via form submission, so there is no e-mail protocol
> involved, my side.
>
> Your resultant problem might have been because I originally pasted in
> the displayed characters which sprang to life after I had typed in the
> decimal translation of the machine code for those characters.
> Consequently, for a final test, I am simply typing in the decimal
> translations of the machine codes for the Japanese Yen, the registered
> trade mark, and the Euro, encased in the beautifully symmetric Spanish
> version of the question mark, using the HTML identifiers &, #, and ;
> below:
>
> ¿ ¥ ® € ?
>
>
> Let me know what you see.
>
>
> Chalky

OK, that corrupted here for me. Did it also for those receiving
postings by email?



  
Date: 13 Nov 2006 18:00:41
From: Pat O'Connell
Subject: Re: Restricted ASCII? The final test


Chalky wrote:
> Chalky wrote:
>
>> Thanks too to Sorcerer (Androcles), and George Dishman for your
>> collective constructive feedback. It seems that you probably had a
>> secondary display problem but I didn't, because you have Usenet
>> postings e-mailed to you.
>>
>> I just go to the website to read what I am interested in, and, when I
>> respond, I do so via form submission, so there is no e-mail protocol
>> involved, my side.
>>
>> Your resultant problem might have been because I originally pasted in
>> the displayed characters which sprang to life after I had typed in the
>> decimal translation of the machine code for those characters.
>> Consequently, for a final test, I am simply typing in the decimal
>> translations of the machine codes for the Japanese Yen, the registered
>> trade mark, and the Euro, encased in the beautifully symmetric Spanish
>> version of the question mark, using the HTML identifiers &, #, and ;
>> below:
>>
>> ¿ ¥ ® € ?

These are escaped characters in HTML, and are written correctly above in
ASCII.

Usenet readers don't display HTML, though some newsreaders will convert
URLs to be linkable.
>>
>>
>> Let me know what you see.

What I'm supposed to see in ASCII.

--
Pat O'Connell
[note munged EMail address]
Take nothing but pictures, Leave nothing but footprints,
Kill nothing but vandals...


   
Date: 14 Nov 2006 01:21:40
From: Ben Rudiak-Gould
Subject: Re: Restricted ASCII? The final test


Pat O'Connell wrote:
>Chalky wrote:
>>> ¿ ¥ ® € ?
>
>These are escaped characters in HTML, and are written correctly above in
>ASCII.

Except for "€", which is a control character, not the Euro symbol. The
Euro symbol is "€".

-- Ben


 
Date: 13 Nov 2006 16:19:30
From: Jeff Root
Subject: Re: Restricted ASCII?


My web pages don't specify the character set because I
didn't know which character set to specify, and I'd rather
let the user's operating system & browser go with what
they think is right rather than specifying one which
doesn't work for some people.

However, recently I wanted to use a centered dot. I found
it in Arial (the font I was suggesting for the page) using
Windows Character Map. I copied the character and pasted
it into the page in my text editor. The editor is set to
use a different font, and the character did not display
correctly. Based on past experience, I went ahead anyway.
The dot showed up fine on the web page on my local hard
drive. But when I uploaded the page to the server and
viewed it from there, the character did not display.

It was suggested to me that I specify the character set.
But I still was unsure which set to specify, and wanted to
avoid unnecessary complications. I found that replacing
the pasted character with the HTML escape sequence ·
works, so I am currently going with that.

Any comments or suggestions? I will specify a character
set if I have confidence that it won't cause more problems
than it solves. If it really isn't needed, though, I'll
continue to omit it.

-- Jeff, in Minneapolis



  
Date: 14 Nov 2006 00:49:17
From: Ben Rudiak-Gould
Subject: Re: Restricted ASCII?


Jeff Root wrote:
>My web pages don't specify the character set because I
>didn't know which character set to specify, and I'd rather
>let the user's operating system & browser go with what
>they think is right rather than specifying one which
>doesn't work for some people.

The only way the page can not work for /some/ people is if you don't specify
an encoding. If you do specify an encoding, then it will either look right
to everyone or look wrong to everyone. In particular, if it looks right to
you, it'll look right to everyone. That's much better than leaving it up to
each user's browser.

>It was suggested to me that I specify the character set.
>But I still was unsure which set to specify, and wanted to
>avoid unnecessary complications. I found that replacing
>the pasted character with the HTML escape sequence ·
>works, so I am currently going with that.

Yes, stick with ASCII and you'll be fine. "·" is a sequence of ASCII
characters for document-encoding purposes.

-- Ben


 
Date: 13 Nov 2006 21:57:59
From: David Woolley
Subject: Re: Restricted ASCII?


In article <1163391574.028033.308640@h48g2000cwc.googlegroups.com >,
chalkyspam@bleachboys.co.uk wrote:

[ all snipped ]

I think I'm spending too much time on this thread, without their being
sufficient benefit to the world. There are just two many misunderstanding
to counter and details to check. Therefore I'm going to drop out.

However, as a parting observation, I'd point out that neither Google
Groups nor Wikipedia use 8 bit character encodings. You are campaigning
for something that is already obsolescent!


  
Date: 14 Nov 2006 04:31:08
From: Sorcerer
Subject: Re: Restricted ASCII?



"David Woolley" <david@djwhome.demon.co.uk > wrote in message
news:T1163455119@djwhome.demon.co.uk...


 
Date: 14 Nov 2006 16:37:07
From: zzbunker@netscape.net
Subject: Re: Restricted ASCII?



Chalky wrote:
> Chalky wrote:
>
> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic
> > codes, to allow for 256 characters.
> >
> > However, I have noticed that this set is truncated to 7 characters at
> > sci.astro.research to conform to its first commercial use as a
> > seven-bit teleprinter code (1963).
> >
> > This can result in considerable garbling of 8-bit ASCII text.
> >
> > Since I am pretty sure I have seen Schrodinger's equation spelled
> > correctly at sci.physics.research, I am curious to discover how endemic
> > this restriction of the ASCII set actually still is, in the Usenet
> > groups.
> >
> > Towards this end, I have pasted in the 8 bit ASCII for degrees, for +
> > -, and for copyright, below
> >
> > =B0 =B1 =A9
> >
> > Chalky
>
> Interesting. All groups display correctly except
> sci.physics.relativity, which still displayed 8 bits, but translated
> into a more old-fashioned font.
>
> Looks like sci.astro.research is the only newsgroup actually restricted
> to 7 bits.
>
> I wonder why that is?

It's probably the same reason as all
backward compatibilty problems,
Whoever set the newsgroup up, may
have decided to use Fortran code
from the Mercury missions, as the
standard for that group's servers.=20


>=20
> Chalky



 
Date: 14 Nov 2006 10:21:47
From: Chalky
Subject: Re: Restricted ASCII?


Greg Hennessy wrote:

> On 2006-11-13, Chalky <chalkyspam@bleachboys.co.uk> wrote:
> > I did, and it was promptly changed, after my original posting on this
> > subject (yesterday). However, if David Wooley's faith in the accuracy
> > of Wiki logs of alterations is justified, we now know that the
> > misleading text string read as:
> >
> > a 8-bit system that would allow for 256 characters, the ASCII developed
> > from telegraphic codes and first entered commercial use as a seven-bit
> > teleprinter code promoted by Bell data services in 1963.
>
> No, that is *NOT* what the text read as. You are now simply
> lieing.

OK. So, obviously, you have now decided to demonstrate talking out of
your own arse in copycat mode. That is very brave of you [I don't
think]. Since I do not suffer fools gladly, welcome to the
slaughterhouse, now. (You have asked for it.)

I will start by explaining to you that, unless you are determined to
demonstrate to the world that you are semi-literate, as well as a
complete idiot, it is worth your while:

(A) investing in a dictionary and a spellchecker. That way, you will
not make a complete fool of yourself, by accusing someone of what you
are actually doing yourself, by mis-spelling such an elementary word as
lying.

(B) READING and trying to UNDERSTAND the prior postings in the thread,
and their associated responses. That way, you will not make an even
bigger fool of yourself, by raising a subject that has already been
addressed and adequately resolved, in a less confrontational and (for
you) potentially libelous manner. (If you had only had enough education
to get that elementary spelling right, I would at least have then had a
fighting chance of being "quids in" following litigation.

(C) investing in a text editor. That way, even if you don't have
adequate intelligence to work out what "misleading text string" means,
you will at least then be able to work out, via a tedious process of
trial and error, that your sentence

> "Some time after the EBCDIC code, a 8-bit system that would allow
> for 256 characters, the ASCII developed from Telegraphy


  
Date: 15 Nov 2006 03:47:52
From: Greg Hennessy
Subject: Re: Restricted ASCII?


On 2006-11-14, Chalky <chalkyspam@bleachboys.co.uk > wrote:
> (C) investing in a text editor. That way, even if you don't have
> adequate intelligence to work out what "misleading text string" means,
> you will at least then be able to work out, via a tedious process of
> trial and error, that your sentence
>
>> "Some time after the EBCDIC code, a 8-bit system that would allow
>> for 256 characters, the ASCII developed from Telegraphy


  
Date: 14 Nov 2006 21:13:29
From: Paul Schlyter
Subject: Re: Restricted ASCII?


Ever considered becoming a lawyer? You'd probably succeed, since you
seem to have a great talent at twisting and misinterpreting words......

FYI: truncating a piece from a text frequently alters the meaning
of the text. It's a common trick by those who wish to misinterpret
a text.


In article <1163528507.205692.270060@h54g2000cwb.googlegroups.com >,
Chalky <chalkyspam@bleachboys.co.uk > wrote:

> Greg Hennessy wrote:
>
>> On 2006-11-13, Chalky <chalkyspam@bleachboys.co.uk> wrote:
>>> I did, and it was promptly changed, after my original posting on this
>>> subject (yesterday). However, if David Wooley's faith in the accuracy
>>> of Wiki logs of alterations is justified, we now know that the
>>> misleading text string read as:
>>>
>>> a 8-bit system that would allow for 256 characters, the ASCII developed
>>> from telegraphic codes and first entered commercial use as a seven-bit
>>> teleprinter code promoted by Bell data services in 1963.
>>
>> No, that is *NOT* what the text read as. You are now simply
>> lieing.
>
> OK. So, obviously, you have now decided to demonstrate talking out of
> your own arse in copycat mode. That is very brave of you [I don't
> think]. Since I do not suffer fools gladly, welcome to the
> slaughterhouse, now. (You have asked for it.)
>
> I will start by explaining to you that, unless you are determined to
> demonstrate to the world that you are semi-literate, as well as a
> complete idiot, it is worth your while:
>
> (A) investing in a dictionary and a spellchecker. That way, you will
> not make a complete fool of yourself, by accusing someone of what you
> are actually doing yourself, by mis-spelling such an elementary word as
> lying.
>
> (B) READING and trying to UNDERSTAND the prior postings in the thread,
> and their associated responses. That way, you will not make an even
> bigger fool of yourself, by raising a subject that has already been
> addressed and adequately resolved, in a less confrontational and (for
> you) potentially libelous manner. (If you had only had enough education
> to get that elementary spelling right, I would at least have then had a
> fighting chance of being "quids in" following litigation.
>
> (C) investing in a text editor. That way, even if you don't have
> adequate intelligence to work out what "misleading text string" means,
> you will at least then be able to work out, via a tedious process of
> trial and error, that your sentence
>
>> "Some time after the EBCDIC code, a 8-bit system that would allow
>> for 256 characters, the ASCII developed from Telegraphy


 
Date: 15 Nov 2006 01:51:58
From: John (Liberty) Bell
Subject: Re: Restricted ASCII?



Richard Tobin wrote:

> In article <1163402398.151550.250220@m7g2000cwm.googlegroups.com>,
> Chalky <chalkyspam@bleachboys.co.uk> wrote:

> >> Only characters 0 through 127 have been standardized as part of ASCII.
> >> Each computer operating system (for instance DOS, Windows, Mac, and VMS)
> >> displays its own symbol set for 128 through 255.

You have 'credited' the wrong person for this misleading statement
here. It was Pat O'Connell who wrote that.

> This is an exaggeration. Most modern software knows about different
> character encodings and can display them. It's true that if you
> present some data without any information about the encoding,
> different systems may display different things by default.

JB



  
Date: 15 Nov 2006 15:35:02
From: Richard Tobin
Subject: Re: Restricted ASCII?


In article <1163584318.857439.244880@m73g2000cwd.googlegroups.com >,
John (Liberty) Bell <john.bell@accelerators.co.uk > wrote:

>You have 'credited' the wrong person for this misleading statement
>here.

No, I credited him with quoting it. That's why there were two ' >'
characters at the beginning of each line.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.


 
Date: 15 Nov 2006 19:45:13
From: John (Liberty) Bell
Subject: Re: Restricted ASCII? and the final test


Ben Rudiak-Gould wrote (Re: Restricted ASCII? The final test):

> Pat O'Connell wrote:
> >Chalky wrote:
> >>> ¿ ¥ ® € ?
> >
> >These are escaped characters in HTML, and are written correctly above in
> >ASCII.

I suspect the reason most of us see (eg) ¿ and not the character
represented by ¿ in HTML, may be that the groups server has
cunningly inserted an extra invisible character within each of those
HTML instructions, to block our browsers and newsreaders from
displaying those HTML instructions, as single characters.

As Chalky discovered, if you instead simply paste in the corresponding
displayed HTML character when making a posting, many of us will then
see it. However, there are then some user dependent problems which can
arise:

1) If you are using an Outlook Express Newsreader, the indents will
foul up when you try to respond.

2) If you are using another (as yet unidentified [here]) user
interface, that character is translated instead into a string of (7
bit) ascii characters, so you don't see what was intended.

3) If, on the other hand, you are using an Internet browser interface,
there are no resultant problems for you, personally, UNLESS you post to
sci.astro.research. [This is because the moderator there falls into
category (2), and, if approved, the posting will then appear in the
altered form the moderator saw]

> Except for "€", which is a control character, not the Euro symbol.

"€" is a Euro symbol for Microsoft browsers and for other
'relaxed' ISO based browsers. "€" is a control character only
under 'strict' ISO interpretation

> The Euro symbol is "€".

Yes, that is the Euro symbol in Unicode. It appears to work on all
browsers going back at least as far as Netscape 4.75. That,
incidentally, is the only browser I have tried which does not also
display the Euro symbol when fed €. Instead it displays €.

So I think I would agree, on the basis of this early Netscape test,
that Unicode is probably the best way to go (at least for HTML
scripting).


Regards

John



  
Date: 16 Nov 2006 13:32:14
From: Richard Tobin
Subject: Re: Restricted ASCII? and the final test


In article <1163648713.310853.239480@b28g2000cwb.googlegroups.com >,
John (Liberty) Bell <john.bell@accelerators.co.uk > wrote:

>"€" is a Euro symbol for Microsoft browsers and for other
>'relaxed' ISO based browsers. "€" is a control character only
>under 'strict' ISO interpretation

This is nonsense. Character number 128 in some Microsoft character
sets is a Euro character. But the &#NNN; notation *means* the
character represented by that number in Unicode. That browsers
display a Euro symbol is at best an attempt to make broken pages
display correctly, and may well just be a consequence of the fonts
they use.

>> The Euro symbol is "€".
>
>Yes, that is the Euro symbol in Unicode. It appears to work on all
>browsers going back at least as far as Netscape 4.75.

Whether it works depends more on the fonts than on the browsers.
Browsers don't have code to handle each character.

>So I think I would agree, on the basis of this early Netscape test,
>that Unicode is probably the best way to go (at least for HTML
>scripting).

If it's not Unicode, it's not HTML either.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.


  
Date: 16 Nov 2006 07:42:37
From: Paul Schlyter
Subject: Re: Restricted ASCII? and the final test


In article <1163648713.310853.239480@b28g2000cwb.googlegroups.com >,
John (Liberty) Bell <john.bell@accelerators.co.uk > wrote:

> Ben Rudiak-Gould wrote (Re: Restricted ASCII? The final test):
>
>> Pat O'Connell wrote:
>>>Chalky wrote:
>>>>> ¿ ¥ ® € ?
>>>
>>>These are escaped characters in HTML, and are written correctly above in
>>>ASCII.
>
> I suspect the reason most of us see (eg) ¿ and not the character
> represented by ¿ in HTML, may be that the groups server has
> cunningly inserted an extra invisible character within each of those
> HTML instructions, to block our browsers and newsreaders from
> displaying those HTML instructions, as single characters.

There are no such invisible characters inserted here, and I see
¿ too....

There's an easier way to accomplish this: make sure a line like
this is present in the message header:

Content-Type: text/plain; charset="us-ascii"

A compliant news reader should then display this as pure ASCII, not as HTML
excape characters. Of course, if web based, the news reader must do
some processing, such as replacing the ¿ with e.g. &#191;

--
----------------------------------------------------------------
Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN
e-mail: pausch at stockholm dot bostream dot se
WWW: http://stjarnhimlen.se/


 
Date: 16 Nov 2006 02:51:52
From: John (Liberty) Bell
Subject: Re: Restricted ASCII? and the final test



Paul Schlyter wrote:

> In article <1163648713.310853.239480@b28g2000cwb.googlegroups.com>,
> John (Liberty) Bell <john.bell@accelerators.co.uk> wrote:
> >> Pat O'Connell wrote:
> >>>Chalky wrote:
> >>>>> ¿ ¥ ® € ?

> >>>These are escaped characters in HTML, and are written correctly above in
> >>>ASCII.

> > I suspect the reason most of us see (eg) ¿ and not the character
> > represented by ¿ in HTML, may be that the groups server has
> > cunningly inserted an extra invisible character within each of those
> > HTML instructions, to block our browsers and newsreaders from
> > displaying those HTML instructions, as single characters.

> There are no such invisible characters inserted here, and I see
> ¿ too....

OK, so my blind guess can't be right.

> There's an easier way to accomplish this: make sure a line like
> this is present in the message header:
>
> Content-Type: text/plain; charset="us-ascii"

Ah! So it IS called us-ascii

> A compliant news reader should then display this as pure ASCII, not as HTML
> excape characters. Of course, if web based, the news reader must do
> some processing, such as replacing the ¿ with e.g. &#191;

Ah So! Chalky did say he saw something like &#191; when he
previewed the original posting, so modified the posting by pasting in
the correspondingly displayed characters, when he employed browser and
email clients directly to read the code.

However, when I previewed my own (later) postings, no such alteration
on preview was displayed.

Amazing how much seems to have changed in just a few days, isn't it?


John



  
Date: 16 Nov 2006 23:47:28
From: George Dishman
Subject: Re: Restricted ASCII? and the final test



"John (Liberty) Bell" <john.bell@accelerators.co.uk > wrote in message
news:1163674312.513112.24390@b28g2000cwb.googlegroups.com...
>
> Ah So! Chalky did say he saw something like &#191; when he
> previewed the original posting, so modified the posting by pasting in
> the correspondingly displayed characters, when he employed browser and
> email clients directly to read the code.

If you write HTML with notepad, you type "&" to display
the ampersand sign. It is the browser that does the conversion
from HTML.

> However, when I previewed my own (later) postings, no such alteration
> on preview was displayed.

In a "posting" on Usenet, the "&" should be passed through
unaltered and anyone using a compliant reader should see that.
If you view witha browser and it shows the ampersand then it is
broken, the interface should change "&" to "&amp;" so
the browser shows the characters.

> Amazing how much seems to have changed in just a few days, isn't it?

Not really, ASCII is still 7-bit, just as it always has been.

George




 
Date: 16 Nov 2006 21:02:20
From: Jeff Root
Subject: Re: Restricted ASCII? and the final test


When writing HTML, is it better to just say "M&M" or to
write out the verbose equivalent "M&M" ?

-- Jeff, in Minneapolis



  
Date: 17 Nov 2006 12:56:09
From: Richard Tobin
Subject: Re: Restricted ASCII? and the final test


In article <1163739740.253656.102810@j44g2000cwa.googlegroups.com >,
Jeff Root <jeff5@freemars.org > wrote:

>When writing HTML, is it better to just say "M&M" or to
>write out the verbose equivalent "M&M" ?

There are a few contexts in SGML in which & can be used literally, but
this is not one of them (it's recognised as an entity reference open
delimiter because it's followed by a name start character). And HTML
itself recommends that you should use &.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.


  
Date: 17 Nov 2006 06:09:37
From: Chris L Peterson
Subject: Re: Restricted ASCII? and the final test


On 16 Nov 2006 21:02:20 -0800, "Jeff Root" <jeff5@freemars.org > wrote:

>When writing HTML, is it better to just say "M&M" or to
>write out the verbose equivalent "M&M" ?

The latter will give the output you are looking for under a wider
variety of conditions.

_________________________________________________

Chris L Peterson
Cloudbait Observatory
http://www.cloudbait.com


 
Date: 17 Nov 2006 07:38:03
From: David Woolley
Subject: Re: Restricted ASCII? and the final test


In article <1163739740.253656.102810@j44g2000cwa.googlegroups.com >,
Jeff Root <jeff5@freemars.org > wrote:

> When writing HTML, is it better to just say "M&M" or to

M&M is not HTML (undefined general entity) and, for an XHTML document
would cause a compliant XHTML browser to abort the document (entity
reference not closed with ;, which is a well-formedness violation -
does not match the syntactical definition of a document).

(Note that IE (including IE7) does not support XHTML served as XHTML but
only a subset of XHTML 1.0 (defined in appendix C of its specification)
served, in compatibility mode, as HTML, and intended to be treated as
a sort of broken HTML.)

> write out the verbose equivalent "M&M" ?

The "'s are not needed, unless you use the string in an attribute
value, and even then, definitely for HTML, and I believe also for XHTML,
only if you use " rather than ' as the delimiter. (Using delimiters
is mandatory in XHTML, and is mandatory in HTML where most punctuation
characters are used.)

The most common place where &'s are invalidly left bare is form submission
like URLs. The HTML specification itself points this one out.

See <http://validator.w3.org/ > to check whether or not a document is HTML.

See <http://blogs.msdn.com/ie/archive/2005/09/15/467901.aspx > for Microsoft's
policy on supporting XML in IE7.