| |
Main
Date: 12 Nov 2006 02:25:03
From: Chalky
Subject: Restricted ASCII?
|
ASCII is defined in wiki as an 8-bit system, developed from telegraphic codes, to allow for 256 characters. However, I have noticed that this set is truncated to 7 characters at sci.astro.research to conform to its first commercial use as a seven-bit teleprinter code (1963). This can result in considerable garbling of 8-bit ASCII text. Since I am pretty sure I have seen Schrodinger's equation spelled correctly at sci.physics.research, I am curious to discover how endemic this restriction of the ASCII set actually still is, in the Usenet groups. Towards this end, I have pasted in the 8 bit ASCII for degrees, for + -, and for copyright, below =B0 =B1 =A9 Chalky
|
|
| |
Date: 12 Nov 2006 15:31:45
From: David Woolley
Subject: Re: Restricted ASCII?
|
In article <1163343127.057255.151280@h48g2000cwc.googlegroups.com >, chalkyspam@bleachboys.co.uk wrote: > I think it is unfortunate that we now appear to be restricted to that > subset of ascii which is represented by a single keypress on a standard > British or American keyboard. If you include shift and control modifiers, what you get on a standard US keyboard is the whole of ASCII. With the same caveat, what you get on a standard British keyboard is a superset of ASCII, because old keyboards support =A3 as a simple shifted character. (Modern ones also have the EURO symbol, but that requires the ISO 8859/15 character code or one of the proprietary ones. EURO, on PC keyboards, also requires the use of the alt-graphics modifier key.) As to American keyboards, I would have thought some parts of America used keyboards optimised for Spanish or Portuguese. Incidentally, this one was identified as pure ASCII, but I've had to convert to ISO 8859/1, in order to include =A3. I haven't used ISO 8859/15, as that is relatively recent and there may be people without support for it, either in the browser, or in fonts. > Content-Type: text/plain; charset="us-ascii" (Note the level of quoting is getting close to the level at which my spam filter cuts in because the article is too long.) PS. It looks like the article was rejected by the moderator on the research newsgroup, and that they rejected it for the obvious reason that it was off topic.
|
| |
Date: 12 Nov 2006 15:38:12
From: Tom Roberts
Subject: Re: Restricted ASCII?
|
Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. > > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). [...] This is not due to the newsgroup itself or to the underlying newsgroup software. It is due to the newsgroup client used by individual people. For instance, I use Firefox and your symbols come out fine, except in moderated newsgroups. In moderated newsgroups it also depends on the email system used (author- >moderator) and on the client software used by the moderator, because postings are processed via email and by the moderator's client before being distributed. So this is almost surely due to either the email software or the moderator of sci.astro.research using newsgroup software that truncates the high bit. Tom Roberts
|
| |
Date: 12 Nov 2006 07:30:41
From: Chalky
Subject: Re: Restricted ASCII?
|
George Dishman wrote: > [note: hand indented because the special characters > included by "Chalky" forced the use of "quoted > printable" which prevents Outlook Express handling > the ident automatically.] OK, but what does this mean in terms of displayed information, since I don't really understand this comment? > "Chalky" <chalkyspam@bleachboys.co.uk> wrote in message > news:1163327103.618298.79130@e3g2000cwe.googlegroups.com... > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > http://en.wikipedia.org/wiki/ASCII WOW. That text has CERTAINLY been changed since this morning. What I quoted above was copied and pasted _directly_ from the first sentence of that reference, under the sub heading "History", at ~10AM BST today! > > "ASCII was first published as a standard in 1967 and > was last updated in 1986. It currently defines codes > for 128 characters. 33 are non-printing, mostly > obsolete control characters that affect how text is > processed, and the other 95 printable characters are > as follows (starting with the space character):" > > and later > > "ASCII is, strictly, a seven-bit code, meaning that it > uses the bit patterns representable with seven binary > digits (a range of 0 to 127 decimal) to represent > character information." Yes. That is copied from the CHANGED Wiki reference, which _postdates_ my first posting of today. I disagree with that conveniently changed reference of this afternoon, anyway. The Server Side coding of http://1stlight.org/design/ascii.asp, specifically intructs any Windows NT 4, Windows 2000, or subsequent Microsoft server, to display the ascii symbols for all n from 1 to 255, in sequence. Neither the server nor any known browser has any difficulty in doing so. This works with both server side and client side scripting, and has done so since the last century. > History aside, use of 8-bit characters breaks one > of the most common newsreaders. Which one? > It may be MS's fault I doubt that. > but that's where we are. I doubt that too. As suggested by Edward Green, this seems to be a bug introduced by the sci.astro.research moderator's software/interface. As I have already pointed out, 8 bit info works fine at sci.physics.research, and in every other usenet group tried. Chalky
|
| | |
Date: 12 Nov 2006 16:25:15
From: George Dishman
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163345440.980996.198470@m7g2000cwm.googlegroups.com... > > George Dishman wrote: > >> [note: hand indented because the special characters >> included by "Chalky" forced the use of "quoted >> printable" which prevents Outlook Express handling >> the ident automatically.] > > OK, but what does this mean in terms of displayed information, since I > don't really understand this comment? It means the " > " which prefixes each quoted line doesn't get put in by the newsreader, I had to edit it into each line I quoted myself. That's why I trimmed most of your post. >> "Chalky" <chalkyspam@bleachboys.co.uk> wrote in message >> news:1163327103.618298.79130@e3g2000cwe.googlegroups.com... >> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic >> > codes, to allow for 256 characters. >> >> http://en.wikipedia.org/wiki/ASCII > > WOW. That text has CERTAINLY been changed since this morning. What I > quoted above was copied and pasted _directly_ from the first sentence > of that reference, under the sub heading "History", at ~10AM BST today! I suspect you can find some history of editing of the page on Wiki that would tell you who changed it but it was like that when I went there. BTW, there are some duplicated topics on Wiki so it might be worth checking your browser history to be sure you got the same page. >> "ASCII was first published as a standard in 1967 and >> was last updated in 1986. It currently defines codes >> for 128 characters. 33 are non-printing, mostly >> obsolete control characters that affect how text is >> processed, and the other 95 printable characters are >> as follows (starting with the space character):" >> >> and later >> >> "ASCII is, strictly, a seven-bit code, meaning that it >> uses the bit patterns representable with seven binary >> digits (a range of 0 to 127 decimal) to represent >> character information." > > Yes. That is copied from the CHANGED Wiki reference, which _postdates_ > my first posting of today. > > I disagree with that conveniently changed reference of this afternoon, > anyway. I know what it says now is correct, I've been familiar with the coding for decades (one of my early jobs required converting between 7-bit and 5-bit). > The Server Side coding of http://1stlight.org/design/ascii.asp, > specifically intructs any Windows NT 4, Windows 2000, or subsequent > Microsoft server, to display the ascii symbols for all n from 1 to 255, > in sequence. Neither the server nor any known browser has any > difficulty in doing so. This works with both server side and client > side scripting, and has done so since the last century. Again, MS seldom restricts itself to standards. >> History aside, use of 8-bit characters breaks one >> of the most common newsreaders. > > Which one? Outlook Express as I said above. >> It may be MS's fault > > I doubt that. > >> but that's where we are. > > I doubt that too. As suggested by Edward Green, this seems to be a bug > introduced by the sci.astro.research moderator's software/interface. As > I have already pointed out, 8 bit info works fine at > sci.physics.research, and in every other usenet group tried. It depends on the characters used. I have had this problem earlier this year on posts in sci.astro which is unmoderated. The first message you sent had this in the headers: Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1163327108 1505 127.0.0.1 (12 Nov 2006 10:25:08 GMT) The post I am replying to now has no problems and these are the headers: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-Trace: posting.google.com 1163345446 13704 127.0.0.1 (12 Nov 2006 15:30:46 GMT) Google automatically switched to non-ASCII because you included the special characters. George
|
| |
Date: 12 Nov 2006 06:52:07
From: Chalky
Subject: Re: Restricted ASCII?
|
Edward Green wrote: > Chalky wrote: > > > Edward Green wrote: > > > > I read everything at Google, the new AOL to some, and your symbols look > > > fine in any group. I don't know what's going on behind the scenes, > > > but I don't think a "Usenet group" is the logical entity you think it > > > is. You have the messages, and you have the formatting, which is left > > > up to the Newsreader. > > > > Not true. The top bit is erased at sci.astro.research. > > Can you reference a particular post? Maybe there is something funny > going on in the moderation step. Sure. I can do better than that. The following was the sci.astro.research moderator's response to my own (unaccepted) posting: I am sorry to inform you that your post has been found unsuitable for posting to sci.astro.research, for the following reason(s): * insufficiently relevant to research in astronomy or astrophysics. Either your message is completely off-topic for this forum, in which case please submit it to a more appropriate group; or it has insufficient content related to research to allow it to be posted under the sci.astro.research charter, in which case it may be better to post it in sci.astro or one of the other unmoderated groups in the sci.astro hierarchy. Moderator, sci.astro.research [This discussion is off-topic for s.a.r., but you might want to refer to http://en.wikipedia.org/wiki/ASCII -- mjh] ------------ Text of your message: --------------------- >From martinh@chiark.greenend.org.uk Thu Nov 09 16:40:59 2006 Return-path: <martinh@chiark.greenend.org.uk > X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on hercules.herts.ac.uk X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00, UNPARSEABLE_RELAY autolearn=ham version=3.1.3 Envelope-to: mjh@localhost Delivery-date: Thu, 09 Nov 2006 16:40:59 +0000 Received: from localhost ([127.0.0.1] ident=mjh) by hercules.herts.ac.uk with esmtp (Exim 3.36 #1 (Debian)) id 1GiCxX-00084E-00 for <mjh@localhost >; Thu, 09 Nov 2006 16:40:59 +0000 Received: from tucana.herts.ac.uk [147.197.215.113] by localhost with IMAP (fetchmail-6.2.5) for mjh@localhost (single-drop); Thu, 09 Nov 2006 16:40:59 +0000 (GMT) Received: from corvus.herts.ac.uk ([147.197.215.112] helo=corvus) by tucana.herts.ac.uk with esmtp (Exim 4.44) id 1GiCxB-0001Lm-82 for m.j.hardcastle@herts.ac.uk; Thu, 09 Nov 2006 16:40:37 +0000 Received: from [193.201.200.170] (helo=chiark.greenend.org.uk) by corvus with smtp (Exim 4.40) id 1GiCx9-0006eG-3h for mjh@star.herts.ac.uk; Thu, 09 Nov 2006 16:40:35 +0000 Received: from [193.4.58.12] (helo=horus.isnic.is ident=root) by chiark.greenend.org.uk (Debian Exim 3.36 #1) with esmtp (return-path news@google.com) id 1GiCwy-0000OO-00 for sci.astro.research@slimy.greenend.org.uk; Thu, 09 Nov 2006 16:40:24 +0000 Received: from proxy.google.com (proxy.google.com [66.102.7.4]) by horus.isnic.is (8.12.9p2/8.12.9/isnic) with ESMTP id kA9GeMUx027497 for <sci-astro-research@moderators.isc.org >; Thu, 9 Nov 2006 16:40:22 GMT (envelope-from news@google.com) Received: from G081002 by proxy.google.com with ESMTP id kA9GeLHu015417 for <sci-astro-research@moderators.isc.org >; Thu, 9 Nov 2006 08:40:21 -0800 Received: (from news@localhost) by Google Production with id kA9GeKTj031064 for sci-astro-research@moderators.isc.org; Thu, 9 Nov 2006 08:40:20 -0800 To: sci-astro-research@moderators.isc.org Path: m7g2000cwm.googlegroups.com!not-for-mail From: "Chalky" <chalkyspam@bleachboys.co.uk > Newsgroups: sci.astro.research Subject: Re: A Revised Planck Scale? Date: 9 Nov 2006 08:40:17 -0800 Organization: http://groups.google.com Lines: 39 Message-ID: <1163090417.072908.314350@m7g2000cwm.googlegroups.com > References: <mt2.0-21097-1162632715@hercules.herts.ac.uk > <mt2.0-21568-1162989572@hercules.herts.ac.uk > NNTP-Posting-Host: 195.92.67.65 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-Trace: posting.google.com 1163090420 31043 127.0.0.1 (9 Nov 2006 16:40:20 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Thu, 9 Nov 2006 16:40:20 +0000 (UTC) In-Reply-To: <mt2.0-21568-1162989572@hercules.herts.ac.uk > User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0),gzip(gfe),gzip(gfe) X-HTTP-Via: 1.1 webcacheH01 (NetCache NetApp/5.5R3D3) Complaints-To: groups-abuse@google.com Injection-Info: m7g2000cwm.googlegroups.com; posting-host=195.92.67.65; posting-account=oMPGkg0AAAB-lceMS5dlyP2BwpYen6gq X-C-UH-MailScanner: No Virus detected X-UH-MailScanner-From: martinh@chiark.greenend.org.uk X-UH-MailScanner-Information: UH-mail X-UH-MailScanner: No Virus detected Oh No wrote: > Thus spake Oh No <NotI@charlesfrancis.wanadoo.co.uk> > >I should just like to add that the Schwarzschild radius of the proton > >is not something which appears in standard physical models, the reason > >being that a classical massive point particle is not a consistent idea > >in general relativity. In fact a proton must be treated quantum > >mechanically, and we do not have an accepted theory on that, but if the > >Schwarzschild radius of the proton were considered then it would have a > >magnitude given by > > > > 2Gm/c^3 =3D 8.28 x 10 e^-63 m > > > >Planck length also has a formal definition > > > > l_p =3D sqrt(hbar*G/c^3) =3D 1.61605e-35 =B1 1.0e-39 m > > > >Neither of these figures is open to revision beyond that allowed by > >experimental margins of error. If you are defining other quantities, you > >should give them other names. > > > With apologies, I copy pasted those figures from another source. The > equations looked all right when I posted, but obviously they did not > contain pure ASCII You are wrong. At least the latter was pure ascii. Many useful caracters are part of the pure ascii set that used to be accepted in at least some of these newsgroups. The classic example is the correct o in Schrodinger, recently featured (correctly) in a title at sci.physics research. I think it is unfortunate that we now appear to be restricted to that subset of ascii which is represented by a single keypress on a standard British or American keyboard. Chalky ----- End forwarded message ----- > > These postings > > were in response to a suggestion from the moderator there, that this > > should be discussed here. > > But where is "here"? sci.physics, sci.physics.relativity, sci.astro, sci.astro.amateur, sci.astro.seti (Sci.astro.research moderator's recommendation just being sci.astro) Chalky
|
| |
Date: 12 Nov 2006 06:36:55
From: Chalky
Subject: Re: Restricted ASCII?
|
Chalky wrote: > Edward Green wrote: > > > Chalky wrote: > > > > > ASCII is defined in wiki as an 8-bit system, developed from telegraph= ic > > > codes, to allow for 256 characters. > > > > Neat. Another discussion of computer archaics. And I was afraid > > outlets for procrastination were closed! > > > > > However, I have noticed that this set is truncated to 7 characters at > > > sci.astro.research to conform to its first commercial use as a > > > seven-bit teleprinter code (1963). > > > > I always thought of them as ASCII and extended-ASCII, or possibly ANSI? > > > > > This can result in considerable garbling of 8-bit ASCII text. > > > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > > correctly at sci.physics.research, I am curious to discover how endem= ic > > > this restriction of the ASCII set actually still is, in the Usenet > > > groups. > > > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > > -, and for copyright, below > > > > > > =B0 =B1 =A9 > > > > I read everything at Google, the new AOL to some, and your symbols look > > fine in any group. I don't know what's going on behind the scenes, > > but I don't think a "Usenet group" is the logical entity you think it > > is. You have the messages, and you have the formatting, which is left > > up to the Newsreader. > > Not true. The top bit is erased at sci.astro.research. To clarify further, by "the top bit" I mean the _Most Significant Bit_. In computers that are more advanced than 8 bit microprocessors (circa early '70s), the most significant bit of the computer word is typically employed for the most important variable. For data, this typically means + or -. Similarly, for the instruction set (eg in the General Instruments CP 1600 microprocessor [circa mid-late '70s]), this typically was used to signal a switch between internal (MSB=3D0) and external (MSB=3D1) data manipulations [albeit still, in that example, only then employing the MSB of a 12 bit instruction word] When we come back down to 8 bit data words, then the MSB is still used to switch from the restricted (1963) set of American Bell teleprinter code characters (MSB=3D0), and the extended set (MSB=3D1), as originally intended when ascii was proposed and defined as an 8 bit code. Chalky
|
| | |
Date: 12 Nov 2006 16:00:37
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163342215.613268.5060@m73g2000cwd.googlegroups.com... Chalky wrote: > Edward Green wrote: > > > Chalky wrote: > > > > > ASCII is defined in wiki as an 8-bit system, developed from > > > telegraphic > > > codes, to allow for 256 characters. > > > > Neat. Another discussion of computer archaics. And I was afraid > > outlets for procrastination were closed! > > > > > However, I have noticed that this set is truncated to 7 characters at > > > sci.astro.research to conform to its first commercial use as a > > > seven-bit teleprinter code (1963). > > > > I always thought of them as ASCII and extended-ASCII, or possibly ANSI? > > > > > This can result in considerable garbling of 8-bit ASCII text. > > > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > > correctly at sci.physics.research, I am curious to discover how > > > endemic > > > this restriction of the ASCII set actually still is, in the Usenet > > > groups. > > > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > > -, and for copyright, below > > > > > > ° ± © > > > > I read everything at Google, the new AOL to some, and your symbols look > > fine in any group. I don't know what's going on behind the scenes, > > but I don't think a "Usenet group" is the logical entity you think it > > is. You have the messages, and you have the formatting, which is left > > up to the Newsreader. > > Not true. The top bit is erased at sci.astro.research. To clarify further, by "the top bit" I mean the _Most Significant Bit_. In computers that are more advanced than 8 bit microprocessors (circa early '70s), the most significant bit of the computer word is typically employed for the most important variable. For data, this typically means + or -. Similarly, for the instruction set (eg in the General Instruments CP 1600 microprocessor [circa mid-late '70s]), this typically was used to signal a switch between internal (MSB=0) and external (MSB=1) data manipulations [albeit still, in that example, only then employing the MSB of a 12 bit instruction word] When we come back down to 8 bit data words, then the MSB is still used to switch from the restricted (1963) set of American Bell teleprinter code characters (MSB=0), and the extended set (MSB=1), as originally intended when ascii was proposed and defined as an 8 bit code. Chalky I used to own a 110 baud teleprinter. Being mechanical the MSB was ignored, but not only that, lower case was ignored also. It was essentially 6-bit. Man, that used to clatter, but it worked with a drop of oil. Then, glory be, a TV interface. Full 8-bit prom for the character set, 2K ram for the entire screen. You could do a lot with a 4 MHz Zilog Z80 and a cassette recorder for mass storage. I see Woolworth are selling Chinese B/W 5" screen TVs for £10 now. Androcles
|
| |
Date: 12 Nov 2006 14:51:08
From: George Dishman
Subject: Re: Restricted ASCII?
|
[note: hand indented because the special characters included by "Chalky" forced the use of "quoted printable" which prevents Outlook Express handling the ident automatically.] "Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163327103.618298.79130@e3g2000cwe.googlegroups.com... > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. http://en.wikipedia.org/wiki/ASCII "ASCII was first published as a standard in 1967 and was last updated in 1986. It currently defines codes for 128 characters. 33 are non-printing, mostly obsolete control characters that affect how text is processed, and the other 95 printable characters are as follows (starting with the space character):" and later "ASCII is, strictly, a seven-bit code, meaning that it uses the bit patterns representable with seven binary digits (a range of 0 to 127 decimal) to represent character information." History aside, use of 8-bit characters breaks one of the most common newsreaders. It may be MS's fault but that's where we are. Chalky
|
| | |
Date: 12 Nov 2006 16:00:37
From: Sorcerer
Subject: Re: Restricted ASCII?
|
Ok, thanks. Maybe it'll be fixed some day. "George Dishman" <george@briar.demon.co.uk > wrote in message news:ej7bfu$2td$1@news.freedom2surf.net...
|
| |
Date: 12 Nov 2006 05:57:47
From: Edward Green
Subject: Re: Restricted ASCII?
|
Chalky wrote: > Edward Green wrote: > > I read everything at Google, the new AOL to some, and your symbols look > > fine in any group. I don't know what's going on behind the scenes, > > but I don't think a "Usenet group" is the logical entity you think it > > is. You have the messages, and you have the formatting, which is left > > up to the Newsreader. > > Not true. The top bit is erased at sci.astro.research. Can you reference a particular post? Maybe there is something funny going on in the moderation step. > These postings > were in response to a suggestion from the moderator there, that this > should be discussed here. But where is "here"?
|
| |
Date: 12 Nov 2006 05:57:15
From: Chalky
Subject: Re: Restricted ASCII?
|
Sorcerer wrote: > Returned from sci.physics.relativity, absent auto-indent. Sorcerer wrote: > Returned from sci.physics, also absent auto-indent. > Androcles Sorry, could you explain what you mean by this? As far as I am aware, auto-indent is not an ascii code. As far as I am aware, I did not employ an auto-indent in these postings, anyway. C
|
| | |
Date: 12 Nov 2006 15:21:08
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163339834.991566.50860@i42g2000cwa.googlegroups.com...
|
| |
Date: 12 Nov 2006 05:51:18
From: Chalky
Subject: Re: Restricted ASCII?
|
Edward Green wrote: > Chalky wrote: > > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > Neat. Another discussion of computer archaics. And I was afraid > outlets for procrastination were closed! > > > However, I have noticed that this set is truncated to 7 characters at > > sci.astro.research to conform to its first commercial use as a > > seven-bit teleprinter code (1963). > > I always thought of them as ASCII and extended-ASCII, or possibly ANSI? > > > This can result in considerable garbling of 8-bit ASCII text. > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > correctly at sci.physics.research, I am curious to discover how endemic > > this restriction of the ASCII set actually still is, in the Usenet > > groups. > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > -, and for copyright, below > > > > =B0 =B1 =A9 > > I read everything at Google, the new AOL to some, and your symbols look > fine in any group. I don't know what's going on behind the scenes, > but I don't think a "Usenet group" is the logical entity you think it > is. You have the messages, and you have the formatting, which is left > up to the Newsreader. Not true. The top bit is erased at sci.astro.research. These postings were in response to a suggestion from the moderator there, that this should be discussed here. > Lets set some rational follow-ups, shall we? Yes please Chalky
|
| | |
Date: 12 Nov 2006 07:56:57
From: Starlord
Subject: Re: Restricted ASCII?
|
We are just fine in using the plain text in S.A.A. -- The Lone Sidewalk Astronomer of Rosamond Telescope Buyers FAQ http://home.inreach.com/starlord Sidewalk Astronomy www.sidewalkastronomy.info The Church of Eternity http://home.inreach.com/starlord/church/Eternity.html "Chalky" <chalkyspam@bleachboys.co.uk > wrote garbage
|
| | |
Date: 12 Nov 2006 15:04:41
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163339478.048813.247840@b28g2000cwb.googlegroups.com... Edward Green wrote: > Chalky wrote: > > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > Neat. Another discussion of computer archaics. And I was afraid > outlets for procrastination were closed! > > > However, I have noticed that this set is truncated to 7 characters at > > sci.astro.research to conform to its first commercial use as a > > seven-bit teleprinter code (1963). > > I always thought of them as ASCII and extended-ASCII, or possibly ANSI? > > > This can result in considerable garbling of 8-bit ASCII text. > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > correctly at sci.physics.research, I am curious to discover how endemic > > this restriction of the ASCII set actually still is, in the Usenet > > groups. > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > -, and for copyright, below > > > > ° ± © > > I read everything at Google, the new AOL to some, and your symbols look > fine in any group. I don't know what's going on behind the scenes, > but I don't think a "Usenet group" is the logical entity you think it > is. You have the messages, and you have the formatting, which is left > up to the Newsreader. Not true. The top bit is erased at sci.astro.research. These postings were in response to a suggestion from the moderator there, that this should be discussed here. > Lets set some rational follow-ups, shall we? Yes please Chalky At least I now know why I have no indents auto-supplied by Outlook Express when I hit "Reply". Good experiment, Chalky.
|
| |
Date: 12 Nov 2006 05:46:50
From: Edward Green
Subject: Re: Restricted ASCII?
|
Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. Neat. Another discussion of computer archaics. And I was afraid outlets for procrastination were closed! > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). I always thought of them as ASCII and extended-ASCII, or possibly ANSI? > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > -, and for copyright, below > > =B0 =B1 =A9 I read everything at Google, the new AOL to some, and your symbols look fine in any group. I don't know what's going on behind the scenes, but I don't think a "Usenet group" is the logical entity you think it is. You have the messages, and you have the formatting, which is left up to the Newsreader. Lets set some rational follow-ups, shall we?
|
| | |
Date: 12 Nov 2006 15:01:00
From: Sorcerer
Subject: Re: Restricted ASCII?
|
He's testing, and you are whining about follow ups. You are right, you don't know what's going on behind the scenes, you clueless MORON, Green! "Edward Green" <spamspamspam3@netzero.com > wrote in message news:1163339210.186352.264010@h48g2000cwc.googlegroups.com... Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. Neat. Another discussion of computer archaics. And I was afraid outlets for procrastination were closed! > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). I always thought of them as ASCII and extended-ASCII, or possibly ANSI? > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > -, and for copyright, below > > ° ± © I read everything at Google, the new AOL to some, and your symbols look fine in any group. I don't know what's going on behind the scenes, but I don't think a "Usenet group" is the logical entity you think it is. You have the messages, and you have the formatting, which is left up to the Newsreader. Lets set some rational follow-ups, shall we?
|
| |
Date: 12 Nov 2006 05:41:30
From: Chalky
Subject: Re: Restricted ASCII?
|
Chalky wrote: > Chalky wrote: > > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > > > However, I have noticed that this set is truncated to 7 characters at > > sci.astro.research to conform to its first commercial use as a > > seven-bit teleprinter code (1963). > > > > This can result in considerable garbling of 8-bit ASCII text. > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > correctly at sci.physics.research, I am curious to discover how endemic > > this restriction of the ASCII set actually still is, in the Usenet > > groups. > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > -, and for copyright, below > > > > =B0 =B1 =A9 > > > > Chalky > > Interesting. All groups display correctly except > sci.physics.relativity, which still displayed 8 bits, but translated > into a more old-fashioned font. > > Looks like sci.astro.research is the only newsgroup actually restricted > to 7 bits. > > I wonder why that is? > > Chalky I have since noticed that the wiki reference and all references therefrom are complete rubbish in all other respects, since they all still restrict the displayed characters to the least significant 7 bits of ASCII (i.e. restriction to Bell teleprinter code, circa 1963). This erases all unique characteristics of Scandanavian (and Germanic) languages, all unique characteristics of Latin languages (such as French & Spanish), and all currencies other than the Yankey Dollar. (Thus excluding the British Pound Sterling, the Euro, and the Japanese Yen [to name a few important examples], as well as precluding the use of any more advanced scientific notation.) So, this is (probably) goodbye from me to sci.astro.research. (I can't cope with this more-than-40-year-out-of-date ascii restriction. [Or, as Captain Beefheart said more eloquently, "I cry, but I can't buy your Veterans' Day Poppy."]) In view of this apparent dearth of up-to-date information on ascii on the internet, I am now recommending (to the relevant management) that the intRAnet version of the file http://1stlight.org/design/ascii.asp, should now be included on the intERnet version of that site, too. Chalky
|
| |
Date: 12 Nov 2006 02:46:29
From: Chalky
Subject: Re: Restricted ASCII?
|
Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. > > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). > > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > -, and for copyright, below > > =B0 =B1 =A9 > > Chalky Interesting. All groups display correctly except sci.physics.relativity, which still displayed 8 bits, but translated into a more old-fashioned font. Looks like sci.astro.research is the only newsgroup actually restricted to 7 bits. I wonder why that is? Chalky
|
| | |
Date: 12 Nov 2006 14:34:16
From: David Woolley
Subject: Re: Restricted ASCII?
|
In article <1163328389.468587.34850@h48g2000cwc.googlegroups.com >, chalkyspam@bleachboys.co.uk wrote: > Chalky wrote: > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. That wiki is wrong. It is using a common marketing/popular computing misuse of the term. Which wiki was it and which article? The current edit of the English version of Wikepedia seems to correctly define it as seven bit: <http://en.wikipedia.org/w/index.php?title=ASCII&oldid=87321585 >. ASCII is the US variant of ISO 646 (I think it is the same as the reference variant), which is a seven bit code. As indicated by the Content-Type header, your article is not in ASCII: > Content-Type: text/plain; charset="iso-8859-1" ^^^^^^^^^^ Which indicates that it is in the eight bit code ISO 8859/1, which contains ISO 646 (reference variant) as a subset. You were not posting in ASCII. One of your articles was, however, in "just send eight" format, which, while it generally passes through Usenet OK, because most of Usenet is 8 bit clean, leaves the receiving newsreader to guess what character set was actually intended, and is therefore invalid. In Chinese speaking parts of the world, just send 8 content is likely to be in GB2312 or Big 5, not ISO 8859/1. Even in the West, it is quite likely to be Window-1252 a variant of ISO 8859/1 in which 32 control characters are replaced by extra graphics, but it could be the Mac version, instead. > > This can result in considerable garbling of 8-bit ASCII text. There is no such thing as 8 bit ASCII text. > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > correctly at sci.physics.research, I am curious to discover how endemic > > this restriction of the ASCII set actually still is, in the Usenet > > groups. For all except moderated newsgroups, and I don't think your test post would have been passed by a moderator, this is purely a function of the newsreaders or other user agents used. For a cross posted article, only one copy is ever transmitted, so any corruption would apply to all newsgroups. (In your case, the user agent seems to be the Google Groups nntp to HTML/HTTP gateway.) > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > -, and for copyright, below > > > > =B0 =B1 =A9 Actually, as well as using ISO 8859/1, you used quoted printable, and therefore actually only sent 7 bits in all except the just send eight case. > > Chalky > Interesting. All groups display correctly except > sci.physics.relativity, which still displayed 8 bits, but translated > into a more old-fashioned font. Fonts are purely a user agent issue. As I said, unless you post to a moderated group, in which case it is the moderator's computer system that determines what happens to things other than pure ASCII, this is an issue with your newsreader, because the same copy of the article is used for all unmoderated groups in a cross-posting. > Looks like sci.astro.research is the only newsgroup actually restricted > to 7 bits. It looks like sci.astro.research *is* moderated ("m" flag): 200 news.demon.co.uk InterNetNews NNRP server INN 2.4.1 ready (posting ok). list active sci.astro.research 215 Newsgroups in form "group high low flags". sci.astro.research 0000004703 0000003871 m . Maybe it is auto-moderated which is why the test got through. Your problem is with the email system used by the moderator. It seems to be resolving the quoted printable coding to 8 bit, but then losing the MIME information when re-submitting the approved version. It's generally best to restrict yourself to the proper definition of ASCII unless you are in a restricted community, typically a language community, or the material can't be satifactorily represented in ASCII. For maths, there is only a narrow band in which this is valid, as one soon reaches a point where one needs to use TeX, troff's eqn, or MathML, which would normally be treated as binaries on Usenet, so are best placed on a web site.
|
| | |
Date: 12 Nov 2006 13:32:03
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163328389.468587.34850@h48g2000cwc.googlegroups.com... Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. > > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). > > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > -, and for copyright, below > > ° ± © > > Chalky Interesting. All groups display correctly except sci.physics.relativity, which still displayed 8 bits, but translated into a more old-fashioned font. Looks like sci.astro.research is the only newsgroup actually restricted to 7 bits. I wonder why that is? Chalky Returned from sci.physics, also absent auto-indent. Androcles
|
| | |
Date: 12 Nov 2006 13:26:14
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163328389.468587.34850@h48g2000cwc.googlegroups.com... Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. > > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). > > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > -, and for copyright, below > > ° ± © > > Chalky Interesting. All groups display correctly except sci.physics.relativity, which still displayed 8 bits, but translated into a more old-fashioned font. Looks like sci.astro.research is the only newsgroup actually restricted to 7 bits. I wonder why that is? Chalky Returned from sci.physics.relativity, absent auto-indent.
|
| | | |
Date: 12 Nov 2006 13:58:06
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Sorcerer" <Headmaster@hogwarts.physics_e > wrote in message news:WhF5h.157081$lT5.7614@fe2.news.blueyonder.co.uk...
|
| | |
Date: 12 Nov 2006 20:03:11
From: Pat O'Connell
Subject: Re: Restricted ASCII?
|
Chalky wrote: > Chalky wrote: > >> ASCII is defined in wiki as an 8-bit system, developed from telegraphic >> codes, to allow for 256 characters. >> >> However, I have noticed that this set is truncated to 7 characters at >> sci.astro.research to conform to its first commercial use as a >> seven-bit teleprinter code (1963). >> >> This can result in considerable garbling of 8-bit ASCII text. >> >> Since I am pretty sure I have seen Schrodinger's equation spelled >> correctly at sci.physics.research, I am curious to discover how endemic >> this restriction of the ASCII set actually still is, in the Usenet >> groups. >> >> Towards this end, I have pasted in the 8 bit ASCII for degrees, for + >> -, and for copyright, below >> >> ° ± © >> >> Chalky > > Interesting. All groups display correctly except > sci.physics.relativity, which still displayed 8 bits, but translated > into a more old-fashioned font. > > Looks like sci.astro.research is the only newsgroup actually restricted > to 7 bits. Only characters 0 through 127 have been standardized as part of ASCII. Each computer operating system (for instance DOS, Windows, Mac, and VMS) displays its own symbol set for 128 through 255. -- Pat O'Connell [note munged EMail address] Take nothing but pictures, Leave nothing but footprints, Kill nothing but vandals...
|
| | | |
Date: 13 Nov 2006 15:43:30
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
An interesting history of character codes, from Morse Codes through Baudot Code to ASCII-1967 can be found here: http://www.wps.com/projects/codes/ The author says, on that page: # ASCII is and always was a seven bit code. I am shocked at the number of # people and sources that claim it to be an 8-bit code. There are only 128 # character codes in ASCII. # Many of the extentions to ASCII are 8 bits, but they are not ASCII. -- ------ Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN e-mail: pausch at stockholm dot bostream dot se WWW: http://stjarnhimlen.se/
|
| | | |
Date: 14 Nov 2006 00:12:38
From: Ben Rudiak-Gould
Subject: Re: Restricted ASCII?
|
Pat O'Connell wrote: >Each computer operating system (for instance DOS, Windows, Mac, and VMS) >displays its own symbol set for 128 through 255. Every *localized version* of every OS has a *default* character encoding that many tools use in the absence of other encoding information. The local default makes no sense (on either end) for documents that are being sent between computers, like email and Usenet messages and HTML pages. Therefore you should always specify the encoding of such a message explicitly within the message itself, which makes the local default irrelevant. -- Ben
|
| |
Date: 13 Nov 2006 05:25:52
From: Matt Giwer
Subject: Re: Restricted ASCII?
|
Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. > > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). > > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > -, and for copyright, below It has been so long I doubt I remember enough to do this subject the injustice it deserves. First there were others for different languages such as UKSCII whose main difference was the pound symbol instead of the $. This # is an octopule not a pound sign. Those are just two of the SCIIs that were around. Anyone who wants to tell me I am full of it and have it wrong please feel free. It has been over 20 years since I read it. The it was a Ma Bell handbook on the oxymoronic RS-232 standard. It was mostly a recounting of the different ways the "standard" was implemented. It started as a standard (Ma Bell rented what was used on its lines) with 26 uppercase, 10 numbers, punctuation, printable characters like / () $ and control characters, line feed, page feed, tab and such for teletypes. Those are those huge chuncka-chuncka machines that some of us were lucky enough to have to learn programming on instead of punchcards. At that time it was six bits plus one parity bit, 64 total possibilities for seven bits total. For teletypes the writer created a paper tape version of his article which was then transmitted over the expensive long distance line to save costs. That is why the old movies show them printing so fast. Also it was adapted to papertape programming to replace punchcards which started to control looms. When teletypes were replaced with machines that could do lower case it extended to seven bits plus parity other printable characters were defined such as []{} but a single standard for all the 128 possible codes never did develop. That is likely because it was not around long enough for a single manufacturer to dominate the market and Ma Bell was gone by then. In the early days of PCs there was a short time with 6+1 bits, uppercase only, followed by 7+1 bits. With the development of home modems and doing it affordably it was not possible for many years to drop the parity bit. Also relaying data through some machines (IBM mainframes mostly) required limiting the data to 6+1 for the lack of standardization of the full 128 bit charset and some of their legacy mainframes were never upgraded to 128. As the gods would have it phone lines got better and modems included error correcting code and had the full 8+2 bits for error correction. In any event the computer no longer needed to deal with the parity issue and the full 8 bits could be used for display characters. Given the lack of standardization of all the extras in the lower 128 it is not surprising the upper 128 were wildcards. I think it was Apple ][ upgrade chipset or maybe the ill-conceived III which first used them for international characters. Atari simply made them the inverse of their choices for the lower 128. This was before displays went graphic as in windows and MS was on DOS 7. As to what all is out there today, you tell me. From what I see occasionally UNICODE is not much of a match for the snail as far as progress goes. I can see all kinds of reasons for that given all the "alphabets" around. It is a worth looking into if you want to see why progress is slow. For example some Arabic alphabets have parts of letters "underlining" other letters. Another font I saw requires the ability to embed letters within other letters like putting a lowercase e inside an uppercase L. -- Hodie pridie Idus Octobres MMVI est -- The Ferric Webceasar nizkor http://www.giwersworld.org/nizkook/nizkook.phtml Iraqi democracy http://www.giwersworld.org/911/armless.phtml a3
|
| | |
Date: 13 Nov 2006 01:10:12
From: Paul
Subject: Re: Restricted ASCII?
|
Matt Giwer wrote: > Chalky wrote: > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > > > However, I have noticed that this set is truncated to 7 characters at > > sci.astro.research to conform to its first commercial use as a > > seven-bit teleprinter code (1963). > > > > This can result in considerable garbling of 8-bit ASCII text. > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > correctly at sci.physics.research, I am curious to discover how endemic > > this restriction of the ASCII set actually still is, in the Usenet > > groups. > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > -, and for copyright, below > > It has been so long I doubt I remember enough to do this subject the injustice > it deserves. Very funny! Matt I love the way you write. Yermiah
|
| |
Date: 12 Nov 2006 21:46:06
From: Chalky
Subject: Restricted ASCII? The final test
|
Thanks too to Sorcerer (Androcles), and George Dishman for your collective constructive feedback. It seems that you probably had a secondary display problem but I didn't, because you have Usenet postings e-mailed to you. I just go to the website to read what I am interested in, and, when I respond, I do so via form submission, so there is no e-mail protocol involved, my side. Your resultant problem might have been because I originally pasted in the displayed characters which sprang to life after I had typed in the decimal translation of the machine code for those characters. Consequently, for a final test, I am simply typing in the decimal translations of the machine codes for the Japanese Yen, the registered trade mark, and the Euro, encased in the beautifully symmetric Spanish version of the question mark, using the HTML identifiers &, #, and ; below: ¿ ¥ ® ? Let me know what you see. Chalky
|
| |
Date: 12 Nov 2006 20:59:20
From: Chalky
Subject: Re: Restricted ASCII?
|
Greg Hennessy wrote: > The facts are you claimed ascii was 8 bit. It isn't. Wiki defines it > as 7 bit. And, of course, Wiki is always infallible, is it? > There is an 8 bit code called EBCDIC, which is mostly used in IBM > mainframes, but that has nothing to do with ASCII. I have no argument with that. So please now explain why, less than 24 hours ago, infallible Wiki connected EBCDIC and ASCII together, via the qualifying phrase, a 8-bit system that would allow for 256 characters, > You were confused. If so, it looks like Wiki was too. See also Wiki ref to US-ASCII, and my reference to the server side instruction Asc(), which definitely returns the results of all 8 bits in the byte. Cheers It has been fun. Chalky
|
| | |
Date: 13 Nov 2006 15:43:30
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
In article <1163393960.821344.305840@m73g2000cwd.googlegroups.com >, Chalky <chalkyspam@bleachboys.co.uk > wrote: > >Greg Hennessy wrote: > >> The facts are you claimed ascii was 8 bit. It isn't. Wiki defines it >> as 7 bit. > >And, of course, Wiki is always infallible, is it? Wiki is no more infallible than its authors of course. But he claimed wiki said ASCII was 8-bit, and wiki never said that. Btw wiki is correct in this particular case: ASCII *is* a 7-bit code. >> There is an 8 bit code called EBCDIC, which is mostly used in IBM >> mainframes, but that has nothing to do with ASCII. > >I have no argument with that. So please now explain why, less than 24 >hours ago, infallible Wiki connected EBCDIC and ASCII together, via the >qualifying phrase, a 8-bit system that would allow for 256 characters, EBCDIC predated ASCII, and was of course considered when the 5-bit Baudot code evolved into ASCII. However, early versions of EBCDIC had "holes" in its character table for each byte where both nibbles weren't in the range 0 to 9. Which is natural, since EBCDIC was an evolution of the earlier BCD codes: BCD = Binary Coded Decimal EBCDIC = Extended Binary Coded Decimal Interchange Code ASCII used the bit space more efficiently, using all possible bit combinations. >> You were confused. > >If so, it looks like Wiki was too. No - Wiki never claimed ASCII to be an 8-bit code. Re-read the older version of that page, and re-read it carefully.... >See also Wiki ref to US-ASCII, and my reference to the server side >instruction Asc(), which definitely returns the results of all 8 bits >in the byte. > >Cheers > >It has been fun. > >Chalky > -- ------ Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN e-mail: pausch at stockholm dot bostream dot se WWW: http://stjarnhimlen.se/
|
| |
Date: 12 Nov 2006 20:19:34
From: Chalky
Subject: Re: Restricted ASCII?
|
David Woolley wrote: > In article <1163358882.886919.240610@m7g2000cwm.googlegroups.com>, > chalkyspam@bleachboys.co.uk wrote: > > Greg Hennessy wrote: > > > > and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit > > > system that would allow for 256 characters, the". Via reconstruction, the exact Wiki sentence on the morning of my original posting, would have been: "Some time after the [[EBCDIC]] code, a 8-bit system that would allow for 256 characters, the ASCII developed from telegraphic codes and first entered commercial use as a seven-bit teleprinter code promoted by Bell data services in 1963." Given that the subsequenty deleted qualification ", a 8-bit system that would allow for 256 characters, " is linguistically incorrect in its own right, that phrase could equally have referred to ASCII, EBCDIC, or both, within that sentence (as I had originally assumed). Thanks for helping to clear up that point of confusion. Perhaps such confusion could be avoided in the future by adopting the http://en.wikipedia.org/wiki/MIME method of referring to this explicitly restricted 7 bit instruction set as US-ASCII. David Woolley also wrote: > In article <1163345440.980996.198470@m7g2000cwm.googlegroups.com>, > chalkyspam@bleachboys.co.uk wrote: > > The Server Side coding of http://1stlight.org/design/ascii.asp, > This URL produces a request to "Click Here AFTER inserting security key > into your computer", so is useless as a public reference. Yes, I did say in an earlier posting that this link is only currently accessible on the intRAnet version of that site. (Thanks for confirming that the relevant site security lock still works when that software is instead running on the intERnet copy of the server) > > specifically intructs any Windows NT 4, Windows 2000, or subsequent > > Microsoft server, to display the ascii symbols for all n from 1 to 255, > They are misusing the term ASCII. That's a very common mistake. > Probably what it actually does is to display the raw font encoding for > the current font. The relevant employed server side visual basic (ASP) coding is: a=Chr(n) Response.Write Asc(a) (for all n from 1 to 255) Perhaps you should now complain to Microsoft that they have not sawn off the most significant bit of the data byte in their Asc() instruction, to specifically restrict that server side scripting to only handling the US-ASCII subset. (Yes, I already know that not every possible permutation of 1s and 0s in that byte, results in a displayable graphic. This is equally true for the 7 bit subset, as for the 8 bit set.) Cheers It has been fun. Chalky
|
| | |
Date: 13 Nov 2006 00:09:40
From: Rich Townsend
Subject: Re: Restricted ASCII?
|
Chalky wrote: > David Woolley wrote: > >> In article <1163358882.886919.240610@m7g2000cwm.googlegroups.com>, >> chalkyspam@bleachboys.co.uk wrote: >>> Greg Hennessy wrote: >>>> and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit >>>> system that would allow for 256 characters, the". > > Via reconstruction, the exact Wiki sentence on the morning of my > original posting, would have been: "Some time after the [[EBCDIC]] > code, a 8-bit system that would allow for 256 characters, the ASCII > developed from telegraphic codes and first entered commercial use as a > seven-bit teleprinter code promoted by Bell data services in 1963." > > Given that the subsequenty deleted qualification ", a 8-bit system that > would allow for 256 characters, " is linguistically incorrect in its > own right, that phrase could equally have referred to ASCII, EBCDIC, or > both, within that sentence (as I had originally assumed). > > Thanks for helping to clear up that point of confusion. Perhaps such > confusion could be avoided in the future by adopting the > http://en.wikipedia.org/wiki/MIME method of referring to this > explicitly restricted 7 bit instruction set as US-ASCII. > > David Woolley also wrote: > >> In article <1163345440.980996.198470@m7g2000cwm.googlegroups.com>, >> chalkyspam@bleachboys.co.uk wrote: > >>> The Server Side coding of http://1stlight.org/design/ascii.asp, > >> This URL produces a request to "Click Here AFTER inserting security key >> into your computer", so is useless as a public reference. > > Yes, I did say in an earlier posting that this link is only currently > accessible on the intRAnet version of that site. (Thanks for confirming > that the relevant site security lock still works when that software is > instead running on the intERnet copy of the server) > >>> specifically intructs any Windows NT 4, Windows 2000, or subsequent >>> Microsoft server, to display the ascii symbols for all n from 1 to 255, > >> They are misusing the term ASCII. That's a very common mistake. >> Probably what it actually does is to display the raw font encoding for >> the current font. > > The relevant employed server side visual basic (ASP) coding is: > > a=Chr(n) > Response.Write Asc(a) > (for all n from 1 to 255) > > Perhaps you should now complain to Microsoft that they have not sawn > off the most significant bit of the data byte in their Asc() > instruction, to specifically restrict that server side scripting to > only handling the US-ASCII subset. > No, ASCII is the proper designation for the 7-bit encoding -- the 'A' standing for 'American'. Anything with 8 bits just isn't ASCII, it's ISO 8859/1 or somesuch. How do I know this? I spent five months working in the standards department of a very large news company, and it was my business to know. cheers, Rich
|
| | | |
Date: 13 Nov 2006 10:44:36
From: Starlord
Subject: Re: Restricted ASCII?
|
I know my old Atari 800XL used ASCII and so does my Atari TT030. But this has nothing to do with Astronomy or telescopes. -- The Lone Sidewalk Astronomer of Rosamond Telescope Buyers FAQ http://home.inreach.com/starlord Sidewalk Astronomy www.sidewalkastronomy.info The Church of Eternity http://home.inreach.com/starlord/church/Eternity.html "Rich Townsend" <rhdt@barVOIDtol.udel.edu > wrote in message news:ej8umk$nbg$1@scrotar.nss.udel.edu...
|
| |
Date: 12 Nov 2006 20:25:03
From: David Woolley
Subject: Re: Restricted ASCII?
|
In article <1163358882.886919.240610@m7g2000cwm.googlegroups.com >, chalkyspam@bleachboys.co.uk wrote: > Greg Hennessy wrote: > > and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit > > system that would allow for 256 characters, the". > Precisely. This is exactly what I copied this morning, followed by It says nothing about the nature of ASCII. EBCDIC is a proprietary, IBM, character code, which has a certain relationship to punched card codes. (Punched cards have 12 potential holes in each column. With the normal codes exactly one of these is punched for each digit from 0 to 9 (somewhat desirable with early manual card punches, where you had to push a key for each hole - in my time only used for making corrections). Uppercase characters were coded by punching one, or both, of the remaining rows. EBCDIC reflects this structure by coding the 0 to 9 punching into the low order four bits, with the result that character codes were not contiguous. I suspect this was done because it simplified the electronics used in the card readers.) I did see this edit, but discarded it because it was clearly an irrelevant side comment. In fact the edit history makes it clear that the reason for this change was that it misrepresented the time relation between the creation of the two different codes. (I actually read the edit comments before looking at the actual edits.) > negligible linguistic modification, which made ABSOLUTELY no difference > to the meaning of the (then) wiki ref., under History. Obviously it makes no significant difference because its a comment about EBCDIC in an article about ASCII. > Thank you for this objective confirmation. It confirms that there was a change today and that change did not have any relevance, except to someone who had completely misunderstood the original. I would definitely have been speaking out of my mouth if I hadn't typed this without speaking. (EBCDIC has some relevance to USENET as newsreaders on EBCDIC based machines cannot assume that their character code is identical to ASCII in the first 128 code - it is very different.) EBCDIC = Extended Binary Coded Decimal Interchange Code
|
| |
Date: 12 Nov 2006 11:18:46
From: Chalky
Subject: Re: Restricted ASCII?
|
David Woolley wrote: > In article <1163345440.980996.198470@m7g2000cwm.googlegroups.com>, > chalkyspam@bleachboys.co.uk wrote: > > > Yes. That is copied from the CHANGED Wiki reference, which _postdates_ > > my first posting of today. > > I've gone through the edit history and no edit since, at the latest, > November 7th, has made such a change. Even if the page had been vandalised, > the change would still be in the edit history. The only time an edit > would be taken out of the history is if it would be illegal, or at > least legally unsafe, to keep it. The sort of change we are discussing > here, by no means, fits that category (unless the same edit introduced > legally unsafe material, which would only really occur for vandalism > with this article). > > Note that Wiki is a generic term. What we seem to be, actually, talking > about is the English version of Wikipedia. > > > I disagree with that conveniently changed reference of this afternoon, > > anyway. > > The current reference (spot version URL given earlier) is correct > in the area under discussion and hasn't significantly changed in the > last few days. > > > The Server Side coding of http://1stlight.org/design/ascii.asp, > > This URL produces a request to "Click Here AFTER inserting security key > into your computer", so is useless as a public reference. > > > specifically intructs any Windows NT 4, Windows 2000, or subsequent > > Microsoft server, to display the ascii symbols for all n from 1 to 255, > > They are misusing the term ASCII. That's a very common mistake. > Probably what it actually does is to display the raw font encoding for > the current font. Most Windows fonts in the UK are either Unicode or > Windows-1252 coded, both of which have the ASCII and ISO 8859/1 graphics > as a subset. > > In particular, ASCII code points 0 to 31 are not displayable, although > a few of them have an effect on formatting. Nor is ASCII code point 127. > 128 through 255, as repeatedly stated, are not in ASCII, and even with the > other non-proprietary codes (ISO 8859/* and ISO 10646 (~Unicode)) code > points 128 through 159 are not displayable graphics. > > > in sequence. Neither the server nor any known browser has any > > difficulty in doing so. This works with both server side and client > > side scripting, and has done so since the last century. > > There are two ways of specifying characters to a browser, one is by > literally including the character code in the data stream. In that case, > it should be interpreted in the context of the charset parameter in the > Content-Type and whether or not a character even exists will depend on > the character set specified. The other is to provide it using numeric > entitities ( etc.) (or named ones referencing them) or through > scripting, in which case the characters should be in the HTML native > character set, which is ISO 8859/1 for versions before HTML 4.0 and > ISO 10646 for later versions. In both these cases, codes 128 through > 159 are absolutely forbidden, as is 127 and most of the range from 0 > to 31. If the browser displays these characters, it is doing so for > error recovery reasons, or for compatibility with early browsers that > were rather sloppy in their character code handling (in particular, > they tended to display the current platform code page character, rather > than the correct one defined by the standard - in the USA and UK, this > tended to produce the same result, for conforming characters). > > > I doubt that too. As suggested by Edward Green, this seems to be a bug > > introduced by the sci.astro.research moderator's software/interface. As > > That seems to be the case. Strictly speaking, the use of MIME > has never been standardised on USENET, so anything that doesn't use > ASCII is non-standard. However, the de facto situation is that MIME > using quoted-printable or base64 works with modern news readers and MIME > using 8bit works most of the time. > > In the *.research moderation case, it does seem that the MIME encoding > is being undone, which suggests that the system as a whole (which > might include Google) is broken because it is partially MIME aware. > Things ought to work OK if the software is fully MIME aware or, like > the USENET backbone, totally unaware of MIME. > > > I have already pointed out, 8 bit info works fine at > > sci.physics.research, and in every other usenet group tried. > > Your examples, mostly, have not sent 8 bit characters over USENET. They > have used quoted-printable encoding, in which bytes that cannot be > represented directly using ASCII are sent as three ASCII characters, > "=" and two hexadecimal digits. "=" followed by space, and "=" at the > end of the line also have special meanings, and some ASCII characters also > have to be coded using "=" and hex digits, including, of course, "=" > itself. (I say bytes, rather than characters, because the modern trend, > particularly for email and web pages, is to move to the use of the > UTF-8 encoding (or sometimes UTF-7) of ISO 10646, which is a variable > length code. Quoted printable encodes the component bytes, not the > whole character. UTF-7 reduces the problem, because it only uses > bytes which are printable in ASCII, and possibly common control > characters.) > > As far as the USENET backbone is concerned, the result is pure ASCII, and > that is what it transmits. Only when the article is passed to a user > agent (e.g. Outlook Express) or gatewayed to another protocol (e.g. email > for the moderation process or HTTP/HTML for Google Groups) is the MIME > encoding detected and resolved. Some articles are actually sent raw 8 bit. > These generally also work across the USENET backbone as USENET has > generally been carried by 8 bit clean protocols (unlike email), and > there have been few, if any, IBM mainframe based USENET systems, using > EBCDIC (an 8 bit code that is not based on ASCII, and the most likely > ASCII-incompatible code to have been encoutered in recent systems). > > Problems with quoting with MIME encoded material may be to do with the > way that GUI mail and news user agents normally mis-use MIME to try and > send reflowable paragraphs. See my response to Greg Hennessy. You are talking out of your arse. C
|
| |
Date: 12 Nov 2006 11:14:42
From: Chalky
Subject: Re: Restricted ASCII?
|
Greg Hennessy wrote: > >> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > >> > codes, to allow for 256 characters. > >> > >> Nope. Ascii is 7 bit. > >> http://en.wikipedia.org/wiki/ASCII > > > > Check out my response to George. This definition was changed TODAY at > > Wiki > > Not according to the WIKI history logs. There is one change listed for > today, by Chris Chittleborough, who changed "similar to" into "like" > and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit > system that would allow for 256 characters, the". Precisely. This is exactly what I copied this morning, followed by negligible linguistic modification, which made ABSOLUTELY no difference to the meaning of the (then) wiki ref., under History. Thank you for this objective confirmation. Chalky
|
| | |
Date: 12 Nov 2006 19:24:39
From: Greg Hennessy
Subject: Re: Restricted ASCII?
|
>> >> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic >> >> > codes, to allow for 256 characters. >> >> >> >> Nope. Ascii is 7 bit. >> >> http://en.wikipedia.org/wiki/ASCII >> > >> > Check out my response to George. This definition was changed TODAY at >> > Wiki >> >> Not according to the WIKI history logs. There is one change listed for >> today, by Chris Chittleborough, who changed "similar to" into "like" >> and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit >> system that would allow for 256 characters, the". > > Precisely. This is exactly what I copied this morning, followed by > negligible linguistic modification, which made ABSOLUTELY no difference > to the meaning of the (then) wiki ref., under History. The facts are you claimed ascii was 8 bit. It isn't. Wiki defines it as 7 bit. There is an 8 bit code called EBCDIC, which is mostly used in IBM mainframes, but that has nothing to do with ASCII. You were confused. > Thank you for this objective confirmation. I am not confirming you. I am proving you wrong.
|
| |
Date: 12 Nov 2006 18:47:18
From: David Woolley
Subject: Re: Restricted ASCII?
|
In article <1163345440.980996.198470@m7g2000cwm.googlegroups.com >, chalkyspam@bleachboys.co.uk wrote: > Yes. That is copied from the CHANGED Wiki reference, which _postdates_ > my first posting of today. I've gone through the edit history and no edit since, at the latest, November 7th, has made such a change. Even if the page had been vandalised, the change would still be in the edit history. The only time an edit would be taken out of the history is if it would be illegal, or at least legally unsafe, to keep it. The sort of change we are discussing here, by no means, fits that category (unless the same edit introduced legally unsafe material, which would only really occur for vandalism with this article). Note that Wiki is a generic term. What we seem to be, actually, talking about is the English version of Wikipedia. > I disagree with that conveniently changed reference of this afternoon, > anyway. The current reference (spot version URL given earlier) is correct in the area under discussion and hasn't significantly changed in the last few days. > The Server Side coding of http://1stlight.org/design/ascii.asp, This URL produces a request to "Click Here AFTER inserting security key into your computer", so is useless as a public reference. > specifically intructs any Windows NT 4, Windows 2000, or subsequent > Microsoft server, to display the ascii symbols for all n from 1 to 255, They are misusing the term ASCII. That's a very common mistake. Probably what it actually does is to display the raw font encoding for the current font. Most Windows fonts in the UK are either Unicode or Windows-1252 coded, both of which have the ASCII and ISO 8859/1 graphics as a subset. In particular, ASCII code points 0 to 31 are not displayable, although a few of them have an effect on formatting. Nor is ASCII code point 127. 128 through 255, as repeatedly stated, are not in ASCII, and even with the other non-proprietary codes (ISO 8859/* and ISO 10646 (~Unicode)) code points 128 through 159 are not displayable graphics. > in sequence. Neither the server nor any known browser has any > difficulty in doing so. This works with both server side and client > side scripting, and has done so since the last century. There are two ways of specifying characters to a browser, one is by literally including the character code in the data stream. In that case, it should be interpreted in the context of the charset parameter in the Content-Type and whether or not a character even exists will depend on the character set specified. The other is to provide it using numeric entitities ( etc.) (or named ones referencing them) or through scripting, in which case the characters should be in the HTML native character set, which is ISO 8859/1 for versions before HTML 4.0 and ISO 10646 for later versions. In both these cases, codes 128 through 159 are absolutely forbidden, as is 127 and most of the range from 0 to 31. If the browser displays these characters, it is doing so for error recovery reasons, or for compatibility with early browsers that were rather sloppy in their character code handling (in particular, they tended to display the current platform code page character, rather than the correct one defined by the standard - in the USA and UK, this tended to produce the same result, for conforming characters). > I doubt that too. As suggested by Edward Green, this seems to be a bug > introduced by the sci.astro.research moderator's software/interface. As That seems to be the case. Strictly speaking, the use of MIME has never been standardised on USENET, so anything that doesn't use ASCII is non-standard. However, the de facto situation is that MIME using quoted-printable or base64 works with modern news readers and MIME using 8bit works most of the time. In the *.research moderation case, it does seem that the MIME encoding is being undone, which suggests that the system as a whole (which might include Google) is broken because it is partially MIME aware. Things ought to work OK if the software is fully MIME aware or, like the USENET backbone, totally unaware of MIME. > I have already pointed out, 8 bit info works fine at > sci.physics.research, and in every other usenet group tried. Your examples, mostly, have not sent 8 bit characters over USENET. They have used quoted-printable encoding, in which bytes that cannot be represented directly using ASCII are sent as three ASCII characters, "=" and two hexadecimal digits. "=" followed by space, and "=" at the end of the line also have special meanings, and some ASCII characters also have to be coded using "=" and hex digits, including, of course, "=" itself. (I say bytes, rather than characters, because the modern trend, particularly for email and web pages, is to move to the use of the UTF-8 encoding (or sometimes UTF-7) of ISO 10646, which is a variable length code. Quoted printable encodes the component bytes, not the whole character. UTF-7 reduces the problem, because it only uses bytes which are printable in ASCII, and possibly common control characters.) As far as the USENET backbone is concerned, the result is pure ASCII, and that is what it transmits. Only when the article is passed to a user agent (e.g. Outlook Express) or gatewayed to another protocol (e.g. email for the moderation process or HTTP/HTML for Google Groups) is the MIME encoding detected and resolved. Some articles are actually sent raw 8 bit. These generally also work across the USENET backbone as USENET has generally been carried by 8 bit clean protocols (unlike email), and there have been few, if any, IBM mainframe based USENET systems, using EBCDIC (an 8 bit code that is not based on ASCII, and the most likely ASCII-incompatible code to have been encoutered in recent systems). Problems with quoting with MIME encoded material may be to do with the way that GUI mail and news user agents normally mis-use MIME to try and send reflowable paragraphs.
|
| | |
Date: 13 Nov 2006 05:33:44
From: =?UTF-8?Q?Jeff=E2=80=A6Relf?=
Subject: " ASCII Art " ( e.g. tables ), 80 columns, monospaced, unwrapped.
|
Hi David_Woolley, What newsreader are you using ? " ASCII Art " ( e.g. tables ), 80 columns, monospaced, unwrapped, should be the standard, I maintain... even on cell phones. As far as the proper Encoding/Charset/etc. to use ( when posting ), I say, " If Google displays it correctly, then it's correct. " But FireFox ( my browser ) also effects the way posts look. For example, my userContent.CSS prevents line wrapping, to wit: * { white-space: nowrap !important; } Cotse.NET/users/jeffrelf/userContent.CSS ( to see the change, restart FireFox and do a Cntrl-F5 [ refresh ] ) So, to wrap WikiPedia's insanely long lines ( shame on them ! ), I copy and paste from WikiPedia to MS_Word. My hand-rolled newsreader ( X.EXE, X.TXT, X.CPP ) puts all posts in a single ( complexly maintained, UTF-16 ) .TXT file; so all text is editable ( e.g. I can add new-lines, if need be ).
|
| |
Date: 12 Nov 2006 10:50:53
From: Chalky
Subject: Re: Restricted ASCII?
|
Tom Roberts wrote: > Chalky wrote: > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > > > However, I have noticed that this set is truncated to 7 characters at > > sci.astro.research to conform to its first commercial use as a > > seven-bit teleprinter code (1963). [...] > > This is not due to the newsgroup itself or to the underlying newsgroup > software. It is due to the newsgroup client used by individual people. > For instance, I use Firefox and your symbols come out fine, except in > moderated newsgroups. In moderated newsgroups it also depends on the > email system used (author->moderator) and on the client software used by > the moderator, because postings are processed via email and by the > moderator's client before being distributed. > > So this is almost surely due to either the email software or the > moderator of sci.astro.research using newsgroup software that truncates > the high bit. > > > Tom Roberts _Thank you._ This highly consistent with my own analsyis of the experiment, and of the resultant responses to my associated postings. Cheers Chalky
|
| |
Date: 12 Nov 2006 10:35:48
From: Chalky
Subject: Re: Restricted ASCII?
|
Greg Hennessy wrote: > On 2006-11-12, Chalky <chalkyspam@bleachboys.co.uk> wrote: > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > Nope. Ascii is 7 bit. > http://en.wikipedia.org/wiki/ASCII Check out my response to George. This definition was changed TODAY at Wiki C
|
| | |
Date: 12 Nov 2006 19:01:01
From: Greg Hennessy
Subject: Re: Restricted ASCII?
|
>> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic >> > codes, to allow for 256 characters. >> >> Nope. Ascii is 7 bit. >> http://en.wikipedia.org/wiki/ASCII > > Check out my response to George. This definition was changed TODAY at > Wiki Not according to the WIKI history logs. There is one change listed for today, by Chris Chittleborough, who changed "similar to" into "like" and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit system that would allow for 256 characters, the".
|
| | |
Date: 13 Nov 2006 07:42:43
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
In article <1163356548.517299.123720@m7g2000cwm.googlegroups.com >, Chalky <chalkyspam@bleachboys.co.uk > wrote: > Greg Hennessy wrote: > >> On 2006-11-12, Chalky <chalkyspam@bleachboys.co.uk> wrote: >>> ASCII is defined in wiki as an 8-bit system, developed from telegraphic >>> codes, to allow for 256 characters. >> >> Nope. Ascii is 7 bit. >> http://en.wikipedia.org/wiki/ASCII > > Check out my response to George. This definition was changed TODAY at > Wiki No, it wasn't! I checked a version of that wikipedia entry from a few days ago, and that version also said ASCII is a 7-bit code. Sorry.... -- ------ Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN e-mail: pausch at stockholm dot bostream dot se WWW: http://stjarnhimlen.se/
|
| |
Date: 12 Nov 2006 10:32:12
From: Chalky
Subject: Re: Restricted ASCII?
|
I would like to say, first of all, that it is a pleasure to receive two responses from someone with adequate technical competence to actually teach me something (possibly?). David Woolley wrote: > In article <1163343127.057255.151280@h48g2000cwc.googlegroups.com>, > chalkyspam@bleachboys.co.uk wrote: > > > I think it is unfortunate that we now appear to be restricted to that > > subset of ascii which is represented by a single keypress on a standard > > British or American keyboard. > > If you include shift and control modifiers, what you get on a standard > US keyboard is the whole of ASCII. Interesting. I will try that out soon, if I can get hold of a standard American keyboard, from my only (drinking partner) USAF employee, in my (parochial) village. > With the same caveat, what you get > on a standard British keyboard is a superset of ASCII, because old keyboa= rds > support =A3 as a simple shifted character. How old are you talking about?. British keyboards support the BritishPoundSterling symbol, without shift, but fail to support the Yankee Dollar symbol thus. Similarly, Spanish keyboards support the inverted question mark (necessary for correct punctuation in that language), and, by now, the Euro. (Additional examples will be apparent to the contemporary cosmopolitan European reader.) > As to American keyboards, I would have thought some parts of America > used keyboards optimised for Spanish or Portuguese. I would imagine so too. By American I meant WASP Northern America, which is where commercial computer technology first got off the ground, after British innovators such as Babbage and Turing? (not sure of latter spelling. [The AI guy who was instrumental in breaking the Enigma code in WW2]) prepared the groundwork, and then British venture capitalism failed to adequately exploit that lead (same old story again). > Incidentally, this one was identified as pure ASCII, but I've had > to convert to ISO 8859/1, in order to include =A3. Sure. With my cynic's hat on, I would suggest this could be interpreted as yet another example of American Imperialism (and associated stupidity). > I haven't used > ISO 8859/15, as that is relatively recent and there may be people > without support for it, either in the browser, or in fonts. > > > Content-Type: text/plain; charset=3D"us-ascii" > > (Note the level of quoting is getting close to the level at which my > spam filter cuts in because the article is too long.) In that case, I hope this particular response "hits the spot" > PS. It looks like the article was rejected by the moderator on the > research newsgroup, and that they rejected it for the obvious reason > that it was off topic. Yes, I totally accept that. However, it is not off topic in the sense of ensuring the communication channel is left wide open. David Woolley subsequently wrote: > In article <1163328389.468587.34850@h48g2000cwc.googlegroups.com>, > chalkyspam@bleachboys.co.uk wrote: > > Chalky wrote: > > > > ASCII is defined in wiki as an 8-bit system, developed from telegraph= ic > > > codes, to allow for 256 characters. > > That wiki is wrong. It is using a common marketing/popular > computing misuse of the term. Which wiki was it and > which article? The reference quoted by George Dishman. Morning as opposed to afternoon version (GMT) of today. > The current edit of the English version > of Wikepedia seems to correctly define it as seven bit: > <http://en.wikipedia.org/w/index.php?title=3DASCII&oldid=3D87321585>. > > ASCII is the US variant of ISO 646 (I think it is the same as the > reference variant), which is a seven bit code. OK, if so, let me put my question slightly differently. Why is it that a European moderated sci.((astro)) research group (Moderators: Jonathan Thornberg [Germany] and Martin Hardcastle [Britain]) is enslaved to a US variant, whereas the U.S. moderated sci.((physics)) research group (Moderators: Igor Khavkine and Phillip Helbig), is not? > As indicated by the Content-Type header, your article is not in ASCII: And who wrote the Content-Type header program? A WASP American, no doubt, who was employed by Bill Gates (the _pretender_ to the throne of software perfection!). > > Content-Type: text/plain; charset=3D"iso-8859-1" > ^^^^^^^^^^ > Which indicates that it is in the eight bit code ISO 8859/1, which contai= ns > ISO 646 (reference variant) as a subset. You were not posting in ASCII. OK. so you are arguing here about a mere debatable technical definition, not about basic (and invariant) principles of information transfer protocol, as encapsulated in HyperText Markup Language (HTML), and its contained (and unambiguous) 8 bit interpretation of the character set. [snip nonsense] > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > > correctly at sci.physics.research, I am curious to discover how endem= ic > > > this restriction of the ASCII set actually still is, in the Usenet > > > groups. > > For all except moderated newsgroups, and I don't think your test post > would have been passed by a moderator, this is purely a function of the > newsreaders or other user agents used. For a cross posted article, > only one copy is ever transmitted, so any corruption would apply to all > newsgroups. (In your case, the user agent seems to be the Google > Groups nntp to HTML/HTTP gateway.) OK That sounds like good info to me. > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > > -, and for copyright, below > > > > > > =3DB0 =3DB1 =3DA9 OK that is a hexadecimal translation of the html coding: n; where n is anything from 0 to 255. Hex has always been a crap way of displaying information, and there is material in the prior published literature, to suggest that hex was only introduced as an obscuration technique to disguise the underlying simplicity of the 8 bit microprocessor instruction set. This is particularly evident when you compare the instruction sets of early 16 bit microprocessors, expressed in octal, versus the much cruder (but apparently more complex) instruction sets of more primitive 8 bit microprocessors, which were instead expressed in hexadecimal, for marketing purposes. > Actually, as well as using ISO 8859/1, you used quoted printable, and > therefore actually only sent 7 bits in all except the just send eight > case. David Woolley also wrote: > > Interesting. All groups display correctly except > > sci.physics.relativity, which still displayed 8 bits, but translated > > into a more old-fashioned font. > > Fonts are purely a user agent issue. Wrong. You can over-ride this, if you wish, using the <PRE ></PRE> HTML enclosers. There are other ways too. >As I said, unless you post to a > moderated group, in which case it is the moderator's computer system > that determines what happens to things other than pure ASCII, this is > an issue with your newsreader, because the same copy of the article is > used for all unmoderated groups in a cross-posting. > > > Looks like sci.astro.research is the only newsgroup actually restricted > > to 7 bits. [snip additional undecipherable material] > It's generally best to restrict yourself to the proper definition of > ASCII unless you are in a restricted community, You mean like the World Wide Web? > typically a language > community, You mean like HTML, CGI, ASP, php, and Javascript? > or the material can't be satifactorily represented in ASCII. > For maths, there is only a narrow band in which this is valid, as one > soon reaches a point where one needs to use TeX, troff's eqn, or MathML, > which would normally be treated as binaries on Usenet, so are best placed > on a web site. Absolutely. In view of such excessive restrictions, perhaps use of these newsgroups is now best limited to the provision of relevant hyperlinks. Chalky
|
| |
Date: 12 Nov 2006 17:28:12
From: Greg Hennessy
Subject: Re: Restricted ASCII?
|
On 2006-11-12, Chalky <chalkyspam@bleachboys.co.uk > wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. Nope. Ascii is 7 bit. http://en.wikipedia.org/wiki/ASCII ASCII was first published as a standard in 1967 and was last updated in 1986. It currently defines codes for 128 characters. 33 are non-printing, mostly obsolete control characters that affect how text is processed, and the other 95 printable characters are as follows (starting with the space character):
|
| |
Date: 13 Nov 2006 02:54:10
From: Chalky
Subject: Re: Restricted ASCII?
|
Paul Schlyter wrote: > Sorry, but http://en.wikipedia.org/wiki/ASCII says that ASCII is 7-bit. > Please check the reference before pointing to it ! I did, and it was promptly changed, after my original posting on this subject (yesterday). However, if David Wooley's faith in the accuracy of Wiki logs of alterations is justified, we now know that the misleading text string read as: a 8-bit system that would allow for 256 characters, the ASCII developed from telegraphic codes and first entered commercial use as a seven-bit teleprinter code promoted by Bell data services in 1963. This was still posted up at Wiki about 24 hours ago, as a subsection of the first sentence expressed under "History". > That's ISO-8859-1, not ASCII..... :-) > > Check out: http://czyborra.com/ Thanks. Interesting. This appears to confirm that the 8 bit ISO-8859-1 standard has been the de facto standard for interpreting single byte character definitions in all HTML browsers since the last century. So my momentary recent concerns about cross computer compatibility (as a consequence of recent comments here), appear to be unfounded. I don't really care whether this is called ISO-8859-1 or ASCII, provided it works consistently across all client browser platforms, for all 8 bit coding of text. > The characters in the range 80-FF (hex) are often erroneously called > "extended ASCII" or even just "ASCII". Yes, this is what I unerstood by the term ASCII, prior to more informed feedback, during the discussion. If this places me amongst the ranks of the www proletariat, I don't really mind that either. :-) Your input was appreciated. Chalky
|
| | |
Date: 13 Nov 2006 15:13:17
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
In article <1163415250.692641.297160@i42g2000cwa.googlegroups.com >, Chalky <chalkyspam@bleachboys.co.uk > wrote: > >Paul Schlyter wrote: > >> Sorry, but http://en.wikipedia.org/wiki/ASCII says that ASCII is 7-bit. >> Please check the reference before pointing to it ! > >I did, and it was promptly changed, after my original posting on this >subject (yesterday). However, if David Wooley's faith in the accuracy >of Wiki logs of alterations is justified, we now know that the >misleading text string read as: > >a 8-bit system that would allow for 256 characters, the ASCII developed >from telegraphic codes and first entered commercial use as a seven-bit >teleprinter code promoted by Bell data services in 1963. Please don't truncate this paragraph so it becomes misleading. If we include some of the words before, it will read: # History # # Some time after the EBCDIC code, a 8-bit system that would allow for # 256 characters, the ASCII developed from telegraphic codes and first # entered commercial use as a seven-bit teleprinter code promoted by Bell # data services in 1963 The 8-bit system referred to is EBCDIC, not ASCII. And EBCDIC is indeed an 8-bit character code, although it has "holes" here and there in its allocation of characters to codes. >This was still posted up at Wiki about 24 hours ago, as a subsection of >the first sentence expressed under "History". True, however the older version of that section never claimed ASCII to be an 8-bit character set...... :-) >> That's ISO-8859-1, not ASCII..... :-) >> >> Check out: http://czyborra.com/ > >Thanks. Interesting. This appears to confirm that the 8 bit ISO-8859-1 >standard has been the de facto standard for interpreting single byte >character definitions in all HTML browsers since the last century. That's probably region dependent. Remember that we also have ISO-Latin-2 to ISO-Latin-13, most of which are the preferred version of ISO-Latin in various parts of the world. And China, Japan and Korea preferred other character sets which included the Kanji and related characters. >So my momentary recent concerns about cross computer compatibility (as a >consequence of recent comments here), appear to be unfounded. I don't >really care whether this is called ISO-8859-1 or ASCII, provided it >works consistently across all client browser platforms, for all 8 bit >coding of text. Unfortunately, it's not quite that easy. If you write HTML, you'd better be aware of what character encoding you're using, or else your web pages may display incorrectly here and there. In western Europe and the US, the predominant encoding is of course ISO-8859-1 aka ISO-Latin-1, however UTF-8 is getting more and more common (UTF-8 is a popular 8-bit encoding of Unicode). And, as usual, Microsoft does things in its own ways. Instead of using proper ISO-Latin-1, Windows uses Win-Latin-1 (aka Windows Code Page 1252). Now, Win-Latin-1 and ISO-Latin-1 are very similar -- the only difference are the characters 0x80 to 0x9F: in ISO-Latin (all versions) 0x80 to 0x9F just duplicate the control characters 0x00 to 0x1F, while Win-Latin-1 puts additional printable characters in 0x80 to 0x9F. Quite often, people who write web pages on a Windows platform are really using Win-Latin-1, but they believe they're using ISO-Latin-1 and say so in the headers of their HTML files - as a result, their web pages may look weird here and there, when viewed on a web browser on a non-Win |
|