| |
Main
Date: 12 Nov 2006 02:25:03
From: Chalky
Subject: Restricted ASCII?
|
ASCII is defined in wiki as an 8-bit system, developed from telegraphic codes, to allow for 256 characters. However, I have noticed that this set is truncated to 7 characters at sci.astro.research to conform to its first commercial use as a seven-bit teleprinter code (1963). This can result in considerable garbling of 8-bit ASCII text. Since I am pretty sure I have seen Schrodinger's equation spelled correctly at sci.physics.research, I am curious to discover how endemic this restriction of the ASCII set actually still is, in the Usenet groups. Towards this end, I have pasted in the 8 bit ASCII for degrees, for + -, and for copyright, below =B0 =B1 =A9 Chalky
|
|
| |
Date: 12 Nov 2006 15:31:45
From: David Woolley
Subject: Re: Restricted ASCII?
|
In article <1163343127.057255.151280@h48g2000cwc.googlegroups.com >, chalkyspam@bleachboys.co.uk wrote: > I think it is unfortunate that we now appear to be restricted to that > subset of ascii which is represented by a single keypress on a standard > British or American keyboard. If you include shift and control modifiers, what you get on a standard US keyboard is the whole of ASCII. With the same caveat, what you get on a standard British keyboard is a superset of ASCII, because old keyboards support =A3 as a simple shifted character. (Modern ones also have the EURO symbol, but that requires the ISO 8859/15 character code or one of the proprietary ones. EURO, on PC keyboards, also requires the use of the alt-graphics modifier key.) As to American keyboards, I would have thought some parts of America used keyboards optimised for Spanish or Portuguese. Incidentally, this one was identified as pure ASCII, but I've had to convert to ISO 8859/1, in order to include =A3. I haven't used ISO 8859/15, as that is relatively recent and there may be people without support for it, either in the browser, or in fonts. > Content-Type: text/plain; charset="us-ascii" (Note the level of quoting is getting close to the level at which my spam filter cuts in because the article is too long.) PS. It looks like the article was rejected by the moderator on the research newsgroup, and that they rejected it for the obvious reason that it was off topic.
|
| |
Date: 12 Nov 2006 15:38:12
From: Tom Roberts
Subject: Re: Restricted ASCII?
|
Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. > > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). [...] This is not due to the newsgroup itself or to the underlying newsgroup software. It is due to the newsgroup client used by individual people. For instance, I use Firefox and your symbols come out fine, except in moderated newsgroups. In moderated newsgroups it also depends on the email system used (author- >moderator) and on the client software used by the moderator, because postings are processed via email and by the moderator's client before being distributed. So this is almost surely due to either the email software or the moderator of sci.astro.research using newsgroup software that truncates the high bit. Tom Roberts
|
| |
Date: 12 Nov 2006 07:30:41
From: Chalky
Subject: Re: Restricted ASCII?
|
George Dishman wrote: > [note: hand indented because the special characters > included by "Chalky" forced the use of "quoted > printable" which prevents Outlook Express handling > the ident automatically.] OK, but what does this mean in terms of displayed information, since I don't really understand this comment? > "Chalky" <chalkyspam@bleachboys.co.uk> wrote in message > news:1163327103.618298.79130@e3g2000cwe.googlegroups.com... > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > http://en.wikipedia.org/wiki/ASCII WOW. That text has CERTAINLY been changed since this morning. What I quoted above was copied and pasted _directly_ from the first sentence of that reference, under the sub heading "History", at ~10AM BST today! > > "ASCII was first published as a standard in 1967 and > was last updated in 1986. It currently defines codes > for 128 characters. 33 are non-printing, mostly > obsolete control characters that affect how text is > processed, and the other 95 printable characters are > as follows (starting with the space character):" > > and later > > "ASCII is, strictly, a seven-bit code, meaning that it > uses the bit patterns representable with seven binary > digits (a range of 0 to 127 decimal) to represent > character information." Yes. That is copied from the CHANGED Wiki reference, which _postdates_ my first posting of today. I disagree with that conveniently changed reference of this afternoon, anyway. The Server Side coding of http://1stlight.org/design/ascii.asp, specifically intructs any Windows NT 4, Windows 2000, or subsequent Microsoft server, to display the ascii symbols for all n from 1 to 255, in sequence. Neither the server nor any known browser has any difficulty in doing so. This works with both server side and client side scripting, and has done so since the last century. > History aside, use of 8-bit characters breaks one > of the most common newsreaders. Which one? > It may be MS's fault I doubt that. > but that's where we are. I doubt that too. As suggested by Edward Green, this seems to be a bug introduced by the sci.astro.research moderator's software/interface. As I have already pointed out, 8 bit info works fine at sci.physics.research, and in every other usenet group tried. Chalky
|
| | |
Date: 12 Nov 2006 16:25:15
From: George Dishman
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163345440.980996.198470@m7g2000cwm.googlegroups.com... > > George Dishman wrote: > >> [note: hand indented because the special characters >> included by "Chalky" forced the use of "quoted >> printable" which prevents Outlook Express handling >> the ident automatically.] > > OK, but what does this mean in terms of displayed information, since I > don't really understand this comment? It means the " > " which prefixes each quoted line doesn't get put in by the newsreader, I had to edit it into each line I quoted myself. That's why I trimmed most of your post. >> "Chalky" <chalkyspam@bleachboys.co.uk> wrote in message >> news:1163327103.618298.79130@e3g2000cwe.googlegroups.com... >> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic >> > codes, to allow for 256 characters. >> >> http://en.wikipedia.org/wiki/ASCII > > WOW. That text has CERTAINLY been changed since this morning. What I > quoted above was copied and pasted _directly_ from the first sentence > of that reference, under the sub heading "History", at ~10AM BST today! I suspect you can find some history of editing of the page on Wiki that would tell you who changed it but it was like that when I went there. BTW, there are some duplicated topics on Wiki so it might be worth checking your browser history to be sure you got the same page. >> "ASCII was first published as a standard in 1967 and >> was last updated in 1986. It currently defines codes >> for 128 characters. 33 are non-printing, mostly >> obsolete control characters that affect how text is >> processed, and the other 95 printable characters are >> as follows (starting with the space character):" >> >> and later >> >> "ASCII is, strictly, a seven-bit code, meaning that it >> uses the bit patterns representable with seven binary >> digits (a range of 0 to 127 decimal) to represent >> character information." > > Yes. That is copied from the CHANGED Wiki reference, which _postdates_ > my first posting of today. > > I disagree with that conveniently changed reference of this afternoon, > anyway. I know what it says now is correct, I've been familiar with the coding for decades (one of my early jobs required converting between 7-bit and 5-bit). > The Server Side coding of http://1stlight.org/design/ascii.asp, > specifically intructs any Windows NT 4, Windows 2000, or subsequent > Microsoft server, to display the ascii symbols for all n from 1 to 255, > in sequence. Neither the server nor any known browser has any > difficulty in doing so. This works with both server side and client > side scripting, and has done so since the last century. Again, MS seldom restricts itself to standards. >> History aside, use of 8-bit characters breaks one >> of the most common newsreaders. > > Which one? Outlook Express as I said above. >> It may be MS's fault > > I doubt that. > >> but that's where we are. > > I doubt that too. As suggested by Edward Green, this seems to be a bug > introduced by the sci.astro.research moderator's software/interface. As > I have already pointed out, 8 bit info works fine at > sci.physics.research, and in every other usenet group tried. It depends on the characters used. I have had this problem earlier this year on posts in sci.astro which is unmoderated. The first message you sent had this in the headers: Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1163327108 1505 127.0.0.1 (12 Nov 2006 10:25:08 GMT) The post I am replying to now has no problems and these are the headers: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-Trace: posting.google.com 1163345446 13704 127.0.0.1 (12 Nov 2006 15:30:46 GMT) Google automatically switched to non-ASCII because you included the special characters. George
|
| |
Date: 12 Nov 2006 06:52:07
From: Chalky
Subject: Re: Restricted ASCII?
|
Edward Green wrote: > Chalky wrote: > > > Edward Green wrote: > > > > I read everything at Google, the new AOL to some, and your symbols look > > > fine in any group. I don't know what's going on behind the scenes, > > > but I don't think a "Usenet group" is the logical entity you think it > > > is. You have the messages, and you have the formatting, which is left > > > up to the Newsreader. > > > > Not true. The top bit is erased at sci.astro.research. > > Can you reference a particular post? Maybe there is something funny > going on in the moderation step. Sure. I can do better than that. The following was the sci.astro.research moderator's response to my own (unaccepted) posting: I am sorry to inform you that your post has been found unsuitable for posting to sci.astro.research, for the following reason(s): * insufficiently relevant to research in astronomy or astrophysics. Either your message is completely off-topic for this forum, in which case please submit it to a more appropriate group; or it has insufficient content related to research to allow it to be posted under the sci.astro.research charter, in which case it may be better to post it in sci.astro or one of the other unmoderated groups in the sci.astro hierarchy. Moderator, sci.astro.research [This discussion is off-topic for s.a.r., but you might want to refer to http://en.wikipedia.org/wiki/ASCII -- mjh] ---------------------------------------------------------------------- Text of your message: --------------------- >From martinh@chiark.greenend.org.uk Thu Nov 09 16:40:59 2006 Return-path: <martinh@chiark.greenend.org.uk > X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on hercules.herts.ac.uk X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00, UNPARSEABLE_RELAY autolearn=ham version=3.1.3 Envelope-to: mjh@localhost Delivery-date: Thu, 09 Nov 2006 16:40:59 +0000 Received: from localhost ([127.0.0.1] ident=mjh) by hercules.herts.ac.uk with esmtp (Exim 3.36 #1 (Debian)) id 1GiCxX-00084E-00 for <mjh@localhost >; Thu, 09 Nov 2006 16:40:59 +0000 Received: from tucana.herts.ac.uk [147.197.215.113] by localhost with IMAP (fetchmail-6.2.5) for mjh@localhost (single-drop); Thu, 09 Nov 2006 16:40:59 +0000 (GMT) Received: from corvus.herts.ac.uk ([147.197.215.112] helo=corvus) by tucana.herts.ac.uk with esmtp (Exim 4.44) id 1GiCxB-0001Lm-82 for m.j.hardcastle@herts.ac.uk; Thu, 09 Nov 2006 16:40:37 +0000 Received: from [193.201.200.170] (helo=chiark.greenend.org.uk) by corvus with smtp (Exim 4.40) id 1GiCx9-0006eG-3h for mjh@star.herts.ac.uk; Thu, 09 Nov 2006 16:40:35 +0000 Received: from [193.4.58.12] (helo=horus.isnic.is ident=root) by chiark.greenend.org.uk (Debian Exim 3.36 #1) with esmtp (return-path news@google.com) id 1GiCwy-0000OO-00 for sci.astro.research@slimy.greenend.org.uk; Thu, 09 Nov 2006 16:40:24 +0000 Received: from proxy.google.com (proxy.google.com [66.102.7.4]) by horus.isnic.is (8.12.9p2/8.12.9/isnic) with ESMTP id kA9GeMUx027497 for <sci-astro-research@moderators.isc.org >; Thu, 9 Nov 2006 16:40:22 GMT (envelope-from news@google.com) Received: from G081002 by proxy.google.com with ESMTP id kA9GeLHu015417 for <sci-astro-research@moderators.isc.org >; Thu, 9 Nov 2006 08:40:21 -0800 Received: (from news@localhost) by Google Production with id kA9GeKTj031064 for sci-astro-research@moderators.isc.org; Thu, 9 Nov 2006 08:40:20 -0800 To: sci-astro-research@moderators.isc.org Path: m7g2000cwm.googlegroups.com!not-for-mail From: "Chalky" <chalkyspam@bleachboys.co.uk > Newsgroups: sci.astro.research Subject: Re: A Revised Planck Scale? Date: 9 Nov 2006 08:40:17 -0800 Organization: http://groups.google.com Lines: 39 Message-ID: <1163090417.072908.314350@m7g2000cwm.googlegroups.com > References: <mt2.0-21097-1162632715@hercules.herts.ac.uk > <mt2.0-21568-1162989572@hercules.herts.ac.uk > NNTP-Posting-Host: 195.92.67.65 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-Trace: posting.google.com 1163090420 31043 127.0.0.1 (9 Nov 2006 16:40:20 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Thu, 9 Nov 2006 16:40:20 +0000 (UTC) In-Reply-To: <mt2.0-21568-1162989572@hercules.herts.ac.uk > User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0),gzip(gfe),gzip(gfe) X-HTTP-Via: 1.1 webcacheH01 (NetCache NetApp/5.5R3D3) Complaints-To: groups-abuse@google.com Injection-Info: m7g2000cwm.googlegroups.com; posting-host=195.92.67.65; posting-account=oMPGkg0AAAB-lceMS5dlyP2BwpYen6gq X-C-UH-MailScanner: No Virus detected X-UH-MailScanner-From: martinh@chiark.greenend.org.uk X-UH-MailScanner-Information: UH-mail X-UH-MailScanner: No Virus detected Oh No wrote: > Thus spake Oh No <NotI@charlesfrancis.wanadoo.co.uk> > >I should just like to add that the Schwarzschild radius of the proton > >is not something which appears in standard physical models, the reason > >being that a classical massive point particle is not a consistent idea > >in general relativity. In fact a proton must be treated quantum > >mechanically, and we do not have an accepted theory on that, but if the > >Schwarzschild radius of the proton were considered then it would have a > >magnitude given by > > > > 2Gm/c^3 =3D 8.28 x 10 e^-63 m > > > >Planck length also has a formal definition > > > > l_p =3D sqrt(hbar*G/c^3) =3D 1.61605e-35 =B1 1.0e-39 m > > > >Neither of these figures is open to revision beyond that allowed by > >experimental margins of error. If you are defining other quantities, you > >should give them other names. > > > With apologies, I copy pasted those figures from another source. The > equations looked all right when I posted, but obviously they did not > contain pure ASCII You are wrong. At least the latter was pure ascii. Many useful caracters are part of the pure ascii set that used to be accepted in at least some of these newsgroups. The classic example is the correct o in Schrodinger, recently featured (correctly) in a title at sci.physics research. I think it is unfortunate that we now appear to be restricted to that subset of ascii which is represented by a single keypress on a standard British or American keyboard. Chalky ----- End forwarded message ----- > > These postings > > were in response to a suggestion from the moderator there, that this > > should be discussed here. > > But where is "here"? sci.physics, sci.physics.relativity, sci.astro, sci.astro.amateur, sci.astro.seti (Sci.astro.research moderator's recommendation just being sci.astro) Chalky
|
| |
Date: 12 Nov 2006 06:36:55
From: Chalky
Subject: Re: Restricted ASCII?
|
Chalky wrote: > Edward Green wrote: > > > Chalky wrote: > > > > > ASCII is defined in wiki as an 8-bit system, developed from telegraph= ic > > > codes, to allow for 256 characters. > > > > Neat. Another discussion of computer archaics. And I was afraid > > outlets for procrastination were closed! > > > > > However, I have noticed that this set is truncated to 7 characters at > > > sci.astro.research to conform to its first commercial use as a > > > seven-bit teleprinter code (1963). > > > > I always thought of them as ASCII and extended-ASCII, or possibly ANSI? > > > > > This can result in considerable garbling of 8-bit ASCII text. > > > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > > correctly at sci.physics.research, I am curious to discover how endem= ic > > > this restriction of the ASCII set actually still is, in the Usenet > > > groups. > > > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > > -, and for copyright, below > > > > > > =B0 =B1 =A9 > > > > I read everything at Google, the new AOL to some, and your symbols look > > fine in any group. I don't know what's going on behind the scenes, > > but I don't think a "Usenet group" is the logical entity you think it > > is. You have the messages, and you have the formatting, which is left > > up to the Newsreader. > > Not true. The top bit is erased at sci.astro.research. To clarify further, by "the top bit" I mean the _Most Significant Bit_. In computers that are more advanced than 8 bit microprocessors (circa early '70s), the most significant bit of the computer word is typically employed for the most important variable. For data, this typically means + or -. Similarly, for the instruction set (eg in the General Instruments CP 1600 microprocessor [circa mid-late '70s]), this typically was used to signal a switch between internal (MSB=3D0) and external (MSB=3D1) data manipulations [albeit still, in that example, only then employing the MSB of a 12 bit instruction word] When we come back down to 8 bit data words, then the MSB is still used to switch from the restricted (1963) set of American Bell teleprinter code characters (MSB=3D0), and the extended set (MSB=3D1), as originally intended when ascii was proposed and defined as an 8 bit code. Chalky
|
| | |
Date: 12 Nov 2006 16:00:37
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163342215.613268.5060@m73g2000cwd.googlegroups.com... Chalky wrote: > Edward Green wrote: > > > Chalky wrote: > > > > > ASCII is defined in wiki as an 8-bit system, developed from > > > telegraphic > > > codes, to allow for 256 characters. > > > > Neat. Another discussion of computer archaics. And I was afraid > > outlets for procrastination were closed! > > > > > However, I have noticed that this set is truncated to 7 characters at > > > sci.astro.research to conform to its first commercial use as a > > > seven-bit teleprinter code (1963). > > > > I always thought of them as ASCII and extended-ASCII, or possibly ANSI? > > > > > This can result in considerable garbling of 8-bit ASCII text. > > > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > > correctly at sci.physics.research, I am curious to discover how > > > endemic > > > this restriction of the ASCII set actually still is, in the Usenet > > > groups. > > > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > > -, and for copyright, below > > > > > > ° ± © > > > > I read everything at Google, the new AOL to some, and your symbols look > > fine in any group. I don't know what's going on behind the scenes, > > but I don't think a "Usenet group" is the logical entity you think it > > is. You have the messages, and you have the formatting, which is left > > up to the Newsreader. > > Not true. The top bit is erased at sci.astro.research. To clarify further, by "the top bit" I mean the _Most Significant Bit_. In computers that are more advanced than 8 bit microprocessors (circa early '70s), the most significant bit of the computer word is typically employed for the most important variable. For data, this typically means + or -. Similarly, for the instruction set (eg in the General Instruments CP 1600 microprocessor [circa mid-late '70s]), this typically was used to signal a switch between internal (MSB=0) and external (MSB=1) data manipulations [albeit still, in that example, only then employing the MSB of a 12 bit instruction word] When we come back down to 8 bit data words, then the MSB is still used to switch from the restricted (1963) set of American Bell teleprinter code characters (MSB=0), and the extended set (MSB=1), as originally intended when ascii was proposed and defined as an 8 bit code. Chalky I used to own a 110 baud teleprinter. Being mechanical the MSB was ignored, but not only that, lower case was ignored also. It was essentially 6-bit. Man, that used to clatter, but it worked with a drop of oil. Then, glory be, a TV interface. Full 8-bit prom for the character set, 2K ram for the entire screen. You could do a lot with a 4 MHz Zilog Z80 and a cassette recorder for mass storage. I see Woolworth are selling Chinese B/W 5" screen TVs for £10 now. Androcles
|
| |
Date: 12 Nov 2006 14:51:08
From: George Dishman
Subject: Re: Restricted ASCII?
|
[note: hand indented because the special characters included by "Chalky" forced the use of "quoted printable" which prevents Outlook Express handling the ident automatically.] "Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163327103.618298.79130@e3g2000cwe.googlegroups.com... > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. http://en.wikipedia.org/wiki/ASCII "ASCII was first published as a standard in 1967 and was last updated in 1986. It currently defines codes for 128 characters. 33 are non-printing, mostly obsolete control characters that affect how text is processed, and the other 95 printable characters are as follows (starting with the space character):" and later "ASCII is, strictly, a seven-bit code, meaning that it uses the bit patterns representable with seven binary digits (a range of 0 to 127 decimal) to represent character information." History aside, use of 8-bit characters breaks one of the most common newsreaders. It may be MS's fault but that's where we are. Chalky
|
| | |
Date: 12 Nov 2006 16:00:37
From: Sorcerer
Subject: Re: Restricted ASCII?
|
Ok, thanks. Maybe it'll be fixed some day. "George Dishman" <george@briar.demon.co.uk > wrote in message news:ej7bfu$2td$1@news.freedom2surf.net...
|
| |
Date: 12 Nov 2006 05:57:47
From: Edward Green
Subject: Re: Restricted ASCII?
|
Chalky wrote: > Edward Green wrote: > > I read everything at Google, the new AOL to some, and your symbols look > > fine in any group. I don't know what's going on behind the scenes, > > but I don't think a "Usenet group" is the logical entity you think it > > is. You have the messages, and you have the formatting, which is left > > up to the Newsreader. > > Not true. The top bit is erased at sci.astro.research. Can you reference a particular post? Maybe there is something funny going on in the moderation step. > These postings > were in response to a suggestion from the moderator there, that this > should be discussed here. But where is "here"?
|
| |
Date: 12 Nov 2006 05:57:15
From: Chalky
Subject: Re: Restricted ASCII?
|
Sorcerer wrote: > Returned from sci.physics.relativity, absent auto-indent. Sorcerer wrote: > Returned from sci.physics, also absent auto-indent. > Androcles Sorry, could you explain what you mean by this? As far as I am aware, auto-indent is not an ascii code. As far as I am aware, I did not employ an auto-indent in these postings, anyway. C
|
| | |
Date: 12 Nov 2006 15:21:08
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163339834.991566.50860@i42g2000cwa.googlegroups.com...
|
| |
Date: 12 Nov 2006 05:51:18
From: Chalky
Subject: Re: Restricted ASCII?
|
Edward Green wrote: > Chalky wrote: > > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > Neat. Another discussion of computer archaics. And I was afraid > outlets for procrastination were closed! > > > However, I have noticed that this set is truncated to 7 characters at > > sci.astro.research to conform to its first commercial use as a > > seven-bit teleprinter code (1963). > > I always thought of them as ASCII and extended-ASCII, or possibly ANSI? > > > This can result in considerable garbling of 8-bit ASCII text. > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > correctly at sci.physics.research, I am curious to discover how endemic > > this restriction of the ASCII set actually still is, in the Usenet > > groups. > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > -, and for copyright, below > > > > =B0 =B1 =A9 > > I read everything at Google, the new AOL to some, and your symbols look > fine in any group. I don't know what's going on behind the scenes, > but I don't think a "Usenet group" is the logical entity you think it > is. You have the messages, and you have the formatting, which is left > up to the Newsreader. Not true. The top bit is erased at sci.astro.research. These postings were in response to a suggestion from the moderator there, that this should be discussed here. > Lets set some rational follow-ups, shall we? Yes please Chalky
|
| | |
Date: 12 Nov 2006 07:56:57
From: Starlord
Subject: Re: Restricted ASCII?
|
We are just fine in using the plain text in S.A.A. -- The Lone Sidewalk Astronomer of Rosamond Telescope Buyers FAQ http://home.inreach.com/starlord Sidewalk Astronomy www.sidewalkastronomy.info The Church of Eternity http://home.inreach.com/starlord/church/Eternity.html "Chalky" <chalkyspam@bleachboys.co.uk > wrote garbage
|
| | |
Date: 12 Nov 2006 15:04:41
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163339478.048813.247840@b28g2000cwb.googlegroups.com... Edward Green wrote: > Chalky wrote: > > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > Neat. Another discussion of computer archaics. And I was afraid > outlets for procrastination were closed! > > > However, I have noticed that this set is truncated to 7 characters at > > sci.astro.research to conform to its first commercial use as a > > seven-bit teleprinter code (1963). > > I always thought of them as ASCII and extended-ASCII, or possibly ANSI? > > > This can result in considerable garbling of 8-bit ASCII text. > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > correctly at sci.physics.research, I am curious to discover how endemic > > this restriction of the ASCII set actually still is, in the Usenet > > groups. > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > -, and for copyright, below > > > > ° ± © > > I read everything at Google, the new AOL to some, and your symbols look > fine in any group. I don't know what's going on behind the scenes, > but I don't think a "Usenet group" is the logical entity you think it > is. You have the messages, and you have the formatting, which is left > up to the Newsreader. Not true. The top bit is erased at sci.astro.research. These postings were in response to a suggestion from the moderator there, that this should be discussed here. > Lets set some rational follow-ups, shall we? Yes please Chalky At least I now know why I have no indents auto-supplied by Outlook Express when I hit "Reply". Good experiment, Chalky.
|
| |
Date: 12 Nov 2006 05:46:50
From: Edward Green
Subject: Re: Restricted ASCII?
|
Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. Neat. Another discussion of computer archaics. And I was afraid outlets for procrastination were closed! > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). I always thought of them as ASCII and extended-ASCII, or possibly ANSI? > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > -, and for copyright, below > > =B0 =B1 =A9 I read everything at Google, the new AOL to some, and your symbols look fine in any group. I don't know what's going on behind the scenes, but I don't think a "Usenet group" is the logical entity you think it is. You have the messages, and you have the formatting, which is left up to the Newsreader. Lets set some rational follow-ups, shall we?
|
| | |
Date: 12 Nov 2006 15:01:00
From: Sorcerer
Subject: Re: Restricted ASCII?
|
He's testing, and you are whining about follow ups. You are right, you don't know what's going on behind the scenes, you clueless MORON, Green! "Edward Green" <spamspamspam3@netzero.com > wrote in message news:1163339210.186352.264010@h48g2000cwc.googlegroups.com... Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. Neat. Another discussion of computer archaics. And I was afraid outlets for procrastination were closed! > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). I always thought of them as ASCII and extended-ASCII, or possibly ANSI? > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > -, and for copyright, below > > ° ± © I read everything at Google, the new AOL to some, and your symbols look fine in any group. I don't know what's going on behind the scenes, but I don't think a "Usenet group" is the logical entity you think it is. You have the messages, and you have the formatting, which is left up to the Newsreader. Lets set some rational follow-ups, shall we?
|
| |
Date: 12 Nov 2006 05:41:30
From: Chalky
Subject: Re: Restricted ASCII?
|
Chalky wrote: > Chalky wrote: > > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > > > However, I have noticed that this set is truncated to 7 characters at > > sci.astro.research to conform to its first commercial use as a > > seven-bit teleprinter code (1963). > > > > This can result in considerable garbling of 8-bit ASCII text. > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > correctly at sci.physics.research, I am curious to discover how endemic > > this restriction of the ASCII set actually still is, in the Usenet > > groups. > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > -, and for copyright, below > > > > =B0 =B1 =A9 > > > > Chalky > > Interesting. All groups display correctly except > sci.physics.relativity, which still displayed 8 bits, but translated > into a more old-fashioned font. > > Looks like sci.astro.research is the only newsgroup actually restricted > to 7 bits. > > I wonder why that is? > > Chalky I have since noticed that the wiki reference and all references therefrom are complete rubbish in all other respects, since they all still restrict the displayed characters to the least significant 7 bits of ASCII (i.e. restriction to Bell teleprinter code, circa 1963). This erases all unique characteristics of Scandanavian (and Germanic) languages, all unique characteristics of Latin languages (such as French & Spanish), and all currencies other than the Yankey Dollar. (Thus excluding the British Pound Sterling, the Euro, and the Japanese Yen [to name a few important examples], as well as precluding the use of any more advanced scientific notation.) So, this is (probably) goodbye from me to sci.astro.research. (I can't cope with this more-than-40-year-out-of-date ascii restriction. [Or, as Captain Beefheart said more eloquently, "I cry, but I can't buy your Veterans' Day Poppy."]) In view of this apparent dearth of up-to-date information on ascii on the internet, I am now recommending (to the relevant management) that the intRAnet version of the file http://1stlight.org/design/ascii.asp, should now be included on the intERnet version of that site, too. Chalky
|
| |
Date: 12 Nov 2006 02:46:29
From: Chalky
Subject: Re: Restricted ASCII?
|
Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. > > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). > > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > -, and for copyright, below > > =B0 =B1 =A9 > > Chalky Interesting. All groups display correctly except sci.physics.relativity, which still displayed 8 bits, but translated into a more old-fashioned font. Looks like sci.astro.research is the only newsgroup actually restricted to 7 bits. I wonder why that is? Chalky
|
| | |
Date: 12 Nov 2006 14:34:16
From: David Woolley
Subject: Re: Restricted ASCII?
|
In article <1163328389.468587.34850@h48g2000cwc.googlegroups.com >, chalkyspam@bleachboys.co.uk wrote: > Chalky wrote: > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. That wiki is wrong. It is using a common marketing/popular computing misuse of the term. Which wiki was it and which article? The current edit of the English version of Wikepedia seems to correctly define it as seven bit: <http://en.wikipedia.org/w/index.php?title=ASCII&oldid=87321585 >. ASCII is the US variant of ISO 646 (I think it is the same as the reference variant), which is a seven bit code. As indicated by the Content-Type header, your article is not in ASCII: > Content-Type: text/plain; charset="iso-8859-1" ^^^^^^^^^^ Which indicates that it is in the eight bit code ISO 8859/1, which contains ISO 646 (reference variant) as a subset. You were not posting in ASCII. One of your articles was, however, in "just send eight" format, which, while it generally passes through Usenet OK, because most of Usenet is 8 bit clean, leaves the receiving newsreader to guess what character set was actually intended, and is therefore invalid. In Chinese speaking parts of the world, just send 8 content is likely to be in GB2312 or Big 5, not ISO 8859/1. Even in the West, it is quite likely to be Window-1252 a variant of ISO 8859/1 in which 32 control characters are replaced by extra graphics, but it could be the Mac version, instead. > > This can result in considerable garbling of 8-bit ASCII text. There is no such thing as 8 bit ASCII text. > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > correctly at sci.physics.research, I am curious to discover how endemic > > this restriction of the ASCII set actually still is, in the Usenet > > groups. For all except moderated newsgroups, and I don't think your test post would have been passed by a moderator, this is purely a function of the newsreaders or other user agents used. For a cross posted article, only one copy is ever transmitted, so any corruption would apply to all newsgroups. (In your case, the user agent seems to be the Google Groups nntp to HTML/HTTP gateway.) > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > -, and for copyright, below > > > > =B0 =B1 =A9 Actually, as well as using ISO 8859/1, you used quoted printable, and therefore actually only sent 7 bits in all except the just send eight case. > > Chalky > Interesting. All groups display correctly except > sci.physics.relativity, which still displayed 8 bits, but translated > into a more old-fashioned font. Fonts are purely a user agent issue. As I said, unless you post to a moderated group, in which case it is the moderator's computer system that determines what happens to things other than pure ASCII, this is an issue with your newsreader, because the same copy of the article is used for all unmoderated groups in a cross-posting. > Looks like sci.astro.research is the only newsgroup actually restricted > to 7 bits. It looks like sci.astro.research *is* moderated ("m" flag): 200 news.demon.co.uk InterNetNews NNRP server INN 2.4.1 ready (posting ok). list active sci.astro.research 215 Newsgroups in form "group high low flags". sci.astro.research 0000004703 0000003871 m . Maybe it is auto-moderated which is why the test got through. Your problem is with the email system used by the moderator. It seems to be resolving the quoted printable coding to 8 bit, but then losing the MIME information when re-submitting the approved version. It's generally best to restrict yourself to the proper definition of ASCII unless you are in a restricted community, typically a language community, or the material can't be satifactorily represented in ASCII. For maths, there is only a narrow band in which this is valid, as one soon reaches a point where one needs to use TeX, troff's eqn, or MathML, which would normally be treated as binaries on Usenet, so are best placed on a web site.
|
| | |
Date: 12 Nov 2006 13:32:03
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163328389.468587.34850@h48g2000cwc.googlegroups.com... Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. > > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). > > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > -, and for copyright, below > > ° ± © > > Chalky Interesting. All groups display correctly except sci.physics.relativity, which still displayed 8 bits, but translated into a more old-fashioned font. Looks like sci.astro.research is the only newsgroup actually restricted to 7 bits. I wonder why that is? Chalky Returned from sci.physics, also absent auto-indent. Androcles
|
| | |
Date: 12 Nov 2006 13:26:14
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163328389.468587.34850@h48g2000cwc.googlegroups.com... Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. > > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). > > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > -, and for copyright, below > > ° ± © > > Chalky Interesting. All groups display correctly except sci.physics.relativity, which still displayed 8 bits, but translated into a more old-fashioned font. Looks like sci.astro.research is the only newsgroup actually restricted to 7 bits. I wonder why that is? Chalky Returned from sci.physics.relativity, absent auto-indent.
|
| | | |
Date: 12 Nov 2006 13:58:06
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Sorcerer" <Headmaster@hogwarts.physics_e > wrote in message news:WhF5h.157081$lT5.7614@fe2.news.blueyonder.co.uk...
|
| | |
Date: 12 Nov 2006 20:03:11
From: Pat O'Connell
Subject: Re: Restricted ASCII?
|
Chalky wrote: > Chalky wrote: > >> ASCII is defined in wiki as an 8-bit system, developed from telegraphic >> codes, to allow for 256 characters. >> >> However, I have noticed that this set is truncated to 7 characters at >> sci.astro.research to conform to its first commercial use as a >> seven-bit teleprinter code (1963). >> >> This can result in considerable garbling of 8-bit ASCII text. >> >> Since I am pretty sure I have seen Schrodinger's equation spelled >> correctly at sci.physics.research, I am curious to discover how endemic >> this restriction of the ASCII set actually still is, in the Usenet >> groups. >> >> Towards this end, I have pasted in the 8 bit ASCII for degrees, for + >> -, and for copyright, below >> >> ° ± © >> >> Chalky > > Interesting. All groups display correctly except > sci.physics.relativity, which still displayed 8 bits, but translated > into a more old-fashioned font. > > Looks like sci.astro.research is the only newsgroup actually restricted > to 7 bits. Only characters 0 through 127 have been standardized as part of ASCII. Each computer operating system (for instance DOS, Windows, Mac, and VMS) displays its own symbol set for 128 through 255. -- Pat O'Connell [note munged EMail address] Take nothing but pictures, Leave nothing but footprints, Kill nothing but vandals...
|
| | | |
Date: 13 Nov 2006 15:43:30
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
An interesting history of character codes, from Morse Codes through Baudot Code to ASCII-1967 can be found here: http://www.wps.com/projects/codes/ The author says, on that page: # ASCII is and always was a seven bit code. I am shocked at the number of # people and sources that claim it to be an 8-bit code. There are only 128 # character codes in ASCII. # Many of the extentions to ASCII are 8 bits, but they are not ASCII. -- ---------------------------------------------------------------- Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN e-mail: pausch at stockholm dot bostream dot se WWW: http://stjarnhimlen.se/
|
| | | |
Date: 14 Nov 2006 00:12:38
From: Ben Rudiak-Gould
Subject: Re: Restricted ASCII?
|
Pat O'Connell wrote: >Each computer operating system (for instance DOS, Windows, Mac, and VMS) >displays its own symbol set for 128 through 255. Every *localized version* of every OS has a *default* character encoding that many tools use in the absence of other encoding information. The local default makes no sense (on either end) for documents that are being sent between computers, like email and Usenet messages and HTML pages. Therefore you should always specify the encoding of such a message explicitly within the message itself, which makes the local default irrelevant. -- Ben
|
| |
Date: 13 Nov 2006 05:25:52
From: Matt Giwer
Subject: Re: Restricted ASCII?
|
Chalky wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. > > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). > > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > -, and for copyright, below It has been so long I doubt I remember enough to do this subject the injustice it deserves. First there were others for different languages such as UKSCII whose main difference was the pound symbol instead of the $. This # is an octopule not a pound sign. Those are just two of the SCIIs that were around. Anyone who wants to tell me I am full of it and have it wrong please feel free. It has been over 20 years since I read it. The it was a Ma Bell handbook on the oxymoronic RS-232 standard. It was mostly a recounting of the different ways the "standard" was implemented. It started as a standard (Ma Bell rented what was used on its lines) with 26 uppercase, 10 numbers, punctuation, printable characters like / () $ and control characters, line feed, page feed, tab and such for teletypes. Those are those huge chuncka-chuncka machines that some of us were lucky enough to have to learn programming on instead of punchcards. At that time it was six bits plus one parity bit, 64 total possibilities for seven bits total. For teletypes the writer created a paper tape version of his article which was then transmitted over the expensive long distance line to save costs. That is why the old movies show them printing so fast. Also it was adapted to papertape programming to replace punchcards which started to control looms. When teletypes were replaced with machines that could do lower case it extended to seven bits plus parity other printable characters were defined such as []{} but a single standard for all the 128 possible codes never did develop. That is likely because it was not around long enough for a single manufacturer to dominate the market and Ma Bell was gone by then. In the early days of PCs there was a short time with 6+1 bits, uppercase only, followed by 7+1 bits. With the development of home modems and doing it affordably it was not possible for many years to drop the parity bit. Also relaying data through some machines (IBM mainframes mostly) required limiting the data to 6+1 for the lack of standardization of the full 128 bit charset and some of their legacy mainframes were never upgraded to 128. As the gods would have it phone lines got better and modems included error correcting code and had the full 8+2 bits for error correction. In any event the computer no longer needed to deal with the parity issue and the full 8 bits could be used for display characters. Given the lack of standardization of all the extras in the lower 128 it is not surprising the upper 128 were wildcards. I think it was Apple ][ upgrade chipset or maybe the ill-conceived III which first used them for international characters. Atari simply made them the inverse of their choices for the lower 128. This was before displays went graphic as in windows and MS was on DOS 7. As to what all is out there today, you tell me. From what I see occasionally UNICODE is not much of a match for the snail as far as progress goes. I can see all kinds of reasons for that given all the "alphabets" around. It is a worth looking into if you want to see why progress is slow. For example some Arabic alphabets have parts of letters "underlining" other letters. Another font I saw requires the ability to embed letters within other letters like putting a lowercase e inside an uppercase L. -- Hodie pridie Idus Octobres MMVI est -- The Ferric Webceasar nizkor http://www.giwersworld.org/nizkook/nizkook.phtml Iraqi democracy http://www.giwersworld.org/911/armless.phtml a3
|
| | |
Date: 13 Nov 2006 01:10:12
From: Paul
Subject: Re: Restricted ASCII?
|
Matt Giwer wrote: > Chalky wrote: > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > > > However, I have noticed that this set is truncated to 7 characters at > > sci.astro.research to conform to its first commercial use as a > > seven-bit teleprinter code (1963). > > > > This can result in considerable garbling of 8-bit ASCII text. > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > correctly at sci.physics.research, I am curious to discover how endemic > > this restriction of the ASCII set actually still is, in the Usenet > > groups. > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > -, and for copyright, below > > It has been so long I doubt I remember enough to do this subject the injustice > it deserves. Very funny! Matt I love the way you write. Yermiah
|
| |
Date: 12 Nov 2006 21:46:06
From: Chalky
Subject: Restricted ASCII? The final test
|
Thanks too to Sorcerer (Androcles), and George Dishman for your collective constructive feedback. It seems that you probably had a secondary display problem but I didn't, because you have Usenet postings e-mailed to you. I just go to the website to read what I am interested in, and, when I respond, I do so via form submission, so there is no e-mail protocol involved, my side. Your resultant problem might have been because I originally pasted in the displayed characters which sprang to life after I had typed in the decimal translation of the machine code for those characters. Consequently, for a final test, I am simply typing in the decimal translations of the machine codes for the Japanese Yen, the registered trade mark, and the Euro, encased in the beautifully symmetric Spanish version of the question mark, using the HTML identifiers &, #, and ; below: ¿ ¥ ® ? Let me know what you see. Chalky
|
| |
Date: 12 Nov 2006 20:59:20
From: Chalky
Subject: Re: Restricted ASCII?
|
Greg Hennessy wrote: > The facts are you claimed ascii was 8 bit. It isn't. Wiki defines it > as 7 bit. And, of course, Wiki is always infallible, is it? > There is an 8 bit code called EBCDIC, which is mostly used in IBM > mainframes, but that has nothing to do with ASCII. I have no argument with that. So please now explain why, less than 24 hours ago, infallible Wiki connected EBCDIC and ASCII together, via the qualifying phrase, a 8-bit system that would allow for 256 characters, > You were confused. If so, it looks like Wiki was too. See also Wiki ref to US-ASCII, and my reference to the server side instruction Asc(), which definitely returns the results of all 8 bits in the byte. Cheers It has been fun. Chalky
|
| | |
Date: 13 Nov 2006 15:43:30
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
In article <1163393960.821344.305840@m73g2000cwd.googlegroups.com >, Chalky <chalkyspam@bleachboys.co.uk > wrote: > >Greg Hennessy wrote: > >> The facts are you claimed ascii was 8 bit. It isn't. Wiki defines it >> as 7 bit. > >And, of course, Wiki is always infallible, is it? Wiki is no more infallible than its authors of course. But he claimed wiki said ASCII was 8-bit, and wiki never said that. Btw wiki is correct in this particular case: ASCII *is* a 7-bit code. >> There is an 8 bit code called EBCDIC, which is mostly used in IBM >> mainframes, but that has nothing to do with ASCII. > >I have no argument with that. So please now explain why, less than 24 >hours ago, infallible Wiki connected EBCDIC and ASCII together, via the >qualifying phrase, a 8-bit system that would allow for 256 characters, EBCDIC predated ASCII, and was of course considered when the 5-bit Baudot code evolved into ASCII. However, early versions of EBCDIC had "holes" in its character table for each byte where both nibbles weren't in the range 0 to 9. Which is natural, since EBCDIC was an evolution of the earlier BCD codes: BCD = Binary Coded Decimal EBCDIC = Extended Binary Coded Decimal Interchange Code ASCII used the bit space more efficiently, using all possible bit combinations. >> You were confused. > >If so, it looks like Wiki was too. No - Wiki never claimed ASCII to be an 8-bit code. Re-read the older version of that page, and re-read it carefully.... >See also Wiki ref to US-ASCII, and my reference to the server side >instruction Asc(), which definitely returns the results of all 8 bits >in the byte. > >Cheers > >It has been fun. > >Chalky > -- ---------------------------------------------------------------- Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN e-mail: pausch at stockholm dot bostream dot se WWW: http://stjarnhimlen.se/
|
| |
Date: 12 Nov 2006 20:19:34
From: Chalky
Subject: Re: Restricted ASCII?
|
David Woolley wrote: > In article <1163358882.886919.240610@m7g2000cwm.googlegroups.com>, > chalkyspam@bleachboys.co.uk wrote: > > Greg Hennessy wrote: > > > > and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit > > > system that would allow for 256 characters, the". Via reconstruction, the exact Wiki sentence on the morning of my original posting, would have been: "Some time after the [[EBCDIC]] code, a 8-bit system that would allow for 256 characters, the ASCII developed from telegraphic codes and first entered commercial use as a seven-bit teleprinter code promoted by Bell data services in 1963." Given that the subsequenty deleted qualification ", a 8-bit system that would allow for 256 characters, " is linguistically incorrect in its own right, that phrase could equally have referred to ASCII, EBCDIC, or both, within that sentence (as I had originally assumed). Thanks for helping to clear up that point of confusion. Perhaps such confusion could be avoided in the future by adopting the http://en.wikipedia.org/wiki/MIME method of referring to this explicitly restricted 7 bit instruction set as US-ASCII. David Woolley also wrote: > In article <1163345440.980996.198470@m7g2000cwm.googlegroups.com>, > chalkyspam@bleachboys.co.uk wrote: > > The Server Side coding of http://1stlight.org/design/ascii.asp, > This URL produces a request to "Click Here AFTER inserting security key > into your computer", so is useless as a public reference. Yes, I did say in an earlier posting that this link is only currently accessible on the intRAnet version of that site. (Thanks for confirming that the relevant site security lock still works when that software is instead running on the intERnet copy of the server) > > specifically intructs any Windows NT 4, Windows 2000, or subsequent > > Microsoft server, to display the ascii symbols for all n from 1 to 255, > They are misusing the term ASCII. That's a very common mistake. > Probably what it actually does is to display the raw font encoding for > the current font. The relevant employed server side visual basic (ASP) coding is: a=Chr(n) Response.Write Asc(a) (for all n from 1 to 255) Perhaps you should now complain to Microsoft that they have not sawn off the most significant bit of the data byte in their Asc() instruction, to specifically restrict that server side scripting to only handling the US-ASCII subset. (Yes, I already know that not every possible permutation of 1s and 0s in that byte, results in a displayable graphic. This is equally true for the 7 bit subset, as for the 8 bit set.) Cheers It has been fun. Chalky
|
| | |
Date: 13 Nov 2006 00:09:40
From: Rich Townsend
Subject: Re: Restricted ASCII?
|
Chalky wrote: > David Woolley wrote: > >> In article <1163358882.886919.240610@m7g2000cwm.googlegroups.com>, >> chalkyspam@bleachboys.co.uk wrote: >>> Greg Hennessy wrote: >>>> and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit >>>> system that would allow for 256 characters, the". > > Via reconstruction, the exact Wiki sentence on the morning of my > original posting, would have been: "Some time after the [[EBCDIC]] > code, a 8-bit system that would allow for 256 characters, the ASCII > developed from telegraphic codes and first entered commercial use as a > seven-bit teleprinter code promoted by Bell data services in 1963." > > Given that the subsequenty deleted qualification ", a 8-bit system that > would allow for 256 characters, " is linguistically incorrect in its > own right, that phrase could equally have referred to ASCII, EBCDIC, or > both, within that sentence (as I had originally assumed). > > Thanks for helping to clear up that point of confusion. Perhaps such > confusion could be avoided in the future by adopting the > http://en.wikipedia.org/wiki/MIME method of referring to this > explicitly restricted 7 bit instruction set as US-ASCII. > > David Woolley also wrote: > >> In article <1163345440.980996.198470@m7g2000cwm.googlegroups.com>, >> chalkyspam@bleachboys.co.uk wrote: > >>> The Server Side coding of http://1stlight.org/design/ascii.asp, > >> This URL produces a request to "Click Here AFTER inserting security key >> into your computer", so is useless as a public reference. > > Yes, I did say in an earlier posting that this link is only currently > accessible on the intRAnet version of that site. (Thanks for confirming > that the relevant site security lock still works when that software is > instead running on the intERnet copy of the server) > >>> specifically intructs any Windows NT 4, Windows 2000, or subsequent >>> Microsoft server, to display the ascii symbols for all n from 1 to 255, > >> They are misusing the term ASCII. That's a very common mistake. >> Probably what it actually does is to display the raw font encoding for >> the current font. > > The relevant employed server side visual basic (ASP) coding is: > > a=Chr(n) > Response.Write Asc(a) > (for all n from 1 to 255) > > Perhaps you should now complain to Microsoft that they have not sawn > off the most significant bit of the data byte in their Asc() > instruction, to specifically restrict that server side scripting to > only handling the US-ASCII subset. > No, ASCII is the proper designation for the 7-bit encoding -- the 'A' standing for 'American'. Anything with 8 bits just isn't ASCII, it's ISO 8859/1 or somesuch. How do I know this? I spent five months working in the standards department of a very large news company, and it was my business to know. cheers, Rich
|
| | | |
Date: 13 Nov 2006 10:44:36
From: Starlord
Subject: Re: Restricted ASCII?
|
I know my old Atari 800XL used ASCII and so does my Atari TT030. But this has nothing to do with Astronomy or telescopes. -- The Lone Sidewalk Astronomer of Rosamond Telescope Buyers FAQ http://home.inreach.com/starlord Sidewalk Astronomy www.sidewalkastronomy.info The Church of Eternity http://home.inreach.com/starlord/church/Eternity.html "Rich Townsend" <rhdt@barVOIDtol.udel.edu > wrote in message news:ej8umk$nbg$1@scrotar.nss.udel.edu...
|
| |
Date: 12 Nov 2006 20:25:03
From: David Woolley
Subject: Re: Restricted ASCII?
|
In article <1163358882.886919.240610@m7g2000cwm.googlegroups.com >, chalkyspam@bleachboys.co.uk wrote: > Greg Hennessy wrote: > > and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit > > system that would allow for 256 characters, the". > Precisely. This is exactly what I copied this morning, followed by It says nothing about the nature of ASCII. EBCDIC is a proprietary, IBM, character code, which has a certain relationship to punched card codes. (Punched cards have 12 potential holes in each column. With the normal codes exactly one of these is punched for each digit from 0 to 9 (somewhat desirable with early manual card punches, where you had to push a key for each hole - in my time only used for making corrections). Uppercase characters were coded by punching one, or both, of the remaining rows. EBCDIC reflects this structure by coding the 0 to 9 punching into the low order four bits, with the result that character codes were not contiguous. I suspect this was done because it simplified the electronics used in the card readers.) I did see this edit, but discarded it because it was clearly an irrelevant side comment. In fact the edit history makes it clear that the reason for this change was that it misrepresented the time relation between the creation of the two different codes. (I actually read the edit comments before looking at the actual edits.) > negligible linguistic modification, which made ABSOLUTELY no difference > to the meaning of the (then) wiki ref., under History. Obviously it makes no significant difference because its a comment about EBCDIC in an article about ASCII. > Thank you for this objective confirmation. It confirms that there was a change today and that change did not have any relevance, except to someone who had completely misunderstood the original. I would definitely have been speaking out of my mouth if I hadn't typed this without speaking. (EBCDIC has some relevance to USENET as newsreaders on EBCDIC based machines cannot assume that their character code is identical to ASCII in the first 128 code - it is very different.) EBCDIC = Extended Binary Coded Decimal Interchange Code
|
| |
Date: 12 Nov 2006 11:18:46
From: Chalky
Subject: Re: Restricted ASCII?
|
David Woolley wrote: > In article <1163345440.980996.198470@m7g2000cwm.googlegroups.com>, > chalkyspam@bleachboys.co.uk wrote: > > > Yes. That is copied from the CHANGED Wiki reference, which _postdates_ > > my first posting of today. > > I've gone through the edit history and no edit since, at the latest, > November 7th, has made such a change. Even if the page had been vandalised, > the change would still be in the edit history. The only time an edit > would be taken out of the history is if it would be illegal, or at > least legally unsafe, to keep it. The sort of change we are discussing > here, by no means, fits that category (unless the same edit introduced > legally unsafe material, which would only really occur for vandalism > with this article). > > Note that Wiki is a generic term. What we seem to be, actually, talking > about is the English version of Wikipedia. > > > I disagree with that conveniently changed reference of this afternoon, > > anyway. > > The current reference (spot version URL given earlier) is correct > in the area under discussion and hasn't significantly changed in the > last few days. > > > The Server Side coding of http://1stlight.org/design/ascii.asp, > > This URL produces a request to "Click Here AFTER inserting security key > into your computer", so is useless as a public reference. > > > specifically intructs any Windows NT 4, Windows 2000, or subsequent > > Microsoft server, to display the ascii symbols for all n from 1 to 255, > > They are misusing the term ASCII. That's a very common mistake. > Probably what it actually does is to display the raw font encoding for > the current font. Most Windows fonts in the UK are either Unicode or > Windows-1252 coded, both of which have the ASCII and ISO 8859/1 graphics > as a subset. > > In particular, ASCII code points 0 to 31 are not displayable, although > a few of them have an effect on formatting. Nor is ASCII code point 127. > 128 through 255, as repeatedly stated, are not in ASCII, and even with the > other non-proprietary codes (ISO 8859/* and ISO 10646 (~Unicode)) code > points 128 through 159 are not displayable graphics. > > > in sequence. Neither the server nor any known browser has any > > difficulty in doing so. This works with both server side and client > > side scripting, and has done so since the last century. > > There are two ways of specifying characters to a browser, one is by > literally including the character code in the data stream. In that case, > it should be interpreted in the context of the charset parameter in the > Content-Type and whether or not a character even exists will depend on > the character set specified. The other is to provide it using numeric > entitities ( etc.) (or named ones referencing them) or through > scripting, in which case the characters should be in the HTML native > character set, which is ISO 8859/1 for versions before HTML 4.0 and > ISO 10646 for later versions. In both these cases, codes 128 through > 159 are absolutely forbidden, as is 127 and most of the range from 0 > to 31. If the browser displays these characters, it is doing so for > error recovery reasons, or for compatibility with early browsers that > were rather sloppy in their character code handling (in particular, > they tended to display the current platform code page character, rather > than the correct one defined by the standard - in the USA and UK, this > tended to produce the same result, for conforming characters). > > > I doubt that too. As suggested by Edward Green, this seems to be a bug > > introduced by the sci.astro.research moderator's software/interface. As > > That seems to be the case. Strictly speaking, the use of MIME > has never been standardised on USENET, so anything that doesn't use > ASCII is non-standard. However, the de facto situation is that MIME > using quoted-printable or base64 works with modern news readers and MIME > using 8bit works most of the time. > > In the *.research moderation case, it does seem that the MIME encoding > is being undone, which suggests that the system as a whole (which > might include Google) is broken because it is partially MIME aware. > Things ought to work OK if the software is fully MIME aware or, like > the USENET backbone, totally unaware of MIME. > > > I have already pointed out, 8 bit info works fine at > > sci.physics.research, and in every other usenet group tried. > > Your examples, mostly, have not sent 8 bit characters over USENET. They > have used quoted-printable encoding, in which bytes that cannot be > represented directly using ASCII are sent as three ASCII characters, > "=" and two hexadecimal digits. "=" followed by space, and "=" at the > end of the line also have special meanings, and some ASCII characters also > have to be coded using "=" and hex digits, including, of course, "=" > itself. (I say bytes, rather than characters, because the modern trend, > particularly for email and web pages, is to move to the use of the > UTF-8 encoding (or sometimes UTF-7) of ISO 10646, which is a variable > length code. Quoted printable encodes the component bytes, not the > whole character. UTF-7 reduces the problem, because it only uses > bytes which are printable in ASCII, and possibly common control > characters.) > > As far as the USENET backbone is concerned, the result is pure ASCII, and > that is what it transmits. Only when the article is passed to a user > agent (e.g. Outlook Express) or gatewayed to another protocol (e.g. email > for the moderation process or HTTP/HTML for Google Groups) is the MIME > encoding detected and resolved. Some articles are actually sent raw 8 bit. > These generally also work across the USENET backbone as USENET has > generally been carried by 8 bit clean protocols (unlike email), and > there have been few, if any, IBM mainframe based USENET systems, using > EBCDIC (an 8 bit code that is not based on ASCII, and the most likely > ASCII-incompatible code to have been encoutered in recent systems). > > Problems with quoting with MIME encoded material may be to do with the > way that GUI mail and news user agents normally mis-use MIME to try and > send reflowable paragraphs. See my response to Greg Hennessy. You are talking out of your arse. C
|
| |
Date: 12 Nov 2006 11:14:42
From: Chalky
Subject: Re: Restricted ASCII?
|
Greg Hennessy wrote: > >> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > >> > codes, to allow for 256 characters. > >> > >> Nope. Ascii is 7 bit. > >> http://en.wikipedia.org/wiki/ASCII > > > > Check out my response to George. This definition was changed TODAY at > > Wiki > > Not according to the WIKI history logs. There is one change listed for > today, by Chris Chittleborough, who changed "similar to" into "like" > and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit > system that would allow for 256 characters, the". Precisely. This is exactly what I copied this morning, followed by negligible linguistic modification, which made ABSOLUTELY no difference to the meaning of the (then) wiki ref., under History. Thank you for this objective confirmation. Chalky
|
| | |
Date: 12 Nov 2006 19:24:39
From: Greg Hennessy
Subject: Re: Restricted ASCII?
|
>> >> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic >> >> > codes, to allow for 256 characters. >> >> >> >> Nope. Ascii is 7 bit. >> >> http://en.wikipedia.org/wiki/ASCII >> > >> > Check out my response to George. This definition was changed TODAY at >> > Wiki >> >> Not according to the WIKI history logs. There is one change listed for >> today, by Chris Chittleborough, who changed "similar to" into "like" >> and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit >> system that would allow for 256 characters, the". > > Precisely. This is exactly what I copied this morning, followed by > negligible linguistic modification, which made ABSOLUTELY no difference > to the meaning of the (then) wiki ref., under History. The facts are you claimed ascii was 8 bit. It isn't. Wiki defines it as 7 bit. There is an 8 bit code called EBCDIC, which is mostly used in IBM mainframes, but that has nothing to do with ASCII. You were confused. > Thank you for this objective confirmation. I am not confirming you. I am proving you wrong.
|
| |
Date: 12 Nov 2006 18:47:18
From: David Woolley
Subject: Re: Restricted ASCII?
|
In article <1163345440.980996.198470@m7g2000cwm.googlegroups.com >, chalkyspam@bleachboys.co.uk wrote: > Yes. That is copied from the CHANGED Wiki reference, which _postdates_ > my first posting of today. I've gone through the edit history and no edit since, at the latest, November 7th, has made such a change. Even if the page had been vandalised, the change would still be in the edit history. The only time an edit would be taken out of the history is if it would be illegal, or at least legally unsafe, to keep it. The sort of change we are discussing here, by no means, fits that category (unless the same edit introduced legally unsafe material, which would only really occur for vandalism with this article). Note that Wiki is a generic term. What we seem to be, actually, talking about is the English version of Wikipedia. > I disagree with that conveniently changed reference of this afternoon, > anyway. The current reference (spot version URL given earlier) is correct in the area under discussion and hasn't significantly changed in the last few days. > The Server Side coding of http://1stlight.org/design/ascii.asp, This URL produces a request to "Click Here AFTER inserting security key into your computer", so is useless as a public reference. > specifically intructs any Windows NT 4, Windows 2000, or subsequent > Microsoft server, to display the ascii symbols for all n from 1 to 255, They are misusing the term ASCII. That's a very common mistake. Probably what it actually does is to display the raw font encoding for the current font. Most Windows fonts in the UK are either Unicode or Windows-1252 coded, both of which have the ASCII and ISO 8859/1 graphics as a subset. In particular, ASCII code points 0 to 31 are not displayable, although a few of them have an effect on formatting. Nor is ASCII code point 127. 128 through 255, as repeatedly stated, are not in ASCII, and even with the other non-proprietary codes (ISO 8859/* and ISO 10646 (~Unicode)) code points 128 through 159 are not displayable graphics. > in sequence. Neither the server nor any known browser has any > difficulty in doing so. This works with both server side and client > side scripting, and has done so since the last century. There are two ways of specifying characters to a browser, one is by literally including the character code in the data stream. In that case, it should be interpreted in the context of the charset parameter in the Content-Type and whether or not a character even exists will depend on the character set specified. The other is to provide it using numeric entitities ( etc.) (or named ones referencing them) or through scripting, in which case the characters should be in the HTML native character set, which is ISO 8859/1 for versions before HTML 4.0 and ISO 10646 for later versions. In both these cases, codes 128 through 159 are absolutely forbidden, as is 127 and most of the range from 0 to 31. If the browser displays these characters, it is doing so for error recovery reasons, or for compatibility with early browsers that were rather sloppy in their character code handling (in particular, they tended to display the current platform code page character, rather than the correct one defined by the standard - in the USA and UK, this tended to produce the same result, for conforming characters). > I doubt that too. As suggested by Edward Green, this seems to be a bug > introduced by the sci.astro.research moderator's software/interface. As That seems to be the case. Strictly speaking, the use of MIME has never been standardised on USENET, so anything that doesn't use ASCII is non-standard. However, the de facto situation is that MIME using quoted-printable or base64 works with modern news readers and MIME using 8bit works most of the time. In the *.research moderation case, it does seem that the MIME encoding is being undone, which suggests that the system as a whole (which might include Google) is broken because it is partially MIME aware. Things ought to work OK if the software is fully MIME aware or, like the USENET backbone, totally unaware of MIME. > I have already pointed out, 8 bit info works fine at > sci.physics.research, and in every other usenet group tried. Your examples, mostly, have not sent 8 bit characters over USENET. They have used quoted-printable encoding, in which bytes that cannot be represented directly using ASCII are sent as three ASCII characters, "=" and two hexadecimal digits. "=" followed by space, and "=" at the end of the line also have special meanings, and some ASCII characters also have to be coded using "=" and hex digits, including, of course, "=" itself. (I say bytes, rather than characters, because the modern trend, particularly for email and web pages, is to move to the use of the UTF-8 encoding (or sometimes UTF-7) of ISO 10646, which is a variable length code. Quoted printable encodes the component bytes, not the whole character. UTF-7 reduces the problem, because it only uses bytes which are printable in ASCII, and possibly common control characters.) As far as the USENET backbone is concerned, the result is pure ASCII, and that is what it transmits. Only when the article is passed to a user agent (e.g. Outlook Express) or gatewayed to another protocol (e.g. email for the moderation process or HTTP/HTML for Google Groups) is the MIME encoding detected and resolved. Some articles are actually sent raw 8 bit. These generally also work across the USENET backbone as USENET has generally been carried by 8 bit clean protocols (unlike email), and there have been few, if any, IBM mainframe based USENET systems, using EBCDIC (an 8 bit code that is not based on ASCII, and the most likely ASCII-incompatible code to have been encoutered in recent systems). Problems with quoting with MIME encoded material may be to do with the way that GUI mail and news user agents normally mis-use MIME to try and send reflowable paragraphs.
|
| | |
Date: 13 Nov 2006 05:33:44
From: =?UTF-8?Q?Jeff=E2=80=A6Relf?=
Subject: " ASCII Art " ( e.g. tables ), 80 columns, monospaced, unwrapped.
|
Hi David_Woolley, What newsreader are you using ? " ASCII Art " ( e.g. tables ), 80 columns, monospaced, unwrapped, should be the standard, I maintain... even on cell phones. As far as the proper Encoding/Charset/etc. to use ( when posting ), I say, " If Google displays it correctly, then it's correct. " But FireFox ( my browser ) also effects the way posts look. For example, my userContent.CSS prevents line wrapping, to wit: * { white-space: nowrap !important; } Cotse.NET/users/jeffrelf/userContent.CSS ( to see the change, restart FireFox and do a Cntrl-F5 [ refresh ] ) So, to wrap WikiPedia's insanely long lines ( shame on them ! ), I copy and paste from WikiPedia to MS_Word. My hand-rolled newsreader ( X.EXE, X.TXT, X.CPP ) puts all posts in a single ( complexly maintained, UTF-16 ) .TXT file; so all text is editable ( e.g. I can add new-lines, if need be ).
|
| |
Date: 12 Nov 2006 10:50:53
From: Chalky
Subject: Re: Restricted ASCII?
|
Tom Roberts wrote: > Chalky wrote: > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > > > However, I have noticed that this set is truncated to 7 characters at > > sci.astro.research to conform to its first commercial use as a > > seven-bit teleprinter code (1963). [...] > > This is not due to the newsgroup itself or to the underlying newsgroup > software. It is due to the newsgroup client used by individual people. > For instance, I use Firefox and your symbols come out fine, except in > moderated newsgroups. In moderated newsgroups it also depends on the > email system used (author->moderator) and on the client software used by > the moderator, because postings are processed via email and by the > moderator's client before being distributed. > > So this is almost surely due to either the email software or the > moderator of sci.astro.research using newsgroup software that truncates > the high bit. > > > Tom Roberts _Thank you._ This highly consistent with my own analsyis of the experiment, and of the resultant responses to my associated postings. Cheers Chalky
|
| |
Date: 12 Nov 2006 10:35:48
From: Chalky
Subject: Re: Restricted ASCII?
|
Greg Hennessy wrote: > On 2006-11-12, Chalky <chalkyspam@bleachboys.co.uk> wrote: > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > Nope. Ascii is 7 bit. > http://en.wikipedia.org/wiki/ASCII Check out my response to George. This definition was changed TODAY at Wiki C
|
| | |
Date: 12 Nov 2006 19:01:01
From: Greg Hennessy
Subject: Re: Restricted ASCII?
|
>> > ASCII is defined in wiki as an 8-bit system, developed from telegraphic >> > codes, to allow for 256 characters. >> >> Nope. Ascii is 7 bit. >> http://en.wikipedia.org/wiki/ASCII > > Check out my response to George. This definition was changed TODAY at > Wiki Not according to the WIKI history logs. There is one change listed for today, by Chris Chittleborough, who changed "similar to" into "like" and deleted the phrase "Some time after the [[EBCDIC]] code, a 8-bit system that would allow for 256 characters, the".
|
| | |
Date: 13 Nov 2006 07:42:43
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
In article <1163356548.517299.123720@m7g2000cwm.googlegroups.com >, Chalky <chalkyspam@bleachboys.co.uk > wrote: > Greg Hennessy wrote: > >> On 2006-11-12, Chalky <chalkyspam@bleachboys.co.uk> wrote: >>> ASCII is defined in wiki as an 8-bit system, developed from telegraphic >>> codes, to allow for 256 characters. >> >> Nope. Ascii is 7 bit. >> http://en.wikipedia.org/wiki/ASCII > > Check out my response to George. This definition was changed TODAY at > Wiki No, it wasn't! I checked a version of that wikipedia entry from a few days ago, and that version also said ASCII is a 7-bit code. Sorry.... -- ---------------------------------------------------------------- Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN e-mail: pausch at stockholm dot bostream dot se WWW: http://stjarnhimlen.se/
|
| |
Date: 12 Nov 2006 10:32:12
From: Chalky
Subject: Re: Restricted ASCII?
|
I would like to say, first of all, that it is a pleasure to receive two responses from someone with adequate technical competence to actually teach me something (possibly?). David Woolley wrote: > In article <1163343127.057255.151280@h48g2000cwc.googlegroups.com>, > chalkyspam@bleachboys.co.uk wrote: > > > I think it is unfortunate that we now appear to be restricted to that > > subset of ascii which is represented by a single keypress on a standard > > British or American keyboard. > > If you include shift and control modifiers, what you get on a standard > US keyboard is the whole of ASCII. Interesting. I will try that out soon, if I can get hold of a standard American keyboard, from my only (drinking partner) USAF employee, in my (parochial) village. > With the same caveat, what you get > on a standard British keyboard is a superset of ASCII, because old keyboa= rds > support =A3 as a simple shifted character. How old are you talking about?. British keyboards support the BritishPoundSterling symbol, without shift, but fail to support the Yankee Dollar symbol thus. Similarly, Spanish keyboards support the inverted question mark (necessary for correct punctuation in that language), and, by now, the Euro. (Additional examples will be apparent to the contemporary cosmopolitan European reader.) > As to American keyboards, I would have thought some parts of America > used keyboards optimised for Spanish or Portuguese. I would imagine so too. By American I meant WASP Northern America, which is where commercial computer technology first got off the ground, after British innovators such as Babbage and Turing? (not sure of latter spelling. [The AI guy who was instrumental in breaking the Enigma code in WW2]) prepared the groundwork, and then British venture capitalism failed to adequately exploit that lead (same old story again). > Incidentally, this one was identified as pure ASCII, but I've had > to convert to ISO 8859/1, in order to include =A3. Sure. With my cynic's hat on, I would suggest this could be interpreted as yet another example of American Imperialism (and associated stupidity). > I haven't used > ISO 8859/15, as that is relatively recent and there may be people > without support for it, either in the browser, or in fonts. > > > Content-Type: text/plain; charset=3D"us-ascii" > > (Note the level of quoting is getting close to the level at which my > spam filter cuts in because the article is too long.) In that case, I hope this particular response "hits the spot" > PS. It looks like the article was rejected by the moderator on the > research newsgroup, and that they rejected it for the obvious reason > that it was off topic. Yes, I totally accept that. However, it is not off topic in the sense of ensuring the communication channel is left wide open. David Woolley subsequently wrote: > In article <1163328389.468587.34850@h48g2000cwc.googlegroups.com>, > chalkyspam@bleachboys.co.uk wrote: > > Chalky wrote: > > > > ASCII is defined in wiki as an 8-bit system, developed from telegraph= ic > > > codes, to allow for 256 characters. > > That wiki is wrong. It is using a common marketing/popular > computing misuse of the term. Which wiki was it and > which article? The reference quoted by George Dishman. Morning as opposed to afternoon version (GMT) of today. > The current edit of the English version > of Wikepedia seems to correctly define it as seven bit: > <http://en.wikipedia.org/w/index.php?title=3DASCII&oldid=3D87321585>. > > ASCII is the US variant of ISO 646 (I think it is the same as the > reference variant), which is a seven bit code. OK, if so, let me put my question slightly differently. Why is it that a European moderated sci.((astro)) research group (Moderators: Jonathan Thornberg [Germany] and Martin Hardcastle [Britain]) is enslaved to a US variant, whereas the U.S. moderated sci.((physics)) research group (Moderators: Igor Khavkine and Phillip Helbig), is not? > As indicated by the Content-Type header, your article is not in ASCII: And who wrote the Content-Type header program? A WASP American, no doubt, who was employed by Bill Gates (the _pretender_ to the throne of software perfection!). > > Content-Type: text/plain; charset=3D"iso-8859-1" > ^^^^^^^^^^ > Which indicates that it is in the eight bit code ISO 8859/1, which contai= ns > ISO 646 (reference variant) as a subset. You were not posting in ASCII. OK. so you are arguing here about a mere debatable technical definition, not about basic (and invariant) principles of information transfer protocol, as encapsulated in HyperText Markup Language (HTML), and its contained (and unambiguous) 8 bit interpretation of the character set. [snip nonsense] > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > > correctly at sci.physics.research, I am curious to discover how endem= ic > > > this restriction of the ASCII set actually still is, in the Usenet > > > groups. > > For all except moderated newsgroups, and I don't think your test post > would have been passed by a moderator, this is purely a function of the > newsreaders or other user agents used. For a cross posted article, > only one copy is ever transmitted, so any corruption would apply to all > newsgroups. (In your case, the user agent seems to be the Google > Groups nntp to HTML/HTTP gateway.) OK That sounds like good info to me. > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > > -, and for copyright, below > > > > > > =3DB0 =3DB1 =3DA9 OK that is a hexadecimal translation of the html coding: n; where n is anything from 0 to 255. Hex has always been a crap way of displaying information, and there is material in the prior published literature, to suggest that hex was only introduced as an obscuration technique to disguise the underlying simplicity of the 8 bit microprocessor instruction set. This is particularly evident when you compare the instruction sets of early 16 bit microprocessors, expressed in octal, versus the much cruder (but apparently more complex) instruction sets of more primitive 8 bit microprocessors, which were instead expressed in hexadecimal, for marketing purposes. > Actually, as well as using ISO 8859/1, you used quoted printable, and > therefore actually only sent 7 bits in all except the just send eight > case. David Woolley also wrote: > > Interesting. All groups display correctly except > > sci.physics.relativity, which still displayed 8 bits, but translated > > into a more old-fashioned font. > > Fonts are purely a user agent issue. Wrong. You can over-ride this, if you wish, using the <PRE ></PRE> HTML enclosers. There are other ways too. >As I said, unless you post to a > moderated group, in which case it is the moderator's computer system > that determines what happens to things other than pure ASCII, this is > an issue with your newsreader, because the same copy of the article is > used for all unmoderated groups in a cross-posting. > > > Looks like sci.astro.research is the only newsgroup actually restricted > > to 7 bits. [snip additional undecipherable material] > It's generally best to restrict yourself to the proper definition of > ASCII unless you are in a restricted community, You mean like the World Wide Web? > typically a language > community, You mean like HTML, CGI, ASP, php, and Javascript? > or the material can't be satifactorily represented in ASCII. > For maths, there is only a narrow band in which this is valid, as one > soon reaches a point where one needs to use TeX, troff's eqn, or MathML, > which would normally be treated as binaries on Usenet, so are best placed > on a web site. Absolutely. In view of such excessive restrictions, perhaps use of these newsgroups is now best limited to the provision of relevant hyperlinks. Chalky
|
| |
Date: 12 Nov 2006 17:28:12
From: Greg Hennessy
Subject: Re: Restricted ASCII?
|
On 2006-11-12, Chalky <chalkyspam@bleachboys.co.uk > wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. Nope. Ascii is 7 bit. http://en.wikipedia.org/wiki/ASCII ASCII was first published as a standard in 1967 and was last updated in 1986. It currently defines codes for 128 characters. 33 are non-printing, mostly obsolete control characters that affect how text is processed, and the other 95 printable characters are as follows (starting with the space character):
|
| |
Date: 13 Nov 2006 02:54:10
From: Chalky
Subject: Re: Restricted ASCII?
|
Paul Schlyter wrote: > Sorry, but http://en.wikipedia.org/wiki/ASCII says that ASCII is 7-bit. > Please check the reference before pointing to it ! I did, and it was promptly changed, after my original posting on this subject (yesterday). However, if David Wooley's faith in the accuracy of Wiki logs of alterations is justified, we now know that the misleading text string read as: a 8-bit system that would allow for 256 characters, the ASCII developed from telegraphic codes and first entered commercial use as a seven-bit teleprinter code promoted by Bell data services in 1963. This was still posted up at Wiki about 24 hours ago, as a subsection of the first sentence expressed under "History". > That's ISO-8859-1, not ASCII..... :-) > > Check out: http://czyborra.com/ Thanks. Interesting. This appears to confirm that the 8 bit ISO-8859-1 standard has been the de facto standard for interpreting single byte character definitions in all HTML browsers since the last century. So my momentary recent concerns about cross computer compatibility (as a consequence of recent comments here), appear to be unfounded. I don't really care whether this is called ISO-8859-1 or ASCII, provided it works consistently across all client browser platforms, for all 8 bit coding of text. > The characters in the range 80-FF (hex) are often erroneously called > "extended ASCII" or even just "ASCII". Yes, this is what I unerstood by the term ASCII, prior to more informed feedback, during the discussion. If this places me amongst the ranks of the www proletariat, I don't really mind that either. :-) Your input was appreciated. Chalky
|
| | |
Date: 13 Nov 2006 15:13:17
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
In article <1163415250.692641.297160@i42g2000cwa.googlegroups.com >, Chalky <chalkyspam@bleachboys.co.uk > wrote: > >Paul Schlyter wrote: > >> Sorry, but http://en.wikipedia.org/wiki/ASCII says that ASCII is 7-bit. >> Please check the reference before pointing to it ! > >I did, and it was promptly changed, after my original posting on this >subject (yesterday). However, if David Wooley's faith in the accuracy >of Wiki logs of alterations is justified, we now know that the >misleading text string read as: > >a 8-bit system that would allow for 256 characters, the ASCII developed >from telegraphic codes and first entered commercial use as a seven-bit >teleprinter code promoted by Bell data services in 1963. Please don't truncate this paragraph so it becomes misleading. If we include some of the words before, it will read: # History # # Some time after the EBCDIC code, a 8-bit system that would allow for # 256 characters, the ASCII developed from telegraphic codes and first # entered commercial use as a seven-bit teleprinter code promoted by Bell # data services in 1963 The 8-bit system referred to is EBCDIC, not ASCII. And EBCDIC is indeed an 8-bit character code, although it has "holes" here and there in its allocation of characters to codes. >This was still posted up at Wiki about 24 hours ago, as a subsection of >the first sentence expressed under "History". True, however the older version of that section never claimed ASCII to be an 8-bit character set...... :-) >> That's ISO-8859-1, not ASCII..... :-) >> >> Check out: http://czyborra.com/ > >Thanks. Interesting. This appears to confirm that the 8 bit ISO-8859-1 >standard has been the de facto standard for interpreting single byte >character definitions in all HTML browsers since the last century. That's probably region dependent. Remember that we also have ISO-Latin-2 to ISO-Latin-13, most of which are the preferred version of ISO-Latin in various parts of the world. And China, Japan and Korea preferred other character sets which included the Kanji and related characters. >So my momentary recent concerns about cross computer compatibility (as a >consequence of recent comments here), appear to be unfounded. I don't >really care whether this is called ISO-8859-1 or ASCII, provided it >works consistently across all client browser platforms, for all 8 bit >coding of text. Unfortunately, it's not quite that easy. If you write HTML, you'd better be aware of what character encoding you're using, or else your web pages may display incorrectly here and there. In western Europe and the US, the predominant encoding is of course ISO-8859-1 aka ISO-Latin-1, however UTF-8 is getting more and more common (UTF-8 is a popular 8-bit encoding of Unicode). And, as usual, Microsoft does things in its own ways. Instead of using proper ISO-Latin-1, Windows uses Win-Latin-1 (aka Windows Code Page 1252). Now, Win-Latin-1 and ISO-Latin-1 are very similar -- the only difference are the characters 0x80 to 0x9F: in ISO-Latin (all versions) 0x80 to 0x9F just duplicate the control characters 0x00 to 0x1F, while Win-Latin-1 puts additional printable characters in 0x80 to 0x9F. Quite often, people who write web pages on a Windows platform are really using Win-Latin-1, but they believe they're using ISO-Latin-1 and say so in the headers of their HTML files - as a result, their web pages may look weird here and there, when viewed on a web browser on a non-Windows computer. >> The characters in the range 80-FF (hex) are often erroneously called >> "extended ASCII" or even just "ASCII". > >Yes, this is what I unerstood by the term ASCII, prior to more informed >feedback, during the discussion. If this places me amongst the ranks of >the www proletariat, I don't really mind that either. :-) It's never too late to educate yourself.... :-) The problem with "extended ASCII" was that there were so many varieties of it. The Commodore PET had its own version (sometimes called PETSCII). The early IBM PC had another version, the early Mac a third version, etc. Not until ISO-Latin was defined did a convergence towards a common standard happen. On the PC side, the switch to ISO-Latin was made in Windows (at least almost - Win-Latin isn't quite ISO-Latin, but nearly so). On the Mac side I don't know when the switch happened, but I'm positive Mac OS-X uses ISO-Latin -- some of the later pre-OS-X versons of Mac OS may have used it too. However, the older character sets live on as backward compatibility - in Windows each time you open a console window (often incorrectly called a "DOS box" - it was a DOS box on Win-95/98/ME, but on Win-NT/2k/XP it's more than that). You can configure Windows console windows to work in ISO-Latin instead of CP-850, but if you do, nationalized versions of the standard Windows utilities may output text which look weird, since these texts are still in CP-850. >Your input was appreciated. > > >Chalky -- ---------------------------------------------------------------- Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN e-mail: pausch at stockholm dot bostream dot se WWW: http://stjarnhimlen.se/
|
| | |
Date: 13 Nov 2006 12:52:39
From: Greg Hennessy
Subject: Re: Restricted ASCII?
|
On 2006-11-13, Chalky <chalkyspam@bleachboys.co.uk > wrote: > I did, and it was promptly changed, after my original posting on this > subject (yesterday). However, if David Wooley's faith in the accuracy > of Wiki logs of alterations is justified, we now know that the > misleading text string read as: > > a 8-bit system that would allow for 256 characters, the ASCII developed > from telegraphic codes and first entered commercial use as a seven-bit > teleprinter code promoted by Bell data services in 1963. No, that is *NOT* what the text read as. You are now simply lieing. The text, as shown on http://en.wikipedia.org/w/index.php?title=ASCII&diff=87495535&oldid=87462571 was "Some time after the EBCDIC code, a 8-bit system that would allow for 256 characters, the ASCII developed from Telegraphy
|
| |
Date: 13 Nov 2006 07:42:42
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
In article <1163327103.618298.79130@e3g2000cwe.googlegroups.com >, Chalky <chalkyspam@bleachboys.co.uk > wrote: > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > codes, to allow for 256 characters. Sorry, but http://en.wikipedia.org/wiki/ASCII says that ASCII is 7-bit. Please check the reference before pointing to it ! > However, I have noticed that this set is truncated to 7 characters at > sci.astro.research to conform to its first commercial use as a > seven-bit teleprinter code (1963). > > This can result in considerable garbling of 8-bit ASCII text. > > Since I am pretty sure I have seen Schrodinger's equation spelled > correctly at sci.physics.research, I am curious to discover how endemic > this restriction of the ASCII set actually still is, in the Usenet > groups. > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for > +-, and for copyright, below > > =B0 =B1 =A9 That's ISO-8859-1, not ASCII..... :-) Check out: http://czyborra.com/ The characters in the range 80-FF (hex) are often erroneously called "extended ASCII" or even just "ASCII". But there were many different, and mutually incompatible, ways to extend ASCII to an 8-bit code. So instead of talking about "ASCII" or "Extended ASCII" as an 8-bit code, it's better to refer to the 8-bit code with its proper name (such as ISO-8859-1, or IBM CP-850). If you do that, others will know *which* ASCII "extension" you're referring to. > Chalky -- ---------------------------------------------------------------- Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN e-mail: pausch at stockholm dot bostream dot se WWW: http://stjarnhimlen.se/
|
| |
Date: 12 Nov 2006 23:19:58
From: Chalky
Subject: Re: Restricted ASCII?
|
Pat O'Connell wrote: > Only characters 0 through 127 have been standardized as part of ASCII. > Each computer operating system (for instance DOS, Windows, Mac, and VMS) > displays its own symbol set for 128 through 255. Thanks for this info.. But, _good grief_, is this for real? I had assumed that if I checked that a web page displayed correctly with Netscape 4, Netscape 7, Mozilla, Firefox, Opera, and Internet Explorer browsers, this would mean that it probably displayed correctly, period. Are you now telling me that I also have to run all these browsers on a range of different computer makes, with different dates of manufacture, before I can have any confidence in this? If so, is there a reference link I can go to, to identify what the problem characters are likely to be? Or is it now necessary for each web designer to construct his own graphics symbols for everything that is not already defined in 7 bit US-ASCII ? Chalky
|
| | |
Date: 13 Nov 2006 12:51:03
From: Richard Tobin
Subject: Re: Restricted ASCII?
|
In article <1163402398.151550.250220@m7g2000cwm.googlegroups.com >, Chalky <chalkyspam@bleachboys.co.uk > wrote: >> Only characters 0 through 127 have been standardized as part of ASCII. >> Each computer operating system (for instance DOS, Windows, Mac, and VMS) >> displays its own symbol set for 128 through 255. This is an exaggeration. Most modern software knows about different character encodings and can display them. It's true that if you present some data without any information about the encoding, different systems may display different things by default. >Are you now telling me that I also have to run all these browsers on a >range of different computer makes, with different dates of manufacture, >before I can have any confidence in this? HTML uses Unicode. Any particular page will use some character encoding that covers all or part of Unicode, and can use character references (such as £) or entity references (such as é) for characters that the encoding doesn't provide. Web servers are supposed to inform browsers what the encoding is, so they can display the right characters. Unfortunately many pages are incompetently written, and many web servers are incorrectly configured. For example, pages written in some Windows encoding are often incorrectly claimed by the server to be in ISO-Latin-1. Browsers try to compensate for this, but don't always get it right. A simple solution for authors is to use only plain ascii characters in your web page, and use character or entity references for all the others. -- Richard -- "Consideration shall be given to the need for as many as 32 characters in some alphabets" - X3.4, 1963.
|
| | | |
Date: 13 Nov 2006 15:13:16
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
In article <ej9pnn$mdb$1@pc-news.cogsci.ed.ac.uk >, Richard Tobin <richard@cogsci.ed.ac.uk > wrote: > >Unfortunately many pages are incompetently written, and many web >servers are incorrectly configured. For example, pages written in >some Windows encoding are often incorrectly claimed by the server to >be in ISO-Latin-1. That's Windows Core Page 1252, aka Win-Latin-1 The difference between ISO-Latin-1 and Win-Latin-1 are in the characters 0x80 to 0x9F: in ISO-Latin-1 these characters are a duplication of the control characters 0x00 to 0x1F, while in Win-Latin-1 additional printable characters have been put here. If you want to stay in the ISO domain of character sets, you must use Unicode to display the characters which have been put in 0x80 to 0x9F in Win-Latin-1. Here you can find a table which translates Win-Latin-1 to Unicode: http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT >Browsers try to compensate for this, but don't always get it right. Browsers which are told that a particular web page contain ISO-Latin will probably just skip any characters in the 0x80 to 0x9F range, treating them as control characters. So the effect of putting Win-Latin into a HTML page and then tell the browser it's ISO-Latin is that some characters will vanish when displayed. Another problem are web pages which don't say anything at all what character encoding it uses - then the browser must guess. If the web page uses pure US-ASCII, all works well. But if the web page uses ISO-Latin, the browser might try to display it as UTF-8, with peculiar results. Or the page may contain UTF-8 which the browser tries to display as ISO-Latin, also with peculiar results. -- ---------------------------------------------------------------- Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN e-mail: pausch at stockholm dot bostream dot se WWW: http://stjarnhimlen.se/
|
| | | | |
Date: 13 Nov 2006 17:02:15
From: Richard Tobin
Subject: Re: Restricted ASCII?
|
In article <eja0ba$1vup$1@merope.saaf.se >, Paul Schlyter <pausch@saaf.se > wrote: >Browsers which are told that a particular web page contain ISO-Latin >will probably just skip any characters in the 0x80 to 0x9F range, >treating them as control characters. Not in my experience. For example, when I try it with Firefox or Safari on a Mac, it shows the cp-1252 characters. This is probably because it doesn't check the range at all, and just happens to have to be using a font in which those code points (wrongly) have the cp-1252 glyphs. >Another problem are web pages which don't say anything at all what character >encoding it uses - then the browser must guess. If the web page uses >pure US-ASCII, all works well. But if the web page uses ISO-Latin, the >browser might try to display it as UTF-8, with peculiar results. In theory, HTTP specifies that the default character set for documents with media-type text/* (including text/html) is Latin-1. -- Richard -- "Consideration shall be given to the need for as many as 32 characters in some alphabets" - X3.4, 1963.
|
| | | | | |
Date: 14 Nov 2006 00:33:21
From: Ben Rudiak-Gould
Subject: Re: Restricted ASCII?
|
Richard Tobin wrote: >Paul Schlyter <pausch@saaf.se> wrote: >>Browsers which are told that a particular web page contain ISO-Latin >>will probably just skip any characters in the 0x80 to 0x9F range, >>treating them as control characters. > >Not in my experience. For example, when I try it with Firefox or >Safari on a Mac, it shows the cp-1252 characters. This is probably >because it doesn't check the range at all, and just happens to have to >be using a font in which those code points (wrongly) have the cp-1252 >glyphs. My guess is that it's a deliberate workaround for a FrontPage bug. There are lots of pages on the web generated by broken versions of FrontPage that misidentify Windows code page 1252 as Latin-1. Aliasing Latin-1 to CP1252 makes those pages display correctly and doesn't hurt anything else (since actual Latin-1 pages presumably won't use the C1 characters at all). -- Ben
|
| | |
Date: 13 Nov 2006 07:44:19
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Chalky" <chalkyspam@bleachboys.co.uk > wrote in message news:1163402398.151550.250220@m7g2000cwm.googlegroups.com...
|
| | | |
Date: 13 Nov 2006 15:43:30
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
In article <nnV5h.124886$3x1.20740@fe1.news.blueyonder.co.uk >, Sorcerer <Headmaster@hogwarts.physics_e > wrote: .......... > I was accused of writing milliseconds (symbol roman m) >when I had written microseconds (symbol greek mu) by someone >not using Internet Explorer or Firefox. I am NOT changing it. >The reader can change his browser as far as I'm concerned. >See for yourself: > http://www.androcles01.pwp.blueyonder.co.uk/Smart/Smart.htm > >Androcles. You could try to write: μ instead of: <font face="Symbol" >m</font> though. That requires a Unicode font in the web browser to display properly, but it worked fine on my browsers. Or you could encode your web page as UTF-8 (and then of course also say so in the header of the HTML) -- then you could write greek letters directly in your HTML code. But that requires an UTF-8 capable browser, but most modern browser have that capability. Note: the Unicode character code for Greek mu is: 0x03BC -- ---------------------------------------------------------------- Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN e-mail: pausch at stockholm dot bostream dot se WWW: http://stjarnhimlen.se/
|
| | | | |
Date: 13 Nov 2006 17:32:34
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Paul Schlyter" <pausch@saaf.se > wrote in message news:eja37t$2165$1@merope.saaf.se...
|
| | | | | |
Date: 13 Nov 2006 17:47:05
From: Richard Tobin
Subject: Re: Restricted ASCII?
|
In article <S_16h.159294$lT5.31497@fe2.news.blueyonder.co.uk >, Sorcerer <Headmaster@hogwarts.physics_e > wrote: >Thank you, I could. But I'm not going to, and neither am I going to use >Chinese, Hebrew or Cherokee characters in anticipation of someone having >a browser that doesn't recognise Greek characters. Either he gets a browser >that does, or he fails to communicate. That's his problem, not mine. But you didn't use a Greek character. You used an "m", and requested a font in which you expect it to look like a mu. Why not just use a mu? -- Richard -- "Consideration shall be given to the need for as many as 32 characters in some alphabets" - X3.4, 1963.
|
| | | | | | |
Date: 13 Nov 2006 17:58:04
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Richard Tobin" <richard@cogsci.ed.ac.uk > wrote in message news:ejab2p$vc2$1@pc-news.cogsci.ed.ac.uk... Did you snip something? It seems I have too and now I can't find it. Androcles
|
| | | | | |
Date: 13 Nov 2006 22:43:30
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
In article <S_16h.159294$lT5.31497@fe2.news.blueyonder.co.uk >, Sorcerer <Headmaster@hogwarts.physics_e > wrote: > "Paul Schlyter" <pausch@saaf.se> wrote in message > news:eja37t$2165$1@merope.saaf.se... >> In article <nnV5h.124886$3x1.20740@fe1.news.blueyonder.co.uk>, >> Sorcerer <Headmaster@hogwarts.physics_e> wrote: >> .......... >> > I was accused of writing milliseconds (symbol roman m) >> >when I had written microseconds (symbol greek mu) by someone >> >not using Internet Explorer or Firefox. I am NOT changing it. >> >The reader can change his browser as far as I'm concerned. >> >See for yourself: >> > http://www.androcles01.pwp.blueyonder.co.uk/Smart/Smart.htm >> > >> >Androcles. >> >> You could try to write: >> >> μ >> >> instead of: >> >> <font face="Symbol">m</font> >> >> though. That requires a Unicode font in the web browser to display >> properly, but it worked fine on my browsers. >> >> Or you could encode your web page as UTF-8 (and then of course also say so >> in the header of the HTML) -- then you could write greek letters directly >> in your HTML code. But that requires an UTF-8 capable browser, but most >> modern browser have that capability. >> >> Note: the Unicode character code for Greek mu is: 0x03BC > > Thank you, I could. But I'm not going to, and neither am I going to use > Chinese, Hebrew or Cherokee characters in anticipation of someone having > a browser that doesn't recognise Greek characters. Either he gets a browser > that does, or he fails to communicate. That's his problem, not mine. ....excuse me, but you really got this backwards. Using μ in your HTML really requires the browser to recognize Greek characters -- it just won't work on browsers recognizing no Greek characters.... > Many years ago I went to Italy to speak with some engineers concerning > the electronics of Tornado, which was built jointly between Britain, > Germany and Italy. > http://archive.cs.uu.nl/pub/AIRCRAFT-IMAGES/Tornado.jpg > I discovered there were no electronics data books in Italian, they > were using American just as I was. So we both had to use a common > subset of English which I admit was easier for me to learn than he, > but had I been German we'd have been on equal footing. > > Esperanto is a failure, English is a success. Currently it is, but in a century or two the situation may be different. Maybe we're all speaking Chinese then..... :-) You probably see no difference between a century and eternity though... <g > > So... I'm not going to > rewrite > or change my pages to Unicode. You are free to do so if you wish. > Androcles -- ---------------------------------------------------------------- Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN e-mail: pausch at stockholm dot bostream dot se WWW: http://stjarnhimlen.se/
|
| | | | | | |
Date: 14 Nov 2006 04:31:08
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"Paul Schlyter" <pausch@saaf.se > wrote in message news:ejaqps$2aq0$1@merope.saaf.se...
|
| | | | |
Date: 14 Nov 2006 08:19:59
From: David Woolley
Subject: Re: Restricted ASCII?
|
In article <eja37t$2165$1@merope.saaf.se >, pausch@saaf.se (Paul Schlyter) wrote: As this is a sort of new thread, I'll drop in just for a while, probably just for one posting. > You could try to write: > μ Correct. > instead of: > <font face="Symbol">m</font> Which has always ever only meant "m" since the very first version of HTML, albeit a rather strange way of writing an "m". This particular technique, which I call the "symbol hack", allowed one to make HTML 3.2, and earlier browser, display characters that were not in the western European only character set (ISO 8859/1) allowed by those versions of HTML. It always was an abuse of HTML and would not work on text only and non-visual browsers, or when abstracting for search engines (although some may now recognize the hack). It hasn't been necessary for almost 7 years now and probably only works because browsers tend to be bugwise compatible with earlier browsers rather than properly conforming with the specification. A fully conformant HTML 4.01 browser should display "m", using a fallback font. Androcles is the one using the non-compliant browser. (One should never use popular browser to check HTML; they go out of their way to second guess bad HTML.) The main reason that it probably worked in early browsers is that they didn't honour the character set rules for HTML at all and simply passed the raw bytes to their font engines. > though. That requires a Unicode font in the web browser to display properly, > but it worked fine on my browsers. It only really requires the browser to know the Unicode encoding for the Symbol font, something that ought to be built into any browser on a platform with a custom encoded Symbol font.
|
| |
Date: 12 Nov 2006 22:06:31
From: Chalky
Subject: Re: Restricted ASCII? The final test
|
Chalky wrote: > Thanks too to Sorcerer (Androcles), and George Dishman for your > collective constructive feedback. It seems that you probably had a > secondary display problem but I didn't, because you have Usenet > postings e-mailed to you. > > I just go to the website to read what I am interested in, and, when I > respond, I do so via form submission, so there is no e-mail protocol > involved, my side. > > Your resultant problem might have been because I originally pasted in > the displayed characters which sprang to life after I had typed in the > decimal translation of the machine code for those characters. > Consequently, for a final test, I am simply typing in the decimal > translations of the machine codes for the Japanese Yen, the registered > trade mark, and the Euro, encased in the beautifully symmetric Spanish > version of the question mark, using the HTML identifiers &, #, and ; > below: > > ¿ ¥ ® ? > > > Let me know what you see. > > > Chalky OK, that corrupted here for me. Did it also for those receiving postings by email?
|
| | |
Date: 13 Nov 2006 18:00:41
From: Pat O'Connell
Subject: Re: Restricted ASCII? The final test
|
Chalky wrote: > Chalky wrote: > >> Thanks too to Sorcerer (Androcles), and George Dishman for your >> collective constructive feedback. It seems that you probably had a >> secondary display problem but I didn't, because you have Usenet >> postings e-mailed to you. >> >> I just go to the website to read what I am interested in, and, when I >> respond, I do so via form submission, so there is no e-mail protocol >> involved, my side. >> >> Your resultant problem might have been because I originally pasted in >> the displayed characters which sprang to life after I had typed in the >> decimal translation of the machine code for those characters. >> Consequently, for a final test, I am simply typing in the decimal >> translations of the machine codes for the Japanese Yen, the registered >> trade mark, and the Euro, encased in the beautifully symmetric Spanish >> version of the question mark, using the HTML identifiers &, #, and ; >> below: >> >> ¿ ¥ ® ? These are escaped characters in HTML, and are written correctly above in ASCII. Usenet readers don't display HTML, though some newsreaders will convert URLs to be linkable. >> >> >> Let me know what you see. What I'm supposed to see in ASCII. -- Pat O'Connell [note munged EMail address] Take nothing but pictures, Leave nothing but footprints, Kill nothing but vandals...
|
| | | |
Date: 14 Nov 2006 01:21:40
From: Ben Rudiak-Gould
Subject: Re: Restricted ASCII? The final test
|
Pat O'Connell wrote: >Chalky wrote: >>> ¿ ¥ ® ? > >These are escaped characters in HTML, and are written correctly above in >ASCII. Except for "", which is a control character, not the Euro symbol. The Euro symbol is "€". -- Ben
|
| |
Date: 13 Nov 2006 16:19:30
From: Jeff Root
Subject: Re: Restricted ASCII?
|
My web pages don't specify the character set because I didn't know which character set to specify, and I'd rather let the user's operating system & browser go with what they think is right rather than specifying one which doesn't work for some people. However, recently I wanted to use a centered dot. I found it in Arial (the font I was suggesting for the page) using Windows Character Map. I copied the character and pasted it into the page in my text editor. The editor is set to use a different font, and the character did not display correctly. Based on past experience, I went ahead anyway. The dot showed up fine on the web page on my local hard drive. But when I uploaded the page to the server and viewed it from there, the character did not display. It was suggested to me that I specify the character set. But I still was unsure which set to specify, and wanted to avoid unnecessary complications. I found that replacing the pasted character with the HTML escape sequence · works, so I am currently going with that. Any comments or suggestions? I will specify a character set if I have confidence that it won't cause more problems than it solves. If it really isn't needed, though, I'll continue to omit it. -- Jeff, in Minneapolis
|
| | |
Date: 14 Nov 2006 00:49:17
From: Ben Rudiak-Gould
Subject: Re: Restricted ASCII?
|
Jeff Root wrote: >My web pages don't specify the character set because I >didn't know which character set to specify, and I'd rather >let the user's operating system & browser go with what >they think is right rather than specifying one which >doesn't work for some people. The only way the page can not work for /some/ people is if you don't specify an encoding. If you do specify an encoding, then it will either look right to everyone or look wrong to everyone. In particular, if it looks right to you, it'll look right to everyone. That's much better than leaving it up to each user's browser. >It was suggested to me that I specify the character set. >But I still was unsure which set to specify, and wanted to >avoid unnecessary complications. I found that replacing >the pasted character with the HTML escape sequence · >works, so I am currently going with that. Yes, stick with ASCII and you'll be fine. "·" is a sequence of ASCII characters for document-encoding purposes. -- Ben
|
| |
Date: 13 Nov 2006 21:57:59
From: David Woolley
Subject: Re: Restricted ASCII?
|
In article <1163391574.028033.308640@h48g2000cwc.googlegroups.com >, chalkyspam@bleachboys.co.uk wrote: [ all snipped ] I think I'm spending too much time on this thread, without their being sufficient benefit to the world. There are just two many misunderstanding to counter and details to check. Therefore I'm going to drop out. However, as a parting observation, I'd point out that neither Google Groups nor Wikipedia use 8 bit character encodings. You are campaigning for something that is already obsolescent!
|
| | |
Date: 14 Nov 2006 04:31:08
From: Sorcerer
Subject: Re: Restricted ASCII?
|
"David Woolley" <david@djwhome.demon.co.uk > wrote in message news:T1163455119@djwhome.demon.co.uk...
|
| |
Date: 14 Nov 2006 16:37:07
From: zzbunker@netscape.net
Subject: Re: Restricted ASCII?
|
Chalky wrote: > Chalky wrote: > > > ASCII is defined in wiki as an 8-bit system, developed from telegraphic > > codes, to allow for 256 characters. > > > > However, I have noticed that this set is truncated to 7 characters at > > sci.astro.research to conform to its first commercial use as a > > seven-bit teleprinter code (1963). > > > > This can result in considerable garbling of 8-bit ASCII text. > > > > Since I am pretty sure I have seen Schrodinger's equation spelled > > correctly at sci.physics.research, I am curious to discover how endemic > > this restriction of the ASCII set actually still is, in the Usenet > > groups. > > > > Towards this end, I have pasted in the 8 bit ASCII for degrees, for + > > -, and for copyright, below > > > > =B0 =B1 =A9 > > > > Chalky > > Interesting. All groups display correctly except > sci.physics.relativity, which still displayed 8 bits, but translated > into a more old-fashioned font. > > Looks like sci.astro.research is the only newsgroup actually restricted > to 7 bits. > > I wonder why that is? It's probably the same reason as all backward compatibilty problems, Whoever set the newsgroup up, may have decided to use Fortran code from the Mercury missions, as the standard for that group's servers.=20 >=20 > Chalky
|
| |
Date: 14 Nov 2006 10:21:47
From: Chalky
Subject: Re: Restricted ASCII?
|
Greg Hennessy wrote: > On 2006-11-13, Chalky <chalkyspam@bleachboys.co.uk> wrote: > > I did, and it was promptly changed, after my original posting on this > > subject (yesterday). However, if David Wooley's faith in the accuracy > > of Wiki logs of alterations is justified, we now know that the > > misleading text string read as: > > > > a 8-bit system that would allow for 256 characters, the ASCII developed > > from telegraphic codes and first entered commercial use as a seven-bit > > teleprinter code promoted by Bell data services in 1963. > > No, that is *NOT* what the text read as. You are now simply > lieing. OK. So, obviously, you have now decided to demonstrate talking out of your own arse in copycat mode. That is very brave of you [I don't think]. Since I do not suffer fools gladly, welcome to the slaughterhouse, now. (You have asked for it.) I will start by explaining to you that, unless you are determined to demonstrate to the world that you are semi-literate, as well as a complete idiot, it is worth your while: (A) investing in a dictionary and a spellchecker. That way, you will not make a complete fool of yourself, by accusing someone of what you are actually doing yourself, by mis-spelling such an elementary word as lying. (B) READING and trying to UNDERSTAND the prior postings in the thread, and their associated responses. That way, you will not make an even bigger fool of yourself, by raising a subject that has already been addressed and adequately resolved, in a less confrontational and (for you) potentially libelous manner. (If you had only had enough education to get that elementary spelling right, I would at least have then had a fighting chance of being "quids in" following litigation. (C) investing in a text editor. That way, even if you don't have adequate intelligence to work out what "misleading text string" means, you will at least then be able to work out, via a tedious process of trial and error, that your sentence > "Some time after the EBCDIC code, a 8-bit system that would allow > for 256 characters, the ASCII developed from Telegraphy
|
| | |
Date: 15 Nov 2006 03:47:52
From: Greg Hennessy
Subject: Re: Restricted ASCII?
|
On 2006-11-14, Chalky <chalkyspam@bleachboys.co.uk > wrote: > (C) investing in a text editor. That way, even if you don't have > adequate intelligence to work out what "misleading text string" means, > you will at least then be able to work out, via a tedious process of > trial and error, that your sentence > >> "Some time after the EBCDIC code, a 8-bit system that would allow >> for 256 characters, the ASCII developed from Telegraphy
|
| | |
Date: 14 Nov 2006 21:13:29
From: Paul Schlyter
Subject: Re: Restricted ASCII?
|
Ever considered becoming a lawyer? You'd probably succeed, since you seem to have a great talent at twisting and misinterpreting words...... FYI: truncating a piece from a text frequently alters the meaning of the text. It's a common trick by those who wish to misinterpret a text. In article <1163528507.205692.270060@h54g2000cwb.googlegroups.com >, Chalky <chalkyspam@bleachboys.co.uk > wrote: > Greg Hennessy wrote: > >> On 2006-11-13, Chalky <chalkyspam@bleachboys.co.uk> wrote: >>> I did, and it was promptly changed, after my original posting on this >>> subject (yesterday). However, if David Wooley's faith in the accuracy >>> of Wiki logs of alterations is justified, we now know that the >>> misleading text string read as: >>> >>> a 8-bit system that would allow for 256 characters, the ASCII developed >>> from telegraphic codes and first entered commercial use as a seven-bit >>> teleprinter code promoted by Bell data services in 1963. >> >> No, that is *NOT* what the text read as. You are now simply >> lieing. > > OK. So, obviously, you have now decided to demonstrate talking out of > your own arse in copycat mode. That is very brave of you [I don't > think]. Since I do not suffer fools gladly, welcome to the > slaughterhouse, now. (You have asked for it.) > > I will start by explaining to you that, unless you are determined to > demonstrate to the world that you are semi-literate, as well as a > complete idiot, it is worth your while: > > (A) investing in a dictionary and a spellchecker. That way, you will > not make a complete fool of yourself, by accusing someone of what you > are actually doing yourself, by mis-spelling such an elementary word as > lying. > > (B) READING and trying to UNDERSTAND the prior postings in the thread, > and their associated responses. That way, you will not make an even > bigger fool of yourself, by raising a subject that has already been > addressed and adequately resolved, in a less confrontational and (for > you) potentially libelous manner. (If you had only had enough education > to get that elementary spelling right, I would at least have then had a > fighting chance of being "quids in" following litigation. > > (C) investing in a text editor. That way, even if you don't have > adequate intelligence to work out what "misleading text string" means, > you will at least then be able to work out, via a tedious process of > trial and error, that your sentence > >> "Some time after the EBCDIC code, a 8-bit system that would allow >> for 256 characters, the ASCII developed from Telegraphy
|
| |
Date: 15 Nov 2006 01:51:58
From: John (Liberty) Bell
Subject: Re: Restricted ASCII?
|
Richard Tobin wrote: > In article <1163402398.151550.250220@m7g2000cwm.googlegroups.com>, > Chalky <chalkyspam@bleachboys.co.uk> wrote: > >> Only characters 0 through 127 have been standardized as part of ASCII. > >> Each computer operating system (for instance DOS, Windows, Mac, and VMS) > >> displays its own symbol set for 128 through 255. You have 'credited' the wrong person for this misleading statement here. It was Pat O'Connell who wrote that. > This is an exaggeration. Most modern software knows about different > character encodings and can display them. It's true that if you > present some data without any information about the encoding, > different systems may display different things by default. JB
|
| | |
Date: 15 Nov 2006 15:35:02
From: Richard Tobin
Subject: Re: Restricted ASCII?
|
In article <1163584318.857439.244880@m73g2000cwd.googlegroups.com >, John (Liberty) Bell <john.bell@accelerators.co.uk > wrote: >You have 'credited' the wrong person for this misleading statement >here. No, I credited him with quoting it. That's why there were two ' >' characters at the beginning of each line. -- Richard -- "Consideration shall be given to the need for as many as 32 characters in some alphabets" - X3.4, 1963.
|
| |
Date: 15 Nov 2006 19:45:13
From: John (Liberty) Bell
Subject: Re: Restricted ASCII? and the final test
|
Ben Rudiak-Gould wrote (Re: Restricted ASCII? The final test): > Pat O'Connell wrote: > >Chalky wrote: > >>> ¿ ¥ ® ? > > > >These are escaped characters in HTML, and are written correctly above in > >ASCII. I suspect the reason most of us see (eg) ¿ and not the character represented by ¿ in HTML, may be that the groups server has cunningly inserted an extra invisible character within each of those HTML instructions, to block our browsers and newsreaders from displaying those HTML instructions, as single characters. As Chalky discovered, if you instead simply paste in the corresponding displayed HTML character when making a posting, many of us will then see it. However, there are then some user dependent problems which can arise: 1) If you are using an Outlook Express Newsreader, the indents will foul up when you try to respond. 2) If you are using another (as yet unidentified [here]) user interface, that character is translated instead into a string of (7 bit) ascii characters, so you don't see what was intended. 3) If, on the other hand, you are using an Internet browser interface, there are no resultant problems for you, personally, UNLESS you post to sci.astro.research. [This is because the moderator there falls into category (2), and, if approved, the posting will then appear in the altered form the moderator saw] > Except for "", which is a control character, not the Euro symbol. "" is a Euro symbol for Microsoft browsers and for other 'relaxed' ISO based browsers. "" is a control character only under 'strict' ISO interpretation > The Euro symbol is "€". Yes, that is the Euro symbol in Unicode. It appears to work on all browsers going back at least as far as Netscape 4.75. That, incidentally, is the only browser I have tried which does not also display the Euro symbol when fed . Instead it displays . So I think I would agree, on the basis of this early Netscape test, that Unicode is probably the best way to go (at least for HTML scripting). Regards John
|
| | |
Date: 16 Nov 2006 13:32:14
From: Richard Tobin
Subject: Re: Restricted ASCII? and the final test
|
In article <1163648713.310853.239480@b28g2000cwb.googlegroups.com >, John (Liberty) Bell <john.bell@accelerators.co.uk > wrote: >"" is a Euro symbol for Microsoft browsers and for other >'relaxed' ISO based browsers. "" is a control character only >under 'strict' ISO interpretation This is nonsense. Character number 128 in some Microsoft character sets is a Euro character. But the NNN; notation *means* the character represented by that number in Unicode. That browsers display a Euro symbol is at best an attempt to make broken pages display correctly, and may well just be a consequence of the fonts they use. >> The Euro symbol is "€". > >Yes, that is the Euro symbol in Unicode. It appears to work on all >browsers going back at least as far as Netscape 4.75. Whether it works depends more on the fonts than on the browsers. Browsers don't have code to handle each character. >So I think I would agree, on the basis of this early Netscape test, >that Unicode is probably the best way to go (at least for HTML >scripting). If it's not Unicode, it's not HTML either. -- Richard -- "Consideration shall be given to the need for as many as 32 characters in some alphabets" - X3.4, 1963.
|
| | |
Date: 16 Nov 2006 07:42:37
From: Paul Schlyter
Subject: Re: Restricted ASCII? and the final test
|
In article <1163648713.310853.239480@b28g2000cwb.googlegroups.com >, John (Liberty) Bell <john.bell@accelerators.co.uk > wrote: > Ben Rudiak-Gould wrote (Re: Restricted ASCII? The final test): > >> Pat O'Connell wrote: >>>Chalky wrote: >>>>> ¿ ¥ ® ? >>> >>>These are escaped characters in HTML, and are written correctly above in >>>ASCII. > > I suspect the reason most of us see (eg) ¿ and not the character > represented by ¿ in HTML, may be that the groups server has > cunningly inserted an extra invisible character within each of those > HTML instructions, to block our browsers and newsreaders from > displaying those HTML instructions, as single characters. There are no such invisible characters inserted here, and I see ¿ too.... There's an easier way to accomplish this: make sure a line like this is present in the message header: Content-Type: text/plain; charset="us-ascii" A compliant news reader should then display this as pure ASCII, not as HTML excape characters. Of course, if web based, the news reader must do some processing, such as replacing the ¿ with e.g. ¿ -- ---------------------------------------------------------------- Paul Schlyter, Grev Turegatan 40, SE-114 38 Stockholm, SWEDEN e-mail: pausch at stockholm dot bostream dot se WWW: http://stjarnhimlen.se/
|
| |
Date: 16 Nov 2006 02:51:52
From: John (Liberty) Bell
Subject: Re: Restricted ASCII? and the final test
|
Paul Schlyter wrote: > In article <1163648713.310853.239480@b28g2000cwb.googlegroups.com>, > John (Liberty) Bell <john.bell@accelerators.co.uk> wrote: > >> Pat O'Connell wrote: > >>>Chalky wrote: > >>>>> ¿ ¥ ® ? > >>>These are escaped characters in HTML, and are written correctly above in > >>>ASCII. > > I suspect the reason most of us see (eg) ¿ and not the character > > represented by ¿ in HTML, may be that the groups server has > > cunningly inserted an extra invisible character within each of those > > HTML instructions, to block our browsers and newsreaders from > > displaying those HTML instructions, as single characters. > There are no such invisible characters inserted here, and I see > ¿ too.... OK, so my blind guess can't be right. > There's an easier way to accomplish this: make sure a line like > this is present in the message header: > > Content-Type: text/plain; charset="us-ascii" Ah! So it IS called us-ascii > A compliant news reader should then display this as pure ASCII, not as HTML > excape characters. Of course, if web based, the news reader must do > some processing, such as replacing the ¿ with e.g. ¿ Ah So! Chalky did say he saw something like ¿ when he previewed the original posting, so modified the posting by pasting in the correspondingly displayed characters, when he employed browser and email clients directly to read the code. However, when I previewed my own (later) postings, no such alteration on preview was displayed. Amazing how much seems to have changed in just a few days, isn't it? John
|
| | |
Date: 16 Nov 2006 23:47:28
From: George Dishman
Subject: Re: Restricted ASCII? and the final test
|
"John (Liberty) Bell" <john.bell@accelerators.co.uk > wrote in message news:1163674312.513112.24390@b28g2000cwb.googlegroups.com... > > Ah So! Chalky did say he saw something like ¿ when he > previewed the original posting, so modified the posting by pasting in > the correspondingly displayed characters, when he employed browser and > email clients directly to read the code. If you write HTML with notepad, you type "&" to display the ampersand sign. It is the browser that does the conversion from HTML. > However, when I previewed my own (later) postings, no such alteration > on preview was displayed. In a "posting" on Usenet, the "&" should be passed through unaltered and anyone using a compliant reader should see that. If you view witha browser and it shows the ampersand then it is broken, the interface should change "&" to "&" so the browser shows the characters. > Amazing how much seems to have changed in just a few days, isn't it? Not really, ASCII is still 7-bit, just as it always has been. George
|
| |
Date: 16 Nov 2006 21:02:20
From: Jeff Root
Subject: Re: Restricted ASCII? and the final test
|
When writing HTML, is it better to just say "M&M" or to write out the verbose equivalent "M&M" ? -- Jeff, in Minneapolis
|
| | |
Date: 17 Nov 2006 12:56:09
From: Richard Tobin
Subject: Re: Restricted ASCII? and the final test
|
In article <1163739740.253656.102810@j44g2000cwa.googlegroups.com >, Jeff Root <jeff5@freemars.org > wrote: >When writing HTML, is it better to just say "M&M" or to >write out the verbose equivalent "M&M" ? There are a few contexts in SGML in which & can be used literally, but this is not one of them (it's recognised as an entity reference open delimiter because it's followed by a name start character). And HTML itself recommends that you should use &. -- Richard -- "Consideration shall be given to the need for as many as 32 characters in some alphabets" - X3.4, 1963.
|
| | |
Date: 17 Nov 2006 06:09:37
From: Chris L Peterson
Subject: Re: Restricted ASCII? and the final test
|
On 16 Nov 2006 21:02:20 -0800, "Jeff Root" <jeff5@freemars.org > wrote: >When writing HTML, is it better to just say "M&M" or to >write out the verbose equivalent "M&M" ? The latter will give the output you are looking for under a wider variety of conditions. _________________________________________________ Chris L Peterson Cloudbait Observatory http://www.cloudbait.com
|
| |
Date: 17 Nov 2006 07:38:03
From: David Woolley
Subject: Re: Restricted ASCII? and the final test
|
In article <1163739740.253656.102810@j44g2000cwa.googlegroups.com >, Jeff Root <jeff5@freemars.org > wrote: > When writing HTML, is it better to just say "M&M" or to M&M is not HTML (undefined general entity) and, for an XHTML document would cause a compliant XHTML browser to abort the document (entity reference not closed with ;, which is a well-formedness violation - does not match the syntactical definition of a document). (Note that IE (including IE7) does not support XHTML served as XHTML but only a subset of XHTML 1.0 (defined in appendix C of its specification) served, in compatibility mode, as HTML, and intended to be treated as a sort of broken HTML.) > write out the verbose equivalent "M&M" ? The "'s are not needed, unless you use the string in an attribute value, and even then, definitely for HTML, and I believe also for XHTML, only if you use " rather than ' as the delimiter. (Using delimiters is mandatory in XHTML, and is mandatory in HTML where most punctuation characters are used.) The most common place where &'s are invalidly left bare is form submission like URLs. The HTML specification itself points this one out. See <http://validator.w3.org/ > to check whether or not a document is HTML. See <http://blogs.msdn.com/ie/archive/2005/09/15/467901.aspx > for Microsoft's policy on supporting XML in IE7.
|
|