Home |
Search |
Today's Posts |
#1
![]() |
|||
|
|||
![]()
In searching for information on the Web about versions of ITA 2 as modified for
different languages, I came across a web page which noted that, because of the stateful nature of 5-level code... because a garble can lead to a mistaken shift into FIGS case... 5-level code is quite rightly something that ought to be relegated to the past. In the original Murray code, on which ITA 2 was based, instead of a letters shift and a figures shift, there were "letters space" and "figures space" codes, just as there were in Emile Baudot's original five-level code (on which ITA 1 is based, and which differs from what is commonly called "Baudot", ITA 2 or its slightly incompatible Western Union variant). This would, of course, be limiting. Shift codes are needed, since one doesn't want to be forced to insert a space when changing between letters and figures. But even with shift codes, many RTTY operators adapted their teletypes so that the space was a letters space by installing the "unshift on space" option. In addition to printable characters, the figures case on a 5-level teletypewriter includes two control characters - WRU and BEL. Here's my suggestion for bringing 5-level code into the 21st Century, in stages. a) Make unshift on space standard; change Space to Letters Space. b) Change WRU to ESC - not an exact equivalent of ASCII Escape, but allowing two character control codes for things like WRU, BEL, BS (backspace) and so on. And change BEL to Figures Space, so that one isn't forced to switch to letters and then change back when introducing spaces into figures. (Note, though, as that is the figures shift of a letter, not a code on its own, unlike Letters Space, it doesn't serve as a reminder that one is in figures shift.) c) A first expansion of the character repertoire to include lower-case and more figures characters can now be introduced. However, what I propose will be different from, and simpler than, either ITU recommendation S.2 (where a "superfluous" LTRS issued in letters case toggles between upper and lower case) or ASCII over AMTOR (where the all zeroes character, used elsewhere as a third shift for languages like Russian or Greek, switches to the alternate characters). Since I propose not only to shift the letters case into upper and lower case, but also to shift the figures case for additional characters, now there would be more positions that could be spared for control codes. So instead of using ESC, I would make the UC and LC shifts two additional control codes within the figures case. d) And then a second expansion of the character repertoire, to allow the equivalent of a "third shift" for supporting a non-Latin alphabet, would also be provided for right from the start in the basic design. However, the all-zeroes character would _not_ be used to shift into it. Instead, an additional two control codes would be taken from the figures case, SI and SO, similar to ASCII. So I envisage the figures case as looking like this: QWERTYUIOP - 1234567890 in lower case, and !@#$%&*() in upper case. ASDFGHJKL - -' ESC n n n Fig Sp SI SO in lower case, and _" ESC n n n Fig Sp SI SO in upper case. ZXCVBNM - ? / ; = ? , . in lower case, and ? ? : + ? in upper case. The two unused ? positions and the three national use n positions would be somehow assigned to the four positions needed for additional ASCII characters. One possibility would be: ASDFGHJKL - -' ESC ~[] Fig Sp SI SO in lower case, and _" ESC `{} Fig Sp SI SO in upper case. ZXCVBNM - \ / ; = ? , . in lower case, and | ? : + ? in upper case. taking as much inspiration as possible from ASCII over AMTOR. e) But what about unshift on space? How can that be reconciled with having a third shift language? One extra character is available: 00000. So while the regular space character becomes a letters space, this character could become the "third space". However, *that* has a grievous flaw. 11111 is a _shift_ code, the letters shift, so that despite it doing something, because what it does can be fully undone by a subsequent shift code, it can still also serve the same function as DEL in ASCII - correcting errors by punching over the character involved. 00000 is the code that is present on the blank leader of tape. So it shouldn't "do something" irrevocable like advancing the carriage one space. It should be allowed to perform the function of NUL in ASCII, which it could back when it was used as the third-language shift. But the idea of having both a "letters space" and a "third space" so that there is a constant reminder of the state is useful and important - the vulnerability of stateful five-level code is the very issue I'm trying to address. Well, there _is_ another shift code already present. So I propose to change the assignment of the FIGS shift code from its present value to the all-zeroes code. But instead of "third space" getting the existing FIGS shift code, since the space is the most common character, it should get a code with only one bit set, like the code for the regular space, here used for "letters space". So I propose that "third space" should get the existing code for *carriage return*, with carriage return getting the existing FIGS shift code. It's unfortunate that the existing assignments of two characters are changed in an incompatible manner, but this allows the revised code to be faithful to the original rationale behind the design of the ITA 2 code. To be specific: Since the upper case and lower case codes are _within_ the figures shift, the letters shift, as well as SI and SO, must not affect the upper/lower case shift. Figures shift, on the other hand, could always proceed to lower case within the figures case, so as to go directly to the digits and the most common punctuation marks. SI goes to the "third shift" language, and SO returns from it to the Latin alphabet. The characters for the figures case may also be different in the third shift language, not just the ones in the letters case. John Savard |
#2
![]() |
|||
|
|||
![]()
At this point, "letters space" and "third space" both refresh the FIGS/LTRS
toggle and the SI/SO toggle. But what about the UC/LC toggle? This could be regarded as less critical, of course, since if it is scrambled, messages wouldn't be turned into gibberish quite as badly. But since there is one position left in the figures shift, perhaps something could be done. One possibility would be to change the "figures space" character to "lower figures space", and use figs-B, the unused position, as "upper figures space". However, it isn't that often that figures space would be used in normal text, since it's a space in figures shift, so it would be a space between one group of digits or punctuation marks and another. So another possibility that seems to work well for the most common case might be this: Leave figs-J as "figures space" which prints a space without leaving figures shift, affecting no other shifts. Make figs-B "upper letters space", which prints a space, goes to UC shift, and goes to LTRS shift, without affecting the SI/SO toggle, unlike both "letters space", which sets SO, and "third space", which sets SI. And have both letters space and third space set LC shift. In this way, after going to figures shift to print a period, one could exit figures shift with an upper letters space to start a new sentence... and after going to figures shift to print a comma, one exits figures shift with a normal letters space (or a normal third space, if in the alternate alphabet) so that the next word is in lower case. John Savard |
#3
![]() |
|||
|
|||
![]()
As the value of five-level code is that it is more compact than the 7-bit ASCII
code, it is important to minimize the number of times one has to use two characters to do what ASCII can do with one. So, upon reflection, I think I have not been bold enough in one area. I now think that UC and LC should be swapped with CR and LF, thus relegating carriage return and line feed to the figures shift, but giving two full five-bit codes to UC and LC, as they are likely to be used frequently. John Savard |
#4
![]() |
|||
|
|||
![]()
On Tuesday, March 14, 2017 at 10:40:43 AM UTC-6, John Savard wrote:
I now think that UC and LC should be swapped with CR and LF, thus relegating carriage return and line feed to the figures shift, but giving two full five-bit codes to UC and LC, as they are likely to be used frequently. But if I do that, then, while the code is now changed a great deal, since UC and LC are shift codes, one of them can be given the 00000 code, and figures shift can be returned to its original position. John Savard |
#5
![]() |
|||
|
|||
![]()
Upon drawing the diagram to illustrate the bottom of this web page
http://www.quadibloc.com/crypto/mi6133.htm which describes some further thoughts on this matter, I realized that I had miscounted the number of codes available, and had one less available code in figures shift than I thought. John Savard |
#6
![]() |
|||
|
|||
![]()
On Tue, 14 Mar 2017, John Savard wrote:
In searching for information on the Web about versions of ITA 2 as modified for different languages, I came across a web page which noted that, because of the stateful nature of 5-level code... because a garble can lead to a mistaken shift into FIGS case... 5-level code is quite rightly something that ought to be relegated to the past. But why should five-bit code be saved? It made sense when it was the only thing allowed on the ham bands, and there were those cheap Teletype machines offered by various groups. But relatively few have those old machines around, and would they be compatible with any modified system? For the rest, they are using computers, and then does it really matter which one is being used, 5-bit or ASCII? There was a time when the Deaf used Baudot machines to communicate over the phone lines, influenced by ham RTTY, but once they went to electronic machines, the only reason to stay with 5bit was compatibility. By now, I doubt anyone is using a mechanical typewriter for that, so there's no real compunction to stay with 5bits. Indeed, the ASCII world opens everything up, suddenly the Deaf can talk not only among themselves, but to the world in general (now that everyone has computers). Michael |
#7
![]() |
|||
|
|||
![]()
On Saturday, March 25, 2017 at 11:53:32 AM UTC-6, Michael Black wrote:
But why should five-bit code be saved? That's true - if one is concerned with saving bandwidth, which will still be a concern in some applications, no matter how technology advances - one could always use compression algorithms like those used for ZIP files, or, if simplicity is an issue, a static Huffman code. John Savard |
#8
![]() |
|||
|
|||
![]()
John Savard wrote:
On Saturday, March 25, 2017 at 11:53:32 AM UTC-6, Michael Black wrote: But why should five-bit code be saved? That's true - if one is concerned with saving bandwidth, which will still be a concern in some applications, no matter how technology advances - one could always use compression algorithms like those used for ZIP files, or, if simplicity is an issue, a static Huffman code. 5-level codes are typically used with FSK or AFSK modulation so there is a lot of efficiency to be gained by implementing a more efficient modulation method. When doing that, you can do some compression as well. Wait. This has already been done! It is called PSK31. |
#9
![]() |
|||
|
|||
![]()
On 30/03/2017 15:01, Rob wrote:
5-level codes are typically used with FSK or AFSK modulation so there is a lot of efficiency to be gained by implementing a more efficient modulation method. When doing that, you can do some compression as well. GMSK? |
#10
![]() |
|||
|
|||
![]()
Gareth's Downstairs Computer wrote:
On 30/03/2017 15:01, Rob wrote: 5-level codes are typically used with FSK or AFSK modulation so there is a lot of efficiency to be gained by implementing a more efficient modulation method. When doing that, you can do some compression as well. GMSK? No, not by far. Typical RTTY is sent with 170Hz shift FSK at a rate of 50 baud, so well above GMSK. |
Reply |
|
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
A Modest Proposal (it's called "satire" Roy, well adjusted people embrace satire) | Shortwave | |||
Code-Schmode! Try this proposal instead. | General | |||
Code-Schmode! Try this proposal instead. | Policy | |||
Code-Schmode! Try this proposal instead. | Shortwave | |||
Comments to FCC on RM-10787 No Code Proposal | Policy |