 
For a list of BASHing data 2 blog posts see the index page.    
Beware these characters in a terminal
Back in 2018, I showed in a BASHing data post that printing a Single Character Introducer (SCI) in a terminal emulator could unexpectedly generate a mass of "c62;" strings followed by a hang (you didn't return to a prompt). This strange result appeared in gnome-terminal and xfce4-terminal, but not in xterm. An IT security professional, Timothy Bolton, wrote to me in 2018 that
printf "\x1b\x5b\x35\x63"
likewise pumped out "c62;", and although I could confirm it at the time, I can't now. Instead, in the terminal emulators I've tried in 2025, the SCI simply hides the next character, and Bolton's string does nothing. Here's how that looks in my usual emulator, sakura (3.8.6); I get the same result with xfce4-terminal (1.1.3):
 
I wrote "hides" because the next character isn't really deleted, as can be seen in a text editor:
 
I have no idea why the 2018 and 2025 results are different. In both years my machines were running Debian stable as a base with the en_AU.UTF-8 locale.
The following table shows what happens when I print each of the C0 and C1 control characters within the string AB[control character]CD\n in sakura.
| Acronym | Name | Unicode | Effect | 
| C0 | |||
| NUL | Null | U+0000 | none | 
| SOH | Start of heading | U+0001 | none | 
| STX | Start of text | U+0002 | none | 
| ETX | End of text | U+0003 | none | 
| EOT | End of transmission | U+0004 | none | 
| ENQ | Enquiry | U+0005 | none | 
| ACK | Acknowledge | U+0006 | none | 
| BEL | Bell | U+0007 | none | 
| BS | Backspace | U+0008 | hides previous character | 
| HT | Horizontal tabulation | U+0009 | adds tab character | 
| LF | Line feed | U+000A | starts new line with next character | 
| VT | Vertical tabulation | U+000B | starts new line with next character at its previous horizontal position | 
| FF | Form feed | U+000C | starts new line with next character at its previous horizontal position | 
| CR | Carriage return | U+000D | adds newline, hides preceding characters | 
| SO | Shift out | U+000E | none | 
| SI | Shift in | U+000F | none | 
| DLE | Data link escape | U+0010 | none | 
| DC1 | Device control 1 | U+0011 | none | 
| DC2 | Device control 2 | U+0012 | none | 
| DC3 | Device control 3 | U+0013 | none | 
| DC4 | Device control 4 | U+0014 | none | 
| NAK | Negative acknowledge | U+0015 | none | 
| SYN | Synchronous idle | U+0016 | none | 
| ETB | End transmission block | U+0017 | none | 
| CAN | Cancel | U+0018 | none | 
| EM | End of medium | U+0019 | none | 
| SUB | Substitute | U+001A | appears as � | 
| ESC | Escape | U+001B | hides next character | 
| FS | File separator | U+001C | none | 
| GS | Group separator | U+001D | none | 
| RS | Record separator | U+001E | none | 
| US | Unit separator | U+001F | none | 
| C1 | |||
| PAD | Padding character | U+0080 | none | 
| HOP | High octet preset | U+0081 | none | 
| BPH | Break permitted here | U+0082 | none | 
| NBH | No break here | U+0083 | none | 
| IND | Index | U+0084 | starts new line with next character at its previous horizontal position | 
| NEL | Next line | U+0085 | starts new line with next character | 
| SSA | Start of selected area | U+0086 | none | 
| ESA | End of selected area | U+0087 | none | 
| HTS | Horizontal tabulation set | U+0088 | none | 
| HTJ | Horizontal tabulation with justification | U+0089 | adds tab character | 
| VTS | Vertical tabulation set | U+008A | none | 
| PLD | Partial line down | U+008B | none | 
| PLU | Partial line up | U+008C | none | 
| RI | Reverse index | U+008D | rewrites entry (see below) | 
| SS2 | Single shift 2 | U+008E | none | 
| SS3 | Single shift 3 | U+008F | none | 
| DCS | Device control string | U+0090 | hides next characters, including newline | 
| PU1 | Private use 1 | U+0091 | none | 
| PU2 | Private use 2 | U+0092 | none | 
| STS | Set transmit state | U+0093 | none | 
| CCH | Cancel character | U+0094 | none | 
| MW | Message waiting | U+0095 | none | 
| SPA | Start protected area | U+0096 | none | 
| EPA | End protected area | U+0097 | none | 
| SOS | Start of string | U+0098 | hides next characters, including newline | 
| SGCI | Single graphic character introducer | U+0099 | none | 
| SCI | Single character introducer | U+009A | hides next character | 
| CSI | Control sequence introducer | U+009B | inserts itself invisibly, hides next character | 
| ST | String terminator | U+009C | none | 
| OSC | Operating system command | U+009D | hides next characters, including newline | 
| PM | Privacy message | U+009E | hides next characters, including newline | 
| APC | Application program command | U+009F | hides next characters, including newline | 
I worry about the characters with something other than "none" for an effect, because I do data auditing in a terminal, and I've found almost all the C0 and C1 characters lurking in text files at one time or another. Most of the time they're part of mojibake arising from a failed encoding conversion. I can't simply delete the control characters — I need to work out what the original characters were before the mojibake-ing.
Below I show the "effective" C1 characters at work, except RI:
 
This is, of course, only how the strings look in a terminal. The NEL control character, for example, doesn't really generate a newline, as shown in a text editor:
 
RI is a special and very weird case. The two screenshots below show what my terminal looks like before and after pressing Enter:
 
Once again, the output looks fine in a text editor if saved as "ri-demo":
 
Summing up. I haven't tried every modern terminal emulator, but the ones I've looked at will all interpret a C1 control character as though they were ANSI-standard teletype machines. This is annoying behaviour when working with text files in a terminal. In contrast, neither a TTY nor the venerable xterm pay attention to the meaning of the C1 control characters:
 
Next post:
2025-06-27   Five ways to pass a shell variable to AWK
Last update: 2025-06-20
The blog posts on this website are licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License