Emacs Documentation

14.19 How Text Is Displayed

Most characters are printing characters: when they appear in a buffer, they are displayed literally on the screen. Printing characters include ASCII numbers, letters, and punctuation characters, as well as many non-ASCII characters.

The ASCII character set contains non-printing control characters. Two of these are displayed specially: the newline character (Unicode code point U+000A) is displayed by starting a new line, while the tab character (U+0009) is displayed as a space that extends to the next tab stop column (normally every 8 columns). The number of spaces per tab is controlled by the buffer-local variable tab-width, which must have an integer value between 1 and 1000, inclusive. Note that the way the tab character in the buffer is displayed has nothing to do with the definition of TAB as a command.

Other ASCII control characters, whose codes are below U+0020 (octal 40, decimal 32), are displayed as a caret (‘ ^’) followed by the non-control version of the character, with the escape-glyph face. For instance, the ‘ control-A’ character, U+0001, is displayed as ‘ ^A’.

The raw bytes with codes U+0080 (octal 200) through U+009F (octal 237) are displayed as octal escape sequences, with the escape-glyph face. For instance, character code U+0098 (octal 230) is displayed as ‘ \230’. If you change the buffer-local variable ctl-arrow to nil, the ASCII control characters are also displayed as octal escape sequences instead of caret escape sequences. (You can also request that raw bytes be shown in hex, see display-raw-bytes-as-hex.)

Some non-ASCII characters have the same appearance as an ASCII space or hyphen (minus) character. Such characters can cause problems if they are entered into a buffer without your realization, e.g., by yanking; for instance, source code compilers typically do not treat non-ASCII spaces as whitespace characters. To deal with this problem, Emacs displays such characters specially: it displays U+00A0 NO-BREAK SPACE and other characters from the Unicode horizontal space class with the nobreak-space face, and it displays U+00AD SOFT HYPHEN, U+2010 HYPHEN, and U+2011 NON-BREAKING HYPHEN with the nobreak-hyphen face. To disable this, change the variable nobreak-char-display to nil. If you give this variable a non- nil and non- t value, Emacs instead displays such characters as a highlighted backslash followed by a space or hyphen.

You can customize the way any particular character code is displayed by means of a display table. See Display Tables in The Emacs Lisp Reference Manual.

On graphical displays, some characters may have no glyphs in any of the fonts available to Emacs. These glyphless characters are normally displayed as boxes containing the hexadecimal character code. Similarly, on text terminals, characters that cannot be displayed using the terminal encoding (see Coding Systems for Terminal I/O) are normally displayed as question signs. You can control the display method by customizing the variable glyphless-char-display-control. You can also customize the glyphless-char face to make these characters more prominent on display. See Glyphless Character Display in The Emacs Lisp Reference Manual, for details.

Emacs tries to determine if the curved quotes ‘ and ’ can be displayed on the current display. By default, if this seems to be so, then Emacs will translate the ASCII quotes (‘ `’ and ‘ '’), when they appear in messages and help texts, to these curved quotes. You can influence or inhibit this translation by customizing the user option text-quoting-style (see Keys in
Documentation in The Emacs Lisp Reference Manual).

If the curved quotes ‘, ’, “, and ” are known to look just like ASCII characters, they are shown with the homoglyph face. Curved quotes that are known not to be displayable are shown as their ASCII approximations ‘ `’, ‘ '’, and ‘ "’ with the homoglyph face.