22.17 Unibyte Editing Mode
The ISO 8859 Latin-n character sets define character codes in
the range 0240 to 0377 octal (160 to 255 decimal) to handle the
accented letters and punctuation needed by various European languages
(and some non-European ones). Note that Emacs considers bytes with
codes in this range as raw bytes, not as characters, even in a unibyte
buffer, i.e., if you disable multibyte characters. However, Emacs can
still handle these character codes as if they belonged to one
of the single-byte character sets at a time. To specify which
of these codes to use, invoke M-x set-language-environment
and
specify a suitable language environment such as ‘ Latin-n
’.
See Disabling Multibyte Characters in GNU Emacs Lisp Reference Manual.
Emacs can also display bytes in the range 160 to 255 as readable
characters, provided the terminal or font in use supports them. This
works automatically. On a graphical display, Emacs can also display
single-byte characters through fontsets, in effect by displaying the
equivalent multibyte characters according to the current language
environment. To request this, set the variable
unibyte-display-via-language-environment
to a non- nil
value. Note that setting this only affects how these bytes are
displayed, but does not change the fundamental fact that Emacs treats
them as raw bytes, not as characters.
If your terminal does not support display of the Latin-1 character
set, Emacs can display these characters as ASCII sequences which at
least give you a clear idea of what the characters are. To do this,
load the library iso-ascii
. Similar libraries for other
Latin-n character sets could be implemented, but have not been
so far.
Normally non-ISO-8859 characters (decimal codes between 128 and 159
inclusive) are displayed as octal escapes. You can change this for
non-standard extended versions of ISO-8859 character sets by using the
function standard-display-8bit
in the disp-table
library.
There are two ways to input single-byte non-ASCII characters:
-
You can use an input method for the selected language environment. See Input Methods. When you use an input method in a unibyte buffer, the non-ASCII character you specify with it is converted to unibyte.
-
If your keyboard can generate character codes 128 (decimal) and up, representing non-ASCII characters, you can type those character codes directly.
On a graphical display, you should not need to do anything special to
use these keys; they should simply work. On a text terminal, you
should use the command M-x set-keyboard-coding-system
or
customize the variable keyboard-coding-system
to specify which
coding system your keyboard uses (see Coding Systems for Terminal I/O). Enabling
this feature will probably require you to use ESC
to type Meta
characters; however, on a console terminal or a terminal emulator such
as xterm
, you can arrange for Meta to be converted to ESC
and still be able to type 8-bit characters present directly on the
keyboard or using Compose
or AltGr
keys. See Kinds of User Input.
- You can use the key
C-x 8
as a compose-character prefix for entry of non-ASCII Latin-1 and a few other printing characters.C-x 8
is good for insertion (in the minibuffer as well as other buffers), for searching, and in any other context where a key sequence is allowed.
C-x 8
works by loading the iso-transl
library. Once that
library is loaded, the Alt
modifier key, if the keyboard has
one, serves the same purpose as C-x 8
: use Alt
together
with an accent character to modify the following letter. In addition,
if the keyboard has keys for the Latin-1 dead accent characters,
they too are defined to compose with the following character, once
iso-transl
is loaded.
Use C-x 8 C-h
to list all the available C-x 8
translations.