22.9 Specifying a Coding System for File Text
In cases where Emacs does not automatically choose the right coding system for a file’s contents, you can use these commands to specify one:
C-x RET f coding RET
Use coding system coding to save or revisit the file in
the current buffer ( set-buffer-file-coding-system
).
C-x RET c coding RET
Specify coding system coding for the immediately following
command ( universal-coding-system-argument
).
C-x RET r coding RET
Revisit the current file using the coding system coding
( revert-buffer-with-coding-system
).
M-x recode-region RET right RET wrong RET
Convert a region that was decoded using coding system wrong, decoding it using coding system right instead.
The command C-x RET f
( set-buffer-file-coding-system
) sets the file coding system for
the current buffer (i.e., the coding system to use when saving or
reverting the file). You specify which coding system using the
minibuffer. You can also invoke this command by clicking with
mouse-3
on the coding system indicator in the mode line
(see The Mode Line).
If you specify a coding system that cannot handle all the characters in the buffer, Emacs will warn you about the troublesome characters, and ask you to choose another coding system, when you try to save the buffer (see Choosing Coding Systems for Output).
You can also use this command to specify the end-of-line conversion
(see end-of-line conversion) for encoding the
current buffer. For example, C-x RET f dos RET
will
cause Emacs to save the current buffer’s text with DOS-style
carriage return followed by linefeed line endings.
Another way to specify the coding system for a file is when you visit
the file. First use the command C-x RET c
( universal-coding-system-argument
); this command uses the
minibuffer to read a coding system name. After you exit the minibuffer,
the specified coding system is used for the immediately following
command.
So if the immediately following command is C-x C-f
, for example,
it reads the file using that coding system (and records the coding
system for when you later save the file). Or if the immediately following
command is C-x C-w
, it writes the file using that coding system.
When you specify the coding system for saving in this way, instead
of with C-x RET f
, there is no warning if the buffer
contains characters that the coding system cannot handle.
Other file commands affected by a specified coding system include
C-x i
and C-x C-v
, as well as the other-window variants
of C-x C-f
. C-x RET c
also affects commands that
start subprocesses, including M-x shell
(see Running Shell Commands from Emacs). If the
immediately following command does not use the coding system, then
C-x RET c
ultimately has no effect.
An easy way to visit a file with no conversion is with the M-x find-file-literally
command. See Visiting Files.
The default value of the variable buffer-file-coding-system
specifies the choice of coding system to use when you create a new file.
It applies when you find a new file, and when you create a buffer and
then save it in a file. Selecting a language environment typically sets
this variable to a good choice of default coding system for that language
environment.
If you visit a file with a wrong coding system, you can correct this
with C-x RET r
( revert-buffer-with-coding-system
).
This visits the current file again, using a coding system you specify.
If a piece of text has already been inserted into a buffer using the
wrong coding system, you can redo the decoding of it using M-x recode-region
. This prompts you for the proper coding system, then
for the wrong coding system that was actually used, and does the
conversion. It first encodes the region using the wrong coding system,
then decodes it again using the proper coding system.