Emacs Documentation

28.4 Find Identifier References

An identifier is a name of a syntactical subunit of the program: a function, a subroutine, a method, a class, a data type, a macro, etc. In a programming language, each identifier is a symbol in the language’s syntax. Identifiers are also known as tags.

Program development and maintenance requires capabilities to quickly find where each identifier was defined and referenced, to rename identifiers across the entire project, etc. These capabilities are also useful for finding references in major modes other than those defined to support programming languages. For example, chapters, sections, appendices, etc. of a text or a TeX document can be treated as subunits as well, and their names can be used as identifiers. In this chapter, we use the term “identifiers” to collectively refer to the names of any kind of subunits, in program source and in other kinds of text alike.

Emacs provides a unified interface to these capabilities, called ‘ xref’.

To do its job, xref needs to make use of information and to employ methods specific to the major mode. What files to search for identifiers, how to find references to identifiers, how to complete on identifiers—all this and more is mode-specific knowledge. xref delegates the mode-specific parts of its job to a backend provided by the mode; it also includes defaults for some of its commands, for those modes that don’t provide their own.

A backend can implement its capabilities in a variety of ways. Here are a few examples:

Some major modes provide built-in means for looking up the language symbols. For example, Emacs Lisp symbols can be identified by searching the package load history, maintained by the Emacs Lisp interpreter, and by consulting the built-in documentation strings; the Emacs Lisp mode uses these facilities in its backend to allow finding definitions of symbols. (One disadvantage of this kind of backend is that it only knows about subunits that were loaded into the interpreter.)
An external program can extract references by scanning the relevant files, and build a database of these references. A backend can then access this database whenever it needs to list or look up references. The Emacs distribution includes etags, a command for tagging identifier definitions in programs, which supports many programming languages and other major modes, such as HTML, by extracting references into tags tables. See Creating Tags Tables. Major modes for languages supported by etags can use tags tables as basis for their backend. (One disadvantage of this kind of backend is that tags tables need to be kept reasonably up to date, by rebuilding them from time to time.)