Interpreter (computing)

This article deals with the general term interpreter in software engineering. For other meanings, see Interpreter (disambiguation).

This article is not sufficiently supported by evidence (e.g., anecdotal evidence). Information without sufficient evidence may be removed soon. Please help Wikipedia by researching the information and adding good evidence.

The article, especially historical information and subjective statements should be accompanied by evidence.

An interpreter is a computer program that appears to execute a sequence of instructions directly, given the format of the instructions. The interpreter does this by reading one or more source files, analyzing them, and then executing them instruction by instruction by translating them into machine code that a computer system can execute directly. Interpreters are significantly slower than compilers, but generally provide better error analysis.

Interpreters are used for programming languages as well as for computer programs.

Usage

Programming

In programming, an interpreter is almost always a part of software development.

In their pure form, compilers - in contrast to interpreters - translate the instructions from the source files in one or more passes into machine code for a previously specified target system and thus create an executable computer program. However, there is already a distinction here between compiler-compilers and interpreter-compilers, just as there are interpreter-interpreters and compiler-interpreters.

"Any good software engineer will tell you that a compiler and an interpreter are interchangeable."

"Any good software developer will tell you that compilers and interpreters are interchangeable."

- Tim Berners-Lee: Torben Ægidius Mogensen: Introduction to Compiler Design. Springer Science & Business Media, London 2011, ISBN 978-0-85729-828-7 (limited preview in Google Book Search).

If the last stage is an interpreter, the translation of the source file is done at runtime of the program.

Programming languages that do not compile source code but always interpret an input or a source file are also called "interpreter language" or scripting language. Classic interpreter languages are, for example, BASIC such as GW-BASIC, Tcl or JavaScript. In some other programming languages, a programmer can choose between interpreter and compiler. With some programming languages, a bytecode is also generated as intermediate code, which is already optimized, but again requires an interpreter on the target system for execution. In addition, there are externally developed compilers for actually pure interpreter languages.

Computer programs

Command line interpreter scripts, such as batch files or Unix shell scripts, are also executed by an interpreter. To avoid having to specify the script as a command-line parameter, Unix-like systems and shells have what is known as shebang-the script thus calls the appropriate interpreter itself, so to speak, with the help of the shell.

Computer programs are also referred to as interpreters as soon as the code cannot or should not be executed directly by the computer system. This is also the case with emulators, among others, which analyze machine code for other computer systems, rewrite it and execute it interpreted for the computer system on which they are currently running. Virtual machines, however, do not count as such, since they directly execute large portions of the guest system's machine code on the host system uninterpreted. Game engines can also be interpreters if they execute the actual game data, usually as bytecode, interpreted on the respective platform.

Properties

Interpreters are usually available in the machine language of the target processor, but can also be available in an interpreter language themselves. The biggest disadvantage is the lower execution speed compared to a compiler. This is due to the fact that the compiler can take the time during the compilation process to optimize the code, which is thus executed faster on the respective target system. However, such optimizations are time-consuming, so that an interpreter usually performs a direct conversion to machine code, which is, however, slower again in total than the optimized code by the compiler.

Interpreted code is about five to 20 times slower than compiled code.

One of the advantages of interpreted code, besides the better error analysis, is the independence from a predefined computer architecture - because interpreted code runs on any system that has an interpreter for it.

Speed increases

A compromise solution is a just-in-time compiler (JIT compiler), in which the program is not translated until runtime, but directly into machine code. Afterwards, the translated code is executed directly by the processor. By caching the machine code, program sections that have been run several times only have to be translated once. Also, the JIT compiler allows for more optimization of the binary code. However, JIT compilers can only run on a specific computer architecture because they generate machine code for that architecture, and they require far more memory than pure interpreters.

Intermediate code

Bytecode interpreters are another intermediate stage. Here, the source code is translated (in advance or at runtime) into a simple intermediate code, which is then executed by an interpreter - also often referred to as a virtual machine. In Java, for example, this is done by the Java Virtual Machine (JVM). It corresponds to the compiler-interpreter concept, since the intermediate code has already been compiled in parts in an optimized way (source code → compiler → intermediate code as bytecode → interpreter → execution on the target system).

Especially in the 1980s, people used the intermediate stage of converting commands at input time into more easily decodable tokens, which were converted back into plain text at (list) output. Besides the speed increase, the compression of the source code was a weighty argument. In principle, this also made it possible to use native-language keywords in each case if the data exchange was carried out on the basis of the tokenized source program.

Hybrid forms

Since JIT code is not automatically faster than interpreted code, some runtime environments use a hybrid form. An example of this is the JVM. Here, the JIT compiler is used in parallel with the interpreter, whereby the faster execution path "wins" in each case.