Programming and the Python interpreter

Published

2023-08-02

Why learn a programming language?

Computers are powerful tools. Computers can perform all manner of tasks: communication, computation, managing and manipulating data, modeling natural phenomena, and creating images, videos, and music, just to name a few. However, computers don’t read minds (yet), and thus we have to provide instructions to computers so they can perform these tasks.

Computers don’t speak natural languages (yet)—they only understand binary code. Binary code is unreadable by humans.

For example, a portion of an executable program might look like this (in binary):

0110101101101011 1100000000110101 1011110100100100
1010010100100100 0010100100010011 1110100100010101 
1110100100010101 0001110110000000 1110000111100000 
0000100000000001 0100101101110100 0000001000101011
0010100101110000 0101001001001001 1010100110101000

This is unintelligible. It’s bad enough to try to read it, and it would be even worse if we had to write our computer programs in this fashion.

Computers don’t speak human language, and humans don’t speak computer language. That’s a problem. The solution is programming languages.

Programming languages allow us, as humans, to write instructions in a form we can understand and reason about, and then have these instructions converted into a form that a computer can read and execute.

There is a tremendous variety of programming languages. Some languages are low-level, like assembly language, where there’s roughly a one-to-one correspondence between machine instructions and assembly language instructions. Here’s a “Hello World!” program in assembly language (for ARM64 architecture):1

.equ STDOUT, 1
.equ SVC_WRITE, 64
.equ SVC_EXIT, 93
 
.text
.global _start
 
_start:
    stp x29, x30, [sp, -16]!
    mov x0, #STDOUT
    ldr x1, =msg
    mov x2, 13
    mov x8, #SVC_WRITE
    mov x29, sp
    svc #0 // write(stdout, msg, 13);
    ldp x29, x30, [sp], 16
    mov x0, #0
    mov x8, #SVC_EXIT
    svc #0 // exit(0);
 
msg:    .ascii "Hello World!\n"
.align 4

Now, while this is a lot better than a string of zeros and ones, it’s not so easy to read, write, and reason about code in assembly language.

Fortunately, we have high-level languages. Here’s the same program in C++:

#include <iostream>
 
int main () {
  std::cout << "Hello World!" << std::endl;
}

Much better, right?

In Python, the same program is even more succinct:

print('Hello World!')

Notice that as we progress from machine code to Python, we’re increasing abstraction. Machine code is the least abstract. These are the actual instructions executed on your computer. Assembly code uses human-readable symbols, but still retains (for the most part) a one-to-one correspondence between assembly instructions and machine instructions. In the case of C++, we’re using a library iostream to provide us with an abstraction of an output stream, std::cout, and we’re just sending strings to that stream. In the case of Python, we simply say “print this string” (more or less). This is the most abstract of these examples—we needn’t concern ourselves with low-level details.

Figure 1: Increasing abstraction

Now, you may be wondering: How is it that we can write programs in such languages when computers only understand zeros and ones? There are programs which convert high-level code into machine code for execution. There are two main approaches when dealing with high-level languages, compilation and interpretation.

Compilation and interpretation

Generally speaking, compilation is a process whereby source code in some programming language is converted into binary code for execution on a particular architecture. The program which performs this conversion is called a compiler. The compiler takes source code (in some programming language) as an input, and yields binary machine code as an output.

Figure 2: Compilation (simplified)

Interpreted languages work a little differently. Python is an interpreted language. In the case of Python, intermediate code is generated, and then this intermediate code is read and executed by another program. The intermediate code is called bytecode.

While the difference between compilation and interpretation is not quite as clear-cut as suggested here, these descriptions will serve for the present purposes.

The Python interpreter

Python is an interpreted language with intermediate bytecode. While you don’t need to understand all the details of this process, it’s helpful to have a general idea of what’s going on.

Say you have written this program and saved it as hello_world.py.

print('Hello World!')

You may run this program from the terminal (command prompt), thus:

$ python hello_world.py

where $ indicates a command prompt (your prompt may vary). When this runs, the following is printed to the console:

Hello World!

When we run this program, Python first reads the source code, then produces the intermediate bytecode, then executes each instruction in the bytecode.

Figure 3: Execution of a Python program
  1. By issuing the command python hello_world.py, we invoke the Python interpreter and tell it to read and execute the program hello_world.py (.py is the file extension used for Python files).
  2. The Python interpreter reads the file hello_world.py.
  3. The Python interpreter produces an intermediate, bytecode representation of the program in hello_world.py.
  4. The bytecode is executed by the Python Virtual Machine.
  5. This results in the words “Hello World!” being printed to the console.

So you see, there’s a lot going on behind the scenes when we run a Python program.2 However, this allows us to write programs in a high-level language that we as humans can understand.

Supplemental reading

  • Whetting Your Appetite, from The (Official) Python Tutorial.3

Original author: Clayton Cafiero < [given name] DOT [surname] AT uvm DOT edu >

No generative AI was used in producing this material. This was written the old-fashioned way.

This material is for free use under either the GNU Free Documentation License or the Creative Commons Attribution-ShareAlike 3.0 United States License (take your pick).

Footnotes

  1. Assembly language code sample from Rosetta Code: https://www.rosettacode.org/wiki/Hello_world↩︎

  2. Actually, there’s quite a bit more going on behind the scenes, but this should suffice for our purposes. If you’re curious and wish to learn more, ask!↩︎

  3. https://docs.python.org/release/3.10.4/tutorial/appetite.html↩︎