Syntax and semantics

Published

2023-07-31

Syntax and semantics

In this text we’ll talk about syntax and semantics, so it’s important that we understand what these terms mean, particularly in the context of computer programming.

Syntax

In a (natural) language course—say Spanish, Chinese, or Latin—you’d learn about certain rules of syntax, that is, how we arrange words and choose the correct forms of words to produce a valid sentence or utterance. For example, in English,

My hovercraft is full of eels.

is a syntactically valid sentence.1 While it may or may not be true, and may not even make sense, it is certainly a well-formed English sentence. By contrast, the sequence of words

Is by is is and cheese for

is not a well-formed English sentence. These are examples of valid and invalid syntax. The first is syntactically valid (well-formed); the second is not.

Every programming language has rules of syntax—rules which govern what is and is not a valid statement or expression in the language. For example, in Python

>>> 2 3

is not syntactically valid. If we were to try this using the Python shell, the Python interpreter would complain.

>>> 2 3
File "<stdin>", line 1
    2 3
    ^^^
SyntaxError: invalid syntax. Perhaps you forgot a comma?

That’s a clear-cut example of a syntax error in Python. Here’s another:

>>> = 5
File "<stdin>", line 1
    = 5
    ^
SyntaxError: invalid syntax

Python makes it clear when we have syntax errors in our code. Usually it can point to the exact position within a line where such an error occurs. Sometimes, it can even provide suggestions, for example, “Perhaps you forgot a comma?”

Semantics

On the other hand, semantics is about meaning. In English we may say

The ball is red.

We know there’s some object being referred to—a ball—and that an assertion is being made about the color of the ball—red. This is fairly straightforward.

Of course, it’s possible to construct ambiguous sentences in English. For example (with apologies to any vegetarians who may be reading):

The turkey is ready to eat.

Does this mean that someone has cooked a turkey and that it is ready to be eaten? Or does this mean that there’s a hungry turkey who is ready to be fed? This kind of ambiguity is quite common in natural languages. Not so with programming languages. If we’ve produced a syntactically valid statement or expression, it has only one “interpretation.” There is no ambiguity in programming.

Here’s another famous example, devised by the linguist Noam Chomsky:2

Colorless green ideas sleep furiously.

This is a perfectly valid English sentence with respect to syntax. However, it is meaningless, nonsensical. How can anything be colorless and green at the same time? How can something abstract like an idea have color? What does it mean to “sleep furiously”? Syntax: A-OK. Semantics: nonsense.

Again, in programming, every syntactically valid statement or expression has a meaning. It is our job as programmers to write code which is syntactically valid but also semantically correct.

What happens if we write something which is syntactically valid and also semantically incorrect? It means that we’ve written code that does not do what we intend for it to do. There’s a word for that: a bug.

Here’s an example. Let’s say we know the temperature in degrees Fahrenheit, but we want to know the equivalent in degrees Celsius. You may know the formula

\begin{equation*} C = \frac{F - 32}{1.8} \end{equation*}

where F is degrees Fahrenheit and C is degrees Celsius.

Let’s say we wrote this Python code.

f = 68.0             # 68 degrees Fahrenheit
c = (f - 32) * 1.8   # attempt conversion to Celsius
print(c)             # print the result

This prints 64.8 which is incorrect! What’s wrong? We’re multiplying by 1.8 when we should be dividing by 1.8! This is a problem of semantics. Our code is syntactically valid. Python interprets it, runs it, and produces a result—but the result is wrong. Our code does not do what we intend for it to do. Call it what you will—a defect, an error, a bug—but it’s a semantic error, not a syntactic error.

To fix it, we must change the semantics—the meaning—of our code. In this case the fix is simple.

f = 68.0             # 68 degrees Fahrenheit
c = (f - 32) / 1.8   # correct conversion to Celsius
print(c)             # print the result

and now this prints 20.0 which is correct. Now our program has the semantics we intend for it.

Original author: Clayton Cafiero < [given name] DOT [surname] AT uvm DOT edu >

No generative AI was used in producing this material. This was written the old-fashioned way.

This material is for free use under either the GNU Free Documentation License or the Creative Commons Attribution-ShareAlike 3.0 United States License (take your pick).

Footnotes

  1. “My hovercraft is full of eels” originates in a famous sketch by Monty Python’s Flying Circus.↩︎

  2. https://en.wikipedia.org/wiki/Noam_Chomsky↩︎