Comparison operators

Published

2023-07-31

Comparison operators

It is often the case that we wish to compare two objects or two values. We do this with comparison operators.

Comparison operators compare two objects (or the values of these objects) and return a Boolean True if the comparison holds, and False if it does not.

Python provides us with the following comparison operators (and more):

Operator Example Explanation
== a == b Does the value of a equal the value of b?
> a > b Is the value of a greater than the value of b?
< a < b Is the value of a less than the value of b?
>= a >= b Is the value of a greater than or equal to the value of b?
<= a <= b Is the value of a less than or equal to the value of b?
!= a != b Is the value of a not equal to the value of b?

It’s important to understand that these operators perform comparisons and expressions which use them to evaluate to a Boolean value (True or False).

Let’s demonstrate in the Python shell.

>>> a = 12
>>> b = 31
>>> a == b
False
>>> a > b
False
>>> a < b
True
>>> a >= b
False
>>> a <= b
True
>>> a != b
True
>>> not (a == b)
True

Now what happens in the case of strings? Let’s try it and find out!

>>> a = 'duck'
>>> b = 'swan'
>>> a == b
False
>>> a > b
False
>>> a < b
True
>>> a >= b
False
>>> a <= b
True
>>> a != b
True
>>> not (a == b)
True

What’s going on here? When we compare strings, we compare them lexicographically. A string is less than another string if its lexicographic order is lower than the other. A string is greater than another string if its lexicographic order is greater than the other.

What is lexicographic order?

Lexicographic order is like alphabetic order, but is somewhat more general. Consider our example ‘duck’ and ‘swan’. This is an easy case, since both are four characters long, so alphabetizing them is straightforward.

But what about ‘a’ and ‘aa’? Which comes first? Both start with ‘a’ so their first character is the same. If you look in a dictionary you’ll find that ‘a’ appears before ‘aa’.1 Why? Because when comparing strings of different lengths, the comparison is made as if the shorter string were padded with an invisible character which comes before all other characters in the ordering. Hence, ‘a’ comes before ‘aa’ in a lexicographic ordering.

>>> 'a' < 'aa'
True
>> 'a' > 'aa'
False

The situation is a little more complex than this, because strings can have any character in them (not just letters, and hence the term “alphabetic order” loses its meaning). So what Python actually compares are the code points of Unicode characters. Unicode is the system that Python uses to encode character information, and Unicode includes many other alphabets (Arabic, Armenian, Cyrillic, Greek, Hangul, Hebrew, Hindi, Telugu, Thai, etc.), symbols from non-alphabetic languages such as Chinese or Japanese Kanji, and many special symbols (®, €, ±, ∞, etc.). Each character has a number associated with it called a code point (yes, this is a bit of a simplification). In comparing strings, Python compares these values.2

Thus, 'duck' < 'swan' evaluates to True, 'wing' < 'wings' evaluates to True, and 'bingo' < 'bin' evaluates to False.

>>> 'duck' < 'swan'
True
>>> 'wing' < 'wings'
True
>>> 'bingo' < 'bin'
False

Now, you may wonder what happens in alphabetic systems, like English and modern European languages, which have majuscule (upper-case) and miniscule (lower-case) letters (not all alphabetic systems have this distinction).

'a' > 'A'
True
'a' < 'A'
False

Upper-case letters have lower order than lower-case letters.

>>> 'ALPHA' < 'aLPHA'
True

So keep this in mind when comparing strings.

Original author: Clayton Cafiero < [given name] DOT [surname] AT uvm DOT edu >

No generative AI was used in producing this material. This was written the old-fashioned way.

This material is for free use under either the GNU Free Documentation License or the Creative Commons Attribution-ShareAlike 3.0 United States License (take your pick).

Footnotes

  1. Yes, “aa” is a word, sometimes spelled “a’a”. It comes from the Hawai’ian, meaning rough and jagged cooled lava (as opposed to pahoehoe, which is very smooth).↩︎

  2. If you want to get really nosy about this, you can use the Python built-in function ord() to get the numeric value associated with each character. E.g.,

    >>> ord('A')
    65
    >>> ord('a')
    97
    >>> ord('b')
    98
    >>> ord('£')
    163

    See also: Joel Spolsky’s The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) last seen in the wild at https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/↩︎