Matters of notation

Published

2023-08-17

Notation of sets

Depending on when and with whom you took CS 1640 Discrete Structures (formerly CS 064), or MATH 2055 Fundamentals of Mathematics (formerly MATH 052), you may notice some minor differences in notation when reading Sipser.

For example, Sipser uses \mathcal{N}, \mathcal{Z}, \mathcal{Q}, and \mathcal{R} for natural numbers, integers, rationals, and reals, respectively. In other courses, you may have seen \mathbb{N}, \mathbb{Z}, \mathbb{Q}, and \mathbb{R} for the same sets. I will use the latter (it’s easier to write the latter on blackboard or whiteboard and maintain legibility, which is how these symbols arose in the first place). You should understand that these two notations are interchangeable (but shouldn’t ever be mixed).

Inclusion

Sipser writes A \subseteq B to indicate inclusion of A in B. That is, A \subseteq B if and only if every element of A is also an element of B. This leaves open the possibility of A = B. Sipser also writes A \subsetneq B to signify that A is strict subset of B, that is, A is a subset of B and A \neq B. Other authors use A \subset B to signify a strict subset. I will try to use Sipser’s \subsetneq but may occasionally lapse into \subset. Where this occurs \subset should be read as \subsetneq. That is, A \subset B means that A is a strict subset of B (they are not equal).

Venn diagrams

Venn diagrams can be useful but one should keep in mind a few cautionary notes. A Venn diagram is always drawn with respect to some universe of discourse. In fact, other authors would draw Sipser’s Figure 0.1 thus:

where \mathcal{U} indicates the universe, and the set \text{START-t} is a subset of that universe.

Understanding the universe is important when it comes to complementation.

\overline{\text{START-t}} is the complement of \text{START-t}, that is, everything that’s not in \text{START-t}. Sipser defines \text{START-t} to be the set of all English words that start with the letter “t”, so it’s implied that the universe is the set of all English words. Thus, the complement \overline{\text{START-t}} is the set of all English words which do not start with a “t”. However, in other contexts we might consider the universe of all strings of Unicode symbols of arbitrary length, or the universe of all strings of symbols in the Latin alphabet of length less than 1,000,000. In both these cases, \text{START-t} would contain exactly the same elements (all English words starting with “t”) but the complements of \text{START-t} would be very different indeed!

It’s often clear from context what constitutes the universe, but this not always so.

Another thing to watch out for is that Venn diagrams don’t necessarily make assertions about the contents of sets. For example, we might draw this:

without making any assertions or assumptions about the intersection! We might draw this diagram and then ask “is the intersection empty?” So don’t assume that if we draw a Venn diagram thus that the intersection must be non-empty.

Of course, when we know (or assume) that two sets are disjoint, we can draw a Venn diagram which shows this.

Why Sipser includes certain notation or fundamentals

You might be wondering why we concern ourselves with certain notation or fundamentals. What follows includes some motivation (but there are many other examples).

Why do we care about sets?

In this course, we’re going to learn about languages—regular languages, context-free languages, recursively enumarable languages. We’ll see that a language is a set of strings (often but not always bitstrings).

We’re going to be asking questions like “Is such-and-such string, s in some language, \mathcal{L}?” That’s a matter of membership in a set.

We’re also going to ask if a string, s, is in two different languages, \mathcal{L}_1 and \mathcal{L}_2. That is, is the string in the language \mathcal{L}_1 and in the language \mathcal{L}_2. That’s asking if s is in the intersection \mathcal{L}_1 \bigcap \mathcal{L}_2.

We’re also going to ask if a string, s, is in either of two different languages. That’s asking if s is in the union \mathcal{L}_1 \bigcup \mathcal{L}_2.

Questions like this will be frequent.

Why do we care about Cartesian products?

Sipser gives the notation of a set, A crossed with itself k times, e.g.

\overbrace{A \times A \times A \times \ldots \times A}^k = A^k.

We’ll see this quite a bit when considering strings. “Strings?” you ask, “What does this have to do with strings?” Well, say we have some set of symbols called an alphabet. In this course, we’ll use \Sigma to denote an alphabet. Then \Sigma^5 would be the Cartesian product \Sigma \times \Sigma \times \Sigma \times \Sigma \times \Sigma. That’s the set of all possible strings of length five. We’ll see this notation, \Sigma^*, to denote all possible strings that can be constructed from the alphabet \Sigma, including the empty string.

This is one reason why we care about notation for Cartesian products.

Why do we care about tuples?

Sipser speaks of sequences, which are ordered collections of objects. Several definitions of automata involve tuples. You don’t need to worry about the details yet, but the definition of a deterministic finite automata is a 5-tuple, a pushdown automaton is a 6-tuple, and a Turing machine is a 7-tuple.

It’s also the case that strings, deep down, are tuples. ('c', 'a', 't'), ('d', 'o', 'g'), ('b', 'a', 'n', 'a', 'n', 'a').

We’ll also see functions which accept multiple arguments (a tuple of arguments), and which return multiple values (a tuple of values).

Copyright © 2023 Clayton Cafiero. All rights reserved.