Groups

Suppose we have some arbitrary monoid \((M, *)\), and we are given two elements \(a, b \in M\), and we want to solve an equation of the form:

\[a * x = b\]

That is, we want to find some \(x \in M\) such that the equation is satisfied. Can we always do this?

We will start by looking at some examples. First consider \((\mathbb{Z}, +)\). In this case, one example of such an equation might be this:

\[4 + x = 7\]

You can probably see how to solve this already: simply subtract 4 from both sides, and you’re left with this:

\[x = 7 - 4 = 3\]

Easy. In fact, with this monoid, we can always solve this kind of equation, regardless of which values of \(a\) and \(b\) we are given: in general, the solution is \(x = b - a\).

Now we consider a different monoid: \((\mathbb{N}, +)\). Can we solve the following equation with this monoid?

\[4 + x = 2\]

We can’t! If we were working with a set which contains negative numbers, we would be fine: in this case, the answer would be \(-2\). But \(-2 \notin \mathbb{N}\).

Exercise 3.1. Can you think of another example of a monoid \(M\) and elements \(a, b \in M\) so that the equation \(a*x = b\) has no solutions in \(M\)? Hint: we discussed one possible monoid in the previous chapter.

So it appears that there’s some fundamental difference between the monoids \((\mathbb{Z}, +)\) and \((\mathbb{N}, +)\). This suggests that there might be a way of categorising monoids, based on whether any equation of this form can be solved. Our next task as mathematicians is to try to make this a bit more precise!

We do this by defining a new algebraic structure called a group, which is a monoid with one extra requirement. Suppose we have a monoid \((G, *)\). We say that \((G, *)\) is a group if and only if it satisfies this additional law:

  • Inverses. \(\forall g \in G.\; \exists h \in G.\; g * h = h * g = e\)

That is, every element has an inverse, and combining an element with its inverse gives you the identity.

If you’re wondering why I’m using different letters now, it’s nothing more than a convention: people generally use \(G\) to refer to some arbitrary group, and lowercase letters starting from \(g\) to refer to elements of a group.

We often omit the \(*\) symbol; you might see people expressing the above property as \(\forall g \in G.\; \exists h \in G.\; gh = hg = e\).

\((\mathbb{Z}, +)\) is the first example of a group we will consider. In this group, the inverse of \(1\) is \(-1\), the inverse of \(-5\) is \(5\), and in general the inverse of \(x\) is \(-x\).

\((\mathbb{N}, +)\) is not a group, because no positive elements have inverses.

\((\mathbb{Q}, +)\) and \((\mathbb{R}, +)\) are both groups, and these groups both have the same rule for finding inverses as we saw with \((\mathbb{Z}, +)\). That is, we find the inverse of an element by multiplying by \(-1\).

The trivial monoid is also a group, and unsurprisingly we call it the trivial group. To show that the trivial monoid is a group, we need to find an inverse for every element. Because the trivial monoid only has one element, there’s only one element which we need to find an inverse for: \(e\). Similarly there’s only one candidate for that inverse: also \(e\). We already know that \(e * e = e\) so we are good; \(e^{-1} = e\), and \(\{e\}\) is a group.

Uniqueness of inverses

It turns out that in any group, every element has exactly one inverse. We can prove this:

Let \((G, *)\) be a group, and let \(g \in G\). Suppose we have two additional elements, \(h_1, h_2 \in G\), such that \(h_1\) and \(h_2\) are both inverses of \(g\).

Then:

  1. \(h_1\) is equal to \(h_1 * e\), since \(e\) is the identity element.
  2. \(h_1 * e\) is in turn equal to \(h_1 * (g * h_2)\): since \(g\) and \(h_2\) are inverses, we can replace \(e\) with \(g * h_2\).
  3. \(h_1 * (g * h_2)\) is equal to \((h_1 * g) * h_2\) by the associativity law.
  4. \((h_1 * g) * h_2\) is equal to \(e * h_2\) since \(g\) and \(h_1\) are inverses.
  5. \(e * h_2\) is just \(h_2\).

So \(h_1 = h_2\), and therefore we have shown that any element has exactly one inverse.

Because elements of a group always have exactly one inverse, we can talk about the inverse of an element, as opposed to just an inverse of an element (just like with identity elements of monoids). Also, we can define a notation for the inverse of an element: if \(g\) is some element of a group, then we often write the inverse of \(g\) as \(g^{-1}\).

Warning

This notation can be a little treacherous: it isn’t always the same as exponentiation of numbers which you have probably seen before. It depends on the group we’re talking about. For example, we saw that in \((\mathbb{Z}, +)\), we find the inverse of an element by negating it. So in \((\mathbb{Z}, +)\), we could write that \(4^{-1} = -4\). Normally, however, we would expect that \(4^{-1}\) means the same thing as \(1/4\). This ambiguity can be a bit awkward, so it’s best to avoid this notation for inverses in cases where it can be ambiguous.

Exercise 3.2. In an arbitrary group, what is the inverse of the identity element?

Exercise 3.3. Let \(G\) be a group, and let \(g, h \in G\). Show that \(g^{-1} h^{-1} = (hg)^{-1}\).

Modular arithmetic

Finite groups — that is, groups with a finite number of elements — are often a little easier to deal with, so we will now talk about an example of a finite group. Modular arithmetic is a fairly widely known concept, but we will cover it in a slightly more rigorous way than you may have seen before; the reason I have done this is that it will help explain other concepts further along.

Let \(x, y, m \in \mathbb{Z},\) with \(m > 0\). We say that \(x\) is congruent to \(y\) modulo \(m\) if and only if \(x - y\) is a multiple of \(m\). In mathematical notation:

\[x \equiv y \; (\mathrm{mod} \; m) \; \Leftrightarrow \; \exists a \in \mathbb{Z}.\; x - y = am\]

For example, we will take \(m = 12\). Then, \(15\) is congruent to \(3\) modulo \(12\), because \(15 - 3 = 12\). Also, \(27\) is congruent to \(3\) modulo \(12\), because \(27 - 3 = 24 = 2 \times 12\).

By contrast, \(2\) is not congruent to \(1\) modulo \(12\), because \(2 - 1 = 1\) and \(1\) is not an integer multiple of \(12\).

Note that any integer is congruent to itself modulo \(12\), because we say that \(0\) counts as an integer multiple of \(12\); we can take \(a = 0\) in the definition above and we see that \(0 \times 12 = 0\).

Now, we ask: given some integer \(x \in \mathbb{Z}\), and a modulus \(m \in \mathbb{Z}, m > 0\), can we find the entire set of integers which are congruent to \(x\) modulo \(m\)? For example, can we find the entire set of integers congruent to \(0\) modulo \(12\)?

Before we continue, we will introduce a new notation to describe sets like this. It is called set-builder notation, and it looks like this:

\[\{\, y \in \mathbb{Z} \,|\, x \equiv y \; (\mathrm{mod} \; m) \,\}\]

Read: “the set of \(y\) in \(\mathbb{Z}\) such that \(x\) is congruent to \(y\) modulo \(m\)”.

We will define \(\overline{x}\) to be this set; that is:

\[\overline{x} = \{\, y \in \mathbb{Z} \,|\, x \equiv y \; (\mathrm{mod} \; m) \,\}\]

The set \(\overline{x}\) is called the congruence class of \(x\).

In particular, when \(m = 12\), we have seen that \(15 \in \overline{3}\), and \(27 \in \overline{3}\), but \(2 \notin \overline{1}\). It turns out that in this case, \(\overline{15}\) is actually the exact same set as \(\overline{3}\), and again the exact same set as \(\overline{27}\).

In fact, for any \(x \in \mathbb{Z}\), we have that \(\overline{x} = \overline{x + m}\). To prove that two sets \(U\) and \(V\) are the same, we first need to show that every element of \(U\) is an element of \(V\), and then we show that every element of \(V\) is also an element of \(U\). It’s not enough to just do one of these steps; we need to do both, because \(U\) might be a subset of \(V\) or vice versa, and both steps are required to rule this out.

Therefore, we first prove that every element of \(\overline{x}\) is also an element of \(\overline{x + m}\). Let \(x, y \in \mathbb{Z}\), with \(y \in \overline{x}\). Then, there exists an \(a \in \mathbb{Z}\) such that \(x - y = am\). Then, adding \(m\) to both sides, we have:

\[ \begin{align}\begin{aligned}x + m - y = am + m\\(x + m) - y = (a + 1)m\end{aligned}\end{align} \]

That is, \(x + m \equiv y \; (\mathrm{mod} \; m)\) and \(y \in \overline{x + m}\). So if \(y \in \overline{x}\), then we also have that \(y \in \overline{x + m}\). The second part of the proof, that is, showing that every element of \(\overline{x + m}\) is also an element of \(\overline{x}\), is very similar: the main difference is that we subtract \(m\) from both sides instead of adding.

The important thing to take from all this is that there are exactly \(m\) such congruence classes. We will define a set \(\mathbb{Z}_m\) containing all of these, which we can write as \(\overline{0}\) up to \(\overline{m-1}\):

\[\mathbb{Z}_{12} = \{ \overline{0}, \overline{1}, ... , \overline{10}, \overline{11} \}\]

Then, for each \(m \in \mathbb{Z}, m > 0\), every \(x \in \mathbb{Z}\) is contained in exactly one element of \(\mathbb{Z}_{m}\). I omit a proof of this, but it follows as a consequence of congruence modulo \(m\) being a particular kind of relation called an equivalence relation. I might expand on equivalence relations in a future version of this guide.

We can define an addition operation on this set, too:

\[\overline{x} + \overline{y} = \overline{x + y}\]

For example, in \(\mathbb{Z}_{12}\), we have that \(\overline{8} + \overline{9} = \overline{8 + 9} = \overline{17} = \overline{5}\).

It turns out that this addition operation satisfies all of the group axioms, so we have a finite group. In particular, \(\overline{0}\) is the identity element. Again, I won’t prove this right now for the sake of expediency, although I might put a proof in an appendix later.

Exercise 3.4.a. Which element of \(\mathbb{Z}_{12}\) solves the equation \(\overline{3} + \overline{x} = \overline{2}\)?

Exercise 3.4.b. What is the additive inverse of \(\overline{5}\) in \(\mathbb{Z}_{12}\)? That is, which element of \(\mathbb{Z}_{12}\) solves the equation \(\overline{5} + \overline{x} = \overline{0}\)?

Permutations

We now consider another example of a finite group which arises from the monoid \((\mathrm{Maps}(X, X), \circ)\), which we saw in the previous chapter.

Firstly, a very brief interlude on functions and terminology: a function sends elements in one set to elements of some other set. If a function \(f\) sends elements of the set \(X\) to elements of the set \(Y\), we indicate this using mathematical notation by writing \(f : X \rightarrow Y\), or equivalently, \(f \in \mathrm{Maps}(X, Y)\). We call the set \(X\), from which \(f\) takes its argument, the domain; we call the set \(Y\), to which \(f\) sends those elements, the codomain.

The first thing to notice is that not all elements of \(\mathrm{Maps}(X, X)\) are invertible; that is, given some \(f \in \mathrm{Maps}(X, X)\), we can’t always find a \(g \in \mathrm{Maps}(X, X)\) such that \(f \circ g = g \circ f = e\). For example, suppose that we take \(X = \{A, B\}\) as before. We defined a function \(f_A\) in the previous chapter which sends both \(A\) and \(B\) to \(A\). To invert \(f_A\), we need to come up with a rule, so that if we are given any element \(y \in Y\), we can find the unique element \(x \in X\) satisfying \(f_A(x) = y\). That is, given the result of applying \(f_A\) to something, we have to be able to find that thing.

But this is impossible! Suppose we are told that the result of applying \(f_A\) to something was \(A\). Well, \(f_A\) always produces \(A\), regardless of what you put in, so we can’t know what the original thing was; it could just as well have been \(A\) or \(B\) as far as we know.

Alternatively, suppose we are told that the result of applying \(f_A\) to something was \(B\). But \(f_A\) never produces \(B\) as its result, so we certainly can’t find some other element \(x\) such that \(f_A(x) = B\).

So \(f_A\) is not invertible, and similarly, neither is \(f_B\) (recall that \(f_B\) was defined similarly to \(f_A\), except that the result is always \(B\) rather than \(A\)).

However, \(f_{swap}\) is invertible, and its inverse is \(f_{swap}\) (itself).

We have a few ways of classifying functions which we need to talk about briefly before continuing. Specifically, we need to clarify what it means for a function to be invertible.

Injectivity

Firstly, as we saw with \(f_A\), we can’t invert a function if it sends two different things to the same thing. Another example: the function \(f : \mathbb{R} \rightarrow \mathbb{R}\) given by \(f (x) = x^2\) sends both of \(2\) and \(-2\) to \(4\), so it is not invertible.

Functions which don’t suffer from this problem are called injective. We say that a function \(f : X \rightarrow Y\) is injective if and only if

\[\forall x_1, x_2 \in X.\; x_1 \neq x_2 \Rightarrow f(x_1) \neq f(x_2)\]

The identity function \(f(x) = x\) is injective, as is the function \(f(x) = x^3\). For functions from \(\mathbb{R}\) to \(\mathbb{R}\), a good way of thinking about injectivity is that a function \(f\) is injective if and only if any horizontal line drawn on a graph will only intersect with the curve \(y = f(x)\) at most once — that is, either exactly once or not at all.

Surjectivity

Another problem that we saw with \(f_A\) is that we can’t invert a function if there is some element in the codomain which isn’t ‘hit’ by the function. That is, if there’s some element \(y\) in the codomain such that there’s no \(x\) in the domain for which \(f(x) = y\), we can’t invert it, because we don’t have anything to send \(y\) to. The function \(f : \mathbb{R} \rightarrow \mathbb{R}\) defined by \(f(x) = x^2\) also suffers from this problem: there’s no real number \(x\) such that \(x^2 = -1\), for example.

We call functions that don’t suffer from this problem surjective. We say that a function \(f : X \rightarrow Y\) is surjective if and only if

\[\forall y \in Y.\; \exists x \in X.\; f(x) = y\]

The functions \(f(x) = x\) and \(f(x) = x^3\) are surjective in addition to being injective. Using a similar idea to the one we had with injectivity, a function \(f : \mathbb{R} \rightarrow \mathbb{R}\) is surjective if and only if any horizontal line drawn on a graph will intersect with the curve \(y = f(x)\) at least once.

Bijectivity

We are now ready to say what an invertible function is: a function is invertible if it is both injective and surjective. Functions which are both injective and surjective are also called bijective.

Note

You might ask what the point is of having two words, bijective and invertible, which mean the same thing. It might just be a historical accident. There is a subtle difference between these words though: the word ‘invertible’ is quite general, as it can refer to many different kinds of objects; by contrast, ‘bijective’ almost always refers to functions.

If a function \(f : X \rightarrow Y\) is bijective, then it has an inverse, which we usually write as \(f^{-1} : Y \rightarrow X\). For the inverse of \(f\), we have that \(f^{-1}(f(x)) = x\) for all \(x \in X\), and additionally \(f(f^{-1}(y)) = y\) for all \(y \in Y\). In essence, \(f^{-1}\) undoes the effect of \(f\), putting us back to where we started.

Going back to the example from the last chapter, \(e\) and \(f_{swap}\) are both injective and surjective and thus bijective, while \(f_A\) and \(f_B\) fail to be either injective or surjective.

The symmetric group

If \(X\) is some finite set, and we want to make a group out of \((\mathrm{Maps}(X, X), \circ)\), all we need to do is discard the elements of \(\mathrm{Maps}(X, X)\) which fail to be bijective.

Because the actual set \(X\) we choose doesn’t really matter in the context of group theory, it is conventional to use integers from \(1\) up to \(n\); that is, we take \(X = \{ 1, 2, ... , n \}\). Clearly, then, this set has \(n\) elements.

The group of permutations on this set is very important, so it has a name: it is called the symmetric group of degree \(n\). We denote this group by \(S_n\).

Note

Be careful not to confuse the set \(\{ 1, 2, ... , n\}\) with the group of permutations on that set, \(S_n\). Remember that the elements of \(S_n\) are functions, not numbers.

Exercise 3.5. If \(n\) is a positive integer, the product of all positive integers less than or equal to \(n\) is called \(n\) factorial, written \(n!\). Show that \(S_n\) has \(n!\) elements.

As for checking the group laws for \(S_n\): we have already shown that \((\mathrm{Maps}(X, X), \circ)\) is a monoid, which means that we get associativity “for free”, since we’re using the same operation as before. The identity function is bijective, which means we don’t discard it and we can use it for the identity element in our group, and so the identity law is satisfied too. Also, we know that bijective functions have inverses, so the inverses law is satisfied. The only thing left to check is closure; that is, we need to check that the composite of two bijective functions is itself bijective. This is true, although I will not prove it here. I encourage you to look for a proof elsewhere on the web if you’re itching to see one.

Cancellation

Now that we have seen a few more examples of groups, we go back to our original problem, except this time, we assume that we have a group, not just a monoid. That is, we let \((G, *)\) be some group, and let \(a, b \in G\). We want to know if there is a solution to the equation

\[a * x = b\]

Because it’s an equation, we can do the same thing to both sides, so we will combine both sides with \(a^{-1}\) on the left, like this:

\[a^{-1} * a * x = a^{-1} * b\]

We can now cancel:

\[x = a^{-1} * b\]

And we have solved for \(x\). So, if we are dealing with a group, then an equation of the form \(a * x = b\) always has exactly one solution. Cancellation — the ability to move elements to the other side of equations like this — is arguably a defining property of groups.

Abelian groups

Before moving on we just need to talk about one more specific kind of group.

We say that a group is an Abelian group, or a commutative group, if it satisfies the following additional law:

  • Commutativity. \(\forall g, h \in G.\; g*h = h*g\).

Almost all of the groups we have seen so far have been Abelian; in particular, you were probably already aware that \(x + y = y + x\) for all \(x, y \in \mathbb{R}\).

The only non-Abelian groups we have seen so far are the symmetric groups: the symmetric group of degree \(n\) is non-Abelian whenever \(n \geq 3\).

It is possible to prove, although we will not do so here, that any non-Abelian group must have at least \(6\) elements. In fact, the symmetric group of degree \(3\), that is \(S_3\), is the smallest possible non-Abelian group, with exactly \(6\) elements.

A final note on groups

Groups might seem like a simple concept but they give rise to an astonishing amount of rather lovely mathematics. I don’t want to dwell on them too much here, because we want to get on to rings and fields and things, but I recommend studying them in more depth if you get the chance.

In my experience, it’s fairly uncommon to want a Group type class in PureScript code, but if you do ever happen to want one, it’s in the purescript-group library.