# Ph.D. Comprehensive exam practice problems, Round 1

In May, I will be taking the qualifying exams for my Ph.D.. Over the next few weeks, I will be posting practice problems and my solutions to them. Until the end of February, I will be reviewing linear algebra, single variable real analysis, complex analysis and multivariable calculus. In March and April, I will be focusing on algebra, geometry and topology.

Here are three problems to start.

Problem: Suppose that ${A}$ is an ${n \times n}$ real matrix with ${n}$ distinct real eigenvalues. Show that ${A}$ can be written in the form ${\sum_{j=1}^n \lambda_j I_j}$ where each ${\lambda_j}$ is a real number and the ${I_j}$ are ${n\times n}$ real matrices with ${\sum_{j=1}^n I_j = I}$, and ${I_jI_l = 0}$ if ${j \neq l}$. Give a ${2 \times 2}$ real matrix ${A}$ for which such a decomposition is not possible and justify your answer.

Solution: for each ${j}$, let ${E_j}$ denote the matrix with a ${1}$ on the entry ${(j,j)}$ and zeroes everywhere else. Then ${\sum_j E_j = I}$ and ${E_jE_l= 0}$ when ${j\neq l}$. Since ${A}$ has ${n}$ distinct real eigenvalues ${\lambda_1, \dots, \lambda_n}$, it is diagonalizable over ${\mathbf R}$, so there is a real matrix ${P}$ such that ${P^{-1}AP = D}$, where ${D=\mathrm{diag}(\lambda_1, \dots, \lambda_n) = \sum_j \lambda_j E_j }$. Let ${I_j = PE_jP^{-1}}$. Then

$\displaystyle \sum_j \lambda_j I_j = P\left(\sum_j \lambda_j E_j\right) P^{-1} = PDP^{-1} = A.$

Moreover, for ${j \neq l}$ we have ${I_jI_l = PE_jE_lP^{-1} = 0}$.

For the second part, notice that if the matrix ${A}$ is decomposed in the manner described above, the numbers ${\lambda_j}$ are necessarily eigenvalues of ${A}$. Indeed, multiplying the equality ${\sum I_j = I}$ by ${I_l}$ and using that ${I_lI_j = 0}$ when ${l \neq j}$, we find that ${I_l^2=I_l}$. Hence, let ${v \in \mathbf R^n}$ be any nonzero vector. Since ${\sum_j I_j v = v}$, at least one of the terms in the sum is nonzero, say ${I_l v \neq 0}$. Then

$\displaystyle AI_lv = \sum_j \lambda_j I_j I_lv = \lambda_l I_l^2v = \lambda_l I_lv,$

and therefore ${I_lv}$ is an eigenvector of ${A}$ with eigenvalue ${\lambda_l}$. Thus, it is impossible for the matrix ${A}$ to have such a decomposition if, say, it has no real eigenvalues, for example

$\displaystyle A=\left(\begin{array}{ll} 0 & -1 \\ 1 & 0 \end{array}\right).$

# A divisibility identity for Euler’s totient function

In this note I will give a Galois-theoretic proof that for a prime ${p}$ and positive integer ${n}$,

$\displaystyle n \mid \frac{\varphi(p^n-1)}{\varphi(p-1)}.$

I’d love to see a more elementary proof if you can come up with one.
First we need the following:

Lemma 1 Let ${Z_n}$ be the cyclic group with ${n}$ elements. Let ${m}$ be a positive divisor of ${n}$, and consider ${Z_m}$ as a subgroup of ${Z_n}$. Then the number of automorphisms of ${Z_n}$ which fix ${Z_m}$ pointwise is equal to ${\varphi(n)/\varphi(m)}$ (which, in particular, is an integer).

Proof of the Lemma: Note that any automorphism of ${Z_n}$ fixes ${Z_m}$, though not necessarily pointwise: indeed ${Z_n}$ has a unique subgroup of order ${m}$, and thus any automorphism of ${Z_n}$ must take this subgroup to itself. Thus we have a group homomorphism ${\text{Aut}(Z_n) \rightarrow \text{Aut}(Z_m)}$ which is easily seen to be surjective; its kernel is precisely the subgroup consisting of those automorphisms of ${Z_n}$ which fix ${Z_m}$ pointwise. The statement follows by comparing orders. ${\square}$

Now to prove the initial claim, consider the field extension ${\mathbf{F}_{p^n}/\mathbf{F}_p}$. Basic Galois theory tells that this is a Galois extension of degree ${n}$. Consider the canonical homomorphism

$\psi: \displaystyle \text{Gal}(\mathbf{F}_{p^n}/\mathbf{F}_p) \rightarrow \text{Aut}(\mathbf{F}_{p^n}^\times)$

which restricts an ${\mathbf{F}_p}$-automorphism ${\sigma}$ to the group of units of ${\mathbf{F}_{p^n}}$. Clearly it is an injective homomorphism since ${\sigma}$ is completely determined by where it sends the units. Moreover for any ${\sigma \in \text{Gal}(\mathbf{F}_{p^n}/\mathbf{F}_p)}$, $\psi(\sigma)$ lies in the subgroup of ${\mathbf{F}_{p^n}^\times}$ of those automorphisms fixing pointwise the cyclic subgroup ${\mathbf{F}_{p}^\times}$ of order ${p-1}$, because the Galois group consists of $\mathbf{F}_p$-homomorphisms. By the lemma the subgroup of these automorphisms has order ${\frac{\varphi(p^n-1)}{\varphi(p-1)}}$, whereas ${\text{Gal}(\mathbf{F}_{p^n}/\mathbf{F}_p)}$ has order ${n}$. This does it.

# Burnside’s lemma and Bell numbers

In this post I am going to show that ${B_n}$, the ${n}$th Bell number, is larger than ${n^n/n!}$.

Our main tool will be Burnside’s lemma, which states that if a finite group ${G}$ acts on a finite set ${S}$, the average number of fixed points of the elements of ${G}$ is equal to the number of orbits of the action of ${G}$ on ${S}$:

$\displaystyle |G\backslash S|=\frac{1}{|G|}\sum_{g \in G}|\text{Fix }g|,$

where ${\text{Fix }g}$ is the set of fixed points of ${g}$.

We let ${S_n}$, the symmetric group on ${n}$ letters, act on ${\{1,2,\dots,n\}^n}$ component-wise. The elements of ${[n]=\{1,2,\dots,n\}^n}$ are ${n}$-tuples consisting of integers between ${1}$ and ${n}$. Now you may easily convince yourself that it is possible to send a tuple ${(a_1, \dots, a_n)}$ to a tuple ${(b_1, \dots, b_n)}$ if and only if whenever ${a_i=a_j}$, we also have ${b_i=b_j}$, and vice versa. In other words we view each ${n}$-tuple ${(a_1, \dots, a_n)}$ as a function ${\sigma: [n] \rightarrow [n]}$; its fibres partition ${[n]}$, and composition with a permutation preserves the fibres of ${\sigma}$. It is immediate that ${\sigma}$ and ${\sigma'}$ are in the same orbit of ${S_n}$ if and only if they have the same collection of fibres. For instance, $(1,1,2)$ can be sent to $(3,3,1)$ by the cycle $(1\: 3\: 2)$ but there is no way to send $(1,1,2)$ to $(1,2,3)$, because any permutation will send $(1,1,2)$ to a 3-tuple of the form $(\bullet,\bullet,\circ)$.

Thus ${S_n}$ has ${B_n}$ orbits on ${\{1,2,\dots,n\}^n}$. On the other hand since our permutations act component-wise, we have

$\displaystyle \text{Fix }_{[n]^n}g \cong (\text{Fix }_{[n]}g)^n,$

i.e. the fixed points of ${g}$ acting on ${[n]^n}$ are the tuples ${(a_1, \dots, a_n)}$ consisting of fixed points of ${g}$ acting on ${[n]}$. Therefore, by Burnside’s lemma, we have

$\displaystyle \frac{1}{n!}\sum_{g\in S_n} (\text{Fix }g)^n = B_n.$

In fact the same argument shows that for any $m\geq n$, we have

$\displaystyle \frac{1}{m!}\sum_{g\in S_m} (\text{Fix }g)^n = B_n.$

In particular, the identity of ${S_n}$ fixes all of ${[n]}$, so we have

$\displaystyle \frac{n^n}{n!} \leq B_n.$

In fact, by using the fact that a permutation of ${[n]}$ consists of a subset of ${[n]}$ (the subset of fixed points), and a derangement of the remaining elements, we easily obtain the formula

$\displaystyle B_n = \frac{1}{n!}\sum_{i=0}^n {n \choose i} !(n-i) i^n$

where ${!n}$ is the number of derangements of a set of ${n}$ elements.

# The Problem of Misaddressed Letters

I have decided to switch the focus of this blog. Instead of expository write-ups, I will be posting mostly tidbits of fun mathematics, possibly without relation to one another.

In this post, I want to talk about the problem of derangements, fist considered by Niclaus Bernoulli (1687-1759), solved by him and, later, independently by Euler. If I write letters to ${100}$ different friends, and send the letters randomly among them, what is the probability that none of my friends will receive the letter personally addressed to them? Since there are ${100!}$ different ways of sending the letters, this probability equals ${!100/100!}$, where ${!100}$ denotes the number of ways of rearranging ${100}$ objects in such a way that no object is left in the same position. Such a permutation is called a derangement.

How can we calculate ${!100}$? First, note that any permutation of ${n}$ objects fixes certain elements, and deranges the others. The number of permutations of ${n}$ fixing exactly ${k}$ elements is equal to

$\displaystyle {n \choose k} !(n-k).$

Therefore, the total number of permutations is

$\displaystyle n! = \sum_{k=0}^n{n \choose k} !(n-k).$

In the language of species, we can say that the species of permutations is the product of the species of derangements and of the identity species.

In the language of generating functions, this translates to

$\displaystyle \frac{1}{1-x} = e^x D(x)$

where ${D(x)= \sum_{n=0}^\infty !n \frac{x^n}{n!}}$.

Therefore,

$\displaystyle D(x)=\frac{e^{-x}}{1-x}$

and from this we read off the formula

$\displaystyle !n = \sum_{k=0}^n {n \choose k}(-1)^k (n-k)!.$

In fact, rearranging this shows that

$\displaystyle \frac{!n}{n!} = \sum_{k=0}^n \frac{(-1)^k}{k!},$

which is simply the truncated Taylor series for ${e^{-x}}$, evaluated at ${1}$. Hence we see that

$\displaystyle \frac{!n}{n!} - e^{-1} \rightarrow 0.$

But even more is true: using Taylor’s remainder formula, we see that for all ${n>0}$,

$\displaystyle \left |\frac{!n}{n!} - e^{-1} \right | < \frac{1}{(n+1)!}.$

Hence, in fact, ${!n}$ is the nearest integer to ${n!/e}$, for all ${n}$.

Furthermore, as a consequence of the identity ${\frac{1}{1-x} = 1+\frac{x}{1-x}}$, we see that ${!n}$ satisfies the recurrence relation

$\displaystyle !n = n\times !(n-1) + (-1)^n.$

Note that this is the same recurrence as satisfied by ${n!}$, with an extra ${(-1)^n}$ term.

The derangement numbers appear in certain integrals. For instance, for ${n\geq 0}$,

$\displaystyle \int_1^e (\log u)^n \mathrm{d}u = (-1)^n(e!n-n!).$

This gives another proof that ${!n - n!e^{-1} \rightarrow 0}$ (and that it oscillates around ${0}$), since clearly the integral is positive and converges to ${0}$. It also gives a continuation of the function ${!n}$ to complex values.

# Elliptic functions, part II

This post is a continuation of my previous post about elliptic functions.

We showed that the ${\wp}$ function which is ${\Omega}$-invariant satisfies the differential equation

$\displaystyle f'(z)^2=4f(z)^3-g_2f(z)-g_3.$

where the constants $g_n$ are given in terms of

$\displaystyle \sum_{\omega \in \Omega^*}\frac{1}{\omega^{2n}}$

We did this by neutralizing the only pole of ${\wp'(z)^2}$ on ${E=\mathbb{C}/\Omega}$, by adding to ${\wp'(z)^2}$ a suitable polynomial in ${\wp(z)}$.

Thus we can use the functions ${\wp(z), \wp'(z)}$ to parametrize the curve

$\displaystyle y^2=4x_3-g_2x-g_3$

in ${\mathbb{C}^2}$. In fact we’re really parametrizing the projective curve

$\displaystyle \tilde E = V(Y^2Z-4X^3-g_2XZ^2-g_3Z^3)$

in ${\mathbb{P}^2(\mathbb{C})}$ by using the map

$\displaystyle \Psi : E \rightarrow \tilde E$

$\displaystyle z \mod \Omega \mapsto \begin{cases} (\wp(z), \wp'(z), 1) & \text{if }(z \mod \Omega) \neq 0 \\ (0,0,1) & \text{otherwise.}\end{cases}$

What we’re doing is exactly analogous to the parametrization of a conic using trigonometric functions.

With a bit more work, we can see that the field of elliptic functions with respect to ${\Omega}$ is precisely the abstract field ${\mathbb{C}(x,y)}$, subject to the relation ${y^2=4x^3-g_2x-g_3}$ (i.e. the quotient field of ${\mathbb{C}[x,y]/(y^2-4x^3-g_2x-g_3)}$). This means that for each ${\Omega}$-elliptic function ${f}$, we can construct a rational function of ${\wp}$ and ${\wp'}$ which has the same poles and zeroes, and thus express ${f}$ as a rational function in ${\wp}$ and ${\wp'}$.

Now since ${\mathbb{C}}$ is algebraically closed, we can factor our equation as

$\displaystyle \wp'(z)^2=\wp(z)^3-g_2\wp(z)-g_3=(\wp(z)-e_1)(\wp(z)-e_2)(\wp(z)-e_3)$

for suitable values of ${e_i}$. It’s easy to see from this equation that ${e_i=\wp(c_i)}$, where ${c_i}$ runs over the zeroes of ${\wp'(z)}$. Counted with multiplicities, there are ${3}$ points where ${\wp'(z)}$ vanishes, since ${\wp(z)}$ of degree ${3}$ as a cover of ${\mathbb{P}^1(\mathbb{C})}$. Using the fact that ${\wp'(z)}$ is an odd function and periodic with respect to ${\Gamma}$, we can see that ${\wp'(z)}$ vanishes at the symmetry points of the fundamental paralellogram having coordinates

$\displaystyle \frac{\omega_1}{2}, \frac{\omega_2}{2}, \frac{\omega_1+\omega_2}{2}.$

Thus we have

$\displaystyle (e_1, e_2, e_3)=\left(\wp\left(\frac{\omega_1}{2}\right), \wp\left(\frac{\omega_2}{2}\right), \wp\left(\frac{\omega_1+\omega_2}{2}\right)\right).$

Moreover, these three values are distinct. Indeed, it’s easy to see that each one is taken with multiplicity two by construction (i.e. each is a double zero of ${\wp'(z)}$), and since ${\wp}$ takes each value exactly twice, no two of them can be equal. To see that these points are double points, notice that the derivative of ${\wp(z)-\wp(\omega_1/2)}$ vanishes at ${\omega_1/2}$, so the point ${\frac{\omega_1}{2}}$ is a double point. This implies that the discriminant of ${f(x)=(x-e_1)(x-e_2)(x-e_3)}$ does not vanish, which implies after a quick check that the curve ${\tilde E}$, which is the locus of zeroes of ${Y^2Z-4X^3-g_2XZ^2-g_3Z^3}$ in the projective plane, is actually a nonsingular curve. (From now on we’ll call both ${\tilde E}$, ${E}$ curves.)

So we have two curves: ${E}$ is defined in an analytic way, because its function field is constructed as a subfield of the field of meromorphic functions on ${\mathbb{C}}$. On the other hand, the curve ${E}$ is an algebraic curve.

In fact, the curves ${E}$ and ${\tilde E}$ are exactly the same in every respect, as it turns out. This means that the curve ${\tilde E}$ can be made into a group, since the curve ${\mathbb{C}/\Omega}$ is naturally a group (it’s just the torus group). Of couse, the magical thing that happens is that the group law on ${\tilde E}$ has a beautiful geometric interpretation, and that it’s given by rational functions on ${\tilde E}$.

Let’s compare again with trigonometric functions. Consider the locus ${S}$ of ${x^2+y^2=1}$ in ${\mathbb{C}^2}$. We know how to add points on the (usual) circle by adding angles. We can prove by elementary geometry, or using the series definitions of trigonometric functions, the formula

$\displaystyle (x,y)+(x',y')=(xx'-yy', xy'+x'y),$

which shows that the group structure on ${S}$ is given by rational (polynomial!) functions. What is amazing is that the group structure is compatible with the ${(\sin t, \cos t)}$ parametrization of ${S}$ – in fact, the group law becomes a pair of “addition theorems”: one for ${\sin}$ and one for ${\cos}$.

For the rest of this post, I will assume that the reader is familiar with the simple geometric interpretation of the group law on an elliptic curve. For an easy description, see the wikipedia page.

So it is easy to “discover” the addition theorem for ${\wp}$ if we take the group law on ${\tilde E}$ for granted. By a simple calculation, we obtain that, for ${z \neq Z}$,

$\displaystyle \wp(z+Z)+\wp(z)+\wp(Z) = \left(\frac{\wp'(z)-\wp'(Z)}{\wp(z)-\wp(Z)}\right)^2.$

For example, to get the formula for ${\wp'(z+Z)}$, we find the ${y}$-coordinate of the point

$\displaystyle (\wp(z), \wp'(z))+(\wp(Z), \wp'(Z))$

using the geometric law. To ease notation a bit, let

$\displaystyle P=(x_1, y_1)=(\wp(z), \wp'(z))$

$\displaystyle Q=(x_2, y_2) = (\wp(Z), \wp'(Z))$

$\displaystyle P*Q=(x_3, y_3)$

Now the line passing through ${P}$ and ${Q}$ (assuming they are distinct, so ${z}$ and ${Z}$ are distinct points) is ${y-y_1=\lambda(x-x_1)}$, where ${\lambda = (y_1-y_2)/(x_1-x_2)}$. We substitute this value of ${x=\lambda^{-1}(y-y_1)+x_1}$ in the equation ${y^2=4x^3-g_3x-g_2}$ and we get the cubic in ${y}$

$\displaystyle \lambda^3y^2=4(y-y_1-\lambda x_1)^3-g_2\lambda^2(y-y_1-\lambda x_1) - g_3\lambda^3.$

Which is, after dividing by ${4}$ and rearranging terms,

$\displaystyle y^3-y^2(\lambda^3-3(y_1+\lambda x_1))-\dots =0$

Now we already know two roots of this cubic; they are the ${y}$-coordinates of ${P}$ and ${Q}$, by construction. Thus, by inspecting the coefficient of ${y^2}$ in this cubic, which is ${-(y_1+y_2+y_3)}$, we see that

$\displaystyle (y_1+y_2+y_3)=3(y_1+\lambda x_1)-\lambda^3$

and hence, by the definition of addition on the elliptic curve ${E:\ y^2=4x^3-g_2x-g_3}$ and by the (still unjustified) assumption that the group structure is compatible with the coordinates ${(\wp, \wp')}$, that the function ${\wp'}$ satisfies the addition theorem

$\displaystyle -\wp'(z+Z)=3\left(\wp'(z)+\wp(z)\frac{\wp'(z)-\wp'(Z)}{\wp(z)-\wp(Z)}\right)-\left(\frac{\wp'(z)-\wp'(Z)}{\wp(z)-\wp(Z)}\right)^3$

$\displaystyle =3\frac{\wp'(z)\wp(Z)-\wp'(Z)\wp(z)}{\wp(z)-\wp(Z)}-\left(\frac{\wp'(z)-\wp'(Z)}{\wp(z)-\wp(Z)}\right)^3$

As expected, this expression is symmetric in ${z}$ and ${Z}$. The doubling formula, i.e. the case ${z=Z}$, is obtained by taking the limit as ${z \rightarrow Z}$ in the addition theorem.

Of course, none of this is justified because we haven’t explained why the coordinates ${(\wp, \wp')}$ should be compatible with the group structure. In fact, it makes much more sense to think of the group structure on the elliptic curve as a consequence of the addition theorems. So, to understand why the group structure really is what it is, we have to understand where these addition theorems really come from.

Recall from complex analysis that, for a function ${f}$ meromorphic on a domain ${D}$, the integral

$\displaystyle \int_{\delta D} \frac{df}{f}$

equals ${2\pi i (Z-P)}$, where ${Z}$ and ${P}$ denote the number of zeroes and poles of ${f}$ on ${D}$, each taken with appropriate multiplicity.

By multiplying the differential ${df/f}$ by a function ${g}$ holomorphic on ${D}$, we obtain a weighted sum over the zeroes and poles of ${f}$. More precisely,

$\displaystyle \frac{1}{2\pi i}\int_{\delta D} g\frac{df}{f} = \sum_{s \in D} g(z) - \sum_{p \in D}g(p)$

where ${s}$ and ${p}$ run over the zeroes and poles of ${f}$ in ${D}$, respectively. In particular, taking ${g(z)=z}$, we see that

$\displaystyle \frac{1}{2\pi i} \int_{\delta D} z\frac{f'(z)dz}{f(z)} = \sum_{z \in D}v_z(f) z$

where ${v_p(f)}$ denotes the order of ${f}$ at ${z}$, i.e. the greatest integer ${n}$ such that ${(z-p)^{-n}f(z)}$ is holomorphic at ${z}$.

Now if ${f}$ is an elliptic function, and we take for ${D}$ a fundamental parallelogram, we see immediately using the periodicity of ${f}$, that ${\frac{1}{2\pi i} \int_{\delta D}z\frac{f'(z)dz}{f(z)}}$ equals a ${\mathbb{Z}}$-linear combination of elements of ${\Omega}$. Thus we see that , for an elliptic function ${f}$,

$\displaystyle \sum_{z \in \mathbb{C}/\Omega}v_z(f) z \equiv 0 \mod \Omega.$

In particular, if ${f}$ is an elliptic function of order ${3}$, then we can determine the position of any zero of ${f}$ from knowledge of the position of the other two; the zeroes are always ${z}$, ${Z}$, and ${-z-Z}$ (mod ${\Omega}$).

Now apply this to the elliptic function

$\displaystyle F(u)=\wp'(u)-\wp'(z)=\lambda (\wp(u)-\wp(z)),$

which is of order ${3}$. By construction, it has zeroes at ${u=z}$ and at ${u=Z}$; thus its third zero is at ${u=-z-Z}$. This means precisely that the line passing through ${(\wp(z), \wp'(z))}$ and ${(\wp(Z), \wp'(Z))}$ also passes through ${(\wp(-z-Z), \wp'(-z-Z)) = (\wp(z+Z), -\wp'(z+Z))}$. Of course, these three points lie on the cubic ${y^2=4x^3-g_2x-g_3}$. This explains precisely where the group law comes from. It also shows why we must reflect the third point of intersection of the line through ${P}$ and ${Q}$ across the ${x}$-axis.

All of this discussion can be carried out in a quite abstract setting using the Riemann-Roch theorem, which allows us to endow any smooth, genus ${1}$ algebraic variety having at least one point with a group structure, as above. It follows from the general construction that the group structure on an elliptic curve ${E}$ is isomorphic to ${\mbox{Pic}_0(E)}$, the degree ${0}$ Picard group of ${E}$.

In a future series of posts, I will discuss the similarities between the theory of number fields and the theory of elliptic curves, which lead to the Birch and Swinnerton-Dyer conjecture.

# Elliptic functions, Part I

Elliptic functions were discovered in the ${19^{th}}$ century, but their first appearance in hidden guise goes back to Fagano and Euler, who proved “addition theorems” for elliptic integrals which amount to addition theorems for elliptic functions. Elliptic functions were the bread and butter of many generations of mathematicians; their study gave birth to the theory of Riemann surfaces and, eventually, to modern complex geometry.

An elliptic function ${f}$ is a meromorphic function on ${\mathbb{C}}$ which has two ${\mathbb{R}}$-linearly independent periods ${\omega_1}$ and ${\omega_2}$. Thus ${f}$ is invariant under the action of the lattice group ${\Omega = \omega_1{\mathbb Z}\oplus \omega_2{\mathbb Z}}$. (Of course, there are many possible choices of ${\omega_1}$ and ${\omega_2}$.) This means that we can factor ${f}$ through the projection ${\mathbb{C}\rightarrow \mathbb{C}/\Omega}$. With the complex structure inherited from ${\mathbb{C}}$, the topological space ${E_\Omega = \mathbb{C}/\Omega}$ is a compact Riemann surface which has the shape of a torus.

We can identify ${E_\Omega}$ with the points of a suitably chosen parallelogram. The parallelogram having ${0}$, ${\omega_1}$ and ${\omega_2}$ as vertices is called a fundamental parallelogram. We include only a specified half of its boundary (for example, only the edges ${0\omega_1}$ and ${0\omega_2}$) so as to make sure that no two points are congruent ${\mod \Omega}$. Of course, ${E_\Omega}$ is simply obtained by “pasting” opposite sides of this parallelogram together. We will call any fundamental parallelogram ${E}$. (Thus, as a topological space, ${E_\Omega}$ is a quotient of ${E}$, obtained by pasting opposite sides together.) There are many other choice for $E_\Omega$; as many as there are ${\mathbb{Z}}$-bases for ${\Gamma}$.

So, all that we have merely noted so far is the basic fact that there exists a natural identification of the family of all ${\Omega}$-invariant meromorphic functions on ${\mathbb{C}}$ with the family of all meromorphic functions on ${\mathbb{C}/\Omega}$.

So now we want to build elliptic functions. The way to build a function invariant under the action of some group is to average out this action. For example, suppose ${G}$ is a finite group acting on some set ${S}$ and that we are given a function ${f:S\rightarrow \mathbb{C}}$. We can build the function ${F:S \rightarrow \mathbb{C}}$ by ${F(s)=\sum_{g \in G}f(g \cdot s)}$, which is invariant under ${G}$. Of course, if ${G}$ is infinite, we may hope to replace the finite sum by an appropriately converging series.

So Weierstrass’s idea was just that. Let ${G=\Omega}$, and let ${f(z)=\frac{1}{z^{3}}}$. The series

$\displaystyle f(z)=\sum_{\omega \in \Omega}\frac{1}{(z-\omega)^3}$

is easily seen to converge absolutely and uniformly on every compact set not containing a point of ${\Omega}$. Since each summand is a meromorphic function of ${z}$, so is ${f(z)}$.

It is easy to see from the series expansion that the function ${f}$ has a triple pole at every lattice point with zero residue. Moreover, ${f(z)}$ is odd, since ${-\Omega = \Omega}$ and ${z^3}$ is odd. Thus we have produced a non-trivial elliptic function of order ${3}$ (the order, or the degree, of an elliptic function ${f}$ is the number of its poles inside any period parallelogram; or, if you prefer, it is the degree of ${f}$ considered as a ramified covering ${E_\Omega \rightarrow \mathbb{CP}^1}$).

Now for every ${z \notin \Gamma}$, the function ${f(u)-1/u^3}$ can be integrated from ${0}$ to ${z}$ along a path not passing through any point of ${\Gamma}$. Since the residue of ${f}$ at each pole is ${0}$, the value of the integral is independent of path. By the uniform convergence of the series defining ${f}$ along the integration path, we obtain a new function of ${z}$,

$\displaystyle P(z)=\int_0^z (f(u)-1/u^3)du = \frac{-1}{2}\sum_{\omega \in \Omega^*} \left(\frac{1}{(z-\omega)^2}-\frac{1}{\omega^2}\right),$

which is meromorphic, and has a double pole with zero residue at every point of ${\Omega^*=\Omega - \{0\}}$. The function ${\wp(z)=-2P(z)+2u^{-2}}$ is the Weierstrass ${\wp}$-function associated to ${\Omega}$ (we may sometimes write it ${\wp_\Omega}$ to emphasize the dependence of ${\wp}$ on ${\Omega}$). It is given by the series

$\displaystyle \wp(z)=\frac{1}{z^2}+\sum_{\omega \in \Omega^*} \left(\frac{1}{(z-\omega)^2}-\frac{1}{\omega^2}\right),$

which also converges absolutely and uniformly on every compact set disjoint from ${\Omega}$. Notice that to integrate from ${0}$, we had to remove the pole at ${0}$, integrate, and then put the pole back. It is not immediately obvious that ${\wp}$ should be elliptic. However, since ${f}$ is odd and elliptic, we have, for example, ${f(\omega_1(\frac{1}{2}+t))=f(\omega_1(\frac{-1}{2}+t))=-f(\omega_1(\frac{1}{2}-t))}$. This shows that integrating ${f}$ along the side ${\omega_1}$ of the fundamental period parallelogram gives ${0}$; by generalizing this observation, we can see that ${P(z)}$ is elliptic.

It would be criminal to continue without mentioning that we have only been generalizing the theory of trigonometric functions (and, as we shall see, of their associated curves, the conics). Recall that Euler gave us the product formula for ${\sin z}$:

$\displaystyle \frac{\sin \pi z}{\pi z}=\prod_{n=1}^\infty \left(1-\frac{z^2}{n^2\pi^2}\right).$

Taking the logarithmic derivative, we obtain the “partial fraction” expansion

$\displaystyle \pi \cot \pi z = \frac{1}{z} + \sum_{n=1}^\infty \frac{-2z}{n^2\pi^2}\frac{n^2\pi^2}{n^2\pi^2-z^2}=\frac{1}{z}+\sum_{n=1}^\infty \frac{1}{\pi n-z}-\frac{1}{\pi n+z}.$

(Euler used this formula to give the value of ${\zeta(2n)}$, by expanding further each term in this formula, and comparing the resulting series with the Taylor series for ${\pi \cot \pi z}$.) But this function is not quite yet analogous to the ${\wp}$ function, because it’s an odd function. Applying ${-\frac{d}{dz}}$ yields

$\displaystyle \pi^2 \csc^\pi z = \frac{1}{z^2}+\sum_{n=1}^\infty \frac{1}{(z+\pi n)^2}+\frac{1}{(z-\pi n)^2}$

which really is analogous to the ${\wp}$ function. So we see that the ${\wp}$ function degenerates to ${\csc^2}$ as one of its periods becomes ${0}$.

Now let’s make some general observations about any elliptic function ${f}$. First, note that ${f}$ must have finitely many poles in any period parallelogram, since the closure of the period parallelogram is compact, and ${f}$ is meromorphic. Second, note that the sum of the residues of ${f}$ at its poles in a period parallelogram is ${0}$. Indeed, integrating around the boundary and using the periodicity of ${f}$, we see that integrals along opposite sides cancel each other. (We have to avoid poles on the boundary if there are some. By the periodicity of ${f}$, the poles on the boundary come in pairs with equal residue, and by going around them in small semi-circles in such a way that the contributions of the residues cancel each other out, we save the situation. So we’re integrating on a jigsaw puzzle piece with which we can tile the plane, basically.) This observation may remind you of the theorem which states that the sum of the residues of a meromorphic differential on ${\mathbb{CP}^1}$ is ${0}$. This fact holds on all compact Riemann surfaces.

As a consequence of the fact that the residues sum to ${0}$, we see that an elliptic function cannot have a single simple pole. This is almost true also on ${\mathbb{CP}^1}$. For example, ${f(z)=1/z}$ has a single simple pole. The sum of its residues is not ${0}$, but the differential ${\frac{dz}{z}}$, however, has two poles with residues ${1}$ and ${-1}$; indeed, let ${w=\frac{1}{z}}$; then ${\frac{dz}{z}=-\frac{dw}{w}}$, so that ${\frac{dz}{z}}$ also has a pole at ${w=0}$, with residue ${-1}$. Residues are really a property of differentials and not of functions.

Moreover, a non-constant elliptic function ${f}$ must have at least one pole inside any fundamental parallelogram. Indeed, if ${f}$ is analytic (and hence continuous) on the closure ${\overline{E}}$ of a fundamental parallelogram ${E}$, the image ${f(\overline{E})}$ is compact, since ${\overline{E}}$ is compact; but since ${f(\overline{E})=f(\mathbb{C})}$, Liouville’s theorem implies that ${f}$ is constant.

This observation, while very simple, is the basic tool in proofs of relations among elliptic functions.

Let’s expand the series for ${\wp}$ a bit further. We have

$\displaystyle \frac{1}{(z-\omega)^2}-\frac{1}{\omega^2} = \frac{1}{\omega^2} \left(\frac{1}{(z/\omega-1)^2}-1 \right)=\sum_{n=1}^\infty (n+1)\omega^{-n-2}z^{n}$

Hence

$\displaystyle \wp(z)=\frac{1}{z^2}+\sum_{\omega \in \Omega^*}\left(\frac{1}{(z-\omega)^2}-\frac{1}{\omega^2}\right)$

$\displaystyle = \frac{1}{z^2}+\sum_{n=1}^\infty (n+1)e_{n} z^{n} = \frac{1}{z^2}+2e_1z+3e_2z^2+\dots$

where ${e_{n} = \sum_{\omega \in \Omega^*}\omega^{-n-2}}$. Notice that ${e_n=0}$ for ${n}$ odd, since ${-\Omega = \Omega}$ (or alternatively, since ${\wp}$ is an even function). Thus we have

$\displaystyle \wp(z)=\frac{1}{z^2} + 3e_2z^2 + 5e_4z^4+\dots$

As functions of ${\Gamma}$, the values ${e_4, e_6, \dots, }$ are very interesting in their own right (they’re the fundamental examples of modular forms); I will talk about them in a later post.

So now that we have the Laurent expansion for ${\wp (z)}$ around ${0}$, we can hope to discover a relationship between ${\wp(z)}$ and ${\wp'(z)}$. We have

$\displaystyle \wp'(z)=\frac{-2}{z^3}+6e_2z+20e_4z^3+\dots$

and hence the function ${\wp'(z)^2-4\wp(z)^3}$ has a pole of order ${< 6}$ at ${0}$, since the terms in ${z^{-6}}$ cancel out. We can write explicitly:

$\displaystyle \wp'(z)^2-4\wp(z)^3 = \left(\frac{-24e_2}{z^2}-{80e_4}+\dots \right)-4\left(\frac{9e_2}{z^2}+15e_4+\dots\right)$

$\displaystyle =\frac{-60e_2}{z^2}-140e_4+\dots$

where each occurence of “${\dots}$” represents some analytic function which vanishes at ${0}$. Hence, by adding ${60e_2\wp(z)}$ and ${140e_4}$ to this series, we cancel the ${z^{-2}}$ term and the constant term, and we see that the function ${\wp'(z)^2-4\wp(z)^3+60e_2\wp(z)+140e_4}$ is analytic at ${0}$ and vanishes there. Moreover, it is an elliptic function with respect to ${\Omega}$, since the same is true of ${\wp'}$ and ${\wp}$. Also, it can only have poles at the points of ${\Omega}$, since the same is true of ${\wp'}$ and ${\wp}$. But it has no pole at ${0}$; hence it has no pole at all. Hence it must be constantly equal to ${0}$. So we have proved that the ${\wp}$ function satisfies the second-order non-linear differential equation

$\displaystyle \wp'(z)^2=4\wp(z)^3-g_2\wp(z)-g_3$

in terms of the constants (depending on ${\Omega}$)

$\displaystyle g_2=60\sum_{\omega \in \Omega^*}\frac{1}{\omega^4},$

$\displaystyle g_3=140\sum_{\omega \in \Omega^*}\frac{1}{\omega^6}.$

We will discuss the great significance of this differential equation in a future post.

# Universal constructions throughout mathematics

Here are the notes I wrote for the talk I am giving at the Canadian Undergraduate Mathematics Conference.

My talk is meant as a friendly introduction to category theory and to the notion of a universal arrow.

Edit: corrected a couple of mistakes in my original document. If you find any more, please let me know.