Algebraic modular forms and characteristic classes

This post is the first of a series dedicated to algebraic modular forms and {p}-adic modular forms. I will be mostly following the foundational paper of Katz in the Antwerp proceedings.

A characteristic class with values in a cohomology theory {H^\bullet} is a rule which assigns to each bundle {E/S} a cohomology class {f(E/S) \in H^\bullet(S)} such that (i) {f(E/S)} depends only on the isomorphism class of {E/S}; (ii) {f} commutes with base change: if {g:S' \rightarrow S} is any morphism, then {f(E_{S'}/S') = g^*f(E/S)}.

In order to illustrate, let us take the prototypical example of {G}-bundles. If {G} is a topological group, a {G}-bundle on a topological space {S} is a map of topological spaces {p: E \rightarrow S} equipped with a continuous action of {G}, such that, locally over {S}, {E} is isomorphic to a product {U \times G} with its natural action. It can be thought of as a continously varying family of {G}-homogeneous spaces.

Notice that {G}-bundles can be base changed: if {S' \rightarrow S} is a continuous map, the fibre product {E \times_S S' \rightarrow S'} is naturally a {G}-bundle.

Let {b : \text{Top} \rightarrow \text{Set}} denote the functor which takes {S \mapsto \{\text{iso. classes of } G \text{-bundles on } S}\}. It takes a continuous map {S'\rightarrow S} to the map {b(S) \rightarrow b(S')} induced by base change; it is therefore a contravariant functor. Let {H^\bullet : \text{Top} \rightarrow \text{Set}} be a cohomology theory. For our purposes, this need only be a contravariant functor which is homotopy invariant. Then, by definition, a characteristic class with values in {H^\bullet} is simply a natural transformation {b \rightarrow H^\bullet}.

Example of a characteristic class. In this example we take {H^\bullet(S):=H^1(S, \text{Aut}(G))} denote the first Cech cohomology pointed set with coefficients in the (non-abelian) sheaf of continuous functions {U \mapsto {\text{Aut}(G)}}. Then any {G}-bundle {p: E\rightarrow S} determines a class in {H^1(S, {\text{Aut}(G)})} in the following manner: pick a trivializing cover {\{U_i\}} of {S}, with {G}-bundle isomorphisms {\varphi_i : E|_{U_i} \rightarrow U_i \times G}. Over the intersection {U_{ij}}, the map {\varphi_j \varphi_i^{-1}} is an automorphism of the {G}-bundle {U_{ij} \times G}, or, what is the same, a continuous map {U_{ij} \rightarrow \text{Aut}(G)}, i.e. a section of the sheaf { {\text{Aut}(G)}} over {U_{ij}}. The collection {\{\varphi_j \varphi_i^{-1}\}} is a Cech {1}-cocycle, and therefore it determines a cohomology class {\sigma_{E/S} \in H^1(S, {\text{Aut}(G)})}. Different choices of trivializations give cocycles which differ by coboundaries, so the cohomology class {\sigma_{E/S}} is in fact well-defined.

The characteristic class {\sigma_{E/S}} constructed above actually determines the bundle {E/S} up to isomorphism, so that the pointed cohomology set {H^1(S, {\text{Aut}(G)})} actually classifies isomorphism classes of {G}-bundles on {S}. There is, however, another way to classify {G}-bundles, using classifying spaces. Given a topological group {G}, one can construct a topological space {BG} which represents the functor {b} in the homotopy category of topological spaces. This means means that there is a canonical isomorphism

\displaystyle b(S) \cong [S, BG]

where {[S,BG]} denotes the set of homotopy classes of maps {S \rightarrow BG}.

Thanks to the homotopy invariance of cohomology, we have the following proposition:

Proposition 1 Given a cohomology theory {H^\bullet}, there exists a canonical bijection between the set {\{f : b \rightarrow H^\bullet\}} of characteristic classes with values in {H^\bullet}, and the cohomology set {H^\bullet(BG)} of the classifying space {BG}.

Proof: Indeed, this is nothing but the Yoneda lemma:

\displaystyle \text{Nat}(b, H^\bullet) = \text{Nat}([-, BG], H^\bullet) = H^\bullet(BG).


This allows us to think of characteristic classes in two equivalent ways: either as rules which transform bundles into cohomology classes on the base, or as cohomology classes on a classifying space. The same thing will happen with modular forms.

An elliptic curve over a scheme {S} (also called a relative elliptic curve over {S}) is defined as a smooth and proper morphism {E \xrightarrow{p} S}, equipped with a section {S \xrightarrow{e} E}, whose geometric fibres are smooth curves of genus one.

It can be shown that there is a unique structure of {S}-group scheme on {E}, for which {e} is the identity, and which induces the group structure on each fibre {E_x} (for which {e_x} is the identity).

Elliptic curves can be based change: If {p: E \rightarrow S} is an elliptic curve and {T \rightarrow S} is a morphism, then the base change {E\times_S T = E_T \rightarrow T} is an elliptic curve over {T}. We think of {E_{/S}} as a family of elliptic curves, parametrized by the geometric points of {S}.

Here are some examples:

  1. Let {S = \text{Spec }K}, where {K} is an algebraically closed field. Then an elliptic curve over {S} is an elliptic curve over {K} in the classical sense.
  2. An elliptic curve over {\mathbf Z_p} is an elliptic curve over {\mathbf Q_p} having good reduction at {p}.

  3. There are no elliptic curves over {\mathbf Z}. Indeed, such a curve would provide an elliptic curve over {\mathbf Q} having good reduction everywhere. However, by a result of Tate, this is impossible. (In fact, there are no abelian varieties over {\mathbf Z}, of any dimension.)

    In general, if {R} is a ring which admits a morphism {R \rightarrow \mathbf Z}, then there is no elliptic curve over {R}.

  4. Let {\Delta \in \mathbf Z} and {S=\text{Spec } \mathbf Z[1/\Delta]}. Then an elliptic curve over {S} is an elliptic curve over {\mathbf Q} whose minimal discriminant divides {\Delta}.

  5. Weierstrass’ theory of elliptic functions shows that the equation

    \displaystyle y^2+xy=x^3+B(q)x + C(q)

    defines an elliptic curve over {\mathbf Z((q))}, where {B} and {C} are the formal power series

    \displaystyle B(q) = -5 \sum_{n \geq 1} \sigma_3 (n) q^n, \qquad C(q) = \sum_{n \geq 1} \frac{-5\sigma_3(n) - 7 \sigma_5(n)}{12} q^n \qquad (\sigma_k(n) = \sum_{d \mid n}d^k).

    The Tate curve is therefore an elliptic curve over the formal punctured disc. It comes with the canonical differential {dx/(2y+x)}, which is independent of {q}. We write {(\text{Tate}(q), \omega_{can})} for the Tate curve and its canonical differential.

    The Tate curve cannot be extended to the whole disc {\mathbf Z[[q]]} because the ring {\mathbf Z[[q]]} admits a morphism to {\mathbf Z}, given by {q \mapsto 0}.

Given an elliptic curve {E/S}, the sheaf of algebraic {1}-forms {\Omega^1_{E/S}} is free of rank {1} on {E}. Since {p : E\rightarrow S} is proper, { {\omega}_{E/S} : = p_*\Omega^1_{E/S}} is free on {S}. For example, if {S=\text{Spec } K} then { {\omega}_{E/S}} can be identified with {H^0(E, \Omega^1_{E/K})}, the vector space of global differentials on {E} (a one-dimensional vector space over {K}).

Now we define the notion of an algebraic modular form.

Definition 2 An algebraic modular form of weight {k \in \mathbf Z} is a rule {f} which, to any elliptic curve over any scheme {S}, assigns a section {f(E/S) \in H^0(S, {\omega}_{E/S}^{\otimes k})} such that: (i) {f(E/S)} depends only on the isomorphism class of {E/S}; (ii) {f} commutes with base change: if {g:S' \rightarrow S} is any morphism, then {f(E_{S'}/S') = g^*f(E/S)}.

Next time, I’ll expand on the definition and relate it with classical modular forms.

Chevalley’s theorem

The purpose of this post is to prove Chevalley’s theorem: If {f: X \rightarrow Y} is a finite surjective morphism of noetherian separated schemes, with {X} affine, then {Y} is affine.

We will follow the outline in Hartshorne (III.3 Problems 1 & 2 and III.4 Problems 1 & 2).

Theorem 1 Let {f: X \rightarrow Y} be an affine morphism of noetherian schemes. Then for any coherent sheaf {\mathcal F} on {X}, there are natural isomorphisms for all {i \geq 0},

\displaystyle H^i(X, \mathcal F) \simeq H^i(Y, f_* \mathcal F).

Proof: According to (II, Ex. 5.17), when {f} is affine, the direct image functor {f_*} induces an equivalence from the category of coherent {\mathcal O_X}-modules to the category of coherent {f_*\mathcal O_X}-modules. Moreover, an equivalence {\tau : A \rightarrow B} of abelian categories (i.e. an additive functor which is also an equivalence) is exact. Therefore, if {F: B \rightarrow \text{Ab}} is a left additive functor, by the uniqueness of the {\delta}-functor extending a given left additive functor, it follows that there exists a natural isomorphism {R^i(F \circ \tau) \simeq R^i F \circ \tau} for each {i}. \Box

Theorem 2 Let {X} be a noetherian scheme. Then {X} is affine if and only if {X_{\text{red}}} is.

Proof: Clearly {X_\text{red}} is affine if {X} is affine.

Conversely, suppose {X_{\text{red}}} is affine. We prove that {X} has cohomological dimension {0}, hence it is affine by Serre’s theorem (III.3.7). Let {\mathcal F} be a quasi-coherent sheaf on {X}. As indicated in the hint, we let {\mathcal N} denote the sheaf of nilpotents of {X} and we consider the filtration

\displaystyle \mathcal F \supseteq \mathcal N \cdot \mathcal F \supseteq \mathcal N^2 \cdot \mathcal F \supseteq \dots

of {\mathcal F}. Since {X} is noetherian, there exists an {n>0} such that {\mathcal N^n = 0}, so the filtration is finite.

We prove by descending induction on {j} that { \mathcal N^j \cdot \mathcal F} is acyclic. For {j=n}, it is trivial. Now consider the exact sequence of quasi-coherent sheaves on {X},

\displaystyle 0 \rightarrow \mathcal N^j \cdot \mathcal F\rightarrow \mathcal N^{j-1} \cdot \mathcal F \rightarrow (\mathcal N^{j-1} \cdot \mathcal F) / (\mathcal N^j \cdot \mathcal F) \rightarrow 0.

The quasi-coherent sheaf {(\mathcal N^{j-1} \cdot \mathcal F) / (\mathcal N^j \cdot \mathcal F)} is naturally a quasi-coherent {\mathcal O_X / \mathcal N \simeq \mathcal O_{X_{\text{red}}}}-module, and its cohomology can be calculated either as an {\mathcal O_X}-module or as an {\mathcal O_{X_{\text{red}}}} module by Theorem 1 (using the fact that the reduction morphism {X_{\text{red}} \to X} is affine). Therefore, it is acyclic, since {X_{\text{red}}} is affine by assumption. The sheaf {\mathcal N^j \cdot \mathcal F} is acyclic by the inductive hypothesis. By the long exact sequence of cohomology, we see that {\mathcal N^{j-1} \cdot \mathcal F} is also acyclic. \Box

Theorem 3 Let {X} be a reduced scheme. Then {X} is affine if and only if each irreducible component of {X} is affine.

Proof: The irreducible components of {X} are closed subschemes of {X}, hence they are affine if {X} is affine. Conversely, suppose that every irreducible component of {X} is affine. We prove that {X} has cohomological dimension {0}.

We proceed by induction on the number of irreducible components of {X}. If {X} is irreducible, then the statement is vacuously true. Now suppose it holds for noetherian schemes with {n-1} irreducible components. Suppose that {X} has {n} irreducible components, and write it as {X=Y \cup X'} where {Y} is irreducible. Let {\mathcal F} be a quasi-coherent sheaf on {X}. Denote {\tau} the inclusion {Y \hookrightarrow X} and {\iota} the inclusion {X' \hookrightarrow X}, where each closed subscheme is given the canonical reduced closed subscheme structure. Since {Y} is Noetherian, { \tau_* \tau^* \mathcal F} is also a quasi-coherent sheaf on {X}, supported on {Y}. There is a canonical morphism {\mathcal F \rightarrow \tau_* \tau^* \mathcal F}, and {\mathcal F \rightarrow \iota_* \iota^* \mathcal F }. (Each of these two morphisms is a unit of the “inverse image – direct image” adjunction). Let

\displaystyle g : \mathcal F \rightarrow \tau_* \tau^* \mathcal F \oplus \iota_* \iota^* \mathcal F

be their sum. It is easy to see that this morphism is surjective, and an isomorphism away from the intersection. Let {\mathcal G= \ker g}. Then {\mathcal G} is quasi-coherent and supported in {Y \cap X'}. Therefore we have an exact sequence

\displaystyle 0 \rightarrow \mathcal G \rightarrow \mathcal F \rightarrow \tau_* \tau^* \mathcal F \oplus \iota_* \iota^* \mathcal F \rightarrow 0

Since {X'} is affine by the induction hypothesis, {Y \cap X'} is affine, being a closed subscheme of an affine scheme. Now, since {\text{Supp }\mathcal G \subseteq Y \cap X'}, the cohomology of {\mathcal G} can be calculated either as an {\mathcal O_{(Y \cap X')}}-module or as an {\mathcal O_X}-module, and therefore it vanishes. Similarily the sheaf {\tau_* \tau^* \mathcal F \oplus \iota_* \iota^* \mathcal F} is acyclic because {Y} and {X'} are affine. Therefore, by the long exact sequence of cohomology, {\mathcal F} is also acyclic. \Box

Lemma 4 Let {f: X \rightarrow Y} be a finite surjective morphism of integral noetherian schemes. Then there is a coherent sheaf {\mathcal M} on {X}, and a morphism of sheaves {\alpha : \mathcal O_Y^r \rightarrow f_* \mathcal M} for some {r>0}, such that {\alpha} is an isomorphism at the generic point of {Y}.

Proof: Let {L} be the function field of {X} and {K} be the function field of {Y}. Then the morphism {f} gives rise to a field homomorphism {K \hookrightarrow L}. Since {f} is finite and surjective, {L} is finite over {K}, say of degree {r}. Let {\{x_1, \dots, x_r\}} be a basis for {L} over {K}. Each {x_j} can be represented as a section {s_j} of {\mathcal O_X} over an open set {U_j}. Let {\tau_j : U_j \hookrightarrow X} be the inclusion. Let {\mathcal E_j} be the sheaf {\mathcal E_j = s_j \cdot \mathcal O_{U_j}} on {U_j}. Obviously {\mathcal E_j} is coherent (in fact free of rank {1}). Let {\mathcal F_j = (\tau_j)_*(\mathcal E_j)}. Then {\mathcal F_j} is quasi-coherent on {X} since {U_j} is noetherian; since {f} is finite, {\mathcal F_j} is in fact coherent. Let {\mathcal M = \bigoplus_j \mathcal F_j}. Define the morphism {\alpha : \mathcal O^r_Y \rightarrow f_*\mathcal M} by the global sections {x_j} of {f_*\mathcal M} (using the fact that {\mathcal O_Y} represents the global sections functor {\Gamma(Y, -)}). Then, by construction, {\alpha} is an isomorphism of {K}-vector spaces {K^r \cong L} at the generic point of {Y}. \Box

Lemma 5 Let {f: X \rightarrow Y} be a finite surjective morphism of integral noetherian schemes. Then for any coherent sheaf {\mathcal F} on {Y}, there exists a coherent sheaf {\mathcal G} on {X}, and a a morphism {\beta : f_* \mathcal G \rightarrow \mathcal F^r} which is an isomorphism at the generic point of {Y}.

Proof: We take {\beta = \mathcal{H}\text{om}(\alpha, \mathcal F)}, where {\mathcal{H}\text{om}} is the sheaf {\mathcal{H}\text{om}} and {\alpha} is the morphism of Lemma 4:

\displaystyle \beta: \mathcal{H}\text{om}(f_*\mathcal M, \mathcal F) \rightarrow \mathcal{H}\text{om}(\mathcal O_Y^r, \mathcal F).

Remark that {\mathcal{H}\text{om}(\mathcal O_Y^r, \mathcal F) \simeq \mathcal F^r}. Moreover, the sheaf {\mathcal{H}\text{om}(f_*\mathcal M, \mathcal F)} naturally has a structure of {f_*\mathcal O_X}-module. By (II, Ex. 5.17), when {f} is an affine morphism, {f_*} induces an equivalence between the category of coherent {\mathcal O_Y}-modules and the category of coherent {f_*\mathcal O_X}-modules. Therefore {\mathcal{H}\text{om}(f_*\mathcal M, \mathcal F)} is isomorphic to an {\mathcal O_Y}-module of the form {f_*\mathcal G}, where {\mathcal G} is a coherent {\mathcal O_X}-module. Thus {\beta} has the form {f_* \mathcal G \rightarrow \mathcal F^r}.

Moreover, it follows from the fact that a coherent sheaf on a noetherian scheme is finitely presented that on such a scheme, taking sheaf {\mathcal{H}\text{om}} commutes with taking stalks of morphisms; therefore {\beta} is also an isomorphism at the generic point of {Y}. \Box

Now we are ready to prove Chevalley’s theorem.

Theorem 6 (Chevalley’s theorem). Let {f: X \rightarrow Y} be a finite surjective morphism of noetherian separated schemes, where {X} is affine. Then {Y} is affine.

Proof: By Theorems 2 and 3, we may suppose that {X} and {Y} are reduced and irreducible. We prove by contradiction that {Y} is affine. Let {\Sigma} be the collection of closed subschemes of {Y} which are not affine. Suppose it not empty; then it contains a minimal element {Z \hookrightarrow X}, which we may view as having the reduced induced subscheme structure. Since finite morphisms are stable under base change, we may in fact suppose that {Z=Y} (what this means is that we are replacing {f} by its restriction to {f^{-1}(Z)} if necessary). Therefore, we suppose that every proper closed subscheme of {Y} is affine.

Let {\mathcal F} be a coherent sheaf on {X}. By Lemma {5}, there exists a coherent sheaf {\mathcal G} on {X} and a morphism {\beta: f_* \mathcal G \rightarrow \mathcal F^r} which is generically an isomorphism (and which is therefore surjective, since {Y} is irreducible). Thus, if {\mathcal D = \ker \beta}, we have an exact sequence of sheaves on {Y}

\displaystyle 0 \rightarrow \mathcal D \rightarrow f_* \mathcal G \rightarrow \mathcal F^r \rightarrow 0.

Now, as in the proof of Theorem 3, we view {\mathcal D} as a quasi-coherent sheaf on the proper closed subscheme {\text{Supp }\mathcal D}. By the minimality of {Y}, {\text{Supp }\mathcal D} is affine and therefore {\mathcal D} is acyclic. Moreover, since a finite morphism is affine, we can apply Theorem 1 to see that {f_* \mathcal G} is also acyclic. Therefore, by the long exact sequence of cohomology, {\mathcal F^r} is acyclic, so {\mathcal F} is acyclic. \Box

Thefore, {Y} has cohomological dimension {0}, which contradicts the assumption that it is not affine.

The Lebesgue Number Lemma and uniform continuity

In this post, I’ll prove the Lebesgue Number Lemma and use it to prove that a continous function on a compact metric space is uniformly continuous.

Lemma 1 (Lebesgue Number Lemma). Let {(X,d)} be a compact metric space and let {\{V_i\}_{i \in I}} be an open cover of {X}. Then there exists a real number {\delta > 0} such that every open ball of radius {\delta} is contained in some {V_i}.

Proof: First, remark that if any refinement of the cover {\{V_i\}_{i \in I}} satisfies this property, then {\{V_i\}} also satisfies this property; thus, since {X} is compact, we may replace the cover {\{V_i\}} by a finite cover by open balls {\{B(x_i, r_i)\}_{i=1}^n}.

Define {f_i} on {X} by

\displaystyle f_i(x) = \min(0, r_i - d(x, x_i)).

Then {f_i} is continous and its support is {B(x_i, r_i)}. Let {f= \min(f_1, \dots, f_n)}. Then {f} is continous, and {f>0} because {\{B(x_i, r_i)\}} covers {X}. Since {X} is compact, {f} attains its minimum, which is {>0}; we call it {\delta}. Now, if {x \in X}, the statement {f(x)\leq \delta} means precisely that for some {i}, {d(x, x_i) \leq r_i - \delta}, so the ball of radius {\delta} around {x} is contained in {B(x_i, r_i)}. \Box

Theorem 2 Let {f} be a continous function on the compact metric space {(X, d)}. Then {f} is uniformly continous.

Proof: Let {\epsilon>0}. For each {w \in X}, let

\displaystyle V_w = \{y \in X : |f(w)-f(y)|<\epsilon/2\}.

Then {\{V_w\}_{w \in X}} is an open cover of {X}. Let {\delta} be a Lebesgue Number for the cover {\{V_w\}}. Then, if {x, y\in X} are such that {d(x, y)< \delta}, there exists a {w \in X} such that {B(x, \delta) \subseteq V_w}; therefore, since {x, y \in V_w}, we have

\displaystyle |f(x)-f(y)|\leq |f(x)-f(w)| + |f(w)-f(y)| < \epsilon.


Eisenstein series identities, directly

Let {\Omega \subseteq \mathbf C} be a lattice. The Weierstrass {\wp}-function is defined as

\displaystyle \wp(z) = \frac{1}{z^2} + \sum_{\omega \in \Omega^*} \left(\frac{1}{(z-\omega)^{2}} - \frac{1}{\omega^{2}}\right).

It is {\Omega}-invariant, meromorphic, and has a double pole at each lattice point and no other poles. Its Laurent expansion at the origin is

\displaystyle \wp(z) = \frac{1}{z^2} + c_2z^2 + c_4z^4 + c_6z^6 + \dots,

where {c_{2m} = (2m+1)\sum \frac{1}{\omega^{(2m+2)}}}. Its derivative is

\displaystyle \wp'(z) = \sum_{\omega \in \Omega} \frac{1}{(z-\omega)^3} = \frac{-2}{z^3} + 2c_2z + 4c_4z^3 + 6c_6z^5 +\dots.

The functions {\wp} and {\wp'} satisfy

\displaystyle \wp'(z)^2 = 4\wp(z)^2 - g_2\wp(z) - g_3

in terms of the quantities {g_2 = 20c_2} and {g_3 = 28c_4}. Now if we take {\Lambda = \left<1, \tau\right>}, where {\tau} is in the upper half-plane, then {\sum \frac{1}{\omega^{2m}} = G_{2k}(\tau)}, where {G_{2k}} is the weight {2k} Eisenstein series. These Eisenstein series, or rather the normalized Eisenstein series {E_{2k} =\frac{G_{2k}}{G_{2k}(i\infty)}= \frac{G_{2k}}{2\zeta(2k)}}, satisfy certain relations, such as: (wikipedia)

\displaystyle  \begin{array}{rcl}  E_{8} &=& E_4^2 \\ E_{10} &=& E_4\cdot E_6 \\ 691 \cdot E_{12} &=& 441\cdot E_4^3+ 250\cdot E_6^2 \\ E_{14} &=& E_4^2\cdot E_6 \\ 3617\cdot E_{16} &=& 1617\cdot E_4^4+ 2000\cdot E_4 \cdot E_6^2 \\ 43867 \cdot E_{18} &=& 38367\cdot E_4^3\cdot E_6+5500\cdot E_6^3 \\ 174611 \cdot E_{20} &=& 53361\cdot E_4^5+ 121250\cdot E_4^2\cdot E_6^2 \\ 77683 \cdot E_{22} &=& 57183\cdot E_4^4\cdot E_6+20500\cdot E_4\cdot E_6^3 \\ 236364091 \cdot E_{24} &=& 49679091\cdot E_4^6+ 176400000\cdot E_4^3\cdot E_6^2 + 10285000\cdot E_6^4. \end{array}

In most basic texts on modular forms, these identities are derived by proving and exploiting the fact that the space {M_{2k}} of modular forms of weight {2k} is finite-dimensional. For instance, the fact that {E_8} and {E_4^2} have the same value at {i\infty}, combined with {\dim M_{8} = 1}, implies {E_8=E_4^2}. However, there is another, more “hands on” way to derive these identities.

Let us prove that {E_8 = E_4^2}. Substituting Laurent series at the origin in the equation {\wp'(z)^2 - 4\wp(z)^2 + g_2\wp(z) + g_3=0}, we see after some rearranging that the function

\displaystyle (12c_2^2 - 36c_6)z^2 + (12c_2c_4 - 44c_8)z^4 + (-4c_2^3 + 4c_4^2 + 20c_2c_6 - 52c_10)z^6 + ...

is identically {0}, and therefore

\displaystyle  \begin{array}{rcl}  0 &= &12c_2^2 - 36c_6\\ 0 &=& 12c_2c_4 - 44c_8 \\ 0 &=& -4c_2^3 + 4c_4^2 + 20c_2c_6 - 52c_{10}\\ & \dots & \end{array}


\displaystyle c_{2m} = (2m+1)G_{2m+2} = (2m+1)2\zeta(2m+2)E_{2m+2} = (-1)^{m+1}(2m+1)\frac{(2\pi)^{2m+2}}{(2m+2)!}B_{2m+2}.

Thus, for instance, {c_6 = 14 \zeta(8)E_8 = \frac{14}{9450}\pi^8E_8}. Making these substitutions in the first equation and factoring out {\pi^8}, we see that

\displaystyle 0= \frac{12 \cdot 36}{90^2}E_4^2 - \frac{14 \cdot 36}{9450}E_8.

Since {\frac{12 \cdot 36}{90^2} = \frac{14 \cdot 36}{9450}}, we have {E_4^2 = E_8}.

In fact, we have a bijection between {\{(\wp_\Omega(z), \wp_\Omega'(z)) : \Omega \subseteq C\}} and the set of infinituples {(c_2, c_4, c_6, \dots)} of complex numbers satisfying the infinite system of equations {I}:

\displaystyle  \begin{array}{rcl}  0 &= &12c_2^2 - 36c_6\\ 0 &=& 12c_2c_4 - 44c_8 \\ 0 &=& -4c_2^3 + 4c_4^2 + 20c_2c_6 - 52c_{10}\\ & \dots & \end{array}

Luckily, the values of {c_2} and {c_4} determine all of the others, and the ring {\mathbf C[c_2, c_4, c_6, \dots]/I} is generated by {c_2} and {c_4}, and is in fact isomorphic to {\mathbf C[c_2, c_3] = \mathbf C[G_4, G_6]}. This means precisely that {\{(\wp_\Omega(z), \wp_\Omega'(z)) : \Omega \subseteq C\}} is in bijection with the closed points of {\text{Proj} (\mathbf C[G_4, G_6]) = \mathbf P^1_{\mathbf C}}.

A Noetherian and Hausdorff space is finite

In this post, I will prove that a Noetherian and Hausdorff topological space is finite (and therefore has the discrete topology, being Hausdorff). The proof is very short and pleasant.

Proof: Let {X} be such a space, and suppose that it is infinite. Let {\Sigma} be the collection of infinite closed subsets of {X}. It is nonempty since {X \in \Sigma}, and therefore has a minimal member {Z} by the Noetherian assumption. Let {p,q} be points of {Z}, and {U,V} be disjoint open neighborhoods of {p} and {q} respectively (such {U} and {V} exist by the Hausdorff assumption). Then {X = (X-U) \cup (X-V)} since {U} and {V} are disjoint, so {Z = (Z \cap (X-U)) \cup (Z \cap (X-V))}. Now each of {Z \cap (X-U)} and {Z \cap (X-V)} is closed in {X}, and is properly contained in {Z} (the first one doesn’t contain {p}, and the second one doesn’t contain {q}). Therefore, by minimality of {Z}, each must be finite, and therefore {Z} is also finite, which is a contradiction. \Box

Corollary: in any infinite Hausdorff space, there exists a strictly descending infinite chain of closed subsets Z_1 \supset Z_2 \supset Z_3 \dots. The proof above can be easily adapted to construct such a sequence.

The Mayer-Vietoris sequence in sheaf cohomology

In this post, I will prove the Mayer-Vietoris Sequence for sheaf cohomology.

In what follows, {X} is a topological space and {\mathcal F, \mathcal G, \mathcal H} are sheaves of abelian groups on {X}. Let {Z} be a closed subset of {X}. We let {\Gamma_Z(X,\mathcal F)} denote the global sections of {\mathcal F} with support in {Z}. The functor {\Gamma_Z(X, -)} is a left-exact additive functor from sheaves on {X} to abelian groups, and its right derived functors, denoted {H^i_Z(X, -)}, is the {i}-th cohomology of {X} with support in {Z}. If {\mathcal F} is a sheaf, the presheaf {U \mapsto \Gamma_{Y \cap U}(U, \mathcal F)} is also a sheaf on {X}, denoted {\mathcal H^0_Y(\mathcal F)} and called the “subsheaf of {\mathcal F} with support in {Y}“.

The Mayer-Vietoris sequence, for a sheaf {\mathcal F} and for a pair of closed subsets {Y,Z \subseteq X}, is the long exact sequence of cohomology with supports

\displaystyle \dots \rightarrow H^i_{Y \cap Z}(X, \mathcal F) \rightarrow H^i_Y(X, \mathcal F) \oplus H^i_Z(X, \mathcal F) \rightarrow H^i_{Y \cup Z}(X, \mathcal F) \rightarrow H^{i+1}_{Y \cap Z}(X, \mathcal F) \rightarrow \dots

We will prove the existence of this sequence in several steps.

Lemma 1 Let {\mathcal E} be a flasque sheaf, {Y} a closed subset of {X}, and {U=X-Y}. Then the sequence

\displaystyle 0 \rightarrow \Gamma_Y(X, \mathcal E) \rightarrow \Gamma(X, \mathcal E) \rightarrow \Gamma(U, \mathcal E) \rightarrow 0

is exact.

Proof: Trivial. \Box

Lemma 2 Let {\mathcal E} be a flasque sheaf, and let {Y, Z} be closed subsets of {X}. Then the sequence

\displaystyle 0 \rightarrow \Gamma_{Y \cap Z}(X, \mathcal E) \rightarrow \Gamma_Y(X, \mathcal E) \oplus \Gamma_Z(X, \mathcal E) \rightarrow \Gamma_{Y \cup Z}(X, \mathcal E) \rightarrow 0

is exact (where the first map is the diagonal embedding, and the second map is {(s, t) \mapsto s-t}).

Proof: Exactness is clear except possibly on the right. Let U=X-Y, V=X-Z, and Let {D, E} be the short exact sequences

\displaystyle 0 \rightarrow \Gamma(X, \mathcal E) \rightarrow \Gamma(X, \mathcal E) \oplus \Gamma(X, \mathcal E) \rightarrow \Gamma(X, \mathcal E) \rightarrow 0


\displaystyle 0 \rightarrow \Gamma(U \cup V, \mathcal E) \rightarrow \Gamma(U, \mathcal E) \oplus \Gamma(V, \mathcal E) \rightarrow \Gamma(U \cap V, \mathcal E) \rightarrow 0

where the maps are defined similarily as in the statement of the Lemma. There is an obvious morphism of short exact sequences {D \rightarrow E}. Since {\mathcal E} is flasque, this morphism is surjective onto each term of {E}. By the snake lemma, and using Lemma 1, we get the desired short exact sequence. \Box

Now we are ready to prove the existence of the Mayer-Vietoris sequence for {\mathcal F}. Let

\displaystyle 0 \rightarrow \mathcal F \rightarrow \mathcal E^0 \rightarrow \mathcal E^1 \rightarrow \dots

be a flasque resolution of {\mathcal F}. By the lemma, we have a short exact sequence of complexes

\displaystyle 0 \rightarrow \Gamma_{Y \cap Z}(X, \mathcal E^\bullet) \rightarrow \Gamma_Y(X, \mathcal E^\bullet) \oplus \Gamma_Z(X, \mathcal E^\bullet) \rightarrow \Gamma_{Y \cup Z}(X, \mathcal E^\bullet) \rightarrow 0.

The long exact sequence of cohomology associated to this short exact sequences of complexes is precisely the Mayer-Vietoris sequence.

Ph.D. Comprehensive exam practice problems, Round 2

Exercise 1 Let {V} be the vector space of continuous real-valued functions on the interval {[0,\pi]}. Then, for any {f \in V},

\displaystyle 2 \int_0^\pi f(x)^2 \sin x dx \geq \left(\int_0^\pi f(x) \sin x dx\right)^2.

Proof: Let {d\mu} be the measure {\frac{\sin x dx}{2}} on {[0,\pi] = X}. Then {(X, d\mu)} is a probability space, {f} is Lebesgue-integrable on {X} and {t \mapsto t^2} is a convex function {\mathbf R \rightarrow \mathbf R}. By Jensen’s inequality,

\displaystyle \int_0^\pi f(x)^2 d\mu \geq \left(\int_0^\pi f(x) d\mu\right)^2.

Multiplying throughout by {4} we get the claimed inequality.

Exercise 2 Let {T} be a linear operator on a finite-dimensional vector space {V}. (a) Prove that if every one-dimensional subspace of {V} is {T}-invariant, then {T} is a scalar multiple of the identity operator. (b) Prove that if every codimension-one subspace of {V} is {T}-invariant, then {T} is a scalar multiple of the identity operator.

Proof: (a) The hypothesis means that every nonzero vector of {V} is an eigenvector of {T}. Suppose {v_1, v_2} are eigenvectors of {T} with eigenvalues {\lambda_1}, {\lambda_2}. Since, by assumption {v_1+v_2} is also an eigenvector, and {v_1} and {v_2} are independent, we can read off the eigenvalue of {v_1 + v_2} off of either coefficient in the equation {T(v_1+v+2)= \lambda_1 v_1 + \lambda_2 v_2}, and therefore {\lambda_1 = \lambda _2}. Therefore {T} is a multiple of the identity operator.

(b) Let {T^\vee} be the dual operator on {V^\vee}. We claim that {T^\vee} satisfies the condition of {(a)}. First, we have the following:

Lemma 1 Two functionals {f, g : V \rightarrow k} (where {k} is the ground field) have the same kernel if and only if they are multiples of each other.

Proof: Indeed, it is trivial if either of {f} or {g} is {0} (in which case both are zero), so suppose neither is {0}. Recall that if {W \subseteq V^\vee} and we define {\mathrm{Ann}(W) = \{v \in V : f(v) = 0\: \: \forall w \in W\}}, then we have a canonical isomorphism {\mathrm{Ann}(W) \cong (V^\vee/W)^\vee}, which in particular implies {\dim \mathrm{Ann}(W) = \mathrm{codim}(W\subseteq V^\vee)}. If we apply this to {W=\left<f,g\right>}, we have, under assumption,

\displaystyle \mathrm{Ann}(W) = \ker f \cap \ker g = \ker f = \ker g

which has codimension {1} since {f,g \neq 0}. Therefore {W} has dimension {1}, and {f} and {g} are scalar multiples of each other. \Box

Now, back to {(b)}. S suppose that {0 \neq f \in V^\vee}. Then {\ker f} has codimension {1} in {V}, and therefore, under the hypothesis of (b), {T(\ker f) \subseteq \ker f}. This implies {\ker T^\vee(f) \supseteq \ker f}; indeed, if {v \in \ker f}, then {T^\vee(f)(v) = f(Tv) = 0 } since {Tv \in \ker f}. Since {\ker f} is codimension {1}, we either have equality, or {T^\vee(f) = 0}. If there is equality, then {T^\vee(f)} and {\ker f} have the same kernel and therefore they are proportional, i.e. {f} is an eigenvector of {T^\vee}. If {T^\vee(f)=0} then {f} is trivially an eigenvector of {T^\vee}. In every case, we see that {f} is an eigenvector of {T^\vee}. By (a), {T^\vee}, and therefore {T}, is a multiple of the identity operator. \Box

Exercise 3 Let {T} be a linear operator on a finite-dimensional inner product space {V}.

  • (a) Define what is meant by the adjoint {T^*} of {T}.
  • (b) Prove that {\ker T^* = \mathrm{im}(T)^\perp}.
  • (c) If {T} is normal, prove that {\ker T = \ker T^*}. Give an example when the equality fails (and, of course, {T} is not normal).


  • (a) It is the unique linear operator {T^*} on {V} such that {\left<Tv, w\right> = \left<v, T^*w\right>} for every {v, w \in V}.
  • (b) Indeed,

    \displaystyle  \begin{array}{rcl}  v \in \ker T^* &\Leftrightarrow& \left<w, T^*v\right> = 0 \: \forall w \in W \\ &\Leftrightarrow& \left<Tw, v\right> = 0 \: \forall w \in W \\ &\Leftrightarrow& v \perp T(w)\: \forall w \in W. \end{array}

  • (c) A normal operator is one which commutes with its adjoint, i.e. {TT^* = T^*T}. Thus,

    \displaystyle  \begin{array}{rcl}  v \in \ker T^* &\Leftrightarrow& \left<T^*v, T^*v\right> = 0\\ &\Leftrightarrow& \left<TT^*v, v\right> = 0 \\ &\Leftrightarrow& \left<T^*Tv, v\right> = 0 \\ &\Leftrightarrow& \left<Tv, Tv\right> = 0\\ &\Leftrightarrow& Tv=0. \end{array}

    An example where the equality fails is supplied by the operator {T=\left(\begin{array}{ll} 1 & 1 \\ 0 & 1 \end{array}\right)} acting on {(\mathbf R^2, \bullet)} in the standard way. The vector {(1,-1)} is in the kernel of {T} but not of {T^*}.


Ph.D. Comprehensive exam practice problems, Round 1

In May, I will be taking the qualifying exams for my Ph.D.. Over the next few weeks, I will be posting practice problems and my solutions to them. Until the end of February, I will be reviewing linear algebra, single variable real analysis, complex analysis and multivariable calculus. In March and April, I will be focusing on algebra, geometry and topology.

Here are three problems to start.

Problem: Suppose that {A} is an {n \times n} real matrix with {n} distinct real eigenvalues. Show that {A} can be written in the form {\sum_{j=1}^n \lambda_j I_j} where each {\lambda_j} is a real number and the {I_j} are {n\times n} real matrices with {\sum_{j=1}^n I_j = I}, and {I_jI_l = 0} if {j \neq l}. Give a {2 \times 2} real matrix {A} for which such a decomposition is not possible and justify your answer.

Solution: for each {j}, let {E_j} denote the matrix with a {1} on the entry {(j,j)} and zeroes everywhere else. Then {\sum_j E_j = I} and {E_jE_l= 0} when {j\neq l}. Since {A} has {n} distinct real eigenvalues {\lambda_1, \dots, \lambda_n}, it is diagonalizable over {\mathbf R}, so there is a real matrix {P} such that {P^{-1}AP = D}, where {D=\mathrm{diag}(\lambda_1, \dots, \lambda_n) = \sum_j \lambda_j E_j }. Let {I_j = PE_jP^{-1}}. Then

\displaystyle \sum_j \lambda_j I_j = P\left(\sum_j \lambda_j E_j\right) P^{-1} = PDP^{-1} = A.

Moreover, for {j \neq l} we have {I_jI_l = PE_jE_lP^{-1} = 0}.

For the second part, notice that if the matrix {A} is decomposed in the manner described above, the numbers {\lambda_j} are necessarily eigenvalues of {A}. Indeed, multiplying the equality {\sum I_j = I} by {I_l} and using that {I_lI_j = 0} when {l \neq j}, we find that {I_l^2=I_l}. Hence, let {v \in \mathbf R^n} be any nonzero vector. Since {\sum_j I_j v = v}, at least one of the terms in the sum is nonzero, say {I_l v \neq 0}. Then

\displaystyle AI_lv = \sum_j \lambda_j I_j I_lv = \lambda_l I_l^2v = \lambda_l I_lv,

and therefore {I_lv} is an eigenvector of {A} with eigenvalue {\lambda_l}. Thus, it is impossible for the matrix {A} to have such a decomposition if, say, it has no real eigenvalues, for example

\displaystyle A=\left(\begin{array}{ll} 0 & -1 \\ 1 & 0 \end{array}\right).

Continue reading

A divisibility identity for Euler’s totient function

In this note I will give a Galois-theoretic proof that for a prime {p} and positive integer {n},

\displaystyle n \mid \frac{\varphi(p^n-1)}{\varphi(p-1)}.

I’d love to see a more elementary proof if you can come up with one.
First we need the following:

Lemma 1 Let {Z_n} be the cyclic group with {n} elements. Let {m} be a positive divisor of {n}, and consider {Z_m} as a subgroup of {Z_n}. Then the number of automorphisms of {Z_n} which fix {Z_m} pointwise is equal to {\varphi(n)/\varphi(m)} (which, in particular, is an integer).

Proof of the Lemma: Note that any automorphism of {Z_n} fixes {Z_m}, though not necessarily pointwise: indeed {Z_n} has a unique subgroup of order {m}, and thus any automorphism of {Z_n} must take this subgroup to itself. Thus we have a group homomorphism {\text{Aut}(Z_n) \rightarrow \text{Aut}(Z_m)} which is easily seen to be surjective; its kernel is precisely the subgroup consisting of those automorphisms of {Z_n} which fix {Z_m} pointwise. The statement follows by comparing orders. {\square}

Now to prove the initial claim, consider the field extension {\mathbf{F}_{p^n}/\mathbf{F}_p}. Basic Galois theory tells that this is a Galois extension of degree {n}. Consider the canonical homomorphism

\psi: \displaystyle \text{Gal}(\mathbf{F}_{p^n}/\mathbf{F}_p) \rightarrow \text{Aut}(\mathbf{F}_{p^n}^\times)

which restricts an {\mathbf{F}_p}-automorphism {\sigma} to the group of units of {\mathbf{F}_{p^n}}. Clearly it is an injective homomorphism since {\sigma} is completely determined by where it sends the units. Moreover for any {\sigma \in \text{Gal}(\mathbf{F}_{p^n}/\mathbf{F}_p)}, \psi(\sigma) lies in the subgroup of {\mathbf{F}_{p^n}^\times} of those automorphisms fixing pointwise the cyclic subgroup {\mathbf{F}_{p}^\times} of order {p-1}, because the Galois group consists of \mathbf{F}_p-homomorphisms. By the lemma the subgroup of these automorphisms has order {\frac{\varphi(p^n-1)}{\varphi(p-1)}}, whereas {\text{Gal}(\mathbf{F}_{p^n}/\mathbf{F}_p)} has order {n}. This does it.

Burnside’s lemma and Bell numbers

In this post I am going to show that {B_n}, the {n}th Bell number, is larger than {n^n/n!}.

Our main tool will be Burnside’s lemma, which states that if a finite group {G} acts on a finite set {S}, the average number of fixed points of the elements of {G} is equal to the number of orbits of the action of {G} on {S}:

\displaystyle |G\backslash S|=\frac{1}{|G|}\sum_{g \in G}|\text{Fix }g|,

where {\text{Fix }g} is the set of fixed points of {g}.

We let {S_n}, the symmetric group on {n} letters, act on {\{1,2,\dots,n\}^n} component-wise. The elements of {[n]=\{1,2,\dots,n\}^n} are {n}-tuples consisting of integers between {1} and {n}. Now you may easily convince yourself that it is possible to send a tuple {(a_1, \dots, a_n)} to a tuple {(b_1, \dots, b_n)} if and only if whenever {a_i=a_j}, we also have {b_i=b_j}, and vice versa. In other words we view each {n}-tuple {(a_1, \dots, a_n)} as a function {\sigma: [n] \rightarrow [n]}; its fibres partition {[n]}, and composition with a permutation preserves the fibres of {\sigma}. It is immediate that {\sigma} and {\sigma'} are in the same orbit of {S_n} if and only if they have the same collection of fibres. For instance, (1,1,2) can be sent to (3,3,1) by the cycle (1\: 3\: 2) but there is no way to send (1,1,2) to (1,2,3), because any permutation will send (1,1,2) to a 3-tuple of the form (\bullet,\bullet,\circ).

Thus {S_n} has {B_n} orbits on {\{1,2,\dots,n\}^n}. On the other hand since our permutations act component-wise, we have

\displaystyle \text{Fix }_{[n]^n}g \cong (\text{Fix }_{[n]}g)^n,

i.e. the fixed points of {g} acting on {[n]^n} are the tuples {(a_1, \dots, a_n)} consisting of fixed points of {g} acting on {[n]}. Therefore, by Burnside’s lemma, we have

\displaystyle \frac{1}{n!}\sum_{g\in S_n} (\text{Fix }g)^n = B_n.

In fact the same argument shows that for any m\geq n, we have

\displaystyle \frac{1}{m!}\sum_{g\in S_m} (\text{Fix }g)^n = B_n.

In particular, the identity of {S_n} fixes all of {[n]}, so we have

\displaystyle \frac{n^n}{n!} \leq B_n.

In fact, by using the fact that a permutation of {[n]} consists of a subset of {[n]} (the subset of fixed points), and a derangement of the remaining elements, we easily obtain the formula

\displaystyle B_n = \frac{1}{n!}\sum_{i=0}^n {n \choose i} !(n-i) i^n

where {!n} is the number of derangements of a set of {n} elements.