# Chevalley’s theorem

The purpose of this post is to prove Chevalley’s theorem: If ${f: X \rightarrow Y}$ is a finite surjective morphism of noetherian separated schemes, with ${X}$ affine, then ${Y}$ is affine.

We will follow the outline in Hartshorne (III.3 Problems 1 & 2 and III.4 Problems 1 & 2).

Theorem 1 Let ${f: X \rightarrow Y}$ be an affine morphism of noetherian schemes. Then for any coherent sheaf ${\mathcal F}$ on ${X}$, there are natural isomorphisms for all ${i \geq 0}$,

$\displaystyle H^i(X, \mathcal F) \simeq H^i(Y, f_* \mathcal F).$

Proof: According to (II, Ex. 5.17), when ${f}$ is affine, the direct image functor ${f_*}$ induces an equivalence from the category of coherent ${\mathcal O_X}$-modules to the category of coherent ${f_*\mathcal O_X}$-modules. Moreover, an equivalence ${\tau : A \rightarrow B}$ of abelian categories (i.e. an additive functor which is also an equivalence) is exact. Therefore, if ${F: B \rightarrow \text{Ab}}$ is a left additive functor, by the uniqueness of the ${\delta}$-functor extending a given left additive functor, it follows that there exists a natural isomorphism ${R^i(F \circ \tau) \simeq R^i F \circ \tau}$ for each ${i}$. $\Box$

Theorem 2 Let ${X}$ be a noetherian scheme. Then ${X}$ is affine if and only if ${X_{\text{red}}}$ is.

Proof: Clearly ${X_\text{red}}$ is affine if ${X}$ is affine.

Conversely, suppose ${X_{\text{red}}}$ is affine. We prove that ${X}$ has cohomological dimension ${0}$, hence it is affine by Serre’s theorem (III.3.7). Let ${\mathcal F}$ be a quasi-coherent sheaf on ${X}$. As indicated in the hint, we let ${\mathcal N}$ denote the sheaf of nilpotents of ${X}$ and we consider the filtration

$\displaystyle \mathcal F \supseteq \mathcal N \cdot \mathcal F \supseteq \mathcal N^2 \cdot \mathcal F \supseteq \dots$

of ${\mathcal F}$. Since ${X}$ is noetherian, there exists an ${n>0}$ such that ${\mathcal N^n = 0}$, so the filtration is finite.

We prove by descending induction on ${j}$ that ${ \mathcal N^j \cdot \mathcal F}$ is acyclic. For ${j=n}$, it is trivial. Now consider the exact sequence of quasi-coherent sheaves on ${X}$,

$\displaystyle 0 \rightarrow \mathcal N^j \cdot \mathcal F\rightarrow \mathcal N^{j-1} \cdot \mathcal F \rightarrow (\mathcal N^{j-1} \cdot \mathcal F) / (\mathcal N^j \cdot \mathcal F) \rightarrow 0.$

The quasi-coherent sheaf ${(\mathcal N^{j-1} \cdot \mathcal F) / (\mathcal N^j \cdot \mathcal F)}$ is naturally a quasi-coherent ${\mathcal O_X / \mathcal N \simeq \mathcal O_{X_{\text{red}}}}$-module, and its cohomology can be calculated either as an ${\mathcal O_X}$-module or as an ${\mathcal O_{X_{\text{red}}}}$ module by Theorem 1 (using the fact that the reduction morphism ${X_{\text{red}} \to X}$ is affine). Therefore, it is acyclic, since ${X_{\text{red}}}$ is affine by assumption. The sheaf ${\mathcal N^j \cdot \mathcal F}$ is acyclic by the inductive hypothesis. By the long exact sequence of cohomology, we see that ${\mathcal N^{j-1} \cdot \mathcal F}$ is also acyclic. $\Box$

Theorem 3 Let ${X}$ be a reduced scheme. Then ${X}$ is affine if and only if each irreducible component of ${X}$ is affine.

Proof: The irreducible components of ${X}$ are closed subschemes of ${X}$, hence they are affine if ${X}$ is affine. Conversely, suppose that every irreducible component of ${X}$ is affine. We prove that ${X}$ has cohomological dimension ${0}$.

We proceed by induction on the number of irreducible components of ${X}$. If ${X}$ is irreducible, then the statement is vacuously true. Now suppose it holds for noetherian schemes with ${n-1}$ irreducible components. Suppose that ${X}$ has ${n}$ irreducible components, and write it as ${X=Y \cup X'}$ where ${Y}$ is irreducible. Let ${\mathcal F}$ be a quasi-coherent sheaf on ${X}$. Denote ${\tau}$ the inclusion ${Y \hookrightarrow X}$ and ${\iota}$ the inclusion ${X' \hookrightarrow X}$, where each closed subscheme is given the canonical reduced closed subscheme structure. Since ${Y}$ is Noetherian, ${ \tau_* \tau^* \mathcal F}$ is also a quasi-coherent sheaf on ${X}$, supported on ${Y}$. There is a canonical morphism ${\mathcal F \rightarrow \tau_* \tau^* \mathcal F}$, and ${\mathcal F \rightarrow \iota_* \iota^* \mathcal F }$. (Each of these two morphisms is a unit of the “inverse image – direct image” adjunction). Let

$\displaystyle g : \mathcal F \rightarrow \tau_* \tau^* \mathcal F \oplus \iota_* \iota^* \mathcal F$

be their sum. It is easy to see that this morphism is surjective, and an isomorphism away from the intersection. Let ${\mathcal G= \ker g}$. Then ${\mathcal G}$ is quasi-coherent and supported in ${Y \cap X'}$. Therefore we have an exact sequence

$\displaystyle 0 \rightarrow \mathcal G \rightarrow \mathcal F \rightarrow \tau_* \tau^* \mathcal F \oplus \iota_* \iota^* \mathcal F \rightarrow 0$

Since ${X'}$ is affine by the induction hypothesis, ${Y \cap X'}$ is affine, being a closed subscheme of an affine scheme. Now, since ${\text{Supp }\mathcal G \subseteq Y \cap X'}$, the cohomology of ${\mathcal G}$ can be calculated either as an ${\mathcal O_{(Y \cap X')}}$-module or as an ${\mathcal O_X}$-module, and therefore it vanishes. Similarily the sheaf ${\tau_* \tau^* \mathcal F \oplus \iota_* \iota^* \mathcal F}$ is acyclic because ${Y}$ and ${X'}$ are affine. Therefore, by the long exact sequence of cohomology, ${\mathcal F}$ is also acyclic. $\Box$

Lemma 4 Let ${f: X \rightarrow Y}$ be a finite surjective morphism of integral noetherian schemes. Then there is a coherent sheaf ${\mathcal M}$ on ${X}$, and a morphism of sheaves ${\alpha : \mathcal O_Y^r \rightarrow f_* \mathcal M}$ for some ${r>0}$, such that ${\alpha}$ is an isomorphism at the generic point of ${Y}$.

Proof: Let ${L}$ be the function field of ${X}$ and ${K}$ be the function field of ${Y}$. Then the morphism ${f}$ gives rise to a field homomorphism ${K \hookrightarrow L}$. Since ${f}$ is finite and surjective, ${L}$ is finite over ${K}$, say of degree ${r}$. Let ${\{x_1, \dots, x_r\}}$ be a basis for ${L}$ over ${K}$. Each ${x_j}$ can be represented as a section ${s_j}$ of ${\mathcal O_X}$ over an open set ${U_j}$. Let ${\tau_j : U_j \hookrightarrow X}$ be the inclusion. Let ${\mathcal E_j}$ be the sheaf ${\mathcal E_j = s_j \cdot \mathcal O_{U_j}}$ on ${U_j}$. Obviously ${\mathcal E_j}$ is coherent (in fact free of rank ${1}$). Let ${\mathcal F_j = (\tau_j)_*(\mathcal E_j)}$. Then ${\mathcal F_j}$ is quasi-coherent on ${X}$ since ${U_j}$ is noetherian; since ${f}$ is finite, ${\mathcal F_j}$ is in fact coherent. Let ${\mathcal M = \bigoplus_j \mathcal F_j}$. Define the morphism ${\alpha : \mathcal O^r_Y \rightarrow f_*\mathcal M}$ by the global sections ${x_j}$ of ${f_*\mathcal M}$ (using the fact that ${\mathcal O_Y}$ represents the global sections functor ${\Gamma(Y, -)}$). Then, by construction, ${\alpha}$ is an isomorphism of ${K}$-vector spaces ${K^r \cong L}$ at the generic point of ${Y}$. $\Box$

Lemma 5 Let ${f: X \rightarrow Y}$ be a finite surjective morphism of integral noetherian schemes. Then for any coherent sheaf ${\mathcal F}$ on ${Y}$, there exists a coherent sheaf ${\mathcal G}$ on ${X}$, and a a morphism ${\beta : f_* \mathcal G \rightarrow \mathcal F^r}$ which is an isomorphism at the generic point of ${Y}$.

Proof: We take ${\beta = \mathcal{H}\text{om}(\alpha, \mathcal F)}$, where ${\mathcal{H}\text{om}}$ is the sheaf ${\mathcal{H}\text{om}}$ and ${\alpha}$ is the morphism of Lemma 4:

$\displaystyle \beta: \mathcal{H}\text{om}(f_*\mathcal M, \mathcal F) \rightarrow \mathcal{H}\text{om}(\mathcal O_Y^r, \mathcal F).$

Remark that ${\mathcal{H}\text{om}(\mathcal O_Y^r, \mathcal F) \simeq \mathcal F^r}$. Moreover, the sheaf ${\mathcal{H}\text{om}(f_*\mathcal M, \mathcal F)}$ naturally has a structure of ${f_*\mathcal O_X}$-module. By (II, Ex. 5.17), when ${f}$ is an affine morphism, ${f_*}$ induces an equivalence between the category of coherent ${\mathcal O_Y}$-modules and the category of coherent ${f_*\mathcal O_X}$-modules. Therefore ${\mathcal{H}\text{om}(f_*\mathcal M, \mathcal F)}$ is isomorphic to an ${\mathcal O_Y}$-module of the form ${f_*\mathcal G}$, where ${\mathcal G}$ is a coherent ${\mathcal O_X}$-module. Thus ${\beta}$ has the form ${f_* \mathcal G \rightarrow \mathcal F^r}$.

Moreover, it follows from the fact that a coherent sheaf on a noetherian scheme is finitely presented that on such a scheme, taking sheaf ${\mathcal{H}\text{om}}$ commutes with taking stalks of morphisms; therefore ${\beta}$ is also an isomorphism at the generic point of ${Y}$. $\Box$

Now we are ready to prove Chevalley’s theorem.

Theorem 6 (Chevalley’s theorem). Let ${f: X \rightarrow Y}$ be a finite surjective morphism of noetherian separated schemes, where ${X}$ is affine. Then ${Y}$ is affine.

Proof: By Theorems 2 and 3, we may suppose that ${X}$ and ${Y}$ are reduced and irreducible. We prove by contradiction that ${Y}$ is affine. Let ${\Sigma}$ be the collection of closed subschemes of ${Y}$ which are not affine. Suppose it not empty; then it contains a minimal element ${Z \hookrightarrow X}$, which we may view as having the reduced induced subscheme structure. Since finite morphisms are stable under base change, we may in fact suppose that ${Z=Y}$ (what this means is that we are replacing ${f}$ by its restriction to ${f^{-1}(Z)}$ if necessary). Therefore, we suppose that every proper closed subscheme of ${Y}$ is affine.

Let ${\mathcal F}$ be a coherent sheaf on ${X}$. By Lemma ${5}$, there exists a coherent sheaf ${\mathcal G}$ on ${X}$ and a morphism ${\beta: f_* \mathcal G \rightarrow \mathcal F^r}$ which is generically an isomorphism (and which is therefore surjective, since ${Y}$ is irreducible). Thus, if ${\mathcal D = \ker \beta}$, we have an exact sequence of sheaves on ${Y}$

$\displaystyle 0 \rightarrow \mathcal D \rightarrow f_* \mathcal G \rightarrow \mathcal F^r \rightarrow 0.$

Now, as in the proof of Theorem 3, we view ${\mathcal D}$ as a quasi-coherent sheaf on the proper closed subscheme ${\text{Supp }\mathcal D}$. By the minimality of ${Y}$, ${\text{Supp }\mathcal D}$ is affine and therefore ${\mathcal D}$ is acyclic. Moreover, since a finite morphism is affine, we can apply Theorem 1 to see that ${f_* \mathcal G}$ is also acyclic. Therefore, by the long exact sequence of cohomology, ${\mathcal F^r}$ is acyclic, so ${\mathcal F}$ is acyclic. $\Box$

Thefore, ${Y}$ has cohomological dimension ${0}$, which contradicts the assumption that it is not affine.

# A Noetherian and Hausdorff space is finite

In this post, I will prove that a Noetherian and Hausdorff topological space is finite (and therefore has the discrete topology, being Hausdorff). The proof is very short and pleasant.

Proof: Let ${X}$ be such a space, and suppose that it is infinite. Let ${\Sigma}$ be the collection of infinite closed subsets of ${X}$. It is nonempty since ${X \in \Sigma}$, and therefore has a minimal member ${Z}$ by the Noetherian assumption. Let ${p,q}$ be points of ${Z}$, and ${U,V}$ be disjoint open neighborhoods of ${p}$ and ${q}$ respectively (such ${U}$ and ${V}$ exist by the Hausdorff assumption). Then ${X = (X-U) \cup (X-V)}$ since ${U}$ and ${V}$ are disjoint, so ${Z = (Z \cap (X-U)) \cup (Z \cap (X-V))}$. Now each of ${Z \cap (X-U)}$ and ${Z \cap (X-V)}$ is closed in ${X}$, and is properly contained in ${Z}$ (the first one doesn’t contain ${p}$, and the second one doesn’t contain ${q}$). Therefore, by minimality of ${Z}$, each must be finite, and therefore ${Z}$ is also finite, which is a contradiction. $\Box$

Corollary: in any infinite Hausdorff space, there exists a strictly descending infinite chain of closed subsets $Z_1 \supset Z_2 \supset Z_3 \dots$. The proof above can be easily adapted to construct such a sequence.

# Ph.D. Comprehensive exam practice problems, Round 2

Exercise 1 Let ${V}$ be the vector space of continuous real-valued functions on the interval ${[0,\pi]}$. Then, for any ${f \in V}$,

$\displaystyle 2 \int_0^\pi f(x)^2 \sin x dx \geq \left(\int_0^\pi f(x) \sin x dx\right)^2.$

Proof: Let ${d\mu}$ be the measure ${\frac{\sin x dx}{2}}$ on ${[0,\pi] = X}$. Then ${(X, d\mu)}$ is a probability space, ${f}$ is Lebesgue-integrable on ${X}$ and ${t \mapsto t^2}$ is a convex function ${\mathbf R \rightarrow \mathbf R}$. By Jensen’s inequality,

$\displaystyle \int_0^\pi f(x)^2 d\mu \geq \left(\int_0^\pi f(x) d\mu\right)^2.$

Multiplying throughout by ${4}$ we get the claimed inequality.
$\Box$

Exercise 2 Let ${T}$ be a linear operator on a finite-dimensional vector space ${V}$. (a) Prove that if every one-dimensional subspace of ${V}$ is ${T}$-invariant, then ${T}$ is a scalar multiple of the identity operator. (b) Prove that if every codimension-one subspace of ${V}$ is ${T}$-invariant, then ${T}$ is a scalar multiple of the identity operator.

Proof: (a) The hypothesis means that every nonzero vector of ${V}$ is an eigenvector of ${T}$. Suppose ${v_1, v_2}$ are eigenvectors of ${T}$ with eigenvalues ${\lambda_1}$, ${\lambda_2}$. Since, by assumption ${v_1+v_2}$ is also an eigenvector, and ${v_1}$ and ${v_2}$ are independent, we can read off the eigenvalue of ${v_1 + v_2}$ off of either coefficient in the equation ${T(v_1+v+2)= \lambda_1 v_1 + \lambda_2 v_2}$, and therefore ${\lambda_1 = \lambda _2}$. Therefore ${T}$ is a multiple of the identity operator.

(b) Let ${T^\vee}$ be the dual operator on ${V^\vee}$. We claim that ${T^\vee}$ satisfies the condition of ${(a)}$. First, we have the following:

Lemma 1 Two functionals ${f, g : V \rightarrow k}$ (where ${k}$ is the ground field) have the same kernel if and only if they are multiples of each other.

Proof: Indeed, it is trivial if either of ${f}$ or ${g}$ is ${0}$ (in which case both are zero), so suppose neither is ${0}$. Recall that if ${W \subseteq V^\vee}$ and we define ${\mathrm{Ann}(W) = \{v \in V : f(v) = 0\: \: \forall w \in W\}}$, then we have a canonical isomorphism ${\mathrm{Ann}(W) \cong (V^\vee/W)^\vee}$, which in particular implies ${\dim \mathrm{Ann}(W) = \mathrm{codim}(W\subseteq V^\vee)}$. If we apply this to ${W=\left}$, we have, under assumption,

$\displaystyle \mathrm{Ann}(W) = \ker f \cap \ker g = \ker f = \ker g$

which has codimension ${1}$ since ${f,g \neq 0}$. Therefore ${W}$ has dimension ${1}$, and ${f}$ and ${g}$ are scalar multiples of each other. $\Box$

Now, back to ${(b)}$. S suppose that ${0 \neq f \in V^\vee}$. Then ${\ker f}$ has codimension ${1}$ in ${V}$, and therefore, under the hypothesis of (b), ${T(\ker f) \subseteq \ker f}$. This implies ${\ker T^\vee(f) \supseteq \ker f}$; indeed, if ${v \in \ker f}$, then ${T^\vee(f)(v) = f(Tv) = 0 }$ since ${Tv \in \ker f}$. Since ${\ker f}$ is codimension ${1}$, we either have equality, or ${T^\vee(f) = 0}$. If there is equality, then ${T^\vee(f)}$ and ${\ker f}$ have the same kernel and therefore they are proportional, i.e. ${f}$ is an eigenvector of ${T^\vee}$. If ${T^\vee(f)=0}$ then ${f}$ is trivially an eigenvector of ${T^\vee}$. In every case, we see that ${f}$ is an eigenvector of ${T^\vee}$. By (a), ${T^\vee}$, and therefore ${T}$, is a multiple of the identity operator. $\Box$

Exercise 3 Let ${T}$ be a linear operator on a finite-dimensional inner product space ${V}$.

• (a) Define what is meant by the adjoint ${T^*}$ of ${T}$.
• (b) Prove that ${\ker T^* = \mathrm{im}(T)^\perp}$.
• (c) If ${T}$ is normal, prove that ${\ker T = \ker T^*}$. Give an example when the equality fails (and, of course, ${T}$ is not normal).

Proof:

• (a) It is the unique linear operator ${T^*}$ on ${V}$ such that ${\left = \left}$ for every ${v, w \in V}$.
• (b) Indeed,

$\displaystyle \begin{array}{rcl} v \in \ker T^* &\Leftrightarrow& \left = 0 \: \forall w \in W \\ &\Leftrightarrow& \left = 0 \: \forall w \in W \\ &\Leftrightarrow& v \perp T(w)\: \forall w \in W. \end{array}$

• (c) A normal operator is one which commutes with its adjoint, i.e. ${TT^* = T^*T}$. Thus,

$\displaystyle \begin{array}{rcl} v \in \ker T^* &\Leftrightarrow& \left = 0\\ &\Leftrightarrow& \left = 0 \\ &\Leftrightarrow& \left = 0 \\ &\Leftrightarrow& \left = 0\\ &\Leftrightarrow& Tv=0. \end{array}$

An example where the equality fails is supplied by the operator ${T=\left(\begin{array}{ll} 1 & 1 \\ 0 & 1 \end{array}\right)}$ acting on ${(\mathbf R^2, \bullet)}$ in the standard way. The vector ${(1,-1)}$ is in the kernel of ${T}$ but not of ${T^*}$.

$\Box$

# Ph.D. Comprehensive exam practice problems, Round 1

In May, I will be taking the qualifying exams for my Ph.D.. Over the next few weeks, I will be posting practice problems and my solutions to them. Until the end of February, I will be reviewing linear algebra, single variable real analysis, complex analysis and multivariable calculus. In March and April, I will be focusing on algebra, geometry and topology.

Here are three problems to start.

Problem: Suppose that ${A}$ is an ${n \times n}$ real matrix with ${n}$ distinct real eigenvalues. Show that ${A}$ can be written in the form ${\sum_{j=1}^n \lambda_j I_j}$ where each ${\lambda_j}$ is a real number and the ${I_j}$ are ${n\times n}$ real matrices with ${\sum_{j=1}^n I_j = I}$, and ${I_jI_l = 0}$ if ${j \neq l}$. Give a ${2 \times 2}$ real matrix ${A}$ for which such a decomposition is not possible and justify your answer.

Solution: for each ${j}$, let ${E_j}$ denote the matrix with a ${1}$ on the entry ${(j,j)}$ and zeroes everywhere else. Then ${\sum_j E_j = I}$ and ${E_jE_l= 0}$ when ${j\neq l}$. Since ${A}$ has ${n}$ distinct real eigenvalues ${\lambda_1, \dots, \lambda_n}$, it is diagonalizable over ${\mathbf R}$, so there is a real matrix ${P}$ such that ${P^{-1}AP = D}$, where ${D=\mathrm{diag}(\lambda_1, \dots, \lambda_n) = \sum_j \lambda_j E_j }$. Let ${I_j = PE_jP^{-1}}$. Then

$\displaystyle \sum_j \lambda_j I_j = P\left(\sum_j \lambda_j E_j\right) P^{-1} = PDP^{-1} = A.$

Moreover, for ${j \neq l}$ we have ${I_jI_l = PE_jE_lP^{-1} = 0}$.

For the second part, notice that if the matrix ${A}$ is decomposed in the manner described above, the numbers ${\lambda_j}$ are necessarily eigenvalues of ${A}$. Indeed, multiplying the equality ${\sum I_j = I}$ by ${I_l}$ and using that ${I_lI_j = 0}$ when ${l \neq j}$, we find that ${I_l^2=I_l}$. Hence, let ${v \in \mathbf R^n}$ be any nonzero vector. Since ${\sum_j I_j v = v}$, at least one of the terms in the sum is nonzero, say ${I_l v \neq 0}$. Then

$\displaystyle AI_lv = \sum_j \lambda_j I_j I_lv = \lambda_l I_l^2v = \lambda_l I_lv,$

and therefore ${I_lv}$ is an eigenvector of ${A}$ with eigenvalue ${\lambda_l}$. Thus, it is impossible for the matrix ${A}$ to have such a decomposition if, say, it has no real eigenvalues, for example

$\displaystyle A=\left(\begin{array}{ll} 0 & -1 \\ 1 & 0 \end{array}\right).$

# A divisibility identity for Euler’s totient function

In this note I will give a Galois-theoretic proof that for a prime ${p}$ and positive integer ${n}$,

$\displaystyle n \mid \frac{\varphi(p^n-1)}{\varphi(p-1)}.$

I’d love to see a more elementary proof if you can come up with one.
First we need the following:

Lemma 1 Let ${Z_n}$ be the cyclic group with ${n}$ elements. Let ${m}$ be a positive divisor of ${n}$, and consider ${Z_m}$ as a subgroup of ${Z_n}$. Then the number of automorphisms of ${Z_n}$ which fix ${Z_m}$ pointwise is equal to ${\varphi(n)/\varphi(m)}$ (which, in particular, is an integer).

Proof of the Lemma: Note that any automorphism of ${Z_n}$ fixes ${Z_m}$, though not necessarily pointwise: indeed ${Z_n}$ has a unique subgroup of order ${m}$, and thus any automorphism of ${Z_n}$ must take this subgroup to itself. Thus we have a group homomorphism ${\text{Aut}(Z_n) \rightarrow \text{Aut}(Z_m)}$ which is easily seen to be surjective; its kernel is precisely the subgroup consisting of those automorphisms of ${Z_n}$ which fix ${Z_m}$ pointwise. The statement follows by comparing orders. ${\square}$

Now to prove the initial claim, consider the field extension ${\mathbf{F}_{p^n}/\mathbf{F}_p}$. Basic Galois theory tells that this is a Galois extension of degree ${n}$. Consider the canonical homomorphism

$\psi: \displaystyle \text{Gal}(\mathbf{F}_{p^n}/\mathbf{F}_p) \rightarrow \text{Aut}(\mathbf{F}_{p^n}^\times)$

which restricts an ${\mathbf{F}_p}$-automorphism ${\sigma}$ to the group of units of ${\mathbf{F}_{p^n}}$. Clearly it is an injective homomorphism since ${\sigma}$ is completely determined by where it sends the units. Moreover for any ${\sigma \in \text{Gal}(\mathbf{F}_{p^n}/\mathbf{F}_p)}$, $\psi(\sigma)$ lies in the subgroup of ${\mathbf{F}_{p^n}^\times}$ of those automorphisms fixing pointwise the cyclic subgroup ${\mathbf{F}_{p}^\times}$ of order ${p-1}$, because the Galois group consists of $\mathbf{F}_p$-homomorphisms. By the lemma the subgroup of these automorphisms has order ${\frac{\varphi(p^n-1)}{\varphi(p-1)}}$, whereas ${\text{Gal}(\mathbf{F}_{p^n}/\mathbf{F}_p)}$ has order ${n}$. This does it.

# The Problem of Misaddressed Letters

I have decided to switch the focus of this blog. Instead of expository write-ups, I will be posting mostly tidbits of fun mathematics, possibly without relation to one another.

In this post, I want to talk about the problem of derangements, fist considered by Niclaus Bernoulli (1687-1759), solved by him and, later, independently by Euler. If I write letters to ${100}$ different friends, and send the letters randomly among them, what is the probability that none of my friends will receive the letter personally addressed to them? Since there are ${100!}$ different ways of sending the letters, this probability equals ${!100/100!}$, where ${!100}$ denotes the number of ways of rearranging ${100}$ objects in such a way that no object is left in the same position. Such a permutation is called a derangement.

How can we calculate ${!100}$? First, note that any permutation of ${n}$ objects fixes certain elements, and deranges the others. The number of permutations of ${n}$ fixing exactly ${k}$ elements is equal to

$\displaystyle {n \choose k} !(n-k).$

Therefore, the total number of permutations is

$\displaystyle n! = \sum_{k=0}^n{n \choose k} !(n-k).$

In the language of species, we can say that the species of permutations is the product of the species of derangements and of the identity species.

In the language of generating functions, this translates to

$\displaystyle \frac{1}{1-x} = e^x D(x)$

where ${D(x)= \sum_{n=0}^\infty !n \frac{x^n}{n!}}$.

Therefore,

$\displaystyle D(x)=\frac{e^{-x}}{1-x}$

and from this we read off the formula

$\displaystyle !n = \sum_{k=0}^n {n \choose k}(-1)^k (n-k)!.$

In fact, rearranging this shows that

$\displaystyle \frac{!n}{n!} = \sum_{k=0}^n \frac{(-1)^k}{k!},$

which is simply the truncated Taylor series for ${e^{-x}}$, evaluated at ${1}$. Hence we see that

$\displaystyle \frac{!n}{n!} - e^{-1} \rightarrow 0.$

But even more is true: using Taylor’s remainder formula, we see that for all ${n>0}$,

$\displaystyle \left |\frac{!n}{n!} - e^{-1} \right | < \frac{1}{(n+1)!}.$

Hence, in fact, ${!n}$ is the nearest integer to ${n!/e}$, for all ${n}$.

Furthermore, as a consequence of the identity ${\frac{1}{1-x} = 1+\frac{x}{1-x}}$, we see that ${!n}$ satisfies the recurrence relation

$\displaystyle !n = n\times !(n-1) + (-1)^n.$

Note that this is the same recurrence as satisfied by ${n!}$, with an extra ${(-1)^n}$ term.

The derangement numbers appear in certain integrals. For instance, for ${n\geq 0}$,

$\displaystyle \int_1^e (\log u)^n \mathrm{d}u = (-1)^n(e!n-n!).$

This gives another proof that ${!n - n!e^{-1} \rightarrow 0}$ (and that it oscillates around ${0}$), since clearly the integral is positive and converges to ${0}$. It also gives a continuation of the function ${!n}$ to complex values.

# A cute problem

I came up with this little problem last night. It’s not very difficult to prove but still fun (I think). Here it is : let $A$ be a commutative ring, and let $h(u,v) \in A[u,v]$. Suppose that, for any polynomials $f(u), g(u) \in A[u]$, we have $h(f(u), g(u))=h(g(u), f(u))$. Then $h(u,v)=h(v,u)$.

I’ll post my solution in a couple of days to see if anyone can come up with an alternative solution in the meantime.