Minkowski Theory
Published:
We give a standard presentation of Minkowski theory.
Our exposition follows Neukrich’s Algebraic Number Theory. We are interested in using the archimedean places of a number field to allow us to do geometry. To give us some sort of goal to build towards to, here is our target result.
Theorem 1. Let \(K\) be a number field of degree \(n\) and signature \((r_1,r_2)\). Then there is a nonzero \(\alpha\in\mathcal O_K\) such that \[\left|\operatorname N_{K/\mathbb Q}(\alpha)\right|\le\frac{n!}{n^n}\left(\frac4\pi\right)^{r_2}\sqrt{\left|\operatorname{disc}K\right|}.\]
Corollary 2. Let \(K\) be a number field of degree \(n\). Then \[\sqrt{\left|\operatorname{disc}K\right|}\ge\frac{n^n}{n!}\left(\frac\pi4\right)^{n/2}.\]
Proof. This follows from Theorem 1 by using the bounds \(\left|\operatorname N_{K/\mathbb Q}(\alpha)\right|\ge1\) and \(r_2\le n/2\). \(\blacksquare\)
Remark 3. Another common application is to show that the class number is finite. For this, one needs to show that all ideal classes are represented by an ideal of bounded norm. Upon considering inverse classes, it is enough to show that any given ideal contains an element of small norm (relative to the norm of the ideal). The proof of this last assertion is parallel to Theorem 1 but does not immediately follow from it: one basically needs to replace \(\mathcal O_K\) be an integral ideal everywhere in the arguments below.
Our proof of Theorem 1 will use the following lattice point theorem.
Theorem 4 (Minkowski). Let \(V\) be a real inner product space of dimension \(n\). Let \(\Lambda\subseteq V\) be a lattice of rank \(n\), and let \(\Omega\subseteq V\) be a measurable, convex, and centrally symmetric subset. If \[\operatorname{vol}\Omega>2^n\operatorname{covol}\Lambda,\] then \(\Omega\cap\Lambda\) has a nonzero element.
Proof. The idea is to consider the image of \(\Omega\) along the projection \(\pi\colon V\to V/2\Lambda\).
- The main claim is that \(\pi\colon\Omega\to V/2\Lambda\) is not injective. Well, note that we can represent \(V/2\Lambda\) by some fundamental parallelepiped \(F\subseteq V\) of volume \(2^n\operatorname{covol}\Lambda\), and the projection map basically translates \(V\) into \(F\). Thus, injectivity of \(\pi\) would imply that \(\operatorname{vol}\Omega\le\operatorname{vol}F\), which is false by hypothesis.
- We complete the proof. By the first step, we are granted distinct \(x,y\in\Omega\) such that \(x-y\in2\Lambda\). But then \(-y\in\Omega\) because \(\Omega\) is centrally symmetric, and \(\frac{x-y}2\in\Omega\) because \(\Omega\) is convex. Thus, \(\frac{x-y}2\in\Omega\cap\Lambda\) is the desired nonzero element. \(\blacksquare\)
Remark 5. If \(\Omega\) is compact but merely has volume equal to \(2^n\operatorname{covol}\Lambda\), then the conclusion still holds by a limiting argument. Indeed, Minkowski’s theorem applies to \((1+\varepsilon)\Omega\) for any \(n>0\), so we get some nonzero vector \(v_n\) in the compact set \[\Omega_n=(1+1/n)\Omega\cap(\Lambda\setminus\{0\})\] for each \(\varepsilon>0\). Sending \(\varepsilon\to0^+\) must have a subsequence \(\{v_n\}\) approaching some vector \(v\). We claim that \(v\) is the required vector. Indeed, by the compactness of \(\Omega_n\), we see that \(v\in\Omega_n\) for each \(n\), but \(\bigcap_{n>0}\Omega_n=\Omega\cap(\Lambda\setminus\{0\}\), so we are done.
Thus, to prove Theorem 1, we need to produce a vector space, a lattice, and a large subset of the vector space. Let’s begin with the vector space, for which we will choose \[K_{\mathbb R}=K\otimes_{\mathbb Q}\mathbb R.\] We are allowed to use any inner product to give a topology on \(K_{\mathbb R}\), but we take a moment to motivate a canonical choice.
Lemma 6. Fix a number field \(K\). Then there is an isomorphism \[K\otimes_{\mathbb Q}\mathbb C\to\prod_\tau\mathbb C,\] where the product on the right-hand side is taken over all embeddings \(K\hookrightarrow\mathbb C\).
Proof. Here, the map is given by \(\alpha\otimes z\mapsto(z\tau(\alpha))_\tau\), which we note is a morphism both of \(\mathbb C\)-vector spaces and of \(K\)-algebras. To check that this is an isomorphism, we write the isomorphism a different way: choose a primitive element \(\beta\) of \(K\) and use it to write \(K=\mathbb Q[x]/(f(x))\) for some irreducible polynomial \(f\) of degree \(n\). Then let \(\{\beta_1,\ldots,\beta_n\}\) be the roots of \(f\) in \(\mathbb C\), so we have a composite \[K\otimes_{\mathbb Q}\mathbb C\cong\frac{\mathbb C[x]}{(f(x))}\cong\prod_{i=1}^n\frac{\mathbb C[x]}{(x-\alpha_i)}\cong\prod_{i=1}^n\mathbb C.\] We now see that this is an isomorphism of \(\mathbb C\)-vector spaces and \(K\)-algebras, but this composite map sends \(\beta\otimes z\) to \((z\beta_i)_i\), so it agrees with the original map! \(\blacksquare\)
Now, \(K_{\mathbb C}=\prod_\tau\mathbb C\) admits a canonical positive-definite Hermitian inner product given by \[\langle(x_\tau),(y_\tau)\rangle=\sum_\tau x_\tau\overline{y_\tau}.\] This then restricts to a Hermitian \(\mathbb R\)-biliear form on \(K_{\mathbb R}\).
Lemma 7. Fix a number field \(K\). The canonical positive-definite Hermitian inner product on \(K_{\mathbb C}\) restricts to a positive-definite symmetric inner product on \(K_{\mathbb R}\).
Proof. All adjectives are automatic as soon as we know that the canonical inner product is real-valued and symmetric on \(K_{\mathbb R}\). The point is that the image of \(K_{\mathbb R}\) in \(\prod_\tau\mathbb C_\tau\) is spanned by elements of the form \[\{(r\tau(\alpha))_\tau:r\in\mathbb R,\alpha\in K\}.\] Thus, if \((x_\tau)\) is in the image of \(K_{\mathbb R}\), then \(x_{\overline\tau}=\overline{x_\tau}\) for all embeddings \(\tau\). However, these \(n\) equations cut out an \(n\)-dimensional real subspace of \(K_{\mathbb C}\), so they exactly cut out \(K_{\mathbb R}\). (Note that the map \(K_{\mathbb R}\to K_{\mathbb C}\) is injective; e.g., this can be seen by writing \(K=\mathbb Q[x]/(f(x))\).) Thus, we conclude that the image of \(K_{\mathbb R}\) is \[\{(x_\tau)\in K_{\mathbb C}:x_{\overline\tau}=\overline{x_\tau}\text{ for all }\tau\}.\] The Hermitian form being real-valued and symmetric now follows because any \((x_\tau)\) and \((y_\tau)\) have \(\overline{x_\tau\overline{y_\tau}}=x_{\overline\tau}\overline{y_{\overline\tau}}\). In particular, taking the complex conjugate of the given Hermitian form merely rearranges the sum. \(\blacksquare\)
We have so far provided a vector space, and we have even described how to measure volumes on it by providing an inner product. We now turn to the lattice. Well, we are interested in finding nonzero integral elements in Theorem 1, so we will use the free abelian group \(\mathcal O_K\) as our lattice.
Proposition 8. Fix a number field \(K\) of degree \(n\). Let \(\Lambda_K\) be the image of \(\mathcal O_K\) in \(K_{\mathbb R}\). Then \(\Lambda_K\) is a lattice of rank \(n\) and covolume \(\sqrt{\left|\operatorname{disc}\mathcal O_K\right|}\).
Proof. By writing \(K=\mathbb Q[x]/(f(x))\) in the usual manner, we see that the map \(\iota\colon K\to K_{\mathbb R}\) is an injection because \(K\) is a field and \(K_{\mathbb R}\) is a product of fields. Thus, \(\mathcal O_K\) being free over \(\mathbb Z\) of rank \(n\) implies the same for \(\Lambda_K\).
It remains to compute the covolume. Fix an integral basis \(\{\alpha_1,\ldots,\alpha_n\}\) of \(\mathcal O_K\), so we are interested in calculating the volume of the fundamental parallelepiped spanned by the \(\iota(\alpha_\bullet)\)s. This volume is the (absolute) determinant of the linear operator taking an orthonormal basis to the \(\iota(\alpha_\bullet)\)s. But to compute such a determinant, we may as well work with an orthonormal basis taken from \(K_{\mathbb C}=\prod_\tau\mathbb C_\tau\). Using the obvious orthonormal basis of \(K_{\mathbb C}\) (which is not always defined over \(K_{\mathbb R}\)!), we find that the volume is \[\left|\det\left[\sigma_i\alpha_j\right]_{ij}\right|.\] We would like to show that the square of this volume is \(\left|\operatorname{disc}\mathcal O_K\right|\). Well, let the matrix in question be \(A\), and we see that \[\begin{aligned}A^\intercal A &= \begin{bmatrix}\sigma_1\alpha_1 & \cdots & \sigma_n\alpha_1 \\\vdots & \ddots & \vdots \\\sigma_1\alpha_n & \cdots & \sigma_n\alpha_n\end{bmatrix}\begin{bmatrix}\sigma_1\alpha_1 & \cdots & \sigma_1\alpha_n \\\vdots & \ddots & \vdots \\\sigma_n\alpha_1 & \cdots & \sigma_n\alpha_n\end{bmatrix} \\{} &= \begin{bmatrix}\operatorname{tr}_{K/\mathbb Q}(\alpha_1\alpha_1) & \cdots & \operatorname{tr}_{K/\mathbb Q}(\alpha_1\alpha_n) \\\vdots & \ddots & \vdots \\\operatorname{tr}_{K/\mathbb Q}(\alpha_n\alpha_1) & \cdots & \operatorname{tr}_{K/\mathbb Q}(\alpha_n\alpha_n)\end{bmatrix},\end{aligned}\] so the result follows by taking determinants everywhere. \(\blacksquare\)
It remains to construct our large subset of \(K_{\mathbb R}\). Because the norm of an element is the product of the embeddings, it would make sense to use the subset \[\left\{(x_\tau)\in K_{\mathbb R}:\prod_\tau x_\tau\le t\right\}\] for some \(t\), but this set is not convex. Instead, we use subsets of the form \(t\Omega\), where \[\Omega=\left\{(x_\tau)\in K_{\mathbb R}:\sum_\tau \left|x_\tau\right|\le 1\right\},\] which will work by the AM–GM inequality.
Calculating the volume of \(t\Omega\) is a little technical, so we will save it for the end.
Lemma 9. Let \(K\) be a number field of degree \(n\). Then there is a nonzero \(\alpha\in\mathcal O_K\) such that \[\left|\operatorname N_{K/\mathbb Q}(\alpha)\right|\le\frac{2^n}{n^n\operatorname{vol}\Omega}\sqrt{\left|\operatorname{disc}K\right|}.\]
Proof. We use Minkowski’s theorem in the form of Remark 5. We use the vector space \(K_{\mathbb R}\) with the lattice \(\Lambda_K\) which is the image of \(\mathcal O_K\). Now, choose \(t\) so that \[t^n=\frac{2^n\sqrt{\left|\operatorname{disc}\mathcal O_K\right|}}{\operatorname{vol}\Omega}\] so that \[\operatorname{vol}(t\Omega)=2^n\operatorname{covol}(\Lambda)\] by Proposition 8. Now, \(t\Omega\) is compact, convex, and centrally symmetric (it is basically a union of simplices when extended to \(K_{\mathbb C}\)), so we are granted some nonzero \(\alpha\in t\Omega\cap\mathcal O_K\). It only remains to bound the norm of \(\alpha\), for which we see \[\left|\operatorname N_{K/\mathbb Q}(\alpha)\right|=\prod_\tau\left|\tau(\alpha)\right|.\] The AM–GM inequality asserts that \(\frac1n\sum_\tau\left|x_\tau\right|\ge\sqrt[n]{\prod_\tau\left|x_\tau\right|}\), so we see \[\left|\operatorname N_{K/\mathbb Q}(\alpha)\right|\le\left(\frac tn\right)^n.\] Plugging in for \(t^n\) completes the proof. \(\blacksquare\)
Thus, to prove Theorem 1, it is enough by Lemma 9 to show that \[\operatorname{vol}\Omega\stackrel?=\frac{2^{r_1}\pi^{r_2}}{n!}.\] As a technical step, we compute some integrals over simplices.
Lemma 10. For \(n>0\), let \(\Delta^n\subseteq\mathbb R^n\) be the subset \[\left\{(x_0,\ldots,x_{n-1}):x_i\ge0\text{ for all }i,\sum_{i=0}^{n-1}x_i\le1\right\}.\] For any sequence \(\{a_0,\ldots,a_n\}\) of positive integers, \[\int_{\Delta^n}x_0^{a_0-1}\cdots x_{n-1}^{a_{n-1}-1}(1-x_0-\cdots-x_{n-1})^{a_n-1}\,dx_0\cdots dx_{n-1}\] equals \[\frac{(a_0-1)!\cdots(a_n-1)!}{(a_0+\cdots+a_n-1)!}.\]
Proof. We induct on \(n\). If \(n=1\), we would like to check that \[\int_{0}^1x^{a-1}(1-x)^{b-1}\,dx\stackrel?=\frac{(a-1)!(b-1)!}{(a+b-1)!}\] By symmetry, we may assume that \(a\ge b\). Now, we induct on \(b\) because this is immediate for \(b=1\). Then for \(b>1\), we integrate by parts to see \[\int_{0}^1x^{a-1}(1-x)^{b-1}\,dx=\frac{a-1}b\int_{0}^1x^{a-2}(1-x)^{b}\,dx,\] so the claim follows.
For the induction on \(n\), we isolate the integral over \(x_{0}\), which leaves us with \[\int_0^1I(1-x_{0})x_{0}^{a_{0}-1}\,dx_{0},\] where \(I(1-x_{0})\) is the integral \[\int_{(1-x_{0})\Delta^{n-1}}x_1^{a_1-1}\cdots x_{n-1}^{a_{n-1}-1}((1-x_{0})-x_1-\cdots-x_{n-1})^{a_n-1}\,dx_0\cdots dx_{n-2}.\] Evaluating this integral by the induction, we see we are trying to compute \[\frac{(a_1-1)!\cdots(a_n-1)!}{(a_1+\cdots+a_n-1)!}\int_0^1(1-t)^{a_1+\cdots+a_{n-1}-1}t^{a_{0}-1}\,dt.\] The result now follows from the case of \(n=1\). \(\blacksquare\)
As discussed above (see Lemma 9), the following lemma completes the proof of Theorem 1.
Lemma 11. Fix a number field \(K\) of degree \(n\) and signature \((r_1,r_2)\). Define \(\Omega\) as above. Then \[\operatorname{vol}\Omega=\frac{2^{r_1}\pi^{r_2}}{n!}.\]
Proof. We reduce to Lemma 10. Enumerate our real embeddings by \(\rho_1,\ldots,\rho_{r_2}\), and enumerate our complex embeddings by \(\sigma_1,\overline\sigma_1,\ldots,\sigma_{r_2},\overline\sigma_{r_2}\).
To begin, consider the isomorphism \(K_{\mathbb R}\to\mathbb R^{r_1}\times\mathbb R^{2r_2}\) given by \[(x_\tau)\mapsto(x_{\rho_1},\ldots,x_{\rho_{r_1}},\operatorname{Re}x_{\sigma_1},\operatorname{Im}x_{\sigma_1},\ldots,\operatorname{Re}x_{\sigma_{r_2}},\operatorname{Im}x_{\sigma_{r_2}}).\] Quickly, note that this is in fact a well-defined isomorphism, both of which follow from the proof of Lemma 7. Indeed, well-definedness merely needs us to note that \(x_\rho\) is real for real \(\rho\). Similarly, being an isomorphism merely needs us to check injectivity, which follows because \(x_{\overline\sigma}=\overline{x_\sigma}\) for complex \(\sigma\).
We are going to compute the volume of the image of \(\Omega\) along this isomorphism, so we take a moment to note that the determinant of this isomorphism (when both sides have been given standard orthonormal bases) is \(2^{-r_2}\). Indeed, we may compute the determinant after extending to \(\mathbb C\) so that \(K_{\mathbb C}=\prod_\tau\mathbb C_\tau\). On one hand, the operator looks like the identity on the real embeddings; on the the other hand, \(\mathbb C_\sigma\times\mathbb C_{\overline\sigma}\to\mathbb C^2\) looks like \[(x_\sigma,x_{\overline\sigma})\mapsto\frac12(x_\sigma+x_{\overline\sigma},x_\sigma-x_{\overline\sigma}),\] which has determinant \(1/2\).
By the previous step, we would like to show that the subset \[\left\{(x_1,\ldots,x_{r_1},y_1,z_1,\ldots,y_{r_2},z_{r_2}):\sum_i\left|x_i\right|+2\sum_j\sqrt{y_j^2+z_j^2}\le1\right\}\] has volume \(2^{r_1}(\pi/2)^{r_2}/n!\). By restricting all \(x_\bullet\)s to be positive (implicitly) and dividing the \(y_\bullet\)s and \(z_\bullet\)s by \(2\), this is equivalent to showing that the volume of \[\left\{(x_1,\ldots,x_{r_1},y_1,z_1,\ldots,y_{r_2},z_{r_2}):\sum_ix_i+\sum_j\sqrt{y_j^2+z_j^2}\le1\right\}\] is \((2\pi)^{r_2}/n!\). Now, we pass to polar coordines \((y_j,z_j)=(\ell_j\cos\theta_j,\ell_j\sin\theta_j)\) so that we are interested in showing that the integral \[(2\pi)^{r_2}\int_{\Delta^{r_1+2r_2}}\ell_1\cdots\ell_{r_2}\,dx_1\cdots dx_{r_1}\,d\ell_1\cdots d\ell_{r_2}\] equals \((2\pi)^{r_2}/n!\). This follows from Lemma 10 applied to the sequence with \(r_2\) twos and \((r_1+1)\) ones: indeed, the numerator is then a product of ones, and the denominator is \((2r_2+r_1)!\). \(\blacksquare\)