December 22nd
Today I learned the details of the proof of the class number formula. We begin with a definition of $n$-Lipschitz parametrizable. A set $S$ is $n$-Lipschitz parametrizable if and only if it can be covered by some finite set of Lipschitz functions $f_\bullet:[0,1]^n\to S.$ Lipschitz means that there exists a global constant $\lambda$ such that\[|f(x)-f(y)| \lt \lambda|x-y|.\]The claim we want to show that is if we have a set $S\subseteq\RR^n$ with boundary $\del S$ that is $(n-1)$-Lipschitz parametrizable, then\[\#(tS\cap\Lambda)=\frac{\mu(S)}{\op{covol}(\Lambda)}t^n+O\left(t^{n-1}\right)\]for real values $t\in\RR.$ Read this as generalized Gauss circle problem.
To get the easy reductions out of the way, note it's sufficient for $t\in\ZZ$ because\[\#(\floor tS\cap\Lambda)\le\#(tS\cap\Lambda)\le\#(\ceil tS\cap\Lambda),\]so if the result is true for $\floor t$ and $\ceil t,$ then our error term for the middle term is upper-bounded by $\ceil t^n-\floor t^n\le(t+1)^n-(t-1)^n=O\left(t^{n-1}\right).$
Additionally, we may take $\Lambda=\ZZ^n.$ Indeed, the lattice $\Lambda$ is the image of a linear transformation $L$ under $\ZZ^n,$ and reversing this linear transformation will take $\Lambda\to\ZZ^n.$ Then $S$ gets sent to $L^{-1}(S),$ and the boundary of $S$ remains Lipschitz under the transformation. Note that scaling both back will not change the number of lattice points, and $\mu(S)$ will scale back with the covolume of the lattice, both by $\det(L).$ Technically this reduction isn't necessary, but it reduces headaches.
So we want to show that\[\#\left(tS\cap\ZZ^n\right)=\mu(S)t^n+O\left(t^{n-1}\right).\]To count the number of lattice points, we bound as in the Gauss circle problem. Partition $\RR^n$ into half-open unit cubes by\[C(a_1,\ldots,a_n)=\prod_{k=1}^n[a_k,a_k+1)\]for $(a_1,\ldots,a_n)\in\ZZ^n.$ Then we define the following two bounding functions\begin{align*} \iota^+(tS) &= \#\left\{a\in\ZZ^n:C(a)\cap tS\ne\emp\right\},\\ \iota^-(tS) &= \#\left\{a\in\ZZ^n:C(a)\subseteq tS\right\}.\end{align*}Note that $\iota^-(tS)\le\mu(tS),\#(tS\cap\ZZ^n)\le\iota^+(tS).$ Indeed, to bound $\mu(S),$ $\iota^-(tS)\le\mu(tS)$ holds because each cube is disjoint and has volume $1,$ so their union will have volume $\iota^-(tS)$ while inside $\mu(tS)$; similarly, $\mu(tS)\le\iota^+(tS)$ because every point in $\mu(tS)$ is in some cube and is therefore in some cube in $\iota^+(tS).$
To bound $\#(tS\cap\ZZ^n),$ note $\iota^-(tS)\le\#(tS\cap\ZZ^n)$ because each cube in $\iota^-(tS)$ contributes at least one lattice point to $\#(tS\cap\ZZ^n)$; similarly, $\#(tS\cap\ZZ^n)\le\iota^+(tS)$ because every lattice point in $tS\cap\ZZ^n$ gets counted in some cube of $\iota^+(tS).$
Anyways, this maens that\[\left|\mu(tS)-\#\left(tS\cap\ZZ^n\right)\right|\le\iota^+(tS)-\iota^-(tS),\]so it suffices to show that $\iota^+(tS)=\iota^-(tS)+O\left(t^{n-1}\right).$ Well, this quantity is\[\iota^+(tS)-\iota^-(tS)=\#\left\{a\in\ZZ^n:C(a)\cap tS\ne\emp,tS\right\}.\]Namely, we count cubes $C(a)$ which are only partially contained in $S.$ By connecting a point of $C(a)$ inside and outside of $S,$ we see that this is equivalent to counting cubes which intersect the boundary $\del(tS).$ So we need to show\[\#\left\{a\in\ZZ^n:C(a)\cap\del(tS)\ne\emp\right\}=O\left(t^{n-1}\right).\]So far this argument is written to mimic the Gauss circle problem. At this point, we would notice that the number of lattice points touching a circle of radius $r$ is bounded by the perimeter of a square of side length $2t$ surrounding the circle to finish. We will have to try a bit harder for the general case.
We see the boundary, so it's time to use the fact that $\del S$ (and therefore $\del(tS)=t\del S$) is $(n-1)$-Lipschitz parametrizable. Suppose $f_1,\ldots,f_m$ cover $\del S.$ Then we're interested in bounding\[\#\left\{a\in\ZZ^n:C(a)\cap\bigcup_{k=1}^mtf_k\left([0,1]^{n-1}\right)\ne\emp\right\}\le\sum_{k=1}^m\#\left\{a\in\ZZ^n:C(a)\cap tf_k\left([0,1]^{n-1}\right)\ne\emp\right\}.\]Therefore it suffices to show that for any Lipschitz $f:[0,1]^{n-1}\to\RR^n,$ we have that\[\#\left\{a\in\ZZ^n:C(a)\cap tf\left([0,1]^{n-1}\right)\ne\emp\right\}=O\left(t^{n-1}\right).\]Extract the $\lambda$ belonging to $f$ due to being Lipschitz.
The main trick now is to divide into $[0,1]^{n-1}$ a total of $t^{n-1}$ cubes in the obvious way; we need each subcube to have $O(1)$ lattice points. Each of these cubes has diameter $\sqrt{n-1}/t,$ so for any $x,y$ in the same cube, we see\[|tf(x)-tf(y)|\le t\lambda|x-y|\le\lambda\sqrt{n-1}.\]Importantly, this is independent of $t.$ So now we get to be more liberal with our bounds—the above tells us that image of one of these $t^{n-1}$ subcubes under $tf$ is contained in a ball centered around $tf(x)$ (for any $x$ in our subcube) of radius $\lambda\sqrt{n-1}.$ The number of lattice points in such a ball is certainly less than\[\left(2\lambda\sqrt{n-1}+2\right)^n\]by placing the ball into an $n$-cube of side length longer than the diameter of the ball. But this quantity is $O(1)$ with respect to $t,$ so we are done here.
It remains to apply this to the class number formula. Namely, we need to show $\del S_{\le1}$ is $(n-1)$-Lipschitz parametrizable, and then we need to compute $\mu(S_{\le1}).$ I'm going to outline showing that the $S_{\le1}$ is $(n-1)$-Lipschitz parametrizable because I don't think it's a very human result. Essentially, we will need to parameterize $S_{\le1}$ by logarithms somehow, so we note the following isomorphism.\begin{align*} K_\RR^\times=(\RR^\times)^r\times(\CC^\times)^s &\longrightarrow \RR^{r+s}\times\{\pm1\}^r\times[0,2\pi)^s \\ x &\longmapsto \op{Log}(x) \times (\sgn(x_1),\ldots,\sgn(x_r)) \times (\arg(z_1)\ldots,\arg(z_s)).\end{align*}Pushing $S_{\le1}$ through the isomorphism, we will have $\{\pm1\}^r$ different components, but we can parameterize these separately to cover $S_{\le1}.$ Additionally, we will require $s$ parameters for each $[0,2\pi),$ but this is still not a problem.
Parameterizing the $\op{Log}$ factor is a bit more obnoxious. Luckily for us, $\op{Log}(S)=F\oplus(1,\ldots,1,2,\ldots,2).$ Then we can parameterize $F$ with its $r+s-1$ basis vectors (coming from $[0,1)$), and then we just have to account for the $(1,\ldots,2)$ vector. This vector encodes the norm, which lives in $(0,1],$ so we can just use the norm to parameterize this factor. Thus, we have expressed $S_{\le1}$ as the image of some function from\[f:[0,1)^{r+s-1}\times(0,1]\times\{\pm1\}^r\times[0,1)^s\to S_{\le1}.\]Note that the domain has dimension $n.$ The individual components of $f$ we expressed using linear transformations (for $[0,2\pi)$ and the norm) or exponentials (for $F$), all of which are locally Lipschitz and therefore continuous.
It follows that the interior of the half-open $n$-cube that is the domain of $f$ gets mapped to the interior of $S_{\le1},$ so the boundary of our $(n-1)$-cube will have to get mapped to the boundary $\del S_{\le1}.$ (Extend $f$ to the endpoints of each interval as necessary.) Finally, noting that the boundary of an $n$-cube is certainly $(n-1)$-Lipschitz means that the image of $f$ under the boundary is also $(n-1)$-Lipschitz, which is what we wanted.
This completes the first task—showing that $\del S_{\le1}$ is $(n-1)$-Lipschitz parametrizable. It remains to compute $\mu(S_{\le1}),$ which is a more interesting task. The actual reason I bothered doing as many details as I did to show $\del S_{\le1}$ is $(n-1)$-Lipschitz is that the given mappings are how we are going to compute $\mu(S_{\le1}).$ For starters, note the Minkowski measure maps to\begin{align*} K_\RR^\times &\longrightarrow (\RR^\times)^r\times(\CC^\times)^s, \\ d\mu &\longmapsto (dx)^r(2dA)^s.\end{align*}The $2$ comes from the Minkowski measure double-counting complex embeddings. As suggested, we now transform to log space. Each $\RR^\times$ looks like\begin{align*} \RR^\times &\longrightarrow \RR \times \{\pm1\}, \\ x & \longmapsto (\log|x|,\sgn(x)), \\ dx & \longmapsto e^\ell d\ell\cdot d\mu_{\pm1}.\end{align*}The $\CC^\times$ are a bit weirder to make the $2$ behave. We write\begin{align*} \CC^\times &\longrightarrow \RR \times [0,2\pi), \\ z & \longmapsto (2\log|z|,\arg(z)), \\ 2dA & \longmapsto e^\ell d\ell\cdot d\mu_{[0,2\pi)}.\end{align*}Yes, the $2$ did magically disappear—$z=e^{\ell/2}$ means $2dA=2e^{\ell/2}d\left(e^{\ell/2}\right)=e^\ell d\ell.$ It follows our measure $\mu$ of $K_\RR$ maps to\begin{align*} K_\RR^\times &\longrightarrow \RR^{r+s}\times\{\pm1\}^r\times[0,2\pi)^s, \\ d\mu &\longmapsto e^{\sum(\ell_\bullet)}d\mu_{\RR^{r+s}}\cdot d\mu_{\{\pm1\}}^r\cdot d\mu_{[0,2\pi)}^s.\end{align*}Here $\sum(\ell_\bullet)$ refers to the sum of the coordinates. Now, we're interested in computing $\mu(S_{\le1}),$ so it would be nice if the logarithm of the norm appeared as one of the coordinates. So we write\begin{align*} \RR^{r+s} &\longrightarrow \RR^{r+s-1}\times\RR, \\ (x_1,\ldots,x_n) &\longmapsto (x_1,\ldots,x_{n-1})\times(x_1+\cdots+x_{n-1}), \\ e^{\sum(\ell_\bullet)}d\mu_{\RR^{r+s}} &\longmapsto e^yd\mu_{\RR^{r+s-1}}\cdot dy.\end{align*}Our coordinate $y$ now tracks the logarithm of the norm. Note that $\op{Log}(S_{\le1})=F\oplus\RR(1,\ldots,2)$ under this coordinate change will have the first $r+s-1$ coordinates track a coordinate projected $F$ and $y$ track the component of $(1,\ldots,2)$ as the logarithm of the norm. So $\mu_{\RR^{r+s-1}}(\text{projected }F)=\op{Reg}_K$ by definition of the regulator, and $y$ ranges in $(-\infty,0]$ while in $S_{\le1}.$
In total, we get to write, abusing notation a bit,\[\int_{S_{\le1}}d\mu=\int_{-\infty}^0e^y\,dy\cdot\int_Fd\mu_{\RR^{r+s-1}}\cdot\int d\mu_{\{\pm1\}}^r\cdot\int d\mu_{[0,2\pi)}^s=\op{Reg}_K\cdot2^r\cdot(2\pi)^s,\]which completes the proof of the class number formula. Putting it all together, we see\[\lim_{z\to1}(z-1)\zeta_K(s)=\rho_K=\frac{\mu(S_{\le1})h_K}{w_K\sqrt{|\op{disc}(\mathcal O_K)|}}=\frac{2^r(2\pi)^s\op{Reg}_Kh_K}{w_K\sqrt{|\op{disc}(\mathcal O_K)|}},\]which is what we wanted. We quickly review where each component comes from.
-
The $\zeta_K(s)$ stuff is a mask to compute the growth rate of ideals with respect to the norm, which is linear with $\rho_K$ as constant of proportionality.
-
To compute $\rho_K,$ we split up by ideal class and find each ideal class has the same growth rate, which is where $h_K$ comes from. This lets us count principal ideals in a lattice instead of arbitrary ideals.
-
To count principal ideals in a lattice, we remove roots of unity by dividing them out (which is where $w_K$ comes from) and then count lattice points in some set of coset representative of $K_\RR/U.$
-
The size of the lattice, scaled appropriately, cancels out to $\sqrt{|\op{disc}(\mathcal O_K)|}.$
-
The regulator comes from the size of the coset representatives of $K_\RR/U$ when ported over to log space.
As I understand it, the $2^r$ and $(2\pi)^s$ are more or less details that just come out of the mathematics.