I’m reading ‘Information Theory, inference and learning algorithms' by David MacKay at the moment and I'm really enjoying it so far. One cool trick that he introduces early in the book is a method of deriving Stirling’s approximation through the use of the Gaussian approximation to the Poisson Distribution.
Derivation Suppose we have a Poisson Distribution with parameter $\lambda$, by definition: $$P(X=k)=e^{ \lambda} \frac{\lambda^k}{k!}$$ If we replace $\lambda$ with $r$, and consider the probability that $X=r$, we get: $$P(X=r)=e^{ r} \frac{r^r }{r!}$$ Now suppose we are restricting ourselves to large value of $r$, in which case, a Poisson distribution is well approximated by a Gaussian distribution with mean and variance both equal to $r$. Setting up this approximation. $$e^{ r} \frac{r^r }{r!} \approx \frac{1}{\sqrt{2 \pi r}} e^{ \frac{(rr)^2}{2}}$$ From which the right hand side simplifies further to give: $$ e^{ r} \frac{r^r }{r!} \approx \frac{1}{\sqrt{2 \pi r}} e^{0}$$ Giving: $$ e^{ r} \frac{r^r }{r!} \approx \frac{1}{\sqrt{2 \pi r}}$$ Which when we rearrange to obtain Stirling’s approximation: $$r! \approx e^{r} r^r \sqrt{2 \pi r}$$ Intuition We probably shouldn’t be surprised that we’ve found a link between $e^r$ and $\pi$. The fact that the two are linked can be easily drawn out using Euler’s formula: $$e^{inx}=\cos(nx)+i\sin(nx)$$ And examining the value $\pi$: $$e^{i\pi }+1=0$$ So there is a clearly a link between $e^x$, and $\pi$, but its not obvious we can draw in the factorial as well. Above we teased out a link between all of the following: $n!$, $n^n$, $e^r$, and $\pi$, which is interesting for its own sake, but moreover provides intuition as to why the Gaussian approximation to the Poisson distribution works. It should probably be noted that we’ve implicitly invoked the Central Limit Theorem to establish the approximation, and the CTL is some pretty heavy machinery! The proof from first principles of the CTL is much more involved that the proof of Stirling's approximation, so the derivation above should be thought of as strictly a process of drawing out interesting parallels, rather than a path for proving the result from first principles. Use in Actuarial Modelling I tend to use the Gaussian approximation quite a lot at work – any time I’m modelling a claim count frequency in a Spreadsheet and I’ve got a reasonable number of annual claims, I’m a proponent of just using a discretised Gaussian, with a min applied at 0, and with a variance and mean set as required. This has a couple of advantages to using either a Poisson or Negative Binomial:
Coming up with interesting problems David Mackay seemed to have an eye for interesting problems – reading up on Wikipedia about him he competed in a Maths Olympiad while a student. I do wonder if there is a correlation between wellwritten, entertaining textbooks, and authors who have a background in competitive maths problems. The link between Stirling’s approximation, Gaussian, and Poisson is just the sort of thing that could make an interesting problem in a competitive maths competition. I also realised after writoing this post that I’d already written about something pretty similar before, where we can use Stirling’s approximation to easily estimate the probability that a Poisson value is equal to it’s mean. Here’s the link: www.lewiswalsh.net/blog/poissondistributionwhatistheprobabilitythedistributionisequaltothemean
0 Comments
Leave a Reply. 
AuthorI work as a pricing actuary at a reinsurer in London. Categories
All
Archives
March 2020
