When we are setting up a frequency-severity model to model claims from a insurance portfolio, we will normally fit a range of different continuous probability distribution to the empirical Claim Severity CDF using some sort of curve fitting software and then select the most appropriate curve. Some Distributions are used more often than others, for example, LogNormal, Pareto, Weibull are all common curves to use, but there is no single curve that we would assume the severity distribution conforms to by default.

But when we are selecting a distribution to model claim frequency, we usually start modelling using a Poisson distribution and only try other distributions if a Poisson distribution is not a good fit. So why is a Poisson distribution a natural distribution to use to model claims frequency? And why is there no 'natural' distribution for claim severity?

In this post I thought I would write up an interesting result that shows that a counting distribution which has a number of basic properties will be distributed with a Poisson distribution. We will then be able to see that are reasonable assumptions to make about Claims Frequency for an insurance portfolio,

**Poisson Distribution**

Before working through this result, here are a list of additional reasons why we might select a Poisson Distribution:

- The expected number of events for a Poisson Distribution with parameter $\lambda$ is $\lambda$. That is if:

Then

$$\mathbb{E} [N] = \lambda$$

and furthermore,

$$ Var [ N ] = \lambda$$

- If we have two Poisson distributions, then the sum of the two distributions also follows a Poisson distribution. This fits our intuition of how claims frequency works in practice. If we have two sources of claims, one modelled with a Poisson distribution with parameter $\lambda_1$ and another with parameter $\lambda_2$ then the sum of the distributions will have parameter $\lambda_1 + \lambda_2$. i.e. the expected number of claims from both sources of claims is just the sum of the expected numbers of claims from each individual source.
- If we wish to fit a Poisson distribution to a collection of claims data then the maximum likelihood estimate and the method of moments estimate are both the same. In both cases, $\tilde{\lambda} = \frac{1}{n} \sum_{1}^{n} k_i$.
- The Poisson distribution only has one parameter, which reduces the complexity of the model.
- The waiting time, (the time between events) for a process which is distributed with a Poisson distribution follows an exponential distribution. This is another well known distribution which we can work with easily.
- The waiting time before k events occur has a gamma distribution.

**The Result**

Suppose:

- The number of claims in disjoint time periods are independent
- For a small time period $\delta t$ the probability of a single event is given by $ \mathbb{P} ( A(t + \delta t) - A(t) = 1 ) = \lambda \delta t +o( \delta)$
- The probability that two or more claims occur in the same time period is negligible. That is:

$$\mathbb{P} ( A(t + \delta t) - A(t) \geq 2 ) = o( \delta t)$$

The Proof

The Proof

Define $P_n (t) = \mathbb{P} ( A(t) = n ) $

for $n > 0$:

$$P_n(t + \delta t) = P_n(t)(1 − \lambda \delta t) + P_{n−1} (t) \lambda \delta t + o( \delta t) $$

for $n = 0$:

$$P_0(t + \delta t) = P_0(t)(1 − \lambda \delta t) + o( \delta t)$$

We can rewrite these equations as:

for $n > 0$:

$$\frac { Pn(t + δt) − Pn(t)}{\delta t} = \frac {−P_n(t)( \lambda \delta t) + P_{n−1}(t) \lambda \delta t + o(\delta t)} { \delta t}$$

for $n = 0$:

$$\frac {P_0 ( t + \delta t) − P_0(t)} {\delta t} = \frac {P_0(t)(− \lambda \delta t) + o(\delta t) } {\delta t}$$

Now take the limit as $\delta \to 0$ which gives:

for $n>0$:

$$\frac { d P_n (t)} {dt} = − \lambda P_n(t) + \lambda P_{n−1}(t)$$

for $n=0$:

$$\frac {d P_0 (t)} {dt} = - \lambda P_0 (t)$$

From inspection we can see that $P_0 (t) = e^{ {- \lambda} }$

The proof for the general case can easily be shown with induction.

We therefore see that $P_n$ has a Poisson distribution.