THE REINSURANCE ACTUARY
  • Blog
  • Project Euler
  • Category Theory
  • Disclaimer

Poisson Distribution and Claims Frequency

17/4/2016

 

Why do we use the Poisson Distribution as the default distribution for modelling Claims Frequency for an insurance portfolio?

Why do we even have a default?

When we are setting up a frequency-severity model to model claims from a insurance portfolio, we will normally approach the fitting of a frequency distribution and the fitting of the severity distribution quite differently. For the frequency distribution the standard approach is to attempt to fit a Poisson distribution, and only look at other distributions if the Poisson is not a good fit (even then we normally limit our search to Negative Binomial, and maybe Binomial at a stretch)

When we fit a severity model however, we will often fit quite a large range of different continuous probability distribution to the empirical Claim Severity CDF using some sort of curve fitting software and then select the most appropriate curve. Some Distributions are used more often than others, for example, LogNormal, Pareto, Weibull are all common curves to use, but there is no single curve that we would assume the severity distribution conforms to by default.

So why is a Poisson distribution a natural distribution to use to model claims frequency? And why is there no 'natural' distribution for claim severity?

In this post I thought I would write up an interesting result that shows that a counting distribution which has a number of basic properties will be distributed with a Poisson distribution. We will then be able to see that are reasonable assumptions to make about Claims Frequency for an insurance portfolio.
Poisson Distribution

Before working through this result, here are a list of additional properties the Poisson Distribution has which make it easy to work with:
  • The expected number of events for a Poisson Distribution with parameter $\lambda$ is $\lambda$. That is if:
$$ N \sim Poi( \lambda )$$

Then
$$\mathbb{E} [N] = \lambda$$
and furthermore,
$$ Var [ N ] = \lambda$$

  • If we have two Poisson distributions, then the sum of the distributions also follows a Poisson distribution. This fits our intuition of how claims frequency works in practice. If we have two sources of claims, one modelled with a Poisson distribution with parameter $\lambda_1$ and another with parameter $\lambda_2$ then the sum of the distributions will have parameter $\lambda_1 + \lambda_2$. i.e. the expected number of claims from both sources of claims is just the sum of the expected numbers of claims from each individual source.
  • If we wish to fit a Poisson distribution to a collection of claims data then the maximum likelihood estimate and the method of moments estimate are both the same. In both cases, $\tilde{\lambda} = \frac{1}{n} \sum_{1}^{n} k_i$.
  • The Poisson distribution only has one parameter, which reduces the complexity of the model.
  • The waiting time, (the time between events) for a process which is distributed with a Poisson distribution follows an exponential distribution. This is another well known distribution which we can work with easily.
  • The waiting time before k events occur has a gamma distribution, another fairly well known distribution. 
  • The equivalent waiting time distributions for a negative binomial are much more complicated.

​The Result
Let $A(t)$ for ($t >0$) denote the the number of claims in the interval $[0,t]$. With $A(0) = 0$.

Suppose:
  1. The number of claims in disjoint time periods are independent
  2. For a small time period $\delta t$ the probability of a single event is given by $ P( A(t + \delta t) - A(t) = 1 ) = \lambda \delta t +o( \delta)$
  3. The probability that two or more claims occur in the same time period is negligible. That is:

$$P( A(t + \delta t) - A(t) \geq 2 ) = o( \delta t)$$

The Proof​

Define $P_n (t) = P( A(t) = n ) $

We then examine the change in $P_n(t)$ over a time period $\delta t$ and then take the limit as $\delta t$ approaches $0$.

for $n > 0$:

$$P_n(t + \delta t) = P_n(t)(1 − \lambda \delta t) + P_{n−1} (t) \lambda \delta t + o( \delta t) $$

for $n = 0$:

$$P_0(t + \delta t) = P_0(t)(1 − \lambda \delta t) + o( \delta t)$$

This follows from the facts that there are two distinct ways for $n$ claims to happen in a time period $t + \delta t$. Either we get $n$ claims in time $t$ and no claims in $\delta t$, or $n-1$ claims in time $t$ and one claim in $\delta t$.
We can rewrite these equations as:

for $n > 0$:

$$\frac { Pn(t + δt) − Pn(t)}{\delta t} = \frac {−P_n(t)( \lambda \delta t) + P_{n−1}(t) \lambda \delta t + o(\delta t)} { \delta t}$$

for $n = 0$:

$$\frac {P_0 ( t + \delta t) − P_0(t)} {\delta t} = \frac {P_0(t)(− \lambda \delta t) + o(\delta t) } {\delta t}$$

Now take the limit as $\delta \to 0$ which gives:

for $n>0$:

$$\frac { d P_n (t)} {dt} = − \lambda P_n(t) + \lambda P_{n−1}(t)$$

for $n=0$:

$$\frac {d P_0 (t)} {dt} = - \lambda P_0 (t)$$


​From inspection we can see that $P_0 (t) = e^{ {- \lambda} }$

The proof for the general case can easily be shown with induction.

We therefore see that $P_n$ has a Poisson distribution.
Poisson as the natural distribution

So we see that, in so far as insurance claims occur in line with the assumptions (independently over the time interval, and only one at a time) we can expect the claims frequency to have a Poisson Distribution. In addition, the Poisson Distribution has a number of properties which make it easy to work with - having a single parameter, having simple formulas for the mean and variance, etc. Therefore, whenever we are fitting a claim frequency model, we will almost always try the Poisson Distribution first.


Your comment will be posted after it is approved.


Leave a Reply.

    Author

    ​​I work as an actuary and underwriter at a global reinsurer in London.

    I mainly write about Maths, Finance, and Technology.
    ​
    If you would like to get in touch, then feel free to send me an email at:

    ​LewisWalshActuary@gmail.com

      Sign up to get updates when new posts are added​

    Subscribe

    RSS Feed

    Categories

    All
    Actuarial Careers/Exams
    Actuarial Modelling
    Bitcoin/Blockchain
    Book Reviews
    Economics
    Finance
    Forecasting
    Insurance
    Law
    Machine Learning
    Maths
    Misc
    Physics/Chemistry
    Poker
    Puzzles/Problems
    Statistics
    VBA

    Archives

    March 2023
    February 2023
    October 2022
    July 2022
    June 2022
    May 2022
    April 2022
    March 2022
    October 2021
    September 2021
    August 2021
    July 2021
    April 2021
    March 2021
    February 2021
    January 2021
    December 2020
    November 2020
    October 2020
    September 2020
    August 2020
    May 2020
    March 2020
    February 2020
    January 2020
    December 2019
    November 2019
    October 2019
    September 2019
    April 2019
    March 2019
    August 2018
    July 2018
    June 2018
    March 2018
    February 2018
    January 2018
    December 2017
    November 2017
    October 2017
    September 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    December 2016
    November 2016
    October 2016
    September 2016
    August 2016
    July 2016
    June 2016
    April 2016
    January 2016

  • Blog
  • Project Euler
  • Category Theory
  • Disclaimer