THE REINSURANCE ACTUARY
  • Blog
  • Project Euler
  • Category Theory
  • Disclaimer

Poisson Distribution, Claims Frequency, and Independence

11/4/2021

 

I received an email from a reader recently asking the following (which for the sake of brevity and anonymity I’ve paraphrased quite liberally)

I’ve been reading about the Poisson Distribution recently and I understand that it is often used to model claims frequency, I’ve also read that the Poisson Distribution assumes that events occur independently. However, isn’t this a bit of a contradiction given the policyholders within a given risk profile are clearly dependent on each other?

It’s a good question; our intrepid reader is definitely on to something here. Let’s talk through the issue and see if we can gain some clarity.
Let’s start with the parts that we definitely know to be true and work outwards from there. Firstly the Poisson distribution is indeed often used to model claims frequency. It is probably the single most common distribution used to model claims frequency. Second up - the distribution does assume that events occur independently (more precisely it assumes that probability of a given event occurring is independent of other events occurring, and the rate at which events occur is independent of any occurrence.)  And third up – this independence property often does not bear out in reality. Many real-world situations have series of dependent events.

Any time this independence assumption does not apply in real life, issues could be introduced into the modelling by the use of a Poisson distribution.

 So why still use it?

You might ask – so why do people still use Poisson distribution?

 There’s a few reasons, but I think the biggest two are:

·       It’s often simpler to use the Poisson than the alternatives. The Negative Binomial for example does not require this type of strict independence however the Negative Binomial, unlike the Poisson has two parameters and is a little bit fiddly to use in Excel. (the Poisson has only one and is fairly straightforward to use in Excel)

·       In certain situation it may not affect the answer much anyway. If for example you are only offering a single limit of cover, it might be the case that one loss would completely erode your limit anyway, so you are not concerned about the possibility of multiple linked losses occurring.

When should I not use a Poisson Distribution?

There are certain situations where you would definitely want to be very careful when using a Poisson for claims frequency. One such example would be when modelling US windstorms. It is widely believed that these events exhibit some form of clustering within a given year, this is borne out both empirically but also follows just from some simple reasoning around the process which generates the windstorms. Empirically we can see the clustering by just noting a few example of years with multiple extreme windstorms - 2017 had Harvey, Irma, Maria, 2005 had Katrina, Rita, and Wilma. These types of years should be very uncommon if a Poisson Distribution was a good fit for the underlying process. In terms of the argument from climate modelling - it goes as follows; US windstorms occurring with a given season will all be generated by the same (or at least similar) climatic conditions which are occurring in that year (El Nino, etc.). Therefore if conditions are such as to be conducive to extreme windstorms, you're quite likely to end up with a few of them in that year.

So US Windstorms - we should probably not use a Poisson Distribution.


Dependent estimation

We do have to be a little bit careful when we are throwing around the term ‘independent’ that we are referring to the exact type of independence that the Poisson Distribution requires. There are certain dependency structures that exist in reality, which would not disqualify the Poisson Distribution from our modelling. A conceptual confusion that can arise is that we are actually okay with dependency in the estimation of claims frequency for a given block of business, and this would not contradict the assumption underlying the Poisson Distribution.

Let’s give an example to make it more concrete. Suppose I am modelling the frequency of breakdowns for a subset of a motor book (perhaps all policyholders who own a Toyota Prius in a given postcode). Then our estimation of the expected number of breakdowns for a given policyholder within that group would depend on how many breakdowns had occurred in the wider group (assuming we are basing our estimate on the wider dataset), so our estimate of the number of breakdowns for this particular policyholder is clearly not independent of the number of breakdowns of the other policyholders.

In fact, it's not that our estimate is just 'weakly correlated' with the other policyholders, it is directly derived from it. But that does not mean we can't use a Poisson Distribution! This particular dependency structure does not contradict any of the assumptions of the Poisson Distribution. So the point to bear in mind, is that in order to use the Poisson Distribution we don’t require that every single thing be independent of every other possible thing, we require only the specific type of independence given in the opening section above.

Conclusion

I hope that helps clarify a little why the Poisson Distribution is used in actuarial modelling, and some of the limitations we should be aware of. If you’ve got any further questions, then please feel free to drop me an email using the address on the right.

Your comment will be posted after it is approved.


Leave a Reply.

    Author

    ​​I work as an actuary and underwriter at a global reinsurer in London.

    I mainly write about Maths, Finance, and Technology.
    ​
    If you would like to get in touch, then feel free to send me an email at:

    ​LewisWalshActuary@gmail.com

      Sign up to get updates when new posts are added​

    Subscribe

    RSS Feed

    Categories

    All
    Actuarial Careers/Exams
    Actuarial Modelling
    Bitcoin/Blockchain
    Book Reviews
    Economics
    Finance
    Forecasting
    Insurance
    Law
    Machine Learning
    Maths
    Misc
    Physics/Chemistry
    Poker
    Puzzles/Problems
    Statistics
    VBA

    Archives

    March 2023
    February 2023
    October 2022
    July 2022
    June 2022
    May 2022
    April 2022
    March 2022
    October 2021
    September 2021
    August 2021
    July 2021
    April 2021
    March 2021
    February 2021
    January 2021
    December 2020
    November 2020
    October 2020
    September 2020
    August 2020
    May 2020
    March 2020
    February 2020
    January 2020
    December 2019
    November 2019
    October 2019
    September 2019
    April 2019
    March 2019
    August 2018
    July 2018
    June 2018
    March 2018
    February 2018
    January 2018
    December 2017
    November 2017
    October 2017
    September 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    December 2016
    November 2016
    October 2016
    September 2016
    August 2016
    July 2016
    June 2016
    April 2016
    January 2016

  • Blog
  • Project Euler
  • Category Theory
  • Disclaimer