THE REINSURANCE ACTUARY
  • Blog
  • Project Euler
  • Category Theory
  • Disclaimer

Backtesting inflation modelling - median of top x losses

29/7/2022

 


I wrote a quick script to backtest one particular method of deriving claims inflation from loss data. I first came across the method in 'Pricing in General Insurance' by Pietro Parodi [1], but I'm not sure if the method pre-dates the book or not.

In order to run the method all we require is a large loss bordereaux, which is useful from a data perspective. Unlike many methods which focus on fitting a curve through attritional loss ratios, or looking at ultimate attritional losses per unit of exposure over time, this method can easily produce a *large loss* inflation pick. Which is important as the two can often be materially different.
Picture
Source: Willis Building and Lloyd's building, @Colin, https://commons.wikimedia.org/wiki/User:Colin
The code works by simulating 10 years of individual losses from a Poisson-Lognormal model, and then applying 5% inflation pa. We then throw away all losses below the large loss threshold, to put ourselves in the situation as if we'd only been supplied with a large loss claims listing. We then analyse the change over time of the 'median of the top 10 claims'. We select this slightly funky looking statistic as it should increase over time in the presence of inflation, but by looking at the median rather than the mean, we've taken out some of the spikiness. Since we hardcoded 5% inflation into the simulated data, we are looking to arrive back at this value when we apply the method to the  synthetic data.
​
I've pasted the code below, but jumping to the conclusions, here's a few take-aways:
  • The method does work - note the final answer is 5.04%, and with a few more sims this does appear to approach the original pick of 5%, which is good, the method provides an unbiased estimate.
  • The standard deviation is really high - the standard deviation is of the order of 50% of the mean. Assuming a normal distribution, we'd expect 95% of values to sit between 0%-10% - which is a huge range. In practice even an extra 1% additional inflation in our modelling can often cause a big swing in loss cost, so the method as currently presented, and using this particular set up of simulated loss data is basically useless.
  • The method is thrown off by changes in the FGU claim count  - I haven't shown it below, but if you amend the 'Exposure Growth' value below from 0%, the method no longer provides an unbiased estimate. If the data includes growth, then it tends to over-estimate inflation, and vice-versa if the FGU claim count reduces over time. Parodi does mention this in the book an offers a work-around which I haven't included below, but will write up another time.
  • It's a non-parametric - I do like the fact that it's a non-parametric method. The other large loss inflation methods I'm aware of all involve assuming some underlying probability distribution for the data (exponential, pareto, etc.).
  • We can probably improve the method  - the method effectively ignores all the data other than the 5th largest claim within a given year. So we reduce the entire analysis to the rate of change of just 10 numbers. One obvious extension of the method would be to average across the change in multiple percentiles of the distribution, we could also explore other robust statistics (e.g. Parodi mentions trimmed means?). I'll also set this up another time to see if we get an improvement in performance.
In [1]:
import numpy as np
import pandas as pd
import scipy.stats as scipy
from math import exp
from math import log
from math import sqrt
from scipy.stats import lognorm
from scipy.stats import poisson
from scipy.stats import linregress
In [2]:
Distmean = 1000000.0
DistStdDev = Distmean*1.5
AverageFreq = 100
years = 10
ExposureGrowth = 0.0

Mu = log(Distmean/(sqrt(1+DistStdDev**2/Distmean**2)))
Sigma = sqrt(log(1+DistStdDev**2/Distmean**2))

LLThreshold = 1e6
Inflation = 0.05

s = Sigma
scale= exp(Mu)
In [3]:
MedianTop10Method = []
AllLnOutput = []

for sim in range(5000):

    SimOutputFGU = []
    SimOutputLL = []
    year = 0
    Frequency= []
    for year in range(years):
        FrequencyInc = poisson.rvs(AverageFreq*(1+ExposureGrowth)**year,size = 1)
        Frequency.append(FrequencyInc)
        r = lognorm.rvs(s,scale = scale, size = FrequencyInc[0])
        r = np.multiply(r,(1+Inflation)**year)
        r = np.sort(r)[::-1]
        r_LLOnly = r[(r>= LLThreshold)]
        SimOutputFGU.append(np.transpose(r))
        SimOutputLL.append(np.transpose(r_LLOnly))
        
    SimOutputFGU = pd.DataFrame(SimOutputFGU).transpose()
    SimOutputLL = pd.DataFrame(SimOutputLL).transpose()    
    a = np.log(SimOutputLL.iloc[5])
    AllLnOutput.append(a)
    b = linregress(a.index,a).slope
    MedianTop10Method.append(b)

AllLnOutputdf = pd.DataFrame(AllLnOutput)

dfMedianTop10Method= pd.DataFrame(MedianTop10Method)
dfMedianTop10Method['Exp-1'] = np.exp(dfMedianTop10Method[0]) -1
print(np.mean(dfMedianTop10Method['Exp-1']))
print(np.std(dfMedianTop10Method['Exp-1']))
0.050423461401442896
0.02631028930074786
[1] - Pricing in General Insurance, By Pietro Parodi, ISBN 9781466581449, Chapman and Hall/CRC

Your comment will be posted after it is approved.


Leave a Reply.

    Author

    ​​I work as an actuary and underwriter at a global reinsurer in London.

    I mainly write about Maths, Finance, and Technology.
    ​
    If you would like to get in touch, then feel free to send me an email at:

    ​LewisWalshActuary@gmail.com

      Sign up to get updates when new posts are added​

    Subscribe

    RSS Feed

    Categories

    All
    Actuarial Careers/Exams
    Actuarial Modelling
    Bitcoin/Blockchain
    Book Reviews
    Economics
    Finance
    Forecasting
    Insurance
    Law
    Machine Learning
    Maths
    Misc
    Physics/Chemistry
    Poker
    Puzzles/Problems
    Statistics
    VBA

    Archives

    March 2023
    February 2023
    October 2022
    July 2022
    June 2022
    May 2022
    April 2022
    March 2022
    October 2021
    September 2021
    August 2021
    July 2021
    April 2021
    March 2021
    February 2021
    January 2021
    December 2020
    November 2020
    October 2020
    September 2020
    August 2020
    May 2020
    March 2020
    February 2020
    January 2020
    December 2019
    November 2019
    October 2019
    September 2019
    April 2019
    March 2019
    August 2018
    July 2018
    June 2018
    March 2018
    February 2018
    January 2018
    December 2017
    November 2017
    October 2017
    September 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    December 2016
    November 2016
    October 2016
    September 2016
    August 2016
    July 2016
    June 2016
    April 2016
    January 2016

  • Blog
  • Project Euler
  • Category Theory
  • Disclaimer