THE REINSURANCE ACTUARY
  • Blog
  • Project Euler
  • Category Theory
  • Disclaimer

Inflation modelling - Nth largest - part 6 - various metrics

7/9/2023

 


We previously introduced a method of deriving large loss claims inflation from a large loss claims bordereaux, and we then spent some time understanding how robust the method is depending on how much data we have, and how volatile the data is. In this post we're finally going to play around with making the method more accurate, rather than just poking holes in it. To do this, we are once again going to simulate data with a baked-in inflation rate (set to 5% here), and then we are going to vary the metric we are using to extract an estimate of the inflation from the data. In particular, we are going to look at using the Nth largest loss by year, where we will vary N from 1 - 20.

Picture
Photo by Julian Dik. I was recently in Losbon, so here is a cool photo of the city. Not really related to the blog post, but to be honest it's hard thinking of photos with some link to inflation, so I'm just picking nice photos as this point!

Here is our python code:

In [1]:
import numpy as np
import pandas as pd
import scipy.stats as scipy
from math import exp
from math import log
from math import sqrt
from scipy.stats import lognorm
from scipy.stats import poisson
from scipy.stats import linregress
import matplotlib.pyplot as plt
In [2]:
Distmean = 1000000.0
DistStdDev = Distmean*1.5
AverageFreq = 100
years = 20
ExposureGrowth = 0.0

Mu = log(Distmean/(sqrt(1+DistStdDev**2/Distmean**2)))
Sigma = sqrt(log(1+DistStdDev**2/Distmean**2))

LLThreshold = 1e6
Inflation = 0.05

s = Sigma
scale = exp(Mu)
MedianTop10Method = []
AllLnOutput = []
In [4]:
for sim in range(10000):

    SimOutputLL = []
    year = 0
    Frequency= []
    for year in range(years):
        FrequencyInc = poisson.rvs(AverageFreq*(1+ExposureGrowth)**year,size = 1)
        Frequency.append(FrequencyInc)
        r = lognorm.rvs(s,scale = scale, size = FrequencyInc[0])
        r = np.multiply(r,(1+Inflation)**year)
        r = np.sort(r)[::-1]
        r_LLOnly = r[(r>= LLThreshold)]
        SimOutputLL.append(np.transpose(r_LLOnly))
        
    SimOutputLL = pd.DataFrame(SimOutputLL).transpose()    
    AllLnOutputSim = []
    MedianTop10MethodSim = []
    for iRow in range(1,21):
        a = np.log(SimOutputLL.iloc[iRow])
        AllLnOutputSim.append(a)
        b = linregress(a.index,a).slope
        MedianTop10MethodSim.append(b)
    AllLnOutput.append(AllLnOutputSim)
    MedianTop10Method.append(MedianTop10MethodSim)
    
AllLnOutputdf = pd.DataFrame(AllLnOutput)

OutputMean = []
OutputStdDev = []

dfMedianTop10Method= pd.DataFrame(MedianTop10Method)
for iRow in range(0,20):
    OutputMean.append(np.mean(np.exp(dfMedianTop10Method[iRow]) -1))
    OutputStdDev.append(np.std(np.exp(dfMedianTop10Method[iRow]) -1))
    
print(OutputMean)
print(OutputStdDev)

plt.plot(OutputStdDev)
plt.xlabel('Nth value to use')
plt.ylabel('Standard Deviation')
plt.title('Standard Deviation vs Nth large loss')
[0.050145087623911413, 0.050114063936035846, 0.05008171646318982, 0.05004295735703755, 0.050032712657938, 0.050014427471083624, 0.05001184154389972, 0.05002322239170273, 0.05000059530333641, 0.04998014653293243, 0.05002513041036643, 0.049987258053581535, 0.049978122421359926, 0.04999577129950346, 0.04997788849739706, 0.04992582226827509, 0.04988217239699916, 0.049813271987513466, 0.0497039041388662, 0.04951880036014647]
[0.013745415886517658, 0.011764812681484629, 0.010643862238651042, 0.009893413918701141, 0.00929811434851868, 0.008829621751760679, 0.00852287511743378, 0.00827148148117868, 0.008059106267070865, 0.007896990729065285, 0.0077614481203254, 0.007619604123042998, 0.007507459106587108, 0.00741696242472465, 0.007320604159513552, 0.007224126566330418, 0.0071451595403478, 0.0070669619394482614, 0.007013056254295168, 0.0069269062482146026]
Out[4]:
Text(0.5, 1.0, 'Standard Deviation vs Nth large loss')
In [ ]:
 

Let's check this graph against our previous runs. We know that when looking at the 5th largest loss, i.e. the median of the top 10, we had a standard deviation of around 0.9%. We can see in the above graph, when reading off 5 from the x-axis, that this matches. So far so good.

What is the graph telling us? It's saying that as we increase the value of N, we get a decrease in the standard deviation of our estimate. It's initially quite a big reduction, going from the largest loss by year against the 5th largest loss results in a 35% reduction in standard deviation, but then it levels off as we continue. Going from the 5th largest loss to the 10 largest loss only results in an approx. 15% reduction.

This is good info to know! It means that we can immediately improve our estimate of claims inflation, just by using a lower attachment. You sometimes see people comparing the 'largest inflated loss' by year as a sense check on their inflation rate, but what this chart is telling us is actually that's the most volatile metric you can pick out of this range, and even the 2nd largest loss would be an improvement.

So why do we have a lower standard deviation of estimate when we use a higher Nth value? I suspect this is driven by the lower volatility inherent in the metric. i.e. the 10th largest claim in a dataset, should have lower volatility than the 9th largest claim, and the 8th largest claim, etc.. So as we move down the distribution, we are implicitly reducing the noise in our data, and that means our estimate will be more accurate.

Next time I'm going to examine this phenomenon further. It would be really cool if we could somehow link this improved accuracy, with the results we obtain previously around the improvement in accuracy related to the reduction in volatility of the ground-up loss distribution.


Your comment will be posted after it is approved.


Leave a Reply.

    Author

    ​​I work as an actuary and underwriter at a global reinsurer in London.

    I mainly write about Maths, Finance, and Technology.
    ​
    If you would like to get in touch, then feel free to send me an email at:

    ​LewisWalshActuary@gmail.com

      Sign up to get updates when new posts are added​

    Subscribe

    RSS Feed

    Categories

    All
    Actuarial Careers/Exams
    Actuarial Modelling
    Bitcoin/Blockchain
    Book Reviews
    Economics
    Finance
    Forecasting
    Insurance
    Law
    Machine Learning
    Maths
    Misc
    Physics/Chemistry
    Poker
    Puzzles/Problems
    Statistics
    VBA

    Archives

    September 2023
    August 2023
    July 2023
    June 2023
    March 2023
    February 2023
    October 2022
    July 2022
    June 2022
    May 2022
    April 2022
    March 2022
    October 2021
    September 2021
    August 2021
    July 2021
    April 2021
    March 2021
    February 2021
    January 2021
    December 2020
    November 2020
    October 2020
    September 2020
    August 2020
    May 2020
    March 2020
    February 2020
    January 2020
    December 2019
    November 2019
    October 2019
    September 2019
    April 2019
    March 2019
    August 2018
    July 2018
    June 2018
    March 2018
    February 2018
    January 2018
    December 2017
    November 2017
    October 2017
    September 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    December 2016
    November 2016
    October 2016
    September 2016
    August 2016
    July 2016
    June 2016
    April 2016
    January 2016

  • Blog
  • Project Euler
  • Category Theory
  • Disclaimer