$$f(x,c) = \frac{c}{(a+x)^{(c+1)}}$$

Whereas, the ‘standard’ specification is [1]:

$$f(x,c, \lambda) = \frac{c \lambda ^ c}{(a+x)^{(c+1)}}$$

Which is also the definition in the IFoA core reading [2]:

So, how can we use the SciPy version of the Lomax to simulate the standard version, given we are missing the $ \lambda ^c$ term?

To answer this we need to go to the relevant section of the SciPy documentation [3]:

It’s not obvious from this, but we can show that the Lomax from Scipy, with loc = 0, c = the tail parameter (alpha from wikipeda), and scale = the shape parameter (lamda from Wikipedia), is actually equal to the Lomax distribution in the standard specification.

To see this, first note that $lomax.pdf(x,c,l,s) = \frac{ lomax(y,c)} {scale}$, where $y = \frac{x-l}{s}$, using the definition of $lomax(x,c)$, setting $loc = 0$, this is equal to :

$$\frac{\frac{c}{(1+\frac{x}{s})^{c+1}}}{s} = \frac{c * s^{-1}}{(1+\frac{x}{s})^{c+1}} = \frac{c * s^{-1}}{(\frac{x+s}{s})^{c+1}} = \frac{c * s^{-1} s^{c+1}}{(x+s)^{c+1}} = \frac{c * s^{c}}{(x+s)^{c+1}}$$

Which get’s us the desired result. i.e. by just using $loc = 0$, and setting $c = \alpha$, and $scale = \lambda$, the SciPy lomax becomes the regular Lomax.

[1] https://en.wikipedia.org/wiki/Lomax_distribution

[2] IFoA - Core Reading for the 2022 Exams - CS2 Risk Modelling and Survival Analysis

[3] https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lomax.html

]]>To see this, first note that $lomax.pdf(x,c,l,s) = \frac{ lomax(y,c)} {scale}$, where $y = \frac{x-l}{s}$, using the definition of $lomax(x,c)$, setting $loc = 0$, this is equal to :

$$\frac{\frac{c}{(1+\frac{x}{s})^{c+1}}}{s} = \frac{c * s^{-1}}{(1+\frac{x}{s})^{c+1}} = \frac{c * s^{-1}}{(\frac{x+s}{s})^{c+1}} = \frac{c * s^{-1} s^{c+1}}{(x+s)^{c+1}} = \frac{c * s^{c}}{(x+s)^{c+1}}$$

Which get’s us the desired result. i.e. by just using $loc = 0$, and setting $c = \alpha$, and $scale = \lambda$, the SciPy lomax becomes the regular Lomax.

[1] https://en.wikipedia.org/wiki/Lomax_distribution

[2] IFoA - Core Reading for the 2022 Exams - CS2 Risk Modelling and Survival Analysis

[3] https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lomax.html

Suppose we’re given sample $X’ = {x_1, x_2, … x_n}$ from a uniform distribution $X$ with parameters $a,b$. Then the MLE estimator for $a = min(X’)$, and $b = max(X’)$. [1] All straight forward so far.

However, examining the estimators, we can also say with probability = 1 that $a < min(X’)$, and similarly that $b > max(X’)$. Isn't it strange that the MLE estimators are clearly less/more than the true values?

So what can we do instead?

Source: https://commons.wikimedia.org/wiki/File:Bendixen_-_Carl_Friedrich_Gau%C3%9F,_1828.jpg

Clearly, asymptotically $min(X’) -> a$ as $n -> inf$. But it’s interesting that the MLE method is unwilling to ‘extrapolate’ and provide us with an $a$ which is greater than the current minimum observed in the sample.

In order to do something along this lines, instead of looking at the MLE, we can instead generate an unbiased estimator, using the following: [2]

$$\frac{(n+1)}{n} max(X’)$$

Using this estimator, we are 'projecting out' that sample by the scaling factor $\frac{(n+1)}{n}$, which feels very sensible to me. Intuitively I’m much more comfortable estimating $(a,b)$ using this unbiased estimator rather than the MLE. Yet, I guess I’ve internalised the idea that the MLE is the ‘best’ estimator to use for a given problem. Turns out this may not always be the case, in particular for small samples sizes, where the scaling factor may be quite material.

[1] https://www.mathworks.com/help/stats/uniform-distribution-continuous.html

[2] https://math.stackexchange.com/questions/2246222/unbiased-estimator-of-a-uniform-distribution

]]>In order to do something along this lines, instead of looking at the MLE, we can instead generate an unbiased estimator, using the following: [2]

$$\frac{(n+1)}{n} max(X’)$$

Using this estimator, we are 'projecting out' that sample by the scaling factor $\frac{(n+1)}{n}$, which feels very sensible to me. Intuitively I’m much more comfortable estimating $(a,b)$ using this unbiased estimator rather than the MLE. Yet, I guess I’ve internalised the idea that the MLE is the ‘best’ estimator to use for a given problem. Turns out this may not always be the case, in particular for small samples sizes, where the scaling factor may be quite material.

[1] https://www.mathworks.com/help/stats/uniform-distribution-continuous.html

[2] https://math.stackexchange.com/questions/2246222/unbiased-estimator-of-a-uniform-distribution

It's a reasonable question, and the answer is a little context dependent, full explanation given below.

Taking a step back, 'Net' in the context of reinsurance refers to something being subtracted off a given value.

If someone used the term 'net quota share', baring any clear indication otherwise, my working assumption would be that they were referring to a quota share where the original brokerage is netted off before the premium is ceded, and then the cede commission is paid as a % of premium net of original commissions. (Which was the version described by the reader)

That being said, returning to my original point above about net referring to *something* being subtracted off, a 'net quota share' can also refer to *inuring reinsurance* being subtracted off. For example, if there is an XoL layer, and the quota share operates on the premium and losses after the XoL premium has been deducted, and the XoL recoveries have been subtracted from losses, then this may also be referred to as a net quota share.

The first use (netting off original brokerage) is probably more common, but if there was ever any confusion I'd 100% ask rather than assuming which one was being referred to.

In respect of the 'caps' question - it's not uncommon to have caps and other structuring on a quota share. I've seen loss ratio caps, per occurrence caps, or as the question refers to, sometimes specific perils such as cat may have a cap as well.

I wrote a quick script to backtest one particular method of deriving claims inflation from loss data. I first came across the method in 'Pricing in General Insurance' by Pietro Parodi [1], but I'm not sure if the method pre-dates the book or not.

In order to run the method all we require is a large loss bordereaux, which is useful from a data perspective. Unlike many methods which focus on fitting a curve through attritional loss ratios, or looking at ultimate attritional losses per unit of exposure over time, this method can easily produce a *large loss* inflation pick. Which is important as the two can often be materially different.

The code works by simulating 10 years of individual losses from a Poisson-Lognormal model, and then applying 5% inflation pa. We then throw away all losses below the large loss threshold, to put ourselves in the situation as if we'd only been supplied with a large loss claims listing. We then analyse the change over time of the 'median of the top 10 claims'. We select this slightly funky looking statistic as it should increase over time in the presence of inflation, but by looking at the median rather than the mean, we've taken out some of the spikiness. Since we hardcoded 5% inflation into the simulated data, we are looking to arrive back at this value when we apply the method to the synthetic data.

I've pasted the code below, but jumping to the conclusions, here's a few take-aways:

I've pasted the code below, but jumping to the conclusions, here's a few take-aways:

**The method does work**- note the final answer is 5.04%, and with a few more sims this does appear to approach the original pick of 5%, which is good, the method provides an unbiased estimate.**The standard deviation is really high**- the standard deviation is of the order of 50% of the mean. Assuming a normal distribution, we'd expect 95% of values to sit between 0%-10% - which is a huge range. In practice even an extra 1% additional inflation in our modelling can often cause a big swing in loss cost, so the method as currently presented, and using this particular set up of simulated loss data is basically useless.**The method is thrown off by changes in the FGU claim count**- I haven't shown it below, but if you amend the 'Exposure Growth' value below from 0%, the method no longer provides an unbiased estimate. If the data includes growth, then it tends to over-estimate inflation, and vice-versa if the FGU claim count reduces over time. Parodi does mention this in the book an offers a work-around which I haven't included below, but will write up another time.**It's a non-parametric -**I do like the fact that it's a non-parametric method. The other large loss inflation methods I'm aware of all involve assuming some underlying probability distribution for the data (exponential, pareto, etc.).**We can probably improve the method**- the method effectively ignores all the data other than the 5th largest claim within a given year. So we reduce the entire analysis to the rate of change of just 10 numbers. One obvious extension of the method would be to average across the change in multiple percentiles of the distribution, we could also explore other robust statistics (e.g. Parodi mentions trimmed means?). I'll also set this up another time to see if we get an improvement in performance.

In [1]:

`import numpy as npimport pandas as pdimport scipy.stats as scipyfrom math import expfrom math import logfrom math import sqrtfrom scipy.stats import lognormfrom scipy.stats import poissonfrom scipy.stats import linregress`

In [2]:

`Distmean = 1000000.0DistStdDev = Distmean*1.5AverageFreq = 100years = 10ExposureGrowth = 0.0Mu = log(Distmean/(sqrt(1+DistStdDev**2/Distmean**2)))Sigma = sqrt(log(1+DistStdDev**2/Distmean**2))LLThreshold = 1e6Inflation = 0.05s = Sigmascale= exp(Mu)`

In [3]:

`MedianTop10Method = []AllLnOutput = []for sim in range(5000): SimOutputFGU = [] SimOutputLL = [] year = 0 Frequency= [] for year in range(years): FrequencyInc = poisson.rvs(AverageFreq*(1+ExposureGrowth)**year,size = 1) Frequency.append(FrequencyInc) r = lognorm.rvs(s,scale = scale, size = FrequencyInc[0]) r = np.multiply(r,(1+Inflation)**year) r = np.sort(r)[::-1] r_LLOnly = r[(r>= LLThreshold)] SimOutputFGU.append(np.transpose(r)) SimOutputLL.append(np.transpose(r_LLOnly)) SimOutputFGU = pd.DataFrame(SimOutputFGU).transpose() SimOutputLL = pd.DataFrame(SimOutputLL).transpose() a = np.log(SimOutputLL.iloc[5]) AllLnOutput.append(a) b = linregress(a.index,a).slope MedianTop10Method.append(b)AllLnOutputdf = pd.DataFrame(AllLnOutput)dfMedianTop10Method= pd.DataFrame(MedianTop10Method)dfMedianTop10Method['Exp-1'] = np.exp(dfMedianTop10Method[0]) -1print(np.mean(dfMedianTop10Method['Exp-1']))print(np.std(dfMedianTop10Method['Exp-1']))`

0.0504234614014428960.02631028930074786

[1] - Pricing in General Insurance, By Pietro Parodi, ISBN 9781466581449, Chapman and Hall/CRC

]]>Source: https://somewan.design

Here's the extract from the Jupyter workbook, it's fairly self-explanatory so I won't really spend time working thought what's happening.