Excess layer pricing16/9/2020 I had to solve an interesting problem yesterday relating to pricing an excess layer which was contained in another layer which we knew the price for – I didn’t price the initial layer, and I did not have a gross loss model. All I had to go on was the overall price and a severity curve which I thought was reasonably accurate. The specific layers in this case were a 9m xs 1m, and I was interested in what we would charge for a 6m xs 4m.
Just to put some concrete numbers to this, let’s say the 9m xs 1m cost \$10m The xs 1m severity curve was as follows: The St Petersburg Paradox21/8/2020
Let me introduce a game – I keep flipping a coin and you have to guess whether it will come up heads or tails. The prize pot starts at \$2, and each time you guess correctly the prize pot doubles, we keep playing until you eventually guess incorrectly at which point you get whatever has accumulated in the prize pot.
So if you guess wrong on the first flip, you just get the \$2. If you guess wrong on the second flip you get \$4, and if you get it wrong on the 10th flip you get \$1024. Knowing this, how much would you pay to enter this game? You're guaranteed to win at least \$2, so you'd obviously pay at least $\2. There is a 50% chance you'll win \$4, a 25% chance you'll win \$8, a 12.5% chance you'll win \$16, and so on. Knowing this maybe you'd pay \$5 to play  you'll probably lose money but there's a decent chance you'll make quite a bit more than \$5. Perhaps you take a more mathematical approach than this. You might reason as follows – ‘I’m a rational person therefore as any good rational person should, I will calculate the expected value of playing the game, this is the maximum I should be willing to play the game’. This however is the crux of the problem and the source of the paradox, most people do not really value the game that highly – when asked they’d pay somewhere between \$2\$10 to play it, and yet the expected value of the game is infinite....
Source: https://unsplash.com/@pujalin
The above is a lovely photo I found of St Petersburg. The reason the paradox is named after St Petersburg actually has nothing to do with the game itself, but is due to an early article published by Daniel Bernoulli in a St Petersburg journal. As an aside, having just finished the book A Gentleman in Moscow by Amor Towles (which I loved and would thoroughly recommend) I'm curious to visit Moscow and St Petersburg one day.
If you are an actuary, you'll probably have done a fair bit of triangle analysis, and you'll know that triangle analysis tends to works pretty well if you have what I'd call 'nice smooth consistent' data, that is  data without sharp corners, no large one off events, and without substantially growth. Unfortunately, over the last few years, motor triangles have been anything but nice, smooth or consistent. These days, using them often seems to require more assumptions than there are data points in the entire triangle.
Should I inflate my loss ratios?14/12/2019 I remember being told as a relatively new actuarial analyst that you "shouldn't inflate loss ratios" when experience rating. This must have been sitting at the back of my mind ever since, because last week, when a colleague asked me basically the same question about adjusting loss ratios for claims inflation, I remembered the conversation I'd had with my old boss and it finally clicked. Let's go back a few years  it's 2016  Justin Bieber has a song out in which he keeps apologising, and to all of us in the UK, Donald Trump (if you've even heard of him) is still just the America's version of Alan Sugar. I was working on the pricing for a Quota Share, I can't remember the class of business, but I'd been given an aggregate loss triangle, ultimate premium estimates, and rate change information. I had carefully and meticulously projected my losses to ultimate, applied rate changes, and then set the trended and developed losses against ultimate premiums. I ended up with a table that looked something like this: (Note these numbers are completely made up but should give you a gist of what I'm talking about.) I then thought to myself ‘okay this is a property class, I should probably inflate losses by about $3\%$ pa’, the definition of a loss ratio is just losses divided by premium, therefore the correct way to adjust is to just inflate the ULR by $3\%$ pa. I did this, sent the analysis to my boss at the time to review, and was told ‘you shouldn’t inflate loss ratios for claims inflation, otherwise you'd need to inflate the premium as well’ – in my head I was like ‘hmmm, I don’t really get that...’ we’ve accounted for the change in premium by applying the rate change, claims certainly do increase each year, but I don't get how premiums also 'inflate' beyond rate movements?! but since he was the kind of actuary who is basically never wrong and we were short on time, I just took his word for it. I didn’t really think of it again, other than to remember that ‘you shouldn’t inflate loss ratios’, until last week one of my colleagues asked me if I knew what exactly this ‘Exposure trend’ adjustment in the experience rating modelling he’d been sent was. The actuaries who had prepared the work had taken the loss ratios, inflated them in line with claims inflation (what you're not supposed to do), but then applied an ‘exposure inflation’ to the premium. Ahha I thought to myself, this must be what my old boss meant by inflating premium. I'm not sure why it took me so long to get to the bottom of what, is when you get down to it, a fairly simple adjustment. In my defence, you really don’t see this approach in ‘London Market’ style actuarial modelling  it's not covered in the IFoA exams for example. Having investigated a little, it does seem to be an approach which is used by US actuaries more – possibly it’s in the CAS exams? When I googled the term 'Exposure Trend', not a huge amount of useful info came up – there are a few threads on Actuarial Outpost which kinda mention it, but after mulling it over for a while I think I understand what is going on. I thought I’d write up my understanding in case anyone else is curious and stumbles across this post. Proof by Example I thought it would be best to explain through an example, let’s suppose we are analysing a single risk over the course of one renewal. To keep things simple, we’ll assume it’s some form of property risk, which is covering Total Loss Only (TLO), i.e. we only pay out if the entire property is destroyed. Let’s suppose for $2018$, the TIV is $1m$ USD, we are getting a net premium rate of $1\%$ of TIV, and we think there is a $0.5\%$ chance of a total loss. For $2019$, the value of the property has increased by $5\%$, we are still getting a net rate of $1\%$, and we think the underlying probability of a total loss is the same. In this case we would say the rate change is $0\%$. That is: $$ \frac{\text{Net rate}_{19}}{\text{Net rate}_{18}} = \frac{1\%}{1\%} = 1 $$ However we would say that claim inflation is $2.5\%$, which is the increase in expected claims this follows from: $$ \text{Claim Inflation} = \frac{ \text{Expected Claims}_{19}}{ \text{Expected Claims}_{18}} = \frac{0.5\%*1.05m}{0.5\%*1m} = 1.05$$
From first principles, our expected gross gross ratio (GLR) for $2018$ is:
$$\frac{0.5 \% *(TIV_{18})}{1 \% *(TIV_{18})} = 50 \%$$ And for $2019$ is: $$\frac{0.5\%*(TIV_{19})}{1\%*(TIV_{19})} = 50\%$$ i.e. they are the same! The correct adjustment when onlevelling $2018$ to $2019$ should therefore result in a flat GLR – this follows as we’ve got the same GLR in each year when we calculated above from first principles. If we’d taken the $18$ GLR, applied the claims inflation $1.05$ and applied the rate change $1.0$, then we might erroneously think the Gross Loss Ratio would be $50\%*1.05 = 52.5\%$. This would be equivalent to what I did in the opening paragraph of this post, the issue being, that we haven’t accounted for trend in exposure and our rate change is a measure of the change in net rate. If we include this exposure trend as an additional explicit adjustment this gives $50\%*1.05*1/1.05 = 50\%$. Which is the correct answer, as we can see by comparing to our first principles calculation. So the fundamental problem, is that our measure of rate change is a measure in the movement of rate on TIV, whereas our claim inflation is a measure of the movement of aggregate claims. These two are misaligned, if our rate change was instead a measure in the movement of overall premium, then the two measures would be consistent and we would not need the additional adjustment. However it’s much more common in this type of situation to get given rate change as a measure of change in rate on TIV. An advantage of making an explicit adjustment for exposure trend and claims inflation is that it allows us to apply different rates – which is probably more accurate. There’s no apriori justification as to why the two should always be the same. Claim inflation will be affected by additional factors beyond changes in the inflation of the assets being insured, this may include changes in frequency, changes in court award inflation, etc… It’s also interesting to note that the clam inflation here is of a different nature to what we would expect to see in a standard Collective Risk Model. In that case we inflate individual losses by the average change in severity i.e. ignoring any change in frequency. When adjusting the LR above, we are adjusting for both the change in frequency and severity together, i.e. in the aggregate loss. The above discussion also shows the importance of understanding exactly what someone means by ‘rate change’. It may sound obvious but there are actually a number of subtle differences in what exactly we are attempting to measure when using this concept. Is it change in premium per unit of exposure, is it change in rate per dollar of exposure, or is it even change in rate adequacy? At various points I’ve seen all of these referred to as ‘rate change’. Beta Distribution in Actuarial Modelling3/11/2019
I saw a useful way of parameterising the Beta Distribution a few weeks ago that I thought I'd write about.
The standard way to define the Beta is using the following pdf:
$$f(x) = \frac{x^{\alpha 1} {(1x)}^{\beta 1}}{B ( \alpha, \beta )}$$
Where $ x \in [0,1]$ and $B( \alpha, \beta ) $ is the Beta Function:
$$ B( \alpha, \beta) = \frac{ \Gamma (\alpha ) \Gamma (\beta)}{\Gamma(\alpha + \beta)}$$
When we use this parameterisation, the first two moments are:
$$E [X] = \frac{ \alpha}{\alpha + \beta}$$
$$Var (X) = \frac{ \alpha \beta}{(\alpha + \beta)^2(\alpha + \beta + 1)}$$
We see that the mean and the variance of the Beta Distribution depend on both parameters  $\alpha$ and $\beta$. If we want to fit these parameters to a data set using a method of moments then we need to use the following formulas, which are quite complicated:
$$\hat{\alpha} = m \Bigg( \frac{m (1m) }{v}  1 \Bigg) $$
$$\hat{\beta} = (1 m) \Bigg( \frac{m (1m) }{v}  1 \Bigg) $$ This is not the only possible parameterisation of the Beta Distribution however. We can use an alternative definition where we define:
$$\gamma = \frac{ \alpha}{\alpha + \beta} $$, and $$\delta = \alpha + \beta$$
And then by construction, $E[X] = \gamma$, and we can calculate the new variance:
$$V = \frac{ \alpha \beta}{(\alpha + \beta)^2(\alpha + \beta + 1)} = \frac{\gamma ( 1  \gamma)}{(1\delta)}$$.
Placing these new variables back in our pdf gives the following equation:
$$f(x) = \frac{x^{\gamma \delta 1} {(1x)}^{\delta (1\gamma) 1}}{B ( \gamma \delta, \delta (1\gamma) 1 )}$$
So why would we bother to do this? Our new formula now looks more complicated to work with than the one we started with. There are however two main advantages to this new version, firstly the method of moments is much simpler to set up, our first parameter is simply the mean, and the formula for variance is easier to calculate than before. This makes using the Beta distribution much easier in a Spreadsheet. The second advantage, and in my mind the more important point, is that since we now have a strong link between the central moments and the two parameters that define the distribution we now have an easy and intuitive understand of what our parameters actually represent. As I’ve written about before, rather than just sticking with the standard statistics textbook version, I’m a big fan of pushing parameterisations that are both useful and easily interpretable, The version of the Beta Distribution presented above achieves this. Furthermore it also fits nicely with the schema I've written about before (most recently in the in the post below on negative binomial distribution), in which no matter which distribution we are talking about, the first parameter of a distribution gives you information about it's mean, the second parameter gives information about its volatility, etc. By doing this you give yourself the ability to compare distributions and sense check parameterisations at a glance. Pensions and Tax12/10/2019 Pensions and tax you think to yourself  he’s managed to find the two most boring subjects possible and now he appears to be planning to write about a combination of them! Is this some sort of attempt to win an award for the most boring and tedious blog post every? Well what if I told you this blog post, whilst being about pension and tax was also about a burning social justice, about high finance, and about death? Would you find that interesting? Okay, so I may got slightly carried away in that last paragraph and over sold slightly. This is less of a 'burning social injustice', and more an 'unfair outcome' which happens to only affect people who are already well off  but someone has to write about this stuff right? And the current status quo is genuinely unfair I promise. Lifetime Allowance Before diving into the unfairness I mentioned earlier, we need to briefly set the scene. Let’s think about all the types of tax that the government levies on us; we get taxed on our income (income tax), we get taxed on our property wealth (council tax), we get taxed on our wealth when we die (inheritance tax), and we get taxed when we buy things (VAT and other sales taxes). Since the government taxes just about every financial transaction we make, they can tweak all these different types of tax to incentivise and disincentivizes certain behaviours. One thing governments have traditionally been very keen on incentivising is getting its citizens saving enough for retirement. The way this has historically been done is to allow income tax relief on the proportion of an employee’s gross salary which is put into a pension. i.e. If you earn £30,000 and put £5,000 of that into a pension, then you pay no tax on the £5,000 and are then taxed as if you only earn £25,000. You might think to yourself great, so now people are saving more for their retirement and everyone lives happily ever after, the end. Well not quite. The problem with this arrangement is certain unscrupulous individuals started to use this mechanism to avoid paying tax. Let’s say you’re 64, quite well off, and hoping to retire next year, why not chuck most of your money into your pension? You can live off your savings for a year and then all that money you would have taken as income and therefore been taxed on can instead be deferred for a year, taken as pension when you retire, and you would pay no income tax on that year's salary. Or let’s suppose an individual come from a very wealthy family, and doesn't really need to get paid a salary to live on, then they could just put their entire salary into a pension, not take anything until they retire (at 55), and then not pay any tax at all on their earnings from their job across their entire career. Okay, fine, it sounds like we’re going to need some caps on the maximum pension tax relief to stop people doing this. Let’s say – £225k per annum is the maximum that can be put into a pension pot per year, and £1.5m is the total size of a pension pot you can build up before we start making people pay tax. That’s a large amount of money, who’s going to complain about that? These were the levels set by Tony Blair’s government between 2004 and 2006. These caps were then slowly increased over time, reaching a height of around £250k pa, and £1.8m as a lifetime allowance in 2011. These are very big figures, and it’s hard to imagine anyone objecting to a tax on an annual pension contribution above £250k. In 2011, with a rapidly expanding budget deficit to deal with and a promise to balance the books, the Conservative government instigated their infamous austerity programme. The tories had pledged to ringfence certain areas of government spending (the NHS, and Education for example), they therefore started to look around for other areas to raise additional income which had not traditionally been taped. Tax relief for high earners seemed like an obvious target, and who can really say that the rich should not have to pay additional tax given the situation to help try to protect spending on areas such as policing or social care. By 2014 the annual allowance (the amount that can be saved per annum) had been reduced to £40k per annum, and the lifetime allowance was down to £1m. These values still seem high, however with the economy in the state it was, they are perhaps not as high as they first appear. This reduction came during a period of alltime low interest rates, I'm going to demonstrate which this caused issued by examining a series of graphs. The following graph displays the Bank of England base rate for the last twenty years, notice the significant reduction in 2007 following the global financial crisis. Note we have been at this very low level ever since. As a proxy for the investment return a pension or life insurance company could receive on a long term almost riskfree investment I’m going to use Base rate + 2%, I've plotted this as well on the graph below: Okay, so why is this important? The interesting idea is that the ‘price’ of an annuity can be approximated using quite a simple formula. It is (very) roughly equal to: $ A = \frac{1}{i} $ Where i is our investment return. What does this mean in real numbers? If I want to retire and get a pension of £10,000 per annum, Then the price of that is going to be 10,000 * A = 10,000 / i. If i=10%, then I will need to have a pot of £100,000. If i = 1%, then I will need a pot of £1m. Let’s plot the implied annuity price on the same chart as a secondary axis: So we see we have this reciprocal relationship, as interest rates go down, the cost of an annuity goes up. At the moment, due to the very low interest rates, it costs something like £37 to purchase an annuity of £1 per annum for life. Interesting you might (or might not) think. But this annuity price becomes useful if we combine it with the lifetime taxfree allowance. If we assume a potential retiree is going to use their pension pot to purchase an annuity, then to calculate the level of pension they will be able to get, we need to take their pension pot and divide by the cost of the annuity. The graph below shows the approximate taxfree pension one could expect if retiring with a pension pot equal to the full Lifetime allowance, using the prevailing annuity rates which are in turn based on the prevailing interest rates. So we see that through a combination of the LTA being reduced significantly and interest rates bottoming out, we are now in the position where the maximum taxfree pension (based on approximate annuity rates) is as little as £25k. A far cry from the approximately £100k when the rule was first implemented. What about that burning social injustice you mentioned? But at least we are all treated the same right? Well not exactly… all of the above glossed over one quite important point, it assumes that the individual in question has a defined contribution pension. For individuals with a final salary pension (which includes MPs, Judges, a majority of CEOs and senior public sector employees), the LTA is calculated with reference to a fixed annuity factor of 20. Given the actual average annuity price is more like 40, this has a massive impact. Let’s plot this on a graph as well. Hmmm, so we see that in 2006 when the LTA was first set up, this factor of 20 was quite generous to Defined Contribution pension holders, they could often retire with a pension of around £120k without paying any tax, whereas a defined benefit pension holder could only retire with a maximum of approx. £80k. This has now flipped, due to changes in annuity rates, and the fact that this factor has not been adjusted for so long. Under current conditions, the maximum pension that can be taken tax free is around £25k pa for a definied contribution pension, but around £50k pa for a definied benefit pension.
So what is the point of all this? What am I advocating? To me it seems clear that we need (approximate) equality of outcome in terms of taxation for defined contribution pensions compared to defined benefit schemes. This could be achieved in one of two ways, either we apply some sort of floor to how low the tax free pension can reduced to for defined contribution pensions, or we adjust the defined benefit annuity factor periodically to keep it roughly in line with market conditions and therefore apply a lower maximum tax free rate to defined benefit pensioners. This would correct the current system by which judges and MPs (and other people with DB pensions) get significant tax breaks at retirement compared to individuals with DC pensions. Negative Binomial in VBA21/9/2019
Have you ever tried to simulate a negative binomial random variable in a Spreadsheet?
If the answer to that is ‘nope – I’d just use Igloo/Metarisk/Remetrica’ then consider yourself lucky! Unfortunately not every actuary has access to a decent software package, and for those muddling through in Excel, this is not a particularly easy task. If on the other hand your answer is ‘nope – I’d use Python/R, welcome to the 21st century’. I’d say great, I like using those programs as well, but sometimes for reasons out of your control, things just have to be done in Excel. This is the situation I found myself in recently, and here is my attempt to solve it: Attempt 0 The first step I took in attempting to solve the problem was of course to Google it, then cross my fingers and hope that someone else has already solved it and this is just going to be a simple copy and paste. Unfortunately when I did search for VBA code to generate a negative binomial random variable, nothing comes up. In fact, nothing comes up when searching for code to simulate a Poisson random variable in VBA. Hopefully if you've found your way here, looking for this exact thing, then you're in luck, just scroll to the bottom and copy and paste my code. When I Googled it, there were a few solutions that almost solved the problem; there is a really useful Excel addin called ‘Real statistics’ which I’ve used a few times: http://www.realstatistics.com/ It's a free excel addin, and it does have functionality to simulate negative bimonials. If however you need someone to be able to rerun the Spreadsheet, they also will need to have it installed. In that case, you might as well use Python, and then hard code the frequency numbers. Also, I have had issues with it slowing Excel down considerably, so I decided not to use this in this case. I realised I’d have to come up with something myself, which ideally would meet the following criteria
How hard can that be? Attempt 1 I’d seen a trick before (from Pietro Parodi’s excellent book ‘Pricing in General Insurance’) that a negative binomial can be thought of as a Poisson distribution with a Gamma distribution as the conjugate prior. See the link below for more details: https://en.wikipedia.org/wiki/Conjugate_prior#Table_of_conjugate_distributions Since Excel has a built in Gamma inverse, we have simplified the problem to needing to write our own Poisson inverse. We can then easily generate negative binomials using a two step process:
Great, so we’ve reduced our problem to just being able to simulate a Poisson in VBA. Unfortunately there’s still no built in Poisson inverse in Excel (or at least the version I have), so we now need a VBA based method to generate this. There is another trick we can use for this  which is also taken from Pietro Parodi  the waiting time for a Poisson dist is an Exponential Dist. And the CDF of an Exponential dist is simple enough that we can just invert it and come up with a formula for generating an Exponential sample. We then set up a loop and add together exponential values, to arrive at Poisson sample. The code for this is give below: Function Poisson_Inv(Lambda As Double) s = 0 N = 0 Do While s < 1 u = Rnd() s = s  (Application.WorksheetFunction.Ln(u) / Lambda) k = k + 1 Loop BH_Poisson_Inv = (k  1) End Function The VBA code for our negative binomial is therefore: Function NegBinOld2(b, r) Dim Lambda As Double Dim N As Long u = Rnd() Lambda = Application.WorksheetFunction.Gamma_Inv(u, r, b) N = Poisson_Inv(Lambda) NegBinOld2 = N End Function Does this do everything we want?
There are a couple of downside of though:
This leads us on to Attempt 2 Attempt 2 If we pass the VBA a random uniform sample, then whenever we hit refresh in the Spreadsheet the random sample will refresh, which will force the Negative Binomial to resample. Without this, sometimes the VBA will function will not reload. i.e. we can use the sample to force a refresh whenever we like. Adapting the code gives the following: Function NegBinOld(b, r, Rnd1 As Double) Dim Lambda As Double Dim N As Long u = Rnd1 Lambda = Application.WorksheetFunction.Gamma_Inv(u, r, b) N = Poisson_Inv(Lambda) NegBinOld = N End Function So this solves the refresh problem. What about the random seed problem? Even though we now always get the same lambda for a given rand – and personally I quite like to hardcode these in the Spreadsheet once I’m happy with the model, just to speed things up. We still use the VBA rand function to generate the Poisson, this means everytime we refresh, even when passing it the same rand, we will get a different answer and this answer will be nonreplicable. This suggests we should somehow use the first random uniform sample to generate all the others in a deterministic (but still pseudorandom) way. Attempt 3 The way I implemented this was to the set the seed in VBA to be equal to the uniform random we are passing the function, and then using the VBA random number generator (which works deterministically for a given seed) after that. This gives the following code: Function NegBin(b, r, Rnd1 As Double) Rnd (1) Randomize (Rnd1) Dim Lambda As Double Dim N As Long u = Rnd() Lambda = Application.WorksheetFunction.Gamma_Inv(u, r, b) N = Poisson_Inv(Lambda) NegBin = N End Function So we seem to have everything we want – a free, quick, solution that can be bundled in a Spreadsheet, which allows other people to rerun without installing any software, and we’ve also eliminated the forced refresh issue. What more could we want? The only slight issue with the last version of the negative binomial is that our parameters are still specified in terms of ‘b’ and ‘r’. Now what exactly are ‘b’ and ‘r’ and how do we relate them to our sample data? I’m not quite sure.... The next trick is shamelessly taken from a conversation I had with Guy Carp’s chief Actuary about their implementation of severity distributions in MetaRisk. Attempt 4 Why can't we reparameterise the distribution using parameters that we find useful, instead of feeling bound by using the standard statistics textbook definition (or even more specifically the list given in the appendix to ‘Loss Models – from data to decisions’, which seems to be somewhat of an industry standard), why can't we redefine all the parameters from all common actuarial distributions using a systematic approach for parameters? Let's imagine a framework where no matter which specific severity distribution you are looking at, the first parameter contains information about the mean (even better if it is literally scaled to the mean in some way), the second contains information about the shape or volatility, the third contains information about the tail weight, and so on. This makes fitting distributions easier, it makes comparing the goodness of fit of different distributions easier, and it make sense checking our fit much easier. I took this idea, and tied this in neatly to a method of moments parameterisation, whereby the first value is simply the mean of the distribution, and the second is the variance over the mean. This gives us our final version: Function NegBin(Mean, VarOMean, Rnd1 As Double) Rnd (1) Randomize (Rnd1) Dim Lambda As Double Dim N As Long b = VarOMean  1 r = Mean / b u = Rnd() Lambda = Application.WorksheetFunction.Gamma_Inv(u, r, b) N = Poisson_Inv(Lambda) NegBin = N End Function Function Poisson_Inv(Lambda As Double) s = 0 N = 0 Do While s < 1 u = Rnd() s = s  (Application.WorksheetFunction.Ln(u) / Lambda) k = k + 1 Loop BH_Poisson_Inv = (k  1) End Function Poisson Distribution for small Lambda23/4/2019
I was asked an interesting question a couple of weeks ago when talking through some modelling with a client.
We were modelling an airline account, and for various reasons we had decided to base our large loss modelling on a very basic topdown allocation method. We would take a view of the market losses at a few different return periods, and then using a scenario approach, would allocate losses to our client proportionately. Using this method, the frequency of losses is then scaled down by the % of major policies written, and the severity of losses is scaled down by the average line size. To give some concrete numbers (which I’ve made up as I probably shouldn’t go into exactly what the client’s numbers were), let's say the company was planning on taking a line on around 10% of the Major Airline Risks, and their average line was around 1%. We came up with a table of return periods for market level losses. The table looked something like following (the actual one was also different to the table below, but not miles off):
Then applying the 10% hit factor if there is a loss, and the 1% line written, we get the following table of return periods for our client:
Hopefully all quite straightforward so far. As an aside, it is quite interesting to sometimes pare back all the assumptions to come up with something transparent and simple like the above. For airline risks, the largest single policy limit is around USD 2.5bn, so we are saying our worst case scenario is a single full limit loss, and that each year this has around a 1 in 50 chance of occurring. We can then directly translate that into an expected loss, in this case it equates to 50m (i.e. 2.5bn *0.02) of pure loss cost. If we don't think the market is paying this level of premium for this type of risk, then we better have a good reason for why we are writing the policy!
So all of this is interesting (I hope), but what was the original question the client asked me? We can see from the chart that for the market level the highest return period we have listed is 1 in 50. Clearly this does translate to a much longer return period at the client level, but in the meeting where I was asked the original question, we were just talking about the market level. The client was interested in what the 1 in 200 at the market level was and what was driving this in the modelling. The way I had structured the model was to use four separate risk sources, each with a Poisson frequency (lambda set to be equal to the relevant return period), and a fixed severity. So what this question translates to is, for small Lambdas $(<<1)$, what is the probability that $n=2$, $n=3$, etc.? And at what return period is the $n=2$ driving the $1$ in $200$? Let’s start with the definition of the Poisson distribution: Let $N \sim Poi(\lambda)$, then: $$P(N=n) = e^{\lambda} \frac{ \lambda ^ n}{ n !} $$ We are interested in small $\lambda$ – note that for large $\lambda$ we can use a different approach and apply sterling’s approximation instead. Which if you are interested, I’ve written about here: www.lewiswalsh.net/blog/poissondistributionwhatistheprobabilitythedistributionisequaltothemean
For small lambda, the insight is to use a Taylor expansion of the $e^{\lambda}$ term. The Taylor expansion of $e^{\lambda}$ is:
$$ e^{\lambda} = \sum_{i=0}^{\infty} \frac{\lambda^i}{ i!} = 1  \lambda + \frac{\lambda^2}{2} + o(\lambda^2) $$
We can then examine the pdf of the Poisson distribution using this approximation: $$P(N=1) =\lambda e^{\lambda} = \lambda ( 1 – \lambda + \frac{\lambda^2}{2} + o(\lambda^2) ) = \lambda  \lambda^2 +o(\lambda^2)$$
as in our example above, we have:
$$ P(N=1) ≈ \frac{1}{50} – {\frac{1}{50}}^2$$
This means that, for small lambda, the probability that $N$ is equal to $1$ is always slightly less than lambda. Now taking the case $N=2$: $$P(N=2) = \frac{\lambda^2}{2} e^{\lambda} = \frac{\lambda^2}{2} (1 – \lambda +\frac{\lambda^2}{2} + o(\lambda^2)) = \frac{\lambda^2}{2} \frac{\lambda^3}{2} +\frac{\lambda^4}{2} + o(\lambda^2) = \frac{\lambda^2}{2} + o(\lambda^2)$$
So once again, for $\lambda =\frac{ 1}{50}$ we have:
$$P(N=2) ≈ 1/50 ^ 2 /2 = P(N=1) * \lambda / 2$$
In this case, for our ‘1 in 50’ sized loss, we would expect to have two such losses in a year once every 5000 years! So this is definitely not driving our 1 in 200 result.
We can add some extra columns to our market level return periods as follows:
So we see for the assumptions we made, around the 1 in 200 level our losses are still primarily being driven by the P(N=1) of the 2.5bn loss, but then in addition we will have some losses coming through corresponding to P(N=2) and P(N=3) of the 250m and 500m level, and also combinations of the other return periods.
So is this the answer I gave to the client in the meeting? …. Kinda, I waffled on a bit about this kind of thing, but then it was only after getting back to the office that I thought about trying to breakdown analytically which loss levels we can expect to kick in at various return periods. Of course all of the above is nice but there is an easier way to see the answer, since we’d already stochastically generated a YLT based on these assumptions, we could have just looked at our YLT, sorted by loss size and then gone to the 99.5 percentile and see what sort of losses make up that level. The above analysis would have been more complicated if we have also varied the loss size stochastically. You would normally do this for all but the most basic analysis. The reason we didn’t in this case was so as to keep the model as simple and transparent as possible. If we had varied the loss size stochastically then the 1 in 200 would have been made up of frequency picks of various return periods, combined with severity picks of various return periods. We would have had to arbitrarily fix one in order to say anything interesting about the other one, which would not have been as interesting. Converting a Return Period to a RoL15/3/2019 I came across a useful way of looking at Rate on Lines last week, I was talking to a broker about what return periods to use in a model for various levels of airline market loss (USD250m, USD500m, etc.). The model was intended to be just a very high level, transparent market level model which we could use as a framework to discuss with an underwriter. We were talking through the reasonableness of the assumptions when the broker came out with the following: 'Well, you’d pay about 12.5 on line in the retro market at that attachment level, so that’s a 1 in 7 breakeven right?' My response was: 'ummmm, come again?' His reasoning was as follows: Suppose the ILW pays $1$ @ $100$% reinstatements, and that it costs $12.5$% on line. Then if the layer suffers a loss, the insured will have a net position on the contract of $75$%. This is the $100$% limit which they receive due to the loss, minus the original $12.5$% Premium, minus an additional $12.5$% reinstatement Premium. The reinsurer will now need another $6$ years at $12.5$% RoL $(0.0125 * 6 = 0.75)$ to recover the limit and be at breakeven. Here is a breakdown of the cashflow over the seven years for a $10m$ stretch at $12.5$% RoL: So the loss year plus the six clean years, tells us that if a loss occurs once every 7 years, then the contract is at breakeven for this level of RoL.
So this is kind of cool  any time we have a RoL for a retro layer, we can immediately convert it to a Return Period for a loss which would trigger the layer. Generalisation 1 – various rates on line We can then generalise this reasoning to apply to a layer with an arbitrary RoL. Using the same reasoning as above, the breakeven return period ends up being: $RP= 1 + \frac{(12*RoL)}{RoL}$ Inverting this gives: $RoL = \frac{1}{(1 + RP)}$ So let's say we have an ILW costing $7.5$% on line, the breakeven return period is: $1 + \frac{(10.15)}{0.075} = 11.3$ Or let’s suppose we have a $1$ in $19$ return period, the RoL will be: $0.05 = \frac{1}{(1 + 19)}$ Generalisation 2 – other nonproportional layers The formula we derived above was originally intended to apply to ILWs, but it also holds any time we think the loss to the layer, if it occurs, will be a total loss. This might be the case for a cat layer, or a clash layers (layers which have an attachment above the underwriting limit for a single risk), or any layer with a relatively high attachment point compared to the underwriting limit. Adjustments to the formulas There are a few of adjustments we might need to make to these formulas before using them in practice. Firstly, the RoL above has no allowance for profit or expense loading, we can account for this by converting the market RoL to a technical RoL, this is done by simply dividing the RoL by $120130$% (or any other appropriate profit/expense loading). This has the effect of increasing the number of years before the loss is expected to occur. Alternately, if layer does not have a paid reinstatement, or has a different factor than $100$%, then we would need to amend the multiple we are multiplying the RoL by in the formula above. For example, with nil paid reinstatements, the formula would be: $RP = 1 + \frac{(1RoL)}{RoL}$ Another refinement we might wish to make would be to weaken the total loss assumption. We would then need to reduce the RoL by an appropriate amount to account for the possibility of partial losses. It’s going to be quite hard to say how much this should be adjusted for – the lower the layer the more it would need to be. Extending the Copula Method26/8/2018 If you have ever generated Random Variables stochastically using a Gaussian Copula, you may have noticed that the correlation of the generated sample ends up being lower than the value of the Covariance matrix of the underlying multivariate Gaussian Distribution. For an explanation of why this happens you can check out a previous post of mine: www.lewiswalsh.net/blog/correlationsfriedrichgaussandcopula. It would be nice if we could amend our method to compensate for this drop. As a quick fix, we can simply run the model a few times and fudge the Covariance input until we get the desired Correlation value. If the model runs quickly, this is quite easy to do, but as soon as the model starts to get bigger and slower, it quickly becomes impractical to run it three of four times just to get the output Correlation we desire. We can do better than this. The insight we rely on is that for a Gaussian Copula, the Pearson Correlation in the generated sample just depends on the Covariance Value. We can therefore create a precomputed table of Input and Output values, and use this to select the correct input value for the desired output. I wrote some R code to do just that, we compute a table of Pearson's Correlations obtained for various Input Covariance values when using the Gaussian Copula. a < library(MASS) library(psych) set.seed(100) m < 2 n < 10^6 OutputCor < 0 InputCor < 0 for (i in 1:100) { sigma < matrix(c(1, i/100, i/100, 1), nrow=2) z < mvrnorm(n,mu=rep(0, m),Sigma=sigma,empirical=T) u < pnorm(z) OutputCor[i] < cor(u,method='pearson')[1,2] InputCor[i] < i/10 } OutputCor InputCor Here is a sample from the table of results. You can see that the drop is relatively modest, but it does apply consistent across the whole table. Here is a graph showing the drop in values:
Updated Algorithm
We can then use the precomputed table, interpolating where necessary, to give us a Covariance value for our Multivariate Gaussian Distribution which will generate the desired Pearson Product Moment Correlation Value. So for example, if we would like to generate a sample with a Pearson Product Moment value of $0.5$, according to our table, we would need to use $0.517602$ as an input Covariance. We can test these values using the following code: a < library(MASS) library(psych) set.seed(100) m < 2 n < 5000000 sigma < matrix(c(1, 0.517602, 0.517602, 1), nrow=2) z < mvrnorm(n,mu=rep(0, m),Sigma=sigma,empirical=T) u < pnorm(z) cor(u,method='pearson') Analytic Formulas I tried to find an analytic formula for the Product Moment values obtained in this manner, but I couldn't find anything online, and I also wasn't able to derive one myself. If we could find one, then instead of using the precompued table, we would be able to simply calculate the correct value. While searching, I did come across a number of interesting analytic formulas linking the values of Kendall's Tau, Spearman's Rank, and the input Covariance.. All the formulas below are from Fang, Fang, Kotz (2002) Link to paper: www.sciencedirect.com/science/article/pii/S0047259X01920172 The paper gives the following two results, where $\rho$ is the Pearson's Product Moment
$$\tau = \frac{2}{\pi} arcsin ( \rho ) $$ $$ {\rho}_s = \frac{6}{\pi} arcsin ( \frac{\rho}{2} ) $$
We can then use these formulas to extend our method above further to calculate an input Covariance to give any desired Kendall Tau, or Spearman's Rank. I initially thought that they would link the Pearson Product Moment value with Kendall or Spearman's measure, in which case we would still have to use the precomputed table. After testing it I realised that it is actually linking the Covariance to Kendall and Spearman's measures. Thinking about it, Kendall's Tau, and Spearman's Rank are both invariant to the reverse Guassian transformation when moving from $z$ to $u$ in the algorithm. Therefore the problem of deriving an analytic formula for them is much simpler as one only has to link their values for a multivariate Guassian Distribution. Pearson's however does change, therefore it is a completely different problem and may not even have a closed form solution. As an example of how to use the above formula, suppose we'd like our generated data to have a Kendall's Tau of $0.4$. First we need to invert the Kendall's Tau formula: $$ \rho = sin ( \frac{ \tau \pi }{2} ) $$ We then plug in $\rho = 0.4 $ giving:
$$ \rho = sin ( \frac{ o.4 \pi }{2} ) = 0.587785 $$
Giving usan input Covariance value of $0.587785$
We can then test this value with the following R code:
a < library(MASS) library(psych) set.seed(100) m < 2 n < 50000 sigma < matrix(c(1, 0.587785, 0.587785, 1), nrow=2) z < mvrnorm(n,mu=rep(0, m),Sigma=sigma,empirical=T) u < pnorm(z) cor(z,method='kendall') Which we see gives us the value of $\tau$ we want. In this case the difference between the input Covariance $0.587785$, and the value of Kendall's Tau $0.4$ is actually quite significant. It's the second week of your new job Capital Modelling job. After days spent sorting IT issues, getting lost coming back from the toilets, and perfecting your new commute to work (probability of getting a seat + probability of delay * average journey temperature.) your boss has finally given you your first real project to work on. You've been asked to carry out an annual update of the Underwriting Risk Capital Charge for a minor part of the company's Motor book. Not the grandest of analysis you'll admit, this particular class only makes up about 0.2% of the company's Gross Written Premium, and the Actuaries who reserve the company's bigger classes would probably consider the number of decimal places used in the annual report more material than your entire analysis. But you know in your heart of hearts that this is just another stepping stone on your inevitable meteoric rise to Chief Actuary in the Merger and Acquisition department, where one day you will pass judgement on billion dollar deals inbetween expensive lunches with CFOs, and drinks with journalists on glamorous rooftop bars. The company uses inhouse reserving software, but since you're not that familiar with it, and because you want to make a good impression, you decide to carry out extensive checking of the results in Excel. You fire up the Capital Modelling Software (which may or may not have a name that means a house made out of ice), put in your headphones and grind it out. Hours later you emerge triumphant, and you've really nailed it, your choice of correlation (0.4), and correlation method (Gaussian Copula) is perfect. As planned you run extracts of all the outputs, and go about checking them in Excel. But what's this? You set the correlation to be 0.4 in the software, but when you check the correlation yourself in Excel, it's only coming out at 0.384?! What's going on? Simulating using Copulas The above is basically what happened to me (minus most of the actual details. but I did set up some modelling with correlated random variables and then checked it myself in Excel and was surprised to find that the actual correlation in the generated output was always lower than the input.) I looked online but couldn't find anything explaining this phenomenon, so I did some investigating myself. So just to restate the problem, when using Monte Carlo simulation, and generating correlated random variables using the Copula method. When we actually check the correlation of the generated sample, it always has a lower correlation than the correlation we specified when setting up the modelling. My first thought for why this was happening was that were we not running enough simulations and that the correlations would eventually converge if we just jacked up the number of simulations. This is the kind of behaviour you see when using Monte Carlo simulation and not getting the mean or standard deviation expected from the sample. If you just churn through more simulations, your output will eventually converge. When creating Copulas using the Gaussian Method, this is not the case though, and we can test this. I generated the graph below in R to show the actual correlation we get when generating correlated random variables using the Copula method for a range of different numbers of simulations. There does seem to be some sort of loose limiting behaviour, as the number of simulations increases, but the limit appears to be around 0.384 rather than 0.4. The actual explanation First, we need to briefly review the algorithm for generating random variables with a given correlation using the normal copula. Step 1  Simulate from a multivariate normal distribution with the given covariance matrix. Step 2  Apply an inverse gaussian transformation to generate random variables with marginal uniform distribution, but which still maintain a dependency structure Step 3  Apply the marginal distributions we want to the random variables generated in step 2 We can work through these three steps ourselves, and check at each step what the correlation is. The first step is to generate a sample from the multivariate normal. I'll use a correlation of 0.4 though out this example. Here is the R code to generate the sample: a < library(MASS) library(psych) set.seed(100) m < 2 n < 1000 sigma < matrix(c(1, 0.4, 0.4, 1), nrow=2) z < mvrnorm(n,mu=rep(0, m),Sigma=sigma,empirical=T) And here is a Scatterplot of the generated sample from the multivariate normal distribution: We now want to check the product moment correlation of our sample, which we can do using the following code: cor(z,method='pearson') Which gives us the following result: > cor(z,method='pearson') [,1] [,2] [1,] 1.0 0.4 [2,] 0.4 1.0 So we see that the correlation is 0.4 as expected. The Psych package has a useful function which produces a summary showing a Scatterplot, the two marginal distribution, and the correlation: Let us also check Kendall's Tau and Spearman's rank at this point. This will be instructive later on. We can do this using the following code: cor(z,method='spearman') cor(z,method='Kendall') Which gives us the following results: > cor(z,method='spearman') [,1] [,2] [1,] 1.0000000 0.3787886 [2,] 0.3787886 1.0000000 > cor(z,method='kendall') [,1] [,2] [1,] 1.0000000 0.2588952 [2,] 0.2588952 1.0000000 Note that this is less than 0.4 as well, but we will discuss this further later on.
We now need to apply step 2 of the algorithm, which is applying the inverse Gaussian transformation to our multivariate normal distribution. We can do this using the following code:
u < pnorm(z) We now want to check the correlation again, which we can do using the following code: cor(z,method='spearman') Which gives the following result: > cor(z,method='spearman') [,1] [,2] [1,] 1.0000000 0.3787886 [2,] 0.3787886 1.0000000 Here is the Psych summary again: u is now marginally uniform (hence the name). We can see this by looking at the Scatterplot and marginal pdfs above. We also see that the correlation has dropped to 0.379, down from 0.4 at step 1. The Pearson correlation measures the linear correlation between two random variables. We generated normal random variables, which had the required correlation, but then we applied a nonlinear (inverse Gaussian) transformation. This nonlinear step is the source of the dropped correlation in our algorithm. We can also retest Kendall's Tau, and Spearman's at this point using the following code: cor(z,method='spearman') cor(z,method='Kendall') This gives us the following result: > cor(u,method='spearman') [,1] [,2] [1,] 1.0000000 0.3781471 [2,] 0.3781471 1.0000000 > cor(u,method='kendall') [,1] [,2] [1,] 1.0000000 0.2587187 [2,] 0.2587187 1.0000000 Interestingly, these values have not changed from above! i.e. we have preserved these measures of correlation between step 1 and step 2. It's only the Pearson correlation measure (which is a measure of linear correlation) which has not been preserved. Let's now apply the step 3, and once again retest our three correlations. The code to carry out step 3 is below: x1 < qgamma(u[,1],shape=2,scale=1) x2 < qbeta(u[,2],2,2) df < cbind(x1,x2) pairs.panels(df) The summary for step 3 looks like the following. This is the end goal of our method. We see that our two marginal distributions have the required distribution, and we have a correlation between them of 0.37. Let's recheck our three measures of correlation. cor(df,method='pearson') cor(df,meth='spearman') cor(df,method='kendall') > cor(df,method='pearson') x1 x2 x1 1.0000000 0.3666192 x2 0.3666192 1.0000000 > cor(df,meth='spearman') x1 x2 x1 1.0000000 0.3781471 x2 0.3781471 1.0000000 > cor(df,method='kendall') x1 x2 x1 1.0000000 0.2587187 x2 0.2587187 1.0000000 So the Pearson has reduced again at this step, but the Spearman and Kendall's Tau are once again the same.
Does this matter?
This does matter, let's suppose you are carrying out capital modelling and using this method to correlate your risk sources. Then you would be underestimating the correlation between random variables, and therefore potentially underestimating the risk you are modelling. Is this just because we are using a Gaussian Copula? No, this is the case for all Copulas. Is there anything you can do about it? Yes, one solution is to just increase the input correlation by a small amount, until we get the output we want. A more elegant solution would be to build this scaling into the method. The amount of correlation lost at the second step is dependent just on the input value selected, so we could precompute a table of input and output correlations, and then based on the desired output, we would be able to look up the exact input value to use. If you have played around with Correlating Random Variables using a Correlation Matrix in [insert your favourite financial modelling software] then you may have noticed the requirement that the Correlation Matrix be positive semidefinite. But what exactly does this mean? And how would we check this ourselves in VBA or R? Mathematical Definition Let's start with the Mathematical definition. To be honest, it didn't really help me much in understanding what's going on, but it's still useful to know.
A symmetric $n$ x $n$ matrix $M$ is said to be positive semidefinite if the scalar $z^T M z $ is positive for every nonzero column vector $z$ of $n$ real numbers.
If I am remembering my first year Linear Algebra course correctly, then Matrices can be thought of as transformations on Vector Spaces. Here the Vector Space would be a collection of Random Variables. I'm sure there's some clever way in which this gives us some kind of nondegenerate behaviour. After a bit of research online I couldn't really find much. Intuitive Definition The intuitive explanation is much easier to understand. The requirement comes down to the need for internal consistency between the correlations of the Random Variables. For example, suppose we have three Random Variables, A, B, C. Let's suppose that A and B are highly correlated, that is to say, when A is a high value, B is also likely to be a high value. Let's also suppose that A and C are highly correlated, so that if A is a high value, then C is also likely to be a high value. We have now implicitly defined a constraint on the correlation between B and C. If A is high both B and C are also high, so it can't be the case that B and C are negatively correlated, i.e. that when B is high, C is low. Therefore some correlation matrices will give relations which are impossible to model. Alternative characterisations You can find a number of necessary and sufficient conditions for a matrix to be positive definite, I've included some of them below. I used number 2 in the VBA code for a real model I set up to check for positive definiteness. 1. All Eigenvalues are positive. If you have studied some Linear Algebra, then you may not be surprised to learn that there is a characterization using Eigenvalues. It seems like just about anything to do with Matrices can be restated in terms of Eigenvalues. I'm not really sure how to interpret this condition though in an intuitive way. 2. All leading principal minors are all positive This is the method I used to code the VBA algorithm below. The principal minors are just another name for the determinant of the upperleft $k$ by $k$ submatrix. Since VBA has a built in method for returning the determinant of a matrix this was quite an easy method to code. 3. It has a unique Cholesky decomposition I don't really understand this one properly, however I remember reading that Cholesky decomposition is used in the Copula Method when sampling Random Variables, therefore I suspect that this characterisation may be important! Since I couldn't really write much about Cholesky decomposition here is a picture of Cholesky instead, looking quite dapper. All 2x2 matrices are positive semidefinite Since we are dealing with Correlation Matrices, rather than arbitrary Matrices, we can actually show apriori that all 2 x 2 Matrices are positive semidefinite. Proof
Let M be a $2$ x $2$ correlation matrix.
$$M = \begin{bmatrix} 1&a\\ a&1 \end{bmatrix}$$ And let $z$ be the column vector $M = \begin{bmatrix} z_1\\ z_2 \end{bmatrix}$ Then we can calculate $z^T M z$ $$z^T M z = {\begin{bmatrix} z_1\\ z_2 \end{bmatrix}}^T \begin{bmatrix} 1&a\\ a&1 \end{bmatrix} \begin{bmatrix} z_1\\ z_2 \end{bmatrix} $$ Multiplying this out gives us: $$ = {\begin{bmatrix} z_1\\ z_2 \end{bmatrix}}^T \begin{bmatrix} z_1 & a z_2 \\ a z_1 & z_2 \end{bmatrix} = z_1 (z_1 + a z_2) + z_2 (a z_1 + z_2)$$ We can then simplify this to get: $$ = {z_1}^2 + a z_1 z_2 + a z_1 z_2 + {z_2}^2 = (z_1 + a z_2)^2 \geq 0$$ Which gives us the required result. This result is consistent with our intuitive explanation above, we need our Correlation Matrix to be positive semidefinite so that the correlations between any three random variables are internally consistent. Obviously, if we only have two random variables, then this is trivially true, so we can define any correlation between two random variables that we like. Not all 3x3 matrices are positive semidefinite The 3x3 case, is simple enough that we can derive explicit conditions. We do this using the second characterisation, that all principal minors must be greater than or equal to 0.
Demonstration
Let M be a $3$ x $3$ correlation matrix: $$M = \begin{bmatrix} 1&a&b\\ a&1&c \\ b&c&1 \end{bmatrix}$$ We first check the determinant of the $2$ x $2$ sub matrix. We need that: $ \begin{vmatrix} 1 & a \\ a & 1 \end{vmatrix} \geq 0 $ By definition: $ \begin{vmatrix} 1 & a \\ a & 1 \end{vmatrix} = 1  a^2$ We have that $  a  \leq 1 $, hence $  a^2  \leq 1 $, and therefore: $  1 a^2  \geq 0 $ Therefore the determinant of the $2$ x $2$ principal submatrix is always positive. Now to check the full $3$ x $3$. We require: $ \begin{vmatrix} 1 & a & b \\ a & 1 & c \\ b & c & 1 \end{vmatrix} \geq 0 $ By definition: $ \begin{vmatrix} 1 & a & b \\ a & 1 & c \\ b & c & 1 \end{vmatrix} = 1 ( 1  c^2)  a (a  bc) + b(ac  b) = 1 + 2abc  a^2  b^2  c^2 $ Therefore in order for a $3$ x $3$ matrix to be positive demidefinite we require: $a^2 + b^2 + c^2  2abc = 1 $ I created a 3d plot in R of this condition over the range [0,1].
It's a little hard to see, but the way to read this graph is that the YZ Correlation can take any value below the surface. So for example, when the XY Corr is 1, and the XZ Corr is 0, the YZ Corr has to be 0. When the XY Corr is 0 on the other hand, and XZ Corr is also 0, then the YZ Corr can be any value between 0 and 1. Checking that a Matrix is positive semidefinite using VBA When I needed to code a check for positivedefiniteness in VBA I couldn't find anything online, so I had to write my own code. It makes use of the excel determinant function, and the second characterization mentioned above. Note that we only need to start with the 3x3 sub matrix as we know from above that all 1x1 and all 2x2 determinants are positive. This is not a very efficient algorithm, but it works and it's quite easy to follow. Function CheckCorrMatrixPositiveDefinite() Dim vMatrixRange As Variant Dim vSubMatrix As Variant Dim iSubMatrixSize As Integer Dim iRow As Integer Dim iCol As Integer Dim bIsPositiveDefinite As Boolean bIsPositiveDefinite = True vMatrixRange = Range(Range("StartCorr"), Range("StartCorr").Offset(inumberofrisksources  1, inumberofrisksources  1)) ' Only need to check matrices greater than size 2 as determinant always greater than 0 when less than or equal to size 2' If iNumberOfRiskSources > 2 Then For iSubMatrixSize = iNumberOfRiskSources To 3 Step 1 ReDim vSubMatrix(iSubMatrixSize  1, iSubMatrixSize  1) For iRow = 1 To iSubMatrixSize For iCol = 1 To iSubMatrixSize vSubMatrix(iRow  1, iCol  1) = vMatrixRange(iRow, iCol) Next Next 'If the determinant of the matrix is 0, then the matrix is semipositive definite' If Application.WorksheetFunction.MDeterm(vSubMatrix) < 0 Then CheckCorrMatrixisPositiveDefinite = False bIsPositiveDefinite = False End If Next End If If bIsPositiveDefinite = True Then CheckCorrMatrixPositiveDefinite = True Else CheckCorrMatrixPositiveDefinite = False End If End Function Checking that a Matrix is positive semidefinite in R Let's suppose that instead of VBA you were using an actually user friendly language like R. What does the code look like then to check that a matrix is positive semidefinite? All we need to do is install a package called 'Matrixcalc', and then we can use the following code: is.positive.definite( Matrix ) That's right, we needed to code up our own algorithm in VBA, whereas with R we can do the whole thing in one line using a built in function! It goes to show that the choice of language can massively effect how easy a task is. Was the Lognormal Distribution misnamed?20/2/2018 I was thinking about this last week at work when I was coding part of a model involving the parameters of a truncated lognormal distribution. The lognormal distribution definitely feels like it was named the wrong way round. What is a Lognormal Distribution?
We say that a Random Variable $X$ has a LogNormal Distribution that is:
$$ X \sim LogN( \mu , { \sigma }^2 ) $$ if: $$ Log (X) \sim N( \mu , { \sigma }^2 ) $$In other words, a Lognormal distribution is a distribution such that the log of the distribution is a normal distribution. It is not, as you might think, a distribution which is the log of the normal distribution. So if $Y \sim N( \mu , {\sigma}^2 ) $ then $Log ( Y ) $ is not a lognormal distribution, instead $ e ^ Y $ is a lognormal distribution. So to create a lognormal distribution, we don't take the log of the normal distribution, we take the exponential! Why does this matter?
Definitions are just definitions after all, and as long as everyone knows how something is defined and there is no ambiguity one definition is usually as good as another. In this case though, defining it in this way does have some ugly and unnatural consequences. For example, if we take the result that the sum of two independent normal distributions is also a normal distribution, i.e.
If: $$ X \sim N( {\mu}_1 , {{\sigma}_1}^2 ) , Y \sim N( {\mu}_2 , {{\sigma}_2}^2 ) $$ Then: $$ X + Y \sim N( {\mu}_1 + {\mu}_2 , {{\sigma}_1}^2 + {{\sigma}_2}^2 ) $$ Then applying this result to the lognormal distribution, we get: If $ X \sim LogN( {\mu}_1 , {{\sigma}_1}^2 ) $ and $ Y \sim LogN( {\mu}_2 , {{\sigma}_2}^2 ) $ assuming independence, Then:$$ XY \sim LogN( {\mu}_1 + {\mu}_2 , {{\sigma}_1}^2 + {{\sigma}_2}^2 ) $$ Maybe this doesn't look too bad to you. But what if I replace $X$ and $Y$ with ${LogN}_1$ and ${LogN}_2$? Then we get: $$ {LogN}_1 * {LogN}_2 \sim LogN( {\mu}_1 + {\mu}_2 , {{\sigma}_1}^2 + {{\sigma}_2}^2 ) $$ This should definitely look wrong to you! Remember that for a standard logarithm: $$ Log (AB) = Log(A) + Log(B) $$ Instead we have an identity that looks much more like an exponential:$$ e^A * e^B = e^{ (A + B ) } $$ And that's precisely because we are dealing with an exponential! The lognormal distribution is simply the exponential of the normal, which is a much more natural way of phrasing it than to say that the lognormal distribution is a distribution such that the logarithm of the distribution is a normal distribution. So we have two reasons why the Lognormal Distbribution should have been called the Exponential Normal Distribution (Or possibly the XNormal Distribution for short). The identity above makes perfect sense when using exponentials, and we would have a naming convention that is much more natural. It is quite simple to calculate the Reinstatement Premium resulting from a loss to an Excess of Loss contract. Therefore, it seems reasonable that we should be able to come up with a simple formula relating the price charged for the Excess of Loss contract to the price charged for the Reinstatement Premium Protection (RPP) cover. I was in a meeting last week with two brokers who were trying to do just this. We had come up with an indicative price for an XoL layer and we were trying to use this to price the equivalent RPP cover. At the time I didn't have an easy way to do it, and when I did a quick Google search nothing came up. Upon further reflection, there are a couple of easy approximate methods we can use. Below I discuss three different methods which can be used to price an RPP cover, two of which do not require any stochastic modelling. Let's quickly review a few definitions, feel free to skip this section if you just want the formula. What is a Reinstatement Premium? A majority of Excess of Loss contracts will have some form of reinstatement premium. This is a payment from the Insurer to the Reinsurer to reinstate the protection in the event some of the limit is eroded. In the London market, most contracts will have either $1$, $2$, or $3$ reinstatements and generally these will be payable at $100 \%$. From the point of view of the insurer, this additional payment comes at the worst possible time, the Reinsured is being asked to fork over another large premium to the Reinsurer just after having suffered a loss. What is a Reinstatement Premium Protection (RPP)? Reinsurers developed a product called a Reinstatement Premium Protection cover (RPP cover). This cover pays the Reinsured's Reinstatement Premium for them, giveing the insurer further indemnification in the event of a loss. Here's an example of how it works in practice: Let's suppose we are considering a $5m$ xs $5m$ Excess of Loss contract, there is one reinstatement at $100 \%$ (written $1$ @ $100 \%$), and the Rate on Line is $25 \%$. The Rate on Line is just the Premium divided by the Limit. So the Premium can be found by multiplying the Limit and the RoL: $$5m* 25 \% = 1.25m$$ So we see that the Insurer will have to pay the Reinsurer $1.25m$ at the start of the contract. Now let's suppose there is a loss of $7m$. The Insurer will recover $2m$ from the Resinsurer, but they will also have to make a payment to cover the reinstatement premium of: $\frac {2m} {5m} * (5m * 25 \% ) = 2m * 25 \% = 0.5m$ to reinstate the cover. So the Insurer will actually have to pay out $5.5m$. The RPP cover, if purchased by the insurer, would pay the additional $0.5m$ on behalf of the insurer, in exchange for a further upfront premium. Now that we know how it works, how would we price the RPP cover? Three methods for pricing an RPP cover Method 1  Full stochastic model If we have priced the original Excess of Loss layer ourselves using a Monte Carlo model, then it should be relatively straight forward to price the RPP cover. We can just look at the expected Reinstatements, and apply a suitable loading for profit and expenses. This loading will probably be broadly in line with the loading that is applied to the expected losses to the Excess of Loss layer, but accounting for the fact that the writer of the RPP cover will not receive any form of Reinstatement for their Reinsurance. What if we do not have a stochastic model set up to price the Excess of Loss layer? What if all we know is the price being charged for the Excess of Loss layer? Method 2  Simple formula Here is a simple formula we can use which gives the price to charge for an RPP, based on just the deposit premium and the Rate on Line, full derivation below: $$RPP = DP * ROL $$ When attempting to price the RPP last week, I did not have a stochastic model set up. We had come up with the pricing just based off the burning cost and a couple of 'commercial adjustments'. The brokers wanted to use this to come up with the price for the RPP cover. The two should be related, as they pay out dependant on the same underlying losses. So what can we say? If we denote the Expected Losses to the layer by $EL$, then the Expected Reinstatement Premium should be: $$EL * ROL $$ To see this is the case, I used the following reasoning; if we had losses in one year equal to the $EL$ (I'm talking about actual losses, not expected losses here), then the Reinstatement Premium for that year would be the proportion of the layer which had been exhausted $\frac {EL} {Limit} $ multiplied by the Deposit Premium $Limit * ROL$ i.e.: $$ RPP = \frac{EL} {Limit} * Limit * ROL = EL * ROL$$ Great! So we have our formula right? The issue now is that we don't know what the $EL$ is. We do however know the $ROL$, does this help? If we let $DP$ denote the deposit premium, which is the amount we initially pay for the Excess of Loss layer and we assume that we are dealing with a working layer, then we can assume that: $$DP = EL * (1 + \text{ Profit and Expense Loading } ) $$ Plugging this into our formula above, we can then conclude that the expected Reinstatement Premiums will be: $$\frac {DP} { \text{ Profit and Expense Loading } } * ROL $$ In order to turn this into a price (which we will denote $RPP$) rather than an expected loss, we then need to load our formula for profit and expenses i.e. $$RPP = \frac {DP} {\text{ Profit and Expense Loading }} * ROL * ( \text{ Profit and Expense Loading } ) $$Which with cancellation gives us: $$RPP = DP * ROL $$ Which is our first very simple formula for the price that should be charged for an RPP. Was there anything we missed out though in our analysis? Method 3  A more complicated formula: There is one subtlety we glossed over in order to get our simple formula. The writer of the Excess of Loss layer will also receive the Reinstatement Premiums during the course of the contract. The writer of the RPP cover on the other hand, will not receive any reinstatement premiums (or anything equivalent to a reinstatement premium). Therefore, when comparing the Premium charged for an Excess of Loss layer against the Premium charged for the equivalent RPP layer, we should actually consider the total expected Premium for the Excess of Loss Layer rather than just the Deposit Premium. What will the additional premium be? We already have a formula for the expected Reinstatement premium: $$EL * ROL $$ Therefore the total expected premium for the Excess of Loss Layer is the Deposit Premium plus the additional Premium: $$ DP + EL * ROL $$ This total expected premium is charged in exchange for an expected loss of $EL$. So at this point we know the Total Expected Premium for the Excess of Loss contract, and we can relate the expected loss to the Excess of Loss layer to the Expected Loss to the RPP contract. i.e. For an expected loss to the RPP of $EL * ROL$, we would actually expect an equivalent premium for the RPP to be: $$ RPP = (DP + EL * ROL) * ROL $$ This formula is already loaded for Profit and Expenses, as it is based on the total premium charged for the Excess of Loss contract. It does however still contain the $EL$ as one of its terms which we do not know. We have two choices at this point. We can either come up with an assumption for the profit and expense loading (which in this hard market might be as little as only be $5 \%  10 \%$ ). And then replace $EL$ with a scaled down $DP$: $$RPP = \frac{DP} {1.075} * ( 1 + ROL) * ROL $$ Or we could simply replace the $EL$ with the $DP$, which is partially justified by the fact that the $EL$ is only used to multiply the $ROL$, and will therefore have a relatively small impact on the result. Giving us the following formula: $$RPP = DP ( 1 + ROL) * ROL $$ Which of the three methods is the best? The full stochastic model is always going to be the most accurate in my opinion. If we do not have access to one though, then out of the two formulas, the more complicated formula we derived should be more accurate (by which I mean more actuarially correct). If I was doing this in practice, I would probably calculate both, to generate some sort of range, but tend towards the second formula. That being said, when I compared the prices that the Brokers had come up with, which is based on what they thought they could actually place in the market, against my formulas, I found that the simple version of the formula was actually closer to the Broker's estimate of how much these contacts could be placed for in the market. Since the simple formula always comes out with a lower price than the more complicated formula, this suggests that there is a tendency for RPPs to be underpriced in the market. This systematic underpricing may be driven by commercial considerations rather than faulty reasoning on the part of market participants. According to the Broker I was discussing these contracts with, a common reason for placing an RPP is to give a Reinsurer who does not currently have a line on the underlying Excess of Loss layer, but who would like to start writing it, a chance to have an involvement in the same risk, without diminishing the signed lines for the existing markets. So let's say that Reinsurer A writes $100 \%$ of the Excess of Loss contract, and Reinsurer B would like to take a line on the contract. The only way to give them a line on the Excess of Loss contract is to reduce the line that Reinsurer A has. The insurer may not wish to do this though if Reinsurer A is keen to maintain their line. So the Insurer may allow Reinsurer B to write the RPP cover instead, and leave Reinsurer A with $100 \%$ of the Excess of Loss contract. This commercial factor may be one of the reasons that traditionally writers of an RPP would be inclined to give favourable terms relative to the Excess of Loss layer so as to encourage the insurer to allow them on to the main programme and to encourage them to allow them to wrte the RPP cover at all. Moral Hazard One point that is quite interesting to note about how these deals are structured is that RPP covers can have quite a significant moral hazard effect on the Insurer. The existence of Reinstatement Premiums is at least partially a mechanism to prevent moral hazard on the part of the Insurer. To see why this is the case, let's go back to our example of the $5m$ xs $5m$ layer. An insurer who purchases this layer is now exposed to the first $5m$ of any loss. But they are indemnified for the portion of the loss above $5m$, up to a limit of $5m$. If the insurer is presented with two risks which are seeking insurance  one with a total sum insured of $10m$, and another with a total sum insured of $6m$, the net retained exposure is the same for both risks from the point of view of the insurer. By including a reinstatement premium as part of the Excess of Loss layer, an therefore ensuring that the insurer has to make a payment any time a loss ceded to the layer, the reinsurer is ensuring that the insurer keeps their financial incentive to not have losses in this range. By purchasing an RPP cover, the insurer is removing their financial interest in losses which are ceded to the layer. There is an interesting conflict of interest in that the RPP cover will almost always be written by a different reinsurer to the Excess of Loss layer. The Reinsurer that is writing the RPP cover is therefore increasing the moral hazard risk whichever Reinsurer has written the Excess of Loss layer. Which will almost always be business written by one of the Reinsurer's competitors! Working Layers and unlimited Reinstatements Another point to note is that this pricing analysis makes a couple of implicit assumptions. The first is that there is a sensible relationship between the expected loss to the layer and the premium charged for the layer. This will normally only be the case for 'working layers'. These are layers to which a reasonable amount of loss activity is expected. If we are dealing with clash or other higher layers, then the pricing of these layers will be more heavily driven by considerations beyond the expected loss to the layer. These might be capital considerations on the part of the Reinsurer, commercial considerations such as Another implicit assumption in this analysis is that the reinstatements offered are unlimited,. If this is not the case, then the statement that the expected reinstatement is $EL * ROL$ no longer holds. If we have limited reinstatements (which is the case in practice most of the time) then we would expect the expected reinstatement to be less than or equal to this. Compound Poisson Loss Model in VBA13/12/2017 I was attempting to set up a Loss Model in VBA at work yesterday. The model was a CompoundPoisson FrequencySeverity model, where the number of events is simulated from a Poisson distribution, and the Severity of events is simulated from a Severity curve. There are a couple of issues you naturally come across when writing this kind of model in VBA. Firstly, the inbuilt array methods are pretty useless, in particular dynamically resizing an array is not easy, and therefore when initialising each array it's easier to come up with an upper bound on the size of the array at the beginning of the program and then not have to amend the array size later on. Secondly, Excel has quite a low memory limit compared to the total available memory. This is made worse by the fact that we are still using 32bit Office on most of our computers (for compatibility reasons) which has even lower limits. This memory limit is the reason we've all seen the annoying 'Out of Memory' error, forcing you to close Excel completely and reopen it in order to run a macro. The output of the VBA model was going to be a YLT (Yearly Loss Table), which could then easily be pasted into another model. Here is an example of a YLT with some made up numbers to give you an idea: It is much quicker in VBA to create the entire YLT in VBA and then paste it to Excel at the end, rather than pasting one row at a time to Excel. Especially since we would normally run between 10,000 and 50,000 simulations when carrying out a Monte Carlo Simulation. We therefore need to create and store an array when running the program with enough rows for the total number of losses across all simulations, but we won't know how many losses we will have until we actually simulate them. And this is where we come across our main problem. We need to come up with an upper bound for the size of this array due to the issues with dynamically resizing arrays, but since this is going to be a massive array, we want the upper bound to be as small as possible so as to reduce the chance of a memory overflow error. Upper Bound What we need then is an upper bound on the total number of losses across all the simulations years. Let us denote our Frequency Distribution by $N_i$, and the number of Simulations by $n$. We know that $N_i$ ~ $ Poi( \lambda ) \: \forall i$. Lets denote the total size of the YLT array by $T$. We know that $T$ is going to be: $$T = \sum_{1}^{n} N_i$$ We now use the result that the sum of two independent Poisson distributions is also a Poisson distribution with parameter equal to the sum of the two parameters. That is, if $X$ ~ $Poi( \lambda)$ , and $Y$ ~ $Poi( \mu)$, then $X + Y$ ~ $Poi( \lambda + \mu)$. By induction this result can then be extended to any finite sum of independent Poisson Distributions. Allowing us to rewrite $T$ as: $$ T \sim Poi( n \lambda ) $$ We now use another result, a Poisson Distribution approaches a Normally Distribution as $ \lambda \to \infty $. In this case, $ n \lambda $ is certainly large, as $n$ is going to be set to be at least $10,000$. We can therefore say that: $$ T \sim N ( n \lambda , n \lambda ) $$ Remember that $T$ is the distribution of the total number of losses in the YLT, and that we are interested in coming up with an upper bound for $T$. Let's say we are willing to accept a probabilistic upper bound. If our upper bound works 1 in 1,000,000 times, then we are happy to base our program on it. If this were the case, even if we had a team of 20 people, running the program 10 times a day each, the probability of the program failing even once in an entire year is only 4%. I then calculated the $Z$ values for a range of probabilities, where $Z$ is the unit Normal Distribution, in particular, I included the 1 in 1,000,000 Z value. We then need to convert our requirement on $T$ to an equivalent requirement on $Z$. $$ P ( T \leq x ) = p $$ If we now adjust $T$ so that it can be replaced with a standard Normal Distribution, we get: $$P \left( \frac {T  n \lambda} { \sqrt{ n \lambda } } \leq \frac {x  n \lambda} { \sqrt{ n \lambda } } \right) = p $$
Now replacing the left hand side with $Z$ gives:
$$P \left( Z \leq \frac {x  n \lambda} { \sqrt{ n \lambda } } \right) = p $$
Hence, our upper bound is given by:
$$T \lessapprox Z \sqrt{n \lambda} + n \lambda $$
Dividing through by $n \lambda $ converts this to an upper bound on the factor above the mean of the distribution. Giving us the following:
$$ T \lessapprox Z \frac {1} { \sqrt{n \lambda}} + 1 $$
We can see that given $n \lambda$ is expected to be very large and the $Z$ values relatively modest, this bound is actually very tight.
For example, if we assume that $n = 50,000$, and $\lambda = 3$, then we have the following bounds: So we see that even at the 1 in 1,000,000 level, we only need to set the YLT array size to be 1.2% above the mean in order to not have any overflow errors on our array. References (1) Proof that the sum of two independent Poisson Distributions is another Poisson Distribution math.stackexchange.com/questions/221078/poissondistributionofsumoftworandomindependentvariablesxy
(2) Normal Approximation to the Poisson Distribution.
stats.stackexchange.com/questions/83283/normalapproximationtothepoissondistribution 
AuthorI work as a pricing actuary at a reinsurer in London. Categories
All
Archives
September 2020
