Just to put some concrete numbers to this, let’s say the 9m xs 1m cost \$10m

The xs 1m severity curve was as follows:

Which we can convert into a survival function for xs 4m as follows:

**Pricing the layer**

We can reason as follows, we are interested in:

$$ \text{Price}_{\text{6 xs 4}} = \text{(freq xs 4)(average loss to layer s.t. loss xs 4)/(loss ratio)} $$

And we know the following:

$$ \text{Price}_{\text{9 xs 1}} = \text{(freq xs 1)(average loss to layer s.t. loss xs 1)/(loss ratio)} = 10m$$

Solving for freq xs 1:

$$ \text{(freq xs 1)} = \frac{10m}{\text{(average loss to layer s.t. loss xs 1)/(loss ratio)}} $$

And then noting that $ \text{freq xs 4} = \text{freq xs 1} * S(4) = \text{freq xs 1} * 22\% $, which when combined with the above gives us:

$$\text{Price}_{\text{6 xs 4}} = \frac{10m}{\text{(average loss to layer s.t. loss xs 1)}/\text{(loss ratio)}} * 22\% * \text{(average loss to layer s.t. loss xs 4)} * \text{(loss ratio)} $$

And then, rearranging and cancelling the loss ratios:

$$\text{Price}_{\text{6 xs 4}} = 10m * 22\% \frac{\text{(average loss to layer s.t. loss xs 4)} }{\text{(average loss to layer s.t. loss xs 1)}}$$

i.e. the price for the 6 xs 4 is the price for the 9 xs 1 multiplied by the probability of a loss being excess 4, scaled by the quotient of the average loss into each layer. In this case, we have:

$$\text{Price}_{\text{6 xs 4}} = 10m* 22\% \frac{2,136,364}{1,490,000} = 3,154,362$$

]]>We can reason as follows, we are interested in:

$$ \text{Price}_{\text{6 xs 4}} = \text{(freq xs 4)(average loss to layer s.t. loss xs 4)/(loss ratio)} $$

And we know the following:

$$ \text{Price}_{\text{9 xs 1}} = \text{(freq xs 1)(average loss to layer s.t. loss xs 1)/(loss ratio)} = 10m$$

Solving for freq xs 1:

$$ \text{(freq xs 1)} = \frac{10m}{\text{(average loss to layer s.t. loss xs 1)/(loss ratio)}} $$

And then noting that $ \text{freq xs 4} = \text{freq xs 1} * S(4) = \text{freq xs 1} * 22\% $, which when combined with the above gives us:

$$\text{Price}_{\text{6 xs 4}} = \frac{10m}{\text{(average loss to layer s.t. loss xs 1)}/\text{(loss ratio)}} * 22\% * \text{(average loss to layer s.t. loss xs 4)} * \text{(loss ratio)} $$

And then, rearranging and cancelling the loss ratios:

$$\text{Price}_{\text{6 xs 4}} = 10m * 22\% \frac{\text{(average loss to layer s.t. loss xs 4)} }{\text{(average loss to layer s.t. loss xs 1)}}$$

i.e. the price for the 6 xs 4 is the price for the 9 xs 1 multiplied by the probability of a loss being excess 4, scaled by the quotient of the average loss into each layer. In this case, we have:

$$\text{Price}_{\text{6 xs 4}} = 10m* 22\% \frac{2,136,364}{1,490,000} = 3,154,362$$

So if you guess wrong on the first flip, you just get the \$2. If you guess wrong on the second flip you get \$4, and if you get it wrong on the 10th flip you get \$1024.

Knowing this, how much would you pay to enter this game?

You're guaranteed to win at least \$2, so you'd obviously pay at least $\2. There is a 50% chance you'll win \$4, a 25% chance you'll win \$8, a 12.5% chance you'll win \$16, and so on. Knowing this maybe you'd pay \$5 to play - you'll probably lose money but there's a decent chance you'll make quite a bit more than \$5.

Perhaps you take a more mathematical approach than this. You might reason as follows – ‘I’m a rational person therefore as any good rational person should, I will calculate the expected value of playing the game, this is the maximum I should be willing to play the game’. This however is the crux of the problem and the source of the paradox, most people do not really value the game that highly – when asked they’d pay somewhere between \$2-\$10 to play it, and yet the expected value of the game is infinite....

Source: https://unsplash.com/@pujalin

The above is a lovely photo I found of St Petersburg. The reason the paradox is named after St Petersburg actually has nothing to do with the game itself, but is due to an early article published by Daniel Bernoulli in a St Petersburg journal. As an aside, having just finished the book A Gentleman in Moscow by Amor Towles (which I loved and would thoroughly recommend) I'm curious to visit Moscow and St Petersburg one day.

How would we assess this contract from an actuarial perspective?

In order to price the contract we are going to have to introduce a couple of new ways of looking at it, as we noted above simply calculating the expected value is of limited value as this suggests we should assign an infinite value to the game.

First, let’s think in terms of credit risk, how much money do we think the counterparty is good for? Let’s say that we have no idea how much our counterparty can afford, but as a general principle we can certain say that it will be less than the total amount of money in the world. Carrying out some ‘desk research’ i.e googling this question reveals that there is of the order of \$90tr of ‘broad money’ in existence globally [1]. This includes not just hard currency such as coins and notes, but also money saved in saving accounts, and invested in money markets.

Using this as our upper-limit payout, how many flips would be needed to reach \$90tr? Well $log_2(90bn) = 46.4$. So any winnings which involve more than 46 correct flips involve being paid more than the total amount of money in the world. Let’s therefore use this as our maximum winnings. Using this, instead of our expected value being:

$$E[X] = \sum_{i=1}^{\infty} 2^i \left( \frac{1}{2} \right)^i = \sum_{i=1}^{\infty} 1 = \infty $$

It becomes the following:

$$E[X] = \sum_{i=1}^{46} 2^i \left( \frac{1}{2} \right)^i = \sum_{i=1}^{46} 1 = 46$$

All of a sudden our expected value is very much finite, and is not even that big! Paying more than \$46 to play the game suddenly seems foolish.

Now that we have looked at the contract and decided that in practice it does not have an infinite expected value, we can use the amended version with finite expectation to also calculate the standard deviation of our expected winnings. Using the usual formula for standard deviation, and the version of the game limited to 46 flips, we arrive at a value of 11,863,283, or equivalently a coefficient of variation of 257,897, pretty ridiculously high! With actuarial contracts, we might think that a modelled CV about 0.3 is high, let alone multiple hundreds of thousands. This EV of \$46 is about as far from a sure thing as you can get. This suggests that we should pay a lot less than \$46 to account for the volatility.

If someone put a gun to my head, what value would I put on this contract? As a very quick metric we could price it by saying that we will ignore any upside which is outside of a 1-in-200, which means all my upside comes from the first 8 flips, this suggests that I might pay \$8 to play it, which feels about right to me.

Here is another variation of the game which I thought might be interesting.

Suppose instead of playing the game once, you paid an upfront amount to play the game for one hour straight, using a computer to flip a virtual coin. In order to insert some realism into this, we are going to have to define how quickly we can play the game, here is one scenario which is slightly arbitrary but provides one benchmark.

What if we can play the game roughly as many times as the biggest high-frequency trading firm trades in one hour. I thought this would be an interesting scenario, suppose the ability to play this game was auctioned and firms could play it a similar number of times as some of the bigger high frequency traders trade in a given day, what would this look like?

Based on the following WSJ article [3], the largest HFT firm is someone called Virtu Financial (never heard of them to be honest!), exactly how many trades they make per day is not public record, but the article estimates it’s of the order 3 million per day, or approx. 462,000 trades an hour based on the standard opening hours of the NYSE. I would have guessed it was more than this, but I guess their operating model is more about the speed of trades rather than the volume?

Next we are going to need to simulate the game, I used the following code in Python. I’m sure other people can come up with a faster implementation than me, but it seems to serve my purpose. My computer can play around 500k games per second, which means I can simulate an hour of ‘trading’ in one second which makes it quite easy to run.

from random import randomimport matplotlib.pyplot as pltinc = np.empty([462000,40])cml = np.empty([462000])for Play in range(462000): Flip = 1 Prize = 2 j = 1 while j == 1: if random() < 0.5: inc[Play,Flip] = Prize Prize = Prize * 2 Flip = Flip + 1 else: cml[Play] = cml[Play-1] + Prize j = 2

I wanted to see what this would look like in real life, so I set up 100 simulations of one hour slots of playing game, this time using the code below.

Simulations= 500cml = np.empty([462000,Simulations])for sec in range(Simulations): for Play in range(462000): Flip = 1 Prize = 2 j = 1 while j == 1: if random() < 0.5: Prize = Prize * 2 Flip = Flip + 1 else: cml[Play,sec] = cml[Play-1,sec] + Prize j = 2plt.style.use('seaborn-whitegrid')plt.title(str(Simulations) + " Simulations of cumulative winnings in one hour of playing")plt.xlabel("Game number");plt.ylabel("Cumulative winings")plt.plot(cml)

I then created a chart showing the evolution of winnings over time. Each individual line represents a simulation of playing the game for an hour. As we move from left to right we see the cumulative winnings from our 460k flips which we can fit into an hour.

We can see from the chart that most games give a fairly modest and stable payout. There is one outlier, that ended up returning a massive \$1trn+, we'd expect this given the volatility. Most of the other simulations were focused around the \$100m range.

**Another version of the game**

The version of the game presented by Daniel Bernoulli is not even the ‘worst’ version one could come up with which still has an infinite expected value. Here is a variation with an even slower rate of divergence.

Suppose we are still flipping a coin at each trial, but instead of the payout being $2^i$, it is instead, $\frac{1}{i+1} 2^i$. This value has been deliberately selected so that the expected value becomes:

$$E[X] = \sum_{i=1}^{\infty} \frac{1}{2^i} \frac{1}{(i+1)} 2^i = \sum_{i=1}^{\infty} \frac{1}{i+1}= \infty$$

This is the famous divergent series from a first year calculus class called the harmonic series, even though the summands tend to 0, the sum is still divergent, i.e. sums to infinity.

Making the same constraint as above, whereby the maximum pay out is \$90tn, the expected value of such a game is now tiny, it is:

$$E[X] = \sum_{i=1}^{46} \frac{1}{2^i} \frac{1}{(i+1)} 2^i = \sum_{i=1}^{46} \frac{1}{i+1} \approx 3 $$

So we’d pay even less to pay this version than the original version, less than \$3, yet it still has infinite expected value.

References:

[1]www.marketwatch.com/story/this-is-how-much-money-exists-in-the-entire-world-in-one-chart-2015-12-18

[2] mathworld.wolfram.com/HarmonicSeries.html

[3] https://online.wsj.com/public/resources/documents/VirtuOverview.pdf

]]>The version of the game presented by Daniel Bernoulli is not even the ‘worst’ version one could come up with which still has an infinite expected value. Here is a variation with an even slower rate of divergence.

Suppose we are still flipping a coin at each trial, but instead of the payout being $2^i$, it is instead, $\frac{1}{i+1} 2^i$. This value has been deliberately selected so that the expected value becomes:

$$E[X] = \sum_{i=1}^{\infty} \frac{1}{2^i} \frac{1}{(i+1)} 2^i = \sum_{i=1}^{\infty} \frac{1}{i+1}= \infty$$

This is the famous divergent series from a first year calculus class called the harmonic series, even though the summands tend to 0, the sum is still divergent, i.e. sums to infinity.

Making the same constraint as above, whereby the maximum pay out is \$90tn, the expected value of such a game is now tiny, it is:

$$E[X] = \sum_{i=1}^{46} \frac{1}{2^i} \frac{1}{(i+1)} 2^i = \sum_{i=1}^{46} \frac{1}{i+1} \approx 3 $$

So we’d pay even less to pay this version than the original version, less than \$3, yet it still has infinite expected value.

References:

[1]www.marketwatch.com/story/this-is-how-much-money-exists-in-the-entire-world-in-one-chart-2015-12-18

[2] mathworld.wolfram.com/HarmonicSeries.html

[3] https://online.wsj.com/public/resources/documents/VirtuOverview.pdf

Nassim Taleb meanwhile, the very guy who brought the term ‘black swan’ into popular consciousness, has stated that what we are dealing with at the moment isn’t even a black swan!

So what’s going on here? And who is right?

From the insurance insider:

Glaser does not elaborate on exactly what his definition of a black swan is (which of course you wouldn't really expect him to), but hashing his statement - he believes the virus itself, and then the subsequent government actions each represent a separate black swan. Moreover, he makes the claim that this is the first time that two black swans have ever occurred at the same time. Ever!

Hmmm… can we think of two events which could possibly count as black swans, which occurred at the same time, how about these little ones:

- Spanish Flu (Spring 1918-Summer 1919) and WW1 (Summer 1914 - Nov 1918)

The Spanish Flu - which was even more deadly and disruptive then Covid - occurred during WW1 - which is one of the most (if not the most?) significant event in modern history. Moreover both are more obvious candidates for ‘black swan-ship’ given their almost unprecedented nature at the time. This current crises is less significant than both of the above, but also much more predictable.

Since the two events at the moment – Coronavirus itself and the lockdown – are related, it’s not like Glaser's statement even requires that we need two unrelated events to occur simultaneously. Here are a couple of other examples off the top of my head:

- The Japanese 2011 Tsunami, and the Fukashima nuclear disaster
- Pearl Harbour and the war in Europe (WW2)

Okay, so Mr. Glaser may have overreached with his ‘two at once as unprecedented’. Let’s return to the original question of whether this is even a black swan.

Nassim Taleb, according to the following New Yorker article

So clearly Taleb does not believe it is a black swan, let’s not consider this a slam dunk yet though – Taleb is famously provocative and contrarian (which is what makes him such an entertaining writer), so he might just be saying this to get a rise and to attempt to assert some ownership over his idea.

In order to decide who is right, we’re going to have to look at the actual definition of a black swan. It seems like everyone is throwing around the term and meaning slightly different things by it.

Let’s play a game, I’ll give you four options and you tell me which is the correct definition:

A black swan is:

- An outlier, as it lies outside the realm of regular expectations, because nothing in the past can convincingly point to its possibility.
- It carries an extreme ‘impact’
- Explanations for its occurrence are concocted after the fact, making it seem explainable and predictable

A black swan is:

- An event with probabilities that are not computable beforehand
- It plays a major role
- Arise from an incomplete assessment of tail risk

A black swan is:

- A ‘tail event’

A black swan is:

- An event which has a significant impact when it occurs.
- which is an ‘unknown, unknown’ (to use Donald Rumsfeld’s famous turn of phrase)

Okay here are the answers (which you can probably guess):

Note that even 1 and 2 are slightly different, for example, definition 1 has the strange third condition that the event be accompanied with a bias to treat it as explainable in hindsight. This tendency seems more like an empirical statement about human psychology rather than an inherent property of any given event. It's certainly an important feature of how black swans are mis-mismanaged, but I don't think it should make up part of the necessary and sufficient definiton. In fact, Taleb drops it in the second definiton, and in The Black Swan even uses the example of a Turkey being killed at Christmas as a black swan event from the perspective of the turkey. I certainly don’t think a dead turkey is going to ex post facto rationalise the event!

Glaser seems to be using more definition 3, Taleb of course would use some variation of 1 or 2, as these are directly from him.

An important aspect of being a black swan, as I kind of highlighted above, is the event needs to be fundamentally unpredictable:

From definition 1:

“nothing in the past can convincingly point to its possibility”

From definition 2:

“their probabilities are not computable”

The Coronavirus, while a tail event – was basically modelable.

For example:

- People now point to the Spanish Flu as a precedent, (COVID is not even as far into the tail as Spanish Flu). So in order to predict Covid we certainly can extrapolate from past events.
- Bill Gates in 2015 listed this very scenario as one of the biggest risks to humanity
- Taleb has a section in Silent Risk on epidemics and the implied tail risk
- There have been multiple epidemics/pandemics, etc for most of human history. Most recently, Ebola, AIDS, SARS, MERS.
- You could buy insurance for this very event (communicable disease cover)
- RMS has a pandemic cat model
- The WHO purchased pandemic bonds to cover a pandemic

Obviously it was impossible to predict that the pandemic would occur precisely when it did, but that doesn’t mean it wasn’t possible to model some sort of annual probability of occurrence, or to assess the likely impact if it did occur. Windstorms and Earthquakes are similarly not predictable to a specific time, yet we can still model their frequency and severity to a certain extent.

I’m afraid I’m going to have to side with Taleb over Glaser on this point. Firstly I don’t think we can take Glaser’s comments about two at once being unprecedented as correct, just look at the historical precedents. And I think the condition that the event be a true ‘unknown unknown’ is a fundamental part of the definition of a black swan, and Coronavirus just doesn’t quite meet this criteria.

Before we finish I thought I’d comment on Covid in terms of the insurance industry total loss and the return period of such a loss. If we put COVID at USD 150bn all in (it’s going to take years for the final amount to be known, current estimates are more in the USD 50bn – 100bn range, I’ve applied some pretty heavy IBNER to this. I don't have any specific reason for doing so other than I think it is a good rule of thumb that people do tend to underestimate these losses initially).

I have some numbers in my head for the return periods of various market losses, my method for deriving them was to just bundle all known losses from all perils together (natural and man made) and then fit a curve with a decent tail. Based on this approach, I’d put a USD 150bn market loss at about a 1 in 15 year event – so I don't think Covid is even that far into the tail! This is the kind of market loss one should expect a few time in his or her career. The probability of a true black swan event, by definition, can’t be computed by fitting a curve through past losses, and may well be an order of magnitude bigger than anything we’ve seen before, a pretty scary thought to end on!

[1] Insurance Insider article with Glaser's comments:

https://insuranceinsider.com/articles/133338/glaser-current-crisis-is-two-black-swans-at-once

[2] New Yorker article with quote from Taleb:

https://www.newyorker.com/news/daily-comment/the-pandemic-isnt-a-black-swan-but-a-portent-of-a-more-fragile-global-system

I saw this story today [1], and I've got to say I absolutely love it. Here is the tag line:

*“Clarence Thomas Just Asked His First Question in a Supreme Court Argument in 3 Years”*

For those like me not familiar with Justice Thomas, he is one of the nine Supreme Court Justices and he is famously terse:

*“He once went 10 years without speaking up at an argument.”*

“His last questions came on Feb. 29, 2016”

You could say I was speechless upon reading this (see what I did there?), for a judge who sits over some of the most important trials in the US to basically never speak during oral arguments seems pretty incredible.

For those like me not familiar with Justice Thomas, he is one of the nine Supreme Court Justices and he is famously terse:

“His last questions came on Feb. 29, 2016”

You could say I was speechless upon reading this (see what I did there?), for a judge who sits over some of the most important trials in the US to basically never speak during oral arguments seems pretty incredible.

But then on reflection, in his defence (okay I'll stop now):

- There are nine justices, so if each one interrupts every few minutes, that’s a hell of a lot of interruption.
- By the time a case makes it to the Supreme Court, the facts of the matter have usually been established, and the discussion is largely focused on the correct interpretation of the law. Even the main arguments as to the correct way to interpret the law would have probably been stated and restated endlessly as the case has made its way through the lower courts.
- Thomas has stated that he considers most questions unnecessary and the interruptions discourteous.

Which does actually sound pretty reasonable when you think about it, still I found it pretty amusing to read this morning, can you imagine turning up to your job and then not speaking for years at a time? Obviously your boss would have something to say about that, but then what if you didn’t have a boss, and didn’t have any clients per se, and what if you thought you could do your job (acting as a judge in court cases) just as well or if not better by not speaking, maybe the rest of the justices should take a leaf out of Thomas’s book?

Since I’m on the topic of Supreme Court justices, one thing about them I find fascinating, is their job is essentially to just act as the embodiment of a set of principles for how the constitution should be interpreted. Because the rule of law relies on consistency of decision making, Justices by and large stick to their principles for their entire career (and I'm sure they each sincerely believe in what they are advocating for) And because the law touches on basically every nook and cranny of human society, these principles end up having to be so broad and fundamental, that for a justice, selecting them in the first place is tantamount to having to take a position on entire schools of philosophy.

Thomas, according to the following Regent University Law Review article linked below [2], has been described as a Textualist and an advocate of natural law jurisprudence. i.e. a proponent of the philosophical tradition of natural rights, which can be traced back to the writing of Aristotle, survived through the middle ages in the works of Thomas Aquinas, and then came to prominence again post-enlightenment, with thinkers such as Hobbes and John Locke.

Is this the only position one could take? Not at all, for example, take the following extract from an introductory law book I read last year (Understanding Law by John N. Adams and Roger Brownsword):

Ultimately that is exactly the type of statement that you need to be able to make if you are a Supreme Court Justice like Thomas, or Law Professors like Adams and Brownsword. If we think of what a judge does as something akin to meta-law, that is to say, deciding between different legal arguments, then judges need a framework which is not grounded in law in order to make decisions about the correct interpretation of the law. And what’s is left if we can’t reference legal theories when talking about legal theories? Well basically philosophy or religion. To put it another way, if someone were to repeatedly ask a judge ‘yes but why do you believe that’, the judge’s argument needs to bottom out somewhere, and it can’t simply be circular or end up with the statement ‘because’s that’s just how it is’

For Clarence Thomas, his position would bottom out with reference to the philosophical tradition of natural rights. The authors above present a solution which is derived from a pretty different genealogy, if you kept pestering them they would ultimately tell you to go read Kant’s Critique of Pure Reason and if you disagree with Kant’s arguments then go take it up with a Kant scholar. Where it gets interesting is that these two schools of philosophy, German Rationalism for the authors of the book, and some combination of Aristotelianism combined with British Empricism, are fundamentally different doctrines that often take irreconcilable positions. Given these types of questions have been debated pretty continuously for at least two thousand years of written history, I think we can safely conclude that they differences are not going to be removed any time soon.

I’m just glad then when I login in the morning, I don’t need to have a full philosophical framework figured out in order to do my job as an actuary. I largely just apply statistics and critical thinking and if something works I run with it, and if a method or approach doesn’t work I stop using it. I don’t have to worry about the ontology of the objects I’m using, or anything like that…. Oh wait, haven’t I just put forward a basically pragmatist and sceptical empiricist approach? Aren’t these positions extremely difficult to ground without spiralling into relativist and into a self-referential swamp? Luckily for me, I don't get asked, otherwise unlike Justice Thomas I'd probably eventually have to resort to 'because that's just how it is'

https://time.com/5555125/clarence-thomas-first-question-years/

[2] Regent University Law Review article on the confirmation hearing of Justice Thomas

www.regent.edu/acad/schlaw/student_life/studentorgs/lawreview/docs/issues/v12n2/12RegentULRev471.pdf

If you are an actuary, you'll probably have done a fair bit of triangle analysis, and you'll know that triangle analysis tends to works pretty well if you have what I'd call 'nice smooth consistent' data, that is - data without sharp corners, no large one off events, and without substantially growth. Unfortunately, over the last few years, motor triangles have been anything but nice, smooth or consistent. These days, using them often seems to require more assumptions than there are data points in the entire triangle.

So how exactly have we got into this sorry state of affairs? To tell this story, we need to go back a few years.

**Ogden**

The year is 2017, the Ogden discount rate is 2.5%, and the intersection of people worrying about global pandemics and people who regularly wear tin foil hats is not negligible. Then the government comes along and shockingly sets the discount rate at the exact level implied by the clearly defined calculation methodology they have always used. This lead to an Ogden level of -0.75%.

This immediately created a step change in incurred claim triangles, lump sums which were due to be paid at this point should have started to be settled at a -0.75% discount factor, creating a large uplift on that implied by the previous 2.5% Ogden rate. Unless you believe the discount rate is going to be slashed by the same amount every year, when using a triangle, this step change needs to be removed from your development pattern and adjusted for separately.

Why did I say claims*should* have been settled at this rate rather than *were* settled at this rate? Well reality is often messier than we would like. Motor insurers were on the whole unhappy with the announcement and had a number of different responses:

*‘surely the government didn’t really mean to set it at this level? Let’s meet in the middle somewhere.’,*

‘okay, Mr Claimant, I’ll use -0.75% as the discount rate, but I’m going to fight you on other costs to partially offset this increase, ultimately this is still a negotiation’

‘listen, I know the rate is -0.75% now, but we both know the government is probably going to change it to something higher soon, so you’ve got two options, meet me in the middle and accept a lower number, or I’ll drag out the case for a year or so until the rate is changed back and then settle at that lower rate.’

The end result was that lump sums were being settled at a range of effective discount rates which varied anywhere between -0.75% and 2.50%. If you have losses in your triangles settled in this six month window, who knows exactly how you should adjust them?

The next development was in September 2017, the government seeing the issues their change had made, stepped up and released a statement announcing they intended to review the methodology used to set the rate, and that the new rate, using the updated methodology would most likely end up in the range 0% to 1%. (the government didn’t explain the new methodology, but you can bet we then tried to back-in various guesses to see what they might have picked so as to get this range)

So people immediately started pricing business at 0.5% right? Well not quite. If the Ogden rate changes today and your company takes a big reserve hit, your CFO is going to want payback asap, he’s not going to care that there’s a five year tail on large bodily injury claims, therefore the premium we are charging now should be set against the expected Ogden rate at payment date, which is now no longer -0.75%, but is 0.5%, and may even end up less in 5 years time, etc. etc. etc. If the Ogden rate changes then it turns out the market reacts pretty sharpish.

Okay, so you decide to amend how much premium you are charging in order to get payback on the hit you just took. But the hit you took is going to vary substantially by insurer, and as much as you;d like payback, you also like market share. One thing that might protect you is that XoL costs (for those with XoLs that is) will not change immediately, but rather whenever the renewal comes round. Therefore reinsurers will feel the pain immediately but direct writers will feel the pain of increased RI spend at a staggered pace throughout the year (though many will be clustered around 1/1). So different people are reacting in the market at different times, and different people require different levels of payback and are able to be more protective of market share.

Oh and by the way England and Scotland have different rates now. So you need to blend your rate based on where the underlying business is based, sorry! This is a new issue for business being written, but in a few years it will be yet another adjustment to make to triangles as claims start to be settled at differential rates.

Here is a very messy table which summarises the very messy reality of how some of these factors changed over time.

The year is 2017, the Ogden discount rate is 2.5%, and the intersection of people worrying about global pandemics and people who regularly wear tin foil hats is not negligible. Then the government comes along and shockingly sets the discount rate at the exact level implied by the clearly defined calculation methodology they have always used. This lead to an Ogden level of -0.75%.

This immediately created a step change in incurred claim triangles, lump sums which were due to be paid at this point should have started to be settled at a -0.75% discount factor, creating a large uplift on that implied by the previous 2.5% Ogden rate. Unless you believe the discount rate is going to be slashed by the same amount every year, when using a triangle, this step change needs to be removed from your development pattern and adjusted for separately.

Why did I say claims

‘okay, Mr Claimant, I’ll use -0.75% as the discount rate, but I’m going to fight you on other costs to partially offset this increase, ultimately this is still a negotiation’

‘listen, I know the rate is -0.75% now, but we both know the government is probably going to change it to something higher soon, so you’ve got two options, meet me in the middle and accept a lower number, or I’ll drag out the case for a year or so until the rate is changed back and then settle at that lower rate.’

The end result was that lump sums were being settled at a range of effective discount rates which varied anywhere between -0.75% and 2.50%. If you have losses in your triangles settled in this six month window, who knows exactly how you should adjust them?

The next development was in September 2017, the government seeing the issues their change had made, stepped up and released a statement announcing they intended to review the methodology used to set the rate, and that the new rate, using the updated methodology would most likely end up in the range 0% to 1%. (the government didn’t explain the new methodology, but you can bet we then tried to back-in various guesses to see what they might have picked so as to get this range)

So people immediately started pricing business at 0.5% right? Well not quite. If the Ogden rate changes today and your company takes a big reserve hit, your CFO is going to want payback asap, he’s not going to care that there’s a five year tail on large bodily injury claims, therefore the premium we are charging now should be set against the expected Ogden rate at payment date, which is now no longer -0.75%, but is 0.5%, and may even end up less in 5 years time, etc. etc. etc. If the Ogden rate changes then it turns out the market reacts pretty sharpish.

Okay, so you decide to amend how much premium you are charging in order to get payback on the hit you just took. But the hit you took is going to vary substantially by insurer, and as much as you;d like payback, you also like market share. One thing that might protect you is that XoL costs (for those with XoLs that is) will not change immediately, but rather whenever the renewal comes round. Therefore reinsurers will feel the pain immediately but direct writers will feel the pain of increased RI spend at a staggered pace throughout the year (though many will be clustered around 1/1). So different people are reacting in the market at different times, and different people require different levels of payback and are able to be more protective of market share.

Oh and by the way England and Scotland have different rates now. So you need to blend your rate based on where the underlying business is based, sorry! This is a new issue for business being written, but in a few years it will be yet another adjustment to make to triangles as claims start to be settled at differential rates.

Here is a very messy table which summarises the very messy reality of how some of these factors changed over time.

So why do I bring all this up?

A few months ago, if someone sent you a motor triangle to analyse, you could probably just about manage to do something sensible about the issues highlighted above. If you were not interested in large losses, hopefully you'd have enough info to strip them out. By the end of the analysis your triangle probably ended up looking like a rather strange colour-by-numbers picture from all the manual amendments, but at least it (probably) gave sensible answers.

And then the coronavirus came along.

**Lockdown changes**

Following the Government lockdown on 23rd March, pretty immediately claim frequencies dropped a lot. It’s unclear exactly where the ultimates for these lockdown (and the coming semi-lockdown) weeks/months will settle, I expect in the fullness of time we will see that part of the drop off is due to reporting patterns stretching out (customer behaviour, slow down at legal firms, delays at TPAs and in claims teams)

There are a few pieces of information we can look at though. Firstly, if you have access to motor triangles then you can just look at how claim frequencies have changed recently and then adjust somewhat with a few assumptions about how development patterns may have changed – which I’ve done where I can, but won’t go into specifics… proprietary info and all that.

The other piece of information is to look at the premium credits various insurers have announced, and what this must imply about their view of how their ULR has changed. Dowling and Partner’s IBNR weekly published a good analysis of this.

Taking Progressive as an example, who announced a credit of 20% of gross premium for two months, which when converted into a loss cost, is consistent with approximately a 30% reduction in claim frequency.

My personal opinion is that this figure of 30% may be on the conservative side of the actual reduction, if for no other reason than no one knows exactly what will happen later in the year, and insurer's are probably erring on the side of caution in terms of giving premiums back. For example, here are a couple of scenarios off the top of my head where insurers actually end up worse off.

So where does this leave our unfortunate motor triangles? The way this will probably play out is that on top of any Ogden related adjustments, we will need to make adjustments for:

So what's an actuary to do? Wouldn’t it be nice if as a profession we could all agree to ignore the last three years of data and do all our projections off 2016 and prior? Sure it would be a lot less accurate, but it would definitely be a lot simpler, and as long as we’re all doing it, no one will be at a competitive disadvantage.

]]>A few months ago, if someone sent you a motor triangle to analyse, you could probably just about manage to do something sensible about the issues highlighted above. If you were not interested in large losses, hopefully you'd have enough info to strip them out. By the end of the analysis your triangle probably ended up looking like a rather strange colour-by-numbers picture from all the manual amendments, but at least it (probably) gave sensible answers.

And then the coronavirus came along.

Following the Government lockdown on 23rd March, pretty immediately claim frequencies dropped a lot. It’s unclear exactly where the ultimates for these lockdown (and the coming semi-lockdown) weeks/months will settle, I expect in the fullness of time we will see that part of the drop off is due to reporting patterns stretching out (customer behaviour, slow down at legal firms, delays at TPAs and in claims teams)

There are a few pieces of information we can look at though. Firstly, if you have access to motor triangles then you can just look at how claim frequencies have changed recently and then adjust somewhat with a few assumptions about how development patterns may have changed – which I’ve done where I can, but won’t go into specifics… proprietary info and all that.

The other piece of information is to look at the premium credits various insurers have announced, and what this must imply about their view of how their ULR has changed. Dowling and Partner’s IBNR weekly published a good analysis of this.

Taking Progressive as an example, who announced a credit of 20% of gross premium for two months, which when converted into a loss cost, is consistent with approximately a 30% reduction in claim frequency.

My personal opinion is that this figure of 30% may be on the conservative side of the actual reduction, if for no other reason than no one knows exactly what will happen later in the year, and insurer's are probably erring on the side of caution in terms of giving premiums back. For example, here are a couple of scenarios off the top of my head where insurers actually end up worse off.

- The possibility of the decrease in frequency being (partially? or fully?) offset by an increase in severity due to drivers speeding more on quieter roads. Increased severity as an indicator lags new weekly claim notifications – a measure of frequency - by months, so we won't be able to fully see this for some time.
- A scenarios where there will be a surge in car use vs public transport when the lockdown is lifted as public transport is increasingly shunned and runs at reduced capacity. This increase in utilisation may more than offset the reduction in the lockdown months.

So where does this leave our unfortunate motor triangles? The way this will probably play out is that on top of any Ogden related adjustments, we will need to make adjustments for:

- Decreased frequency for the accident months March 2020 onwards
- A gradual increase in frequency from May onwards
- A possible increase in severity for the same period?
- Either a permanent increase or decrease in road utilisation in the medium and long term (tbc which factor ends up dominating)
- A slowing in reporting patterns for the months March 2020 onwards
- A slowing in closing of claims during the same period
- A possible catch up in reporting and closing in the weeks following the release of the lockdown.

So what's an actuary to do? Wouldn’t it be nice if as a profession we could all agree to ignore the last three years of data and do all our projections off 2016 and prior? Sure it would be a lot less accurate, but it would definitely be a lot simpler, and as long as we’re all doing it, no one will be at a competitive disadvantage.

In case you missed it, Aon announced [1] last week that in response to the Covid19 outbreak, and the subsequent expected loss of revenue stemming from the fallout, they would be taking a series of preemptive actions. The message was that no one would lose their job, but that a majority of staff would be asked to accept a 20% salary cut.

The cuts would be made to:

So how significant will the cost savings be here? And is it fair that Aon is continuing with their dividend? I did a couple of back of the envelope calcs to investigate.

The cuts would be made to:

- Approx 70% of the 50,000 employees
- Cuts would be approx. 20% of relevant employee’s salaries
- The cuts would be temporary
- Executives would take a 50% paycut
- Share buy-backs would be suspended
- The dividends would be continued to be paid

So how significant will the cost savings be here? And is it fair that Aon is continuing with their dividend? I did a couple of back of the envelope calcs to investigate.

As a starting point we need to know how much Aon spends on salaries in the first place. Their Q4-19 Earnings Release, which was the most recent available when I started writing this, can be found here:

d18rn0p25nwr6d.cloudfront.net/CIK-0000315293/422df0de-c031-486e-bf9f-bda8b9d353a6.pdf

Based on this, we see that Aon had the following revenue and operating income for FY19:

So the total staff bill was approx. USD 6bn. The announcement stated that the 20% cut would be made to "salaries" therefore I’m assuming other benefits will not be cut, so we will need to split the USD 6bn into base salary vs other staff benefits. Here is a table (which I completely made up), but should hopefully be of the correct order of magnitude:

So based on this, of the USD 6bn of compensation related expenses, I’d estimate something like USD 4.6bn corresponds to base salaries, and USD 1.4bn relates to the other categories listed above.

The reason Aon is cutting salaries of only 70% of employees rather than all employees is to protect the less well paid (the calculation will apparently be based on a cost of living level which will vary by location) The idea being, Aon doesn’t want to cut the salary of someone who lives in London and earns 17k pa down to 13.6k. The effect of this is that as a percentage of the total salary cost, the 20% cut will be made to more than 70% of the total salary bill, as the cut will be skewed to higher paid employees. I’m going to assume that rather than 70%, the cut will represent approx. 85% of the total salary cost. Once again, completely made up but we need to make an adjustment here, and a 15% increase seems reasonable. The actual number is going to be greater than 70% and less than 100% so our final answer shouldn't be too sensitive to this assumption either.

The announcement stated that the reduction would be temporary, it’s not clear what exactly this means in terms of time-frame, so I’m making the assumption that the cut will last for 6 months.

Combining all of the above we get the following:

The reason Aon is cutting salaries of only 70% of employees rather than all employees is to protect the less well paid (the calculation will apparently be based on a cost of living level which will vary by location) The idea being, Aon doesn’t want to cut the salary of someone who lives in London and earns 17k pa down to 13.6k. The effect of this is that as a percentage of the total salary cost, the 20% cut will be made to more than 70% of the total salary bill, as the cut will be skewed to higher paid employees. I’m going to assume that rather than 70%, the cut will represent approx. 85% of the total salary cost. Once again, completely made up but we need to make an adjustment here, and a 15% increase seems reasonable. The actual number is going to be greater than 70% and less than 100% so our final answer shouldn't be too sensitive to this assumption either.

The announcement stated that the reduction would be temporary, it’s not clear what exactly this means in terms of time-frame, so I’m making the assumption that the cut will last for 6 months.

Combining all of the above we get the following:

So this move can be expected to save Aon something like USD 400m in salary expenses if applied for 6 months.

To see this I went to Yahoo Finance and looked up Aon Plc.

Aon paid dividends of USD 0.44 per share quarterly for an annualised USD 1.76 per share. They’ve increased their dividend at a steady 10% per annum. The number of shares outstanding is 231.08m, giving a total annual dividend payment of USD 407m for the latest 4 quarters.

On the face of it then, we’ve got Aon cutting staff costs by approx. 400m as per the above, but then continuing to pay a dividend of …. 400m. Hmmm….

In Aon management's defence, the other part of announcement was the intention to pause the share buyback scheme. This is where the numbers gets interesting, Aon’s used this scheme to purchase over USD 2bn of stock in the last 12 months. That's money which has been returned to shareholders. Therefore by halting this program, assuming Aon would have purchased a similar amount in the next 12 months, Aon is immediately saving a further 2bn, money which would have otherwise gone to shareholders.

This is where I think you can argue in either direction in terms of fairness.

On the one hand, why should the middle 70% of employees have their salaries cut when Aon is intending to continue to pay a dividend to shareholders, and when over 2.4bn was paid out to shareholders last year. This was money which could have been used to pay down debt (which currently stands at around 9bn), and ensure that should any adverse events occur, the company does not need to take such drastic actions such as cutting salaries to protect cashflow and liquidity.

On the other hand, we could say to ourselves - shareholders are going from an annual 2.4bn payout to just 400m an 83% reduction, the executive team is having their pay cut by 50%, and the lowest paid are not having their salary cut at all. Given shareholders are taking a 83% hit, execs a 50% hit, is it unreasonable to ask the middle 70% to take a temporary 20% cut?

[1] Aon's SEC filing here:

d18rn0p25nwr6d.cloudfront.net/CIK-0000315293/337574c0-e686-4571-9f77-1f8af576f692.pdf

The story starts with an open letter written to the UK Government signed by 200+ scientists, condemning the government’s response to the Coronavirus epidemic; that the response was not forceful enough, and that the government was risking lives by their current course of action. The letter was widely reported and even made it to the BBC frontpage, pretty compelling stuff.

Link [1]

The issue is that as soon as you start scratching beneath the surface, all is not quite what it seems. Of the 200+ scientists, about 1/3 are PhD students, not an issue in and of itself, but picking out some of the subjects we’ve got:

- PhD Student in Complex Systems – okay, quite relevant, let’s keep going down the list
- PhD Student in Combinatorics – hmm, now we are listening to pure mathematicians? I really enjoyed studying combinatorics at uni, but I'm not sure it gives a student any particular expertise in epidimiology
- PhD Student in Theoretical Particle Physics – other than being numerically literate this is about as far from relevant as you can get!

And so on, just eyeballing the list I'd hazard a guess that a vast majority of the PhD student signatories are non-specialists in this field. Even among the lecturers we’ve got the following specialisms:

- Lecturer in Accounting
- Associate Professor of Strategy and Entrepreneurship
- Lecturer in Employment Law

I’ve got two responses to this, the first is not a criticism of the signatories – it’s with how this message was relayed in the media. Simply stating that 200+ scientists have written an open letter condemning the Government response is misleading if no information is included as to the relative expertise of the scientists. To call a lecturer in Employment Law a scientists is stretching the definition of scientist beyond any reasonable bounds! The BBC article gave no such indication that the letter had been signed by such a range of individuals.

The second issue is with the references provided within the letter to back up the points being made – these sources contain some pretty glaring errors! I'll explain below what these errors are. I think we can and should hold the scientists accountable for not properly scrutinising their sources, to my mind this is exactly the issue with highly educated people with a non-relevant specialisms commenting on an unfamiliar area.

So what is my the issue with their sources?

The open letter referenced the following Medium .com article written by Tomas Pueyo which appears (ironically?) to have gone viral [2]:

Once again, this is an article written by a non-specialist, and is more evidence of how a small amount of knowledge can be a very dangerous thing! I feel like there's a theme here. Before we start digging into the article too much, what is this guy's background?

Well for one thing he has written a book on Star Wars "The Star Wars Rings: The Hidden Structure Behind the Star Wars Story - Tomas Pueyo Brochard", not a great start. He's worked at a number of tech start ups, and writes online quite a lot, his Medium articles are predominantly about public speaking, building viral apps, and effective writing. Going viral appears to be something he is very good at, so I'd be inclined to listen to his advice on this subject, the analysis of infectious diseases not so much!

The first thing that struck me as strange when reading his article was the following section (pasted below) which gives statistics on the proportion of cases requiring hospitalisation. Note the numbers are worryingly big!

Something felt off about this, then I realised -

Tomas states that 5% of cases require ICU admission and 2.5% require very intensive treatment i.e. mechanical ventilation or similar. I followed the link that Tomas quoted, and firstly the actual value is 2.3% in the study, not sure why he has rounded up here to 2.5%, that’s quite minor though. The more serious issue is that the study was an analysis of the outcomes of approx. 1,000 patients who were

https://www.nejm.org/doi/full/10.1056/NEJMoa2002032

In his Medium article Tomas states that 2.5%

Okay, so that first sentence has two mistakes already, what about that graphic which didn’t add up to 100%?

I went on the following journey to try to find the original study the data is taken from:

Medium article -> Information is Beautiful (another website) which produced the infographic -> A Google Spreadsheet provided by Information is Beautiful which contains 3 links. The links were the following:

1) A Guardian article which is seemingly unrelated to this specific point?!

https://www.theguardian.com/world/2020/mar/03/italy-elderly-population-coronavirus-risk-covid-19

2) A Statista article which repeats the same values:

https://www.statista.com/chart/20856/coronavirus-case-severity-in-china/

3) An article published by the Chinese Centre for Disease Control (CCDC):

https://github.com/cmrivers/ncov/blob/master/COVID-19.pdf

Ah ha, this article actually contains the research being quoted! It appears to be carried out by respectable scientists on real clinical data, okay, now we are getting somewhere.

The article is an analysis of the characteristics of approx. 44k__confirmed cases__, and among other things, breaks out the percentage of these confirmed cases which required hospitalisation. We find out in this paper what happened to the missing 0.6% – in the original paper, these cases were marked up as ‘missing’, i.e. not reported. A fairly innocuous explanation, but one I would have preferred that to have been noted in the Medium graphic in some way – have an asterix and explain at the bottom, or allocate proportionate across the other categories, but do something about it!

This is where I am going to strongly disagree with Tomas’s analysis. Hospitalisation numbers are very unlikely to be underreported – the CCDC article explains that all hospitals were required by law to report any cases to the CCDC. The possibility of over-reporting was handled through the inclusion of citizen’s national identity numbers in the data gathered to prevent double counting. In other words, the absolute number of hospitalisations and deaths quoted in the study are probably pretty accurate, the number of confirmed cases are also probably pretty accurate.

We have to be careful though when applying this denominator to the population as a whole, using the number of confirmed cases as a proxy for the actual number of cases is__very__ prone to underreporting, I would consider this value (approx. 20% requiring hospitalisation) as simply an upper bound on the percentage of infections which require hospitalisation. The Medium article however is not careful about this, and references the CCDC study as if it is an estimate of what percentage of the total population would require hospitalisation if infected by the virus – very different things! Without answering the question of what ratio of confirmed cases to actual infections we are dealing with in the underlying Chinese data, we are not making a valid inference about how these stats will apply to the total population.

Note that the word 'case' in the way used by Tomas might actually be valid, here is the CDC definition:

*CASE. In epidemiology, a countable instance in the population or study group of a particular disease, health disorder, or condition under investigation. Sometimes, an individual with the particular disease.*

So in terms of the study, the hospitalisation rate to Cases (where Case has a capital C) could correctly be said to be 20%. I suspect that Tomas has not understood this subtlety, and if he has then he has presented the data in a very misleading way. Moreover when talking about the % of future cases that will lead to hospitalisation, this would require adjustment. This subtlety is never explained.

The UK Government has stated they believe the proportion of actual infections to reported cases may be out by a factor 5-10. If we use the value of 5 to be on the more conservative side of their range, then we need to scale down all of Tomas’s numbers as follows:

Note that the ICU and ventilation numbers are now lower than the estimated fatality rate from Coronavirus! The UK Government has stated that their estimate is approximately 1%. So even though we’ve now made these numbers internally consistent, they are now too low to be believable.... Why is this?

The first study which Tomas linked to (the table I’ve pasted above) is actually very immature, of the 1,099 patients in the study, 94% were still in the hospital at the end of the study! i.e. could very easily get much worse, therefore the 2.5% and 5% are actually underestimated! Once again we’ve scratched under the surface and found another glaring error. Nowhere in his study did Tomas mention that the 2.5% requiring ventilation and the 5% requiring ICU admission were based on just 25 and 50 people respectively and that he had**not adjusted for the bias from right censoring** (a study ends before a final value is known):

en.wikipedia.org/wiki/Censoring_(statistics)

**So what's my point?**

As numerically literate people who are non-experts in infectious disease and epidemiology (and I consider myself in this category by the way, I am very ignorant of almost all epidemiology) we have to be soooo careful when producing analysis on Coronavirus which is getting disseminated to the wider public. It’s more important than ever to properly scrutinise sources, to stop and think if the person you are listening to truly knows what they are doing, and has taken the time to be careful in their analysis.

I feel like there was a breakdown in reporting at multiple stages here. Tomas misinterpreted various otherwise well written and rigorous studies, the 200+ ‘scientists’ then appear to have swallowed this uncritically and referenced it as evidence which was cited to the UK Government, and the BBC has then quoted these scientists sending up their warning cry without really scrutinising the expertise of the scientists or the sources referenced by the scientists.

[1] Open letter to the UK Government:

http://maths.qmul.ac.uk/~vnicosia/UK_scientists_statement_on_coronavirus_measures.pdf

[2] Tomas Pueyo's Medium article:

https://medium.com/@tomaspueyo/coronavirus-act-today-or-people-will-die-f4d3d9cd99ca

]]>I went on the following journey to try to find the original study the data is taken from:

Medium article -> Information is Beautiful (another website) which produced the infographic -> A Google Spreadsheet provided by Information is Beautiful which contains 3 links. The links were the following:

1) A Guardian article which is seemingly unrelated to this specific point?!

https://www.theguardian.com/world/2020/mar/03/italy-elderly-population-coronavirus-risk-covid-19

2) A Statista article which repeats the same values:

https://www.statista.com/chart/20856/coronavirus-case-severity-in-china/

3) An article published by the Chinese Centre for Disease Control (CCDC):

https://github.com/cmrivers/ncov/blob/master/COVID-19.pdf

Ah ha, this article actually contains the research being quoted! It appears to be carried out by respectable scientists on real clinical data, okay, now we are getting somewhere.

The article is an analysis of the characteristics of approx. 44k

This is where I am going to strongly disagree with Tomas’s analysis. Hospitalisation numbers are very unlikely to be underreported – the CCDC article explains that all hospitals were required by law to report any cases to the CCDC. The possibility of over-reporting was handled through the inclusion of citizen’s national identity numbers in the data gathered to prevent double counting. In other words, the absolute number of hospitalisations and deaths quoted in the study are probably pretty accurate, the number of confirmed cases are also probably pretty accurate.

We have to be careful though when applying this denominator to the population as a whole, using the number of confirmed cases as a proxy for the actual number of cases is

Note that the word 'case' in the way used by Tomas might actually be valid, here is the CDC definition:

So in terms of the study, the hospitalisation rate to Cases (where Case has a capital C) could correctly be said to be 20%. I suspect that Tomas has not understood this subtlety, and if he has then he has presented the data in a very misleading way. Moreover when talking about the % of future cases that will lead to hospitalisation, this would require adjustment. This subtlety is never explained.

The UK Government has stated they believe the proportion of actual infections to reported cases may be out by a factor 5-10. If we use the value of 5 to be on the more conservative side of their range, then we need to scale down all of Tomas’s numbers as follows:

- The 20% requiring hospitalisation is now 4%, still high but less severe
- The 5% requiring ICU, as we saw above was not scaled by the % requiring hospitalisation, which would bring it to 1%, but we then need to scale again by the factor of 5 to go from confirmed to actual – giving a value of 0.2% of all infections
- The 2.5% of cases requiring mechancial ventilation becomes 0.5% when scaled for the % of confirmed cases requiring hospitalisation, and then becomes 0.1% when scaled to the total population. A much less scary number.

Note that the ICU and ventilation numbers are now lower than the estimated fatality rate from Coronavirus! The UK Government has stated that their estimate is approximately 1%. So even though we’ve now made these numbers internally consistent, they are now too low to be believable.... Why is this?

The first study which Tomas linked to (the table I’ve pasted above) is actually very immature, of the 1,099 patients in the study, 94% were still in the hospital at the end of the study! i.e. could very easily get much worse, therefore the 2.5% and 5% are actually underestimated! Once again we’ve scratched under the surface and found another glaring error. Nowhere in his study did Tomas mention that the 2.5% requiring ventilation and the 5% requiring ICU admission were based on just 25 and 50 people respectively and that he had

en.wikipedia.org/wiki/Censoring_(statistics)

As numerically literate people who are non-experts in infectious disease and epidemiology (and I consider myself in this category by the way, I am very ignorant of almost all epidemiology) we have to be soooo careful when producing analysis on Coronavirus which is getting disseminated to the wider public. It’s more important than ever to properly scrutinise sources, to stop and think if the person you are listening to truly knows what they are doing, and has taken the time to be careful in their analysis.

I feel like there was a breakdown in reporting at multiple stages here. Tomas misinterpreted various otherwise well written and rigorous studies, the 200+ ‘scientists’ then appear to have swallowed this uncritically and referenced it as evidence which was cited to the UK Government, and the BBC has then quoted these scientists sending up their warning cry without really scrutinising the expertise of the scientists or the sources referenced by the scientists.

[1] Open letter to the UK Government:

http://maths.qmul.ac.uk/~vnicosia/UK_scientists_statement_on_coronavirus_measures.pdf

[2] Tomas Pueyo's Medium article:

https://medium.com/@tomaspueyo/coronavirus-act-today-or-people-will-die-f4d3d9cd99ca

The result is specifically, that under basic models of the development of distribution of wealth in a society, it can be shown that when growth is equal to $g$, and return on capital is equal to $r$, then the distribution of wealth tends towards a Pareto distribution with parameter $r-g$. That sounds pretty interesting right?

My notes below are largely based on following paper by Charles I. Jones of Stanford Business School, my addition is to derive the assumption of an exponential distribution of income from more basic assumptions about labour and capital income. Link to Jones's paper [1]

The simplest version of the problem is to investigate the inequality of income rather than capital, this way we leave aside issues of inheritance, inter-generational effects, shocks to capital, etc.

In this case, the model as described in the link above from Jones only requires two assumptions:

- Exponentially distributed longevity. That is to say, we assume the survival function for age is given by:

Where $\delta$ is the death rate.

This is my kind of model! I’m pretty sure we derived this distribution in one of the actuarial exams (CT4 or CT5?), and this distribution can be derived just from the assumption of a constant death rate. It’s not without limitations (death rates have a mini-increase in adolescent for example), but the overall shape of the curve is pretty close to the empirical distribution.

What’s our other assumption?

2. Income $y$, increases exponentially with age $x$:

$$y = e^{\mu x}$$

I was happy following the paper until this point, is this a reasonable assumption? Empirically income does not tend to increase exponentially right? Salaries tends to increase up until middle age, but then peak around age 50-55. Before falling off slightly. After more consideration, I realised I was considering the wrong type of income, the important distinction is we are talking about

$$ \text{Income(x)} = \text{Labour income(x)} + \text{Capital income(x)}$$

Below I give a demonstration of why we might be more willing to assume total income increases exponentially with age.

Assume we have a constant savings rate $s$, capital earns interest at rate $r$, and let $C_x$ denote capital at time $x$, and $L_x$ denote labour income at time $x$.

Then we can easily link capital at time $x$ to capital at time $x+1$ using the following:

$$ C_x = C_{x-1}(1+r)+ s*L_x$$

And this formula can then be used to examine individual’s capital over time:

$$ C_1 = 0 + s * L_0 $$

$$ C_2 = C_1 (1+r) + s * L_1 = (s*L_0)(1+r) +s L_1 $$

$$ C_3 = C_2 (1+r) + s * L_2 = (s*L_0)(1+r)^2 +s L_1(1+r)+s*L_2 $$

And generalising (which we can prove by induction, but I’m just going to take as given here):

$$C_n = s(L_0 *(1+r)^{n-1} + L_1*(1+r)^{n-2} + … + L_{n-1})$$

Or in short hand:

$$C_n = s \sum_{i=0}^{n-1} L_{-i}* (1+r)^{n-1-i} $$

We now need to commit to a particular form for $L_n$ to progress any further. It turns out even with a linearly increasing function of $L$, we still end up with an exponential aggregate income over time. Using a linear function for $L_n$ is weaker than using an exponential function (as we would just be begging the question if we already assumed $L_n$ increased exponentially.)

Using $L_n = \alpha n$, gives us:

$$C_n = s \sum_{i=0}^{n-1} \alpha n * (1+r)^{n-1-i} $$

Taking $\alpha$, and a $(1+r)$ outside gives the following:

$$C_n = \frac{\alpha s}{1+r} \sum_{i=0}^{n-1} n * (1+r)^{n-i} $$

The trick now is to think of this sum, which I’ll refer to as $S$ (i.e. ignoring the factors outside the sum), as the following:

\begin{matrix}

(1+r) & + & (1+r) & + & ... &...& ... & ...& +& (1+r)\\

(1+r)^2 & + & (1+r)^2 & + &...& ... & ... & + & (1+r)^2&\\

(1+r)^3 & + & (1+r)^3 & + &...& + &(1+r)^3 &\\

... & & & & & & & &\\

(1+r)^{n-1} & & & & & & & &

\end{matrix}

We now rewrite this as a series of sums of columns as follows:

$$S = \sum_{i=1}^{1} (1+r)^i + \sum_{i=1}^{2} (1+r)^i + … + \sum_{i=1}^{n-1} (1+r)^i$$

But each of these is now just a geometric series, that is to say our sum is equal to:

$$S = \frac{(1+r)^2-1}{r} + \frac{(1+r)^3-1}{r} + … +\frac{(1+r)^n-1}{r} – (n-1)$$

And writing this with summation notation:

$$S = \frac{\sum_{i=2}^{n} (1+r)^i – (n-1)}{r} – (n-1) $$

The trick now is to once again apply the formula for a geometric sum giving:

$$S = \frac{ (1+r)^{n+1} – 1 -2r – r^2}{r^2} - \frac{n-1}{r} - \frac{r (n-1)}{r}$$

Which we can simply slightly:

$$S = \frac{ (1+r)^{n+1} – (1+r)^2}{r^2} - \frac{(n-1)(1+r)}{r}$$

And then plugging this into the formula for $C_n$:

$$C_n = \frac { s \alpha }{1+r} \left( \frac{ (1+r)^{n+1} – (1+r)^2}{r^2} - \frac{(n-1)(1+r)}{r} \right)$$

And this is the result we require. We can see that capital, and hence income increases exponentially with age (though we are subtracting a term which increases linearly with age to slow it down).

Now that we've motivated the two assumptions, all that remains is to combine them, and show that we end up with a Pareto distribution.

Inverting the assumption about income gives the age at which an individual earns a given level of income:

$$y = e^{\mu x}$$

Gives:

$$x(y) = \frac{1}{\mu} \text(log) y $$

And then, using this to evaluate the probability of income being greater than y:

$$P(\text{Income} > y) = P(\text{Age} > x(y)) = e^{-\delta x(y)} = y^{-\frac{\delta}{\mu}}$$

As required.

[1] Charles I. Jones's paper:

web.stanford.edu/~chadj/SimpleParetoJEP.pdf

Suppose we have a Poisson Distribution with parameter $\lambda$, by definition:

$$P(X=k)=e^{- \lambda} \frac{\lambda^k}{k!}$$

If we replace $\lambda$ with $r$, and consider the probability that $X=r$, we get:

$$P(X=r)=e^{- r} \frac{r^r }{r!}$$

Now suppose we are restricting ourselves to large value of $r$, in which case, a Poisson distribution is well approximated by a Gaussian distribution with mean and variance both equal to $r$.

Setting up this approximation.

$$e^{- r} \frac{r^r }{r!} \approx \frac{1}{\sqrt{2 \pi r}} e^{- \frac{(r-r)^2}{2}}$$

From which the right hand side simplifies further to give:

$$ e^{- r} \frac{r^r }{r!} \approx \frac{1}{\sqrt{2 \pi r}} e^{0}$$

Giving:

$$ e^{- r} \frac{r^r }{r!} \approx \frac{1}{\sqrt{2 \pi r}}$$

Which when we rearrange to obtain Stirling’s approximation:

$$r! \approx e^{-r} r^r \sqrt{2 \pi r}$$

We probably shouldn’t be surprised that we’ve found a link between $e^r$ and $\pi$. The fact that the two are linked can be easily drawn out using Euler’s formula:

$$e^{inx}=\cos(nx)+i\sin(nx)$$

And examining the value $\pi$:

$$e^{i\pi }+1=0$$

So there is a clearly a link between $e^x$, and $\pi$, but its not obvious we can draw in the factorial as well.

Above we teased out a link between all of the following: $n!$, $n^n$, $e^r$, and $\pi$, which is interesting for its own sake, but moreover provides intuition as to why the Gaussian approximation to the Poisson distribution works.

It should probably be noted that we’ve implicitly invoked the Central Limit Theorem to establish the approximation, and the CTL is some pretty heavy machinery! The proof from first principles of the CTL is much more involved that the proof of Stirling's approximation, so the derivation above should be thought of as strictly a process of drawing out interesting parallels, rather than a path for proving the result from first principles.

I tend to use the Gaussian approximation quite a lot at work – any time I’m modelling a claim count frequency in a Spreadsheet and I’ve got a reasonable number of annual claims, I’m a proponent of just using a discretised Gaussian, with a min applied at 0, and with a variance and mean set as required.

This has a couple of advantages to using either a Poisson or Negative Binomial:

- First and most importantly, it’s simpler and quicker to set up in the Spreadsheet as we can just use the built in inverse Gaussian Excel function. Negative binomials are just a bit of a pain to model in a Spreadsheet without VBA code or an add-in.
- It allows us to seamlessly vary our variance/mean between a Poisson level (equal to 1), and a Negative Binomial level (greater than 1) without amending our distribution.

David Mackay seemed to have an eye for interesting problems – reading up on Wikipedia about him he competed in a Maths Olympiad while a student. I do wonder if there is a correlation between well-written, entertaining textbooks, and authors who have a background in competitive maths problems. The link between Stirling’s approximation, Gaussian, and Poisson is just the sort of thing that could make an interesting problem in a competitive maths competition.

I also realised after writoing this post that I’d already written about something pretty similar before, where we can use Stirling’s approximation to easily estimate the probability that a Poisson value is equal to it’s mean. Here’s the link:

www.lewiswalsh.net/blog/poisson-distribution-what-is-the-probability-the-distribution-is-equal-to-the-mean

I’m really enjoying working my way through Thomas Piketty’s Capital in the 21st Century, it's been sitting on my shelf unread for a few years now, and at 696 pages it looked like it's going to be a bit of a slog but it's actually been fairly easy and entertaining reading. The overall approach is the following; Piketty collected better data on wealth and income inequalities than anyone else before (going back to around 1700, across multiple countries, and correcting as many systematic biases and data issues as possible), he then analyses said data, drawing out interesting observations whilst writing everything in a fairly non-technical and entertaining. Piketty is able to weave a narrative that sheds light on economic history, predictions for future structural developments of the economy, the history of economic thought, and how the limited data available to past economists skewed their results and how our understanding is different now. |

Piketty also adds colour by tying his observations to the literature written at the time (Austen, Dumas, Balzac), and how the assumptions made by the authors around how money, income and capital work are also reflected in the economic data that Piketty obtained.

Hopefully I've convinced you Piketty's programme is a worthwhile one, but that still leaves the fundamental question -*is his analysis correct*? That's a much harder question to answer, and to be honest I really don't feel qualified to pass judgement on the entirety of the book, other than to say it strikes me as pretty convincing from the limited amount of time I've spent on it.

In an attempt to contribute in some small way to the larger conversation around Piketty's work, I thought I'd write about one specific argument that Piketty makes that I found less convincing than other parts of the book. Around 120 pages in, Piketty introduces what he calls the ‘Second Fundamental Law of Capitalism’, and this is where I started having difficulties in following his argument.

**The Second Fundamental Law of Capitalism**

The rule is defined as follows:

$$ B = \frac{s} { g} $$

Where $B$ , as in Piketty’s first fundamental rule, is defined as the ratio of Capital (the total stock of public and private wealth in the economy) to Income (NNP):

$$B = \frac{ \text{Capital}}{\text{Income}}$$

And where $g$ is the growth rate, and $s$ is the saving rate.

Unlike the first rule which is an accounting identity, and therefore true by definition, the second rule is only true ‘in the long run’. It is an equilibrium that the market will move to over time, and the following argument is given by Piketty:

*“The argument is elementary. Let me illustrate it with an example. In concrete terms: if a country is saving 12 percent of its income every year, and if its initial capital stock is equal to six years of income, then the capital stock will grow at 2 percent a year, thus at exactly the same rate as national income, so that the capital/income ratio will remain stable.*

*By contrast, if the capital stock is less than six years of income, then a savings rate of 12 percent will cause the capital stock to grow at a rate greater than 2 percent a year and therefore faster than income, so that the capital/income ratio will increase until it attains its equilibrium level.*

*Conversely, if the capital stock is greater than six years of annual income, then a savings rate of 12 percent implies that capital is growing at less than 2 percent a year, so that the capital/income ratio cannot be maintained at that level and will therefore decrease until it reaches equilibrium.”*

I’ve got to admit that this was the first part in the book where I really struggled to follow Piketty’s reasoning – possibly this was obvious to other people, but it wasn’t to me!

**Analysis – what does he mean?**

Before we get any further, let’s unpick exactly what Piketty means by all the terms in his formulation of the law:

Income = Net national product = Gross Net product *0.9

(where the factor of 0.9 is to account for depreciation of Capital)

$g$ = growth rate, but growth of what? Here it is specifically growth in income, so while this is*not* exactly the same as GDP growth it’s pretty close. If we assume net exports do not change, and the depreciation factor (0.9) is fixed, then the two will be equal.

$s$ = saving rate – by definition this is the ratio of additional capital divided by income. Since income here is*net* of depreciation, we are already subtracting capital depreciation from income and not including this in our saving rate.

Let’s play around with a few values, splitting growth $g$, into per capita growth and demographic growth we get the following. Note that Total growth is simply the sum of demographic and per capita growth, and Beta is calculated from the other values using the law.

Hopefully I've convinced you Piketty's programme is a worthwhile one, but that still leaves the fundamental question -

In an attempt to contribute in some small way to the larger conversation around Piketty's work, I thought I'd write about one specific argument that Piketty makes that I found less convincing than other parts of the book. Around 120 pages in, Piketty introduces what he calls the ‘Second Fundamental Law of Capitalism’, and this is where I started having difficulties in following his argument.

The rule is defined as follows:

$$ B = \frac{s} { g} $$

Where $B$ , as in Piketty’s first fundamental rule, is defined as the ratio of Capital (the total stock of public and private wealth in the economy) to Income (NNP):

$$B = \frac{ \text{Capital}}{\text{Income}}$$

And where $g$ is the growth rate, and $s$ is the saving rate.

Unlike the first rule which is an accounting identity, and therefore true by definition, the second rule is only true ‘in the long run’. It is an equilibrium that the market will move to over time, and the following argument is given by Piketty:

I’ve got to admit that this was the first part in the book where I really struggled to follow Piketty’s reasoning – possibly this was obvious to other people, but it wasn’t to me!

Before we get any further, let’s unpick exactly what Piketty means by all the terms in his formulation of the law:

Income = Net national product = Gross Net product *0.9

(where the factor of 0.9 is to account for depreciation of Capital)

$g$ = growth rate, but growth of what? Here it is specifically growth in income, so while this is

$s$ = saving rate – by definition this is the ratio of additional capital divided by income. Since income here is

Let’s play around with a few values, splitting growth $g$, into per capita growth and demographic growth we get the following. Note that Total growth is simply the sum of demographic and per capita growth, and Beta is calculated from the other values using the law.

The argument that Piketty is intending to tease out from this equality is the following:

- Given per capita GDP is on average lower than many people realise (on the order of 1-2% pa in the long run)
- And given GDP growth is no longer offset by demographic GDP growth in many advanced economies, i.e. the demographic growth component is now very low
- GDP growth in the future is likely to only be on the order of 1.5% pa.
- Therefore for a fixed saving rate, and relatively low growth, we should expect much higher values of Beta than we have seen in the last 50 years.

In fact using $g=1.5 \%$ as a long term average, we can expect Beta to crystallise around a Beta of $8$! Much higher than it has been for the past 100 years.

As Piketty is quick to point out, this is a long run equilibrium towards which an economy will move. Moreover, it should be noted that the convergence of this process is incredibly slow.

Here is a graph plotting the evolution of Beta, from a starting point of 5, under the assumption of $g=1.5 \%$, $s = 12 \%$:

So we see that after 30 years ( i.e. approx. one generation), Beta has only increased from its starting point of $5$ to around $6$, it then takes another generation and a half to get to $7$, which is still short of its long run equilibrium of $8$.

**Analysis - Is this rule true?**

Piketty is of course going to want to use his formula to say interesting things about the historic evolution of the Capital/Income ratio, and also use it to help predict future movements in Beta. I think this is where we start to push the boundaries of what we can easily reason, without first slowing down and methodically examining our implicit assumptions.

For example – is a fixed saving rate (independent of changes in both Beta, and Growth) reasonable? Remember that the saving rate here is a saving rate on*net* income. So that as Beta increases, we are already having to put more money into upkeep of our current level of capital, so that a fixed net saving rate is actually consistent with an increasing gross saving rate, not a fixed gross saving rate. An increasing gross saving rate might be a reasonable assumption or it might not – this then becomes an empirical question rather than something we can reason about a priori.

Another question is how the law performs for very low rates of $g$, which is in fact how Piketty is intending to use the equation. By inspection, we can see that:

Piketty is of course going to want to use his formula to say interesting things about the historic evolution of the Capital/Income ratio, and also use it to help predict future movements in Beta. I think this is where we start to push the boundaries of what we can easily reason, without first slowing down and methodically examining our implicit assumptions.

For example – is a fixed saving rate (independent of changes in both Beta, and Growth) reasonable? Remember that the saving rate here is a saving rate on

Another question is how the law performs for very low rates of $g$, which is in fact how Piketty is intending to use the equation. By inspection, we can see that:

As $g \rightarrow 0$, $B \rightarrow \infty $.

What is the mechanism by which this occurs in practice? It’s simply that if GDP does not grow from one year to the next, but the net saving rate is still positive, then the stock of capital will still increase, however income has not increased. This does however mean that an ever increasing share of the economy is going towards paying for capital depreciation.

Piketty’s law is still useful, and I do find it convincing to a first order of approximation. But I do think this section of the book could have benefited from more time spent highlighting some of the distortions potentially caused by using

I'm always begrudgingly impressed by brokers and underwriters who can do most of their job without resorting to computers or a calculator. If you give them a gross premium for a layer, they can reel off gross and net rates on line, the implied loss cost, and give you an estimate of the price for a higher layer using an ILF in their head. When I'm working, so much actuarial modelling requires a computer (sampling from probability distributions, Monte Carlo methods, etc.) that just to give any answer at all I need to fire up Excel and make a Spreadsheet. So anytime there's a chance to do some shortcuts I'm always all for it!

One mental calculation trick which is quite useful when working with compound interest is called the Rule of 72. It states that for interest rate $i$, under growth from annual compound interest, it takes approximately $\frac{72}{i} $ years for a given value to double in size.

Here is a quick derivation showing why this works, all we need is to manipulate the exact solution with logarithms and then play around with the Taylor expansion.

We are interested in the following identity, which gives the exact value of $n$ for which an investment doubles under compound interest:

$$ \left( 1 + \frac{i}{100} \right)^n = 2$$

Taking logs of both sides gives the following:

$$ ln \left( 1 + \frac{i}{100} \right)^n = ln(2)$$

And then bringing down the $n$:

$$n* ln \left( 1 + \frac{i}{100} \right) = ln(2)$$

And finally solving for $n$:

$$n = \frac {ln(2)} { ln \left( 1 + \frac{i}{100} \right) }$$

So the above gives us a formula for $n$, the number of years. We now need to come up with a simple approximation to this function, and we do so by examining the Taylor expansion denominator of the right have side:

We can compute the value of $ln(2)$:

$$ln(2) \approx 69.3 \%$$

The Taylor expansion of the denominator is:

$$ln \left( 1 + \frac{i}{100} \right) = \frac{r}{100} – \frac{r^2}{20000} + … $$

In our case, it is more convenient to write this as:

$$ln \left( 1 + \frac{i}{100} \right) = \frac{1}{100} \left( r – \frac{r^2}{200} + … \right) $$

For $r<10$, the second term is less than $\frac{100}{200} = 0.5$. Given the first term is of the order $10$, this means we are only throwing out an adjustment of less than $5 \%$ to our final answer.

Taking just the first term of the Taylor expansion, we end up with:

$$n \approx \frac{69.3 \%}{\frac{1}{100} * \frac{1}{r}}$$

And rearranging gives:

$$n \approx \frac{69.3}{r}$$

So we see, we are pretty close to $ n \approx \frac{72}{r}$.

**Why 72?**

We saw above that using just the first term of the Taylor Expansion suggests we should be using the ‘rule of 69.3%' instead. Why then is this the rule of 72?

There are two main reasons, the first is that for most of the interest rates we are interested in, the Rule of 72 actually gives a better approximation to the exact solution, the following table compares the exact solution, the approximation given by the ‘Rule of 69’, and the approximation given by the Rule of 72:

$$ln \left( 1 + \frac{i}{100} \right) = \frac{r}{100} – \frac{r^2}{20000} + … $$

In our case, it is more convenient to write this as:

$$ln \left( 1 + \frac{i}{100} \right) = \frac{1}{100} \left( r – \frac{r^2}{200} + … \right) $$

For $r<10$, the second term is less than $\frac{100}{200} = 0.5$. Given the first term is of the order $10$, this means we are only throwing out an adjustment of less than $5 \%$ to our final answer.

Taking just the first term of the Taylor expansion, we end up with:

$$n \approx \frac{69.3 \%}{\frac{1}{100} * \frac{1}{r}}$$

And rearranging gives:

$$n \approx \frac{69.3}{r}$$

So we see, we are pretty close to $ n \approx \frac{72}{r}$.

We saw above that using just the first term of the Taylor Expansion suggests we should be using the ‘rule of 69.3%' instead. Why then is this the rule of 72?

There are two main reasons, the first is that for most of the interest rates we are interested in, the Rule of 72 actually gives a better approximation to the exact solution, the following table compares the exact solution, the approximation given by the ‘Rule of 69’, and the approximation given by the Rule of 72:

The reason for this is that for interest rates in the 4%-10% range, the second term of the Taylor expansion is not completely negligible, and act to make the denominator slightly smaller and hence the fraction slightly bigger. It turns out 72 is quite a good fudge factor to account for this.

Another reason for using 72 over other close numbers is that 72 has a lot of divisors, in particular out of all the integers within 10 of 72, 72 has the most divisors. The following table displays the divisors function d(n), for values of n between 60 and 80. 72 clearly stands out as a good candidate.

Another reason for using 72 over other close numbers is that 72 has a lot of divisors, in particular out of all the integers within 10 of 72, 72 has the most divisors. The following table displays the divisors function d(n), for values of n between 60 and 80. 72 clearly stands out as a good candidate.

The main use I find for this trick is in mentally adjusting historic claims for claims inflation. I know that if I put in 6% claims inflation, my trended losses will double in size from their original level approximately every 12 years. Other uses include when analysing investment returns, thinking about the effects of monetary inflation, or it can even be useful when thinking about the effects of discounting.

As an aside, we should be careful when attempting to apply the rule of 72 over too long a time period. Say we are watching a movie set in 1940, can we use the Rule of 72 to estimate what values in the movie are equivalent to now? Let's set up an example and see why it doesn't really work in practice. Let's suppose an item in our movie costs 10 dollars. First we need to pick an average inflation rate for the intervening period (something in the range of 3-4% is probably reasonable). We can then reason as follows; 1940 was 80 years ago, at 4% inflation, $\frac{72}{4} = 18$, and we’ve had approx. 4 lots of 18 years in that time. Therefore the price would have doubled 4 times, or will now be a factor of 16. Suggesting that 10 dollars in 1940 is now worth around 160 dollars in today's terms.

It turns out that this doesn’t really work though, let’s check it against another calculation. The average price of a new car in 1940 was around 800 dollars and the average price now is around 35k, which is a factor of 43.75, quite a bit higher than 16. The issue with using inflation figures like these over very long time periods, is for a given year the difference in the underlying goods is fairly small, therefore a simple percentage change in price is an appropriate measure. When we chain together a large number of annual changes, after a certain number of years, the underlying goods have almost completely changed from the first year to the last. For this reason, simply multiplying an inflation rate across decades completely ignores both improvements in the quality of goods over time, and changes in standards of living, so doesn't really convey the information that we are actually interested in.

Photo by David Preston

Excess of Loss contacts for Aviation books, specifically those covering airline risks (planes with more than 50 seats) often use a special type of deductible, called a floating deductible. Instead of applying a fixed amount to the loss in order to calculate recoveries, the deductible varies based on the size of the market loss and the line written by the insurer. These types of deductibles are reasonably common, I’d estimate something like 25% of airline accounts I’ve seen have had one.

As an aside, these policy features are almost always referred to as deductibles, but technically are not actually deductibles from a legal perspective, they should probably be referred to as floating attachment instead. The definition of a deductible requires that it be

The idea is that the floating deductible should be lower for an airline on which an insurer takes a smaller line, and should be higher for an airline for which the insurer takes a bigger line. In this sense they operate somewhat like a surplus lines contract in property reinsurance.

Before I get into my issues with them, let’s quickly review how they work in the first place.

When binding an Excess of Loss contract with a floating deductible, we need to specify the following values upfront:

- Limit = USD18.5m
- Fixed attachment = USD1.5m
- Original Market Loss = USD150m

And we need to know the following additional information about a given loss in order to calculate recoveries from said loss:

- The insurer takes a 0.75% line on the risk
- The insurer’s limit is USD 1bn
- The risk suffers a USD 200m market loss.

A standard XoL recovery calculation with the fixed attachment given above, would first calculate the UNL (200m*0.75%=1.5m), and then deduct the fixed attachment from this (1.5m-1.5m=0). Meaning in this case, for this loss and this line size, nothing would be recovered from the XoL.

To calculate the recovery from XoL with a floating deductible, we would once again calculate the insured’s UNL 1.5m. However we now need to calculate the applicable deductible, this will be the lesser of 1.5m (the fixed attachment), and the insurer’s effective line (defined as their UNL divided by the market loss = 1.5m/200m) multiplied by the Original Market Loss as defined in the contract. In this case, the effective line would be 0.75%, and the Original Market Loss would be 150m, hence; 0.75%*150m = 1.125m. Since this is less than the 1.5m fixed attachment, the attachment we should use is 1.125m our limit is always just 18.5m, and doesn’t change if the attachment drops down. We would therefore calculate recoveries to this contract, for this loss size and risk, as if the layer was a 18.5m xs 1.125. Meaning the ceded loss would be 0.375m, and the net position would be 1.125m.

Here’s the same calculation in an easier to follow format:

This may seem quite sensible so far, however the issue is with the wording. The following is an example of a fairly standard London Market wording, taken from an anonymised slip which I came across a few years ago.

…

Reinsurers shall only be liable if and when the ultimate net loss paid by the Reinsured in respect of the interest as defined herein exceeds USD 10,000,000 each and every loss or an amount equal to the Reinsured’s Proportion of the total Original Insured Loss sustained by the original insured(s) of USD 200,000,000 or currency equivalent, each and every loss, whichever the lesser (herein referred to as the “Priority”)

For the purpose herein, the Reinsured’s Proportion shall be deemed to be a percentage calculated as follows, irrespective of the attachment dates of the policies giving rise to the Reinsured’s ultimate net loss and the Original Insured Loss:

Reinsured Ultimate Net Loss

/

Original Insured Loss

…

The Original Insured Loss shall be defined as the total amount incurred by the insurance industry including any proportional co-insurance and or self-insurance of the original insured(s), net of any recovery from any other source

What’s going on here is that we’ve defined the effective line to be the Reinsured’s unl divided by the 100% market loss.

From a legal perspective, how would an insurer (or reinsurer for that matter), prove what the 100% insured market loss is? The insurer obviously knows their share of the loss, however what if this is a split placement with 70% placed in London on the same slip, 15% placed in a local market (let’s say Indonesia?), and a shortfall cover (15%) placed in Bermuda. Due to the different jurisdictions, let’s say the Bermudian cover has a number of exclusions and subjectivities, and the Indonesian cover operates under the Indonesian legal system which does not publically disclose private contract details.

Even if the insurer is able to find out through a friendly broker what the other markets are paying, and therefore have a good sense of what the 100% market loss is, they may not have a legal right to this information. The airline

The above issues may sound quite theoretical, and in practice there are normally no issues with collecting on these types of contracts. But to my mind, legal language should bear up to scrutiny even when stretched – that’s precisely when you are going to rely on it. My contention is that as a general rule, it is a bad idea to rely on information in a contract which you do not have an automatic legal right to obtain.

The intention with this wording, and with contracts of this form is that the effective line should basically be the same as the insured’s signed line. Assuming everything is straightforward, if the insurer takes a x% line with a limit of

My guess as to why it is worded this way rather than just taking the actual signed line is that we don’t want to open ourselves to a issues around what exactly we mean by ‘the signed line’ – what if the insured has exposure through two contracts both of which have different signed lines, what if there is an inuring Risk Excess which effectively nets down the gross signed line – should we then use the gross or net line? By couching the contract in terms of UNLs and Market losses we attempt to avoid these ambiguities

Let me give you a scenario though where this wording does fall down:

Let’s suppose there is a mid-air collision between two planes. Each results in an insured market loss of USD 1bn, then the Original Insured Loss is USD 2bn. If our insurer takes a 10% line on the first airline, but does not write the second airline, then their effective line is 10% * 1bn / 2bn = 5%... hmmm this is definitely equal to their signed line of 10%.

You may think this is a pretty remote possibility, after all in the history of modern commercial aviation such an event has not occurred. What about the following scenario which does occur fairly regularly?

Suppose now there is a loss involving a single plane, and the size of the loss is once again USD 1bn, and that our insurer once again has a 10% line. In this case though, what if the manufacturer is found 50% responsible? Now the insurer only has a UNL of USD 500m, and yet once again, in the calculation of their floating deductible, we do the following: 10% * 500m/1bn = 5%.

Hmmm, once again our effective line is below our signed line, and the floating deductible will drop down even further than intended.

My suggested wording, and

Basically the intention is to restrict the market loss, only to those contracts through which the insurer has an involvement. This deals with both issues – the insurer would not be able to net down their line further through references to insured losses which are nothing to do with them, as in the case of scenario 1 and 2 above, and secondly it restrict the information requirements to contracts which the insurer has an automatic legal right to have knowledge of since by definition they will be a party to the contract.

I did run this idea past a few reinsurance brokers a couple of years ago, and they thought it made sense. The only downside from their perspective is that it makes the client's reinsurance slightly less responsive i.e. they knew about the strange quirk whereby the floating deductible dropped in the event of a manufacturer involvement, and saw it as a bonus for their client, which was often not fully priced in by the reinsurer. They therefore had little incentive to attempt to drive through such a change. The only people who would have an incentive to push through this change would be the larger reinsurers, though I suspect they will not do so until they've already been burnt and attempted to rely on the wording in a court case and, at which point they may find it does not quite operate in the way they intended.

Sub RemovePassword()Dim i As Integer, j As Integer, k As IntegerDim l As Integer, m As Integer, n As IntegerDim i1 As Integer, i2 As Integer, i3 As IntegerDim i4 As Integer, i5 As Integer, i6 As IntegerOn Error Resume NextFor i = 65 To 66: For j = 65 To 66: For k = 65 To 66For l = 65 To 66: For m = 65 To 66: For i1 = 65 To 66For i2 = 65 To 66: For i3 = 65 To 66: For i4 = 65 To 66For i5 = 65 To 66: For i6 = 65 To 66: For n = 32 To 126ActiveSheet.Unprotect Chr(i) & Chr(j) & Chr(k) & _Chr(l) & Chr(m) & Chr(i1) & Chr(i2) & Chr(i3) & _Chr(i4) & Chr(i5) & Chr(i6) & Chr(n)If ActiveSheet.ProtectContents = False ThenMsgBox "Password is " & Chr(i) & Chr(j) & _Chr(k) & Chr(l) & Chr(m) & Chr(i1) & Chr(i2) & _Chr(i3) & Chr(i4) & Chr(i5) & Chr(i6) & Chr(n)Exit SubEnd IfNext: Next: Next: Next: Next: NextNext: Next: Next: Next: Next: NextEnd Sub

Nothing too interesting so far, the code looks quite straight forward - we've got a big set of nested loops which appear to test all possible passwords, and will eventually brute force the password - if you've ever tried it you'll know it works pretty well. The interesting part is not so much the code itself, as the answer the code gives - the password which unlocks the sheet is normally something like ‘AAABAABA@1’. I’ve used this code quite a few times over the years, and always with similar results, the password always looks like some variation of this string. This got me thinking - surely it is unlikely that all the Spreadsheets I’ve been unlocking have had passwords of this form? So what’s going on?

After a bit of research, it turns out Excel doesn’t actually store the original password, instead it stores a 4-digit hash of the password. Then to unlock the Spreadsheet, Excel hashes the password attempt and compares it to the stored hashed password. Since the size of all possible passwords is huge (full calculations below), and the size of all possible hashes is much smaller, we end up with a high probability of collisions between password attempts, meaning multiple passwords can open a given Spreadsheet.

I think the main reason Microsoft uses a hash function in this way rather than just storing the unhashed password is that the hash is stored by Excel as an unencrypted string within a xml file. In fact, an .xlsx file is basically just a zip containing a number of xml files. If Excel didn't first hash the password then you could simply unzip Excel file, find the relevant xml file and read the password from any text editor. With the encryption Excel selected, the best you can do is open the xml file and read the hash of the password, which does not help with getting back to the password due to the nature of the hash function.

I couldn't find the name of the hash anywhere, but the following website has the fullest description I could find of the actual algorithm. As an aside, I miss the days when the internet was made up of websites like this – weird, individually curated, static HTML, obviously written by someone with deep expertise, no ads as well! Here’s the link:

http://chicago.sourceforge.net/devel/docs/excel/encrypt.html

And the process is as follows:

*take the ASCII values of all characters shift left the first character 1 bit, the second 2 bits and so on (use only the lower 15 bits and rotate all higher bits, the highest bit of the 16-bit value is always 0 [signed short])**XOR all these values**XOR the count of characters**XOR the constant 0xCE4B*

constant: 0xCE4B

result: 0xFEF1

This value occurs in the PASSWORD record.

Now we know how the algorithm works, can we come up with a probabilistic bound on the number of trials we would need to check in order to be almost certain to get a collision when carrying out a brute force attack (as per the VBA code above)?

This is a fairly straight forward calculation – the probability of guessing incorrectly for a random attempt is $\frac{1}{65536}$. To keep the maths simple, if we assume independence of attempts, the probability of not getting the password after $n$ attempts is simply: $$ \left( \frac{1}{65536} \right)^n$$ The following table then displays these probabilities

So we see that with 200,000 trials, there is a less than 5% chance of not having found a match.

We can also derive the answer directly, we are interested in the following probabilistic inequality: $$ \left( 1- \frac{1}{65536} \right)^k < 0.05$$ Taking logs of both sides gives us: $$ln \left( 1- \frac{1}{65536}\right)^k = ln( 0.05)$$ And then bringing down the k: $$k * ln \left( 1- \frac{1}{65536} \right) = ln(0.05)$$ And then solving for $k$: $$k = \frac{ ln(0.05)}{ln \left(1- \frac{1}{65536}\right)} = 196,327$$

As we explained above, in order to decrypt the sheet, you don’t need to find

I can only think of two basic approaches:

Since this algorithm has been around for decades, and is designed to be difficult to reverse, and so far has not been broken, this is a bit of a non-starter. Let me know if you manage it though!

This is basically your only chance, but let’s run some maths on how difficult this problem is. There are $94$ possible characters (A-Z, a-z,0-9), and in Excel 2010, the maximum password length is $255$, so in all there are, $94^{255}$ possible passwords. Unfortunately for us, that is more than the total number of atoms in the universe $(10^{78})$. If we could check $1000$ passwords per second, then it would take far longer than the current age of the universe to find the correct one.

Okay, so that’s not going to work, but can we make the process more efficient?

Let’s restrict ourselves to looking at passwords of a known length. Suppose we know the password is only a single character, in that case we simply need to check $94$ possible passwords, one of which should unlock the sheet, hence giving us our password. In order to extend this reasoning to passwords of arbitrary but known length, let’s think of the hashing algorithm as a function and consider its domain and range: Let’s call our hashing algorithm $F$, the set of all passwords of length $i$, $A_i$, and the set of all possible password hashes $B$. Then we have a function:

$$ F: A_i -> B$$

Now if we assume the algorithm is approximately equally spread over all the possible values of $B$, then we can use the size of $B$ to calculate the size of the kernel $F^{-1}[A_i]$. The size of $B$ doesn’t change. Since we have a $4$ digit hexadecimal, it is of size $16^4$, and since we know the size of $A_i$ is $96$, we can then estimate the size of the kernel.

Let’s take $i=4$, and work it through:

$A_4$ is size $96^4 = 85m$, $B = 65536$, hence $|F^{-1}[A_4]| = \frac{85m}{65536} = 124416$

Which means for every hash, there are $124,416$ possible $4$ digit passwords which can create this hash, and therefore may have been the original password.

Here is a table of the values for $I = 1$ to $6$:

In fact we can come up with a formula for size of the kernel: $$\frac{96^i}{16^4} \sim 13.5 * 96^{i-2}$$

Which we can see quickly approaches infinity as $i$ increases.

So for $i$ above $5$, the problem is basically intractable without further improvement. How would we progress if we had to? The only other idea I can come up with is to generate a huge array of all possible passwords (based on brute forcing like above and recording all matches), and then start searching within this array for keywords. We could possibly use some sort of fuzzylookup against a dictionary of keywords.

If the original password did not contain any words, but was instead just a fairly random collection of characters then we really would be stumped. I imagine that this problem is basically impossible (and could probably be proved to be so using information theory and entropy)

No idea…I thought it might be fun to do a little bit of online detective work. You see this code all over the internet, but can we find the original source?

This site has quite an informative page on the algorithm:

https://www.spreadsheet1.com/sheet-protection.html

The author of the above page is good enough to credit his source, which is the following stack exchange page:

https://superuser.com/questions/45868/recover-sheet-protection-password-in-excel

Which in turns states that the code was ‘'Author unknown but submitted by brettdj of www.experts-exchange.com’

I had a quick look on Experts-exchange, but that's as far as I could get, at least we found the guy's username.

I think the current VBA code is basically as quick as it is going to get - the hashing algorithm should work just as fast with a short input as a 12 character input, so starting with a smaller value in the loop doesn’t really get us anything. The only real improvement I can suggest, is that if the Spreadsheet is running too slowly to be able to test a sufficient number of hashes per second, then the hashing algorithm could be implemented in python (which I started to do just out of interest, but it was a bit too fiddly to be fun). Once the algorithm is set up, the password could then be brute forced from there (in much quicker time), and one a valid password has been found, this can then be simply typed into Excel.

I remember being told as a relatively new actuarial analyst that you "shouldn't inflate loss ratios" when experience rating. This must have been sitting at the back of my mind ever since, because last week, when a colleague asked me basically the same question about adjusting loss ratios for claims inflation, I remembered the conversation I'd had with my old boss and it finally clicked.

Let's go back a few years - it's 2016 - Justin Bieber has a song out in which he keeps apologising, and to all of us in the UK, Donald Trump (if you've even heard of him) is still just the America's version of Alan Sugar. I was working on the pricing for a Quota Share, I can't remember the class of business, but I'd been given an aggregate loss triangle, ultimate premium estimates, and rate change information. I had carefully and meticulously projected my losses to ultimate, applied rate changes, and then set the trended and developed losses against ultimate premiums. I ended up with a table that looked something like this:

I then thought to myself ‘okay this is a property class, I should probably inflate losses by about $3\%$ pa’, the definition of a loss ratio is just losses divided by premium, therefore the correct way to adjust is to just inflate the ULR by $3\%$ pa. I did this, sent the analysis to my boss at the time to review, and was told ‘you shouldn’t inflate loss ratios for claims inflation, otherwise you'd need to inflate the premium as well’ – in my head I was like ‘hmmm, I don’t really get that...’ we’ve accounted for the change in premium by applying the rate change, claims certainly do increase each year, but I don't get how premiums also 'inflate' beyond rate movements?! but since he was the kind of actuary who is basically never wrong and we were short on time, I just took his word for it.

I didn’t really think of it again, other than to remember that ‘you shouldn’t inflate loss ratios’, until last week one of my colleagues asked me if I knew what exactly this ‘Exposure trend’ adjustment in the experience rating modelling he’d been sent was. The actuaries who had prepared the work had taken the loss ratios, inflated them in line with claims inflation (what you're not supposed to do), but then applied an ‘exposure inflation’ to the premium. Ah-ha I thought to myself, this must be what my old boss meant by inflating premium.

I'm not sure why it took me so long to get to the bottom of what, is when you get down to it, a fairly simple adjustment. In my defence, you really don’t see this approach in ‘London Market’ style actuarial modelling - it's not covered in the IFoA exams for example. Having investigated a little, it does seem to be an approach which is used by US actuaries more – possibly it’s in the CAS exams?

When I googled the term 'Exposure Trend', not a huge amount of useful info came up – there are a few threads on Actuarial Outpost which kinda mention it, but after mulling it over for a while I think I understand what is going on. I thought I’d write up my understanding in case anyone else is curious and stumbles across this post.

I thought it would be best to explain through an example, let’s suppose we are analysing a single risk over the course of one renewal. To keep things simple, we’ll assume it’s some form of property risk, which is covering Total Loss Only (TLO), i.e. we only pay out if the entire property is destroyed.

Let’s suppose for $2018$, the TIV is $1m$ USD, we are getting a net premium rate of $1\%$ of TIV, and we think there is a $0.5\%$ chance of a total loss. For $2019$, the value of the property has increased by $5\%$, we are still getting a net rate of $1\%$, and we think the underlying probability of a total loss is the same.

In this case we would say the rate change is $0\%$. That is:

$$ \frac{\text{Net rate}_{19}}{\text{Net rate}_{18}} = \frac{1\%}{1\%} = 1 $$

However we would say that claim inflation is $2.5\%$, which is the increase in expected claims this follows from:

$$ \text{Claim Inflation} = \frac{ \text{Expected Claims}_{19}}{ \text{Expected Claims}_{18}} = \frac{0.5\%*1.05m}{0.5\%*1m} = 1.05$$

From first principles, our expected gross gross ratio (GLR) for $2018$ is:

$$\frac{0.5 \% *(TIV_{18})}{1 \% *(TIV_{18})} = 50 \%$$ And for $2019$ is: $$\frac{0.5\%*(TIV_{19})}{1\%*(TIV_{19})} = 50\%$$

i.e. they are the same!

$$\frac{0.5 \% *(TIV_{18})}{1 \% *(TIV_{18})} = 50 \%$$ And for $2019$ is: $$\frac{0.5\%*(TIV_{19})}{1\%*(TIV_{19})} = 50\%$$

i.e. they are the same!

The correct adjustment when on-levelling $2018$ to $2019$ should therefore result in a flat GLR – this follows as we’ve got the same GLR in each year when we calculated above from first principles. If we’d taken the $18$ GLR, applied the claims inflation $1.05$ and applied the rate change $1.0$, then we might erroneously think the Gross Loss Ratio would be $50\%*1.05 = 52.5\%$. This would be equivalent to what I did in the opening paragraph of this post, the issue being, that we haven’t accounted for trend in exposure and our rate change is a measure of the change in net rate. If we include this exposure trend as an additional explicit adjustment this gives $50\%*1.05*1/1.05 = 50\%$. Which is the correct answer, as we can see by comparing to our first principles calculation.

So the fundamental problem, is that our measure of rate change is a measure in the movement of

An advantage of making an explicit adjustment for exposure trend and claims inflation is that it allows us to apply different rates – which is probably more accurate. There’s no a-priori justification as to why the two should always be the same. Claim inflation will be affected by additional factors beyond changes in the inflation of the assets being insured, this may include changes in frequency, changes in court award inflation, etc…

It’s also interesting to note that the clam inflation here is of a different nature to what we would expect to see in a standard Collective Risk Model. In that case we inflate individual losses by the average change in

The above discussion also shows the importance of understanding exactly what someone means by ‘rate change’. It may sound obvious but there are actually a number of subtle differences in what exactly we are attempting to measure when using this concept. Is it change in premium per unit of exposure, is it change in rate per dollar of exposure, or is it even change in rate adequacy? At various points I’ve seen all of these referred to as ‘rate change’.

There is a way of thinking about probability distributions that I’ve always found interesting, and to be honest I don’t think I’ve ever seen anyone else write about it. For each probability distribution, the CDF can be thought of as a partial infinite sum, or partial integral identity, and the probability distribution is uniquely defined by this characterisation (with a few reasonable conditions)

I’m not sure whether most people will be completely lost (possibly because I;ve explained it badly), or whether they’ll think what I’ve just said is completely obvious. Let me give an example to help illustrate.

**Poisson Distribution as a partial infinite sum**

Start with the following identity:

I’m not sure whether most people will be completely lost (possibly because I;ve explained it badly), or whether they’ll think what I’ve just said is completely obvious. Let me give an example to help illustrate.

Start with the following identity:

$$ \sum_{i=0}^{\infty} \frac{ x^i}{i!} = e^{x}$$

And let's bring the exponential over to the other side.

$$ \sum_{i=0}^{\infty} \frac{ x^i}{i!} e^{-x} = 1$$

Let's state a few obvious facts about this equation; firstly, this is an infinite sum (which I claimed above were related to probability distributions - so good so far). Secondly, the identity is true by the definition of $e^x$, all we need to do to prove the identity is show the convergence of the infinite sum, i.e. that $e{x}$ is well defined. Finally, each individual summand is greater than or equal to 0.

With that established, if we define a function:

$$ F(x;k) = \sum_{i=0}^{k} \frac{ x^i}{i!} e^{-x}$$

That is, a function which specifies as its parameter the number of partial sumummads we should add together. We can see from the above identity that:

- The partial sum is strictly less than 1
- The sum converges to 1 as $k \rightarrow \infty$.

But wait, the formula for $F(x;k)$ above is actually just the formula for the CDF of a Poisson random variable! That’s interesting right? We started with an identity involving an infinite sum, we then normalised it so that the sum was equal to 1, then we defined a new function equal to the partial summation from this normalised series, and voila, we ended up with the CDF of a well-known probability distribution.

Can we repeat this again? (I’ll give you a hint, we can)

Let’s examine an integral this time. We’ll use the following identity:

$$\int_{0}^{ \infty} e^{- \lambda x} dx = \lambda$$

An integral is basically just a type of infinite series, so let’s apply the same process, first we normalise:

$$ \frac{1}{\lambda} \int_{0}^{ \infty} e^{- \lambda x} dx = 1$$

Then define a function equal to the partial integral:

$$ F(y) = \frac{1}{\lambda} \int_{0}^{ y} e^{- \lambda x} dx $$

And we've ended up with the CDF of an Exponential distribution!

This construction even works when we use more complicated integrals. The Euler integral of the first kind is defined as:

$$B(x,y)=\int_{0}^{1}t^{{x-1}}(1-t)^{{y-1}} dt =\frac{\Gamma (x)\Gamma (y)}{\Gamma (x+y)}$$

This allows us to normalise:

$$\frac{\int_{0}^{1}t^{{x-1}}(1-t)^{{y-1}}dt}{B(x,y)} = 1$$

And once again, we can construct a probability distribution:

$$B(x;a,b) = \frac{\int_{0}^{x}t^{{a-1}}(1-t)^{{b-1}}dt}{B(a,b)}$$

Which is of course the definition of a Beta Distribution, this definition bears some similarity to the definition of an exponential distribution in that our normalisation constant is actually defined by the very integral which we are applying it to.

**Conclusion**

So can we do anything useful with this information? Well not particularly. but I found it quite insightful in terms of how these crazy formulas were discovered in the first place, and we could potentially use the above process to derive our own distributions – all we need is an interesting integral or infinite sum and by normalising and taking a partial sum/integral we've defined a new way of partitioning the unit interval.

Hopefully you found that interesting, let me know if you have any thoughts by leaving a comment in the comment box below!

]]>$$\frac{\int_{0}^{1}t^{{x-1}}(1-t)^{{y-1}}dt}{B(x,y)} = 1$$

And once again, we can construct a probability distribution:

$$B(x;a,b) = \frac{\int_{0}^{x}t^{{a-1}}(1-t)^{{b-1}}dt}{B(a,b)}$$

Which is of course the definition of a Beta Distribution, this definition bears some similarity to the definition of an exponential distribution in that our normalisation constant is actually defined by the very integral which we are applying it to.

So can we do anything useful with this information? Well not particularly. but I found it quite insightful in terms of how these crazy formulas were discovered in the first place, and we could potentially use the above process to derive our own distributions – all we need is an interesting integral or infinite sum and by normalising and taking a partial sum/integral we've defined a new way of partitioning the unit interval.

Hopefully you found that interesting, let me know if you have any thoughts by leaving a comment in the comment box below!