I’m reading ‘Information Theory, inference and learning algorithms' by David MacKay at the moment and I'm really enjoying it so far. One cool trick that he introduces early in the book is a method of deriving Stirling’s approximation through the use of the Gaussian approximation to the Poisson Distribution.
Derivation Suppose we have a Poisson Distribution with parameter $\lambda$, by definition: $$P(X=k)=e^{ \lambda} \frac{\lambda^k}{k!}$$ If we replace $\lambda$ with $r$, and consider the probability that $X=r$, we get: $$P(X=r)=e^{ r} \frac{r^r }{r!}$$ Now suppose we are restricting ourselves to large value of $r$, in which case, a Poisson distribution is well approximated by a Gaussian distribution with mean and variance both equal to $r$. Setting up this approximation. $$e^{ r} \frac{r^r }{r!} \approx \frac{1}{\sqrt{2 \pi r}} e^{ \frac{(rr)^2}{2}}$$ From which the right hand side simplifies further to give: $$ e^{ r} \frac{r^r }{r!} \approx \frac{1}{\sqrt{2 \pi r}} e^{0}$$ Giving: $$ e^{ r} \frac{r^r }{r!} \approx \frac{1}{\sqrt{2 \pi r}}$$ Which when we rearrange to obtain Stirling’s approximation: $$r! \approx e^{r} r^r \sqrt{2 \pi r}$$ Intuition We probably shouldn’t be surprised that we’ve found a link between $e^r$ and $\pi$. The fact that the two are linked can be easily drawn out using Euler’s formula: $$e^{inx}=\cos(nx)+i\sin(nx)$$ And examining the value $\pi$: $$e^{i\pi }+1=0$$ So there is a clearly a link between $e^x$, and $\pi$, but its not obvious we can draw in the factorial as well. Above we teased out a link between all of the following: $n!$, $n^n$, $e^r$, and $\pi$, which is interesting for its own sake, but moreover provides intuition as to why the Gaussian approximation to the Poisson distribution works. It should probably be noted that we’ve implicitly invoked the Central Limit Theorem to establish the approximation, and the CTL is some pretty heavy machinery! The proof from first principles of the CTL is much more involved that the proof of Stirling's approximation, so the derivation above should be thought of as strictly a process of drawing out interesting parallels, rather than a path for proving the result from first principles. Use in Actuarial Modelling I tend to use the Gaussian approximation quite a lot at work – any time I’m modelling a claim count frequency in a Spreadsheet and I’ve got a reasonable number of annual claims, I’m a proponent of just using a discretised Gaussian, with a min applied at 0, and with a variance and mean set as required. This has a couple of advantages to using either a Poisson or Negative Binomial:
Coming up with interesting problems David Mackay seemed to have an eye for interesting problems – reading up on Wikipedia about him he competed in a Maths Olympiad while a student. I do wonder if there is a correlation between wellwritten, entertaining textbooks, and authors who have a background in competitive maths problems. The link between Stirling’s approximation, Gaussian, and Poisson is just the sort of thing that could make an interesting problem in a competitive maths competition. I also realised after writoing this post that I’d already written about something pretty similar before, where we can use Stirling’s approximation to easily estimate the probability that a Poisson value is equal to it’s mean. Here’s the link: www.lewiswalsh.net/blog/poissondistributionwhatistheprobabilitythedistributionisequaltothemean
0 Comments
Piketty also adds colour by tying his observations to the literature written at the time (Austen, Dumas, Balzac), and how the assumptions made by the authors around how money, income and capital work are also reflected in the economic data that Piketty obtained.
Hopefully I've convinced you Piketty's programme is a worthwhile one, but that still leaves the fundamental question  is his analysis correct? That's a much harder question to answer, and to be honest I really don't feel qualified to pass judgement on the entirety of the book, other than to say it strikes me as pretty convincing from the limited amount of time I've spent on it. In an attempt to contribute in some small way to the larger conversation around Piketty's work, I thought I'd write about one specific argument that Piketty makes that I found less convincing than other parts of the book. Around 120 pages in, Piketty introduces what he calls the ‘Second Fundamental Law of Capitalism’, and this is where I started having difficulties in following his argument. The Second Fundamental Law of Capitalism The rule is defined as follows: $$ B = \frac{s} { g} $$ Where $B$ , as in Piketty’s first fundamental rule, is defined as the ratio of Capital (the total stock of public and private wealth in the economy) to Income (NNP): $$B = \frac{ \text{Capital}}{\text{Income}}$$ And where $g$ is the growth rate, and $s$ is the saving rate. Unlike the first rule which is an accounting identity, and therefore true by definition, the second rule is only true ‘in the long run’. It is an equilibrium that the market will move to over time, and the following argument is given by Piketty: “The argument is elementary. Let me illustrate it with an example. In concrete terms: if a country is saving 12 percent of its income every year, and if its initial capital stock is equal to six years of income, then the capital stock will grow at 2 percent a year, thus at exactly the same rate as national income, so that the capital/income ratio will remain stable. By contrast, if the capital stock is less than six years of income, then a savings rate of 12 percent will cause the capital stock to grow at a rate greater than 2 percent a year and therefore faster than income, so that the capital/income ratio will increase until it attains its equilibrium level. Conversely, if the capital stock is greater than six years of annual income, then a savings rate of 12 percent implies that capital is growing at less than 2 percent a year, so that the capital/income ratio cannot be maintained at that level and will therefore decrease until it reaches equilibrium.” I’ve got to admit that this was the first part in the book where I really struggled to follow Piketty’s reasoning – possibly this was obvious to other people, but it wasn’t to me! Analysis – what does he mean? Before we get any further, let’s unpick exactly what Piketty means by all the terms in his formulation of the law: Income = Net national product = Gross Net product *0.9 (where the factor of 0.9 is to account for depreciation of Capital) $g$ = growth rate, but growth of what? Here it is specifically growth in income, so while this is not exactly the same as GDP growth it’s pretty close. If we assume net exports do not change, and the depreciation factor (0.9) is fixed, then the two will be equal. $s$ = saving rate – by definition this is the ratio of additional capital divided by income. Since income here is net of depreciation, we are already subtracting capital depreciation from income and not including this in our saving rate. Let’s play around with a few values, splitting growth $g$, into per capita growth and demographic growth we get the following. Note that Total growth is simply the sum of demographic and per capita growth, and Beta is calculated from the other values using the law.
So why does Piketty introduce this law?
The argument that Piketty is intending to tease out from this equality is the following:
In fact using $g=1.5 \%$ as a long term average, we can expect Beta to crystallise around a Beta of $8$! Much higher than it has been for the past 100 years. Analysis  convergence As Piketty is quick to point out, this is a long run equilibrium towards which an economy will move. Moreover, it should be noted that the convergence of this process is incredibly slow. Here is a graph plotting the evolution of Beta, from a starting point of 5, under the assumption of $g=1.5 \%$, $s = 12 \%$:
So we see that after 30 years ( i.e. approx. one generation), Beta has only increased from its starting point of $5$ to around $6$, it then takes another generation and a half to get to $7$, which is still short of its long run equilibrium of $8$.
Analysis  Is this rule true? Piketty is of course going to want to use his formula to say interesting things about the historic evolution of the Capital/Income ratio, and also use it to help predict future movements in Beta. I think this is where we start to push the boundaries of what we can easily reason, without first slowing down and methodically examining our implicit assumptions. For example – is a fixed saving rate (independent of changes in both Beta, and Growth) reasonable? Remember that the saving rate here is a saving rate on net income. So that as Beta increases, we are already having to put more money into upkeep of our current level of capital, so that a fixed net saving rate is actually consistent with an increasing gross saving rate, not a fixed gross saving rate. An increasing gross saving rate might be a reasonable assumption or it might not – this then becomes an empirical question rather than something we can reason about a priori. Another question is how the law performs for very low rates of $g$, which is in fact how Piketty is intending to use the equation. By inspection, we can see that:
As $g \rightarrow 0$, $B \rightarrow \infty $.
What is the mechanism by which this occurs in practice? It’s simply that if GDP does not grow from one year to the next, but the net saving rate is still positive, then the stock of capital will still increase, however income has not increased. This does however mean that an ever increasing share of the economy is going towards paying for capital depreciation.
Conclusion
Piketty’s law is still useful, and I do find it convincing to a first order of approximation. But I do think this section of the book could have benefited from more time spent highlighting some of the distortions potentially caused by using net income as our primary measure of income. There are multiple theoretical models used in macroeconomics, and it would have been useful for Piketty to help frame his law within the established paradigm. I'm always begrudgingly impressed by brokers and underwriters who can do most of their job without resorting to computers or a calculator. If you give them a gross premium for a layer, they can reel off gross and net rates on line, the implied loss cost, and give you an estimate of the price for a higher layer using an ILF in their head. When I'm working, so much actuarial modelling requires a computer (sampling from probability distributions, Monte Carlo methods, etc.) that just to give any answer at all I need to fire up Excel and make a Spreadsheet. So anytime there's a chance to do some shortcuts I'm always all for it! One mental calculation trick which is quite useful when working with compound interest is called the Rule of 72. It states that for interest rate $i$, under growth from annual compound interest, it takes approximately $\frac{72}{i} $ years for a given value to double in size. Why does it work? Here is a quick derivation showing why this works, all we need is to manipulate the exact solution with logarithms and then play around with the Taylor expansion. We are interested in the following identity, which gives the exact value of $n$ for which an investment doubles under compound interest: $$ \left( 1 + \frac{i}{100} \right)^n = 2$$ Taking logs of both sides gives the following: $$ ln \left( 1 + \frac{i}{100} \right)^n = ln(2)$$ And then bringing down the $n$: $$n* ln \left( 1 + \frac{i}{100} \right) = ln(2)$$ And finally solving for $n$: $$n = \frac {ln(2)} { ln \left( 1 + \frac{i}{100} \right) }$$ So the above gives us a formula for $n$, the number of years. We now need to come up with a simple approximation to this function, and we do so by examining the Taylor expansion denominator of the right have side: We can compute the value of $ln(2)$:
$$ln(2) \approx 69.3 \%$$
The Taylor expansion of the denominator is:
$$ln \left( 1 + \frac{i}{100} \right) = \frac{r}{100} – \frac{r^2}{20000} + … $$ In our case, it is more convenient to write this as: $$ln \left( 1 + \frac{i}{100} \right) = \frac{1}{100} \left( r – \frac{r^2}{200} + … \right) $$ For $r<10$, the second term is less than $\frac{100}{200} = 0.5$. Given the first term is of the order $10$, this means we are only throwing out an adjustment of less than $5 \%$ to our final answer. Taking just the first term of the Taylor expansion, we end up with: $$n \approx \frac{69.3 \%}{\frac{1}{100} * \frac{1}{r}}$$ And rearranging gives: $$n \approx \frac{69.3}{r}$$ So we see, we are pretty close to $ n \approx \frac{72}{r}$. Why 72? We saw above that using just the first term of the Taylor Expansion suggests we should be using the ‘rule of 69.3%' instead. Why then is this the rule of 72? There are two main reasons, the first is that for most of the interest rates we are interested in, the Rule of 72 actually gives a better approximation to the exact solution, the following table compares the exact solution, the approximation given by the ‘Rule of 69’, and the approximation given by the Rule of 72:
The reason for this is that for interest rates in the 4%10% range, the second term of the Taylor expansion is not completely negligible, and act to make the denominator slightly smaller and hence the fraction slightly bigger. It turns out 72 is quite a good fudge factor to account for this.
Another reason for using 72 over other close numbers is that 72 has a lot of divisors, in particular out of all the integers within 10 of 72, 72 has the most divisors. The following table displays the divisors function d(n), for values of n between 60 and 80. 72 clearly stands out as a good candidate.
The rule of 72 in Actuarial Modelling
The main use I find for this trick is in mentally adjusting historic claims for claims inflation. I know that if I put in 6% claims inflation, my trended losses will double in size from their original level approximately every 12 years. Other uses include when analysing investment returns, thinking about the effects of monetary inflation, or it can even be useful when thinking about the effects of discounting. Can we apply the Rule of 72 anywhere else? As an aside, we should be careful when attempting to apply the rule of 72 over too long a time period. Say we are watching a movie set in 1940, can we use the Rule of 72 to estimate what values in the movie are equivalent to now? Let's set up an example and see why it doesn't really work in practice. Let's suppose an item in our movie costs 10 dollars. First we need to pick an average inflation rate for the intervening period (something in the range of 34% is probably reasonable). We can then reason as follows; 1940 was 80 years ago, at 4% inflation, $\frac{72}{4} = 18$, and we’ve had approx. 4 lots of 18 years in that time. Therefore the price would have doubled 4 times, or will now be a factor of 16. Suggesting that 10 dollars in 1940 is now worth around 160 dollars in today's terms. It turns out that this doesn’t really work though, let’s check it against another calculation. The average price of a new car in 1940 was around 800 dollars and the average price now is around 35k, which is a factor of 43.75, quite a bit higher than 16. The issue with using inflation figures like these over very long time periods, is for a given year the difference in the underlying goods is fairly small, therefore a simple percentage change in price is an appropriate measure. When we chain together a large number of annual changes, after a certain number of years, the underlying goods have almost completely changed from the first year to the last. For this reason, simply multiplying an inflation rate across decades completely ignores both improvements in the quality of goods over time, and changes in standards of living, so doesn't really convey the information that we are actually interested in. Photo by David Preston What is a floating deductible? Excess of Loss contacts for Aviation books, specifically those covering airline risks (planes with more than 50 seats) often use a special type of deductible, called a floating deductible. Instead of applying a fixed amount to the loss in order to calculate recoveries, the deductible varies based on the size of the market loss and the line written by the insurer. These types of deductibles are reasonably common, I’d estimate something like 25% of airline accounts I’ve seen have had one. As an aside, these policy features are almost always referred to as deductibles, but technically are not actually deductibles from a legal perspective, they should probably be referred to as floating attachment instead. The definition of a deductible requires that it be deducted from the policy limit rather than specifying the point above which the policy limit sits. That’s a discussion for another day though! The idea is that the floating deductible should be lower for an airline on which an insurer takes a smaller line, and should be higher for an airline for which the insurer takes a bigger line. In this sense they operate somewhat like a surplus lines contract in property reinsurance. Before I get into my issues with them, let’s quickly review how they work in the first place. An example When binding an Excess of Loss contract with a floating deductible, we need to specify the following values upfront:
And we need to know the following additional information about a given loss in order to calculate recoveries from said loss:
A standard XoL recovery calculation with the fixed attachment given above, would first calculate the UNL (200m*0.75%=1.5m), and then deduct the fixed attachment from this (1.5m1.5m=0). Meaning in this case, for this loss and this line size, nothing would be recovered from the XoL. To calculate the recovery from XoL with a floating deductible, we would once again calculate the insured’s UNL 1.5m. However we now need to calculate the applicable deductible, this will be the lesser of 1.5m (the fixed attachment), and the insurer’s effective line (defined as their UNL divided by the market loss = 1.5m/200m) multiplied by the Original Market Loss as defined in the contract. In this case, the effective line would be 0.75%, and the Original Market Loss would be 150m, hence; 0.75%*150m = 1.125m. Since this is less than the 1.5m fixed attachment, the attachment we should use is 1.125m our limit is always just 18.5m, and doesn’t change if the attachment drops down. We would therefore calculate recoveries to this contract, for this loss size and risk, as if the layer was a 18.5m xs 1.125. Meaning the ceded loss would be 0.375m, and the net position would be 1.125m. Here’s the same calculation in an easier to follow format: So…. what’s the issue?
This may seem quite sensible so far, however the issue is with the wording. The following is an example of a fairly standard London Market wording, taken from an anonymised slip which I came across a few years ago. Priority: USD 10,000,000 each and every loss or an amount equal to the “Reinsured’s Portion’ of the total Original Insured Loss sustained by the original insured(s) of USD 200,000,000 each and every loss, whichever the lesser. … Reinsuring Clause Reinsurers shall only be liable if and when the ultimate net loss paid by the Reinsured in respect of the interest as defined herein exceeds USD 10,000,000 each and every loss or an amount equal to the Reinsured’s Proportion of the total Original Insured Loss sustained by the original insured(s) of USD 200,000,000 or currency equivalent, each and every loss, whichever the lesser (herein referred to as the “Priority”) For the purpose herein, the Reinsured’s Proportion shall be deemed to be a percentage calculated as follows, irrespective of the attachment dates of the policies giving rise to the Reinsured’s ultimate net loss and the Original Insured Loss: Reinsured Ultimate Net Loss / Original Insured Loss … The Original Insured Loss shall be defined as the total amount incurred by the insurance industry including any proportional coinsurance and or selfinsurance of the original insured(s), net of any recovery from any other source What’s going on here is that we’ve defined the effective line to be the Reinsured’s unl divided by the 100% market loss. First problem From a legal perspective, how would an insurer (or reinsurer for that matter), prove what the 100% insured market loss is? The insurer obviously knows their share of the loss, however what if this is a split placement with 70% placed in London on the same slip, 15% placed in a local market (let’s say Indonesia?), and a shortfall cover (15%) placed in Bermuda. Due to the different jurisdictions, let’s say the Bermudian cover has a number of exclusions and subjectivities, and the Indonesian cover operates under the Indonesian legal system which does not publically disclose private contract details. Even if the insurer is able to find out through a friendly broker what the other markets are paying, and therefore have a good sense of what the 100% market loss is, they may not have a legal right to this information. The airline does have a legal right to the information, however the reinsurance contract is a contract between the insured and reinsured, the airline is not a party to the reinsurance contract. The point is whether the insurer and reinsured have the legal right to the information. The above issues may sound quite theoretical, and in practice there are normally no issues with collecting on these types of contracts. But to my mind, legal language should bear up to scrutiny even when stretched – that’s precisely when you are going to rely on it. My contention is that as a general rule, it is a bad idea to rely on information in a contract which you do not have an automatic legal right to obtain. The second problem The intention with this wording, and with contracts of this form is that the effective line should basically be the same as the insured’s signed line. Assuming everything is straightforward, if the insurer takes a x% line with a limit of Y bn. If the loss is anything less than Y bn, then the insured’s effective line will simply be x%*Size of Loss / Size of loss. i.e. x%. My guess as to why it is worded this way rather than just taking the actual signed line is that we don’t want to open ourselves to a issues around what exactly we mean by ‘the signed line’ – what if the insured has exposure through two contracts both of which have different signed lines, what if there is an inuring Risk Excess which effectively nets down the gross signed line – should we then use the gross or net line? By couching the contract in terms of UNLs and Market losses we attempt to avoid these ambiguities Let me give you a scenario though where this wording does fall down: Scenario 1 – clash loss Let’s suppose there is a midair collision between two planes. Each results in an insured market loss of USD 1bn, then the Original Insured Loss is USD 2bn. If our insurer takes a 10% line on the first airline, but does not write the second airline, then their effective line is 10% * 1bn / 2bn = 5%... hmmm this is definitely equal to their signed line of 10%. You may think this is a pretty remote possibility, after all in the history of modern commercial aviation such an event has not occurred. What about the following scenario which does occur fairly regularly? Scenario 2 – airline/manufacturer split liability Suppose now there is a loss involving a single plane, and the size of the loss is once again USD 1bn, and that our insurer once again has a 10% line. In this case though, what if the manufacturer is found 50% responsible? Now the insurer only has a UNL of USD 500m, and yet once again, in the calculation of their floating deductible, we do the following: 10% * 500m/1bn = 5%. Hmmm, once again our effective line is below our signed line, and the floating deductible will drop down even further than intended. Suggested alternative wording My suggested wording, and I’m not a lawyer so this is categorically not legal advice, is to retain the basic definition of effective line  as UNL divided by some version of the 100% market loss  by doing so we still neatly sidestep the issues mentioned above around gross vs net lines, or exposure from multiple slips, but instead to replace the definition of Original Insured Loss with some variation of the following ‘the proportion of the Original Insured Loss, for which the insured derives a UNL through their involvement in some contract of insurance, or otherwise’. Basically the intention is to restrict the market loss, only to those contracts through which the insurer has an involvement. This deals with both issues – the insurer would not be able to net down their line further through references to insured losses which are nothing to do with them, as in the case of scenario 1 and 2 above, and secondly it restrict the information requirements to contracts which the insurer has an automatic legal right to have knowledge of since by definition they will be a party to the contract. I did run this idea past a few reinsurance brokers a couple of years ago, and they thought it made sense. The only downside from their perspective is that it makes the client's reinsurance slightly less responsive i.e. they knew about the strange quirk whereby the floating deductible dropped in the event of a manufacturer involvement, and saw it as a bonus for their client, which was often not fully priced in by the reinsurer. They therefore had little incentive to attempt to drive through such a change. The only people who would have an incentive to push through this change would be the larger reinsurers, though I suspect they will not do so until they've already been burnt and attempted to rely on the wording in a court case and, at which point they may find it does not quite operate in the way they intended.
The original idea for this post came from a slight quirk I noticed in some VBA code I was running (code pasted below) If you've ever needed to remove the protect sheet from a Spreadsheet without knowing the password, then you probably recognise it.
Sub RemovePassword() Dim i As Integer, j As Integer, k As Integer Dim l As Integer, m As Integer, n As Integer Dim i1 As Integer, i2 As Integer, i3 As Integer Dim i4 As Integer, i5 As Integer, i6 As Integer On Error Resume Next For i = 65 To 66: For j = 65 To 66: For k = 65 To 66 For l = 65 To 66: For m = 65 To 66: For i1 = 65 To 66 For i2 = 65 To 66: For i3 = 65 To 66: For i4 = 65 To 66 For i5 = 65 To 66: For i6 = 65 To 66: For n = 32 To 126 ActiveSheet.Unprotect Chr(i) & Chr(j) & Chr(k) & _ Chr(l) & Chr(m) & Chr(i1) & Chr(i2) & Chr(i3) & _ Chr(i4) & Chr(i5) & Chr(i6) & Chr(n) If ActiveSheet.ProtectContents = False Then MsgBox "Password is " & Chr(i) & Chr(j) & _ Chr(k) & Chr(l) & Chr(m) & Chr(i1) & Chr(i2) & _ Chr(i3) & Chr(i4) & Chr(i5) & Chr(i6) & Chr(n) Exit Sub End If Next: Next: Next: Next: Next: Next Next: Next: Next: Next: Next: Next End Sub Nothing too interesting so far, the code looks quite straight forward  we've got a big set of nested loops which appear to test all possible passwords, and will eventually brute force the password  if you've ever tried it you'll know it works pretty well. The interesting part is not so much the code itself, as the answer the code gives  the password which unlocks the sheet is normally something like ‘AAABAABA@1’. I’ve used this code quite a few times over the years, and always with similar results, the password always looks like some variation of this string. This got me thinking  surely it is unlikely that all the Spreadsheets I’ve been unlocking have had passwords of this form? So what’s going on? After a bit of research, it turns out Excel doesn’t actually store the original password, instead it stores a 4digit hash of the password. Then to unlock the Spreadsheet, Excel hashes the password attempt and compares it to the stored hashed password. Since the size of all possible passwords is huge (full calculations below), and the size of all possible hashes is much smaller, we end up with a high probability of collisions between password attempts, meaning multiple passwords can open a given Spreadsheet. I think the main reason Microsoft uses a hash function in this way rather than just storing the unhashed password is that the hash is stored by Excel as an unencrypted string within a xml file. In fact, an .xlsx file is basically just a zip containing a number of xml files. If Excel didn't first hash the password then you could simply unzip Excel file, find the relevant xml file and read the password from any text editor. With the encryption Excel selected, the best you can do is open the xml file and read the hash of the password, which does not help with getting back to the password due to the nature of the hash function. What hash function is used? I couldn't find the name of the hash anywhere, but the following website has the fullest description I could find of the actual algorithm. As an aside, I miss the days when the internet was made up of websites like this – weird, individually curated, static HTML, obviously written by someone with deep expertise, no ads as well! Here’s the link: http://chicago.sourceforge.net/devel/docs/excel/encrypt.html And the process is as follows:
Here is the algorithm to create the hash value:
a > 0x61 << 1 == 0x00C2 b > 0x62 << 2 == 0x0188 c > 0x63 << 3 == 0x0318 d > 0x64 << 4 == 0x0640 e > 0x65 << 5 == 0x0CA0 f > 0x66 << 6 == 0x1980 g > 0x67 << 7 == 0x3380 h > 0x68 << 8 == 0x6800 i > 0x69 << 9 == 0x5201 (unrotated: 0xD200) j > 0x6A << 10 == 0x2803 (unrotated: 0x1A800) count: 0x000A constant: 0xCE4B result: 0xFEF1 This value occurs in the PASSWORD record.
How many trials will we need to decrypt?
Now we know how the algorithm works, can we come up with a probabilistic bound on the number of trials we would need to check in order to be almost certain to get a collision when carrying out a brute force attack (as per the VBA code above)?
This is a fairly straight forward calculation – the probability of guessing incorrectly for a random attempt is $\frac{1}{65536}$. To keep the maths simple, if we assume independence of attempts, the probability of not getting the password after $n$ attempts is simply: $$ \left( \frac{1}{65536} \right)^n$$ The following table then displays these probabilities
So we see that with 200,000 trials, there is a less than 5% chance of not having found a match.
We can also derive the answer directly, we are interested in the following probabilistic inequality: $$ \left( 1 \frac{1}{65536} \right)^k < 0.05$$ Taking logs of both sides gives us: $$ln \left( 1 \frac{1}{65536}\right)^k = ln( 0.05)$$ And then bringing down the k: $$k * ln \left( 1 \frac{1}{65536} \right) = ln(0.05)$$ And then solving for $k$: $$k = \frac{ ln(0.05)}{ln \left(1 \frac{1}{65536}\right)} = 196,327$$
Can I work backwards to find the password from a given hash?
As we explained above, in order to decrypt the sheet, you don’t need to find the password, you only need to find a password. Let’s suppose however, for some reason we particularly wanted to find the password which was used, is there any other method to work backwards? I can only think of two basic approaches: Option 1 – find an inversion of the hashing algorithm. Since this algorithm has been around for decades, and is designed to be difficult to reverse, and so far has not been broken, this is a bit of a nonstarter. Let me know if you manage it though! Option 2 – Brute force it. This is basically your only chance, but let’s run some maths on how difficult this problem is. There are $94$ possible characters (AZ, az,09), and in Excel 2010, the maximum password length is $255$, so in all there are, $94^{255}$ possible passwords. Unfortunately for us, that is more than the total number of atoms in the universe $(10^{78})$. If we could check $1000$ passwords per second, then it would take far longer than the current age of the universe to find the correct one. Okay, so that’s not going to work, but can we make the process more efficient? Let’s restrict ourselves to looking at passwords of a known length. Suppose we know the password is only a single character, in that case we simply need to check $94$ possible passwords, one of which should unlock the sheet, hence giving us our password. In order to extend this reasoning to passwords of arbitrary but known length, let’s think of the hashing algorithm as a function and consider its domain and range: Let’s call our hashing algorithm $F$, the set of all passwords of length $i$, $A_i$, and the set of all possible password hashes $B$. Then we have a function: $$ F: A_i > B$$ Now if we assume the algorithm is approximately equally spread over all the possible values of $B$, then we can use the size of $B$ to calculate the size of the kernel $F^{1}[A_i]$. The size of $B$ doesn’t change. Since we have a $4$ digit hexadecimal, it is of size $16^4$, and since we know the size of $A_i$ is $96$, we can then estimate the size of the kernel. Let’s take $i=4$, and work it through:
$A_4$ is size $96^4 = 85m$, $B = 65536$, hence $F^{1}[A_4] = \frac{85m}{65536} = 124416$
Which means for every hash, there are $124,416$ possible $4$ digit passwords which can create this hash, and therefore may have been the original password. Here is a table of the values for $I = 1$ to $6$:
In fact we can come up with a formula for size of the kernel: $$\frac{96^i}{16^4} \sim 13.5 * 96^{i2}$$
Which we can see quickly approaches infinity as $i$ increases. So for $i$ above $5$, the problem is basically intractable without further improvement. How would we progress if we had to? The only other idea I can come up with is to generate a huge array of all possible passwords (based on brute forcing like above and recording all matches), and then start searching within this array for keywords. We could possibly use some sort of fuzzylookup against a dictionary of keywords. If the original password did not contain any words, but was instead just a fairly random collection of characters then we really would be stumped. I imagine that this problem is basically impossible (and could probably be proved to be so using information theory and entropy) Who invented this algorithm? No idea…I thought it might be fun to do a little bit of online detective work. You see this code all over the internet, but can we find the original source? This site has quite an informative page on the algorithm: https://www.spreadsheet1.com/sheetprotection.html The author of the above page is good enough to credit his source, which is the following stack exchange page: https://superuser.com/questions/45868/recoversheetprotectionpasswordinexcel Which in turns states that the code was ‘'Author unknown but submitted by brettdj of www.expertsexchange.com’ I had a quick look on Expertsexchange, but that's as far as I could get, at least we found the guy's username. Can we speed the algorithm up or improve it in any way? I think the current VBA code is basically as quick as it is going to get  the hashing algorithm should work just as fast with a short input as a 12 character input, so starting with a smaller value in the loop doesn’t really get us anything. The only real improvement I can suggest, is that if the Spreadsheet is running too slowly to be able to test a sufficient number of hashes per second, then the hashing algorithm could be implemented in python (which I started to do just out of interest, but it was a bit too fiddly to be fun). Once the algorithm is set up, the password could then be brute forced from there (in much quicker time), and one a valid password has been found, this can then be simply typed into Excel. I remember being told as a relatively new actuarial analyst that you "shouldn't inflate loss ratios" when experience rating. This must have been sitting at the back of my mind ever since, because last week, when a colleague asked me basically the same question about adjusting loss ratios for claims inflation, I remembered the conversation I'd had with my old boss and it finally clicked. Let's go back a few years  it's 2016  Justin Bieber has a song out in which he keeps apologising, and to all of us in the UK, Donald Trump (if you've even heard of him) is still just the America's version of Alan Sugar. I was working on the pricing for a Quota Share, I can't remember the class of business, but I'd been given an aggregate loss triangle, ultimate premium estimates, and rate change information. I had carefully and meticulously projected my losses to ultimate, applied rate changes, and then set the trended and developed losses against ultimate premiums. I ended up with a table that looked something like this: (Note these numbers are completely made up but should give you a gist of what I'm talking about.) I then thought to myself ‘okay this is a property class, I should probably inflate losses by about $3\%$ pa’, the definition of a loss ratio is just losses divided by premium, therefore the correct way to adjust is to just inflate the ULR by $3\%$ pa. I did this, sent the analysis to my boss at the time to review, and was told ‘you shouldn’t inflate loss ratios for claims inflation, otherwise you'd need to inflate the premium as well’ – in my head I was like ‘hmmm, I don’t really get that...’ we’ve accounted for the change in premium by applying the rate change, claims certainly do increase each year, but I don't get how premiums also 'inflate' beyond rate movements?! but since he was the kind of actuary who is basically never wrong and we were short on time, I just took his word for it. I didn’t really think of it again, other than to remember that ‘you shouldn’t inflate loss ratios’, until last week one of my colleagues asked me if I knew what exactly this ‘Exposure trend’ adjustment in the experience rating modelling he’d been sent was. The actuaries who had prepared the work had taken the loss ratios, inflated them in line with claims inflation (what you're not supposed to do), but then applied an ‘exposure inflation’ to the premium. Ahha I thought to myself, this must be what my old boss meant by inflating premium. I'm not sure why it took me so long to get to the bottom of what, is when you get down to it, a fairly simple adjustment. In my defence, you really don’t see this approach in ‘London Market’ style actuarial modelling  it's not covered in the IFoA exams for example. Having investigated a little, it does seem to be an approach which is used by US actuaries more – possibly it’s in the CAS exams? When I googled the term 'Exposure Trend', not a huge amount of useful info came up – there are a few threads on Actuarial Outpost which kinda mention it, but after mulling it over for a while I think I understand what is going on. I thought I’d write up my understanding in case anyone else is curious and stumbles across this post. Proof by Example I thought it would be best to explain through an example, let’s suppose we are analysing a single risk over the course of one renewal. To keep things simple, we’ll assume it’s some form of property risk, which is covering Total Loss Only (TLO), i.e. we only pay out if the entire property is destroyed. Let’s suppose for $2018$, the TIV is $1m$ USD, we are getting a net premium rate of $1\%$ of TIV, and we think there is a $0.5\%$ chance of a total loss. For $2019$, the value of the property has increased by $5\%$, we are still getting a net rate of $1\%$, and we think the underlying probability of a total loss is the same. In this case we would say the rate change is $0\%$. That is: $$ \frac{\text{Net rate}_{19}}{\text{Net rate}_{18}} = \frac{1\%}{1\%} = 1 $$ However we would say that claim inflation is $2.5\%$, which is the increase in expected claims this follows from: $$ \text{Claim Inflation} = \frac{ \text{Expected Claims}_{19}}{ \text{Expected Claims}_{18}} = \frac{0.5\%*1.05m}{0.5\%*1m} = 1.05$$
From first principles, our expected gross gross ratio (GLR) for $2018$ is:
$$\frac{0.5 \% *(TIV_{18})}{1 \% *(TIV_{18})} = 50 \%$$ And for $2019$ is: $$\frac{0.5\%*(TIV_{19})}{1\%*(TIV_{19})} = 50\%$$ i.e. they are the same! The correct adjustment when onlevelling $2018$ to $2019$ should therefore result in a flat GLR – this follows as we’ve got the same GLR in each year when we calculated above from first principles. If we’d taken the $18$ GLR, applied the claims inflation $1.05$ and applied the rate change $1.0$, then we might erroneously think the Gross Loss Ratio would be $50\%*1.05 = 52.5\%$. This would be equivalent to what I did in the opening paragraph of this post, the issue being, that we haven’t accounted for trend in exposure and our rate change is a measure of the change in net rate. If we include this exposure trend as an additional explicit adjustment this gives $50\%*1.05*1/1.05 = 50\%$. Which is the correct answer, as we can see by comparing to our first principles calculation. So the fundamental problem, is that our measure of rate change is a measure in the movement of rate on TIV, whereas our claim inflation is a measure of the movement of aggregate claims. These two are misaligned, if our rate change was instead a measure in the movement of overall premium, then the two measures would be consistent and we would not need the additional adjustment. However it’s much more common in this type of situation to get given rate change as a measure of change in rate on TIV. An advantage of making an explicit adjustment for exposure trend and claims inflation is that it allows us to apply different rates – which is probably more accurate. There’s no apriori justification as to why the two should always be the same. Claim inflation will be affected by additional factors beyond changes in the inflation of the assets being insured, this may include changes in frequency, changes in court award inflation, etc… It’s also interesting to note that the clam inflation here is of a different nature to what we would expect to see in a standard Collective Risk Model. In that case we inflate individual losses by the average change in severity i.e. ignoring any change in frequency. When adjusting the LR above, we are adjusting for both the change in frequency and severity together, i.e. in the aggregate loss. The above discussion also shows the importance of understanding exactly what someone means by ‘rate change’. It may sound obvious but there are actually a number of subtle differences in what exactly we are attempting to measure when using this concept. Is it change in premium per unit of exposure, is it change in rate per dollar of exposure, or is it even change in rate adequacy? At various points I’ve seen all of these referred to as ‘rate change’. 
AuthorI work as a pricing actuary at a reinsurer in London. Categories
All
Archives
February 2020
