In which we do more data exploration, find and then fix a mistake in our previous model, spend some time on feature engineering, and manage to set a new high-score.
An Actuary learns Machine Learning – Part 3 – Automatic testing/feature importance/K-fold cross validation
In which we don’t actually improve our model but we do improve our workflow - being able to check our test score ourselves, analysing the importance of each variable using an algorithm, and then using an algorithm to select the best hyper-parameters
In which we build our first machine learning model in Python, beat our previous Excel model on our first attempt, and then fail multiple time to improve this new model…
In which we enter a machine learning competition, predict who survived the titanic, build an Excel model, and then realise it performs no better than Kaggle’s ‘test submission’...
I sometimes get emails from individuals who have stumbled across my website and have questions about Lloyd's of London which they can't find the answers to online. Below I've collated some of these questions and my responses, plus some extra questions chucked in which I thought might be helpful.
A brief caveat - while I've had a fair amount of interaction with Lloyd's syndicates over the years, I have never actually worked within Lloyd's for a syndicate, and these answers below just represent my understanding and my personal view, other views do exist! If you disagree with anything, or if you think anything below is incorrect please let me know!
Are Lloyd’s of London and Lloyds bank related at all?
They are not, they just happen to have a similar name. Lloyd’s of London is an insurance market, whereas Lloyd’s bank is a bank. They were both set up by people with the surname Lloyd - Lloyds bank was formed by John Taylor and Sampson Lloyd, Lloyd’s of London by Edward Lloyd. Perhaps in the mists of time those two were distantly related but that’s about it for a link.
I just finished reading ‘I am a strange loop’ by Douglas Hofstadter, and before I say anything else about the book, I’ll say that I really did want to like it.
I’m a huge fan of his better known book ‘Godel, Escher, Bach’ for which Hofstadter won a Pulitzer Prize, I’m also very interested in the subject area – maths, logic, self-reference, cognitive science. However there were just too many things that rubbed me up the wrong way, in no particular order here were all the things I didn’t like about the book:
I had to solve an interesting problem yesterday relating to pricing an excess layer which was contained in another layer which we knew the price for – I didn’t price the initial layer, and I did not have a gross loss model. All I had to go on was the overall price and a severity curve which I thought was reasonably accurate. The specific layers in this case were a 9m xs 1m, and I was interested in what we would charge for a 6m xs 4m.
Just to put some concrete numbers to this, let’s say the 9m xs 1m cost \$10m
The xs 1m severity curve was as follows:
Let me introduce a game – I keep flipping a coin and you have to guess whether it will come up heads or tails. The prize pot starts at \$2, and each time you guess correctly the prize pot doubles, we keep playing until you eventually guess incorrectly at which point you get whatever has accumulated in the prize pot.
So if you guess wrong on the first flip, you just get the \$2. If you guess wrong on the second flip you get \$4, and if you get it wrong on the 10th flip you get \$1024.
Knowing this, how much would you pay to enter this game?
You're guaranteed to win at least \$2, so you'd obviously pay at least $\2. There is a 50% chance you'll win \$4, a 25% chance you'll win \$8, a 12.5% chance you'll win \$16, and so on. Knowing this maybe you'd pay \$5 to play - you'll probably lose money but there's a decent chance you'll make quite a bit more than \$5.
Perhaps you take a more mathematical approach than this. You might reason as follows – ‘I’m a rational person therefore as any good rational person should, I will calculate the expected value of playing the game, this is the maximum I should be willing to play the game’. This however is the crux of the problem and the source of the paradox, most people do not really value the game that highly – when asked they’d pay somewhere between \$2-\$10 to play it, and yet the expected value of the game is infinite....
The above is a lovely photo I found of St Petersburg. The reason the paradox is named after St Petersburg actually has nothing to do with the game itself, but is due to an early article published by Daniel Bernoulli in a St Petersburg journal. As an aside, having just finished the book A Gentleman in Moscow by Amor Towles (which I loved and would thoroughly recommend) I'm curious to visit Moscow and St Petersburg one day.
Dan Glaser, CEO of Guy Carp, stated last week that he believes that the current fallout from Coronavirus represents two simultaneous black swans.
Nassim Taleb meanwhile, the very guy who brought the term ‘black swan’ into popular consciousness, has stated that what we are dealing with at the moment isn’t even a black swan!
So what’s going on here? And who is right?
I saw this story today , and I've got to say I absolutely love it. Here is the tag line:
“Clarence Thomas Just Asked His First Question in a Supreme Court Argument in 3 Years”
For those like me not familiar with Justice Thomas, he is one of the nine Supreme Court Justices and he is famously terse:
“He once went 10 years without speaking up at an argument.”
“His last questions came on Feb. 29, 2016”
You could say I was speechless upon reading this (see what I did there?), for a judge who sits over some of the most important trials in the US to basically never speak during oral arguments seems pretty incredible.
If you are an actuary, you'll probably have done a fair bit of triangle analysis, and you'll know that triangle analysis tends to works pretty well if you have what I'd call 'nice smooth consistent' data, that is - data without sharp corners, no large one off events, and without substantially growth. Unfortunately, over the last few years, motor triangles have been anything but nice, smooth or consistent. These days, using them often seems to require more assumptions than there are data points in the entire triangle.
In case you missed it, Aon announced  last week that in response to the Covid19 outbreak, and the subsequent expected loss of revenue stemming from the fallout, they would be taking a series of preemptive actions. The message was that no one would lose their job, but that a majority of staff would be asked to accept a 20% salary cut.
The cuts would be made to:
So how significant will the cost savings be here? And is it fair that Aon is continuing with their dividend? I did a couple of back of the envelope calcs to investigate.
This post is about two pieces of writing released this week, and how easy it is for even smart people to be wrong.
The story starts with an open letter written to the UK Government signed by 200+ scientists, condemning the government’s response to the Coronavirus epidemic; that the response was not forceful enough, and that the government was risking lives by their current course of action. The letter was widely reported and even made it to the BBC frontpage, pretty compelling stuff.
The issue is that as soon as you start scratching beneath the surface, all is not quite what it seems. Of the 200+ scientists, about 1/3 are PhD students, not an issue in and of itself, but picking out some of the subjects we’ve got:
So I finally finished Thomas Piketty's book - Capital in the 21st Century, and I thought I'd write up an interesting result that Piketty mentions, but does not elaborate on. Given the book is already 700 pages, it's probably for the best that he drew the line somewhere.
The result is specifically, that under basic models of the development of distribution of wealth in a society, it can be shown that when growth is equal to $g$, and return on capital is equal to $r$, then the distribution of wealth tends towards a Pareto distribution with parameter $r-g$. That sounds pretty interesting right?
My notes below are largely based on following paper by Charles I. Jones of Stanford Business School, my addition is to derive the assumption of an exponential distribution of income from more basic assumptions about labour and capital income. Link to Jones's paper 
I’m reading ‘Information Theory, inference and learning algorithms' by David MacKay at the moment and I'm really enjoying it so far. One cool trick that he introduces early in the book is a method of deriving Stirling’s approximation through the use of the Gaussian approximation to the Poisson Distribution, which I thought I'd write up here.
I work as a pricing actuary at a reinsurer in London.