The football may be over, but the fun never stops!

There is plenty of data on the recent Russia 2018 World Cup to be found on the Official Fifa site

Using their statistics, I have compared the number of fouls committed versus the number of fouls suffered and plotted the scatter graph above. Fouls committed are on the x axis, fouls suffered on the y. The line is a (computer generated) line of best fit using linear regression.

The greater the distance above the line, the more “saintly” we can say a team was – more fouled against than fouling; those below the line were the “sinners” of the tournament.

Using my criteria, we can say that, despite not coming home with the trophy, England were the Saints of the World Cup!

(A note of caution, however. As ever with data, we must always consider its validity. Despite this data coming from the official FIFA website, it has a total of 1734 fouls committed, but only 1642 fouls suffered – at the time of writing I can’t reconcile the difference.)

]]>

And so the sun sets on another Wimbledon tournament, one that will be remembered in part for the two losing singles finalists.

Serena Williams was the runner up in the ladies final, ten months after giving birth, and Kevin Anderson lost in the men’s final to Novak Djokovic, two days after playing the second-longest match in Wimbledon history, taking six hours, thirty six minutes to overcome John Isner in the semi-final.

Understandably, Anderson has called for a change in how close games are decided.

To win a tennis set, a player must win 6 games, and be two clear games ahead of their opponent. If the score is 6-6, the set is decided by a tie break, where the winner is the first to 7 points (as long as they lead by two clear points.)

Except for the final set of a match. If the fifth set is tied at 6-6, it continues – *with no tie break –* until one player leads by two games. And this is where the problem lies.

If both players are good and evenly matched (a reasonable expectation in the semi-final of a Grand Slam Tournament) then the maths tells us there will be a significant number of games before a player loses a game on their serve. i.e., stalemate sets in, with neither player able to break the opponents serve and win by the required two game margin.

The score in Friday’s final set was 26-24, i.e. it took fifty games to decide the final set.

Looking at the stats for the match, Anderson won 213 points from his 278 serves, giving him a probability of winning any point on his serve of 0.7661. Isner won 206 of his 291 serves, giving him a probability of 0.7071 of winning a point when he served.

To win game, you need to win (at least*) 4 points on your serve, with your opponent winning winning no, one or two points. We can use a bit of basic probability to work out the probability of winning “to love”, ” to 15″ and ” to 30″.

* The problem is made harder as we need to consider “deuce”, when the score reaches 40-40. Due to the need to win by two clear points, “deuce” games could go for ever. The good news is, is that we can model “deuce” games as a geometric series, which we can sum to infinity, thereby coming up with a probability for winning a “deuce” game.

Using the probabilities above, I was able to calculate the probability that each player would win a game when they were serving. More importantly, this allowed me to the calculate the probability they would lose a game on their serve, or have their serve broken.

To win the final set, the game would go on until (at least) a player lost a game on their serve, hence we could treat the game as a geometric distribution, meaning we could calculate the “Expectation” for games lost.

For Anderson, the expectation is 25, that means, you would expect him to lose one game for every twenty-five he plays. The number is a little lower for Isner: we would expect him to lose one game in every eleven.

What this means is is that we can expect long final sets, unless the pragmatic decision is made to revert to allowing tie-breaks in fifth and final sets. As players continue to improve, and the advantage of serve continues to increase, the sport’s administrators will have to grapple with this conundrum, or look forward to future marathons as the sport descends into a war of attrition.

If you are interested in the formula I derived to calculate the expectation, you can see it below. p is the probability of a player winning a point on their own serve.

]]>**or why football remains the most popular and exciting sport**

Oh, what a night. It had drama, heroes and villains, and, for once, the tears shed at the end of game were tears of joy. On a night of pure theatre, England beat Colombia in a penalty shoot out to proceed to the quarter finals of the World Cup.

A nation rejoiced and when, perhaps still a little bleary eyed, it woke realising it wasn’t just a dream, the feel good factor across the land was palpable. Workmates chatted amiably, neighbors conversed happily, strangers smiled as they past each other; everyone was happy.

Everyone, that is, except for one man. One grumpy old man, writing in The Guardian.

I have no objection to Simon Jenkins expressing his opinion in the paper, but unfortunately for him, maths blows away all his arguments.

He objects to is games being decided by penalty shoot outs and in this article he calls on FIFA to make the goalposts bigger which will mean more goals, and therefore less draws.

And if you want the “best” team to always win, then he has a point.

But do we always want the best team to win? No, that is the beauty of football. Football is the least predictable of sports – by that I mean that the favourites win less often than in any other sport and that is what gives the game its drama, that’s why millions watch it.

Increasing the distance between goal posts would mean more goals, and more goals (or points in other sports) means more predictability, and more predictability means less drama, less excitement. It can be shown mathematically, (more goals/points = more predictability), but it can also be deduced intuitively as well.

Imagine that I take to the court to play Andy Murray. On any given point I may, just may, win the point but in the long term he is going to win (many) more points than me, so as the more points are scored the more likely he is to emerge victorious.

Suppose (and we really are stretching the bounds of reality here) I win one point in every ten (meaning Andy wins nine in ten). If we play a single point match, the probability of me winning is one in ten, i.e. if we played ten matches I could expect to win one of those games, or if we played 250 one point matches I would win 25 of them.

If we now play a two point match (first to two) then the probability of me winning a match is 0.028, or I would expect to win 7 out of 250 matches. More points, more predictable. By the time we are playing a three point match my chances of success start to become vanishingly small. (If you don’t believe me, sketch out a tree diagram, plug in the numbers and do the sums)

In his article, Simon Jenkins says:

At root, the trouble is soccer’s notorious inability to deliver scoring opportunities …. So far, only 16 out of the first 56 matches in the current World Cup have been decided by more than a single goal. The contrast with free-scoring rugby, cricket and tennis is stark.

As a neutral, we want games to be decided by a single goal – it keeps it exciting until the very end.

Rugby Union has become so free scoring it has a problem. International matches now see twice as many points scored as they did thirty years ago.

On the one hand this is a positive – we all want to see points (or goals) being scored, but the dominance of the favourites is detracting from the excitement of the spectacle.

The gulf in class between the best and weakest teams in football World Cup is vast, but there is always the chance of an upset. In Rugby Union, it is a major shock when even the world’s second or third ranked nation beats the All Blacks. Great if you are a New Zealander, but not much fun for the rest of the world.

To finish, I would like to look at the article’s last line:

Or we can always watch the tennis, whose scoring system is close to perfect.

It depends what you mean by perfect. Yes, it creates great drama, plenty of decisive moments, but if you asked a statistician to devise a system to determine the best tennis player, they wouldn’t come up with the current scoring system.

But sport isn’t about the perfect ranking system, its about drama, despair and hope. I think football has got it pretty much right.

]]>

Russia 2018 – its turning out to be a vintage World Cup.

We’ve had shocks, upset and – at the time of writing – England are still in it (with a supposedly “easy” route to the final. Half of me dares to hope, the other half fears we will be dumped out tomorrow night by the Colombians.)

It is also the the tournament when stats and data have come of age, and are readily available to those of us who find the numbers (almost) as entrancing as the fancy footwork.

As I did at the end of Euro 2016, I will “Rank the rankings”: compare the finishing positions with the FIFA world rankings. There was no significant correlation in the Euros, and with only four of the top ten still in the competition, it doesn’t look to good this time round, either.

But that must wait until the conclusion of the competition. Today, I wanted to share some great work by John W. Miller who has produced the box plots and scatter graph on this page. You can read his full blog post here where he will also talk you through how to get your hands on the data he used.

Some great charts that should spark some interesting discussions in the classroom.

Large Data Sets are the current buzzword in A level maths, and it is wise to ensure that your students are familiar with the data set provided by the exam board you are following. However, here is another large data set containing (nearly) 40,000 international football results from 1872 to 2018: to my mind far more interesting than the transport arrangements of unitary authorities.

And here are the World Football Elo Ratings – I must confess, I’m not (yet) 100% sure how an Elo rating is calculated, but I shall try and find out.

So, whether it ends in cheers or (more likely) tears, when the competition is over, we can fill the void by exploring all the wonderful stats and data the beautiful game generates.

(And thanks again to John W. Miller for allowing me to use his images. If you, or your students, have a penchant for data and computer science, his blog is well worth a visit.)

]]>It may just be that all our work scrutiny, all that triple marking, all that group work and all those verbal feedback stamps have, at best, merely been tinkering around the edges, covering over the cracks, re-arranging deck chairs on the Titanic.

Perhaps we are ignoring the single biggest factor that could improve the learning and performance of the pupils in our care.

Sleep.

I’ve recently read “Why we sleep: The new science of sleep and dreams” by Matthew Walker and it has, forgive the pun, opened my eyes.

I like to call myself a “healthy sceptic” – I’ve seen fads come and go all too often in education, and I ain’t going to believe something just because someone tells me it is so. But the author of this book has the credibility of being an expert in his field, and the copy is littered with well referenced research and sources. He’s done enough to convince this old cynic.

Sleep has an enormous impact upon learning. Anyone who has ever been in a classroom will, I think, recognise that a tired student is less receptive to learning. What I didn’t appreciate was the importance of sleep after learning.

The author explains (and backs up with scientific evidence) far better than I could ever hope to do how we need sleep after learning to fix new ideas and knowledge in our minds. To be rested before a lesson is not enough – we must rest (sleep) afterwards as well, else the value of the lesson is lost. You could teach the best lesson ever, but its value and impact will be significantly diminished amongst those students who don’t sleep well (and for long enough) in the nights (note the plural, not the single) afterwards.

He explores the different types of sleep (nonREM and REM sleep) and their different impacts upon retention and problem solving. It is a fascinating book, I urge you to buy a copy and read it for yourself.

So what can we as teachers do?

In the book, the author conducts (an admittedly unscientific) poll of friends and colleagues across countries and continents and discovers that no one ever had any lessons on sleep, or the importance of sleep whilst at school. In school, we regularly teach our students of the importance of a good diet, exercise, the danger of drugs and many other vices – think of your PSHE programme and all the above will feature, but “sleep hygiene” never makes an appearance. Perhaps it is time to start educating our pupils on the perils of too little sleep.

We can also help ourselves to more sleep, which will make us more effective in the work place, or in other words, better teachers (and, as a by product, will also make us healthier, more attractive, slimmer, more creative)

Some readers may be in exulted positions and thus able to make systemic changes to the school day. In the book, Professor Walker presents clear evidence that a later start to the school day improves results.

Reading the book, I have come to realise that my schoolboy academic success may not have been down to my hard work (if I’m honest, did I really work that hard?!) or any innate natural talent, but more due to a fluke of geography that had me live a mere hundred yards from the school gate, allowing me to roll out of bed at 8 o’clock ready for a 9am start. I literally slept myself to exam success.

Sleep – has ever anything that feels so good been so good?

]]>After yesterday’s blog post which highlighted the calm, statesman like brilliance of President Macron, today the pendulum swings fully the other way, allowing Piers Morgan to show himself to be a bullying buffoon, a despicable and pathetic man.

He tries to show that he is more intelligent than the guests on his show – guests invited in so that he can belittle them – by asking them about Pythagoras’ Theorem.

But he fails. He fails magnificently.

Clearly Piers himself doesn’t know what Pythagoras’ Theorem is, confusing it instead with Pi. (And, you can just make out over the justified derision of his co-presenter, despite claiming to, he doesn’t know Pi to five decimal places. When challenged he states that Pi is 3.147, which is not to five decimal places, and is wrong anyway. Pi to five decimal places is 3.14159)

I know Pythagoras’ Theorem, and I know Pi to five decimal places, but that doesn’t make me any cleverer or better than Piers or any of his guests – I just know something that they don’t.

Piers: please don’t bully and belittle, take a leaf out of President Macron’s book and educate and inspire, it’ll make you a bigger, and better, man.

]]>

President Macron of France took a teenager to task for failing to show him – or more correctly, the Office of his Presidency – due respect.

It is a fantastic clip, in which the French leader displays gravitas whilst taking the opportunity to educate.

The day you want to start a revolution you study first in order to obtain a degree and feed yourself, OK?

Wise words.

It would have been so easy for him to have ignored the low level “cheek” from the *garcon* – and how often have we, as teachers let things slide, or seen colleagues do so?

As @Marcus Haddon said on Twitter:

I’ve seen some headteachers in the UK have less interaction with their students than this.

Watch the clip for a 30 second masterclass in how to set a positive role model.

]]>Saturday saw Aston Villa face Fulham FC in what is widely regarded as the most valuable game in world football – the winners of the Championship play off can look forward to the next season in the Premier League, a season worth circa £170 million, a sum far eclipsing the prize money of any other competition.

The above is easy to quantify, less so is the assertion that the English Championship is the hardest to gain promotion from, the most competitive league in Europe, although many will make this claim.

So which is the most competitive league in Europe?

Before we can answer that question, we have to determine how we can answer that question.

My solution (of course!) is to employ some maths – let the numbers do the talking.

Standard Deviation is a measure spread, a measure of how close to the mean (average) the data is spread. A low standard deviation tells us that the data is closely clustered around the mean, whilst a high standard deviation tells us that the data is spread out around a wide range of values.

I calculated the standard deviation for fifteen different European leagues, and compared the standard deviations of the points each team gained in the season. The leagues with a lower standard deviation, I concluded, were more competitive than those with a higher standard deviation. A lower standard deviation means that the points for each team were closer to the mean, suggesting that the clubs in that league were more matched, and therefore the league more competitive than those leagues with a high standard deviation.

**68 95 99**

68, 95, 99 – no, not the years that Spurs won the cup, but a handy rule of thumb, sometimes known as the 68-95-99.7 rule (or three sigma rule if you want to sound clever). What it tells us is that for a normal distribution (or bell curve, and we can expect points scored in a league to be of this form) 68% of data points (points gained in our example) lie within one standard deviation (in either direction, above or below) the mean, 95% lie within two standard deviations and 99.7% (or nearly all) results lie within three standard deviations of the mean.

**And the winner is …**

So after all this maths, which league is the most competitive? Is it the English Championship as so many pundits would have you believe?

No, the most competitive league in Europe is the Russian Premier League, with a standard deviation of 13.3, closely followed by the Bundesliga with a standard deviation of 14.0.

The English Championship is not as competitive as the two divisions below it, although it is more competitive that the English Premier League it feeds into.

And it seems that the Bundesliga is bucking the trend – the other “big” European leagues have the higher standard deviations, suggesting that they are less competitive, with Italy’s Serie A coming bottom with a standard deviation of 20.6.

**League of leagues**

So here is my league of leagues, based on standard deviation, the most competitive at the top, least at the bottom:

League | Standard Deviation |

Russian Premier League | 13.3 |

Bundesliga (Germany) | 14.0 |

League 2 (England) | 15.1 |

League 1 (England) | 15.3 |

Greek Super League | 15.8 |

Scottish Divison 1 | 16.8 |

Championship (England) | 17.1 |

Dutch Eridivise | 17.4 |

Ligue 1 (France) | 17.6 |

Scottish Premier | 17.6 |

Scottish Divison 2 | 17.7 |

La Liga (Spain) | 18.2 |

Premier League (England) | 19.2 |

Portugues Liga | 19.3 |

Serie A (Italy) | 20.6 |

]]>

The above tweet [link] popped up in my twitter feed this morning, and it got me thinking.

Not about whether or not Dominic Raab’s claims* were valid.

No, I spent quite some time trying to figure out what that “graph” (info-graphic is probably a better term) was trying to say.

I just couldn’t figure it out.

Now, I’ll be the first to admit, I’m no economist and I’ve never formally studied the subject. But I would describe myself as reasonably numerate and (as I have written before) as a mathematician I am far more interested in the applied side of the subject to the pure; I am used to taking equations, data, charts and graphs and interpreting them. But on seeing the above, I just couldn’t understand it.

First schoolboy error was no axis labels (and no numbers on the y axis at all.)

The headline **in bold** mentioned house price increase from 1991 to 2016, suggesting a time series graph, where we are accustomed to seeing time flow from left to right. The title did imply that we were looking at a change over time, yet this makes no sense in the context of the graphic (I’ve given up callling it a graph because, although presented to try and look like a graph for (I presume) gravitas, it ain’t a graph).

I was now becoming increasingly confused.

Having twigged it was not a time series graph, my mind then picked up on a couple of key features of the graphic. The title said “**average house price**” and the top number on the x-axis was 275. I knew that the average UK house price is around £275K (I’ve since checked – its a little lower, but in that ball park) so perhaps the graphic was meant to represent the average house price in the UK? But that made the chart even more nonsensical.

By this stage I was genuinely perplexed. I genuinely had no idea what this tweet and graphic was trying to say.

I could have (and perhaps should have) left it there and got on with my day. But I couldn’t. It was bugging me, so I did a bit of digging to see if I could fathom what the Economics Editor of the FT was trying to convey. It seems I wasn’t alone in my confusion, finding this thread on Reddit

]]>[–]AlcoholicAxolotl1 point

I stumbled across the above the other day (source) and it made me chuckle, reminding me a little of of this post about cute angels and other mathematical bloopers.

The cartoon above was drawn by Dan Piraro and can be found on his Bizaro website – well worth a visit. By virtue of the fact that you are reading my blog, I’m guessing you are of a mathematical bent (but whether that bend forms an acute or obtuse angle, who knows!) and therefore may particularly enjoy this cartoon of his.

]]>