Saints or Sinners?

The football may be over, but the fun never stops!

There is plenty of data on the recent Russia 2018 World Cup to be found on the Official Fifa site

Using their statistics, I have compared the number of fouls committed versus the number of fouls suffered and plotted the scatter graph above.  Fouls committed are on the x axis, fouls suffered on the y. The line is a (computer generated) line of best fit using linear regression.

The greater the distance above the line, the more “saintly” we can say a team was – more fouled against than fouling; those below the line were the “sinners” of the tournament.

Using my criteria, we can say that, despite not coming home with the trophy, England were the Saints of the World Cup!

(A note of caution, however. As ever with data, we must always consider its validity.  Despite this data coming from the official FIFA website, it has a total of 1734 fouls committed,  but only 1642 fouls suffered – at the time of writing I can’t reconcile the difference.)


A few readers have (correctly) pointed out that the plot is skewed as not all teams play the same amount of games: one would expect France, Croatia, Belgium and England to all be towards the right of the graph as they played more games than other teams.

So I went back to the data produced a plot for fouls committed per game v fouls suffered per game. You can see the plot below. I think we can safely say those above the line were the saints, those below the sinners.


Posted in Handling Data | Tagged , , | Leave a comment

Anyone for tennis?

And so the sun sets on another Wimbledon tournament, one that will be remembered in part for the two losing singles finalists.

Serena Williams was the runner up in the ladies final, ten months after giving birth, and Kevin Anderson lost in the men’s final to Novak Djokovic, two days after playing the second-longest match in Wimbledon history, taking six hours, thirty six minutes to overcome John Isner in the semi-final.

Understandably, Anderson has called for a change in how close games are decided.

To win a tennis set, a player must win 6 games, and be two clear games ahead of their opponent. If the score is 6-6, the set is decided by a tie break, where the winner is the first to 7 points (as long as they lead by two clear points.)

Except for the final set of a match. If the fifth set is tied at 6-6, it continues – with no tie break – until one player leads by two games. And this is where the problem lies.

If both players are good and evenly matched (a reasonable expectation in the semi-final of a Grand Slam Tournament) then the maths tells us there will be a significant number of games before a player loses a game on their serve. i.e., stalemate sets in, with neither player able to break the opponents serve and win by the required two game margin.

The score in Friday’s final set was 26-24, i.e. it took fifty games to decide the final set.

Looking at the stats for the match, Anderson won 213 points from his 278 serves, giving him a probability of winning any point on his serve of 0.7661. Isner won 206 of his 291 serves, giving him a probability of 0.7071 of winning a point when he served.

To win game, you need to win (at least*) 4 points on your serve, with your opponent winning winning no, one or two points. We can use a bit of basic probability to work out the probability of winning “to love”, ” to 15″ and ” to 30″.

* The problem is made harder as we need to consider “deuce”, when the score reaches 40-40. Due to the need to win by two clear points, “deuce” games could go for ever. The good news is, is that we can model “deuce” games as a geometric series, which we can sum to infinity, thereby coming up with a probability for winning a “deuce” game.

Using the probabilities above, I was able to calculate the probability that each player would win a game when they were serving. More importantly, this allowed me to the calculate the probability they would lose a game on their serve, or have their serve broken.

To win the final set, the game would go on until (at least) a player lost a game on their serve, hence we could treat the game as a geometric distribution, meaning we could calculate the “Expectation” for games lost.

For Anderson, the expectation is 25, that means, you would expect him to lose one game for every twenty-five he plays.  The number is a little lower for Isner: we would expect him to lose one game in every eleven.

What this means is is that we can expect long final sets, unless the pragmatic decision is made to revert to allowing tie-breaks in fifth and final sets. As players continue to improve, and the advantage of serve continues to increase, the sport’s administrators will have to grapple with this conundrum, or look forward to future marathons as the sport descends into a war of attrition.


If you are interested in the formula I derived to calculate the expectation, you can see it below. p is the probability of a player winning a point on their own serve.

Expectation to lose a game on serve. p is the probability of winning a point on serve. The formula gives the average number of games in which one game would be lost (the others won) For example, Kevin Anderson has a probability of 0.7661 winning a point on his serve. The formula gives an expectation of 25 (rounded to the nearest whole number). This means we would expect him to lose one game in every twenty five he plays. (Note: it does not mean we would expect him to win the first twenty four then lose the twenty fifth, but in a series of twenty five games, he would lose one game.)

Posted in Probability | Tagged , | 2 Responses

Oh, what a night

A Maths Teacher Celebrates

or why football remains the most popular and exciting sport

Oh, what a night. It had drama, heroes and villains, and, for once, the tears shed at the end of game were tears of joy. On a night of pure theatre, England beat Colombia in a penalty shoot out to proceed to the quarter finals of the World Cup.

A nation rejoiced and when, perhaps still a little bleary eyed, it woke realising it wasn’t just a dream, the feel good factor across the land was palpable. Workmates chatted amiably, neighbors conversed happily, strangers smiled as they past each other; everyone was happy.

Everyone, that is, except for one man. One grumpy old man, writing in The Guardian.

I have no objection to Simon Jenkins expressing his opinion in the paper, but unfortunately for him, maths blows away all his arguments.

He objects to is games being decided by penalty shoot outs and in this article he calls on FIFA to make the goalposts bigger which will mean more goals, and therefore less draws.

And if you want the “best” team to always win, then he has a point.

But do we always want the best team to win? No, that is the beauty of football. Football is the least predictable of sports – by that I mean that the favourites win less often than in any other sport and that is what gives the game its drama, that’s why millions watch it.

Increasing the distance between goal posts would mean more goals, and more goals (or points in other sports) means more predictability, and more predictability means less drama, less excitement. It can be shown mathematically, (more goals/points = more predictability), but it can also be deduced intuitively as well.

Imagine that I take to the court to play Andy Murray. On any given point I may, just may, win the point but in the long term he is going to win (many) more points than me, so as the more points are scored the more likely he is to emerge victorious.

Suppose (and we really are stretching the bounds of reality here) I win one point in every ten (meaning Andy wins nine in ten). If we play a single point match, the probability of me winning is one in ten, i.e. if we played ten matches I could expect to win one of those games, or if we played 250 one point matches I would win 25 of them.

If we now play a two point match (first to two) then the probability of me winning a match is 0.028, or I would expect to win 7 out of 250 matches.  More points, more predictable.  By the time we are playing a three point match my chances of success start to become vanishingly small. (If you don’t believe me, sketch out a tree diagram, plug in the numbers and do the sums)

In his article, Simon Jenkins says:

At root, the trouble is soccer’s notorious inability to deliver scoring opportunities …. So far, only 16 out of the first 56 matches in the current World Cup have been decided by more than a single goal. The contrast with free-scoring rugby, cricket and tennis is stark.

As a neutral, we want games to be decided by a single goal – it keeps it exciting until the very end.

Rugby Union has become so free scoring it has a problem. International matches now see twice as many points scored as they did thirty years ago.

On the one hand this is a positive – we all want to see points (or goals) being scored, but the dominance of the favourites is detracting from the excitement of the spectacle.

The gulf in class between the best and weakest teams in football World Cup is vast, but there is always the chance of an upset. In Rugby Union, it is a major shock when even the world’s second or third ranked nation beats the All Blacks. Great if you are a New Zealander, but not much fun for the rest of the world.

To finish, I would like to look at the article’s last line:

Or we can always watch the tennis, whose scoring system is close to perfect.

It depends what you mean by perfect. Yes, it creates great drama, plenty of decisive moments, but if you asked a statistician to devise a system to determine the best tennis player, they wouldn’t come up with the current scoring system.

But sport isn’t about the perfect ranking system, its about drama, despair and hope. I think football has got it pretty much right.


Posted in Probability | Tagged , | 1 Response

Its all getting quite exciting

Russia 2018 – its turning out to be a vintage World Cup.

We’ve had shocks, upset and – at the time of writing – England are still in it (with a supposedly “easy” route to the final. Half of me dares to hope, the other half fears we will be dumped out tomorrow night by the Colombians.)

It is also the the tournament when stats and data have come of age, and are readily available to those of us who find the numbers (almost) as entrancing as the fancy footwork.

As I did at the end of Euro 2016, I will “Rank the rankings”: compare the finishing positions with the FIFA world rankings. There was no significant correlation in the Euros, and with only four of the top ten still in the competition, it doesn’t look to good this time round, either.

But that must wait until the conclusion of the competition. Today, I wanted to share some great work by John W. Miller who has produced the box plots and scatter graph on this page. You can read his full blog post here where he will also talk you through how to get your hands on the data he used.

Some great charts that should spark some interesting discussions in the classroom.

Distribution of Player Height sorted by Country

Height vs. weight by player position

Large Data Sets are the current buzzword in A level maths, and it is wise to ensure that your students are familiar with the data set provided by the exam board you are following. However, here is another large data set containing (nearly) 40,000 international football results from 1872 to 2018: to my mind far more interesting than the transport arrangements of unitary authorities.

And here are the World Football Elo Ratings – I must confess, I’m not (yet) 100% sure how an Elo rating is calculated, but I shall try and find out.

So, whether it ends in cheers or (more likely) tears, when the competition is over, we can fill the void by exploring all the wonderful stats and data the beautiful game generates.

(And thanks again to John W. Miller for allowing me to use his images. If you, or your students, have a penchant for data and computer science, his blog is well worth a visit.)

Posted in Handling Data, Large Data Sets | Tagged , | Leave a comment

The confessions of a maths teacher – how I slept my way to success

The confessions of a maths teacher – how I slept my way to success.

It may just be that all our work scrutiny, all that triple marking, all that group work and all those verbal feedback stamps have, at best, merely been tinkering around the edges, covering over the cracks, re-arranging deck chairs on the Titanic.

Perhaps we are ignoring the single biggest factor that could improve the learning and performance of the pupils in our care.


I’ve recently read “Why we sleep: The new science of sleep and dreams” by Matthew Walker and it has, forgive the pun, opened my eyes.

I like to call myself a “healthy sceptic” – I’ve seen fads come and go all too often in education, and I ain’t going to believe something just because someone tells me it is so. But the author of this book has the credibility of being an expert in his field, and the copy is littered with well referenced research and sources. He’s done enough to convince this old cynic.

Sleep has an enormous impact upon learning. Anyone who has ever been in a classroom will, I think, recognise that a tired student is less receptive to learning. What I didn’t appreciate was the importance of sleep after learning.

The author explains (and backs up with scientific evidence) far better than I could ever hope to do how we need sleep after learning to fix new ideas and knowledge in our minds. To be rested before a lesson is not enough – we must rest (sleep) afterwards as well, else the value of the lesson is lost. You could teach the best lesson ever, but its value and impact will be significantly diminished amongst those students who don’t sleep well (and for long enough) in the nights (note the plural, not the single) afterwards.

He explores the different types of sleep (nonREM and REM sleep) and their different impacts upon retention and problem solving. It is a fascinating book, I urge you to buy a copy and read it for yourself.

So what can we as teachers do?

In the book, the author conducts (an admittedly unscientific) poll of friends and colleagues across countries and continents and discovers that no one ever had any lessons on sleep, or the importance of sleep whilst at school. In school, we regularly teach our students of the importance of a good diet, exercise, the danger of drugs and many other vices – think of your PSHE programme and all the above will feature, but “sleep hygiene” never makes an appearance. Perhaps it is time to start educating our pupils on the perils of too little sleep.

We can also help ourselves to more sleep, which will make us more effective in the work place, or in other words, better teachers (and, as a by product, will also make us healthier, more attractive, slimmer, more creative)

Some readers may be in exulted positions and thus able to make systemic changes to the school day. In the book, Professor Walker presents clear evidence that a later start to the school day improves results.

Reading the book, I have come to realise that my schoolboy academic success may not have been down to my hard work (if I’m honest, did I really work that hard?!) or any innate natural talent, but more due to a fluke of geography that had me live a mere hundred yards from the school gate, allowing me to roll out of bed at 8 o’clock ready for a 9am start. I literally slept myself to exam success.

Sleep – has ever anything that feels so good been so good?

Posted in Uncategorized | Leave a comment