Sports and Probability: A Bayesian Analysis

In a previous post, I did an exploration on the use of probabilitic models to predict sport results. The motivation was simple: using 20 minutes of data from a handball match, try to predict the final result with the simplest model. To this end, I’ve used a binomial model. Suppose that you have a probability p of making a goal in a given minute. Then, in a 60 minutes match, the number of goals will be distributed as a Binomial with parameters p and n = 60.

I have a particular interest in the last game played by my girlfriend, which was San Miguel (SM) vs Argentinos (AR). The result after 20 minutes was 6 to 10 and the final score was 21 vs 30. With this information, the parameter p becomes: p = 6/20 = 0.3 (SM) and p = 10/20 = 0.5 (AR). And the probability for SM to win was less than 1%.

Bayesian Approach

The described model was a frequentist model. We model the number of goals given the probability of making a goal in a given minute. But, this probability is a random variable and we don’t have any model for it. The Bayesian approach will enrich this by modelling the probability of making a goal in a given minute as a random variable — p — . This will affect the results in two ways:

1. We are going to extract more information in the inference process, id est., the first 20 minutes.

2. We will improve the prediction of the final result.

Inference

In the first place, we only know from p that it’s a probability, thus it can be anywhere between 0 and 1. Therefore, we know a priori that p is distributed as a uniform variable in (0,1) which is equivalent to a Beta distribution with parameters alpha = 1 and beta = 1. Also, we know that given p the probability of making k goals in the first 20 minutes follows a Binomial distribution. Using these two facts, we can compute the distribution of p given the number of goals in the fist twenty minutes — the posteriori distribution — . This also happens to be a Beta distribution, now with updated parameters alpha = 1 + k and beta = n — k +1.

In our particular case we have: p = Beta(6+1, 14+1) for SM and p = Beta(10+1, 10+1) for AR. This is a very important result. Previously, we had inferred p as a value and now we have a probability distribution.

Prediction

For the prediction we are going to use the Binomial distribution for the number of goals, which is similar to the previous model. But , this time we are using a distribution for the probability of making a goal in a given minute. When we put both things together, the Beta distribution and the Binomial distribution, a compound distribution arise: the Beta-binomial distribution. With this distribution we can compute the probability of every possible result analytically.

Also, we can simulate the results. Using simulation we will get a better understanding of this model:

For SM and AR:

  1. We draw a probability of making goals from the Beta distribution (p).

  2. Using p, we draw from a Binomial(p, 60) a possible result.

  3. Compute the previous two steps N times, with N large enough.

Results


Simulations using the original model

Simulations using the original model


Simulations using the Bayesian model

Simulations using the Bayesian model

In the previous article, I pointed out that the results were quite polarised. As you can see, this is not the case when we use the Beta-Binomial model. Using the Binomial model, the probability for SM to win the match was less than 1%. On the contrary, with the Beta-Binomial model the probability is 12.6% !!! To verify this result I’ve used bootstrapping, and 95% of the time the probability appears between 12.6% and 13.2%. That is more than 15 times bigger.


Probability

Probability

Now the question is: which is the correct model? To answer this question we have to notice which information was not used in the first model. The Binomial model only used the ratio number of goals to minutes. For SM we have 6/20, which is the same as 60/200 and 600/2000. But, in the Bayesian model we say that is not the same to do this prediction using 20 minutes of data than using 2000. If we have more information, our prediction is going to be more accurate.


Simulations when increasing the parameters in log-log scale

Simulations when increasing the parameters in log-log scale

If we increase the sample size, from 6/20 to 60/200 and 600/2000, the result using the Beta-Binomial model become more and more similar to the results of the Binomial model, as shown in the figure above.

To conclude, the results of the model will eventually converge when the amount of information is large enough. On the other hand, when using a small data sample size it is very important to model what is unknown, in this case, the probability of making a goal in a given minute.

Note: Here is my IPython notebook for this post.

Javier Burroni

About Javier Burroni

Data Science Consultant. Actuary, Msc Economics (candidate), future PhD student of Computer Science - Knowledge and Data Discovery "Entropy is what kills"

Comments