In a previous post, I did an exploration on the use of probabilitic models to predict sport results. The motivation was simple: using 20 minutes of data from a handball match, try to predict the final result with the simplest model. To this end, I’ve used a binomial model. Suppose that you have a probability p of making a goal in a given minute. Then, in a 60 minutes match, the number of goals will be distributed as a Binomial with parameters p and n = 60.
I have a particular interest in the last game played by my girlfriend, which was San Miguel (SM) vs Argentinos (AR). The result after 20 minutes was 6 to 10 and the final score was 21 vs 30. With this information, the parameter p becomes: p = 6/20 = 0.3 (SM) and p = 10/20 = 0.5 (AR). And the probability for SM to win was less than 1%.
Bayesian Approach
The described model was a frequentist model. We model the number of goals given the probability of making a goal in a given minute. But, this probability is a random variable and we don’t have any model for it. The Bayesian approach will enrich this by modelling the probability of making a goal in a given minute as a random variable — p — . This will affect the results in two ways:
1. We are going to extract more information in the inference process, id est., the first 20 minutes.
2. We will improve the prediction of the final result.
Inference
In the first place, we only know from p that it’s a probability, thus it can be anywhere between 0 and 1. Therefore, we know a priori that p is distributed as a uniform variable in (0,1) which is equivalent to a Beta distribution with parameters alpha = 1 and beta = 1. Also, we know that given p the probability of making k goals in the first 20 minutes follows a Binomial distribution. Using these two facts, we can compute the distribution of p given the number of goals in the fist twenty minutes — the posteriori distribution — . This also happens to be a Beta distribution, now with updated parameters alpha = 1 + k and beta = n — k +1.
In our particular case we have: p = Beta(6+1, 14+1) for SM and p = Beta(10+1, 10+1) for AR. This is a very important result. Previously, we had inferred p as a value and now we have a probability distribution.
Prediction
For the prediction we are going to use the Binomial distribution for the number of goals, which is similar to the previous model. But , this time we are using a distribution for the probability of making a goal in a given minute. When we put both things together, the Beta distribution and the Binomial distribution, a compound distribution arise: the Betabinomial distribution. With this distribution we can compute the probability of every possible result analytically.
Also, we can simulate the results. Using simulation we will get a better understanding of this model:
For SM and AR:

We draw a probability of making goals from the Beta distribution (p).

Using p, we draw from a Binomial(p, 60) a possible result.

Compute the previous two steps N times, with N large enough.
Results
![Simulations using the original model](/content/images/2016/07/Simulationoriginalmodel.png)
Simulations using the original model
![Simulations using the Bayesian model](/content/images/2016/07/SimulationBayesianmodel.png)
Simulations using the Bayesian model
In the previous article, I pointed out that the results were quite polarised. As you can see, this is not the case when we use the BetaBinomial model. Using the Binomial model, the probability for SM to win the match was less than 1%. On the contrary, with the BetaBinomial model the probability is 12.6% !!! To verify this result I’ve used bootstrapping, and 95% of the time the probability appears between 12.6% and 13.2%. That is more than 15 times bigger.
![Probability](/content/images/2016/07/Bootsrappingprobability.png)
Probability
Now the question is: which is the correct model? To answer this question we have to notice which information was not used in the first model. The Binomial model only used the ratio number of goals to minutes. For SM we have 6/20, which is the same as 60/200 and 600/2000. But, in the Bayesian model we say that is not the same to do this prediction using 20 minutes of data than using 2000. If we have more information, our prediction is going to be more accurate.
![Simulations when increasing the parameters in loglog scale](/content/images/2016/07/Simulationswhenincreasingtheparametersinloglogscale1.png)
Simulations when increasing the parameters in loglog scale
If we increase the sample size, from 6/20 to 60/200 and 600/2000, the result using the BetaBinomial model become more and more similar to the results of the Binomial model, as shown in the figure above.
To conclude, the results of the model will eventually converge when the amount of information is large enough. On the other hand, when using a small data sample size it is very important to model what is unknown, in this case, the probability of making a goal in a given minute.
Note: Here is my IPython notebook for this post.