By John Haigh

Great memories. Image from DHD Photo Gallery

Data collected from professional soccer matches suggest strongly that the times when goals are scored are fairly random, with two minor modifications: more goals are scored, on average, in a given five-minute period late in the game than earlier; and "goals beget goals" in the sense that the more goals that have already been scored up to the present time, the greater the average number of
goals in the rest of the match. But these two points are second order factors: by and large, the simple model which assumes that goals come along at random at some average rate, and irrespective of the score, fits the data quite well.

The goal-scoring rate does vary according to which teams are playing, and there are clear patterns that pick out historical periods of greater or lesser rates than the present. But suppose you want to estimate the total number of goals in any of the ten or so matches to be played over a given weekend in the English Premiership. Your local bookmaker has no team loyalty to cloud his vision, he
is merely interested in getting his odds correct; and, whatever the game, his estimate of the average total number of goals seldom strays from the range 2.6 to 2.9. The fact that he remains in business is testimony to him being broadly right.

### Scoring first

And it's a goal!

If the two teams are evenly matched (and ignoring the well-documented phenomenon of home advantage), then any goal that is scored is equally likely to fall to either team. How likely is it, in these circumstances, that the team that scores first will win the game?

It should be intuitively obvious that the answer to that question will depend on the mean (average) number of goals we expect to be scored. For if this mean number is very small, then the first goal is likely to be the only goal, so the team who scores it will almost certainly win; while if it is large (think of basketball), scoring first is almost irrelevant. But once we have decided on a value for the mean number of goals, our model, that goals are scored at random, leads directly to the respective chances of 0, 1, 2, etc. goals in the whole match. The formula for the chance there are exactly goals, when the mean number expected is , is , the so-called Poisson distribution. The table shows the respective chances of given numbers of goals for some typical values of .

 Number of goals 0 1 2 3 4 5 6 or more 0.135 0.271 0.271 0.18 0.09 0.036 0.017 0.082 0.205 0.257 0.214 0.134 0.067 0.041 0.05 0.149 0.224 0.224 0.168 0.101 0.084

Since we are interested only in those matches where at least one goal is scored, we must scale up these probabilities to eliminate goal-less draws, and hence the chance such a match has goals is

To solve our problem, we look separately at matches with exactly 1, 2, 3, etc. goals. Let denote the chance that the team scoring first wins a game in which exactly goals are scored, and let be the chance such a game is drawn. Plainly, if a match has one goal only, the team that scores it must win, so and .

If a match has two goals, our model implies that the second goal is equally likely to go to either team, so . We continue in this fashion. Matches with an odd number of goals are never drawn, so suppose first that is odd; then plainly , and we argue that

To see this, note that the team scoring the first goal will win provided it scores at least of the remaining goals. Under the assumption that each of the remaining goals is equally likely to be scored by either team, the number scored by the first team will have the Binomial distribution Bin, and hence the chance of at least is

This simplifies to the expression shown, by the symmetry of Pascal~s Triangle about its centre.

When is even, a similar argument shows that , and now

since, to draw, the other team must score exactly of the remaining goals.

If and denote the overall chances that the team scoring first will win or draw respectively, then

Using the expressions shown above, and suppressing the working, these are found to reduce to

and

They are valid for all values of , but it is not easy to make sense of them without trying some numbers. The table shows how little they vary when falls between 2.0 and 3.4, which covers most matches of interest.

 2 2.2 2.4 2.6 2.8 3 3.2 3.4 0.717 0.703 0.691 0.68 0.67 0.661 0.653 0.646 0.2 0.203 0.204 0.205 0.204 0.203 0.202 0.2

Scoring first is vital

So in the Premiership, indeed most professional soccer, we expect a team to win about 2/3 of the games in which it scores first, and draw about 1/5 of them. That offers the warm comfort that if your team scores first, it should lose only about one time in seven. You can check the match outcomes each week, and over a season, from information in the newspapers. Real data do conform well to these
proportions.

### Red cards

Geert Ridder and two Dutch colleagues used this same Poisson model to decide when, in the interests of team success, a player should risk being sent off for "denying a clear goal-scoring opportunity". It was not their intention to advocate foul play, merely to make an academic study of a question that players may face during a match. In order to answer that question, they had to assess
what difference it made if one team was reduced to ten players only.

Being sent off usually has other consequences. The player may be fined and prohibited from playing in a number of future matches; his club may be fined, or even have points deducted if their disciplinary record is poor. These extra factors were ignored - it is hard enough to deal with the uncertainties in the rest of the present match.

They collected data from 340 Dutch games in which just one team had one player dismissed. The Poisson model was used to estimate how many goals each team would be expected to score in the time remaining after the dismissal, and this was compared with what actually happened. Nothing is inevitable - some teams did better with only ten men - but the overall message was clear. Having a man sent
off reduces the expected number of goals your team will score, and increases the expected number for your opponents. The data enabled Ridder and his colleagues to estimate the sizes of these changes, depending on how much time is left to play. The two models, according as to whether or not he does get sent off, give estimates for the respective chances of how many goals each team will score in
the rest of the game.

Is it worth it?

Of course, if the sending-off offence would also lead to a penalty kick, then committing the offence is almost never justifiable. So concentrate on cases where the potential offence would occur outside the penalty area. What should the cynical defender do (purely to maximise his team's chances of winning or drawing the game)?

We need to know how likely it is that a goal will be scored if the offence is not committed. This will seldom be an easy judgement to make, but it is crucial to making the right decision. That decision can be neatly described by identifying a crossover time, T, corresponding to the particular chance that a goal will result. A player should risk a red card if, and only if, the crucial moment arises at time T or later. For games at a good professional level, a snapshot of the values is

 Chance of a Goal Crossover Time 16 minutes 48 minutes 71 minutes

Recall that this table applies when a "professional foul" would not lead to a penalty kick. Even the best players stumble, or mis-hit their shots, so the chance of a goal will only rarely be close to 100%. A player who gets sent off in these circumstances before half-time is likely to have made a miscalculation! On the other hand, very late in a tight game, the table suggests that if an attacker has a non-negligible chance of scoring, doing the nasty deed may be best for your team. As a football enthusiast, I dislike this implication of the table. But it suggests the desirability of changing the rules of soccer to encourage fair play: permit a referee to award a penalty kick (or even a goal) if a defender is sent off in these circumstances, wherever the crime is committed!