With this post I take a “femquant” approach to my study of data science in football. Following Nikita Taparia’s lead I try to articulate in more detail what it means that women are being left behind the sport data revolution.
Is there something different in the women’s game?
Two of the last three first halves of Women’s Champions League Finals ended 4-0. Is there something in this score that is specific of women’s football? Is this something bad?
In a low-scoring team sport where the most common score is 1-1 , the final games of the most coveted competition are expected to be decided by marginal differences. If you look at men’s Champions League finals in the last 30 years never a side scored four goals in the first half. Even when Real Madrid had four past Juventus in 2016, the first half ended 1-1. Again with Real vs Atletico in 2014 when the game finished 4-1, the result was 1-1 before extra time. One potential explanation of the 4-0 results links to the argument of the intrinsic differences between male and female football game dynamics [2,3].The first half of a game is when the game plan is most effective. Plans limits unpredictability, goals are less frequent. It is demonstrated that most of the goals are scored towards the end of each half when factors disrupting the game plan such as fatigue kick in . Is there something different in the women’s game with respect to goal time analysis? Apparently yes. Research on the impact of scoring first [5,6] demonstrates that while the importance of scoring first in women’s football is in line with men’s, what’s different is the importance of first goals scored during the first 15 min of the game. Early first goals in women’s football are almost equally determinant than those scored close to the end of the match. This is different from men’s where outcome seem to conform more clearly with common sense that the closer to the end of the match the more important first goals are because there is less time to recover. This different aspect of the women’s game was perhaps known to the USA women’s team at the last World Cup 2019 where, by design, the US team started the game strong and succeeded to score in the first 15 minutes in all games of the tournament except the final.
Do we really need to mobilise gender distinctions?
However, there are other possible explanations of the 4-0 results by half-time that do not need to mobilise gender distinctions. Champions League finals usually involve the best and most well-known football teams, with famous players whose performance has been televised, analysed and dissected endless times. There is plenty of data for the coaching team to prepare a game plan. Plenty of knowledge in the players of their opponents. The only comparable result in a Champions League final in the men’s game is when in 1994 AC Milan scored four in less than 1 hour against Barcelona. That is nearly 30 years ago, when tactical video analysis was done using VHS video tapes. There was hardly any digital data let alone ball events or positional data. Of the two major football data providers, Wyscout would have been founded 12 years later in 2006 and Opta in 2012. Now that all this data is available, does the 4-0 by the first half in these Women’s Champions League (WCL) finals mean that the coaching team did not do their homework and provide players with a game plan or enough information about the opposition?
Let’s take the data argument a bit further
We can take the ‘data’ argument a bit further and look at where data actually comes from. Let’s start by comparing the chances the men’s game have to gather knowledge about the opposition compared to the women’s for example by looking to how many times teams play each other. It is indeed arguable that the more teams and players play each other, the more data is generated, the more knowledge, the easier is to predict how the opposition plays, the tighter the difference. Some might argue that being the number of professional female footballers in Europe so much smaller than the number of males (data for UEFA are 3,572 women – including semi-professionals – against 53,077 men), the women’s game should be more tight-knit than men’s: women’s top players would know each other better. Looking at international men’s football for comparison, however, one can quickly realise there are only few top clubs that can afford the best players. The consequence is that these players move nearly always only between these few top teams. Anecdotal evidence from the four men’s CL semi-finalists this year (Chelsea, Man City, PSG and Real Madrid) shows that Real Madrid fielded two former Chelsea players and both Chelsea coach and captain played for another semi-finalist (Paris Saint Germain) last season. Also, by looking at recent men’s finals, for five times in the last 10 years, finalists have played against each other at least once in the earlier stages of the tournament or in other domestic competitions before meeting in the CL final. That again shows how many chances the small club of top European men’s football teams have to gather intelligence about each other. Comparatively, the structure of the women’s game includes less international games (only 89 fixtures are played in the women’s CL as opposed to 124 in the men’s), fewer domestic games played per season (of which fewer are televised). That in turn means chances to play the same team during the season and generate data and more in general knowledge about it are lower.
It’s not incompetence but a consequence of how resources are distributed
From what we are saying, it turns out that those 4-0s by the first half could be down to inferior resources as opposed to negligence: less resources, less game played, less information about the opposition. Let’s now turn to the international outlook of the men’s game for comparison. In terms of players having played together in their national team, of the 16 nationalities represented in this year men’s CL final, seven will be represented in each team and four of them in equal number. For comparison in the women’s final between Chelsea and Barcelona there was no overlap of nationality between the two starting 11. That’s an unprecedented statistics for a CL final in the men’s game. It should also be noted that women’s football is more distributed across continents with national teams in Oceania, America and Asia as competitive as those from European countries. Perhaps until this season, not a lot of, for example, US and Australian internationals played in European leagues yet.That means that even if more resources are put into organising as many international fixtures in the women’s game, that would not help as much as in men’s football to increase chance for top players to know more about their opponents at club level. Evidence on the geographical extendedness of football leading forces in the women’s game would take us towards an evolutionary argument. The CL finals ended 4-0 because the women’s game is at an earlier stage of development, where globalisation did not yet bring the same degree of conformity that is seen in the men’s game.
The beauty of the numbers game
A final point on the attractiveness of the beautiful game.You may have heard that twelve of the top men’s clubs in Europe wanted to establish a Super League where they would play each other every week. Nobody knows exactly how entertaining this could have been. The closer comparison is when Real Madrid and Barcelona played each other 8 times in just over 12 months between November 2010 and December 2011. Except for two occasions, these matches resulted in three draws and two wins by a one goal difference. Most importantly, this series of re-matches is remembered for red cards statistics 84% higher than the game average and close to twice as much yellow cards per game. What’s more spectacular then: a score of 4-0 by half-time between two teams that never played each other or a draw between two pig-headed teams that met for the 8th time in the season? As the argument goes, give more time and more resources to the women’s game and it will become as tightly connected and it will show the same degree of conformity as men’s. Still, is that what one should wish for the women’s game? Instead of providing more of the same, what about exploring a different model of resource distribution that would retain the unique unpredictability, extended geographical outreach and diversity of the women’s game?
 Anderson, C., & Sally, D. (2013). The numbers game: Why everything you know about soccer is wrong. New York, NY: Penguin.
 Gómez, M. Á., Álvaro, J., & Barriopedro, M. (2008). Behaviour patterns of finishing plays in female and male soccer. Kronos. Rendimiento En El Deporte, 8(14), 5–14.
 Kirkendall, D. T. (2007). Issues in training the female player. British Journal of Sports Medicine, 41(Suppl 1), 64–7.
 Sergio José Ibáñez, José Antonio Pérez-Goye, Javier Courel-Ibáñez & Javier García-Rubio (2018) The impact of scoring first on match outcome in women’s professional football, International Journal of Performance Analysis in Sport, 18:2, 318-326.
 Carlos Lago-Peñas, Miguel Gómez-Ruano, Diego Megías-Navarro & Richard Pollard (2016) Home advantage in football: Examining the effect of scoring first on match outcome in the five major European leagues, International Journal of Performance Analysis in Sport, 16:2, 411-421.