The Next Pirlo? Visually Comparing Central Midfielders

Comparing footballers using only stats can be difficult, and different approaches have to be taken to accommodate the type of player being evaluated. Forwards have traditionally been “analyzed” (even by the math-phobic) based on goals, though this has led to some Andy-Carrol-sized problems (better to look at their expected goals per 90, as 11tegen11 showed recently). Some evaluate keepers on save percentage, but that too can be quite flawed.

Even more difficult to evaluate than those that make their money in the penalty area are the men in the center of the park, whose contributions have been completely unquantifiable until recently. After all, goals are really the only player data that was universally recorded even 10 years ago. Maybe you’d find assists, cards, or minutes, but none of those are going to say a great deal about a central midfielder’s achievements.

At the OptaPro Analytics Forum, Marek Kwiatkowski used 2012/13 Opta passing data in an attempt to compare central midfielders in the Premiership, Bundesliga, La Liga, Serie A, and Ligue 1. This is the kind of creative analysis that needs to happen in order to set us on the road toward quantitatively identifying the next Pirlo/Xavi/Gerrard/etc.

The foundation of Kwiatkoski's measures of Mikel Arteta's passing angles and volume.
Kwiatkowski’s diagram of Mikel Arteta’s passing. The segments are foundational to his pass length and volume dissimilarity scores.

I found Kwiatkowski’s analysis interesting, and potentially important.¬† First he segmented the 360 degrees on offer for passes into 16 sections, as seen above. From there each players’ passing work was broken down in three ways.

  1. Distance. Within each angle segment, what were the player’s pass length tendencies?
  2. Volume. How often did he pass in certain directions?
  3. Position. Breaking away from the passes’ outcome, where was he passing from, on average?
Kwiatkowski’s GIF illustrating his technique for categorizing player passing location.

For all three categories, he calculated dissimilarity between all pairings of the 137 central midfielders in his data set, then tallied them together, with a lesser weighting for the position score, to get an overall player dissimilarity measure. The smaller the number, the more similar the players.

I’ve grossly over-simplified the analysis, and for greater detail on methodology, check out Kwiatkowski’s own article on Statsbomb about his study. At the Opta Forum I came away impressed, but felt that the visualization Marek used to display his findings could be improved greatly with some work in Tableau.

Thankfully, Marek agreed and sent me his dissimilarity scores, allowing me to build an interactive Dashboard that could switch between every central midfielder he analyzed.

Please take advantage of all the interactivity at your disposal here. First, the dropdown menu allows you to choose any single player, or all of them at once. The filters are most useful when seeking signal within the noise of all 9,316 pairings. Also, if you select a pairing, or group of pairings, in either graph, it will act as a filter for the other graph.

Looking over some of the most similar pairs can be instructive, but keep in mind that this model is built only on passing. Differences in Mikel Arteta and Yaya Toure’s shooting, tackling, etc. won’t be reflected here. Same for Xavi and Thiago Alc√°ntara or Marouane Fellaini and Isco.

There are certainly other limits to this kind of analysis, but most are similar to those that linger after traditional scouting. Few can make more than educated guesses at what is required of individual players tactically. Also, further study would be required to see if players’ passing tendencies are consistent year-to-year, with those who change clubs being of particular interest.

Despite those caveats, the questions this analysis raises are quite tantalizing. If you were to replace one of these with their closest similarity, would their passing patterns be familiar to their new teammates? Theoretically, could teammates adjust to non-distribution differences easier?

Smart analysis is seldom about giving an absolute, irrefutable answer, but instead it aims to offer knowledge that can push toward smarter decisions. Kwiatkowski’s analysis can simply help us compare central midfielders in a smarter way, and that has value.

One thought on “The Next Pirlo? Visually Comparing Central Midfielders

  1. I think the (dis)similarity scores could, as you say, help to figure out who could replace whom most efficiently. The big question there, as you also mentioned, is how stable are these tendencies?

    Another question is how much do these tendencies rely on the player’s teammates. If, for instance, you had season-by-season data, and regressed players’ 2011/12 on their 2012/13 numbers, with an extra control variable for change of team (Yes or no), then maybe you could get a sense for how well players are able to maintain passing tendencies, given that they either move teams or stay put. Does the system make the player, or does the player make the system?!

    Another thought is correlating a player’s tendencies (all three variables) to his team’s ability to generate quality shots. You could do that for both simultaneous data (X and Y variable occurring during the same season) and for lagged data (player’s passing tendencies in year N vs. his team’s scoring in year N+1).

    This is really a neat set of data for exploring the value of midfielders!

Comments are closed.