Lesson from the pitch: take data viz insights one step at a time
More than half a century before modern sports metrics transformed the way teams, fans, and analysts look at performance stats, a man named Charles Reep began a groundbreaking data collection project—by hand. Painstakingly drawing play-by-play diagrams of every match his local Swindon Town Football Club played in 1950, Reep accumulated enough spatial data to start to see patterns and form strategic conclusions.
The only problem: he read the data wrong. Though Reep eventually became known as “the father of soccer analytics,” his assumption—most goals are generated after sequences of three passes or fewer—informed a faulty “long ball” approach to the game that was erroneously adopted throughout England for decades.
FiveThirtyEight‘s Neil Paine explains:
“Reep’s mistake was to fixate on the percentage of goals generated by passing sequences of various lengths. Instead, he should have flipped things around, focusing on the probability that a given sequence would produce a goal.”
Today’s statisticians and managers have since corrected that model and recognize that yes, controlling the ball is what increases a team’s probability of scoring. The short-passing tiki-taka style of play made famous by La Liga’s Barcelona F.C. exemplifies this strategy. And with the benefit of modern data visualization tools, passing sequences are getting a closer look once again.
Using dozens of illustrated data sets, Football Crunching analyzes passing series in the Premier League’s 2015/16 season. Editor Ricardo Tavares ranks the most common two-, three- and four-pass sequences according to the individual player combinations that produced them.
As an Arsenal fan, I was delighted to see that attacking midfielder Mesut Özil is a star in these visualizations. While many clubs’ most successful sequences are confined to relatively small portions of the pitch, Özil’s combinations occur all over the field. Hooking up frequently with fellow midfielder Aaron Ramsey and more incisively, connecting with forward Alexis Sánchez inside the opposition’s box, Özil plays “a true free role, resulting in an incredible 19 assists during the season,” Tavares notes.
Recognizing stylistic differentiators in how each club deploys its key players is a strong initial insight. Perhaps to avoid repeating Reep’s misstep, Tavares isn’t rushing to any major conclusions just yet, however. “The next step is to classify the sequences by their nature instead of the players involved. After that, we will look at outcomes: how a team shoots and scores and how it relates to the build-up.”
Tavares’ project may seem a daunting one-man job—a far cry from the real-time visualization capabilities of say, a global commerce platform. But he’s methodically on track to separate the signal from the noise: By identifying successful patterns, placing them in context, then exploring how they complement strengths and impact outcomes, the smart data scientist can make enough small connections to inform an overall winning strategy.