After the first leg of a home-&-home series, some supporters are always left scratching their heads as to what their club needs to accomplish in order to win the aggregate series. Since most home-&-home series use away goals as their primary tiebreaker, I decided to create a tool by which readers can enter team names, first leg scoreline, and then be shown the necessities for those sides in the second leg.
In short, you just have to replace “Team A” and “Team B” with club or national team names, abbreviations, or even nicknames (complimentary or not), then use the slider to select the first leg scoreline. You can use this for matchups in the Champions League, MLS, Liga MX, certain stages of some confederations’ national team qualifying for major tournaments, or even a competition between tiny clubs that very few care about.
I used the current state of the MLS conference finals for tabs illustrating the use of this tool. The New England Revolution obviously has a substantial advantage heading into the second leg, having struck twice in Red Bull Arena and conceding only once. Meanwhile, LA Galaxy have to be pretty happy with a 1-0 win in their home, though the Seattle Sounders seem quite capable of doing enough to take the series in the second leg.
In all of this, keep in mind that certain scorelines are more likely than others, so simply counting cells in the chart isn’t very useful. Hover over any one cell and a pop-up will note the percentage of soccer matches that generally end in that scoreline, based on this piece from the Soccer By the Numbers blog, which cataloged outcomes from the EPL, La Liga, Bundesliga, and Serie seasons from 2005/06 through 2009/10. I simply averaged the distributions for those four leagues, so don’t consider this the be-all-end-all of scoreline distribution. The figures are also problematic predictors of particular home & home fixtures, especially since objectives are so different within a playoff structure, but they do help us set expectations on a basic level.
50% of those matches ended with the home side scoring between zero and two goals and the visitors either getting shut out or scoring once. That makes the six cells in the top left of each chart immensely more important than less-likely scorelines on the periphery. Expanding up to three home goals and two away tallies brings the total above 80%, then by five and four almost 99% of outcomes are covered, which is why those are the limits of the home-&-home away goals charts.
Speaking of those home-&-home maps, I just hope that they help some folks understand the away goals tiebreaker (regardless of my opinion that they are a poor fit for MLS playoffs). They are pretty easy to understand when you look at them from the right perspective, even if away goals are a thoroughly arbitrary way to lessen the likelihood of extra time, and cumulative home field advantage within a series.
For the first time this year, MLS is using away goals as the first tiebreaker in their playoff home and home matchups. Before we get to the pros and (mostly) cons of this rule, here’s a guide to the 2nd legs of the conference semifinals, taking away goals into effect:
Cells with white bars above and below represent aggregate ties, and are the scenarios in which the rule matters. In previous MLS home-and-home matchups, the 2nd leg would have gone to extra time, but now you only get OT if the two legs’ scorelines are palindromic. It is important to note that away goals will only count in regulation, so on rare instances where a matchup reaches overtime, the home side will get a glimmer of an edge. But if they can’t take advantage in 30 minutes, the edge evaporates, as studies have shown that penalty kick shootouts are home field neutral.
In this year’s matchups, this means that Los Angeles & Seattle will be incentivized to clamp down the match and keep it low scoring, so that even an aggregate tie would land in their favor, or at least give them 30 minutes of home overtime. Same for New England, but all they have to do is hold Columbus to a couple of goals or less. The other top seed, D.C. United, has the trickiest path, as they need two goals, but if New York scores once, that number doubles.
Which brings us to the major problem with this setup. Home-and-home is designed to negate home field advantage with 90 minutes played in both settings, and the only way for there to be an advantage for the higher seed is for them to get an extra 30 minutes of home field in that second leg. An away goals tiebreaker makes that outcome even less likely than it was before. I’m not the only one who sees the issue this way, and you can read Brian Straus’ smart critique upon the rule’s introduction here.
It is shown that the observed differences in frequencies of winning between teams first playing away and those which are first playing at home can be completely explained by their performances on the group stage and – more importantly – by the teams’ general strength.
That’s what you want in a champions league competition where seeding is far from straightforward or trustworthy, but in a season-culminating playoff system? If MLS wants the MLS Cup to feel like the legitimate ultimate trophy for each season, they need a playoff system in which regular season excellence is rewarded, not neutered. They had that, though it was cumbersome, before 2003 in their best of three format, but they’ve been adrift ever since in home and home murkiness, and away goals is taking them even further from shore.
Champions of away goals in MLS, such as MLS’ Technical Director of Competition Jeff Agoos, point to various rationales like the rule promoting attacking play, drama, or being an “authentic” Europe-bred standard. Some very smart people in Europe have problems with it, though, and I have yet to see proof of any off these defenses of away goals.
Thankfully, MLS only have to peer south of the border for a clear, simple upgrade. Mexico’s Liga MX has playoffs in which the home-and-home tiebreaker is regular season record. Underdogs have to win outright, which makes far more sense. As things stand in MLS, there is effectively no difference between the 2nd and 3rd seeds in each conference, who meet each other to start their playoff runs, and the only advantage footer the top seeds is the hope that their opponent is wounded from their play-in wild card match. Shouldn’t 34 matches carry much more weight than that?
Yesterday the MLS Players Union updated their MLS salary release, incorporating the wages of new signings and extensions granted recently to the likes of Kaka, Jermaine Jones, Graham Zusi, and Matt Besler. This release paints a useful picture of overall spending trends in the league right now, and we can take a couple extra steps with these figures to see what they imply for spending after the Collective Bargaining Agreement (CBA), whose negotiations loom over the coming MLS offseason.
First, lets look at the salary data as it is. Here’s my visualization of all 568 players in the release that are assigned to one of the 19 current MLS clubs or next year’s expansion teams, New York City FC and Orlando City SC.
Yes, despite only listing five players here, Orlando City already has the fifth-highest payroll, driven almost entirely by Kaka, whose $6,660,000 base salary and $7,167,500 guaranteed compensation are the highest in MLS history. Orlando’s expansion partner, NYCFC, represents the biggest oddity in this data. One of their star signings, Frank Lampard is not mentioned, and the other, David Villa, shows only $60,000. Considering both these players’ histories and the high-spending reputation of the new employer, Mansour bin Zayed Al Nahyan, next it is likely that both will be pulling in a figure that would place them among the richest in the league.
These oddities, especially Villa’s farcical $60k, call to mind the history of managers, owners, and others within MLS downplaying the accuracy of particular players’ salaries listed in previous MLSPU releases. This is why these figures are most useful when viewed from a wide angle, and we should resist the urge to use them to label specific players “underpaid” or “overpaid.” They also don’t take MLS’ myriad salary cap mechanisms like designated player designations, allocation money, retention funds, pro-rated transfer fees, homegrown status, Generation Adidas status, trades in which a player’s former club continues to pay some of his wages, the fact that only 20 of a team’s 30 players hit the cap at all, and the general accuracy of the MLSPU release. Confused? You can read go read MLS Rules and Regulations if you’d like, but you should go in understanding that not all of the rules are publicly stated, and commissioner Don Garber has admitted that the league sometimes alters the rules when it is convenient to do so.
For this reason, I am not estimating clubs’ cap numbers armed with only the MLSPU release. Instead, what we have above is a simple ranking of clubs’ total wages, and a visual reminder of the disparities between the league’s stars, and its rank and file. It is alarming that Kaka will make over 135 times the veteran’s minimum, $48,500, but this kind of ratio is nearly assured by the Designated Player (DP) rule. Since only three players per club can make more than the DP level $387,500, it is 100% guaranteed that at least 90% of the players will make less than that. In actual practice this year, here is the breakdown of players within certain wage ranges:
The middle class in MLS is so barren that the above histogram has to skip every salary bin containing zero players. Without this filter, the number of rows would render the chart on the left absurd, displaying 204 rows, 171 of them blank. On the right, we can see the shift in this salary distribution based on potential post-CBA salary cap levels. This figure defaults to $4,500,000, but you can use the arrows or the slider to adjust it, $100,000 at a time. There still is not much of a middle class, but at the very least the lower class will make enough to stop fretting over their monthly expenses. Feeding this view is a calculation for hypothetical wages which increases every non-DP player’s guaranteed compensation proportionally to the salary cap increase. For a player making $50,000, if the cap rose from $3.1 million to $6.2 million, their hypothetical wage would be $100,000.
Hypothetical wages here are by no means meant to imply that every player will receive such a raise himself. The theory at play in the hypothetical visuals is that wage dynamics will stay roughly proportional across the board, and it is actually quite likely that some of the lowest paid (145 players, 25.5% of the player pool, currently make less than $50,000) will lose their jobs when/if the salary cap and minimum wage increase. Undoubtedly, many of these players deserve to make more money, but a fair number will get squeezed out in favor of new competitors who may have previously dodged MLS because the wages on offer were too low.
The dynamic there portends an oddity in the pending CBA negotiations, where up to a quarter of the current players may want to avoid pushing salaries very far beyond the status quo for fear of losing employment. Meanwhile, the more secure players will surely want an increased cap so they can reap the benefits not only for themselves, but for the sake of getting better teammates. Consider that MLS sponsors and broadcasters may also be calling for increased spending, and you see that CBA negotiations look to be complex even before considering topics that go beyond wages.
In any event, despite some conservative parties on both sides, it seems very likely that we will see a substantial increase in the MLS salary cap. Television ratings are increasing (though the numbers are still far below other leagues), attendance continues to rise, and broadcast rights fees and sponsorships are bringing in more money. My hope is that they set the salary cap as a percentage of league revenues, so it can self-correct over time. With that in mind, here’s a version of the first chart, but using hypothetical wages and also containing a interactive salary cap selector.
Again, the hypothetical wages here run off the assumption that the spread of sub-DP salaries on club rosters will raise roughly in proportion to the salary cap increase. This prediction won’t be 100%, but given the league’s insistence that the cap and single entity isn’t going anywhere, it’s a safe bet that it will be close. The key dynamic I will look for in CBA negotiations is whether each senior designated player will still account for 12.5% of his club’s cap. If that percentage lowers, it will be an enormous victory for the richer clubs, as they would then have more freedom to spend elsewhere on the roster in support of their high-price stars. That could portend an enormous rise in the importance of spending, a factor whose relevance has been quite small thusfar in MLS.
The roster for Chivas USA (who will go on hiatus for two years) stack onto the hypothetical Orlando squad as a means of seeing how the wages of a full team with Kaka would look. NYCFC’s reported squad was so thin that I didn’t alter their player list, but that does make for a notable under-reporting of potential league wages, especially with Sheikh Mansour expected to spend as aggressively as possible, as he has at Manchester City.
Commissioner Garber has stated many times that he wants MLS to become a top league in the world. If this is his true objective, the coming CBA is a prime opportunity for the league to aggressively increase wages, thus attracting and retaining better players with an aim to improve the standard of play and the marketability of the league and its clubs.
League tables lie, and MLS’ is among the least honest. With 19 clubs, one has to sit out every weekend, and even without those bye weeks the league is generally averse to scheduling the season such that clubs’ games played stay roughly even, so ranking by raw points is usually quite misleading. Instead, I’m ordering clubs by their points per game (PPG), which has the added benefit of being able to chart the PPG pace they’ll need over the rest of the season on the same axis.
Hover or click on a club’s crest or bar for a writeup of their place in each race.
The colored boxes represent the PPG that club will need over their remaining matches to compete for that particular objective. Each box spreads over 2 points in the final standings, for example, today (I aim to update this table at least on a weekly basis) the left side of the purple Supporters Shield box represents the pace needed to reach 64 points, the midpoint is 65, and the right side is 66. I’m roughly trying to place the boxes such that the left side represents a 50/50 chance for most clubs, and the right side is at least above 75%, all based on SportsClubStats’ Monte Carlo simulations. Note that the East and West have different point targets for the playoffs (blue) and top three (green), since the two conference’s races are basically independent.
The three races outlined are the most important when considering playoff implications. The Supporters Shield holder gets a CONCACAF Champions League (CCL) spot next year, the top three in each conference get byes to the conference semifinals, and fourth place hosts fifth in one wildcard match, with the winner meeting the top seed in the next round. None of the top three really get an advantage over the others, since the conference semifinals and finals are in home and home format, which neuters home field advantage. Sure, fourth is preferable to fifth because of wildcard hosting, but that’s a small factor in comparison to a bye or a CCL spot.
What we’re left with is a clear indication that the Supporters Shield race is between Seattle and LA, with DC or RSL only able to enter it with a huge rally alongside a stumble from the leaders. Essentially, the West has paired off with a Sounders/Galaxy battle for first, Salt Lake/Dallas grappling for third, and Vancouver/Colorado desperately trying to join those four in the playoffs, at least as the visitors in that wildcard match. Meanwhile, the East is one big jumble, with each club seemingly capable of falling or rising by a couple spots when all is said and done. Except Montreal, who can safely set up shop in the cellar.
The admitted blind spots in all this are schedule strength and tiebreakers. I note wins and goal differential in the writeups that pop up when you select a club in the table, but the specific point targets are conference-wide. A team with three more ties than clubs they are close to (prime example, Chicago, who may well set the MLS single season record for draws) would need to aim one point higher. Meanwhile, I’m not including fixture difficulty, but the PPG pace needed is still highly applicable whether the road to get there is rocky or smooth.
Overall, this should be markedly more helpful than the standard league table, but for other advanced views of MLS races, I highly recommend simulations on Sounder at Heart in their series, “State of the MLS Run In,” as well as on American Soccer Analysis. These sites take a more nuanced approach while projecting for each club, while I can update mine quickly and allow for assessment of all the races at a glance.
A while back¹, I looked into American Google Trends scores across 16 sports, both overall and in terms of seasonality. Inspired by FiveThirtyEight’sriffs on Trends, I found that Trends is a gauge of online interest or intrigue, not popularity per se, but the data is quite useful for comparing sports in some contexts. One of Trends’ most interesting features is the way it parses results geographically. While some sports’ fanbases seem likely to use Google to differing degrees (as illustrated by Part 1′s hypothetical baseball and soccer fans), conceptually I felt that Googling bias when comparing states should be smaller², making intra-sport regionality quite instructive. The map in the following Tableau dashboard compares each sport only against itself, state-to-state. Click the dropdown list in the top left to switch sports.
Note that a state’s Trends scores are all relative to the state in which that sport created the most interest, which always has a score of 100. For example, every other state’s score for tackle football is essentially a percentage of the Google searches made per Google use for that sport in Alabama. Soccer is a pretty steady presence from sea to shining sea, though its consistency is not as pronounced as its no-offseason nature lent it in Part 1′s seasonality study. Virginia’s top spot is interesting, as the state’s direct soccer legacy can be traced mainly to college soccer, which gets almost zero media coverage, and D.C. United, which plays its matches on the other side of the Potomac. The nation’s capital does seem to drive the results, though, as all of the most soccer-Trending Virginia cities reside relatively close to it, as illustrated in this map taken straight from Google Trends:
This map does not tell us whether it is driven by D.C. United, University of Virginia soccer, general demographics, or foreign-born ambassadors/lobbyists/etc living in the suburbs and searching for the beautiful game there, but it is notable that UVA is in Charlottesville, a couple hours away from the state’s soccer Trends epicenter.
Generally, soccer seems to be at its strongest in high-population states, as the top seven states by population all land 65 or above on Trends’ 100-point scale. The sport struggles most in sparsely-populated northern states, like North Dakota, Montana, Wyoming, and Alaska, with the 49th state setting the beautiful game’s floor (a relatively high floor at that) for American interest.
In some other sports we start to see a bias toward hotbeds of that particular sport in NCAA competitions. Alabama leads tackle football in Trends, with the many of the NCAA’s most storied football programs (and very few states that house NFL teams) scattered through the top 12. Basketball’s top 10 states contain only three NBA franchises, the Indiana Pacers, Charlotte Bobcats, and the Cleveland Cavaliers, and all three of those states have hugely popular college teams as well.The most striking example is Louisiana topping baseball scores. LSU baseball is legendary in the niche world of college baseball, having been to 16 College World Series with six championships in the last 28 years. The Tigers have had the NCAA’s #1 baseball attendance for 19 straight years, averaging 10,754 per game when no other college drew more than 7,700. Similarly, other non-MLB states like Mississippi, Nebraska, South Carolina, Alabama, and Arkansas are in the baseball top 10.
Meanwhile, ice hockey unsurprisingly has a decidedly northern thrust. Only eight states had Trends scores above 40, and the furthest south of them was Massachusetts. Outside of Alaska (scoring 47), the most Western hockey-interested state is North Dakota. At the other end of the spectrum, 23 states had hockey Trends scores below 15. This means that in all of these states (mostly South and West) hockey searches occurred at most 15% as often as they did in the sport’s standard-bearing state, Minnesota. Of course, the severity of this geographic trend could very well be exaggerated by college hockey programs, which are far more regional than the NHL.
Based on standard deviations of state Trends scores, cricket and lacrosse join hockey in drawing the most sporadic interest in this country. The cricket Trends seems to somewhat mirror the map of Indian immigration in this country, while lacrosse, for reasons unknown to me, seems to have elicited heaviest interest in Wisconsin, Maryland, Connecticut, and some other Northeastern states. If that list makes sense to you, please clue me in on lacrosse patterns in the comments. (Note: Jason Kuenle commented below that Wisonsin has a town named “La Crosse,” which unfortunately means that the lacrosse Trends scores for that state and some of its neighbors are decidedly suspect.)
Hawaii sticks out as the state with the most unique sports profile, popping up as a top three state for niche sports like rugby, ultimate fighting, volleyball, and swimming. If you are one of the few Americans with a passion for those competitions, good news; you may be best off living on a tropical island. Meanwhile, Hawaii has scores below fifty for each of the top four Trends sports: tackle football, soccer, basketball, and baseball.
If someone is deeply convinced that a certain sport will never be big in America, it could well be that they simply haven’t visited one of its domestic hotbeds³. While the apparent college sports bias does raise questions about the actual meaning of Google Trends scores, I feel that they are still a nice way to map online interest. You just have to realize that those driving online interest are likely far younger than the national average and their googling tendencies may well be skewed by which schools they attend. All of the above is a bit fragmented, but the biggest lesson here is that so are American sports preferences.
¹ OK, a lot earlier. This article got delayed because I have been quite busy personally and professionally of late. The delay is kind of a blessing in disguise, though, as it gave me time to mull the results of this study a bit more and come out the other side with what I think is an improved analysis
² Especially given that each state’s Trends score is scaled based on overall Google searches in that state.
³ Also worth noting that there are fluctuations within states, though Trends’ publicly-available data only details the most popular cities for a particular search, meaning it would have been more difficult for me to map them below the state level.
Ever since I first charted a cosine curve on my Casio graphing calculator in high school pre-cal, I’ve been a fan of data visualization. The vast majority of the time, I employ it to learn things, but sometimes I just doodle for the aesthetic appeal. When this is the objective, I often use that trusty cosine curve as my basis.
Tableau has a contest this month for using their product artistically, and I decided to make a cosine graph, its vertical mirror image, and their trend lines with parameters that users could play with to change amplitude, wavelength, color, etc. of the waves. I’m not defining what each of the four parameter controls do at this point (if anyone cares, I’ll gladly post the formula that drives it), I just hope that someone enjoys tinkering with it like I do.
This graph was only supposed to be a little exercise in descriptive stats, but after I made it, I realized that I could drum up something similar in Tableau, which could easily include some predictive analytics and that would be easily repeatable for others in MLS, whose player pages all include a game log dating back to 2010. I’m just not sure if there’s a market for that, so here I’ll just post the graphic alone and ask for your feedback for now.
Landon Donovan announced his retirement yesterday, effective at the end of the current season. Currently Donovan has the most goals (138) and the 2nd most assists (124) in MLS history. Some have smartly pointed out that these figures are inflated by both his longevity (4th all-time among field players with 27,423 minutes) and his penalty kick goals (2nd most with 28). Thankfully a common fancy stat accounts for both these issues. Non-penalty goals plus assists per 90 is a pretty self explanatory term. Subtract PKs from a player’s goal total, add in assists, divide by minutes, then multiply by 90. I applied this calculation to MLS’ all-time top 25 for both goals and assists, and here’s the leaderboard, active players in green, retirees in blue:
The exclusion of penalty kick goals deserves a quick note. PKs are inherently a different skill than goals in every other situation of the game. Put simply, in the run of play, corner kicks, free kicks, etc. you will never see a single player get an unimpeded run up to a shot 11 yards away from a keeper that’s anchored to the goalline. That situation immensely favors the shooter, scoring 70-80% of the time, while the average shot in other situations goes in only 11%.
Landon’s still #1, but his lead on Preki and Taylor Twellman is pretty slim. A slim lead over MLS originals is even more impressive than you might think, because the standard for assists was very lax in the olden days. You can read about it here, but the basics are that the last two players to touch the ball before their teammate scored were usually credited with assists, even if a lot of things happened between their pass and the strike. Players used to even get an assist even if they took a shot and their teammate scored of a rebound.
The chart above is a bit noisy, and I hope to build an database of MLS players with each year of their production, rather than raw career totals. For now, we have even more reason to be in awe of the face of Major League Soccer who will be hanging up his cleats in a few months.
Check out The Shin Guardian today. I wrote about my vision for improving the MLS All-Star Game. Basically, I want to see a return to a US vs the World format. But with a twist, because the event’s non-FIFA status would allow the US team to include citizenship hopefuls, too.
My contest for guessing new MLS wages ended yesterday. If the MLS Players Union holds to form, they will update their salary release today, and I can determine the winner of the contest. In the meantime, here are box and whisker plots of all the guesses put forth by my readers:
The median values (point in each box when the hue of grey changes) mostly seem reasonable, but based on Brian Straus’ report from last night, the crowd here may be profoundly overestimating the wages Sporting Kansas City will be paying Matt Besler, and especially Graham Zusi. Per Straus, both of those players will “earn a pro-rated base salary of $600,000 this year,” while my readers’ median guesses were $1.1 million for Besler, and $1.2 million for Zusi. The median for DaMarcus Beasley, $850,000, was much closer to the $780,000 guaranteed compensation reported by Straus.
Notably, the MLSPU release could differ from Straus report, and that release will be the standard by which the contest is judges. Even though I trust Straus’ report more, MLSPU-listed guaranteed compensation was the standard by which I set up the contest, and those rules cannot change after submissions were received.