Beyond the Basics: Understanding Expected Goals Added

PHOTO BY JOHN STROHSACKER


Welcome to Beyond the Basics!

My name is Zack Capozzi, and I run LacrosseReference.com, which focuses on developing and sharing new statistics and models for the sport.

The folks at USA Lacrosse Magazine offered me a chance to share some of my observations in a weekly column, and I jumped at the chance. Come back every Tuesday to go beyond the box score in both men’s and women’s lacrosse.

I listen to my readers. Don’t doubt that for a second. This is not some Ivory Tower operation here. I’m a lacrosse fan first and foremost, and it makes my day when I get a reader email with feedback on this column. And it makes my week when they give me a suggestion for a future topic.

And that is just what we have here today; your first-ever reader-inspired Beyond the Basics. Per their suggestion, we are going to dive into EGA (expected goals added), why it’s important and how it’s used to build the most objective Tewaaraton Watch List you’ve ever seen.

EGA: EXPECTED GOALS ADDED

OK, EGA stands for Expected Goals Added. For those familiar with baseball stats, it’s the lacrosse version of WAR. It’s a stat that takes everything a player does (that shows up in the box score) and puts it into a single number. EGA penalizes players for negative plays (turnovers/penalties) and gives credit for positive plays. The credit given is based on how often each type of play is expected to lead to a goal (hence the expected-goals name). So, a ground ball is worth more than a missed shot because ground balls lead to goals more often than missed shots do. (If you want to dig deeper, I wrote a more detailed article on the topic.)

Before EGA, how would one compare two player stat lines? The short answer is that you really couldn’t unless they only differed on one dimension. Is three goals, two assists and five turnovers a more valuable performance than four ground balls and one goal? The bias toward points as the end-all metric would probably make you think that the five-point effort is more impressive. It may not be.

With EGA as a guide, we can account for everything a player does and get a much deeper understanding of the relative contributions. As a glaring example, let’s compare EGA with points, which has been the dominant single-number statistic in lacrosse since time immemorial. One issue with points is that it double counts assists and goals. Only one goal was scored but two points were awarded. Is an assisted goal more valuable than an unassisted goal? I would argue the opposite, and it’s certainly not worth double. Because EGA starts with play-by-play as the raw data, it distinguishes between assisted and unassisted goals, giving half credit to both the scorer and the player who recorded the assist.

The second issue with points as the dominant statistic is that there is much more that goes into winning games than scoring. A focus on points masks the contributions of all the players that worked to create those goals. What about the player that caused the turnover and won the ground ball in the first place? And how do you account for a player with a ton of points AND a ton of turnovers. All these things matter. The process is as important as the result. Points only cares about results. EGA values the entire process.

We can look at some of the highest-usage players in Division I Men’s lacrosse to see where EGA detects more value than points alone would suggest.

 

THE EGA RECIPE

The origin of EGA was actually as an input to my Win Probability model. The calculation needed a way to determine whether a team’s chances of scoring a goal were high or low. Otherwise, the only inputs to the model would be time and score; it wouldn’t be able to update the win probability after a ground ball was picked up as an example. I needed a way to determine, over the next stretch of the game, how many goals we would expect each team to score based on the most recent plays.

As an example, let’s assume there are 100 ground ball pick-ups recorded in the play-by-play. We then look at the next 60 seconds of play after each GB and count the number of times that the GB-winning team scores compared to the number of times the opponent scores. As you can imagine, the team that won the GB is more likely to score a goal, but not always. Let’s say the GB-winning team scores 24 goals over the next 60 seconds and the opponent scores five. That would give us a net goal margin of 19. Divide that by the 100 original ground balls and we can estimate that a ground ball is worth .19 expected goals. Here are the numbers for each play type that a player can record:

  • Missed Shot: 0.19

  • Ground Ball: 0.19

  • Faceoff Win: 0.18

  • Pipe Shot: 0.16

  • Blocked Shot: 0.10

  • Assisted Goal: 0.04

  • Saved Shot: 0.03

  • Unassisted Goal: 0.02

  • Forced Turnover: -0.17

  • Unforced Turnover: -0.17

  • Penalty – 2 min: -0.31

  • Penalty – 30 sec: -0.33

  • Penalty – 1 min: -0.36

Note: Obviously, a player who scores or assists a goal gets the EGA credit for that play as well. The values above are tied to the chances of scoring a next goal since it’s also used to adjust the win probabilities.

Once we have that worked out, all we need to do is count the number of times that each player is credited with each play type, multiply by the play values and sum the total. There you go. The recipe for EGA in a nutshell. You can probably do it for yourself if you are capturing these box score values. (If you do try it for your team, I’d love to hear what you discover.)

The results are often surprising. After I posted this week’s top EGA games, I had several comments about why Jake Taylor wasn’t on it. After all, EGA is supposed to identify the players who added the most value, and surely the eight goals he scored in the thrashing that Notre Dame gave to Syracuse was valuable. And don’t get me wrong, great game for the kid and a great story, but his EGA numbers were not as high as you might expect. The reason is that, of those eight goals, seven were assisted, so he gets half credit. (Pat Kavanagh actually had a higher EGA in that game.)

And that is sort of the point, EGA gets you a clearer understanding of the true value each player contributed.







STATISTICAL TEWAARATON

EGA puts a value on gaining possession for your team just like it values the ultimate act of putting the ball in the net. That gives us a way to evaluate players whose primary role is getting possession for their team (i.e., FOGOs) and compare them apples-to-apples against players with very different roles. And when we look at the current EGA leaders, we can see both types of contributions represented.

 

Look, I get it, there are norms involved in the Tewaaraton process. It’s not supposed to be about the most “value” created; it’s about the players that represent the ideals of what lacrosse is supposed to be about. Personally, I don’t think that the value created by FOGOs should mean more FOGOs should win the Tewaaraton. I’ll settle for a recognition that if we care about true value, we need to go beyond points.

After every game, I update my Statistical Tewaaraton tracker, which captures the players with the highest EGA ratings. In other words, the players that have created the most value for their teams. It’s a useful lens when thinking about who is “best.”

Now I should note; one criticism of EGA is that it does not account for playing time or opponents. And that is true. But I think that’s a feature, not a bug. EGA is typically a great way to surface players who are having a huge impact on their teams but who may not be household names because they aren’t playing on marquee teams against Top-10 opponents each week. I like that. I see you Frankie Labetti.

PRODUCTION IN CONTEXT

Another thing that is useful with EGA is that it can be calculated for the various phases of the game. That means we can give each player a EGA total for offense, defense, and faceoffs/draws. This is especially helpful in the case of faceoff specialists, some of whom create value beyond just winning the faceoff. I’m thinking of all those goals Petey LaSalla has scored over the years.

Take Luke Wierman as an example. In the faceoff-flavored Statistical Tewaaraton he’s got the 5th most FOGO-specific EGA. But he’s got the 4th most offensive-EGA, putting him third among FOGOs in total EGA. These are the top five FOGOs in terms of offensive production this year:

And we can think about this for defensive players and offensive players too. Being able to separate out the various ways that a player produces values helps build a richer story of their season.

Personally, the Defensive Statistical Tewaaraton is my favorite. We just have such a poor statistical palette for defensive players. Most of what a defender does well doesn’t show up in the box score and we don’t have player tracking to work with yet to bridge the gap. But even just adding up the positive value from ground balls and caused turnovers while subtracting negative value from penalties gives you a pretty good look at some of the top defenders. (Although I would argue that EGA is a much better tool for offensive players since almost everything they do shows up in the box score.)

The top five defensemen in terms of defensive EGA are:

Anyway, I hope that, at minimum, I’ve given you enough reason to be skeptical when you see total points or points-per-game cited. Are goals important? Of course. Are goals the only thing that matters? Of course not. Let’s move beyond plain vanilla stats and use the deeper stats to tell richer stories.

LACROSSE STATS RESOURCES

My goal with this column is to introduce fans to a new way to enjoy lacrosse. “Expand your fandom” is the mantra. I want you to walk away thinking about the players and stories presented here in a new light. But I also understand that some of these concepts can take some time to sink in. And part of the reason for this column is, after all, to educate.

To help this process along, I have several resources that have helped hundreds of lacrosse fans and coaches to internalize these new statistical concepts. The first is a Stats Glossary that explains each of my statistical concepts in more detail than I could fit here. The second is a Stats 101 resource, which provides context for each of my statistics. What is a good number? Who’s the current leader? That’s all there.

And last, I would love to hear from you. If you have questions or comments about the stats, feel free to reach out.

Most Recent

Behind the Whistle: We Are Grateful For ...

IWLCA member coaches share what they're thankful for ahead of Thanksgiving.

Journeyman Will Mark Primed to be Syracuse's Answer in Goal

Mark’s a Cali kid who starred at LIU before becoming Syracuse’s answer in goal.

Trailblazing Whittier Leaves Lasting Legacy in Men's Lacrosse

The 2002-03 Whittier teams might have been some of the best D-III teams ever.

Jacelyn Lazore: 'It's Not Just a Sport. It's My Identity.'

Jacelyn Lazore penned a first-person piece about lacrosse and her heritage.







Twitter Posts