XG is a terrible stat

Jotun

Full Member
Joined
Jan 2, 2010
Messages
400
First let me say, I love stats, and I love to measure as much as possible with stats, however finishing and particularly XG is currently terrible stat and shows very little.

Why I'm saying this, well Antony's chance was yesterday evaluated as 0.42. It was an open goal from 4 yards.

Diallo's goals are all evaluated as 0.33, 0.33 and 0.39. Actually, first two goals are quite difficult finishes, the first one from a tight corner on his weaker foot. The second one from a lobbed ball first time finish. I mean they are good chances, but certainly not trivial. However the last goal that is evaluated at 0.39???? Are you telling me there is a 60% chance that a professional footballer will miss an open goal under no pressure and with no sight of goalkeeper.

Are you telling me seriously that Matheus Fernandes chance is more likely to score than Amad his last or Antony? And that chance, despite him being surrounded with two united players and a goalkeeper straight ahead of him is apparently the best chance of game. Really? Really?

Source of XG values:
https://understat.com/match/26811

What this shows is the model (at least understat's) is flawed (probably too simple) and therefor can not be relied upon to provide actual information about quality of chances over a match. That also implies it is unreliable over the course of a season.
 
The problem is no matter how much data you feed it, it is still subjective and based on probability. It shouldn't be used in isolated incidents but across a range of games, that's where it is most effective.
 
First let me say, I love stats, and I love to measure as much as possible with stats, however finishing and particularly XG is currently terrible stat and shows very little.

Why I'm saying this, well Antony's chance was yesterday evaluated as 0.42. It was an open goal from 4 yards.

Diallo's goals are all evaluated as 0.33, 0.33 and 0.39. Actually, first two goals are quite difficult finishes, the first one from a tight corner on his weaker foot. The second one from a lobbed ball first time finish. I mean they are good chances, but certainly not trivial. However the last goal that is evaluated at 0.39???? Are you telling me there is a 60% chance that a professional footballer will miss an open goal under no pressure and with no sight of goalkeeper.

Are you telling me seriously that Matheus Fernandes chance is more likely to score than Amad his last or Antony? And that chance, despite him being surrounded with two united players and a goalkeeper straight ahead of him is apparently the best chance of game. Really? Really?

Source of XG values:
https://understat.com/match/26811

What this shows is the model (at least understat's) is flawed (probably too simple) and therefor can not be relied upon to provide actual information about quality of chances over a match. That also implies it is unreliable over the course of a season.
I'm no expert, but I think whenever people complain about a statistic it's because they are either evaluating it in isolation or expecting a stat to be 1:1 with reality (fair assumption). xG I think is useful the more you zoom out from individual moments/chances/games and start looking at it over a longer period of time.

Of course you then have the reliability of the method used by those providing the statistics, which makes xG look even more confusing as it's not like shots on target which is fairly (although not always) straightforward.
 
  • Like
Reactions: Rood
I wonder do clubs actually use xG to drive decisions, does anyone have any information on this? For me personally, I don't pay much attention to it. I feel like it nevers reflects a result in a game correctly. Missing great chances and scoring improbable chances has always been a part of the game.
 
Think you're reading too much into the xG. Afaik the positioning of opposition players is not calculated into it.
 
Expected Goals models: distance to goal, angle to goal, body part with which the shot was taken, and type of assist or previous action (throughball, cross, set-piece, dribble, etc…). Based on historical information of shots with similar characteristics, the xG model then attributes a value between 0 and 1 to each shot that expresses the probability of it producing a goal.
This is how it's calculated.
 
Yeah same with the Liverpool game where Amad’s header was a great chance but it was so far off it didn’t count as a shot. Meanwhile Liverpool apparently had more xG than the 7-0 game but it didn’t feel like that at all - yes they were clinical in that one but we were well and truly battered, whereas in the recent game we were arguably the better team. Seems like they boosted a few chances because they were close to the goal that weren’t actually that easy finishes.
 
Yeah it's terrible and I automatically assume anyone that mentions it plays more football manger than watching football.

I'll only accept it if a losing manager uses it after his team has had plenty of great chances and just didn't finish them. If you have a high XG and score 0 that does say something.

There being a method and a decimal point leads all the numpties to think it's proper science, which it's not.
 
It's the same as any other stat, it's meant to be used in conjunction with other stats and the eye test.
 
I trust in the scientific method used to calculate the chances but what I find that it doesn't take into account is momentum, moments change games and that is something the xg model has no way of taking into account. The xg of the games didn't show the full picture of how bad we were
 
xG is useful but not as useful as people wish it would be.
It's most useful if you don't have access to watching the game for some reason.
 
The problem is no matter how much data you feed it, it is still subjective and based on probability. It shouldn't be used in isolated incidents but across a range of games, that's where it is most effective.
Is it really? What basis do you have for this claim? I mean, based on yesterday's game the error in evaluation chances is up to 60%. Extrapolating that on the course of season could end up with significant deviations. I mean we see almost every season teams significantly underperforming or overperforming Xgoals. Is it because of the quality of finishing, luck or is it because the model is flawed and is therefore not accurately evaluating the chances.

And based on yesterday's (but not just that one), I'm inclined to conclude it is the latter. The model is simply poo poo and completely unreliable.
 
So reading that it wouldn’t take into account that it was a volley for Amad’s second? That would explain why it’s such a high Xg.

Seems like it also doesn’t take into account that players have a weaker foot.

Side note, I wonder how often someone has scored a hat-trick where two of three goals were with their weaker foot?
 
Disagree. It just gets overused and people overstate its meaning for single games or periods with small sample size. If the sample size is big enough, it’s a useful stat that should be contextualised.
 
The PSxG gives a better context for every chance. Antony's chance is still pretty low though. Maybe, it's taking into account the possibility of the players missing the ball completely from this kind of ball which happens regularly.

xg-psxg.jpg
 
So keeper position is not included?
They're the main factors according to StatsBomb. Other factors can be included, and also every site has its own xG model which may include or determine other factors.
 
I'm no expert, but I think whenever people complain about a statistic it's because they are either evaluating it in isolation or expecting a stat to be 1:1 with reality (fair assumption). xG I think is useful the more you zoom out from individual moments/chances/games and start looking at it over a longer period of time.

Of course you then have the reliability of the method used by those providing the statistics, which makes xG look even more confusing as it's not like shots on target which is fairly (although not always) straightforward.
As far as I understand Xg is "objective" model. The model is created by evaluating a bunch of shots. Certain parameters are included for each shot, position of the shot, pass type, position of defenders(?), goalkeeper(?) etc. And then model learns based on training set what types of shot from which positions are most likely to go in.

So there should be no human evaluation of the chances. Of course, the model will be flawed, as it won't be able to take into account all paremeters, such as speed of pass, bouncy pass, the orientations of the defender (is he charging towards you, moving away, lying on the floor or just standing)... The model will be too simple.

The biggest worth of such a model would be to coaches to understand what patterns of play are most likely to result in goal, for example, cutback vs cross. And then they can train their players. But to actually evaluate performance, with these kind of deviations, it's completely unreliable.
 
First let me say, I love stats, and I love to measure as much as possible with stats, however finishing and particularly XG is currently terrible stat and shows very little.

Why I'm saying this, well Antony's chance was yesterday evaluated as 0.42. It was an open goal from 4 yards.

Diallo's goals are all evaluated as 0.33, 0.33 and 0.39. Actually, first two goals are quite difficult finishes, the first one from a tight corner on his weaker foot. The second one from a lobbed ball first time finish. I mean they are good chances, but certainly not trivial. However the last goal that is evaluated at 0.39???? Are you telling me there is a 60% chance that a professional footballer will miss an open goal under no pressure and with no sight of goalkeeper.

Are you telling me seriously that Matheus Fernandes chance is more likely to score than Amad his last or Antony? And that chance, despite him being surrounded with two united players and a goalkeeper straight ahead of him is apparently the best chance of game. Really? Really?

Source of XG values:
https://understat.com/match/26811

What this shows is the model (at least understat's) is flawed (probably too simple) and therefor can not be relied upon to provide actual information about quality of chances over a match. That also implies it is unreliable over the course of a season.

the data is telling you that, yeah
 
The key problem with the argument that xG is a terrible stat is that it works.

By which I mean it provably has more predictive value than actual goals scored. As well as other stats like shots or points per game. That's the beginning and end of the argument, really.

Whether a specific model is good or whether people on the internet use it properly are seperate and valid arguments, but the idea that xG isn't useful in certain contexts is obviously untrue.
 
It's funny how sometimes I feel like we've been hammered by the XG is like 3-1 in our favour and other games we play well and win but the XG will be 0.5 for us and 2.5 for the opposition or something.

It's an indicator I suppose. Over a whole season if you're constantly losing on XG then it suggests your luck may run out eventually. But for individual games it's a bit of a nonsense. That Antony chance should have been an XG of 0.99999.
 
Someone needs to post what the xg was over the entire PL season vs how many goals were scored (I cba)

If those figures match up, then it'll be fair to say that xg is a pretty reliable model. Just not in isolated incidents, it's more about averages
 
It's just a number on "we could have scored 5 or 6".

For individual games and chances it's about as useful as that argument.
 
Someone needs to post what the xg was over the entire PL season vs how many goals were scored (I cba)

If those figures match up, then it'll be fair to say that xg is a pretty reliable model. Just not in isolated incidents, it's more about averages

xG works both in the sense that teams and players over time tend to score a similar amount of goals to xG created, and in the sense that xG created is the best predictor we have of how many goals players and teams will score in the future.
 
It’s a good stat that can be (and regularly is) used in assessing individual and collective performance. And it’s usually more telling than actual goals scored due to how low-scoring of a game football is.

Just don’t think of it as a faultless model, no statistical model (currently, at least) is complicated enough to give you an objective and full picture of what’s happening. It works pretty well though, especially over the longer periods.

In fact, the Antony chance shows that they’re perhaps more realistic in your assessment than you are… it doesn’t make his attempt any less criminal but surely it’s counterintuitive to blame the model that actually predicted the miss that happened in being unrealistic?
 
First let me say, I love stats, and I love to measure as much as possible with stats, however finishing and particularly XG is currently terrible stat and shows very little.

Why I'm saying this, well Antony's chance was yesterday evaluated as 0.42. It was an open goal from 4 yards.

Diallo's goals are all evaluated as 0.33, 0.33 and 0.39. Actually, first two goals are quite difficult finishes, the first one from a tight corner on his weaker foot. The second one from a lobbed ball first time finish. I mean they are good chances, but certainly not trivial. However the last goal that is evaluated at 0.39???? Are you telling me there is a 60% chance that a professional footballer will miss an open goal under no pressure and with no sight of goalkeeper.

Are you telling me seriously that Matheus Fernandes chance is more likely to score than Amad his last or Antony? And that chance, despite him being surrounded with two united players and a goalkeeper straight ahead of him is apparently the best chance of game. Really? Really?

Source of XG values:
https://understat.com/match/26811

What this shows is the model (at least understat's) is flawed (probably too simple) and therefor can not be relied upon to provide actual information about quality of chances over a match. That also implies it is unreliable over the course of a season.

Fbref and Fotmob have Amad's last goal at 0.7 xG, while you can still argue that's too low it's obviously taking other things into consideration and not just that it was an open goal. They do differ for the Antony chance though, fbref has it at 0.28 while Fotmob had it at 0.7. I mentioned it before in another thread that understat is not a great website for xG stats, so one should do some research themselves on these stats before quoting them. But then again, that is the issue with mass produced stats, people who don't fully understand them quote them without context.

If people have an issue with xG then they shouldn't use simpler stats like goals or assists to assess players. Having higher assists does not mean one is a better playmaker and similarly more goals doesn't mean one is a better finisher.

The PSxG gives a better context for every chance. Antony's chance is still pretty low though. Maybe, it's taking into account the possibility of the players missing the ball completely from this kind of ball which happens regularly.

xg-psxg.jpg

The issue with PSxG is that it accounts for the quality of the shot, which is not necessarily representative of the quality of the chance itself. You could have one 25 yard shot that goes into the top corner, the xG would be low but PSxG would be high. Similarly if there is a simple one on one that the player hits over the bar, the xG would be high but PSxG would 0. I think when considering who had the better chance it would be xG to consider.
 
I agree, it's pretty stupid. I have a fix, but it's far too expensive to ever be rolled out.

Scandi Red xG
- First, you need at least 3 (preferably 5) judges watching the game in real time.
- The judges will use a 5-star system to rate chances.
- The average score from the judges will be the final score of the chance.

1 star = 0.10 xG
This is a negligible chance. A hopeless shot from a long distance, for instance. If the goalkeeper concedes then it's one of the biggest howlers of the season.

2 stars = 0.25 xG
This is a small chance. A couple of these will end up in the back of the net on a weekly basis. I would also typically count them as goalkeeper mistakes (unless it was a deflection or an own goal).

3 stars = 0.5 xG
This is a regular chance. The goalkeeper will be disappointed if it goes in, but it's not really a mistake either. World class goalkeepers should not concede these on a weekly basis though.

4 stars = 0.75 xG
Big chance. The odds favor the attacker. Requires a good save or a poor attempt to be stopped.

5 stars = 0.9 xG
This is a sitter (think Antony's chance last night).

---

The xG value of each star can be debated of course, but I think this would be a good way to judge which team created the most/better chances. What I like about this is that it allows context to play a part. The system is also pretty simple so the judges would probably agree most of the time. And when they don't agree the difference in interpretation is probably just one star.
 
As far as I understand Xg is "objective" model. The model is created by evaluating a bunch of shots. Certain parameters are included for each shot, position of the shot, pass type, position of defenders(?), goalkeeper(?) etc. And then model learns based on training set what types of shot from which positions are most likely to go in.

So there should be no human evaluation of the chances. Of course, the model will be flawed, as it won't be able to take into account all paremeters, such as speed of pass, bouncy pass, the orientations of the defender (is he charging towards you, moving away, lying on the floor or just standing)... The model will be too simple.

The biggest worth of such a model would be to coaches to understand what patterns of play are most likely to result in goal, for example, cutback vs cross. And then they can train their players. But to actually evaluate performance, with these kind of deviations, it's completely unreliable.
Again not an expert, but the human evaluation comes with the weighting and choices given to the data, e.g. Antony's chance for everyone watching is obviously a .999999999999 xG but maybe the modellers made a choice about angle of the ball, strong/weak foot, whether the resulting shot was on target and now you can see why that chance ends up being .2 or whatever it was last night. I can see why you would think in that instance it is an inaccurate statistic, but when you lok at Antony's goalscoring record and his personal xG over his time at United you get an accurate picture of his goalscoring ability, which last night's stat contributes to.