UPDATE: A coding error caused some odd things to happen in the previous set of sims and so I had to re-code things and then re-run the sims. The new (and fixed) run of sims are now reflected in the table and the old ones tossed out. The data dump file has also been replaced with the new set of sims. Apologies for this screwup. My bad.
Okay, so we have 32 teams and they have been separated into the following pots:
Pot 1: South Africa, Brazil, Spain, Netherlands, England, Argentina, Italy, Germany
Pot 2: Mexico, USA, Honduras, Australia, Japan, South Korea, North Korea, New Zealand
Pot 3: Uruguay, Paraguay, Chile, Ivory Coast, Nigeria, Cameroon, Ghana, Algeria
Pot 4: France, Portugal, Denmark, Serbia, Switzerland, Greece, Slovakia, Slovenia
The restrictions on these are that South Africa can’t be drawn with any of the African teams in Pot 3, while Brazil and Argentina can’t be drawn with any of the South American teams in that pot.
So I ran 10,000 simulations similar to what I did for qualifying and here are the basic results in table format, with the columns representing what stage the team went out at for each sim:
|Country||Wins||2nd||3rd||4th||Quarters||Round of 16||Group||Win%|
Not a lot of surprises. Brazil are the favorites with Spain running a not all that close 2nd. Surprisingly France won more than Italy despite going out at the group stage more often (due to their lack of seed). My USA is a longshot but I suppose that counts as non-zero. I would say you can read that as all of the teams from Swtizerland to Serbia as all having functionally the same chances. If I re-do the sims that order changes quite a bit.
I’m doing this now so I can contrast these results with the results after the draw tomorrow (Charlize Theron!!) Something else you can contrast are the differences between the above table and this table. That table is the sims with Africa grouped with CONCACAF instead of CONMEBOL. The team who loses the most wins because of this is Brazil, because they now are guaranteed to draw an African side instead of an Asian/OFC one, and the African sides are on average stronger.
It also should be noted that seeds and pots and the draw in general tend to have larger effects on things like getting out of the group than they do for winning the tournament. Runs like Germany’s in 2002 are rare, and at some point to win it you’re going to have to get past some very tough teams. Seeding appears to mostly just affect at one point you’re forced to face them.
All of this of course can’t really take into account hard to know and impossible to know variables. How much will the African teams other than South Africa benefit from playing on their home continent? Since a World Cup has never been played on African soil, that’s pretty hard to know. Egypt’s Confederations Cup performance provides little guidance as well.
Teams whose recent results differ substantially from their results in the recent past are also difficult to gauge. This would be teams like France, Chile, North Korea and Algeria. My system often differs on these teams than other systems which seem to weight more recent results more heavily. My conclusion on which method is right is unfortunately: “it depends on the team.” Some of those results will represent real changes in the quality of the team and some will be just temporary. It’s very difficult looking just at results to separate out the two. The system I use is designed to give the best results when looking at all teams, but that doesn’t mean it fits as well for any one specific team.
The system does correctly sort out the post-group brackets so that finishing 1st and 2nd does affect your remaining games, but it can’t really account for the effects a relatively easy or relatively tough group stage might have on a team’s chances in the knockout rounds.
As always, the strength of this type of analysis is as a launching point for further analysis. If you have no idea whether Paraguay has closer to a 1% chance to win or 10% chance to win, it’s very difficult to continue very productively from there. Big picture analyses like this are usually quite useful in making the smaller picture adjustments one might want to make.
Finally for those that are truly interested, I have created a .csv file that has the results of every match played in the run of simulations. There’s 64 matches played in a world cup, so that’s 640,000 matches. The file is pretty damned huge for a simple text fule (about 20 MB) so if you’re interested in this data I’d really appreciate it if you’d download the file rather than trying to view it in your browser. Here is the file: