Pokémon Team Optimization

1 month ago (nchagnet.pages.dev)

68 comments

nchagnet

reddalo 1 month ago

An interesting thing of this article is that the SVG image of the type matchup [1] has embedded automatic translation.

The type labels will be displayed in the language your browser is set to. I didn't even know this was possible.

[1] https://upload.wikimedia.org/wikipedia/commons/9/97/Pokemon_...

scrollaway 1 month ago
It's using the <switch> tag for this
https://developer.mozilla.org/en-US/docs/Web/SVG/Reference/E...
However, like with many of these obscure features, I am not so sure it works well in practice. I have the Windows 11 laptop I'm viewing that SVG from set with support enabled for english, french and russian, and I'm getting, among most of the English tags, a few stray "Psychique" and "Привидение" types in the svg. I have no idea how it chooses which one to show, there.
- Sardtok 1 month ago
  
  Just a wild guess, but perhaps the order of the translations vary across cells. Perhaps the browser just picks the first one that matches your supported locales.
nchagnet 1 month ago

Oh that's really cool, I didn't know about this! I just linked to the wikimedia-hosted illustration, but that's a good perk too.
coolness 1 month ago

Wow thats very cool, i was puzzled at first as to why the pokemon types were in Finnish!

HelloUsername 1 month ago

Right on time for 30th anniversary! https://xcancel.com/Pokemon_cojp/status/2006379822012911872

Translated text:

The 30th anniversary of Pokémon begins! It's been 30 years since the release of "Pokémon Red and Green." Pokémon will celebrate its 30th anniversary on Friday, February 27, 2026. This year is going to be the best year yet! Stay tuned! #Pokémon30thAnniversary

WJW 1 month ago
Red and Blue, surely?
- Tzk 1 month ago
  
  No. Red and green were released in Japan first. Only afterwards it was red/blue worldwide (and iirc in 98, not 96).
  
  1 reply →
- calumcl 1 month ago
  
  Red and Blue are the international releases of Gen 1 that came out about 2 years after Red/Green.

huevosabio 1 month ago

Love this!

My general problem with Pokémon (at least the older versions, haven't played the latest) is that when playing against others it frequently just boils down to the same set of legendary and overpowered mons.

You sort of addressed this running the milp without certain mons as options, which makes sense.

But you already have the machinery for a better constraint: max total base stat. You could think of it as "weight classes" in box.

So, for a given weight class, your team can only add up to Y in total base stat. You can squeeze one of the OP mons, but then the rest are slackers. Or you could balance them.

It makes it a lot more interesting and invites for diversity. And you could run it for many different values of Y.

j-bos 1 month ago
It's a good idea but base stats aren't everything: https://youtu.be/gEkMi_y3Wzo
That's why the competitive scene maintains a listing of tiers across generations derived from analyzing the actual usages across thoughful battles. https://www.smogon.com/sm/articles/sm_tiers
- JonathonW 1 month ago
  
  There's even a clear example of this in the non-legendary teams: both of them include Slaking, presumably because it has a particularly high base stat total (the highest of any non-legendary, and matching some legendaries).
  But Slaking doesn't get that for free-- it has an ability (Truant) that means it can only use moves every other turn. That limits its usefulness outside of a couple very specific scenarios, and means that it'll usually be outperformed by significantly "weaker" Pokemon (going purely by numbers).
  And that's just one of the factors you'd need to take into account to build a team optimizer that's actually useful. Actually building a team has to take into account a massive number of factors: roles for each Pokemon (not just what types they can counter), available movesets, any advantages or disadvantages provided by abilities, your opponents' team composition, etc... it's a big problem to try to solve.
oceansky 1 month ago
In competitive Pokémon there are usually different tiers of which Pokémon are accepted. In most legendaries are limited or fully banned.
For the mainline games it usually does not matter. You can beat it with any single Pokémon pretty much.
- 666lumberjack 1 month ago
  
  The median legendary pokemon in any given generation is typically quite mediocre in terms of viability/power, so I'm not sure it's quite correct to say that most tiers are free of legendaries. Even in ZU, the lowest possible tier, you have Pokemon like Mesprit, Regirock and Articuno kicking around in spite of their relatively high stats.
  It is completely true to say that the so-called 'box legendaries' specifically - with base stat totals in the 670-700 range - tend to be excellent Pokemon and with rare exceptions are banned from 'standard' formats for being overcentralizing.
  
  1 reply →
- scott_w 1 month ago
  
  If anyone wants to see an example of this, JRose11 on YouTube is doing exactly this for Pokemon Red/Blue and putting a list together of completion times. It’s been a multi year project but I’d totally recommend starting at the beginning and losing yourself in the rabbit hole.
  He also does some bad Pokemon challenges to see if it’s possible to finish with, say, a Weedle (mostly no).
TZubiri 1 month ago

That property is true in most games though, unless it is heavily optimized for 'balance'. Consider Chess, or Poker, or Soccer, if both players play correctly a huge range of strategies are just easily exploitable and thus unplayed.
That said, complexity emerges and explodes in the tiny differences, even if there's like 8 pokemon that are in 95% of teams, and 10 situational pokemon that appear 5% of the times, that's like C(6,8) teams which is like 56 possible teams of Uber pokemon, and a buttload of possible teams with situational pokemon like choice scarf ditto, eviolite Chansey, nuzzle u-turn super fang pachirisu, etc...
Even if the teams were the same, just the possible differences in movesets create a lot of different sets, suppose 6 possible moves for each mon, you have C(4,6) sets for each mon, each with their own probability weight as well.
mort96 1 month ago

You'd need something which better encapsulates the power level of a pokemon; move pools, abilities, typing and synergies all play a role. There are bad pokemon with a huge base stat total, and a hypothetical pokemon with a tiny base stat total of 50 but with ground/poison typing and the wonder guard ability would be an absolute nightmare.
The Smogon tiers (NU, RU, UU, UUBL, OU, Uber) is a thorough attempt at placing all pokemon in tiers based on overall power level and to make interesting balanced matches at different tiers.
nchagnet 1 month ago

That's a great idea!
huevosabio 1 month ago

Follow up thought: it would be cool if you imagine a match to consist of three sets each for a different class.
So each player comes with three teams of small, mid and large base stat classes. You can't repeat monster across teams. Whoever wins 2/3 wins the match.
...
And if this was my college house, we would have a price system for the mons so you wouldn't be able to repeat mons even between players. But that's a different thing altogether.

oceansky 1 month ago

Base stat total alone is a bad metric, because stat distribution is equally as important.

If the stats are distributed heavily both on attack and special attack, it's usually bad because you generally want specialist attackers and these stats could be better somewhere else like speed.

nchagnet 1 month ago
Absolutely! In general I would expect a better model to incorporate a lot of weighed terms in the objective to choose less "extreme" solutions, but here I was mostly interested in illustrating the method.
- oceansky 1 month ago
  
  It was very impressive at that, congratulations.
TZubiri 1 month ago

You can go for smogon tiers as a proxy for pokemon strength.

snackdex 1 month ago

damn, everyone is having such well thought out conversations, and I just scrolled to see the Pokémon that were selected.

I think I'm one of those people who worries that if i understand the math of it all, it'll lose some of the magic that made me keep my raticate for no other reason than because it was the first pokemon that accepted me without me dealing any damage to it.

xlbuttplug2 1 month ago

I have very fond memories of playing the Pokemon games, but I always saw the battling aspect as a hurdle to unlocking more of the story/world, which was the true appeal to me. I was content turning my brain off and overleveling my mons. Different strokes, I guess.

vmazi 1 month ago

In professional Pokémon competition, winning is not only dependent on stat distribution of Pokémon in/ev but also the move pool they have access to. Strategies are built up using moves that setup Pokémon in play or Pokémon that will be switched to in such a manner that the stat distributions and or move priority is modified in the users favor. I wonder if there’s a way to calculate ideal team not just from stat distributions but the likelihood the available move pool can be linked to a coherent win strategy like stall, trick room switchup, flip turn into swift swim priority and other matchup strategies

s_trumpet 1 month ago

Great article but I’d say you’re optimizing for the wrong metric here. For in game playthroughs, offense > defense and especially speedy offense beats anything else.

I’d state it as, Given any type, we should be able to hit it for super-effective damage with at least 1 move. And instead of taking raw BST, I’d take Max(SPD+ATK, SPD+SPA) to favour speedy offense.

Of course this does not take into question the thorny question of availability. Metagross is a top tier but only available post game in its debut. On the other hand Crobat and Gyarados are readily available in many of the games early on and evolve fairly quickly.

Please look into the competitive Nuzlocke community, there are a lot of damage calculations and viability spreadsheets all around, you’ll find it interesting.

nchagnet 1 month ago

Thank you for your suggestion, I agree with you (and another commenter) that base stat is not that useful, and availability is actually what I would prioritise on in a next iteration. I tried to keep it simple here, mostly because it was interesting enough as an analysis. But if I were to redo this to get _the best_ team in a generation, I'd definitely go with what you suggested!
saghm 1 month ago

If we're really trying to optimize for everything, I'd argue two of the biggest factors are move sets and (in generations after 2) abilities. There are Pokemon with great stats but abilities that quite literally are intended to be drawbacks (e.g. Slaking and Regigigas).

TZubiri 1 month ago

For games that are as complex as pokemon, it's usually necessary to restrict analysis to some subset. In this case team typing was used.

I personally like restricting to generation 1, as it is very cannonical, very static, and one of the simplest.

Furthermore I like the 1v1 format, which instead of a team, it's just 1 pokemon vs the other. Otherwise you have to resort to heuristics.

But even with a 1v1 and generation 1 restriction it still isn't solved!

Even a single matchup it's very complex to arrive to a theoretical mathematical problem, and still quite burdensome to write a montecarlo simulation.

For example:

Tauros vs gengar (Not an uncommon matchup in competitive gen 1)

Hypnosis has a 60% accuracy, tauros can sleep for 1 to 6 turns with equal probability. Tauros can 2HKO with Earthquake, but can also crit. Gengar can 4HKO, with each crit counting as a double hit (both crits having roughly 20% chance).

The question of who has the advantage is to my knowledge unsolved (also consider that in 1v1 the answer is different, as in teams you only have 1 sleep, so Gengar wastes it). It's also different from the problem of choosing the actual correct move, not only do you need to find the best first move, but in the game decision tree, you need a decision for each node. For example, if Tauros has 60% HP and Gengar has 100%HP, is it still better to go for hypnosis, or better to go for damage and hope for 1 out of 2 crits. This is all made more complex by the fact that both mons have a speed tie, so it's yet another probability event of who will attack first.

https://www.smogon.com/forums/threads/gengar-vs-tauros-1v1-w...

For a simple gen 1 with hidden teams, I think there's a bigger game tree than chess, and even Poker. The fact that it's non-stochastic with hidden information makes it very similar to poker analysis wise, I bet Counter Factual Regret Minimization approaches would work as well.

yunusabd 1 month ago

If you find the base game too easy, I can recommend the IronMON challenge: You can only use one mon, permadeath, stats are randomized, all trainer levels are buffed by 1.5x and you can't level up on wilds. Along with numerous other rules to make it harder. There are variants that are borderline impossible to beat, like Super Kaizo IronMON. Out of hundreds of thousands of attempts, it has only been beaten once. Would make for an interesting optimization problem.

https://github.com/PyroMikeGit/SuperKaizoIronMON

tweakimp 1 month ago

Why is y+2x optimal at (0,3) with a value of 3? Isnt it (3,0) with a value of 6?

nchagnet 1 month ago
Good catch! Especially since I ended up drawing y - x = C but didn't update the legend. I updated it!
- tomtom1337 1 month ago
  
  Haha, I started reading this, got interrupted, came back and got confused by the graph. Then came to the comments, saw your comment, reloaded the post and voila!
  Thank you for a lovely post!
abhishekbasu 1 month ago

you're right, it should be (3,0) with optimal obj value of 6.

thiago_fm 1 month ago

Meanwhile I like the optimization analysis, the initial assumptions are very wrong. I know the author mentioned that they are optimizing for stats + being resistant to other Pokémon types, but that analysis will lead to very bad results.

There are Pokémon with certain abilities or tricks that makes it much better than legendary ones, with certain move sequences that could wipe the entire other team.

There are also Pokémon with certain types that are actually good against what the 'data' would say otherwise.

Maybe the analysis could be better done if you instead analyzed matches data.

BTW, the way people pay Pokémon since many years is to also divide the Pokémon into tiers and in a competitive setting, you are only allowed to pick Pokémon from the same tier or lower. This adds another level of complexity.

nchagnet 1 month ago
I generally agree with you on the point that a "good Pokémon team" can be better encapsulated by other attributes, including those you mentioned. I would disagree on assumptions being very wrong, because I am not assuming that the objective and constraints chosen are ideal or even good enough, I am choosing them simple for illustration purposes.
I actually found it interesting that in spite of what is a clearly overly simple model, the non-legendary non-multi-starter you eventually get is quite a good one, in my opinion better than what the naive constraints would lead me to think.
Also, keep in mind that I'm not talking about competitive matches here, just mainline gaming. For that end, types are usually all you need, and in that area the main thing I would do is generalize type constraints to not be just defensive but also ensure each resistant Pokémon has a good enough attack against that type.
In my opinion, abilities, nature, objects are: 1. Too complex for such models (MIPs are still exponential-time) 2. Overkill strategy when all you wanna do is beat the league
But that last part is just my opinion.
- thiago_fm 1 month ago
  
  That's where I disagree as well, the non-legendary non-multi-starter is also a crappy team as Pokémon isn't a rock paper scissor game since a long time.
  You'd need to take into account innate abilities, how to also understand the impact of HP/ATK/etc in each Pokémon.
  Snorlax for example is quite good, as it's pretty tanky and have a good moveset that can help it set it up.
  Dragonite has multiscale, stuff that doesn't show up in by just looking at Pokémon types and stats.
  Again, my main point was... analysis could be much more interesting if it analyzed battle data, rather than a very shallow assumption that a combination of Pokémon types and stats can win a battle.

abhishekbasu 1 month ago

this was a great read to start the new year! having worked extensively with mixed integer programs, it is always a bit disheartening to see them not used enough for everyday decision-making. one of my goals this year is to create a layer to make it easier to formulate mips and test them, via plain text input. this would hopefully increase adoption through a lower barrier to entry.

mparnisari 1 month ago

My uni course on optimization was so much fun but I forgot all of it. This was a nice reminder that I should probably revisit the basics :)

TobiasJBeers 1 month ago

This thread is a great example of how quickly “optimize X” turns into “define X” when modeling intuition meets player intuition.

stevekemp 1 month ago

Lots of people working in IT have tattoos, I like to see what theme/image overlap they have.

Three people in my current workplace have a balloon tattoo (interestingly all of them are red balloons). Five people in my current workplace have a Pokémon tattoo that is easily visible.

Edit: Including myself, on both counts, I should have said.

u8080 1 month ago
>balloon tattoo
What does it mean?
- stevekemp 1 month ago
  
  A tattoo of a balloon! Unless you meant what the meaning of the design was, and in that case different people have different associations and meanings.
  One of my forearms is covered in things my son used to be obsessed by when he was young, which is why I have a lego figure, a pikachu, and a red balloon as depicted in the book "Goodnight Moon" which I read to him every night for 3+ years.
867-5309 1 month ago
which Pokémon? gotta name them all! (5)
- stevekemp 1 month ago
  
  I wish I could remember, but offhand all I can say is that we definitely have two pikachus and one snorlax.
  
  1 reply →

toledocavani 1 month ago

While I didn't have the math brain, I came to a pretty similar conclusion based on days swimming in Bulbapedia (with an added constraint that my Deoxys must have a slot).

Took the team to online matches and got swept, the pro players have completely different team choices and strategies.

arealaccount 1 month ago

Slaking can only attack every other turn making it a bad choice outside of niche teams.

nchagnet 1 month ago

I do comment on that in the article, I think it's a nice example of how your model can only know what you tell it (the one I used in the article doesn't know about abilities).

saagarjha 1 month ago

I'm curious how it would rank existing teams–for example, are there trainers who pick better teams (of course, I am sure the bug catchers get soundly trounced). Surely Cynthia or Red have a strong team?

faitswulff 1 month ago

> I was always hunting for Pokémon with better abilities, better type coverage, analyzing synergy between moves… If you’ve ever played a mainline Pokémon game before, you must know how utterly unnecessary this is. Twenty years ago, I would have just powered through on Blastoise or Typhlosion alone.

I definitely beat the first Pokemon games with a level 100 Charizard. I even defeated gyms that were strong against fire types, often KO'ing Pokemon in one hit. The text would say "It's not very effective..." and then the opponent's health bar would drop to zero. So yeah, these games are easy enough that a 10yo can get by with twinking out a single pokemon. Makes the blog post even funnier

cpburns2009 1 month ago

I have fond memories of my Garados soloing the elite four in Yellow.

the-smug-one 1 month ago

The SVG chart has internationalization built-in, with multiple languages available. I thought that was cool.

HelloUsername 1 month ago

I would've liked to see in conclusion a recommended starter team per generation! Very nice article!

nchagnet 1 month ago

I was planning in a future sequel/update to do this but with "better" constraints like only including Pokémon available in a game, etc... Maybe even separate it into early/mid/late-game availability since most optimal Pokémon are late-game anyway.

mgaunard 1 month ago

Would have been nice to see with only first gen pokemon which were much better balanced IMHO.

saghm 1 month ago

I'm not sure that theory bears out. At least in terms of competitive meta, my understanding is that gen 1 is pretty much always the same Pokemon on most teams (although that's also partially from fewer choices and the lack of newer features that provide ways to build alternative strategies like held items and abilities).

erhuve 1 month ago

my goat tyranitar will always be relevant no matter the generation

yjftsjthsd-h 1 month ago

Small typo(?):

> Mewtwo (#151)

Should be 150

nchagnet 1 month ago
Thank you, you're right! For some reason I always forget mew comes after mewtwo in the pokedex...
- saghm 1 month ago
  
  I think it's a common mistake. It would be much more intuitive for the one literally labeled "two" to come after; I imagine it doesn't work like that because they wanted the mythical one last.
  
  1 reply →

unpopularopp 1 month ago

Now all we need is a quick vibe coded web GUI front end

HelloUsername 1 month ago

Username checks out