I'm no expert on chess engine development, but it's surprising to me that both lc0 and stockfish use SPSA for "tuning" the miscellaneous magic numbers which appear in the system rather than different black box optimization algorithms like Bayesian optimization or evolutionary algorithms. As far as I am aware both of these approaches are used more often for similar tasks in non-chess applications (ex. hyperparameter optimization in ML training) and have much more active research communities compared to SPSA.
Is there something special about these chess engines that makes SPSA more desirable for these use cases specifically? My intuition is that something like Bayesian optimization could yield stronger optimization results, and that the computational overhead of doing BO would be minimal compared to the time it takes to train and evaluate the models.
One thing I wonder is why design of experiments (DOE) methodology is so seldom used for these things.
Statisticians and operations researchers have spent a hundred years deciding how to do as few experiments as possible to tweak parameters in the ways that give the highest impact with statistical basis that the selections are good.
In the language of information and decision trees, these experiments are trying to in some sense “branch” on the entropy minimizing variables.
DOE is still very useful in many contexts, but when it's possible do use a sequential design these adaptive techniques really start to pull away in terms of optimization quality.
There's simply a lot of sample efficiency to gain by adapting the experiment to incoming data in a regime where one can repeatedly design n candidates, observe their effects, and repeat m times compared to a setting where one must design a fixed experiment with n*m samples.
Engines like Stockfish might have over 100 "search parameters" that need to be tuned, to my best knowledge SPSA is preferred because the computational cost typically does not depend on the number of parameters.
Or, if attempting to use SPSA to say, perform a final post-training tune to the last layers of a neural network, this could be thousands of parameters or more.
The concern about the dimensionality of the search space is real, especially once things cross over into the 100s -- BO would certainly not be useful post-training the way the blog post talks about using SPSA.
That being said, it still seems possible to be that using a different black box optimization technique for a fairly constrained set of related magic numbers (say, fewer than 50) might lead to some real performance improvements in these systems, could be worth reaching out to the lc0 or stockfish development communities.
This really reminds me of the web as I remember it from the mid-to-late 90's; I feel like I'm just a click away from the old deoxy.org, if anyone remembers that. (Don't go there now; the domain appears to have been long-ago hijacked.)
It gave me serious vibes of the old internet homepages of highly eccentric people that became a part of the internet folklore, whether in a good way or a bad way.
The video is probably the least bizarre thing there, if that's what you are warning about.
What were you browsing where someone cutting off their own testicles is not as bizarre as other things? I didn't watch the video but atleast there was a warning.
AFAIK chess is has been "solved" for a few years in the sense that Stockfish running on modern laptop with 1 minute per move is unbeatable from the starting position.
This is not true. Stockfish is not unbeatable by another engine, or another copy of Stockfish.
Chess engines have been impossible for humans to beat for well over a decade.
But a position in chess being solved is a specific thing, which is still very far from having happened for the starting position. Chess has been solved up to 7 pieces. Solving basically amounts to some absolutely massive tables that have every variation accounted for, so that you know whether a given position will end in a draw, black win or white win. (https://syzygy-tables.info)
The parent is using a different definition, so they put "solved" in quotes. What word would you suggest to describe the situation where the starting position with 32 pieces always ends in either a draw or win for white, regardless of the compute and creativity available to black?
I haven't verified OP's claim attributed to 'someone on the Stockfish discord', but if true, that's fascinating. There would be nothing left for the engine developers to do but improve efficiency and perhaps increase the win-to-draw ratio.
Do you have a source? I remember asking on the Stockfish Discord and being told that Stockfish on a modern laptop with 1 min per move will never lose against Stockfish with 1000 min per move from the starting position.
But I'm not sure whether that guy was guessing or confident about that claim.
You can run Stockfish single threaded in a deterministic manner by specifying nodes searched instead of time, so in principle it is possible to set some kind of bounty for beating Stockfish X at Y nodes per move from the start position, but I haven't seen anyone willing to actually do so.
I know a fair deal about the subject of chess AI, but when I was reading this and I didn't understand. I was polarized, was I reading a mastermind that was way above my level? Or someone way too confident that learned enough buzzwords through an LLM to briefly delude someone else other than themselves?
A quick visit at the homepage suggests that it's probably the latter. I don't want to be rude, not posting out of malice, but if someone else was reading this and was trying to parse it, I think it might be helpful to compare notes and evaluate whether it's better to discard the article altogether.
Curious, what has you believe that? As someone who doesn't know much about chess AI, I was mostly able to follow along, and figured there were simply some prereqs the author wasn't explaining (e.g. distillation, test-time search, RLVR). If the article is deeply confused in some way I would indeed like to calibrate my BS detectors.
This comment is another example of an "llm psychosis" that is currently occuring in common discourse.
The mass delusion of, "I don't understand what I'm reading, therefore it must be produced by an llm."
I think it's a pretty serious problem. Not that llm text exists on the internet, but that reasonable people are reflexively closed off to creativity because the mere existence of the possibility that something is created by an llm is in their minds grounds for disqualification.
I'm no expert on chess engine development, but it's surprising to me that both lc0 and stockfish use SPSA for "tuning" the miscellaneous magic numbers which appear in the system rather than different black box optimization algorithms like Bayesian optimization or evolutionary algorithms. As far as I am aware both of these approaches are used more often for similar tasks in non-chess applications (ex. hyperparameter optimization in ML training) and have much more active research communities compared to SPSA.
Is there something special about these chess engines that makes SPSA more desirable for these use cases specifically? My intuition is that something like Bayesian optimization could yield stronger optimization results, and that the computational overhead of doing BO would be minimal compared to the time it takes to train and evaluate the models.
One thing I wonder is why design of experiments (DOE) methodology is so seldom used for these things.
Statisticians and operations researchers have spent a hundred years deciding how to do as few experiments as possible to tweak parameters in the ways that give the highest impact with statistical basis that the selections are good.
In the language of information and decision trees, these experiments are trying to in some sense “branch” on the entropy minimizing variables.
SPRT is used religiously in engine development today. There is enormous incentive to test efficiently.
https://github.com/official-stockfish/fishtest/wiki/Fishtest...
DOE is still very useful in many contexts, but when it's possible do use a sequential design these adaptive techniques really start to pull away in terms of optimization quality.
There's simply a lot of sample efficiency to gain by adapting the experiment to incoming data in a regime where one can repeatedly design n candidates, observe their effects, and repeat m times compared to a setting where one must design a fixed experiment with n*m samples.
Engines like Stockfish might have over 100 "search parameters" that need to be tuned, to my best knowledge SPSA is preferred because the computational cost typically does not depend on the number of parameters.
Or, if attempting to use SPSA to say, perform a final post-training tune to the last layers of a neural network, this could be thousands of parameters or more.
The concern about the dimensionality of the search space is real, especially once things cross over into the 100s -- BO would certainly not be useful post-training the way the blog post talks about using SPSA.
That being said, it still seems possible to be that using a different black box optimization technique for a fairly constrained set of related magic numbers (say, fewer than 50) might lead to some real performance improvements in these systems, could be worth reaching out to the lc0 or stockfish development communities.
Please be careful when visiting the homepage
as always, genius and insanity are only 1 step apart
[flagged]
7 replies →
This really reminds me of the web as I remember it from the mid-to-late 90's; I feel like I'm just a click away from the old deoxy.org, if anyone remembers that. (Don't go there now; the domain appears to have been long-ago hijacked.)
or kittens on encyclopediabrittanica
It gave me serious vibes of the old internet homepages of highly eccentric people that became a part of the internet folklore, whether in a good way or a bad way.
The video is probably the least bizarre thing there, if that's what you are warning about.
What were you browsing where someone cutting off their own testicles is not as bizarre as other things? I didn't watch the video but atleast there was a warning.
Feds this guy right here ^^
4 replies →
The homepage for this site is defiantly NSFW.
You probably meant definitely, but defiantly amusingly works too
"definitely" or "defiantly"?
The idea of something being "defiantly" NSFW gave me a chuckle.
AFAIK chess is has been "solved" for a few years in the sense that Stockfish running on modern laptop with 1 minute per move is unbeatable from the starting position.
This is not true. Stockfish is not unbeatable by another engine, or another copy of Stockfish.
Chess engines have been impossible for humans to beat for well over a decade.
But a position in chess being solved is a specific thing, which is still very far from having happened for the starting position. Chess has been solved up to 7 pieces. Solving basically amounts to some absolutely massive tables that have every variation accounted for, so that you know whether a given position will end in a draw, black win or white win. (https://syzygy-tables.info)
The parent is using a different definition, so they put "solved" in quotes. What word would you suggest to describe the situation where the starting position with 32 pieces always ends in either a draw or win for white, regardless of the compute and creativity available to black?
I haven't verified OP's claim attributed to 'someone on the Stockfish discord', but if true, that's fascinating. There would be nothing left for the engine developers to do but improve efficiency and perhaps increase the win-to-draw ratio.
3 replies →
Hypothetically, what reward would be worth the cost for you to attempt to beat Stockfish 18, 100 million nodes/move, from the starting position?
Do you have a source? I remember asking on the Stockfish Discord and being told that Stockfish on a modern laptop with 1 min per move will never lose against Stockfish with 1000 min per move from the starting position.
But I'm not sure whether that guy was guessing or confident about that claim.
10 replies →
You can run Stockfish single threaded in a deterministic manner by specifying nodes searched instead of time, so in principle it is possible to set some kind of bounty for beating Stockfish X at Y nodes per move from the start position, but I haven't seen anyone willing to actually do so.
Even by a stockfish running on a modern laptop with 2 minutes per move (provided they are going second)?!
Yes, that's what "unbeatable from the starting position" means.
“Solved” is a term of art. Defining it in some other way is not really wrong (since it is a definition) but it seems… unnecessary.
https://cosmo.tardis.ac/files/2026-02-12-az-rl-and-spsa.html
Response from the author of Viridithas, there is a link to this engine in her webpage.
Thanks! I've put that link in the toptext as well.
Her?
The author goes by any/all pronouns; she, he, and they have all been used for the author with similar frequency.
I read "girl.surgery" and guessed.
I know a fair deal about the subject of chess AI, but when I was reading this and I didn't understand. I was polarized, was I reading a mastermind that was way above my level? Or someone way too confident that learned enough buzzwords through an LLM to briefly delude someone else other than themselves?
A quick visit at the homepage suggests that it's probably the latter. I don't want to be rude, not posting out of malice, but if someone else was reading this and was trying to parse it, I think it might be helpful to compare notes and evaluate whether it's better to discard the article altogether.
Curious, what has you believe that? As someone who doesn't know much about chess AI, I was mostly able to follow along, and figured there were simply some prereqs the author wasn't explaining (e.g. distillation, test-time search, RLVR). If the article is deeply confused in some way I would indeed like to calibrate my BS detectors.
This comment is another example of an "llm psychosis" that is currently occuring in common discourse.
The mass delusion of, "I don't understand what I'm reading, therefore it must be produced by an llm."
I think it's a pretty serious problem. Not that llm text exists on the internet, but that reasonable people are reflexively closed off to creativity because the mere existence of the possibility that something is created by an llm is in their minds grounds for disqualification.
[dead]