← Back to context Comment by BoredPositron 18 hours ago It was mainly a jab at the protoscientific nature of it. 4 comments BoredPositron Reply vntok 17 hours ago Reproducing experimental results across models and vendors is trivial and cheap nowadays. BoredPositron 16 hours ago Not if anthropic goes further in obfuscating the output of claude code. vntok 15 hours ago Why would you test implementation details? Test what's delivered, not how it's delivered. The thinking portion, synthetized or not, is merely implementation.The resulting artefact, that's what is worth testing. 1 reply →
vntok 17 hours ago Reproducing experimental results across models and vendors is trivial and cheap nowadays. BoredPositron 16 hours ago Not if anthropic goes further in obfuscating the output of claude code. vntok 15 hours ago Why would you test implementation details? Test what's delivered, not how it's delivered. The thinking portion, synthetized or not, is merely implementation.The resulting artefact, that's what is worth testing. 1 reply →
BoredPositron 16 hours ago Not if anthropic goes further in obfuscating the output of claude code. vntok 15 hours ago Why would you test implementation details? Test what's delivered, not how it's delivered. The thinking portion, synthetized or not, is merely implementation.The resulting artefact, that's what is worth testing. 1 reply →
vntok 15 hours ago Why would you test implementation details? Test what's delivered, not how it's delivered. The thinking portion, synthetized or not, is merely implementation.The resulting artefact, that's what is worth testing. 1 reply →
Reproducing experimental results across models and vendors is trivial and cheap nowadays.
Not if anthropic goes further in obfuscating the output of claude code.
Why would you test implementation details? Test what's delivered, not how it's delivered. The thinking portion, synthetized or not, is merely implementation.
The resulting artefact, that's what is worth testing.
1 reply →