https://www.anthropic.com/careers/jobs/5025624008 - "Research Engineer – Cybersecurity RL" - "This role blends research and engineering, requiring you to both develop novel approaches and realize them in code. Your work will include designing and implementing RL environments, conducting experiments and evaluations, delivering your work into production training runs, and collaborating with other researchers, engineers, and cybersecurity specialists across and outside Anthropic."
https://www.anthropic.com/careers/jobs/4924308008 - "Research Engineer / Research Scientist, Biology & Life Sciences" - "As a founding member of our team, you'll work at the intersection of cutting-edge AI and the biological sciences, developing rigorous methods to measure and improve model performance on complex scientific tasks."
The key trend in 2025 was a new emphasis on reinforcement learning - models are no longer just trained by dumping in a ton of scraped text, there's now a TON of work involved designing reinforcement learning loops that teach them how to do specific useful things - and designing those loops requires subject-matter expertise.
That's why they got so much better at code over the past six months - code is the perfect target for RL because you can run generated code and see if it works or not.
The funny part is how they think this will give them the power to take control of what is the defacto standard and circumvent standards.
It will instead further distinguish what is AI slop because it doesn't work and be siloed off to people who don't care about the code so can't fix it.
If people want good interoperable production ready code that can be deployed instantly and just works and meets all current standards and ongoing discussions, we've had it for many decades and it's called open source.
This can never match the scale of organic training data
Or quality
Actually synthetic training dats is better, thats why the new models are all better at design.
2 replies →
These people won’t have to be experts like the tailwind team? Quality will be spontaneous?
They pay people to generate open source libraries? I'd love to see it
this is news to me, how does this work? who is getting paid?
Some relevant job ads for Anthropic:
https://www.anthropic.com/careers/jobs/5025624008 - "Research Engineer – Cybersecurity RL" - "This role blends research and engineering, requiring you to both develop novel approaches and realize them in code. Your work will include designing and implementing RL environments, conducting experiments and evaluations, delivering your work into production training runs, and collaborating with other researchers, engineers, and cybersecurity specialists across and outside Anthropic."
https://www.anthropic.com/careers/jobs/4924308008 - "Research Engineer / Research Scientist, Biology & Life Sciences" - "As a founding member of our team, you'll work at the intersection of cutting-edge AI and the biological sciences, developing rigorous methods to measure and improve model performance on complex scientific tasks."
The key trend in 2025 was a new emphasis on reinforcement learning - models are no longer just trained by dumping in a ton of scraped text, there's now a TON of work involved designing reinforcement learning loops that teach them how to do specific useful things - and designing those loops requires subject-matter expertise.
That's why they got so much better at code over the past six months - code is the perfect target for RL because you can run generated code and see if it works or not.
Mercor, Turing, Scale, etc facilitate the work. Labs pay them, they pay contractors.
The funny part is how they think this will give them the power to take control of what is the defacto standard and circumvent standards.
It will instead further distinguish what is AI slop because it doesn't work and be siloed off to people who don't care about the code so can't fix it.
If people want good interoperable production ready code that can be deployed instantly and just works and meets all current standards and ongoing discussions, we've had it for many decades and it's called open source.