Comment by lp251

8 hours ago

why did they move to fblearner

what is the new training platform

I must know

Meta has been itching to kill FBlearner for a while. Its basically an airflow style interface (much better to use as a dev, not sure about admin, I think it might even pre-date airflow)

They are mostly moved to MAST for GPU stuff now I dpn;t think any GPUs are assigned to fblearner anymore. This is a shame because it feels a bit less integrated into python and feels a bit more like "run your exe on n machines" however, it has a more reliable mechanism for doing multi-GPU things, which is key for doing any kind of research at speed.

My old team are not in the super intelligence org, so I don't have much details on the new training system, but there was lots of noise about "just using vercel" which is great apart from all of the steps and hoops you need to go through before you can train on any kind of non-opensource data. (FAIR had/has thier own cluster on AWS, but that meant that they couldn't use it to train on data we collected internally for research (ie paid studies and data from employees that were bribed with swag)

I've not caught up with the drama for the other choices. Either way, its kinda funny to watch "not invented here syndrome" smashing in to "also not invented here syndrome"