Comment by swyx
1 month ago
hi! author here! putting together a list like this is intimidating - for every thing i pick there are a dozen other suitable candidates so please view this as a curriculum with broadly prescriptive weightings, with the understanding that the CURRENTLY_RELEVANT_PAPER is always a moving pointer rather than fixed reference.
we went thru this specific reading list in our paper club: https://www.youtube.com/watch?v=hnIMY9pLPdg
if you are interested in a narrative version.
This is a great list - I was wondering if there was a less research oriented, more experimental or practical reading list that you're planning as well - some things off the top of my head:
- Actual examples of Fine tuning of LLMs or making merges - usually talked about in r/localLlama for specific use cases like role playing or other scenarios that instruction tuned LLMs are not good at. Jupyter notebook or blog post would be great here.
- Specifically around Agents & Code generation - Anthropic's post about SWE-bench verified gives a very practical look at writing a coding agent https://www.anthropic.com/research/swe-bench-sonnet with prompts, tool schema and metrics.
- The wide amount of Loras and fine tunes available on civitai for image models - a guide on making a custom one that you can use in ComfyUI.
- State of the art in audio models in production - Elevenlabs seems to still be the best for closed platforms, but there are some options for open source voice cloning, TTS, or even text to speech with very small parameter models (kokoro 82M).
welcome to write a guest post around it!
Great work!
I am out of my depth when it comes to reading papers, but I second 'The Prompt Report' from your list.
It gives a great taxonomy that helped me understand the space of prompting techniques better.
he was great on the pod too https://www.latent.space/p/learn-prompting
Hey Swyx, this list of papers is really interesting - I like that it's an opinionated list. Why isn't there more about model implementation though? I'd have expected engineers in this field want to know pytorch, cuda, JAX, maybe even XLA?
because of the growth of the field and proliferation of closed model apis, that is out of scope for AI Engineer now - save that for the MLE and RSes
I appreciate the effort put into curating and maintaining this list, good job on that!
I’m curious, is there also some specific existing “AI Researcher Reading List” you would personally recommend? Or do you plan on making and maintaining one?
im no researcher so no, but i would start with the llm courses at stanford, uc berkeley, stanford, and princeton.