Comment by amluto
1 day ago
My personal peeve about Anki: I don’t like its data model. It seems to me that there ought to be collections of notes (which might be things one would download or generate with an LLM or make yourself or share with friends or students). On top of one or more collections of notes are the sets of cards to want to learn, and they can derive from the notes. This includes, roughly, templates plus some concept of which cards are enabled. On top of that is the spaced repetition history and model. There also ought to be a way to constrain what cards should be studied in a given session. (For example, if learning Chinese or Japanese, one might want to have a pencil and paper when practicing writing but not reading. When practicing without paper, one might want to skip the writing cards.)
Anki doesn’t seem to separate these layers at all. Everything is a monolithic database. Import is unpleasant. Export is unpleasant. Sharing is unpleasant. Doing anything other than practicing and editing in the UI is unpleasant. And, every time I try Anki, I get stuck when I can’t manipulate my own data outside Anki.
Is there any system out there that doesn’t have this issue?
> there ought to be collections of notes (which might be things one would download or generate with an LLM or make yourself or share with friends or students). On top of one or more collections of notes are the sets of cards to want to learn
Is that not what anki does? You have a collection of cards, each card can be in one or more decks derived from the cards.
> There also ought to be a way to constrain what cards should be studied in a given session
That's also decks. You can have your 'Japanese' deck, and then the 'Japanese::writing' subdeck for the subset which require you to have your writing materials handy.
You can also use "Better Tags" to tag cards, and then create a sub-deck with an ad-hoc tag query to only study a subset if you want.
Does creating more decks and then studying the subset you want to in a session not work for what you want?
> Anki doesn’t seem to separate these layers at all. Everything is a monolithic database.
Decks are separate files which can be shared, edited, created, studied, and reasoned about independently.
The "spaced repetition model" in anki is obviously separate from the fact that there are multiple (FSRS and the old one).
> Export is unpleasant. Sharing is unpleasant
It's just files (zip files really). What's unpleasant about it?
> And, every time I try Anki, I get stuck when I can’t manipulate my own data outside Anki.
There's libraries to manipulate anki decks outside of anki for practically every programming languages. There are literally dozens of tools that can generate and import anki cards, such as the large family of japanese "mining" tools which create anki cards from media, dictionary entries, etc etc.
It's open source, and the code has clean library abstractions you can work with, so it's trivial to nab any of the data out of it.
> Is there any system out there that doesn’t have this issue?
Every issue you described is something that I experienced in other software, but which anki solved for me, so for me "anki" is that system.
> Is that not what anki does? You have a collection of cards, each card can be in one or more decks derived from the cards.
Kind of? As far as I can tell (and I haven't spent enormous amounts of time digging in), there are decks, and a deck contains the notes, the templates (and the cards, which may or may not have any sort of independent existence outside the notes and templates that generate them?), and a deck also contains the scheduling information.
One can export the textual and markup contents of decks, but not the media, into a text file, and one can re-import it, supposedly losslessly. One can also export a deck minus scheduling information for sharing purposes. I'm not sure that one can re-import it.
Then there is a collection, which is the whole world: decks along with their scheduling info.
> That's also decks. You can have your 'Japanese' deck, and then the 'Japanese::writing' subdeck for the subset which require you to have your writing materials handy.
I'm guessing that, if I start by importing a Japanese deck from some other source (because, for example, there's a source with high-quality notes), and then I split it into a writing subdeck, and then the original source adds new notes for new words or makes changes or whatever, that merging the results is basically unsupported.
> > Anki doesn’t seem to separate these layers at all. Everything is a monolithic database.
> Decks are separate files which can be shared, edited, created, studied, and reasoned about independently.
Yes, but only as monoliths (again, as far as I can tell). I can export an "Anki Deck Package (.apkg)", but checking that into git would result in a bit of nonsense. I can't export my scheduling information and templates separately from the underlying notes (or, if I can, I failed to find this option).
>> Export is unpleasant. Sharing is unpleasant
> It's just files (zip files really). What's unpleasant about it?
Excel and OpenDocument sheet files are also zip files. But the respective tools are less limiting and don't expect the users to unzip those zip files. (And their merging and text import/export facilities are also weak, and that's unfortunate.)
I could be wrong about most of this. But Anki doesn't seem friendly to a decomposed workflow in the way that modern programming lanuages are.
Maybe I misunderstood something, but I can't find anything in your comments that identifies a flaw with the underlying data model of Anki. It seems your issues are mostly about Anki's data management?
So I would strongly suggest your check out anki-connect (https://git.sr.ht/~foosoft/anki-connect) which provides a REST API for CRUD operations on Anki notes, cards, decks, and media attachments.
Or maybe if you can share in a little more detail on what you are studying, the format of the data, and the exact way that your attempted workflow is breaking down with Anki, I'd be surprised if no one had suggestions for making it work.
Edit: also, to answer a question in your initial post, is there a better SRS tool out there? I've never found anything. For all its warts and flaws, Anki is good where it matters, extensible enough to support pretty much all use cases, and has excellent data portability.
4 replies →
> I'm guessing that, if I start by importing a Japanese deck from some other source (because, for example, there's a source with high-quality notes), and then I split it into a writing subdeck, and then the original source adds new notes for new words or makes changes or whatever, that merging the results is basically unsupported.
Splitting a deck in subdecks can be done through tags, and every card has a unique ID field (usually the front, but you or the creator of a deck can define another one). Assuming tags is the only field you change, when you re-import an updated deck into your collection, Anki will match the IDs and you can define it's behavior when it comes to new or already existent cards.
It is a hard problem with too many solutions. Excel most certainly is limiting since it is a scary format for sharing data. Sharing highly structured info is a hard problem, sharing collections of the same is even harder.
Programming has a similar issue code, docs, resources, discussions and issue tracking are not handled easily.
It’s amazing how every single point you’ve made here is wrong as the other commenter already dove into. On top of that Anki is one of the best documented pieces of open-source software I’ve messed with. If you’re able to program, ChatGPT can basically handle any task you want it too, I data mine the sqllite database regularly for my own insights.
My personal workflow with memorizing with Anki using LLMs is as follows: Read textbook material first. It's important that you understand what you are learning.You cannot just breakdown information into atoms and expect to understand how it works together (for example: you can learn that there are monounsaturated fats, polyunsaturated fats, saturated fats and trans fats. But without reading beforehand about them from a textbook or other source, you will not understand how they differ (in chemical structure, biological function etc). After I understand the material, I feed LLM the documents (textbooks etc) and give it the following prompt: >I want to generate flashcards from the provided textbook document using the attached PDF. Each flashcard should contain a question and a corresponding answer formatted as pairs in plaintext code block. The structure should be: "Question","Answer" >Extract key concepts, definitions, and explanations from the textbook. If in the text imperial unit system is used, convert it to metric. For mathematical symbols and equations, format them using inline MathJax syntax because I will be importing the copied text to Anki. Ensure questions are clear and concise while answers provide a direct yet comprehensive response.
After the text is generated, I check out the accuracy (in 95% of cases the cards are accurate) and I import them into my decks. The rest is good old school Anki memorizing.
I remember trying that a few months back but I was not convinced by the quality of the cards. It's also hard to convey what I think a good card is, I guess there's something about not getting lost in the details while missing the big picture.
You only commented on accuracy, but what's your experience on relevance and how useful LLM-generated flashcards are?
To be clear I've already found myself deleting some flashcards I made myself while reviewing them when I realized they were bad, so I guess one can do that for LLM-generated questions as well, as long as the irrelevance rate is somewhat similar.
1 reply →
I have been looking for a workflow to do exactly this!
Which LLM do you use?
Do you do anything special to structure the deck by chapter or section?
1 reply →
Actually, every point was right and the data model is terrible. I've been using it for years. The other commenter just mentioned a list of things that the software does do and basically said "isn't that good enough for you" a bunch of times. No, it's not. Anki's concepts of flashcards, and how it stores and manipulates them, are horrible.
It's hard to do many, many things in Anki that should be trivial, impossible to do many, many things that should be possible, and the things you can do involve the types of queries being run over your entire collection that causes the app to slow to a crawl after you add about a dozen decks. And in general: I can adjust far too many things that I don't even care to adjust and probably shouldn't be adjusting, and things that should be trivial to do are impossible.
It's bad. Ankidroid is a little better, but they're also stuck with the data model.
That’s because the default model is designed for the general user. If you sat down and really worked with the documentation, you would realize you shouldn’t be using decks or collections for management you should be using tags. Decks and collections are a different abstraction for different purposes.
I’m in medical school which has basically mastered Anki. The AnKing deck, used by over a million medical students, has over 35,000 cards, cross-tagged by numerous study resources that exists on a single “deck” which receives regular updates. I regularly run basically instant queries on over 40,000+ cards.
Medical school Anki has basically mastered this workflow and the original commenters complaints are completely wrong/come from a misunderstanding of Anki’s data model.
To be put simply, ignoring subdecks, filtered decks, cards vs notes, etc.: cards can only belong to one deck, but can have multiple tags. What exactly do you want to see differently in the data model?
Where does your SQLite db sit? I have been thinking about creating a local database to save and feed into LLM but i have no experience with it
The SQLite database file is called "collection.anki2". Its path varies depending on the operating system: https://docs.ankiweb.net/files.html?highlight=collection.ank...
If you export a deck as an .apkg file, you can unzip it to get the SQLite files:
https://www.encona.com/posts/custom-statistics-for-anki-flas...
I’m not using LLMs but I scrape open dictionary data to generate (massive) cards for learning kanji. I publish these in a specialized kanji learning app (rather than anki proper). My db just lives temporarily on my local machine, before I publish. Maybe sometime I will streamline the publishing process and create a github action for it.
btw. I didn’t use LLMs to write the app, it was still pretty straight forward.
https://github.com/runarberg/shodoku/tree/main/scripts
I think there is an argument to be made about the poor learning curve (ironic) for learning how to organize and use Anki
This is probably my biggest gripe. It’s so much information to digest at first, it’s a tough ask. And there’s so many rookie mistakes.
I made the mistake of just jumping in and I would say I spent the first 6 months using Anki “wrong” in the sense I would make bad cards, try to make mc questions, not enabling or optimizing FSRS, capping reviews, doing Anki before I knew the material, etc.
I’m someone who loves to learn from scratch/figure it out myself. I would never recommend watching a YouTube tutorial or following a guide for something you can figure out yourself, but I have to make an exception for Anki. Anki is one of those rare things where it’s simply just better to just copy what someone else is doing and figure out adjustments for your own workflow over time.
You're right about the data model. I built my own flashcard app [1], so I've had to understand the database schema that Anki uses and a lot of it seems very inelegant, but on the other hand, I can see how it was purpose-built and probably grew over time and things were tacked on. (For instance, there's one table that has a single row, where several fields are simply JSON dictionaries.)
On the other hand, how templates work and how cloze deletions work are really nice. With the flashcard app I built, I didn't have templates, but did have a very basic cloze deletion system where you could mark text to be "hidden" on the front. It was very limited in that you'd only ever get one front/back combination. You could hide multiple bits of text, but they'd all be hidden at the same time. With Anki, you can create multiple groups of hidden text, so that you end up with multiple flashcards from the same note (i.e. you hid three separate groups of text, so you now have three different cards to test with).
I've been working on an update to my app to incorporate templates and cloze deletions like how Anki does it, so now I appreciate that aspect to it.
For my own database, at least in the new version I'm working on, I've ended up creating a schema where the individual attributes for each card are thrown into their own tables, but this is mostly because I needed to support updating individual attributes separately since I use a very simple journaling system to sync across devices. With Anki's schema, I can see why sync was complicated (at least in earlier versions) since it wasn't really built for it.
[1] https://www.freshcardsapp.com/
For language learning only, but perhaps my own https://thehardway.app model suits you better (flashcards inside markdown-ish notes)
Sorry, but your landing page is really awful. I’m not allowed to learn about your app because I’m on a tablet? And I have to give you my e-mail address, even though I know nothing about your app so far?
Suggestion: Allow me to browse your site and learn about your application, and I’ll decide if it’s interesting enough for me to open it on my desktop later.
You are right, I will get around that. For what is worth, I do not keep the email address.
How much will you be charging for the app? Is it using LLM ?
Yeah, if you're learning a language, have a group of restaurant words and another group about air travel, etc. Then then user will associate them.
For that I buy a vocabulary book, or a phrase book. As I read through it, and if I don‘t know a word or understand a sentence, I try to learn it using various methods, and create an Anki card to keep it in memory. Anki is just to retain the word.
On a plus side, you can buy specialized vocab/phrase books, I have one just for onomonopias. Also my beginner vocab books come with recordings from actual native speaker voice actors, which I add to the deck. Much better quality than anything an LLM or speech synthesizers can give you.
What are you talking about? Anki explicitly has the concept of 'notes' from which one or more 'cards' are derived. You can absolutely easily make custom decks that only have certain card types.
Yeah I'm confused too. "I don’t like its data model. I wish the data model was instead <perfectly describes Anki's data model>".
Seems like a YC business in the making. There's demand for improvements, and you already have your pitch.
Sprinkle in AI and you'd be a shoe-in.