Comment by CountGeek

4 months ago

So could I in practice train it on all my psychology books, materials, reports, case study and research papers and then run it on demand on a 1xH100 node - https://getdeploying.com/reference/cloud-gpu/nvidia-h100 whenever I have a specialised question?

You could do that indeed, but the performance would be abysmal. For this kind of use-case, it would be a LOT better to use a small pre-trained model and either fine-tune it on your materials, or use some kind of RAG workflow (possibly both).

  • > it would be a LOT better to use a small pre-trained model and either fine-tune it on your materials, or use some kind of RAG workflow (possibly both).

    I noticed NewRelic has a chat feature that does this sort of thing, it's scoped very narrowly down to their website and analytics DSL language, and generates charts/data from their db. I've always wondered how they did that (specifically in terms of set up the training/RAG + guardrails). It's super useful.

    • You might be able to figure that out just by asking it - see if you can get it to spit out a copy of the system prompt or tell you what tools it has access to.

      The most likely way of building that would be to equip it with a "search_docs" tool that lets it look up relevant information for your query. No need to train an extra model at all if you do that.

You could but it would be significantly worse than fine-tuning or RAG with a pre-trained model, or using a smaller model since your dataset would be so small.

Yes, though it's possible a more-general core model, further enhanced with some other ways to bring those texts-of-interest into the working context, might perform better.

Those other ways to integrate the texts might be some form of RAG or other ideas like Apple's recent 'hierarchical memories' (https://arxiv.org/abs/2510.02375).

You could! But just like others have mentioned, the performance would be negligible. If you really wanted to see more of a performance boost by pretraining you could try to create a bigger chunk of data to train off of. This would be done by either creating synthetic data off of your material, or finding adjacent information to your material. Here's a good paper about it: <https://arxiv.org/abs/2409.07431>