Comment by flux1krea

6 months ago

My useful local stack: Ollama + Llama 3.1 8B (chat/RAG), VSCode + Continue.dev for coding, Qdrant for lightweight retrieval, 64 GB RAM desktop. Works offline with low latency; main gotcha is context blow-ups → chunking + token monitoring. Bonus: I A/B prompts/styles in the cloud first, then reproduce in local ComfyUI. Disclosure: I built flux1krea.app to baseline prompts/styles before moving local: https://www.flux1krea.app