Comment by tmzt

19 days ago

Same. I switched my efforts to a larger Gemma 4 MoE model (26B-A4B) and llama.cpp and started getting meaningful results. I also implemented subagents for querying, determining which object/action to execute, and composing a short title. This is all running on an M4 in approximately 16 gb of ram. Also using Google's native tool calling channels.