Comment by andai

1 day ago

Could you elaborate on the multi level LLM workflow? Did you set up a benchmark, and you're having a LLM mutate prompts?

2 comments

andai

metiscus 21 hours ago

But yes, the runbook in the project gives the llm instructions on how to use the scripts and what to modify. I let Claude code read that and tell it to work on a province. It runs a small segment and analyzes the results until it hits 1-2% error with no systemic errors. If it can't get all the errors out then I have it switch to using gemini-flash-lite-latest instead of 2.5, which costs slightly more but performs much better. Basically Claude code runs a self governing loop with my oversight mutating the prompts and data inputs to extract all the names.

EDIT My instructions to the supervising LLM are in here https://github.com/metiscus/roman-names/blob/feature/webapp-...

metiscus 1 day ago

Full code and data available at https://github.com/metiscus/roman-names/

The running version on new. is the webapp branch. Eventually I will get it all fixed up.