Comment by accrual

3 days ago

Small update: thinking models also work well. I like that it shows the thinking stream in a fainter style while it generates, then hides it to show the final output when it's ready. The thinking output is still available with a click.

Another commenter mentioned not being able to point the new UI to a remote Ollama instance - I agree, that would be super handy for running the UI on a slow machine but inferring on something more powerful.