Comment by singpolyma3

6 days ago

Does anyone know what the quantization is with ollama models? They always just list parameter count.

I'm also a bit unsure of the trade offs between smaller quant vs smaller model

run ollama show <name_of_model>:<parameters> and you'll get the info. E.g. ollama show qwen3.5:0.8b Model architecture qwen35 parameters 873.44M context length 262144 embedding length 1024 quantization Q8_0 requires 0.17.1

  Capabilities
    completion    
    vision        
    tools         
    thinking      

  Parameters
    presence_penalty    1.5     
    temperature         1       
    top_k               20      
    top_p               0.95    

  License
    Apache License               
    Version 2.0, January 2004