← Back to context Comment by michelsedgh 10 days ago I wish it was multimodal :( 1 comment michelsedgh Reply leopoldj 10 days ago google/gemma-3-4b-it is one of the smallest multimodal models I know. Works well in a 16GB GPU. Works slowly in a 8GB GPU. It can even be fine tuned [1], which where the real power comes from.1. https://ai.google.dev/gemma/docs/core/huggingface_vision_fin...
leopoldj 10 days ago google/gemma-3-4b-it is one of the smallest multimodal models I know. Works well in a 16GB GPU. Works slowly in a 8GB GPU. It can even be fine tuned [1], which where the real power comes from.1. https://ai.google.dev/gemma/docs/core/huggingface_vision_fin...
google/gemma-3-4b-it is one of the smallest multimodal models I know. Works well in a 16GB GPU. Works slowly in a 8GB GPU. It can even be fine tuned [1], which where the real power comes from.
1. https://ai.google.dev/gemma/docs/core/huggingface_vision_fin...