← Back to context Comment by schopra909 5 days ago Should be fixed now! Thanks again for the heads up 7 comments schopra909 Reply streamer45 5 days ago All good, cheers! schopra909 5 days ago Per the RAM comment, you may able to get it run locally with two tweaks:https://github.com/Linum-AI/linum-v2/blob/298b1bb9186b5b9ff6...1) Free up the t5 as soon as the text is encoded, so you reclaim GPU RAM2) Manual Layer Offloading; move layers off GPU once they're done being used to free up space for the remaining layers + activations dsrtslnd23 5 days ago Any idea on the minimum VRAM footprint with those tweaks? 20GB seems high for a 2B model. I guess the T5 encoder is responsible for that. 4 replies →
streamer45 5 days ago All good, cheers! schopra909 5 days ago Per the RAM comment, you may able to get it run locally with two tweaks:https://github.com/Linum-AI/linum-v2/blob/298b1bb9186b5b9ff6...1) Free up the t5 as soon as the text is encoded, so you reclaim GPU RAM2) Manual Layer Offloading; move layers off GPU once they're done being used to free up space for the remaining layers + activations dsrtslnd23 5 days ago Any idea on the minimum VRAM footprint with those tweaks? 20GB seems high for a 2B model. I guess the T5 encoder is responsible for that. 4 replies →
schopra909 5 days ago Per the RAM comment, you may able to get it run locally with two tweaks:https://github.com/Linum-AI/linum-v2/blob/298b1bb9186b5b9ff6...1) Free up the t5 as soon as the text is encoded, so you reclaim GPU RAM2) Manual Layer Offloading; move layers off GPU once they're done being used to free up space for the remaining layers + activations dsrtslnd23 5 days ago Any idea on the minimum VRAM footprint with those tweaks? 20GB seems high for a 2B model. I guess the T5 encoder is responsible for that. 4 replies →
dsrtslnd23 5 days ago Any idea on the minimum VRAM footprint with those tweaks? 20GB seems high for a 2B model. I guess the T5 encoder is responsible for that. 4 replies →
All good, cheers!
Per the RAM comment, you may able to get it run locally with two tweaks:
https://github.com/Linum-AI/linum-v2/blob/298b1bb9186b5b9ff6...
1) Free up the t5 as soon as the text is encoded, so you reclaim GPU RAM
2) Manual Layer Offloading; move layers off GPU once they're done being used to free up space for the remaining layers + activations
Any idea on the minimum VRAM footprint with those tweaks? 20GB seems high for a 2B model. I guess the T5 encoder is responsible for that.
4 replies →