Comment by alansaber
9 days ago
Depends on the model size, batch size, input sequence length, ... etc. With a small model like this you'll never get a 'good' output but you can maximise its potential.
9 days ago
Depends on the model size, batch size, input sequence length, ... etc. With a small model like this you'll never get a 'good' output but you can maximise its potential.
No comments yet
Contribute on Hacker News ↗