Comment by macrolime
2 years ago
Any particular reason why that shouldn't work well with fine-tuning of an LLM using reinforcement learning?
2 years ago
Any particular reason why that shouldn't work well with fine-tuning of an LLM using reinforcement learning?
No comments yet
Contribute on Hacker News ↗