Comment by Deukhoofd
4 days ago
Using a fine-tuned GPT-3.5 model is definitely a choice, why not pick one of the newer models? The methodology only states they use "ChatGPT" as LLM, but doesn't clarify the model until way later, nor why they picked that model over other existing models.
The methodology also barely explains how they did their baseline experiments with human DMs. Did they do them face-to-face, or via text? Did they have different DMs in the 3 games they used as baseline, or was it the same person? As it stands the research is barely reproducible.
No comments yet
Contribute on Hacker News ↗