Comment by nstart

20 days ago

Is this correct? My assumption is that all the data collected during usage is part of the RLHF loop of LLM providers. Assumption is based on information from books like empire of ai which specifically mention intent of AI providers to train/tune their models further based on usage feedback (eg: whenever I say the model is wrong in its response, thats a human feedback which gets fed back into improving the model).

1 comment

nstart

spwa4 20 days ago

... for the next training run, sure (ie. for ChatGPT 5.1 -> 5.2 "upgrade"). For the current model? No.