← Back to context

Comment by nstart

20 days ago

Is this correct? My assumption is that all the data collected during usage is part of the RLHF loop of LLM providers. Assumption is based on information from books like empire of ai which specifically mention intent of AI providers to train/tune their models further based on usage feedback (eg: whenever I say the model is wrong in its response, thats a human feedback which gets fed back into improving the model).

... for the next training run, sure (ie. for ChatGPT 5.1 -> 5.2 "upgrade"). For the current model? No.