Comment by mrtesthah
1 year ago
It begs the question of whether we can supply a function to be called (e.g., one that compiles and runs code) to evaluate intermediate CoT results
1 year ago
It begs the question of whether we can supply a function to be called (e.g., one that compiles and runs code) to evaluate intermediate CoT results
It seems OpenAI has decided to keep the CoT results a secret. If they were to allow the model to call out to tools to help fill in the CoT steps, then this might reveal what the model is thinking - something they do not want the outside world to know about.
I could imagine OpenAI might allow their own vetted tools to be used, but perhaps it will be a while (if ever) before developers are allowed to hook up their own tools. The risks here are substantial. A model fine-tuned to run chain-of-thought that can answer graduate level physics problems at an expert level can probably figure out how to scam your grandma out of her savings too.
It's only a matter of time. When some other company releases the tool, they likely will too.
I have to agree with you here. OpenAI may be playing for competitive advantage more than for the good of humanity by hiding the results.
The answer is yes if you are willing to code it. OpenAI supports tool calls. Even if it didn't you could just make multiple calls to their API and submit the result of the code execution yourself.
The intermediate CoT results aren't in the API.
I may be mistaken but I don't believe the first version of the comment I replied to mentioned intermediate CoT results.