← Back to context

Comment by nl

11 hours ago

Do you have any examples or data on the discriminatory power of the model for tool use?

The examples are things like "What is the weather in San Francisco", where you are only passed a tool like

  tools='[{"name":"get_weather","parameters":{"location":"string"}}]',

I had a thing[1] over 10 years ago that could handle this kind of problem using SPARQL and knowledge graphs.

My question is how effective is it at handling ambiguity.

Can I send it something like a text message "lets catch up at coffee tomorrow 10:00" and a command like "save this" and have it choose a "add appointment" action from hundreds (or even tens) of possible tools?

[1] https://github.com/nlothian/Acuitra/wiki/About

Thanks to a Huggingface linked below, I tested it and im not impressed. prmopt: i need to contact my boss i will be late. Result: 20mins [{"name":"set_timer","arguments":{"time_human":"20 minutes"}}]. It didnt use the email tool and i tried 2-3 different ways of asking it.

  • Query: context: { "boss_email": "bigboss69420@corporatepersonhood.net", "upcoming_meetings": [{ with: "bigboss69420@corporatepersonhood.net", "time": "11:00" }] } user: i need to contact my boss i will be late, could you tell him I'll be 15 minutes late?

    Output: [{"name":"send_email","arguments":{"to":"bigboss69420@corporatepersonhood.net","subject":"upcoming_meetings","body":"I'll be 15 minutes late"}},{"name":"send_email","arguments":{"to":"bigboss69420@corporatepersonhood.net","subject":"time","body":"I'll be 15 minutes late"}},{"name":"send_email","arguments":{"to":"bigboss69420@corporatepersonhood.net","subject":"time","body":"I'll be 15 minutes late"}}]

    Context definitely helps. But yeah the quality of it doesn't seem to be too high. To be fair it makes you realise that not only is parameter extraction required, but also content generation (email body). Also debouncing the 3 tool calls.

    Maybe under very specific circumstances/very tight harness this sort of model would be useful?

  • works for me:

    input: i need to contact my boss i will be late. output: [{"name":"send_email","arguments":{"to":"boss@company.com","subject":"Running late","body":"I will be late for the meeting."}}]

    it did have the send_email tool on the left hand side though

    • Boss: what meeting are you talking about..?

      In the ideal scenario, the boss also uses Needle, which checks emails and schedule a late meeting with whoever sent that email.

      Needle on the other side receives the invite for a late meeting, and notify OP he's got a 67% chance of getting fired today.

      2 replies →