Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Reopen: Missing Tool Tutputs error when using Azure OpenAI Assistants, when the Assistant planner calls the same tool multiple times in a run #1705 #2154

Closed
andres-swax opened this issue Oct 30, 2024 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@andres-swax
Copy link

andres-swax commented Oct 30, 2024

Language

Python

Version

1.4.1 and previous

Description

I wrote this as a comment on the original report (https://github.com/microsoft/teams-ai/issues/1705 closed already due to inactivity). I am reproducing here my resolution which involves making changes to TTK.AI.


I experienced this ever since I started working with the Assistants version of the template...and hoped the issue would just go away [yeah, right]. Versions of the conversation such as: Where should I go this weekend, Los Angeles or Las Vegas? I prefer warm weather. or something like this which would trigger this scenario.

This is an example of the errors I get. Notice that the error is shown 3 times, I suppose while the exception bubbles up. None of the text below is mine or my code's:

[TURN ERROR] Unhandled error: Error code: 400 - {'error': {'message': "Expected tool outputs for call_ids ['call_5ceIbNb7pLs3R7aNMdNXDxmq', 'call_bq7VJMlniIUVR8Pil2tk9unH'], got ['call_bq7VJMlniIUVR8Pil2tk9unH']", 'type': 'invalid_request_error', 'param': None, 'code': None}}
Error parameters/arguments:
  Error code: 400 - {'error': {'message': "Expected tool outputs for call_ids ['call_5ceIbNb7pLs3R7aNMdNXDxmq', 'call_bq7VJMlniIUVR8Pil2tk9unH'], got ['call_bq7VJMlniIUVR8Pil2tk9unH']", 'type': 'invalid_request_error', 'param': None, 'code': None}}
Error code: 400 - {'error': {'message': "Expected tool outputs for call_ids ['call_5ceIbNb7pLs3R7aNMdNXDxmq', 'call_bq7VJMlniIUVR8Pil2tk9unH'], got ['call_bq7VJMlniIUVR8Pil2tk9unH']", 'type': 'invalid_request_error', 'param': None, 'code': None}}

The error 400 from the LLM states that it received the return value of call 1 with the correct call ID but the LLM is expecting the results of BOTH tool calls on the same response, so it throws a HTTP error 400

Reproduction Steps

  1. Create an assistant on Azure AI Studio -> Assistants Playground, following the source code for the Assistants sample (I would need to search how I found it - I think it is on the current as of 2024/10/20 version of the Teams AI Assistants documentation). I am pasting the complete code of the assistant definition at the end of this post.

  2. Create a simple sample app from Teams Toolit [TTK version 5.10 current as of 2024/10/28] as follows: Teams Toolkit -> New Project -> Custom Engine Agent -> AI Agent -> Build with Assistants API (Preview) -> Python.

Once the assistant is running test it by asking for

3a. something that will trigger two calls to different sequential tools. It runs them one after the other so the output of one becomes the input of another. It will work. Example:
USER Give me the weather of the city nicknamed The Golden City
--- Tool call to getNickname(The Golden City) --> tool returns to LLM San Francisco
--- Tool call to getWeather(San Francisco) --> tool returns to LLM 44 C
LLM for example San Francisco is a little warm today, it is 44 degrees Celsius

3b. something that will triger two simultaneous calls to the same tool where no input is dependent on the output of another call. The stack trace and error message make clear that they were called in parallel. It will fail. 100% repeatable:

USER Give me the name of both cites nicknamed Sin City and The Golden City. There will be one POST from the LLM to the tool, with an array of calls, one for each parameter (city in this case). This means one Run, two Calls I believe.
Python / tool implementation:
--- Tool call to getNickname(Sin City) --> tool returns a proper response
--- Tool call to getNickname(The Golden City) --> Tool executes but response never goes back to the LLM.

LLM Error 400.
----- From this point forward, the Teams agent (or the Bot?) will not reply to any requests/posts so it seems mute, stunned or just dead until the run that triggered this 400 error expires, then it may or may not process the other accumulated inputs I supposed based on whether they have expired or not.

Another example: How is the weather in both Vegas and San Francisco?, triggers the same problem.


Code for the Assistant definition, copied from the Asistants Playground on Azure AI Studio:

{
  "tools": [
    {
      "type": "code_interpreter"
    },
    {
      "type": "function",
      "function": {
        "name": "getCurrentWeather",
        "description": "Get the weather in location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state e.g. Seattle, WA"
            },
            "unit": {
              "type": "string",
              "enum": [
                "c",
                "f"
              ]
            }
          },
          "required": [
            "location"
          ]
        },
        "strict": false
      }
    },
    {
      "type": "function",
      "function": {
        "name": "getNickname",
        "description": "Get the nickname of a city",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state e.g. Seattle, WA"
            }
          },
          "required": [
            "location"
          ]
        },
        "strict": false
      }
    }
  ],
  "name": "Cities and Weather",
  "instructions": "You are an intelligent bot that can write and run code to answer math questions use the provided functions to answer questions",
  "model": "az-oai-gpt-4o",
  "tool_resources": {
    "code_interpreter": {
      "file_ids": []
    }
  },
  "temperature": 1,
  "top_p": 1
}

Code for the tool implementations (called from the LLM)

@bot_app.ai.action("getCurrentWeather")
async def get_current_weather(context: ActionTurnContext, state: TurnState):
    weatherData = {
        'San Francisco, CA': { 'f': '71.6F', 'c': '22C', },
        'Los Angeles, CA': { 'f': '75.2F', 'c': '24C', },
        'New York, NY': { 'f': '44.2F', 'c': '17C', },
    }
    location = context.data.get("location")
    if not weatherData.get(location): return f"No weather data for ${location} found"
    unit = context.data['unit'] if 'unit' in context.data else 'f'
    wdl = weatherData[location] if location in weatherData else {}
    if not wdl : return 'not found'
    else: return wdl[unit]



@bot_app.ai.action("getNickname")
async def get_nickname(context: ActionTurnContext, state: TurnState):
    nicknames = {
        'San Francisco, CA': 'The Golden City',
        'Los Angeles': 'LA',
    }
    location = context.data.get("location")
    resp = str(nicknames.get(location)) if ( nicknames and nicknames.get(location) ) else f"No nickname for ${location} found"
    return resp

EDIT 5 mins later: Formatting and fixing the second sample prompt that triggers an error

@andres-swax andres-swax added the bug Something isn't working label Oct 30, 2024
@andres-swax
Copy link
Author

This is my analysis of the problem and fix. Pardon the verbosity.

I have found the problem, it was being triggered when the same tool was being called more than once.

The Teams AI library contains a dictionary of tools being called toolmap but it was keeping track only of the tool name, and a tool name can be called multiple times on the same run (each with a call_id). So, when preparing the response for delivery, when it was iterating over the actual results, it would overwrite the result stored on the previous call_id and only the last one would survive/be returned. So, when it would post back the results to the LLM, then the LLM would not receive all results.

I have fixed it but I don't know how to submit the actual fix to the correct repository, or the protocol. Also, I fixed this scenario but did not explore any kind of impact anywhere else, testing included. Current tests should pass though, because functionality was not changed.

The fix consists on changing a variable from containing a list of calls tool_map : Dict to containing a list of runs (call_id) for each function (tool name/action) being called tool_map : dict[str,list] where the list corresponds to each call_id of the same tool (str). This seems to have been the original intent of the code.

runsList[callId] became runsList[toolName][callId]

Fix in two places:
teams_ai-1.4.1.dist-info
|--------------------------> teams/ai/ai.py
--------------------------> teams/ai/planners/assistants_planner.py

This is the before and after exported as a git diff. Not certain if this is the best way to pass this info though.

diff --git a/src/ai_toolkit_fixes/teams.ai.ap.py b/src/ai_toolkit_fixes/teams.ai.ap.py
index 562dbfd..6bb0104 100644
--- a/src/ai_toolkit_fixes/teams.ai.ap.py
+++ b/src/ai_toolkit_fixes/teams.ai.ap.py
@@ -171,7 +171,8 @@ class AI(Generic[StateT]):
                     # Set output for action call
                     if command.action_id:
                         loop = True
-                        state.temp.action_outputs[command.action_id] = output or ""
+                        if not command.action in state.temp.action_outputs: state.temp.action_outputs[command.action] = {}       #FIXED
+                        state.temp.action_outputs[command.action][command.action_id] = output or ""                              #FIXED
                     else:
                         loop = len(output) > 0
                         state.temp.action_outputs[command.action] = output
diff --git a/src/ai_toolkit_fixes/teams.ai.planners.assistants_planner.py b/src/ai_toolkit_fixes/teams.ai.planners.assistants_planner.py
index 0df36ca..ad7c0fb 100644
--- a/src/ai_toolkit_fixes/teams.ai.planners.assistants_planner.py
+++ b/src/ai_toolkit_fixes/teams.ai.planners.assistants_planner.py
@@ -264,16 +264,17 @@ class AssistantsPlanner(Generic[StateT], _UserAgent, Planner[StateT]):
 
         # Map the action outputs to tool outputs
         action_outputs = state.temp.action_outputs
-        tool_map = state.get(SUBMIT_TOOL_OUTPUTS_MAP)
+        tool_map : dict[str,list] = state.get(SUBMIT_TOOL_OUTPUTS_MAP)                                                        #FIXED
         tool_outputs: List[ToolOutput] = []
 
-        for action in action_outputs:
-            output = action_outputs[action]
-            if tool_map:
-                tool_call_id = tool_map[action] if action in tool_map else None
-                if tool_call_id is not None:
-                    # Add required output only
-                    tool_outputs.append(ToolOutput(tool_call_id=tool_call_id, output=output))
+        if tool_map:                                                                                                          #FIXED
+            for action in action_outputs:                                                                                     #unchanged
+                if action in tool_map:                                                                                        #FIXED
+                    for tool_call_id in tool_map[action]:                                                                     #FIXED
+                        output = action_outputs[action][tool_call_id] if tool_call_id in action_outputs[action] else None     #FIXED
+                        if output is not None:                                                                                #FIXED
+                            # Add required output only
+                            tool_outputs.append(ToolOutput(tool_call_id=tool_call_id, output=output))
 
         # Submit the tool outputs
         if assistants_state.thread_id and assistants_state.run_id:
@@ -329,12 +330,14 @@ class AssistantsPlanner(Generic[StateT], _UserAgent, Planner[StateT]):
 
     def _generate_plan_from_tools(self, state: TurnState, required_action: RequiredAction) -> Plan:
         plan = Plan()
-        tool_map: Dict = {}
+        tool_map: dict[str,list] = {}                                                                     #FIXED
         for tool_call in required_action.submit_tool_outputs.tool_calls:
-            tool_map[tool_call.function.name] = tool_call.id
+            if not tool_call.function.name in tool_map : tool_map[tool_call.function.name] = []           #FIXED
+            tool_map[tool_call.function.name].append(tool_call.id)                                        #FIXED
             plan.commands.append(
                 PredictedDoCommand(
                     action=tool_call.function.name,
+                    action_id=tool_call.id,                                                               #FIXED
                     parameters=json.loads(tool_call.function.arguments),
                 )
             )

@singhk97
Copy link
Collaborator

Thanks for deep diving into the issue. We've added this to the backlog and will be tackling it soon. Looks like you have a fix for this so feel free to open a PR if you want.

@singhk97 singhk97 self-assigned this Oct 31, 2024
@lilyydu
Copy link
Contributor

lilyydu commented Nov 4, 2024

closing as duplicate with #1705

@lilyydu lilyydu closed this as not planned Won't fix, can't repro, duplicate, stale Nov 4, 2024
@andres-swax
Copy link
Author

andres-swax commented Nov 4, 2024

@lilyydu, could you explain the rationale behind 'not planned' if possible? This is a bug that prevents multiple calls to the same tool from the AI with no workaround.

For example, not being able to run How is the weather in both Vegas and San Francisco? using the Assistants API on Python. No workaround, even in prompt. Tried.

EDIT: Added example.

@lilyydu
Copy link
Contributor

lilyydu commented Nov 4, 2024

@andres-swax GH groups the tag as "not planned" but this includes "marking as duplicate ticket". We are still addressing this issue via #1705 (#2154 (comment))

@andres-swax
Copy link
Author

@andres-swax GH groups the tag as "not planned" but this includes "marking as duplicate ticket". We are still addressing this issue via #1705 (#2154 (comment))

😬😖😊 so very sorry, saw the other one as closed as well. Not familiar with GH procedures/etiquette my bad.

@lilyydu
Copy link
Contributor

lilyydu commented Nov 4, 2024

@andres-swax GH groups the tag as "not planned" but this includes "marking as duplicate ticket". We are still addressing this issue via #1705 (#2154 (comment))

😬😖😊 so very sorry, saw the other one as closed as well. Not familiar with GH procedures/etiquette my bad.

no worries at all, the wording is quite confusing! the other issue is open & in progress now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants