Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat (ai/core): Proposal for generate code API with tools #4196

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

rajatsandeepsen
Copy link

@rajatsandeepsen rajatsandeepsen commented Dec 24, 2024

Proposal for generateCode API

Imagine the LLM can write Javascript code with custom logic within the help of limited tools inside a safe eval( )

Usage

Let me show you an example on how to build an AI powered banking app

import { z } from "zod"
import { tool, experimental_generateCode as generateCode } from "ai"
import { model } from "./model"

Define a set of tools

let balance = 30 // simulation of a DB
const history = [
    { amount: 10, from: "Bob" },
    { amount: 20, to: "Alice" },
]

const tools = ({
    getBalance: tool({
        description: "get balance of the user",
        parameters: z.object({}),
        execute: async () => {
            return balance
        },
        returns: z.number()
    }),
    sentMoney: tool({
        description: "send money to the user",
        parameters: z.object({ amount: z.number(), receiver: z.string() }),
        execute: async ({ amount, receiver }) => {
            if (balance < amount) {
                throw new Error("Insufficient balance")
            }
            balance -= amount

            history.push({ amount, to: receiver })
        },
        returns: z.void()
    }),
    getHistory: tool({
        description: "get history of transactions",
        parameters: z.unknown(),
        execute: async () => {
            return history
        },
        returns: z.array(
            z.object({ amount: z.number(), to: z.string().optional(), from: z.string().optional() }))
    })
})

Fun part begins here

const result = await generateCode({
    model,
    system: "You are a banking app",
    tools: tools,
    prompt: "Get history and find amount i got from Bob, then send that amount to Bob. Then again get history and balance",
})

console.log("Code:", result.code)
console.log("Schema:", result.schema)
console.log("Output:", await result.execute())

This is the output after execution await result.execute( )

{
  balance: 20,
  history: [
    {
      amount: 20,
      to: "Alice"
    }, {
      amount: 10,
      from: "Bob"
    },{
      amount: 10,
      to "Bob"
    }
  ]
}

This is the code written by LLM result.code

let history = this.getHistory({});
let amountGotFromBob = 0;
for (let transaction of history) {
    if (transaction?.from === 'Bob') {
        amountGotFromBob += transaction.amount;
    }
}
if (amountGotFromBob > 0) {
    this.sentMoney({ amount: amountGotFromBob, receiver: 'Bob' });
}
let balance = this.getBalance({});
return { balance, history };

This is the JSON schema written by LLM result.schema (useful for generative UI)

{
  $schema: "http://json-schema.org/draft-07/schema#",
  type: "object",
  properties: {
    balance: {
      type: "number",
    },
    history: {
      type: "array",
      items: {
        type: "object",
        properties: {
          amount: {
            type: "number",
          },
          to: {
            type: "string",
          },
          from: {
            type: "string",
          },
        },
        required: [ "amount" ],
      },
    },
  },
  required: [ "balance", "history" ]
}

Instead of multi-step toolResults, now LLM can write logic along with tools provided by developer. Also the LLM can't execute malicious code because of a safety simple technique i implemented in this.

export const createFunction = (tools: Record<string, CoreTool>, code: string) => {
  const data = Object.entries(tools).reduce((acc, [key, value]) => ({ ...acc, [key]: value.execute }), {})

  return async () => await new Function(main(code)).apply(data, [])
}
const main = (code:string) => `const main = async () => {\n${code}\n}\nreturn main()`

const generateCode = ({ ... }) => {

	// some generateText logic

	return {
		code, schema,
		execute: createFunction(tools, code)
	}
}

Therefor this LLM is restricted to only invoke list of functions we provided to them.

The generateCode( ) API is a powerful wrapper around generateText( )

Copy link

socket-security bot commented Dec 24, 2024

New dependencies detected. Learn more about Socket for GitHub ↗︎

Package New capabilities Transitives Size Publisher
npm/[email protected] None 0 34.6 kB sachinraja

View full report↗︎

@rajatsandeepsen rajatsandeepsen changed the title feat (ai/core): experimental generate code with tools feat (ai/core): Proposal for generate code API with tools Dec 24, 2024
@lgrammel
Copy link
Collaborator

This is pretty cool. However, there is one big issue: it includes a fairly large prompt. The AI SDK tries to not include any prompts whenever possible, because they end up being provider and model dependent (there is some minimal prompting in JSON generation but that is all). Do you see ways to do this without prompting? Or could there be alternative approaches that minimize it?

@@ -56,21 +63,16 @@ If not provided, the tool will not be executed automatically.
@args is the input of the tool call.
@options.abortSignal is a signal that can be used to abort the tool call.
*/
execute?: (
execute: (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the optionality here will break important functionality, namely tools without execute.

Copy link
Author

@rajatsandeepsen rajatsandeepsen Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got three alternatives idea for this issue

  • Create an isolated tool in different name (eg: etool specifically for generateCode

  • add one more option to toolChoice: "none" | "auto" | "required" | "code" (this code helps to make execute strictly required

  • throw error is execute param is undefined

Give me a feedback, I'll rework on this

@@ -94,28 +96,28 @@ The arguments for configuring the tool. Must match the expected arguments define
Helper function for inferring the execute args of a tool.
*/
// Note: special type inference is needed for the execute function args to make sure they are inferred correctly.
export function tool<PARAMETERS extends Parameters, RESULT>(
export function tool<PARAMETERS extends Parameters, RESULT extends Parameters>(
Copy link
Collaborator

@lgrammel lgrammel Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RESULT extends Parameters seems strange. Is this intentional?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generateCode requires an returns zodSchema so LLM can understand whats the output of each tools.

It'll help LLM to create variables to save the tool-calling-result and pass it to next tool.

Reason why i used Parameters type is to strictly specify execute tool according to its returns: zodSchema

I could rework this as

type Returns = Parameters // same properties

export function tool<PARAMETERS extends Parameters, RESULT, RETURNS extends Returns>(
tool: CoreTool<PARAMETERS, RETURNS> & {
    execute: (
      args: inferParameters<PARAMETERS>,
      options: ToolExecutionOptions,
    ) => PromiseLike<inferParameters<RETURNS>>;
  }
)

@rajatsandeepsen
Copy link
Author

This is pretty cool. However, there is one big issue: it includes a fairly large prompt. The AI SDK tries to not include any prompts whenever possible, because they end up being provider and model dependent (there is some minimal prompting in JSON generation but that is all). Do you see ways to do this without prompting? Or could there be alternative approaches that minimize it?

This idea of tools inside generateCode is not widely adopted in openAI API specs.

Everyone uses JSON schema for specifying tools to LLM (as Industry standards). But i used zod-to-ts for three reasons

  1. it converts zodSchema to typescript definition
  2. takes less tokens for defining same functionality compared to JSON schema
  3. anyway we're not passing these functions to tools part of OpenAI API and not returning a list of tool name in JSON object

So complex parameters & returns type can be specified + it can document each param usage like

const writeLetter = tool({
	description: "write letter to a person",
	parameters: z.object({
	 name: z.string().describe("Name of the person")
	}),
	returns: z.string(),
	execute: ({ name }) => `Hello ${name}`
})

After zod-to-ts

const writeLetter = (params: {
	// Name of the person
	name: stirng
}) :string => {
	// some logic
}

Prompt i using is to ensure LLM can writes better and secure code. Some of these prompts are important.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants