Function Calling：让 LLM 调用你的 Python 函数

LLM 为什么需要工具

原生 LLM 能力的边界很清楚：输入文本，输出文本。它不能联网、不能看实时数据、不能读你的数据库、不能发邮件、不能修改文件。但你每天用 ChatGPT / Claude 网页版时看到的"查天气""搜网页""画图""跑代码"这些能力，全都是通过一个叫 Function Calling（也叫 Tool Use）的机制实现的——模型根据对话决定"我该调用哪个函数、传什么参数"，把这个意图以 JSON 形式返回，宿主程序执行函数、把结果再喂给模型，模型基于结果继续生成回答。

理解 Function Calling 是理解 Agent 的前置条件。本篇从最基础的一次工具调用开始，逐步扩展到多工具协同和循环执行。

协议：tools 和 tool_choice

在 chat.completions.create 的参数里多了一个 tools 字段，它是一个函数定义的列表，每个定义告诉模型：

函数名
函数做什么（description，关键！）
参数有哪些、类型是什么、哪些必填（JSON Schema）

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "查询指定城市当前的天气情况",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "城市名，如 '北京'、'上海'",
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "温度单位，默认 celsius",
                    },
                },
                "required": ["city"],
            },
        },
    }
]

注意 description 是模型唯一用来判断"什么时候该调这个函数"的依据，所以要写得简洁明确。描述不清模型就乱调。

完整的 round-trip 流程

一次 Function Calling 的完整流程需要两次 API 调用。这是最多新手第一次踩的坑——以为调一次就完了。

实际流程是：

第一次调用：你把用户问题 + tools 传给模型。如果模型觉得需要调工具，它不会直接回答，而是返回一个"工具调用请求"（tool_calls 字段），里面有函数名和参数
本地执行：你的代码读到 tool_calls，执行对应函数，拿到真实结果
第二次调用：你把工具执行结果以 tool 角色塞回 messages，再次调用模型。这次模型基于工具结果生成自然语言回答

用代码看：

# function_call.py
import json
import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

client = OpenAI(
    api_key=os.getenv("DEEPSEEK_API_KEY"),
    base_url="https://api.deepseek.com/v1",
)


# 真正的业务函数
def get_weather(city: str, unit: str = "celsius") -> dict:
    # 实际项目里这里会调气象 API
    fake_data = {"北京": 8, "上海": 15, "广州": 22}
    temp = fake_data.get(city, 20)
    if unit == "fahrenheit":
        temp = temp * 9 / 5 + 32
    return {"city": city, "temperature": temp, "unit": unit, "condition": "晴"}


tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "查询指定城市当前的天气情况",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["city"],
            },
        },
    }
]


def ask(question: str):
    messages = [{"role": "user", "content": question}]

    # 第一次调用
    resp = client.chat.completions.create(
        model="deepseek-chat",
        messages=messages,
        tools=tools,
    )
    msg = resp.choices[0].message

    # 如果模型决定调用工具
    if msg.tool_calls:
        # 把模型的工具调用意图加入 messages
        messages.append(msg)

        for tc in msg.tool_calls:
            args = json.loads(tc.function.arguments)
            if tc.function.name == "get_weather":
                result = get_weather(**args)
            else:
                result = {"error": f"未知函数 {tc.function.name}"}

            # 把工具执行结果以 tool 角色塞回
            messages.append({
                "role": "tool",
                "tool_call_id": tc.id,
                "content": json.dumps(result, ensure_ascii=False),
            })

        # 第二次调用：基于工具结果生成最终回答
        final = client.chat.completions.create(
            model="deepseek-chat",
            messages=messages,
        )
        return final.choices[0].message.content
    else:
        # 模型直接回答，没有调用工具
        return msg.content


if __name__ == "__main__":
    print(ask("北京今天天气怎么样？"))
    print(ask("你好啊"))  # 这种不会触发工具

跑起来第一个问题，模型会自动触发 get_weather，返回类似"北京今天天气晴朗，温度 8 摄氏度"的回答；第二个问题不需要工具，模型直接回答。

关键字段详解

有几处细节第一次看容易绕，单独拎出来讲清楚：

tool_calls 是一个列表——模型可能一次返回多个工具调用。比如用户问"北京和上海天气"，模型可能同时发起两个 get_weather 调用。你的代码必须遍历处理。

messages.append(msg) 为什么这么写——这里追加的是整个助手消息对象，而不是 {"role": "assistant", "content": ...}。因为带 tool_calls 的 assistant 消息结构更复杂，SDK 的 message 对象可以直接序列化回去。用字典构造也行，但要保证 tool_calls 字段完整。

tool_call_id 必须对应——多工具并发时，每个 tool 角色的消息必须精确对应某个 tool_call 的 id，否则模型无法匹配"这个结果是哪次调用的返回"。

函数参数来自 JSON 字符串——tc.function.arguments 是一个 JSON 字符串（不是字典），要先 json.loads 才能用。

把工具调用抽象成通用框架

上面的写法用一个工具还行，多工具时代码会膨胀。抽成一个注册表模式：

# tool_registry.py
import json
from typing import Callable
from pydantic import BaseModel


class Tool(BaseModel):
    name: str
    description: str
    parameters: dict
    func: Callable

    class Config:
        arbitrary_types_allowed = True


class ToolRegistry:
    def __init__(self):
        self._tools: dict[str, Tool] = {}

    def register(self, tool: Tool):
        self._tools[tool.name] = tool

    def as_openai_tools(self) -> list[dict]:
        return [
            {
                "type": "function",
                "function": {
                    "name": t.name,
                    "description": t.description,
                    "parameters": t.parameters,
                },
            }
            for t in self._tools.values()
        ]

    def call(self, name: str, args_json: str) -> str:
        tool = self._tools.get(name)
        if not tool:
            return json.dumps({"error": f"工具 {name} 不存在"})
        try:
            args = json.loads(args_json)
            result = tool.func(**args)
            return json.dumps(result, ensure_ascii=False, default=str)
        except Exception as e:
            return json.dumps({"error": str(e)})


# 使用
registry = ToolRegistry()

registry.register(Tool(
    name="get_weather",
    description="查询指定城市当前天气",
    parameters={
        "type": "object",
        "properties": {"city": {"type": "string"}},
        "required": ["city"],
    },
    func=get_weather,
))

registry.register(Tool(
    name="get_stock_price",
    description="查询指定股票代码的实时价格",
    parameters={
        "type": "object",
        "properties": {"symbol": {"type": "string"}},
        "required": ["symbol"],
    },
    func=lambda symbol: {"symbol": symbol, "price": 150.23},
))

这样你的主循环会非常干净：

def run(question: str):
    messages = [{"role": "user", "content": question}]

    while True:
        resp = client.chat.completions.create(
            model="deepseek-chat",
            messages=messages,
            tools=registry.as_openai_tools(),
        )
        msg = resp.choices[0].message

        if not msg.tool_calls:
            return msg.content

        messages.append(msg)
        for tc in msg.tool_calls:
            result = registry.call(tc.function.name, tc.function.arguments)
            messages.append({
                "role": "tool",
                "tool_call_id": tc.id,
                "content": result,
            })

注意这里用了 while True——模型可能连续调用好几轮工具才给出最终答案（比如先查天气，根据天气再查航班）。这其实就是 Agent 的核心结构了，下一篇会正式展开。

实战经验

函数描述是最重要的 Prompt——模型选不选对工具、参数填不填对，90% 取决于 description 写得清不清楚。写的时候换位思考：一个不了解你业务的新员工，光看这句描述能判断什么时候用这个工具吗？

工具数量别放太多——每个工具定义都会占 Token，而且候选工具太多模型反而选择困难。经验上一次对话里暴露给模型的工具最好不超过 10~15 个。如果业务工具很多，要做"先按用户意图筛选出相关工具，再暴露给模型"的两阶段路由。

永远做参数校验——模型有一定概率传错参数（类型错、缺字段、枚举值外）。用 Pydantic 模型包一层：

from pydantic import BaseModel, ValidationError

class WeatherArgs(BaseModel):
    city: str
    unit: Literal["celsius", "fahrenheit"] = "celsius"

def safe_get_weather(args_dict: dict) -> dict:
    try:
        args = WeatherArgs(**args_dict)
    except ValidationError as e:
        return {"error": f"参数无效：{e}"}
    return get_weather(args.city, args.unit)

校验失败时返回 error 给模型，模型通常会自己纠正重试。

工具执行要加超时——调用外部 API 或数据库的工具如果卡住，整个 Agent 会挂。关键工具加 asyncio.wait_for 或 signal.alarm。

日志——每次工具调用都记录输入输出，调试和事故排查都离不开。

工具的安全边界

Function Calling 给了 LLM 执行能力，所以必须认真对待安全：

写操作必须有确认——发邮件、转账、删文件这类不可逆操作，不能让模型一个工具调用就执行。中间必须插人工确认或至少 dry-run
文件系统工具要限定根目录——不要暴露 open(path)，要暴露 open_file_in(relative_path, root="/safe/dir")
执行代码工具必须沙箱化——用 Docker 或 gVisor 隔离，绝对不能直接 exec 用户输入
SQL 工具用参数化查询——永远不要把模型输出拼进 SQL 字符串，拼就是 SQL 注入

本篇要点

Function Calling 让 LLM 能"请求调用"你的函数，自己不执行，由宿主代码执行
完整流程是两次 API 调用：第一次拿到 tool_calls，执行后把结果塞回做第二次
messages 里工具结果以 tool 角色传回，必须带 tool_call_id 匹配
多工具场景用注册表模式，主循环 while-loop 一直跑到模型不再调工具为止
函数 description 是决定调度正确性的关键，值得认真打磨
涉及外部副作用的工具必须有安全边界和人工确认

把上面的 while-loop 稍作扩展——加上对轮数的限制、加上错误恢复、加上上下文管理——就是一个 Agent。第 08 篇会讲清楚 Agent 的本质、ReAct 模式、手搓一个最小 Agent，以及什么时候该用 LangGraph 这类框架、什么时候手写更划算。

参考资料

（采用 CC BY-NC-SA 4.0 许可协议进行授权）

本文标题:Function Calling：让 LLM 调用你的 Python 函数

本文链接:https://www.sshipanoo.com/blog/ai/ai-for-python/07-Function-Calling/

本文最后一次更新为天前，文章中的某些内容可能已过时！