智谱 GLM API 使用手册

2026-05-08 · 使用手册

一、概述

智谱 AI 推出的 GLM（General Language Model）系列大模型，提供从轻量到旗舰的完整模型矩阵。兼容 OpenAI API 格式，支持文本生成、对话、函数调用、图像理解、文生图、代码执行等能力。GLM-4V 多模态模型支持超高分辨率图像理解。

二、认证

Authorization: Bearer xxxxxxxxxx.xxxxxxxx

API Key 在智谱开放平台创建。

三、Base URL

https://open.bigmodel.cn/api/paas/v4

四、模型列表

模型ID	上下文	最大输出	说明

glm-5	128K	16K	最新旗舰
glm-4-plus	128K	8K	上一代旗舰
glm-4-flash	128K	8K	免费模型
glm-4-air	128K	8K	高性价比
glm-4-long	1M	8K	超长上下文
glm-4v-plus	8K	4K	多模态旗舰
glm-4v-flash	8K	4K	多模态免费
cogview-4	-	-	文生图
embedding-3	-	-	文本嵌入

五、使用 OpenAI SDK 调用

from openai import OpenAI

client = OpenAI(
    api_key="your-zhipu-api-key",
    base_url="https://open.bigmodel.cn/api/paas/v4"
)

response = client.chat.completions.create(
    model="glm-5",
    messages=[
        {"role": "system", "content": "你是智谱AI助手。"},
        {"role": "user", "content": "你好！"}
    ],
    temperature=0.7,
    top_p=0.7,
    max_tokens=2048,
    stream=False
)
print(response.choices[0].message.content)

六、原生 SDK 调用

from zhipuai import ZhipuAI

client = ZhipuAI(api_key="your-api-key")

response = client.chat.completions.create(
    model="glm-5",
    messages=[{"role": "user", "content": "你好"}]
)
print(response.choices[0].message.content)

七、流式输出

response = client.chat.completions.create(
    model="glm-5",
    messages=[{"role": "user", "content": "写一首关于春天的诗"}],
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

八、Function Calling

tools = [{
    "type": "function",
    "function": {
        "name": "query_weather",
        "description": "查询天气",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "城市"}
            },
            "required": ["location"]
        }
    }
}]

response = client.chat.completions.create(
    model="glm-5",
    messages=[{"role": "user", "content": "北京天气"}],
    tools=tools
)

if response.choices[0].finish_reason == "tool_calls":
    tool_call = response.choices[0].message.tool_calls[0]
    # 执行函数并返回结果
    result = query_weather(json.loads(tool_call.function.arguments)["location"])
    
    follow_up = client.chat.completions.create(
        model="glm-5",
        messages=[
            {"role": "user", "content": "北京天气"},
            response.choices[0].message,
            {"role": "tool", "tool_call_id": tool_call.id, "content": result}
        ],
        tools=tools
    )

九、多模态（图像理解）

response = client.chat.completions.create(
    model="glm-4v-plus",
    messages=[{
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}},
            {"type": "text", "text": "详细描述这张图片"}
        ]
    }]
)

十、文生图（CogView）

response = client.images.generations(
    model="cogview-4",
    prompt="一只在水墨画风格的山间行走的猫",
    size="1024x1024"
)
print(response.data[0].url)

十一、定价

模型	输入价格	输出价格

glm-5	¥30/1M tokens	¥90/1M tokens
glm-4-plus	¥25/1M tokens	¥75/1M tokens
glm-4-flash	免费	免费
glm-4-air	¥1/1M tokens	¥1/1M tokens
glm-4-long	¥1/1M tokens	¥1/1M tokens
cogview-4	¥0.05/张	-

十二、速率限制

模型	RPM（免费）	RPM（付费）

glm-4-flash	10	100
glm-4-air	5	60
glm-5	5	60

十三、错误码

状态码	含义	处理

400	参数错误	检查请求格式
401	认证失败	检查API Key
429	速率限制	退避重试
1301	内容合规拦截	修改输入
1302	模型过载	稍后重试

← 字节豆包 API… Moonshot… →

智谱 GLM API 使用手册

一、概述

二、认证

三、Base URL

四、模型列表

五、使用 OpenAI SDK 调用

六、原生 SDK 调用

七、流式输出

八、Function Calling

九、多模态（图像理解）

十、文生图（CogView）

十一、定价

十二、速率限制

十三、错误码

评论区

发表评论取消回复

欢迎回来

创建账号

智谱 GLM API 使用手册

一、概述

二、认证

三、Base URL

四、模型列表

五、使用 OpenAI SDK 调用

六、原生 SDK 调用

七、流式输出

八、Function Calling

九、多模态（图像理解）

十、文生图（CogView）

十一、定价

十二、速率限制

十三、错误码

评论区

发表评论 取消回复

发表评论取消回复