Google AI

Google Gemini SDK 使用指南

概述

Google Gen AI Python SDK 为开发者提供了将 Google 生成式模型集成到 Python 应用中的接口，支持 Gemini Developer API 和 Vertex AI API。

安装

pip install google-genai

快速开始

Gemini Developer API

from google import genai
from google.genai import types

client = genai.Client(api_key='GEMINI_API_KEY')

response = client.models.generate_content(
    model='gemini-2.5-flash',
    contents=types.Part.from_text(text='Why is the sky blue?'),
    config=types.GenerateContentConfig(
        temperature=0,
        top_p=0.95,
        top_k=20,
    ),
)
print(response.text)

Vertex AI

from google import genai

client = genai.Client(
    vertexai=True,
    project='your-project-id',
    location='us-central1'
)

环境变量配置

Gemini Developer API

export GEMINI_API_KEY='your-api-key'

Vertex AI

export GOOGLE_GENAI_USE_VERTEXAI=true
export GOOGLE_CLOUD_PROJECT='your-project-id'
export GOOGLE_CLOUD_LOCATION='us-central1'

客户端管理

# 使用上下文管理器自动关闭
with genai.Client() as client:
    response = client.models.generate_content(
        model='gemini-2.5-flash',
        contents='Hello'
    )

# 异步客户端
async with genai.Client().aio as aclient:
    response = await aclient.models.generate_content(
        model='gemini-2.5-flash',
        contents='Hello'
    )

API 版本选择

from google.genai import types

client = genai.Client(
    api_key='GEMINI_API_KEY',
    http_options=types.HttpOptions(api_version='v1')
)

可用模型

模型	说明
gemini-2.5-flash	快速响应，适合日常任务
gemini-2.5-pro	高精度推理，复杂任务
gemini-2.0-flash	上一代快速模型

Google AI

Gemini API 快速入门指南

Gemini API 概览

Google Gemini API 提供对 Gemini 系列模型的访问，支持文本生成、多模态理解、代码执行、搜索增强等能力。最新一代 Gemini 3 带来了更强的推理和生成能力。

获取 API Key

访问 Google AI Studio
点击 "Get API Key"
创建或选择 Google Cloud 项目
复制生成的 API Key

Python SDK 安装

pip install google-generativeai

基础文本生成

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")

# 创建模型实例
model = genai.GenerativeModel('gemini-3-pro')

# 生成文本
response = model.generate_content(
    "解释量子计算的基本原理",
    generation_config=genai.types.GenerationConfig(
        temperature=0.7,
        max_output_tokens=2048
    )
)
print(response.text)

多轮对话

chat = model.start_chat(history=[])

response1 = chat.send_message("你好，我想学习Python")
print(response1.text)

response2 = chat.send_message("推荐一些入门项目")
print(response2.text)

多模态：图像理解

import PIL.Image

img = PIL.Image.open('photo.jpg')
model = genai.GenerativeModel('gemini-3-pro')

response = model.generate_content([
    "详细描述这张图片的内容",
    img
])
print(response.text)

系统指令

model = genai.GenerativeModel(
    'gemini-3-pro',
    system_instruction=(
        "你是一位专业的技术文档撰写者。"
        "回答时使用Markdown格式，包含代码示例。"
        "语气专业但易懂。"
    )
)

模型选择

模型	上下文	特点
Gemini 3 Pro	1M	最强推理，适合复杂任务
Gemini 3 Flash	1M	快速高效，性价比最优
Gemini 2.5 Pro	1M	上一代旗舰，仍可用
Gemini 2.5 Flash	1M	上一代快速版

特色功能

搜索增强（Grounding）

利用 Google 搜索实时数据增强回答：

response = model.generate_content(
    "2026年最新的AI发展趋势是什么？",
    tools="google_search_retrieval"
)

代码执行

模型可以自动编写和执行Python代码：

response = model.generate_content(
    "计算斐波那契数列前100项的和",
    tools="code_execution"
)

文件上传

uploaded_file = genai.upload_file("document.pdf")
response = model.generate_content([
    "总结这份文档的关键要点",
    uploaded_file
])

定价

模型	输入（≤128K）	输出
Gemini 3 Pro	$1.25/1M tokens	$5.00/1M tokens
Gemini 3 Flash	$0.15/1M tokens	$0.60/1M tokens

注：免费额度慷慨，适合开发测试。

Google AI

Gemini 多模态生态与最新模型解析

Gemini 3：新一代多模态AI

Gemini 3 是 Google 最新的AI模型系列，在推理、代码、多模态理解方面实现了重大突破。配合丰富的生态工具，构建了从文本到图像、视频、音频的完整多模态能力矩阵。

模型矩阵

模型	类型	核心能力
Gemini 3 Pro	文本/多模态	最强推理，百万上下文
Gemini 3 Flash	文本/多模态	高速推理，低延迟
Nano-Banana 2	图像生成	原生图像生成，支持思考模式
Nano-Banana Pro	图像生成	4K质量图像生成
Veo 3.1	视频生成	图生视频，视频扩展
Lyria 3	音乐生成	30秒片段到完整歌曲
Gemini Robotics-ER 1.5	空间理解	机器人应用

多模态能力详解

图像生成（Nano-Banana 2）

Gemini 原生集成图像生成能力，无需额外工具：

from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro",
    contents="生成一张赛博朋克风格的城市夜景图",
    config=types.GenerateContentConfig(
        response_modalities=["TEXT", "IMAGE"]
    )
)

for part in response.candidates[0].content.parts:
    if part.inline_data:
        with open("output.png", "wb") as f:
            f.write(part.inline_data.data)

视频生成（Veo 3.1）

Veo 支持文本生成视频和图片生成视频：

# 异步视频生成
operation = client.models.generate_videos(
    model="veo-3.1",
    prompt="一只猫在月光下跳舞，电影级画质",
    config=types.GenerateVideosConfig(
        number_of_videos=1,
        duration_seconds=8
    )
)

# 等待完成
while not operation.done:
    time.sleep(10)
    operation = client.operations.get(operation)

# 下载视频
video = operation.response.generated_videos[0]
client.files.download(file=video, filepath="output.mp4")

实时多模态（Live API）

Live API 支持实时的音视频交互：

import asyncio

async def live_session():
    config = {"response_modalities": ["TEXT"]}
    async with client.aio.live.connect(model="gemini-3-flash", config=config) as session:
        await session.send(input="描述你看到的内容", end_of_turn=True)
        async for response in session.receive():
            print(response.text)

生态工具

Gemini CLI

开源AI代理，在终端中直接使用 Gemini：

npm install -g @anthropic-ai/gemini-cli
gemini "分析这个项目的代码架构"

Google AI Studio

在线平台，支持：

交互式提示词测试
结构化提示词构建
模型对比评测
一键部署到 API

File Search（RAG）

托管的 RAG 系统，上传文件即可实现检索增强生成：

# 创建语料库
corpus = client.files.create_corpus(name="my_docs")

# 上传文件
client.files.upload(file="knowledge.pdf", corpus=corpus)

# 检索增强生成
response = model.generate_content(
    "根据文档回答：...",
    tools=[types.Tool(file_search=types.FileSearch(corpus=corpus.name))]
)

迁移到 Gemini 3

从 Gemini 2.x 迁移的主要变更：

模型名更新：gemini-2.5-pro → gemini-3-pro
新增思考模式参数
图像生成改为原生集成
API 兼容，无需大改代码

Google AI

Gemini 3.1 Pro 科学推理能力

## Gemini 3.1 Pro 科学推理能力 ### 核心特点 - 科学推理得分 94.3%，刷新人类纪录 - 1M tokens 超长上下文 - 原生多模态：文本、图像、视频、音频 - Google 生态深度整合 ### API 调用 ```python import google.generativeai as genai # 配置 API_KEY = "your-api-key" genai.configure(api_key=API_KEY) # 创建模型 model = genai.GenerativeModel("gemini-3.1-pro") # 基础对话 response = model.generate_content("解释量子纠缠") print(response.text) # 多模态输入 import urllib.request urllib.request.urlretrieve("https://example.com/chart.png", "chart.png") model = genai.GenerativeModel("gemini-3.1-pro") with open("chart.png", "rb") as f: image_data = f.read() response = model.generate_content([ "分析这张图表中的趋势", {"mime_type": "image/png", "data": image_data} ]) ``` ### Function Calling ```python model = genai.GenerativeModel( "gemini-3.1-pro", tools=[{"function_declarations": [{ "name": "search_database", "description": "搜索数据库", "parameters": { "type": "object", "properties": { "query": {"type": "string"} } } }]}] ) ``` ### 适用场景 - 科学研究与分析 - 多模态数据处理 - 超长文档理解 - Google Workspace 集成

Google AI

Gemini 端云协同部署指南

## Gemini 端云协同部署指南 ### 2026年Q2端云协同架构主流方案为"云端70B通用大模型+端侧10B-30B轻量化场景模型"： | 级别 | 模型 | 适用场景 | |------|------|----------| | 云端 | Gemini 3.1 Pro | 复杂推理、长文档 | | 云端 | Gemini 3.1 Flash | 日常对话、快速响应 | | 端侧 | Gemini Nano | 设备端离线推理 | ### Vertex AI 部署 ```python import vertexai from vertexai.preview.generative_models import GenerativeModel vertexai.init(project="your-project", location="us-central1") model = GenerativeModel("gemini-3.1-pro") response = model.generate_content("Hello") print(response.text) ``` ### 本地部署（ONNX） ```bash pip install transformers ``` ```python from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("google/gemini-nano-3") ``` ### 成本优化 - 使用 A100/GPU 实例按需缩放 - 开启 Prompt Caching 降低重复输入成本 - Flash 版本处理简单任务节省费用

欢迎回来

创建账号

Google Gemini SDK 使用指南

概述

安装

快速开始

Gemini Developer API

Vertex AI

环境变量配置

Gemini Developer API

Vertex AI

客户端管理

API 版本选择

可用模型

Gemini API 快速入门指南

Gemini API 概览

获取 API Key

Python SDK 安装

基础文本生成

多轮对话

多模态：图像理解

系统指令

模型选择

特色功能

搜索增强（Grounding）

代码执行

文件上传

定价

Gemini 多模态生态与最新模型解析

Gemini 3：新一代多模态AI

模型矩阵

多模态能力详解

图像生成（Nano-Banana 2）

视频生成（Veo 3.1）

实时多模态（Live API）

生态工具

Gemini CLI

Google AI Studio

File Search（RAG）

迁移到 Gemini 3

Gemini 3.1 Pro 科学推理能力

Gemini 端云协同部署指南