Ollama Web Search：让 AI 模型实时联网搜索

Ollama 推出了一项重要功能——Web Search API。这项功能为 AI 模型提供了实时从网络获取最新信息的能力，有效减少模型幻觉，提高回答的准确性。更令人兴奋的是，Ollama 为个人用户提供了慷慨的免费额度，让每个人都能轻松使用这项强大的功能。

官方博客：https://ollama.com/blog/web-search 官方文档：https://docs.ollama.com/web-search

为什么选择 Ollama Web Search？

在 AI 应用开发中，让模型具备联网搜索能力至关重要。然而，现有解决方案都有明显缺陷：

📊 主流方案对比

方案	价格	优势	劣势
微软 Bing API	¥108/千次	结果质量高	成本高昂
Google API	¥36/千次	生态成熟	价格较贵
SearXNG	免费	开源、隐私友好	部署复杂、结果不稳定、维护成本高
Ollama Web Search	免费	零成本、易用、AI 优化	有速率限制（可升级）

💰 成本优势： Ollama 完全免费提供，个人用户有慷慨的免费额度，高频场景可升级获得更高速率限制

💡 为什么不选 SearXNG？

虽然 SearXNG 开源免费，但实际使用中效果差强人意：

❌ 部署复杂：需自建服务器、配置繁琐、处理反爬虫
❌ 质量不稳定：搜索相关性差、内容提取不准确、格式不统一
❌ 性能问题：响应慢、高并发易超时、需额外优化

✨ Ollama Web Search 的优势

✅ 完全免费：个人用户免费使用，慷慨的免费额度
✅ 开箱即用：注册即用，无需部署维护
✅ 专为 AI 优化：返回格式友好，深度集成工具链
✅ 结果质量高：准确的内容提取，Markdown 格式输出
✅ 稳定可靠：无需自建服务，官方维护保障

核心特性

1. 增强模型能力

Web Search API 可以为 AI 模型补充最新的网络信息，解决以下问题：

减少幻觉：通过实时网络数据验证，降低模型编造信息的可能性
提升准确性：获取最新的事实和数据，确保回答的时效性
扩展知识范围：突破训练数据的时间限制，获取最新信息

2. 多种集成方式

Ollama 提供了多种使用方式：

REST API（支持 cURL 调用）
Python 库深度集成
JavaScript 库深度集成
MCP Server（Model Context Protocol）集成

3. 免费使用

Ollama Web Search 完全免费提供给用户使用，只需注册 Ollama 账号即可获得慷慨的免费额度。对于高频使用场景，可以升级订阅获得更高的速率限制。

快速开始

创建 API Key

首先需要在 Ollama 账号中创建 API Key：

访问 https://ollama.com 并登录
在账号设置中创建 API Key
保存你的 API Key

export OLLAMA_API_KEY="your_api_key"

使用 cURL 调用

最简单的方式是通过 cURL 直接调用：

curl https://ollama.com/api/web_search \
  --header "Authorization: Bearer $OLLAMA_API_KEY" \
  -d '{
    "query": "what is ollama?"
  }'

返回结果示例：

{
  "results": [
    {
      "title": "Ollama",
      "url": "https://ollama.com/",
      "content": "Cloud models are now available..."
    },
    {
      "title": "What is Ollama? Introduction to the AI model management tool",
      "url": "https://www.hostinger.com/tutorials/what-is-ollama",
      "content": "Ariffud M. 6min Read..."
    }
  ]
}

使用 Python 库

安装 Ollama Python 库（需要 0.6.0 及以上版本）：

pip install 'ollama>=0.6.0'

进行 Web 搜索：

import ollama

response = ollama.web_search("What is Ollama?")
print(response)

返回结果：

results = [
    {
        "title": "Ollama",
        "url": "https://ollama.com/",
        "content": "Cloud models are now available in Ollama..."
    },
    {
        "title": "What is Ollama? Features, Pricing, and Use Cases - Walturn",
        "url": "https://www.walturn.com/insights/what-is-ollama-features-pricing-and-use-cases",
        "content": "Our services..."
    }
]

使用 JavaScript 库

安装 Ollama JavaScript 库：

npm install 'ollama@>=0.6.0'

调用 Web Search：

import { Ollama } from "ollama";

const client = new Ollama();
const results = await client.webSearch({ query: "what is ollama?" });
console.log(JSON.stringify(results, null, 2));

构建搜索代理

Web Search 的真正威力在于结合工具调用（Tool Calling）构建智能搜索代理。以下是一个完整的示例：

准备工作

首先拉取一个支持工具调用的模型，这里使用阿里的 Qwen 3 模型（4B 参数）：

ollama pull qwen3:4b

实现搜索代理

from ollama import chat, web_fetch, web_search

# 定义可用工具
available_tools = {'web_search': web_search, 'web_fetch': web_fetch}

# 用户问题
messages = [{'role': 'user', 'content': "what is ollama's new engine"}]

while True:
    # 调用模型
    response = chat(
        model='qwen3:4b',
        messages=messages,
        tools=[web_search, web_fetch],
        think=True  # 启用思考过程
    )
    
    # 输出思考过程
    if response.message.thinking:
        print('Thinking: ', response.message.thinking)
    
    # 输出回答内容
    if response.message.content:
        print('Content: ', response.message.content)
    
    messages.append(response.message)
    
    # 处理工具调用
    if response.message.tool_calls:
        print('Tool calls: ', response.message.tool_calls)
        for tool_call in response.message.tool_calls:
            function_to_call = available_tools.get(tool_call.function.name)
            if function_to_call:
                args = tool_call.function.arguments
                result = function_to_call(**args)
                print('Result: ', str(result)[:200]+'...')
                # 将结果添加到消息历史（限制长度以适应上下文）
                messages.append({
                    'role': 'tool', 
                    'content': str(result)[:2000 * 4], 
                    'tool_name': tool_call.function.name
                })
            else:
                messages.append({
                    'role': 'tool', 
                    'content': f'Tool {tool_call.function.name} not found', 
                    'tool_name': tool_call.function.name
                })
    else:
        break

运行效果

Thinking: Okay, the user is asking about Ollama's new engine. I need to figure out what they're referring to...

Tool calls: [ToolCall(function=Function(name='web_search', arguments={'max_results': 3, 'query': 'Ollama new engine'}))]

Result: results=[WebSearchResult(content='# New model scheduling\n\n## September 23, 2025\n\nOllama now includes a significantly improved model scheduling system...

Content: Ollama has introduced two key updates to its engine, both released in 2025:

1. **Enhanced Model Scheduling (September 23, 2025)**
   - **Precision Memory Management**: Exact memory allocation reduces out-of-memory crashes
   - **Performance Gains**: Significant speed improvements (e.g., 85.54 tokens/s vs 52.02 tokens/s)
   - **Multi-GPU Support**: Improved efficiency across multiple GPUs
   - **Supported Models**: Includes gemma3, llama4, qwen3, mistral-small3.2, and more

2. **Multimodal Engine (May 15, 2025)**
   - **Vision Support**: First-class support for vision models
   - **Multimodal Tasks**: Image identification, video analysis, document scanning

Web Fetch API

除了搜索，Ollama 还提供了 web_fetch API 用于获取指定网页的内容。

API 请求

请求参数：

url（必填）：要获取的网页 URL

返回结果：

title：网页标题
content：网页内容（Markdown 格式）
links：网页中包含的链接列表

Python 示例

from ollama import web_fetch

result = web_fetch('https://ollama.com')
print(result)

返回结果：

WebFetchResponse(
    title='Ollama',
    content='[Cloud models](https://ollama.com/blog/cloud-models) are now available in Ollama\n\n**Chat & build with open models**...',
    links=['https://ollama.com/', 'https://ollama.com/models', 'https://github.com/ollama/ollama']
)

JavaScript 示例

import { Ollama } from "ollama";

const client = new Ollama();
const fetchResult = await client.webFetch({ url: "https://ollama.com" });
console.log(JSON.stringify(fetchResult, null, 2));

cURL 示例

curl --request POST \
  --url https://ollama.com/api/web_fetch \
  --header "Authorization: Bearer $OLLAMA_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{
      "url": "ollama.com"
  }'

MCP Server 集成

Ollama Web Search 支持通过 MCP（Model Context Protocol）Server 集成到各种 AI 编程工具中。

Cline 集成

在 Cline 的 MCP 服务器设置中添加配置：

{
  "mcpServers": {
    "web_search_and_fetch": {
      "type": "stdio",
      "command": "uv",
      "args": ["run", "path/to/web-search-mcp.py"],
      "env": { "OLLAMA_API_KEY": "your_api_key_here" }
    }
  }
}

Codex 集成

在 ~/.codex/config.toml 中添加配置：

[mcp_servers.web_search]
command = "uv"
args = ["run", "path/to/web-search-mcp.py"]
env = { "OLLAMA_API_KEY" = "your_api_key_here" }

Goose 集成

Goose 也支持通过 MCP 扩展集成 Ollama Web Search。

最佳实践

1. 上下文长度设置

Web Search 和 Web Fetch 可能返回数千个 token 的内容，建议：

将模型的上下文长度设置为至少 32,000 tokens
使用完整上下文长度可以获得最佳的搜索代理效果
Ollama 的云端模型默认运行在完整上下文长度

2. 控制搜索结果数量

通过 max_results 参数控制返回的搜索结果数量：

response = ollama.web_search(
    query="what is ollama?",
    max_results=5  # 默认 5，最大 10
)

3. 结合工具调用

将 web_search 和 web_fetch 作为工具提供给模型，让模型自主决定何时需要搜索：

tools = [web_search, web_fetch]
response = chat(
    model='qwen3:4b',
    messages=messages,
    tools=tools
)

API 参考

Web Search API

端点： POST https://ollama.com/api/web_search 请求参数：

参数	类型	必填	说明
query	string	是	搜索查询字符串
max_results	integer	否	最大返回结果数（默认 5，最大 10）

响应格式：

{
  "results": [
    {
      "title": "网页标题",
      "url": "网页 URL",
      "content": "相关内容片段"
    }
  ]
}

Web Fetch API

端点： POST https://ollama.com/api/web_fetch 请求参数：

参数	类型	必填	说明
url	string	是	要获取的网页 URL

响应格式：

{
  "title": "网页标题",
  "content": "网页内容（Markdown 格式）",
  "links": ["链接1", "链接2", "链接3"]
}

应用场景

1. 实时信息查询

让 AI 模型能够回答关于最新事件、新闻、技术更新等问题。

messages = [{'role': 'user', 'content': "最新的 Spring Boot 版本有什么新特性？"}]
response = chat(model='qwen3:4b', messages=messages, tools=[web_search])

2. 研究助手

结合 OpenAI 的 gpt-oss 模型进行长时间的研究任务。

3. 文档问答

获取特定网页内容并基于此回答问题。

# 获取网页内容
doc_content = web_fetch('https://docs.spring.io/spring-boot/index.html')

# 基于文档内容回答问题
messages = [
    {'role': 'system', 'content': f'参考文档：{doc_content.content}'},
    {'role': 'user', 'content': '如何配置数据源？'}
]
response = chat(model='qwen3:4b', messages=messages)

4. 代码助手增强

在 Cline、Codex 等 AI 编程工具中集成 Web Search，让 AI 助手能够：

查找最新的库文档
搜索错误解决方案
获取示例代码
了解最佳实践

总结

Ollama Web Search API 为本地和云端 AI 模型带来了实时联网搜索能力，主要优势包括： ✅ 完全免费：个人用户免费使用，慷慨的免费额度，相比商业 API 节省数千至数万元 ✅ 易于集成：支持 REST API、Python、JavaScript 多种方式，无需复杂配置 ✅ 工具生态完善：深度集成到各种 AI 编程工具（Cline、Codex、Goose） ✅ 减少模型幻觉：通过实时网络信息验证，提高回答准确性 ✅ 强大的代理能力：结合工具调用构建智能搜索代理 ✅ 优于开源方案：相比 SearXNG，无需部署维护，结果质量更稳定，专为 AI 优化

💰 成本对比

假设每月 30,000 次搜索查询，使用商业 API 的成本：

微软 Bing API：¥3,240/月 → ¥38,880/年
Google API：¥1,080/月 → ¥12,960/年
Ollama Web Search：完全免费 🎉

个人开发者和中小团队每年可节省 ¥12,960 - ¥38,880 的成本！无论你是在构建聊天机器人、研究助手，还是在使用 AI 编程工具，Ollama Web Search 都能显著提升 AI 的能力和准确性。更重要的是，它让联网搜索能力零成本、零门槛，人人都能用得起。立即访问 https://ollama.com 注册账号，开始体验 Web Search 功能吧！

参考资源

官方博客：https://ollama.com/blog/web-search
官方文档：https://docs.ollama.com/web-search
Python 示例代码：https://github.com/ollama/ollama-python
JavaScript 示例代码：https://github.com/ollama/ollama-js

PIGX分享

2025

开源共建

​为什么选择 Ollama Web Search？

​📊 主流方案对比

​💡 为什么不选 SearXNG？

​✨ Ollama Web Search 的优势

​核心特性

​1. 增强模型能力

​2. 多种集成方式

​3. 免费使用

​快速开始

​创建 API Key

​使用 cURL 调用

​使用 Python 库

​使用 JavaScript 库

​构建搜索代理

​准备工作

​实现搜索代理

​运行效果

​推荐模型

​Web Fetch API

​API 请求

​Python 示例

​JavaScript 示例

​cURL 示例

​MCP Server 集成

​Cline 集成

​Codex 集成

​Goose 集成

​最佳实践

​1. 上下文长度设置

​2. 控制搜索结果数量

​3. 结合工具调用

​API 参考

​Web Search API

​Web Fetch API

​应用场景

​1. 实时信息查询

​2. 研究助手

​3. 文档问答

​4. 代码助手增强

​总结

​💰 成本对比

​参考资源

为什么选择 Ollama Web Search？

📊 主流方案对比

💡 为什么不选 SearXNG？

✨ Ollama Web Search 的优势

核心特性

1. 增强模型能力

2. 多种集成方式

3. 免费使用

快速开始

创建 API Key

使用 cURL 调用

使用 Python 库

使用 JavaScript 库

构建搜索代理

准备工作

实现搜索代理

运行效果

推荐模型

Web Fetch API

API 请求

Python 示例

JavaScript 示例

cURL 示例

MCP Server 集成

Cline 集成

Codex 集成

Goose 集成

最佳实践

1. 上下文长度设置

2. 控制搜索结果数量

3. 结合工具调用

API 参考

Web Search API

Web Fetch API

应用场景

1. 实时信息查询

2. 研究助手

3. 文档问答

4. 代码助手增强

总结

💰 成本对比

参考资源