
SambaNova AI API 逆向实战:免费调用 DeepSeek-V3 等顶级开源模型
原创2025年10月16日...大约 9 分钟
前言:SambaNova Cloud 免费资源介绍
SambaNova Cloud 是一个提供免费 AI 模型调用服务的平台,主要特点:
- 新用户福利:注册即送 $5 免费额度
- 官方 API:平台自带标准 API 接口
- 关键发现:Playground 使用不计费,仅统计 API 调用量
支持的开源模型列表
该平台提供多个顶级开源大模型:
模型系列 | 具体型号 | 上下文长度 |
---|---|---|
DeepSeek | R1-0528, R1-Distill-Llama-70B, V3-0324, V3.1, V3.1-Terminus | 131K |
Llama | 3.1-8B, 3.3-70B, 3.3-Swallow-70B-v0.4, 4-Maverick-17B-128E | 16K-131K |
其他 | Qwen3-32B, E5-Mistral-7B, Whisper-Large-v3, gpt-oss-120b | 4K-131K |
💡 核心思路:由于 Playground 不计费,我们可以逆向其接口,实现免费无限制调用。
第一步:接口分析与流量抓包
1.1 打开浏览器开发者工具
- 访问 SambaNova Playground
- 按
F12
打开开发者工具 - 切换到 Network 标签页
- 在 Playground 中发送一条测试消息
1.2 定位关键请求
在网络请求列表中找到 /api/completion
请求:
请求 URL:
https://cloud.sambanova.ai/api/completion
请求方法: POST
关键请求头:
Content-Type: application/json
Accept: text/event-stream
(流式响应)Cookie: access_token=<你的令牌>
第二步:提取 Access Token
2.1 从 Cookie 中获取令牌
在开发者工具的 Application → Cookies 中找到 access_token
:
# 示例令牌格式 (已脱敏)
access_token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJ1c2VyQGV4YW1wbGUuY29tIiwiZXhwIjoxNzYxMDM2MjQ4fQ.xxxxxxxxxxxxxxxxxxxxx

令牌有效期说明:
- JWT 令牌通常有效期为 7天
- 过期后需重新登录获取新令牌
- 建议定期刷新以保证服务稳定性
第三步:分析请求体结构
3.1 原始请求体格式
复制请求负载 (Request Payload) 并格式化:
{
"body": {
"do_sample": false,
"max_tokens": 2048,
"process_prompt": true,
"messages": [
{
"role": "user",
"content": "hello ?"
}
],
"stop": ["<|eot_id|>"],
"stream": true,
"stream_options": {
"include_usage": true
},
"model": "DeepSeek-V3.1-Terminus"
},
"env_type": "text",
"fingerprint": "anon_bc558b0aa92fa43ed9ac23d536480e6c"
}


关键参数说明
参数 | 说明 | 建议值 |
---|---|---|
do_sample | 是否启用采样 (temperature > 0 时为 true) | false |
max_tokens | 最大生成 Token 数 | 2048-7168 |
stream | 是否流式输出 | true |
stop | 停止词列表 | `["< |
fingerprint | 匿名用户标识 | 固定值即可 |
第四步:cURL 请求测试
4.1 导出 cURL 命令
在开发者工具中右键点击请求 → Copy → Copy as cURL (bash)

4.2 简化后的测试命令
curl 'https://cloud.sambanova.ai/api/completion' \
-H 'accept: text/event-stream' \
-H 'content-type: application/json' \
-H 'cookie: access_token=<你的令牌>' \
--data-raw '{
"body": {
"do_sample": false,
"max_tokens": 2048,
"process_prompt": true,
"messages": [{"role": "user", "content": "你好"}],
"stop": ["<|eot_id|>"],
"stream": true,
"stream_options": {"include_usage": true},
"model": "DeepSeek-V3.1-Terminus"
},
"env_type": "text",
"fingerprint": "anon_bc558b0aa92fa43ed9ac23d536480e6c"
}'

第五步:FastAPI 封装实现
5.1 设计目标
将 SambaNova 接口封装成 OpenAI 兼容格式,方便集成到现有项目:
- 端点:
/v1/chat/completions
- 认证:
Authorization: Bearer <access_token>
- 请求/响应格式:完全兼容 OpenAI API
5.2 项目结构
sambanova-proxy/
├── main.py # FastAPI 主程序
├── requirements.txt # 依赖列表
└── README.md # 使用说明
5.3 依赖安装
# requirements.txt
fastapi==0.115.0
uvicorn==0.32.0
requests==2.32.3
pydantic==2.9.2
安装命令:
pip install -r requirements.txt
第六步:完整服务端代码
6.1 核心实现 (main.py)
from fastapi import FastAPI, HTTPException, Header
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from typing import List, Optional, Dict, Any, Union
import requests
import json
import uuid
import time
from datetime import datetime
import urllib3
# 禁用 SSL 警告
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
app = FastAPI(
title="SambaNova to OpenAI API Proxy",
description="将 SambaNova Cloud API 转换为 OpenAI 兼容格式",
version="1.0.0"
)
# ==================== 数据模型定义 ====================
class Message(BaseModel):
role: str
content: str
class ChatCompletionRequest(BaseModel):
model: str = "DeepSeek-V3.1-Terminus"
messages: List[Message]
max_tokens: Optional[int] = 2048
temperature: Optional[float] = 0.7
top_p: Optional[float] = 1.0
stream: Optional[bool] = False
stop: Optional[Union[str, List[str]]] = None
class Choice(BaseModel):
index: int
message: Optional[Message] = None
delta: Optional[Dict[str, Any]] = None
finish_reason: Optional[str] = None
class Usage(BaseModel):
prompt_tokens: int
completion_tokens: int
total_tokens: int
class ChatCompletionResponse(BaseModel):
id: str
object: str = "chat.completion"
created: int
model: str
choices: List[Choice]
usage: Optional[Usage] = None
# ==================== 工具函数 ====================
def create_sambanova_payload(request: ChatCompletionRequest) -> dict:
"""构建 SambaNova 请求体"""
stop_tokens = ["<|eot_id|>"]
if request.stop:
if isinstance(request.stop, str):
stop_tokens.append(request.stop)
else:
stop_tokens.extend(request.stop)
return {
"body": {
"do_sample": request.temperature > 0,
"max_tokens": request.max_tokens,
"process_prompt": True,
"messages": [{"role": msg.role, "content": msg.content} for msg in request.messages],
"stop": stop_tokens,
"stream": request.stream,
"stream_options": {"include_usage": True} if request.stream else {},
"model": request.model
},
"env_type": "text",
"fingerprint": "anon_bc558b0aa92fa43ed9ac23d536480e6c"
}
def get_sambanova_headers(api_key: str) -> dict:
"""构建 SambaNova 请求头"""
return {
'User-Agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
'Accept': "text/event-stream",
'Accept-Encoding': "gzip, deflate, br, zstd",
'Content-Type': "application/json",
'sec-ch-ua-platform': '"Windows"',
'sec-ch-ua': '"Google Chrome";v="141", "Not?A_Brand";v="8", "Chromium";v="141"',
'sec-ch-ua-mobile': "?0",
'origin': "https://cloud.sambanova.ai",
'sec-fetch-site': "same-origin",
'sec-fetch-mode': "cors",
'sec-fetch-dest': "empty",
'referer': "https://cloud.sambanova.ai/playground",
'accept-language': "zh-CN,zh;q=0.9",
'priority': "u=1, i",
'Cookie': f"access_token={api_key}"
}
# ==================== 流式响应处理 ====================
async def stream_response(request: ChatCompletionRequest, api_key: str):
"""处理流式响应"""
url = "https://cloud.sambanova.ai/api/completion"
payload = create_sambanova_payload(request)
headers = get_sambanova_headers(api_key)
try:
response = requests.post(
url,
data=json.dumps(payload),
headers=headers,
verify=False,
proxies={'http': None, 'https': None},
stream=True
)
response.raise_for_status()
completion_id = f"chatcmpl-{uuid.uuid4().hex}"
created = int(time.time())
# 发送初始块 (包含角色信息)
initial_chunk = {
"id": completion_id,
"object": "chat.completion.chunk",
"created": created,
"model": request.model,
"choices": [{
"index": 0,
"delta": {"role": "assistant", "content": ""},
"finish_reason": None
}]
}
yield f"data: {json.dumps(initial_chunk)}\n\n"
# 转发原始流式数据
for line in response.iter_lines():
if line:
line_str = line.decode('utf-8')
yield f"{line_str}\n"
except Exception as e:
error_chunk = {
"id": f"chatcmpl-{uuid.uuid4().hex}",
"object": "chat.completion.chunk",
"created": int(time.time()),
"model": request.model,
"choices": [{
"index": 0,
"delta": {},
"finish_reason": "error"
}],
"error": str(e)
}
yield f"data: {json.dumps(error_chunk)}\n\n"
yield "data: [DONE]\n\n"
# ==================== 非流式响应处理 ====================
async def non_stream_response(request: ChatCompletionRequest, api_key: str):
"""处理非流式响应"""
url = "https://cloud.sambanova.ai/api/completion"
payload = create_sambanova_payload(request)
payload["body"]["stream"] = False
headers = get_sambanova_headers(api_key)
try:
response = requests.post(
url,
data=json.dumps(payload),
headers=headers,
verify=False,
proxies={'http': None, 'https': None}
)
response.raise_for_status()
sambanova_response = response.json()
# 提取响应内容
content = ""
if 'completion' in sambanova_response:
content = sambanova_response['completion']
elif 'choices' in sambanova_response and sambanova_response['choices']:
content = sambanova_response['choices'][0].get('message', {}).get('content', '')
# 构建 OpenAI 格式响应
completion_response = ChatCompletionResponse(
id=f"chatcmpl-{uuid.uuid4().hex}",
created=int(time.time()),
model=request.model,
choices=[
Choice(
index=0,
message=Message(role="assistant", content=content),
finish_reason="stop"
)
],
usage=Usage(
prompt_tokens=len(' '.join([msg.content for msg in request.messages])) // 4,
completion_tokens=len(content) // 4,
total_tokens=(len(' '.join([msg.content for msg in request.messages])) + len(content)) // 4
)
)
return completion_response
except Exception as e:
raise HTTPException(status_code=500, detail=f"SambaNova API 错误: {str(e)}")
# ==================== API 路由 ====================
@app.get("/")
async def root():
"""根路径"""
return {
"message": "SambaNova to OpenAI API Proxy",
"docs": "/docs",
"health": "/health"
}
@app.post("/v1/chat/completions")
async def chat_completions(
request: ChatCompletionRequest,
authorization: str = Header(None, alias="Authorization")
):
"""
OpenAI 兼容的聊天完成接口
请求示例:
```bash
curl http://localhost:8470/v1/chat/completions \
-H "Authorization: Bearer <your_sambanova_access_token>" \
-H "Content-Type: application/json" \
-d '{
"model": "DeepSeek-V3.1-Terminus",
"messages": [{"role": "user", "content": "你好"}],
"stream": true
}'
"""
if not authorization or not authorization.startswith("Bearer "):
raise HTTPException(status_code=401, detail="缺少或无效的 Authorization 头")
api_key = authorization[7:] # 移除 "Bearer " 前缀
if request.stream:
return StreamingResponse(
stream_response(request, api_key),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Headers": "*",
"Access-Control-Allow-Methods": "*"
}
)
else:
return await non_stream_response(request, api_key)
@app.get("/v1/models")
async def list_models():
"""列出可用模型"""
return {
"object": "list",
"data": [
{
"id": "DeepSeek-R1-0528",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 131072,
"max_completion_tokens": 7168
},
{
"id": "DeepSeek-R1-Distill-Llama-70B",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 131072,
"max_completion_tokens": 4096
},
{
"id": "DeepSeek-V3-0324",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 131072,
"max_completion_tokens": 7168
},
{
"id": "DeepSeek-V3.1",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 131072,
"max_completion_tokens": 7168
},
{
"id": "DeepSeek-V3.1-Terminus",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 131072,
"max_completion_tokens": 7168
},
{
"id": "E5-Mistral-7B-Instruct",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 4096,
"max_completion_tokens": 4096
},
{
"id": "Llama-3.3-Swallow-70B-Instruct-v0.4",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 131072,
"max_completion_tokens": 3072
},
{
"id": "Llama-4-Maverick-17B-128E-Instruct",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 131072,
"max_completion_tokens": 4096
},
{
"id": "Meta-Llama-3.1-8B-Instruct",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 16384,
"max_completion_tokens": 4096
},
{
"id": "Meta-Llama-3.3-70B-Instruct",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 131072,
"max_completion_tokens": 3072
},
{
"id": "Qwen3-32B",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 32768,
"max_completion_tokens": 4096
},
{
"id": "Whisper-Large-v3",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 4096,
"max_completion_tokens": 4096
},
{
"id": "gpt-oss-120b",
"object": "model",
"created": int(time.time()),
"owned_by": "sambanova",
"context_length": 131072,
"max_completion_tokens": 131072
}
]
}
@app.get("/health")
async def health_check():
"""健康检查"""
return {
"status": "ok",
"timestamp": datetime.now().isoformat(),
"service": "sambanova-proxy"
}
第七步:部署与测试
7.1 启动服务
# 方式一:直接运行
python main.py
# 方式二:使用 uvicorn (推荐生产环境)
uvicorn main:app --host 0.0.0.0 --port 8470 --reload
7.2 非流式请求测试
curl http://localhost:8470/v1/chat/completions \
-H "Authorization: Bearer <你的access_token>" \
-H "Content-Type: application/json" \
-d '{
"model": "DeepSeek-V3.1-Terminus",
"messages": [
{"role": "user", "content": "用 Python 写一个快速排序"}
],
"stream": false
}'
7.3 流式请求测试
curl http://localhost:8470/v1/chat/completions \
-H "Authorization: Bearer <你的access_token>" \
-H "Content-Type: application/json" \
-d '{
"model": "DeepSeek-V3.1-Terminus",
"messages": [
{"role": "user", "content": "解释什么是递归"}
],
"stream": true
}'
7.4 Python 客户端示例
import openai
# 配置自定义端点
openai.api_base = "http://localhost:8470/v1"
openai.api_key = "你的access_token" # 使用 SambaNova 令牌
response = openai.ChatCompletion.create(
model="DeepSeek-V3.1-Terminus",
messages=[
{"role": "user", "content": "介绍一下 FastAPI 框架"}
],
stream=True
)
for chunk in response:
if 'choices' in chunk and len(chunk.choices) > 0:
delta = chunk.choices[0].delta
if 'content' in delta:
print(delta.content, end='')
7.5 在可视化控制台中测试请求


出现如图效果代表成功!
第八步:生产环境优化建议
8.1 安全加固
令牌加密存储
import os from cryptography.fernet import Fernet # 使用环境变量存储加密密钥 KEY = os.getenv("ENCRYPTION_KEY") cipher = Fernet(KEY) def encrypt_token(token: str) -> str: return cipher.encrypt(token.encode()).decode() def decrypt_token(encrypted: str) -> str: return cipher.decrypt(encrypted.encode()).decode()
请求限流
from slowapi import Limiter from slowapi.util import get_remote_address limiter = Limiter(key_func=get_remote_address) app.state.limiter = limiter @app.post("/v1/chat/completions") @limiter.limit("60/minute") # 每分钟最多 60 次 async def chat_completions(...): pass
8.2 日志记录
import logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler("api.log"),
logging.StreamHandler()
]
)
logger = logging.getLogger(__name__)
@app.post("/v1/chat/completions")
async def chat_completions(...):
logger.info(f"收到请求 - 模型: {request.model}, 消息数: {len(request.messages)}")
# ... 处理逻辑
8.3 Docker 容器化部署
Dockerfile:
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY main.py .
EXPOSE 8470
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8470"]
部署命令:
# 构建镜像
docker build -t sambanova-proxy .
# 运行容器
docker run -d \
--name sambanova-api \
-p 8470:8470 \
--restart unless-stopped \
sambanova-proxy
总结与注意事项
✅ 实现功能
⚠️ 使用限制
- 令牌有效期:需定期刷新 access_token(约 7 天)
- IP 限制:频繁请求可能触发风控(建议添加延时)
- 模型可用性:部分模型可能因维护暂时不可用
- 仅供学习:本教程仅用于技术研究,请遵守平台服务条款
📚 扩展阅读
常见问题 FAQ
Q: 为什么我的请求返回 401 错误?
A: 检查 access_token
是否过期或格式错误,确保 Cookie 中包含有效令牌。
Q: 流式响应中断怎么办?
A: 可能是网络不稳定或令牌失效,建议添加重试机制和心跳检测。
Q: 如何获取更长的上下文支持?
A: 选择 DeepSeek-V3.1 等模型,上下文长度可达 131K Tokens。
Q: 能否商业使用?
A: 需遵守 SambaNova 平台服务条款,建议仅用于个人学习和测试。
赞助