131 lines
2.3 KiB
Markdown
131 lines
2.3 KiB
Markdown
# Qwen2-Audio 语音模型服务
|
||
|
||
语音交互模型 API 服务,基于 Qwen2-Audio-7B-Instruct。
|
||
|
||
## 端口
|
||
|
||
- **服务端口**: 19018
|
||
|
||
## 功能
|
||
|
||
- 语音识别 + 对话生成
|
||
- 多轮对话支持
|
||
- 文本对话接口(可选)
|
||
- 对话历史管理
|
||
|
||
## 部署步骤
|
||
|
||
### 1. 安装依赖
|
||
|
||
```bash
|
||
pip install -r requirements.txt
|
||
```
|
||
|
||
### 2. 启动服务
|
||
|
||
```bash
|
||
# 方式一:直接启动
|
||
python server.py
|
||
|
||
# 方式二:使用脚本(可指定端口)
|
||
PORT=19018 ./start.sh
|
||
```
|
||
|
||
### 3. 验证服务
|
||
|
||
```bash
|
||
curl http://localhost:19018/
|
||
```
|
||
|
||
返回:
|
||
```json
|
||
{"status": "ok", "model": "Qwen/Qwen2-Audio-7B-Instruct", "conversations": 0}
|
||
```
|
||
|
||
## API 接口
|
||
|
||
### 语音推理
|
||
|
||
```bash
|
||
POST /api/voice/inference
|
||
Content-Type: multipart/form-data
|
||
|
||
参数:
|
||
- audio: 音频文件 (WAV/MP3/FLAC)
|
||
- conversation_id: 对话ID(可选,不传则创建新对话)
|
||
- max_length: 最大生成长度(默认256)
|
||
|
||
返回:
|
||
{
|
||
"reply": "你好,有什么可以帮助你的吗?",
|
||
"conversation_id": "xxx-xxx-xxx",
|
||
"timestamp": "2026-04-21T18:00:00"
|
||
}
|
||
```
|
||
|
||
**示例**:
|
||
```bash
|
||
curl -X POST http://localhost:19018/api/voice/inference \
|
||
-F "audio=@test.wav"
|
||
```
|
||
|
||
### 文本推理(测试用)
|
||
|
||
```bash
|
||
POST /api/voice/text
|
||
Content-Type: multipart/form-data
|
||
|
||
参数:
|
||
- text: 文本消息
|
||
- conversation_id: 对话ID(可选)
|
||
|
||
返回: 同上
|
||
```
|
||
|
||
### 获取对话历史
|
||
|
||
```bash
|
||
GET /api/voice/conversation/{conversation_id}
|
||
```
|
||
|
||
### 删除对话
|
||
|
||
```bash
|
||
DELETE /api/voice/conversation/{conversation_id}
|
||
```
|
||
|
||
## 多轮对话
|
||
|
||
第一轮:
|
||
```bash
|
||
curl -X POST http://localhost:19018/api/voice/inference \
|
||
-F "audio=@audio1.wav"
|
||
|
||
# 返回 conversation_id: "abc-123"
|
||
```
|
||
|
||
第二轮:
|
||
```bash
|
||
curl -X POST http://localhost:19018/api/voice/inference \
|
||
-F "audio=@audio2.wav" \
|
||
-F "conversation_id=abc-123"
|
||
```
|
||
|
||
## 环境变量
|
||
|
||
| 变量 | 说明 | 默认值 |
|
||
|------|------|--------|
|
||
| MODEL_NAME | 模型名称 | Qwen/Qwen2-Audio-7B-Instruct |
|
||
| MAX_HISTORY_TURNS | 最大历史轮数 | 10 |
|
||
|
||
## 硬件要求
|
||
|
||
- **GPU**: 推荐 NVIDIA GPU,显存 ≥ 16GB
|
||
- **CPU**: 可运行,但速度较慢
|
||
- **内存**: ≥ 32GB
|
||
|
||
## 注意事项
|
||
|
||
1. 模型首次加载需要下载约 15GB,请确保网络畅通
|
||
2. 音频会自动转换为 16kHz 单声道格式
|
||
3. 对话历史存储在内存中,重启服务会丢失 |