# qa-system

**Repository Path**: whatitis/qa-system

## Basic Information

- **Project Name**: qa-system
- **Description**: 内部，学术使用，禁止复制、传播
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-02-14
- **Last Updated**: 2026-03-08

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# 本地运行环境
```shell

(qa-system)  qa-system % whereis python
python: /Users/XXXX/miniconda3/envs/qa-system/bin/python # 此为示例

conda qa-system
```
# 数据库
数据库DDL与初始化DML都放在migrations目录中。
```shell
PostgreSQL 17.4 on x86_64-pc-linux-gnu
```

# 使用模型
ASR
https://www.modelscope.cn/models/Qwen/Qwen3-ASR-1.7B/
TTS
https://github.com/ysharma3501/LuxTTS

## 数据库实体关系图（qa_system）
```mermaid
erDiagram
    users ||--o{ auth_tokens : "1对多"
    users ||--o{ conversations : "1对多"
    users ||--o{ messages : "1对多(可空)"
    users ||--o{ personas : "更新人"
    users ||--o{ sensitive_words : "创建人"
    users ||--o{ knowledge_materials : "上传人"

    conversations ||--o{ messages : "1对多"

    knowledge_materials ||--o{ knowledge_chunks : "1对多"

    users {
        bigint id PK
        varchar username UK
        varchar password_hash
        bigint tenant_id
        boolean is_admin
        boolean is_active
    }

    auth_tokens {
        bigint id PK
        varchar token UK
        bigint user_id FK
        timestamptz expires_at
    }

    conversations {
        bigint id PK
        bigint tenant_id
        bigint user_id FK
        varchar title
        timestamptz last_message_at
    }

    messages {
        bigint id PK
        bigint conversation_id FK
        bigint tenant_id
        bigint user_id FK
        varchar role
        varchar input_mode
        jsonb references_json
    }

    personas {
        bigint id PK
        bigint tenant_id
        text system_prompt_text_chat
        text system_prompt_voice_chat
        bigint updated_by FK
        boolean is_active
    }

    sensitive_words {
        bigint id PK
        bigint tenant_id
        varchar word
        bigint created_by FK
        boolean is_active
    }

    knowledge_materials {
        bigint id PK
        bigint tenant_id
        varchar title
        varchar file_name
        varchar status
        int chunk_count
        bigint created_by FK
    }

    knowledge_chunks {
        bigint id PK
        bigint material_id FK
        bigint tenant_id
        int chunk_index
        jsonb embedding
    }
```

## 系统分层架构图
```mermaid
flowchart TB
    subgraph UI["页面层（Gradio）"]
        A["/qa-ui 统一工作台"]
        A1["登录"]
        A2["对话（文字/语音）"]
        A3["我的（信息/密码）"]
        A4["管理员（用户/会话/人设/敏感词/知识库）"]
        A --> A1
        A --> A2
        A --> A3
        A --> A4
    end

    subgraph API["接口层（FastAPI）"]
        B["cmds/main.py"]
        B1["/qa/auth/*"]
        B2["/qa/chat/*"]
        B3["/qa/users/me*"]
        B4["/qa/admin/*"]
        B --> B1
        B --> B2
        B --> B3
        B --> B4
    end

    subgraph Service["业务能力层"]
        C1["LLM 问答（OpenAI 兼容）"]
        C2["ASR 语音识别"]
        C3["TTS 语音合成"]
        C4["知识库分块与向量检索"]
        C5["敏感词过滤"]
        C6["会话编排与持久化"]
    end

    subgraph DB["数据层（PostgreSQL / qa_system schema）"]
        D1["users / auth_tokens"]
        D2["conversations / messages"]
        D3["personas / sensitive_words"]
        D4["knowledge_materials / knowledge_chunks"]
    end

    UI --> API
    API --> Service
    Service --> DB
```

# ASR调用代码
```python
from gradio_client.client import Client
import requests

url = "http://100.87.181.117:8000/v1/chat/completions"
headers = {"Content-Type": "application/json"}

data = {
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "audio_url",
                    "audio_url": {
                        "url": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen3-ASR-Repo/asr_en.wav"
                    },
                }
            ],
        }
    ]
}

response = requests.post(url, headers=headers, json=data, timeout=300)
response.raise_for_status()
content = response.json()['choices'][0]['message']['content']
# 返回内容格式：language English<asr_text>Uh huh. Oh yeah, yeah. He wasn't even that big when I started listening to him, but and his solo music didn't do overly well, but he did very well when he started writing for other people.
print(content)
```

# TTS调用代码
```python
# ----- TTS 
from gradio_client import Client, handle_file

client = Client("http://100.87.181.117:7860/")
# 注意，使用之前需要现在gradio上传音频文件。
result = client.predict(
    text="任务计划：6.功能/数据权限管理模块，未评估。7.事件记录与可视化模块，未评估。",
    audio_prompt=handle_file('http://100.87.181.117:7860/gradio_api/file=/tmp/gradio/64d2064e8f423f27f6366ba381ae2892de27fb7cb8f344b7cf92e150a4bbe71b/audio 4.wav'),
    rms=0.01,
    t_shift=0.9,
    num_steps=4,
    speed=1,
    return_smooth=False,
    api_name="/infer"
    )
print(result)
print(result[0]) # 数据格式：/private/var/folders/pg/8s6w8ysj6717_xvqbqfkfk300000gn/T/gradio/6608d3222539aa48e3d85606326ccab36600ee71b63602d4cf9edb15db1b1bf0/audio.wav
```
# LLM对接和记忆库对接代码
```shell
cmds/maf_test.py
```

# gradio官方文档
https://www.gradio.app/guides/quickstart

# FastAPI全栈模板
https://fastapi.tiangolo.com/zh/project-generation/
https://github.com/fastapi/full-stack-fastapi-template

# MAF官方文档
https://learn.microsoft.com/en-us/agent-framework/

# 系统页面入口（本项目内）
- Gradio 统一工作台：`/qa-ui`
  - 含登录、对话、我的、管理员（用户管理/会话管理/人设与提示词/敏感词/知识库）模块
- FastAPI 接口文档：`/docs`

## AI模型与语音说明（按 README/maf_test.py 对齐）
- 对话LLM固定使用 `cmds/maf_test.py` 中同款配置：
  - base_url: `https://ark.cn-beijing.volces.com/api/v3/`
  - model: `doubao-seed-2-0-mini-260215`
- 语音识别（ASR）沿用 README 的 `audio_url` 调用方式。
- Gradio 对话页支持“录音 -> 点击发送”流程；录音文件会转换为 `audio_url` 提交给 ASR。
  - 需要设置环境变量 `QA_PUBLIC_BASE_URL`，使 ASR 服务可访问当前系统，例如：
  - `export QA_PUBLIC_BASE_URL=http://<你的服务可访问地址>:8000`
- 大模型回复可在对话页通过“开启大模型回复TTS语音播报”开关选择是否播报。

默认初始化管理员账号（首次启动会自动创建）：
- 用户名：`admin`
- 密码：`Admin@123456`