更新

2026-02-03 17:42:04 +08:00 · 2026-02-03 17:15:35 +08:00 · 2026-02-03 17:12:30 +08:00 · 2026-02-03 13:46:52 +08:00 · 2026-02-02 17:34:36 +08:00 · 2026-02-02 17:16:07 +08:00
74 changed files with 11365 additions and 905 deletions
--- a/Docs/BACKEND_README.md
+++ b/Docs/BACKEND_README.md
@@ -0,0 +1,172 @@
 # ViGent2 后端开发指南
 本文档为后端开发人员提供架构概览、接口规范以及开发流程指南。
 ---
 ## 🏗️ 架构概览
 后端采用 **FastAPI** 框架，基于 Python 3.10+ 构建，主要负责业务逻辑处理、AI 任务调度以及与各微服务组件的交互。
 ### 目录结构
 ```
 backend/
 ├── app/
 │   ├── api/              # API 路由定义 (endpoints)
 │   ├── core/             # 核心配置 (config.py, security.py)
 │   ├── models/           # Pydantic 数据模型 (schemas)
 │   ├── services/         # 业务逻辑服务层
 │   │   ├── auth_service.py       # 用户认证服务
 │   │   ├── glm_service.py        # GLM-4 大模型服务
 │   │   ├── lipsync_service.py    # LatentSync 唇形同步
 │   │   ├── publish_service.py    # 社交媒体发布
 │   │   └── voice_clone_service.py# Qwen3-TTS 声音克隆
 │   └── tests/            # 单元测试与集成测试
 ├── scripts/              # 运维脚本 (watchdog.py, init_db.py)
 ├── assets/               # 资源库 (fonts, bgm, styles)
 └── requirements.txt      # 依赖清单
 ```
 ---
 ## 🔌 API 接口规范
 后端服务默认运行在 `8006` 端口。
 - **文档地址**: `http://localhost:8006/docs` (Swagger UI)
 - **认证方式**: Bearer Token (JWT)
 ### 核心模块
 1.  **认证 (Auth)**
    *   `POST /api/auth/login`: 用户登录 (手机号)
    *   `POST /api/auth/register`: 用户注册
    *   `GET /api/auth/me`: 获取当前用户信息
 2.  **视频生成 (Videos)**
    *   `POST /api/videos/generate`: 提交生成任务
    *   `GET /api/videos/tasks/{task_id}`: 查询任务状态
    *   `GET /api/videos/generated`: 获取历史视频列表
    *   `DELETE /api/videos/generated/{video_id}`: 删除历史视频
 > **修正 (16:20)**：任务查询与历史列表接口已更新为 `/api/videos/tasks/{task_id}` 与 `/api/videos/generated`。
 3.  **素材管理 (Materials)**
    *   `POST /api/materials/upload`: 上传素材 (Direct Upload to Supabase)
    *   `GET /api/materials`: 获取素材列表
 4.  **社交发布 (Publish)**
    *   `POST /api/publish`: 发布视频到 B站/抖音/小红书
 5.  **资源库 (Assets)**
    *   `GET /api/assets/subtitle-styles`: 字幕样式列表
    *   `GET /api/assets/title-styles`: 标题样式列表
    *   `GET /api/assets/bgm`: 背景音乐列表
 ---
 ## 🎛️ 视频生成扩展参数
 `POST /api/videos/generate` 支持以下可选字段：
 - `subtitle_style_id`: 字幕样式 ID
 - `title_style_id`: 标题样式 ID
 - `subtitle_font_size`: 字幕字号（覆盖样式默认值）
 - `title_font_size`: 标题字号（覆盖样式默认值）
 - `bgm_id`: 背景音乐 ID
 - `bgm_volume`: 背景音乐音量（0-1，默认 0.2）
 ## 📦 资源库与静态资源
 - 本地资源目录：`backend/assets/{fonts,bgm,styles}`
 - 静态访问路径：`/assets`（用于前端样式预览与背景音乐试听）
 ## 🎵 背景音乐混音策略
 - 混音发生在 **唇形对齐之后**，避免影响字幕/口型时间轴。
 - 使用 FFmpeg `amix`，禁用归一化以保持配音音量稳定。
 ## 🛠️ 开发环境搭建
 ### 1. 虚拟环境
 ```bash
 cd backend
 python -m venv venv
 source venv/bin/activate  # Linux/macOS
 # .\venv\Scripts\activate # Windows
 ```
 ### 2. 依赖安装
 ```bash
 pip install -r requirements.txt
 ```
 ### 3. 环境变量配置
 复制 `.env.example` 到 `.env` 并配置必要的 Key：
 ```ini
 # Supabase
 SUPABASE_URL=http://localhost:8008
 SUPABASE_KEY=your_service_role_key
 # GLM API (用于 AI 标题生成)
 GLM_API_KEY=your_glm_api_key
 # LatentSync 配置
 LATENTSYNC_GPU_ID=1
 ```
 ### 4. 启动服务
 **开发模式 (热重载)**:
 ```bash
 uvicorn app.main:app --host 0.0.0.0 --port 8006 --reload
 ```
 ---
 ## 🧩 服务集成指南
 ### 集成新模型
 如果需要集成新的 AI 模型 (例如新的 TTS 引擎)：
 1.  在 `app/services/` 下创建新的 Service 类 (如 `NewTTSService`)。
 2.  实现 `generate` 方法，可以使用 subprocess 调用，也可以是 HTTP 请求。
 3.  **重要**: 如果模型占用 GPU，请务必使用 `asyncio.Lock` 进行并发控制，防止 OOM。
 4.  在 `app/api/` 中添加对应的路由调用。
 ### 添加定时任务
 目前推荐使用 **APScheduler** 或 **Crontab** 来管理定时任务。
 社交媒体的定时发布功能目前依赖 `playwright` 的延迟执行，未来计划迁移到 Celery 队列。
 ---
 ## 🛡️ 错误处理
 全项目统一使用 `Loguru` 进行日志记录。
 ```python
 from loguru import logger
 try:
    # 业务逻辑
 except Exception as e:
    logger.error(f"操作失败: {str(e)}")
    raise HTTPException(status_code=500, detail="服务器内部错误")
 ```
 ---
 ## 🧪 测试
 运行测试套件：
 ```bash
 pytest
 ```
--- a/Docs/DEPLOY_MANUAL.md
+++ b/Docs/DEPLOY_MANUAL.md
@@ -98,6 +98,15 @@ playwright install chromium
 ---
 ### 可选：AI 标题/标签生成
 > ✅ 如需启用“AI 标题/标签生成”功能，请确保后端可访问外网 API。
 - 需要可访问 `https://open.bigmodel.cn`
 - API Key 配置在 `backend/app/services/glm_service.py`（建议替换为自己的密钥）
 ---
 ## 步骤 5: 部署用户认证系统 (Supabase + Auth)
 > 🔐 **包含**: 登录/注册、Supabase 数据库配置、JWT 认证、管理员后台
@@ -258,7 +267,42 @@ chmod +x run_latentsync.sh
 pm2 start ./run_latentsync.sh --name vigent2-latentsync
 ```
-### 4. 保存当前列表 (开机自启)
+### 4. 启动 Qwen3-TTS 声音克隆服务 (可选)
 > 如需使用声音克隆功能，需要启动此服务。
 1. 安装 HTTP 服务依赖:
 ```bash
 conda activate qwen-tts
 pip install fastapi uvicorn python-multipart
 ```
 2. 启动脚本位于项目根目录: `run_qwen_tts.sh`
 3. 使用 pm2 启动:
 ```bash
 cd /home/rongye/ProgramFiles/ViGent2
 pm2 start ./run_qwen_tts.sh --name vigent2-qwen-tts
 pm2 save
 ```
 4. 验证服务:
 ```bash
 # 检查健康状态
 curl http://localhost:8009/health
 ```
 ### 5. 启动服务看门狗 (Watchdog)
 > 🛡️ **推荐**：监控 Qwen-TTS 和 LatentSync 服务健康状态，卡死时自动重启。
 ```bash
 cd /home/rongye/ProgramFiles/ViGent2
 pm2 start ./run_watchdog.sh --name vigent2-watchdog
 pm2 save
 ```
 ### 6. 保存当前列表 (开机自启)
 ```bash
 pm2 save
@@ -271,6 +315,7 @@ pm2 startup
 pm2 status                    # 查看所有服务状态
 pm2 logs                      # 查看所有日志
 pm2 logs vigent2-backend      # 查看后端日志
 pm2 logs vigent2-qwen-tts     # 查看 Qwen3-TTS 日志
 pm2 restart all               # 重启所有服务
 pm2 stop vigent2-latentsync   # 停止 LatentSync 服务
 pm2 delete all                # 删除所有服务
@@ -322,7 +367,46 @@ server {
 ---
-## 步骤 12: 配置阿里云 Nginx 网关 (关键)
+---
 ## 步骤 13: 部署可选功能 (字幕与文案助手)
 本节介绍如何部署逐字高亮字幕、片头标题以及文案提取助手功能。
 ### 13.1 部署字幕系统 (Subtitle System)
 包含 `faster-whisper` (字幕生成) 和 `Remotion` (视频渲染) 组件。
 详细步骤请参考：**[字幕功能部署指南](SUBTITLE_DEPLOY.md)**
 简要步骤：
 1. 安装 Python 依赖: `faster-whisper`
 2. 安装 Node.js 依赖: `npm install` (在 `remotion/` 目录)
 3. 验证: `npx remotion --version`
 ### 13.2 部署文案提取助手 (Copywriting Assistant)
 支持 B站/抖音/TikTok 视频链接提取文案与 AI 洗稿。
 1. **安装核心依赖**:
   ```bash
   cd /home/rongye/ProgramFiles/ViGent2/backend
   source venv/bin/activate
   pip install yt-dlp zai-sdk
   ```
 2. **配置 AI 洗稿 (GLM)**:
   确保 `.env` 中已配置 `GLM_API_KEY`:
   ```ini
   GLM_API_KEY=your_zhipu_api_key
   ```
 3. **验证**:
   访问 `http://localhost:8006/docs`，测试 `/api/tools/extract-script` 接口。
 ---
 ## 步骤 14: 配置阿里云 Nginx 网关 (关键)
 > ⚠️ **CRITICAL**: 如果使用 `api.hbyrkj.top` 等域名作为入口，必须在阿里云 (或公网入口) 的 Nginx 配置中解除上传限制。
 > **这是导致 500/413 错误的核心原因。**
@@ -370,6 +454,7 @@ python3 -c "import torch; print(torch.cuda.is_available())"
 sudo lsof -i :8006
 sudo lsof -i :3002
 sudo lsof -i :8007
 sudo lsof -i :8009  # Qwen3-TTS
 ```
 ### 查看日志
@@ -379,6 +464,7 @@ sudo lsof -i :8007
 pm2 logs vigent2-backend
 pm2 logs vigent2-frontend
 pm2 logs vigent2-latentsync
 pm2 logs vigent2-qwen-tts
 ```
 ### SSH 连接卡顿 / 系统响应慢
@@ -405,6 +491,7 @@ pm2 logs vigent2-latentsync
 | `fastapi` | Web API 框架 |
 | `uvicorn` | ASGI 服务器 |
 | `edge-tts` | 微软 TTS 配音 |
 | `httpx` | GLM API HTTP 客户端 |
 | `playwright` | 社交媒体自动发布 |
 | `biliup` | B站视频上传 |
 | `loguru` | 日志管理 |
--- a/Docs/DevLogs/Day13.md
+++ b/Docs/DevLogs/Day13.md
@@ -0,0 +1,431 @@
 # Day 13 - 声音克隆功能集成 + 字幕功能
 **日期**：2026-01-29
 ---
 ## 🎙️ Qwen3-TTS 服务集成
 ### 背景
 在 Day 12 完成 Qwen3-TTS 模型部署后，今日重点是将其集成到 ViGent2 系统中，提供完整的声音克隆功能。
 ### 架构设计
 ```
 ┌─────────────────────────────────────────────────────────────┐
 │                    前端 (Next.js)                             │
 │      参考音频上传 → TTS 模式选择 → 视频生成请求               │
 └─────────────────────────────────────────────────────────────┘
                            │
                            ▼
 ┌─────────────────────────────────────────────────────────────┐
 │                   后端 (FastAPI :8006)                        │
 │  ref-audios API → voice_clone_service → video_service        │
 └─────────────────────────────────────────────────────────────┘
                            │
                            ▼
 ┌─────────────────────────────────────────────────────────────┐
 │               Qwen3-TTS 服务 (FastAPI :8009)                  │
 │            HTTP /generate → 返回克隆音频                      │
 └─────────────────────────────────────────────────────────────┘
 ```
 ### Qwen3-TTS HTTP 服务 (`qwen_tts_server.py`)
 创建独立的 FastAPI 服务，运行在 8009 端口：
 ```python
 from fastapi import FastAPI, UploadFile, Form, HTTPException
 from fastapi.responses import Response
 import torch
 import soundfile as sf
 from qwen_tts import Qwen3TTSModel
 import io, os
 app = FastAPI(title="Qwen3-TTS Voice Clone Service")
 # GPU 配置
 GPU_ID = os.getenv("QWEN_TTS_GPU_ID", "0")
 model = None
@app.on_event("startup")
 async def load_model():
    global model
    model = Qwen3TTSModel.from_pretrained(
        "./checkpoints/0.6B-Base",
        device_map=f"cuda:{GPU_ID}",
        dtype=torch.bfloat16,
    )
@app.get("/health")
 async def health():
    return {"service": "Qwen3-TTS", "ready": model is not None, "gpu_id": GPU_ID}
@app.post("/generate")
 async def generate(
    ref_audio: UploadFile,
    text: str = Form(...),
    ref_text: str = Form(""),
    language: str = Form("Chinese"),
 ):
    # 保存临时参考音频
    ref_path = f"/tmp/ref_{ref_audio.filename}"
    with open(ref_path, "wb") as f:
        f.write(await ref_audio.read())
    # 生成克隆音频
    wavs, sr = model.generate_voice_clone(
        text=text,
        language=language,
        ref_audio=ref_path,
        ref_text=ref_text or "一段参考音频。",
    )
    # 返回 WAV 音频
    buffer = io.BytesIO()
    sf.write(buffer, wavs[0], sr, format="WAV")
    buffer.seek(0)
    return Response(content=buffer.read(), media_type="audio/wav")
 ```
 ### 后端声音克隆服务 (`voice_clone_service.py`)
 通过 HTTP 调用 Qwen3-TTS 服务：
 ```python
 import aiohttp
 from loguru import logger
 QWEN_TTS_URL = "http://localhost:8009"
 async def generate_cloned_audio(
    ref_audio_path: str,
    text: str,
    output_path: str,
    ref_text: str = "",
 ) -> str:
    """调用 Qwen3-TTS 服务生成克隆音频"""
    async with aiohttp.ClientSession() as session:
        with open(ref_audio_path, "rb") as f:
            data = aiohttp.FormData()
            data.add_field("ref_audio", f, filename="ref.wav")
            data.add_field("text", text)
            data.add_field("ref_text", ref_text)
            async with session.post(f"{QWEN_TTS_URL}/generate", data=data) as resp:
                if resp.status != 200:
                    raise Exception(f"Qwen3-TTS error: {resp.status}")
                audio_data = await resp.read()
                with open(output_path, "wb") as out:
                    out.write(audio_data)
    return output_path
 ```
 ---
 ## 📂 参考音频管理 API
 ### 新增 API 端点 (`ref_audios.py`)
 | 端点 | 方法 | 功能 |
 |------|------|------|
 | `/api/ref-audios` | GET | 获取参考音频列表 |
 | `/api/ref-audios` | POST | 上传参考音频 |
 | `/api/ref-audios/{id}` | DELETE | 删除参考音频 |
 ### Supabase Bucket 配置
 为参考音频创建独立存储桶：
 ```sql
 -- 创建 ref-audios bucket
 INSERT INTO storage.buckets (id, name, public)
 VALUES ('ref-audios', 'ref-audios', true)
 ON CONFLICT (id) DO NOTHING;
 -- RLS 策略
 CREATE POLICY "Allow public uploads" ON storage.objects
 FOR INSERT TO anon WITH CHECK (bucket_id = 'ref-audios');
 CREATE POLICY "Allow public read" ON storage.objects
 FOR SELECT TO anon USING (bucket_id = 'ref-audios');
 CREATE POLICY "Allow public delete" ON storage.objects
 FOR DELETE TO anon USING (bucket_id = 'ref-audios');
 ```
 ---
 ## 🎨 前端声音克隆 UI
 ### TTS 模式选择
 在视频生成页面新增声音克隆选项：
 ```tsx
 {/* TTS 模式选择 */}
 <div className="flex gap-2 mb-4">
  <button
    onClick={() => setTtsMode("edge")}
    className={`px-4 py-2 rounded-lg ${ttsMode === "edge" ? "bg-purple-600" : "bg-white/10"}`}
  >
    🔊 EdgeTTS
  </button>
  <button
    onClick={() => setTtsMode("clone")}
    className={`px-4 py-2 rounded-lg ${ttsMode === "clone" ? "bg-purple-600" : "bg-white/10"}`}
  >
    🎙️ 声音克隆
  </button>
 </div>
 ```
 ### 参考音频管理
 新增参考音频上传和列表展示功能：
 | 功能 | 实现 |
 |------|------|
 | 音频上传 | 拖拽上传 WAV/MP3，直传 Supabase |
 | 列表展示 | 显示文件名、时长、上传时间 |
 | 快速选择 | 点击即选中作为参考音频 |
 | 删除功能 | 删除不需要的参考音频 |
 ---
 ## ✅ 端到端测试验证
 ### 测试流程
 1. **上传参考音频**: 3 秒参考音频 → Supabase ref-audios bucket
 2. **选择声音克隆模式**: TTS 模式切换为 "声音克隆"
 3. **输入文案**: 测试口播文案
 4. **生成视频**: 
   - TTS 阶段调用 Qwen3-TTS (17.7s)
   - LipSync 阶段调用 LatentSync (122.8s)
 5. **播放验证**: 视频声音与参考音色一致
 ### 测试结果
 - ✅ 参考音频上传成功
 - ✅ Qwen3-TTS 生成克隆音频 (15s 推理，4.6s 音频)
 - ✅ LatentSync 唇形同步正常
 - ✅ 总生成时间 143.1s
 - ✅ 前端视频播放正常
 ---
 ## 🔧 PM2 服务配置
 ### 新增 Qwen3-TTS 服务
 **前置依赖安装**：
 ```bash
 conda activate qwen-tts
 pip install fastapi uvicorn python-multipart
 ```
 启动脚本 `run_qwen_tts.sh` (位于项目**根目录**)：
 ```bash
 #!/bin/bash
 cd /home/rongye/ProgramFiles/ViGent2/models/Qwen3-TTS
 /home/rongye/ProgramFiles/miniconda3/envs/qwen-tts/bin/python qwen_tts_server.py
 ```
 PM2 管理命令：
 ```bash
 # 进入根目录启动
 cd /home/rongye/ProgramFiles/ViGent2
 pm2 start ./run_qwen_tts.sh --name vigent2-qwen-tts
 pm2 save
 # 查看状态
 pm2 status
 # 查看日志
 pm2 logs vigent2-qwen-tts --lines 50
 ```
 ### 完整服务列表
 | 服务名 | 端口 | 功能 |
 |--------|------|------|
 | vigent2-backend | 8006 | FastAPI 后端 |
 | vigent2-frontend | 3002 | Next.js 前端 |
 | vigent2-latentsync | 8007 | LatentSync 唇形同步 |
 | vigent2-qwen-tts | 8009 | Qwen3-TTS 声音克隆 |
 ---
 ## 📁 今日修改文件清单
 | 文件 | 变更类型 | 说明 |
 |------|----------|------|
 | `models/Qwen3-TTS/qwen_tts_server.py` | 新增 | Qwen3-TTS HTTP 推理服务 |
 | `run_qwen_tts.sh` | 新增 | PM2 启动脚本 (根目录) |
 | `backend/app/services/voice_clone_service.py` | 新增 | 声音克隆服务 (HTTP 调用) |
 | `backend/app/api/ref_audios.py` | 新增 | 参考音频管理 API |
 | `backend/app/main.py` | 修改 | 注册 ref-audios 路由 |
 | `frontend/src/app/page.tsx` | 修改 | TTS 模式选择 + 参考音频 UI |
 ---
 ## 🔗 相关文档
 - [task_complete.md](../task_complete.md) - 任务总览
 - [Day12.md](./Day12.md) - iOS 兼容与 Qwen3-TTS 部署
 - [QWEN3_TTS_DEPLOY.md](../QWEN3_TTS_DEPLOY.md) - Qwen3-TTS 部署指南
 - [SUBTITLE_DEPLOY.md](../SUBTITLE_DEPLOY.md) - 字幕功能部署指南
 - [DEPLOY_MANUAL.md](../DEPLOY_MANUAL.md) - 完整部署手册
 ---
 ## 🎬 逐字高亮字幕 + 片头标题功能
 ### 背景
 为提升视频质量，新增逐字高亮字幕（卡拉OK效果）和片头标题功能。
 ### 技术方案
 | 组件 | 技术 | 说明 |
 |------|------|------|
 | 字幕对齐 | **faster-whisper** | 生成字级别时间戳 |
 | 视频渲染 | **Remotion** | React 视频合成框架 |
 ### 架构设计
 ```
 原有流程:
  文本 → EdgeTTS → 音频 → LatentSync → FFmpeg合成 → 最终视频
 新流程:
  文本 → EdgeTTS → 音频 ─┬→ LatentSync → 唇形视频 ─┐
                        └→ faster-whisper → 字幕JSON ─┴→ Remotion合成 → 最终视频
 ```
 ### 后端新增服务
 #### 1. 字幕服务 (`whisper_service.py`)
 基于 faster-whisper 生成字级别时间戳：
 ```python
 from faster_whisper import WhisperModel
 class WhisperService:
    def __init__(self, model_size="large-v3", device="cuda"):
        self.model = WhisperModel(model_size, device=device)
    async def align(self, audio_path: str, text: str, output_path: str):
        segments, info = self.model.transcribe(audio_path, word_timestamps=True)
        # 将词拆分成单字，时间戳线性插值
        result = {"segments": [...]}
        # 保存到 JSON
 ```
 **字幕拆字算法**：faster-whisper 对中文返回词级别，系统自动拆分成单字并线性插值：
 ```python
 # 输入: {"word": "大家好", "start": 0.0, "end": 0.9}
 # 输出:
 [
  {"word": "大", "start": 0.0, "end": 0.3},
  {"word": "家", "start": 0.3, "end": 0.6},
  {"word": "好", "start": 0.6, "end": 0.9}
 ]
 ```
 #### 2. Remotion 渲染服务 (`remotion_service.py`)
 调用 Remotion 渲染字幕和标题：
 ```python
 class RemotionService:
    async def render(self, video_path, output_path, captions_path, title, ...):
        cmd = f"npx ts-node render.ts --video {video_path} --output {output_path} ..."
        # 执行渲染
 ```
 ### Remotion 项目结构
 ```
 remotion/
 ├── package.json              # Node.js 依赖
 ├── render.ts                 # 服务端渲染脚本
 └── src/
    ├── Video.tsx             # 主视频组件
    ├── components/
    │   ├── Title.tsx         # 片头标题（淡入淡出）
    │   ├── Subtitles.tsx     # 逐字高亮字幕
    │   └── VideoLayer.tsx    # 视频图层
    └── utils/
        └── captions.ts       # 字幕数据类型
 ```
 ### 前端 UI
 新增标题和字幕设置区块：
 | 功能 | 说明 |
 |------|------|
 | 片头标题输入 | 可选，在视频开头显示 3 秒 |
 | 字幕开关 | 默认开启，可关闭 |
 ### 遇到的问题与修复
 #### 问题 1: `fs` 模块错误
 **现象**：Remotion 打包失败，提示 `fs.js doesn't exist`
 **原因**：`captions.ts` 中有 `loadCaptions` 函数使用了 Node.js 的 `fs` 模块
 **修复**：删除未使用的 `loadCaptions` 函数
 #### 问题 2: 视频文件读取失败
 **现象**：`file://` 协议无法读取本地视频
 **修复**：
 1. `render.ts` 使用 `publicDir` 指向视频目录
 2. `VideoLayer.tsx` 使用 `staticFile()` 加载视频
 ```typescript
 // render.ts
 const publicDir = path.dirname(path.resolve(options.videoPath));
 const bundleLocation = await bundle({
  entryPoint: path.resolve(__dirname, './src/index.ts'),
  publicDir,  // 关键配置
 });
 // VideoLayer.tsx
 const videoUrl = staticFile(videoSrc);
 ```
 ### 测试结果
 - ✅ faster-whisper 字幕对齐成功（~1秒）
 - ✅ Remotion 渲染成功（~10秒）
 - ✅ 字幕逐字高亮效果正常
 - ✅ 片头标题淡入淡出正常
 - ✅ 降级机制正常（Remotion 失败时回退到 FFmpeg）
 ---
 ## 📁 今日修改文件清单（完整）
 | 文件 | 变更类型 | 说明 |
 |------|----------|------|
 | `models/Qwen3-TTS/qwen_tts_server.py` | 新增 | Qwen3-TTS HTTP 推理服务 |
 | `run_qwen_tts.sh` | 新增 | PM2 启动脚本 (根目录) |
 | `backend/app/services/voice_clone_service.py` | 新增 | 声音克隆服务 (HTTP 调用) |
 | `backend/app/services/whisper_service.py` | 新增 | 字幕对齐服务 (faster-whisper) |
 | `backend/app/services/remotion_service.py` | 新增 | Remotion 渲染服务 |
 | `backend/app/api/ref_audios.py` | 新增 | 参考音频管理 API |
 | `backend/app/api/videos.py` | 修改 | 集成字幕和标题功能 |
 | `backend/app/main.py` | 修改 | 注册 ref-audios 路由 |
 | `backend/requirements.txt` | 修改 | 添加 faster-whisper 依赖 |
 | `remotion/` | 新增 | Remotion 视频渲染项目 |
 | `frontend/src/app/page.tsx` | 修改 | TTS 模式选择 + 标题字幕 UI |
 | `Docs/SUBTITLE_DEPLOY.md` | 新增 | 字幕功能部署文档 |
--- a/Docs/DevLogs/Day14.md
+++ b/Docs/DevLogs/Day14.md
@@ -0,0 +1,402 @@
 # Day 14 - 模型升级 + 标题标签生成 + 前端修复
 **日期**：2026-01-30
 ---
 ## 🚀 Qwen3-TTS 模型升级 (0.6B → 1.7B)
 ### 背景
 为提升声音克隆质量，将 Qwen3-TTS 模型从 0.6B-Base 升级到 1.7B-Base。
 ### 变更内容
 | 项目 | 升级前 | 升级后 |
 |------|--------|--------|
 | 模型 | 0.6B-Base | **1.7B-Base** |
 | 大小 | 2.4GB | 6.8GB |
 | 质量 | 基础 | 更高质量 |
 ### 代码修改
 **文件**: `models/Qwen3-TTS/qwen_tts_server.py`
 ```python
 # 升级前
 MODEL_PATH = Path(__file__).parent / "checkpoints" / "0.6B-Base"
 # 升级后
 MODEL_PATH = Path(__file__).parent / "checkpoints" / "1.7B-Base"
 ```
 ### 模型下载
 ```bash
 cd /home/rongye/ProgramFiles/ViGent2/models/Qwen3-TTS
 # 下载 1.7B-Base 模型 (6.8GB)
 modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-Base --local_dir ./checkpoints/1.7B-Base
 ```
 ### 结果
 - ✅ 模型加载正常 (GPU0, bfloat16)
 - ✅ 声音克隆质量提升
 - ✅ 推理速度可接受
 ---
 ## 🎨 标题和字幕显示优化
 ### 字幕组件优化 (`Subtitles.tsx`)
 **文件**: `remotion/src/components/Subtitles.tsx`
 优化内容：
 - 调整高亮颜色配置
 - 优化文字描边效果（多层阴影）
 - 调整字间距和行高
 ```typescript
 export const Subtitles: React.FC<SubtitlesProps> = ({
  captions,
  highlightColor = '#FFFF00',  // 高亮颜色
  normalColor = '#FFFFFF',      // 普通文字颜色
  fontSize = 52,
 }) => {
  // 样式优化
  const style = {
    textShadow: `
      2px 2px 4px rgba(0,0,0,0.8),
      -2px -2px 4px rgba(0,0,0,0.8),
      ...
    `,
    letterSpacing: '2px',
    lineHeight: 1.4,
    maxWidth: '90%',
  };
 };
 ```
 ### 标题组件优化 (`Title.tsx`)
 **文件**: `remotion/src/components/Title.tsx`
 优化内容：
 - 淡入淡出动画效果
 - 下滑入场动画
 - 可配置显示时长
 ```typescript
 interface TitleProps {
  title: string;
  duration?: number;        // 标题显示时长（秒，默认3秒）
  fadeOutStart?: number;    // 开始淡出的时间（秒，默认2秒）
 }
 // 动画效果
 // 淡入：0-0.5 秒
 // 淡出：2-3 秒
 // 下滑：0-0.5 秒，-20px → 0px
 ```
 ### 结果
 - ✅ 字幕显示更清晰
 - ✅ 标题动画更流畅
 ---
 ## 🤖 标题标签自动生成功能
 ### 功能描述
 使用 AI（智谱 GLM-4-Flash）根据口播文案自动生成视频标题和标签。
 ### 后端实现
 #### 1. GLM 服务 (`glm_service.py`)
 **文件**: `backend/app/services/glm_service.py`
 ```python
 class GLMService:
    """智谱 GLM AI 服务"""
    async def generate_meta(self, text: str) -> dict:
        """根据文案生成标题和标签"""
        prompt = """根据以下口播文案，生成一个吸引人的短视频标题和3个相关标签。
 要求：
 1. 标题要简洁有力，能吸引观众点击，不超过10个字
 2. 标签要与内容相关，便于搜索和推荐，只要3个
 返回格式：{"title": "标题", "tags": ["标签1", "标签2", "标签3"]}
 """
        # 调用 GLM-4-Flash API
        response = await self._call_api(prompt + text)
        return self._parse_json(response)
 ```
 **JSON 解析容错**：
 - 支持直接 JSON 解析
 - 支持提取 JSON 块
 - 支持 ```json 代码块提取
 #### 2. API 端点 (`ai.py`)
 **文件**: `backend/app/api/ai.py`
 ```python
 from pydantic import BaseModel
 class GenerateMetaRequest(BaseModel):
    text: str  # 口播文案
 class GenerateMetaResponse(BaseModel):
    title: str        # 生成的标题
    tags: list[str]   # 生成的标签列表
@router.post("/generate-meta", response_model=GenerateMetaResponse)
 async def generate_meta(request: GenerateMetaRequest):
    """AI 生成标题和标签"""
    result = await glm_service.generate_meta(request.text)
    return result
 ```
 ### 前端实现
 **文件**: `frontend/src/app/page.tsx`
 #### UI 按钮
 ```tsx
 <button
  onClick={handleGenerateMeta}
  disabled={isGeneratingMeta || !text.trim()}
  className="px-2 py-1 text-xs rounded transition-all whitespace-nowrap"
 >
  {isGeneratingMeta ? "⏳ 生成中..." : "🤖 AI生成标题标签"}
 </button>
 ```
 #### 处理逻辑
 ```typescript
 const handleGenerateMeta = async () => {
  if (!text.trim()) {
    alert("请先输入口播文案");
    return;
  }
  setIsGeneratingMeta(true);
  try {
    const { data } = await api.post('/api/ai/generate-meta', { text: text.trim() });
    // 更新首页标题
    setVideoTitle(data.title || "");
    // 同步到发布页 localStorage
    localStorage.setItem(`vigent_${storageKey}_publish_title`, data.title || "");
    localStorage.setItem(`vigent_${storageKey}_publish_tags`, JSON.stringify(data.tags || []));
  } catch (err: any) {
    alert(`AI 生成失败: ${err.message}`);
  } finally {
    setIsGeneratingMeta(false);
  }
 };
 ```
 ### 发布页集成
 **文件**: `frontend/src/app/publish/page.tsx`
 从 localStorage 恢复 AI 生成的标题和标签：
 ```typescript
 // 恢复标题和标签
 const savedTitle = localStorage.getItem(`vigent_${storageKey}_publish_title`);
 const savedTags = localStorage.getItem(`vigent_${storageKey}_publish_tags`);
 if (savedTags) {
  try {
    const parsed = JSON.parse(savedTags);
    if (Array.isArray(parsed)) {
      setTags(parsed.join(', '));  // 数组转逗号分隔字符串
    } else {
      setTags(savedTags);
    }
  } catch {
    setTags(savedTags);
  }
 }
 ```
 ### 结果
 - ✅ AI 生成标题和标签功能正常
 - ✅ 数据自动同步到发布页
 - ✅ 支持 JSON 数组和字符串格式兼容
 ---
 ## 🐛 前端文本保存问题修复
 ### 问题描述
 **现象**：页面刷新后，用户输入的文案、标题等数据丢失
 **原因**：
 1. 认证状态恢复失败时，`userId` 为 `null`
 2. 原代码判断 `!userId` 后用默认值覆盖 localStorage 数据
 3. 导致已保存的用户数据被清空
 ### 解决方案
 **文件**: `frontend/src/app/page.tsx`
 #### 1. 添加恢复完成标志
 ```typescript
 const [isRestored, setIsRestored] = useState(false);
 ```
 #### 2. 等待认证完成后恢复数据
 ```typescript
 useEffect(() => {
  if (isAuthLoading) return;  // 等待认证完成
  // 使用 userId 或 'guest' 作为 key
  const key = userId || 'guest';
  // 从 localStorage 恢复数据
  const savedText = localStorage.getItem(`vigent_${key}_text`);
  if (savedText) setText(savedText);
  // ... 恢复其他数据
  setIsRestored(true);  // 标记恢复完成
 }, [userId, isAuthLoading]);
 ```
 #### 3. 恢复完成后才保存
 ```typescript
 useEffect(() => {
  if (isRestored) {
    localStorage.setItem(`vigent_${storageKey}_text`, text);
  }
 }, [text, storageKey, isRestored]);
 ```
 ### 用户隔离机制
 ```typescript
 const storageKey = userId || 'guest';
 ```
 | 用户状态 | storageKey | 说明 |
 |----------|------------|------|
 | 已登录 | `user_xxx` | 数据按用户隔离 |
 | 未登录/认证失败 | `guest` | 使用统一 key |
 ### 数据恢复流程
 ```
 1. 页面加载
   ↓
 2. 检查 isAuthLoading
   ├─ true: 等待认证完成
   └─ false: 继续
   ↓
 3. 确定 storageKey (userId || 'guest')
   ↓
 4. 从 localStorage 读取数据
   ├─ 有保存数据: 恢复到状态
   └─ 无保存数据: 使用默认值
   ↓
 5. 设置 isRestored = true
   ↓
 6. 后续状态变化时保存到 localStorage
 ```
 ### 保存的数据项
 | Key | 说明 |
 |-----|------|
 | `vigent_${key}_text` | 口播文案 |
 | `vigent_${key}_title` | 视频标题 |
 | `vigent_${key}_subtitles` | 字幕开关 |
 | `vigent_${key}_ttsMode` | TTS 模式 |
 | `vigent_${key}_voice` | 选择的音色 |
 | `vigent_${key}_material` | 选择的素材 |
 | `vigent_${key}_publish_title` | 发布标题 |
 | `vigent_${key}_publish_tags` | 发布标签 |
 ### 结果
 - ✅ 页面刷新后数据正常恢复
 - ✅ 认证失败时不会覆盖已保存数据
 - ✅ 多用户数据隔离正常
 ---
 ## 🐛 登录页刷新循环修复
 ### 问题描述
 **现象**：登录页未登录时不断刷新，无法停留在表单页面。
 **原因**：
 1. `AuthProvider` 初始化时调用 `/api/auth/me`
 2. 未登录返回 401
 3. `axios` 全局拦截器遇到 401/403 重定向 `/login`
 4. 登录页本身也在 Provider 中，导致循环刷新
 ### 解决方案
 **文件**: `frontend/src/lib/axios.ts`
 在拦截器中对公开路由跳过重定向，仅在受保护页面触发登录跳转：
 ```typescript
 const PUBLIC_PATHS = new Set(['/login', '/register']);
 const isPublicPath = typeof window !== 'undefined' && PUBLIC_PATHS.has(window.location.pathname);
 if ((status === 401 || status === 403) && !isRedirecting && !isPublicPath) {
  // ... 保持原有重定向逻辑
 }
 ```
 ### 结果
 - ✅ 登录页不再刷新，表单可正常输入
 - ✅ 受保护页面仍会在 401/403 时跳转登录页
 ---
 ## 📁 今日修改文件清单
 | 文件 | 变更类型 | 说明 |
 |------|----------|------|
 | `models/Qwen3-TTS/qwen_tts_server.py` | 修改 | 模型路径升级到 1.7B-Base |
 | `Docs/QWEN3_TTS_DEPLOY.md` | 修改 | 更新部署文档为 1.7B 版本 |
 | `remotion/src/components/Subtitles.tsx` | 修改 | 优化字幕显示效果 |
 | `remotion/src/components/Title.tsx` | 修改 | 优化标题动画效果 |
 | `backend/app/services/glm_service.py` | 新增 | GLM AI 服务 |
 | `backend/app/api/ai.py` | 新增 | AI 生成标题标签 API |
 | `backend/app/main.py` | 修改 | 注册 ai 路由 |
 | `frontend/src/app/page.tsx` | 修改 | AI 生成按钮 + localStorage 修复 |
 | `frontend/src/app/publish/page.tsx` | 修改 | 恢复 AI 生成的标签 |
 | `frontend/src/lib/axios.ts` | 修改 | 公开路由跳过 401/403 登录重定向 |
 ---
 ## 🔗 相关文档
 - [task_complete.md](../task_complete.md) - 任务总览
 - [Day13.md](./Day13.md) - 声音克隆功能集成 + 字幕功能
 - [QWEN3_TTS_DEPLOY.md](../QWEN3_TTS_DEPLOY.md) - Qwen3-TTS 1.7B 部署指南
--- a/Docs/DevLogs/Day15.md
+++ b/Docs/DevLogs/Day15.md
@@ -0,0 +1,410 @@
 # Day 15 - 手机号登录迁移 + 账户设置功能
 **日期**：2026-02-02
 ---
 ## 🔐 认证系统迁移：邮箱 → 手机号
 ### 背景
 根据业务需求，将用户认证从邮箱登录迁移到手机号登录（11位中国手机号）。
 ### 变更范围
 | 组件 | 变更内容 |
 |------|----------|
 | 数据库 Schema | `email` 字段替换为 `phone` |
 | 后端 API | 注册/登录/获取用户信息接口使用 `phone` |
 | 前端页面 | 登录/注册页面改为手机号输入框 |
 | 管理员配置 | `ADMIN_EMAIL` 改为 `ADMIN_PHONE` |
 ---
 ## 📦 后端修改
 ### 1. 数据库 Schema (`schema.sql`)
 **文件**: `backend/database/schema.sql`
 ```sql
 CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    phone TEXT UNIQUE NOT NULL,  -- 原 email 改为 phone
    password_hash TEXT NOT NULL,
    username TEXT,
    role TEXT DEFAULT 'pending' CHECK (role IN ('pending', 'user', 'admin')),
    is_active BOOLEAN DEFAULT FALSE,
    expires_at TIMESTAMP WITH TIME ZONE,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
 );
 CREATE INDEX idx_users_phone ON users(phone);
 ```
 ### 2. 认证 API (`auth.py`)
 **文件**: `backend/app/api/auth.py`
 #### 请求模型更新
 ```python
 class RegisterRequest(BaseModel):
    phone: str
    password: str
    username: Optional[str] = None
    @field_validator('phone')
    @classmethod
    def validate_phone(cls, v):
        if not re.match(r'^\d{11}$', v):
            raise ValueError('手机号必须是11位数字')
        return v
 ```
 #### 新增修改密码接口
 ```python
 class ChangePasswordRequest(BaseModel):
    old_password: str
    new_password: str
    @field_validator('new_password')
    @classmethod
    def validate_new_password(cls, v):
        if len(v) < 6:
            raise ValueError('新密码长度至少6位')
        return v
@router.post("/change-password")
 async def change_password(request: ChangePasswordRequest, req: Request, response: Response):
    """修改密码，验证当前密码后更新"""
    # 1. 验证当前密码
    # 2. 更新密码 hash
    # 3. 重新生成 session token
    # 4. 返回新的 JWT Cookie
 ```
 ### 3. 配置更新
 **文件**: `backend/app/core/config.py`
 ```python
 # 管理员配置
 ADMIN_PHONE: str = ""  # 原 ADMIN_EMAIL
 ADMIN_PASSWORD: str = ""
 ```
 **文件**: `backend/.env`
 ```bash
 ADMIN_PHONE=15549380526
 ADMIN_PASSWORD=lam1988324
 ```
 ### 4. 管理员初始化 (`main.py`)
 **文件**: `backend/app/main.py`
 ```python
@app.on_event("startup")
 async def init_admin():
    admin_phone = settings.ADMIN_PHONE  # 原 ADMIN_EMAIL
    # ... 使用 phone 字段创建管理员
 ```
 ### 5. 管理员 API (`admin.py`)
 **文件**: `backend/app/api/admin.py`
 ```python
 class UserListItem(BaseModel):
    id: str
    phone: str  # 原 email
    username: Optional[str]
    role: str
    is_active: bool
    expires_at: Optional[str]
    created_at: str
 ```
 ---
 ## 🖥️ 前端修改
 ### 1. 登录页面 (`login/page.tsx`)
 **文件**: `frontend/src/app/login/page.tsx`
 ```tsx
 const [phone, setPhone] = useState('');
 // 验证手机号格式
 if (!/^\d{11}$/.test(phone)) {
    setError('请输入正确的11位手机号');
    return;
 }
 <input
    type="tel"
    value={phone}
    onChange={(e) => setPhone(e.target.value.replace(/\D/g, '').slice(0, 11))}
    maxLength={11}
    placeholder="请输入11位手机号"
 />
 ```
 ### 2. 注册页面 (`register/page.tsx`)
 同样使用手机号输入，增加 11 位数字验证。
 ### 3. Auth 工具函数 (`auth.ts`)
 **文件**: `frontend/src/lib/auth.ts`
 ```typescript
 export interface User {
    id: string;
    phone: string;  // 原 email
    username: string | null;
    role: string;
    is_active: boolean;
 }
 export async function login(phone: string, password: string): Promise<AuthResponse> { ... }
 export async function register(phone: string, password: string, username?: string): Promise<AuthResponse> { ... }
 export async function changePassword(oldPassword: string, newPassword: string): Promise<AuthResponse> { ... }
 ```
 ### 4. 首页账户设置下拉菜单 (`page.tsx`)
 **文件**: `frontend/src/app/page.tsx`
 将原来的"退出"按钮改为账户设置下拉菜单：
 ```tsx
 function AccountSettingsDropdown() {
  const [isOpen, setIsOpen] = useState(false);
  const [showPasswordModal, setShowPasswordModal] = useState(false);
  // ...
  return (
    <div className="relative">
      <button onClick={() => setIsOpen(!isOpen)}>
        ⚙️ 账户
      </button>
      {/* 下拉菜单 */}
      {isOpen && (
        <div className="absolute right-0 mt-2 w-40 bg-gray-800 ...">
          <button onClick={() => setShowPasswordModal(true)}>
            🔐 修改密码
          </button>
          <button onClick={handleLogout} className="text-red-300">
            🚪 退出登录
          </button>
        </div>
      )}
      {/* 修改密码弹窗 */}
      {showPasswordModal && (
        <div className="fixed inset-0 z-50 ...">
          <form onSubmit={handleChangePassword}>
            <input placeholder="当前密码" />
            <input placeholder="新密码" />
            <input placeholder="确认新密码" />
          </form>
        </div>
      )}
    </div>
  );
 }
 ```
 ### 5. 管理员页面 (`admin/page.tsx`)
 **文件**: `frontend/src/app/admin/page.tsx`
 ```tsx
 interface UserListItem {
    id: string;
    phone: string;  // 原 email
    // ...
 }
 // 显示手机号而非邮箱
 <div className="text-gray-400 text-sm">{user.phone}</div>
 ```
 ---
 ## 🗄️ 数据库迁移
 ### 迁移脚本
 **文件**: `backend/database/migrate_to_phone.sql`
 ```sql
 -- 删除旧表 (CASCADE 处理外键依赖)
 DROP TABLE IF EXISTS user_sessions CASCADE;
 DROP TABLE IF EXISTS social_accounts CASCADE;
 DROP TABLE IF EXISTS users CASCADE;
 -- 重新创建使用 phone 字段的表
 CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    phone TEXT UNIQUE NOT NULL,
    -- ...
 );
 -- 重新创建依赖表和索引
 CREATE TABLE user_sessions (...);
 CREATE TABLE social_accounts (...);
 CREATE INDEX idx_users_phone ON users(phone);
 ```
 ### 执行方式
 ```bash
 # 方式一：Docker 命令
 docker exec -i supabase-db psql -U postgres < backend/database/migrate_to_phone.sql
 # 方式二：Supabase Studio SQL Editor
 # 打开 https://supabase.hbyrkj.top -> SQL Editor -> 粘贴执行
 ```
 ---
 ## ✅ 部署步骤
 ```bash
 # 1. 执行数据库迁移
 docker exec -i supabase-db psql -U postgres < backend/database/migrate_to_phone.sql
 # 2. 重新构建前端
 cd frontend && npm run build
 # 3. 重启服务
 pm2 restart vigent2-backend vigent2-frontend
 ```
 ---
 ## 📁 今日修改文件清单
 | 文件 | 变更类型 | 说明 |
 |------|----------|------|
 | `backend/database/schema.sql` | 修改 | email → phone |
 | `backend/database/migrate_to_phone.sql` | 新增 | 数据库迁移脚本 |
 | `backend/app/api/auth.py` | 修改 | 手机号验证 + 修改密码 API |
 | `backend/app/api/admin.py` | 修改 | UserListItem.email → phone |
 | `backend/app/core/config.py` | 修改 | ADMIN_EMAIL → ADMIN_PHONE |
 | `backend/app/main.py` | 修改 | 管理员初始化使用 phone |
 | `backend/.env` | 修改 | ADMIN_PHONE=15549380526 |
 | `frontend/src/app/login/page.tsx` | 修改 | 手机号登录 + 11位验证 |
 | `frontend/src/app/register/page.tsx` | 修改 | 手机号注册 + 11位验证 |
 | `frontend/src/lib/auth.ts` | 修改 | phone 参数 + changePassword 函数 |
 | `frontend/src/app/page.tsx` | 修改 | AccountSettingsDropdown 组件 |
 | `frontend/src/app/admin/page.tsx` | 修改 | 用户列表显示手机号 |
 | `frontend/src/contexts/AuthContext.tsx` | 修改 | 存储完整用户信息含 expires_at |
 ---
 ## 🆕 后续完善 (Day 15 下午)
 ### 账户有效期显示
 在账户下拉菜单中显示用户的有效期：
 | 显示情况 | 格式 |
 |----------|------|
 | 有设置 expires_at | `2026-03-15` |
 | NULL | `永久有效` |
 **相关修改**：
 - `backend/app/api/auth.py`: UserResponse 新增 `expires_at` 字段
 - `frontend/src/contexts/AuthContext.tsx`: 存储完整用户对象
 - `frontend/src/app/page.tsx`: 格式化并显示有效期
 ### 点击外部关闭下拉菜单
 使用 `useRef` + `useEffect` 监听全局点击事件，点击菜单外部自动关闭。
 ### 修改密码后强制重新登录
 密码修改成功后：
 1. 显示"密码修改成功，正在跳转登录页..."
 2. 1.5秒后调用登出 API
 3. 跳转到登录页面
 ---
 ## 🔗 相关文档
 - [task_complete.md](../task_complete.md) - 任务总览
 - [Day14.md](./Day14.md) - 模型升级 + AI 标题标签
 - [AUTH_DEPLOY.md](../AUTH_DEPLOY.md) - 认证系统部署指南
 ---
 ## 🤖 模型与功能增强 (Day 15 晚)
 ### 1. GLM-4.7-Flash 升级
 **文件**: `backend/app/services/glm_service.py`
 将文案洗稿模型从 `glm-4-flash` 升级为 `glm-4.7-flash`：
 ```python
 response = client.chat.completions.create(
    model="glm-4.7-flash",  # Upgrade from glm-4-flash
    messages=[...],
    # ...
 )
 ```
 **改进**:
 - 响应速度提升
 - 洗稿文案的流畅度和逻辑性增强
 ### 2. 独立文案提取助手
 实现了独立的文案提取工具，支持从视频/音频文件或 URL 提取文字。
 #### 后端实现 (`backend/app/api/tools.py`)
 - **多源支持**: 文件上传 (MP4/MP3/WAV) 或 URL 下载
 - **智能下载**:
  - `yt-dlp`: 通用下载 (Douyin/TikTok/Bilibili)
  - `Playwright`: 智能回退机制 (Bilibili Dashboard API, Douyin Cookie Bypass)
 - **URL 自动清洗**: 正则提取分享文本中的 HTTP 链接
 - **流程**: 下载 -> FFmpeg 转 WAV (16k) -> Whisper 识别 -> GLM-4.7 洗稿
 #### 前端实现 (`frontend/src/components/ScriptExtractionModal.tsx`)
 - **独立模态框**: 通过顶部导航栏打开
 - **功能**:
  - 链接粘贴 / 文件拖拽
  - 实时进度显示 (下载 -> 识别 -> 洗稿)
  - **一键填入**: 将提取结果直接填充到主输入框
  - **自动识别**: 自动区分平台与链接
 - **交互优化**:
  - 防止误触背景关闭
  - 复制功能兼容 HTTP 环境 (Fallback textArea)
 ### 3. 上传视频预览功能
 在素材列表 (`frontend/src/app/page.tsx`) 中为上传的视频添加预览功能：
 - 点击缩略图弹出视频播放模态框
 - 支持下载与发布快捷跳转
 ---
 ## 📝 任务清单更新
 - [x] 认证系统迁移 (手机号)
 - [x] 账户管理 (密码修改/有效期)
 - [x] GLM-4.7 模型升级
 - [x] 独立文案提取助手 (B站/抖音支持)
 - [x] 视频预览功能
--- a/Docs/DevLogs/Day16.md
+++ b/Docs/DevLogs/Day16.md
@@ -0,0 +1,119 @@
 ---
 ## 🔧 Qwen-TTS Flash Attention 优化 (10:00)
 ### 优化背景
 Qwen3-TTS 1.7B 模型在默认情况下加载速度慢，推理显存占用高。通过引入 Flash Attention 2，可以显著提升模型加载速度和推理效率。
 ### 实施方案
 在 `qwen-tts` Conda 环境中安装 `flash-attn`：
 ```bash
 conda activate qwen-tts
 pip install -U flash-attn --no-build-isolation
 ```
 ### 验证结果
 - **加载速度**: 从 ~60s 提升至 **8.9s** ⚡
 - **显存占用**: 显著降低，消除 OOM 风险
 - **代码变动**: 无代码变动，仅环境优化 (自动检测)
 ---
 ## 🛡️ 服务看门狗 Watchdog (10:30)
 ### 问题描述
 常驻服务 (`vigent2-qwen-tts` 和 `vigent2-latentsync`) 可能会因显存碎片或长时间运行出现僵死 (Port open but unresponsive)。
 ### 解决方案
 开发了一个 Python Watchdog 脚本，每 30 秒轮询服务的 `/health` 接口，如果连续 3 次失败则自动重启服务。
 1. **Watchdog 脚本**: `backend/scripts/watchdog.py`
 2. **启动脚本**: `run_watchdog.sh` (基于 PM2)
 ### 核心逻辑
 ```python
 # 连续 3 次心跳失败触发重启
 if service["failures"] >= service['threshold']:
    subprocess.run(["pm2", "restart", service["name"]])
 ```
 ### 部署状态
 - `vigent2-watchdog` 已启动并加入 PM2 列表
 - 监控对象: `vigent2-qwen-tts` (8009), `vigent2-latentsync` (8007)
 ---
 ## ⚡ LatentSync 性能确认
 经代码审计，LatentSync 1.6 已内置优化：
 - ✅ **Flash Attention**: 原生使用 `torch.nn.functional.scaled_dot_product_attention`
 - ✅ **DeepCache**: 已启用 (`cache_interval=3`)，提供 ~2.5x 加速
 - ✅ **GPU 并发**: 双卡流水线 (GPU0 TTS | GPU1 LipSync) 已确认工作正常
 ---
 ## 🎨 UI 交互体验优化 (15:30)
 ### 优化内容
 - 视频生成完成后，预览优先选中最新输出
 - 选择项持久化：素材 / 背景音乐 / 历史视频
 - 列表内滚动定位选中项，避免页面跳动
 - 刷新回顶部（首页 / 发布页）
 - 背景音乐试听即选中并自动开启，音量滑块实时影响试听
 ### 涉及文件
 - `frontend/src/app/page.tsx`
 - `frontend/src/app/publish/page.tsx`
 ---
 ## 🎵 字体与背景音乐资源库接入 (15:50)
 ### 资源库
 - `backend/assets/fonts/`（SuperIPAgent 字体全量导入）
 - `backend/assets/bgm/`（背景音乐素材）
 - `backend/assets/styles/{subtitle.json,title.json}`（样式预设）
 ### 服务能力
 - `/api/assets/subtitle-styles`、`/api/assets/title-styles`、`/api/assets/bgm`
 - `/assets` 静态挂载供前端预览与试听
 ### 生成链路调整
 - 先完成人声与唇形/字幕对齐，再混入 BGM
 - 修复 FFmpeg shell 解析导致的混音失败
 - 禁用 amix 归一化，保证配音音量不被压低
 ### 关键修改
 `backend/app/services/video_service.py`
 ```python
 filter_complex = (
    "[0:a]volume=1.0[a0];"
    f"[1:a]volume={volume}[a1];"
    "[a0][a1]amix=inputs=2:duration=first:dropout_transition=2:normalize=0[aout]"
 )
 ```
 ---
 ## 🖼️ 标题/字幕样式预览 (16:10)
 ### 前端
 - 样式选择 + 预览面板
 - 字号可调（覆盖样式默认值）
 - 字体文件动态加载
 ### Remotion
 - 样式参数透传到 `Subtitles` / `Title`
 - 渲染前临时复制字体到渲染目录
 ---
 ## 📝 文档更新
 - [x] `Docs/QWEN3_TTS_DEPLOY.md`: 添加 Flash Attention 安装指南
 - [x] `Docs/DEPLOY_MANUAL.md`: 添加 Watchdog 部署说明
 - [x] `Docs/task_complete.md`: 更新进度至 100% (Day 16)
 - [x] `README.md`: 新增样式与背景音乐能力说明
 - [x] `Docs/BACKEND_README.md`: 资产接口与混音链路说明
 - [x] `Docs/FRONTEND_README.md`: 新增样式预览与BGM试听说明
--- a/Docs/FRONTEND_DEV.md
+++ b/Docs/FRONTEND_DEV.md
@@ -180,3 +180,52 @@ const formatDate = (timestamp: number) => {
 2. [ ] 所有 API 请求使用 `api.get/post/delete()` 而非原生 `fetch`
 3. [ ] 日期格式化使用固定格式函数，不用 `toLocaleString()`
 4. [ ] 添加 `'use client'` 指令（如需客户端交互）
 ---
 ## 声音克隆 (Voice Clone) 功能
 ### API 端点
 | 接口 | 方法 | 功能 |
 |------|------|------|
 | `/api/ref-audios` | POST | 上传参考音频 (multipart/form-data: file + ref_text) |
 | `/api/ref-audios` | GET | 列出用户的参考音频 |
 | `/api/ref-audios/{id}` | DELETE | 删除参考音频 (id 需 encodeURIComponent) |
 ### 视频生成 API 扩展
 ```typescript
 // EdgeTTS 模式 (默认)
 await api.post('/api/videos/generate', {
    material_path: '...',
    text: '口播文案',
    tts_mode: 'edgetts',
    voice: 'zh-CN-YunxiNeural',
 });
 // 声音克隆模式
 await api.post('/api/videos/generate', {
    material_path: '...',
    text: '口播文案',
    tts_mode: 'voiceclone',
    ref_audio_id: 'user_id/timestamp_name.wav',
    ref_text: '参考音频对应文字',
 });
 ```
 ### 在线录音
 使用 `MediaRecorder` API 录制音频，格式为 `audio/webm`，上传后后端自动转换为 WAV (16kHz mono)。
 ```typescript
 // 录音需要用户授权麦克风
 const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
 const mediaRecorder = new MediaRecorder(stream, { mimeType: 'audio/webm' });
 ```
 ### UI 结构
 配音方式使用 Tab 切换：
 - **EdgeTTS 音色** - 预设音色 2x3 网格
 - **声音克隆** - 参考音频列表 + 在线录音 + 参考文字输入
--- a/Docs/FRONTEND_README.md
+++ b/Docs/FRONTEND_README.md
@@ -0,0 +1,103 @@
 # ViGent2 Frontend
 ViGent2 的前端界面，采用 Next.js 14 + TailwindCSS 构建。
 ## ✨ 核心功能
 ### 1. 视频生成 (`/`)
 - **素材管理**: 拖拽上传人物视频，实时预览。
 - **文案配音**: 集成 EdgeTTS，支持多音色选择 (云溪 / 晓晓)。
 - **AI 标题/标签**: 一键生成视频标题与标签 (Day 14)。
 - **标题/字幕样式**: 样式选择 + 预览 + 字号调节 (Day 16)。
 - **背景音乐**: 试听 + 音量控制 + 选择持久化 (Day 16)。
 - **交互优化**: 选择项持久化、列表内定位、刷新回顶部 (Day 16)。
 - **进度追踪**: 实时显示视频生成进度 (10% -> 100%)。
 - **结果预览**: 生成完成后直接播放下载。
 - **本地保存**: 文案/标题自动保存，刷新后恢复 (Day 14)。
 ### 2. 全自动发布 (`/publish`) [Day 7 新增]
 - **多平台管理**: 统一管理 B站、抖音、小红书账号状态。
 - **扫码登录**: 
  - 集成后端 Playwright 生成的 QR Code。
  - 实时检测扫码状态 (Wait/Success)。
  - Cookie 自动保存与状态同步。
 - **发布配置**: 设置视频标题、标签、简介。
 - **定时任务**: 支持 "立即发布" 或 "定时发布"。
 ### 3. 声音克隆 [Day 13 新增]
 - **TTS 模式选择**: EdgeTTS (预设音色) / 声音克隆 (自定义音色) 切换。
 - **参考音频管理**: 上传/列表/删除参考音频 (3-20秒 WAV)。
 - **一键克隆**: 选择参考音频后自动调用 Qwen3-TTS 服务。
 ### 4. 字幕与标题 [Day 13 新增]
 - **片头标题**: 可选输入，视频开头显示 3 秒淡入淡出标题。
 - **逐字高亮字幕**: 卡拉OK效果，默认开启，可关闭。
 - **自动对齐**: 基于 faster-whisper 生成字级别时间戳。
 - **样式预设**: 标题/字幕样式选择 + 预览 + 字号调节 (Day 16)。
 ### 5. 背景音乐 [Day 16 新增]
 - **试听预览**: 点击试听即选中，音量滑块实时生效。
 - **混音控制**: 仅影响 BGM，配音保持原音量。
 ### 6. 账户设置 [Day 15 新增]
 - **手机号登录**: 11位中国手机号验证登录。
 - **账户下拉菜单**: 显示有效期 + 修改密码 + 安全退出。
 - **修改密码**: 弹窗输入当前密码与新密码，修改后强制重新登录。
 ### 7. 文案提取助手 (`ScriptExtractionModal`) [Day 15 新增]
 - **多源提取**: 支持文件拖拽上传与 URL 粘贴 (B站/抖音/TikTok)。
 - **AI 洗稿**: 集成 GLM-4.7-Flash，自动改写为口播文案。
 - **一键填入**: 提取结果直接填充至视频生成输入框。
 - **智能交互**: 实时进度展示，防误触设计。
 ## 🛠️ 技术栈
 - **框架**: Next.js 14 (App Router)
 - **样式**: TailwindCSS
 - **图标**: Lucide React
 - **组件**: 自定义现代化组件 (Glassmorphism 风格)
 - **API**: Axios 实例 `@/lib/axios` (对接后端 FastAPI :8006)
 ## 🚀 开发指南
 ### 安装依赖
 ```bash
 npm install
 ```
 ### 启动开发服务器
 默认运行在 **3002** 端口 (通过 `package.json` 配置):
 ```bash
 npm run dev
 # 访问: http://localhost:3002
 ```
 ### 目录结构
 ```
 src/
 ├── app/
 │   ├── page.tsx           # 视频生成主页
 │   ├── publish/           # 发布管理页
 │   │   └── page.tsx
 │   └── layout.tsx         # 全局布局 (导航栏)
 ├── components/            # UI 组件
 │   ├── VideoUploader.tsx  # 视频上传
 │   ├── StatusBadge.tsx    # 状态徽章
 │   └── ...
 └── lib/                   # 工具函数
 ```
 ## 🔌 后端对接
 - **Base URL**: `http://localhost:8006`
 - **代理配置**: Next.js Rewrites (如需) 或直接 CORS。
 ## 🎨 设计规范
 - **主色调**: 深紫/黑色系 (Dark Mode)
 - **交互**: 悬停微动画 (Hover Effects)
 - **响应式**: 适配桌面端大屏操作
--- a/models/LatentSync/DEPLOY.md
+++ b/models/LatentSync/DEPLOY.md
--- a/Docs/Logs.md
+++ b/Docs/Logs.md
@@ -1,29 +0,0 @@
 rongye@r730-ubuntu:~/ProgramFiles/Supabase$ docker compose up -d
 [+] up 136/136
 ✔ Image timberio/vector:0.28.1-alpine      Pulled                                                                63.3ss
 ✔ Image supabase/storage-api:v1.33.0       Pulled                                                                78.6ss
 ✔ Image darthsim/imgproxy:v3.30.1          Pulled                                                                151.9s
 ✔ Image supabase/postgres-meta:v0.95.1     Pulled                                                                87.5ss
 ✔ Image supabase/logflare:1.27.0           Pulled                                                                229.2s
 ✔ Image supabase/postgres:15.8.1.085       Pulled                                                                268.3s
 ✔ Image supabase/supavisor:2.7.4           Pulled                                                                101.6s
 ✔ Image supabase/realtime:v2.68.0          Pulled                                                                56.5ss
 ✔ Image postgrest/postgrest:v14.1          Pulled                                                                201.8s
 ✔ Image supabase/edge-runtime:v1.69.28     Pulled                                                                254.0s
 ✔ Network supabase_default                 Created                                                               0.1s
 ✔ Volume supabase_db-config                Created                                                               0.1s
 ✔ Container supabase-vector                Healthy                                                               16.9s
 ✔ Container supabase-imgproxy              Created                                                               7.4s
 ✔ Container supabase-db                    Healthy                                                               20.6s
 ✔ Container supabase-analytics             Created                                                               0.4s
 ✔ Container supabase-edge-functions        Created                                                               1.8s
 ✔ Container supabase-auth                  Created                                                               1.7s
 ✔ Container supabase-studio                Created                                                               2.0s
 ✔ Container realtime-dev.supabase-realtime Created                                                               1.7s
 ✔ Container supabase-pooler                Created                                                               1.8s
 ✔ Container supabase-kong                  Created                                                               1.7s
 ✔ Container supabase-meta                  Created                                                               2.0s
 ✔ Container supabase-rest                  Created                                                               0.9s
 ✔ Container supabase-storage               Created                                                               1.4s
 Error response from daemon: failed to set up container networking: driver failed programming external connectivity on endpoint supabase-analytics (2fd60a510a1f16bf29f8f5140f14ef457a284c5b65a2567b7be250a4f9708f34): failed to bind host port 0.0.0.0:4000/tcp: address already in use
 [ble: exit 1]
--- a/Docs/QWEN3_TTS_DEPLOY.md
+++ b/Docs/QWEN3_TTS_DEPLOY.md
@@ -1,13 +1,13 @@
-# Qwen3-TTS 0.6B 部署指南
+# Qwen3-TTS 1.7B 部署指南
-> 本文档描述如何在 Ubuntu 服务器上部署 Qwen3-TTS 0.6B-Base 声音克隆模型。
+> 本文档描述如何在 Ubuntu 服务器上部署 Qwen3-TTS 1.7B-Base 声音克隆模型。
 ## 系统要求
 | 要求 | 规格 |
 |------|------|
 | GPU | NVIDIA RTX 3090 24GB (或更高) |
-| VRAM | ≥ 4GB (推理), ≥ 8GB (带 flash-attn) |
+| VRAM | ≥ 8GB (推理), ≥ 12GB (带 flash-attn) |
 | CUDA | 12.1+ |
 | Python | 3.10.x |
 | 系统 | Ubuntu 20.04+ |
@@ -18,7 +18,7 @@
 | GPU | 服务 | 模型 |
 |-----|------|------|
-| GPU0 | **Qwen3-TTS** | 0.6B-Base (声音克隆) |
+| GPU0 | **Qwen3-TTS** | 1.7B-Base (声音克隆，更高质量) |
 | GPU1 | LatentSync | 1.6 (唇形同步) |
 ---
@@ -55,9 +55,9 @@ pip install -e .
 conda install -y -c conda-forge sox
 ```
-### 可选: 安装 FlashAttention (推荐)
+### 可选: 安装 FlashAttention (强烈推荐)
-FlashAttention 可以显著提升推理速度并减少显存占用：
+FlashAttention 可以显著提升推理速度 (加载时间减少 85%) 并减少显存占用：
 ```bash
 pip install -U flash-attn --no-build-isolation
@@ -81,8 +81,8 @@ pip install modelscope
 # 下载 Tokenizer (651MB)
 modelscope download --model Qwen/Qwen3-TTS-Tokenizer-12Hz --local_dir ./checkpoints/Tokenizer
-# 下载 0.6B-Base 模型 (2.4GB)
+# 下载 1.7B-Base 模型 (6.8GB)
-modelscope download --model Qwen/Qwen3-TTS-12Hz-0.6B-Base --local_dir ./checkpoints/0.6B-Base
+modelscope download --model Qwen/Qwen3-TTS-12Hz-1.7B-Base --local_dir ./checkpoints/1.7B-Base
 ```
 ### 方式 B: HuggingFace
@@ -91,7 +91,7 @@ modelscope download --model Qwen/Qwen3-TTS-12Hz-0.6B-Base --local_dir ./checkpoi
 pip install -U "huggingface_hub[cli]"
 huggingface-cli download Qwen/Qwen3-TTS-Tokenizer-12Hz --local-dir ./checkpoints/Tokenizer
-huggingface-cli download Qwen/Qwen3-TTS-12Hz-0.6B-Base --local-dir ./checkpoints/0.6B-Base
+huggingface-cli download Qwen/Qwen3-TTS-12Hz-1.7B-Base --local-dir ./checkpoints/1.7B-Base
 ```
 下载完成后，目录结构应如下：
@@ -102,7 +102,7 @@ checkpoints/
 │   ├── config.json
 │   ├── model.safetensors
 │   └── ...
-└── 0.6B-Base/       # ~2.4GB
+└── 1.7B-Base/       # ~6.8GB
    ├── config.json
    ├── model.safetensors
    └── ...
@@ -136,7 +136,7 @@ from qwen_tts import Qwen3TTSModel
 print("Loading Qwen3-TTS model on GPU:0...")
 model = Qwen3TTSModel.from_pretrained(
-    "./checkpoints/0.6B-Base",
+    "./checkpoints/1.7B-Base",
    device_map="cuda:0",
    dtype=torch.bfloat16,
 )
@@ -169,24 +169,106 @@ python test_inference.py
 ---
 ## 步骤 6: 安装 HTTP 服务依赖
 ```bash
 conda activate qwen-tts
 pip install fastapi uvicorn python-multipart
 ```
 ---
 ## 步骤 7: 启动服务 (PM2 管理)
 ### 手动测试
 ```bash
 conda activate qwen-tts
 cd /home/rongye/ProgramFiles/ViGent2/models/Qwen3-TTS
 python qwen_tts_server.py
 ```
 访问 http://localhost:8009/health 验证服务状态。
 ### PM2 常驻服务
 > ⚠️ **注意**：启动脚本 `run_qwen_tts.sh` 位于项目**根目录**，而非 models/Qwen3-TTS 目录。
 1. 使用启动脚本:
 ```bash
 cd /home/rongye/ProgramFiles/ViGent2
 pm2 start ./run_qwen_tts.sh --name vigent2-qwen-tts
 pm2 save
 ```
 2. 查看日志:
 ```bash
 pm2 logs vigent2-qwen-tts
 ```
 3. 重启服务:
 ```bash
 pm2 restart vigent2-qwen-tts
 ```
 ---
 ## 目录结构
 部署完成后，目录结构应如下：
 ```
-/home/rongye/ProgramFiles/ViGent2/models/Qwen3-TTS/
+/home/rongye/ProgramFiles/ViGent2/
-├── checkpoints/
+├── run_qwen_tts.sh              # PM2 启动脚本 (根目录)
-│   ├── Tokenizer/           # 语音编解码器
+└── models/Qwen3-TTS/
-│   └── 0.6B-Base/           # 声音克隆模型
+    ├── checkpoints/
-├── qwen_tts/                # 源码
+    │   ├── Tokenizer/           # 语音编解码器
-│   ├── inference/
+    │   └── 1.7B-Base/           # 声音克隆模型 (更高质量)
-│   ├── models/
+    ├── qwen_tts/                # 源码
-│   └── ...
+    │   ├── inference/
-├── examples/
+    │   ├── models/
-│   └── myvoice.wav          # 参考音频
+    │   └── ...
-├── pyproject.toml
+    ├── examples/
-├── requirements.txt
+    │   └── myvoice.wav          # 参考音频
-└── test_inference.py        # 测试脚本
+    ├── qwen_tts_server.py       # HTTP 推理服务 (端口 8009)
    ├── pyproject.toml
    ├── requirements.txt
    └── test_inference.py        # 测试脚本
 ```
 ---
 ## API 参考
 ### 健康检查
 ```
 GET http://localhost:8009/health
 ```
 响应:
 ```json
 {
  "service": "Qwen3-TTS Voice Clone",
  "model": "1.7B-Base",
  "ready": true,
  "gpu_id": 0
 }
 ```
 ### 声音克隆生成
 ```
 POST http://localhost:8009/generate
 Content-Type: multipart/form-data
 Fields:
  - ref_audio: 参考音频文件 (WAV)
  - text: 要合成的文本
  - ref_text: 参考音频的转写文字
  - language: 语言 (默认 Chinese)
 Response: audio/wav 文件
 ```
 ---
@@ -199,7 +281,7 @@ python test_inference.py
 |------|------|------|
 | 0.6B-Base | 3秒快速声音克隆 | 2.4GB |
 | 0.6B-CustomVoice | 9种预设音色 | 2.4GB |
-| 1.7B-Base | 声音克隆 (更高质量) | 6.8GB |
+| **1.7B-Base** | **声音克隆 (更高质量)** ✅ 当前使用 | 6.8GB |
 | 1.7B-VoiceDesign | 自然语言描述生成声音 | 6.8GB |
 ### 支持语言
@@ -224,17 +306,18 @@ conda install -y -c conda-forge sox
 ### CUDA 内存不足
-Qwen3-TTS 0.6B 通常只需要 4-6GB VRAM。如果遇到 OOM：
+Qwen3-TTS 1.7B 通常需要 8-10GB VRAM。如果遇到 OOM：
 1. 确保 GPU0 没有运行其他程序
 2. 不使用 flash-attn (会增加显存占用)
 3. 使用更小的参考音频 (3-5秒)
 4. 如果显存仍不足，可降级使用 0.6B-Base 模型
 ### 模型加载失败
 确保以下文件存在：
- `checkpoints/0.6B-Base/config.json`
+- `checkpoints/1.7B-Base/config.json`
- `checkpoints/0.6B-Base/model.safetensors`
+- `checkpoints/1.7B-Base/model.safetensors`
 ### 音频输出质量问题
@@ -244,6 +327,54 @@ Qwen3-TTS 0.6B 通常只需要 4-6GB VRAM。如果遇到 OOM：
 ---
 ## 后端 ViGent2 集成
 ### 声音克隆服务 (`voice_clone_service.py`)
 后端通过 HTTP 调用 Qwen3-TTS 服务：
 ```python
 import aiohttp
 QWEN_TTS_URL = "http://localhost:8009"
 async def generate_cloned_audio(ref_audio_path: str, text: str, output_path: str):
    async with aiohttp.ClientSession() as session:
        with open(ref_audio_path, "rb") as f:
            data = aiohttp.FormData()
            data.add_field("ref_audio", f, filename="ref.wav")
            data.add_field("text", text)
            async with session.post(f"{QWEN_TTS_URL}/generate", data=data) as resp:
                audio_data = await resp.read()
                with open(output_path, "wb") as out:
                    out.write(audio_data)
    return output_path
 ```
 ### 参考音频 Supabase Bucket
 ```sql
 -- 创建 ref-audios bucket
 INSERT INTO storage.buckets (id, name, public)
 VALUES ('ref-audios', 'ref-audios', true)
 ON CONFLICT (id) DO NOTHING;
 -- RLS 策略
 CREATE POLICY "Allow public uploads" ON storage.objects
 FOR INSERT TO anon WITH CHECK (bucket_id = 'ref-audios');
 ```
 ---
 ## 更新日志
 | 日期 | 版本 | 说明 |
 |------|------|------|
 | 2026-01-30 | 1.1.0 | 明确默认模型升级为 1.7B-Base，替换旧版 0.6B 路径 |
 ---
 ## 参考链接
 - [Qwen3-TTS GitHub](https://github.com/QwenLM/Qwen3-TTS)
--- a/Docs/SUBTITLE_DEPLOY.md
+++ b/Docs/SUBTITLE_DEPLOY.md
@@ -0,0 +1,282 @@
 # ViGent2 字幕与标题功能部署指南
 本文档介绍如何部署 ViGent2 的逐字高亮字幕和片头标题功能。
 ## 功能概述
 | 功能 | 说明 |
 |------|------|
 | **逐字高亮字幕** | 使用 faster-whisper 生成字级别时间戳，Remotion 渲染卡拉OK效果 |
 | **片头标题** | 视频开头显示标题，带淡入淡出动画，几秒后消失 |
 ## 技术架构
 ```
 原有流程:
  文本 → EdgeTTS → 音频 → LatentSync → FFmpeg合成 → 最终视频
 新流程:
  文本 → EdgeTTS → 音频 ─┬→ LatentSync → 唇形视频 ─┐
                        └→ faster-whisper → 字幕JSON ─┴→ Remotion合成 → 最终视频
 ```
 ## 系统要求
 | 组件 | 要求 |
 |------|------|
 | Node.js | 18+ |
 | Python | 3.10+ |
 | GPU 显存 | faster-whisper 需要约 3-4GB VRAM |
 | FFmpeg | 已安装 |
 ---
 ## 部署步骤
 ### 步骤 1: 安装 faster-whisper (Python)
 ```bash
 cd /home/rongye/ProgramFiles/ViGent2/backend
 source venv/bin/activate
 # 安装 faster-whisper
 pip install faster-whisper>=1.0.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
 ```
 > **注意**: 首次运行时，faster-whisper 会自动下载 `large-v3` Whisper 模型 (~3GB)
 ### 步骤 2: 安装 Remotion (Node.js)
 ```bash
 cd /home/rongye/ProgramFiles/ViGent2/remotion
 # 安装依赖
 npm install
 ```
 ### 步骤 3: 重启后端服务
 ```bash
 pm2 restart vigent2-backend
 ```
 ### 步骤 4: 验证安装
 ```bash
 # 检查 faster-whisper 是否安装成功
 cd /home/rongye/ProgramFiles/ViGent2/backend
 source venv/bin/activate
 python -c "from faster_whisper import WhisperModel; print('faster-whisper OK')"
 # 检查 Remotion 是否安装成功
 cd /home/rongye/ProgramFiles/ViGent2/remotion
 npx remotion --version
 ```
 ---
 ## 文件结构
 ### 后端新增文件
 | 文件 | 说明 |
 |------|------|
 | `backend/app/services/whisper_service.py` | 字幕对齐服务 (基于 faster-whisper) |
 | `backend/app/services/remotion_service.py` | Remotion 渲染服务 |
 ### Remotion 项目结构
 ```
 remotion/
 ├── package.json              # Node.js 依赖配置
 ├── tsconfig.json             # TypeScript 配置
 ├── render.ts                 # 服务端渲染脚本
 └── src/
    ├── index.ts              # Remotion 入口
    ├── Root.tsx              # 根组件
    ├── Video.tsx             # 主视频组件
    ├── components/
    │   ├── Title.tsx         # 片头标题组件
    │   ├── Subtitles.tsx     # 逐字高亮字幕组件
    │   └── VideoLayer.tsx    # 视频图层组件
    ├── utils/
    │   └── captions.ts       # 字幕数据处理工具
    └── fonts/                # 字体文件目录 (可选)
 ```
 ---
 ## API 参数
 视频生成 API (`POST /api/videos/generate`) 新增以下参数：
 | 参数 | 类型 | 默认值 | 说明 |
 |------|------|--------|------|
 | `title` | string | null | 视频标题（片头显示，可选） |
 | `enable_subtitles` | boolean | true | 是否启用逐字高亮字幕 |
 ### 请求示例
 ```json
 {
  "material_path": "https://...",
  "text": "大家好，欢迎来到我的频道",
  "tts_mode": "edgetts",
  "voice": "zh-CN-YunxiNeural",
  "title": "今日分享",
  "enable_subtitles": true
 }
 ```
 ---
 ## 视频生成流程
 新的视频生成流程进度分配：
 | 阶段 | 进度 | 说明 |
 |------|------|------|
 | 下载素材 | 0% → 5% | 从 Supabase 下载输入视频 |
 | TTS 语音生成 | 5% → 25% | EdgeTTS 或 Qwen3-TTS 生成音频 |
 | 唇形同步 | 25% → 80% | LatentSync 推理 |
 | 字幕对齐 | 80% → 85% | faster-whisper 生成字级别时间戳 |
 | Remotion 渲染 | 85% → 95% | 合成字幕和标题 |
 | 上传结果 | 95% → 100% | 上传到 Supabase Storage |
 ---
 ## 降级处理
 系统包含自动降级机制，确保基本功能不受影响：
 | 场景 | 处理方式 |
 |------|----------|
 | 字幕对齐失败 | 跳过字幕，继续生成视频 |
 | Remotion 未安装 | 使用 FFmpeg 直接合成 |
 | Remotion 渲染失败 | 回退到 FFmpeg 合成 |
 ---
 ## 配置说明
 ### 字幕服务配置
 字幕服务位于 `backend/app/services/whisper_service.py`，默认配置：
 | 参数 | 默认值 | 说明 |
 |------|--------|------|
 | `model_size` | large-v3 | Whisper 模型大小 |
 | `device` | cuda | 运行设备 |
 | `compute_type` | float16 | 计算精度 |
 如需修改，可编辑 `whisper_service.py` 中的 `WhisperService` 初始化参数。
 ### Remotion 配置
 Remotion 渲染参数在 `backend/app/services/remotion_service.py` 中配置：
 | 参数 | 默认值 | 说明 |
 |------|--------|------|
 | `fps` | 25 | 输出帧率 |
 | `title_duration` | 3.0 | 标题显示时长（秒） |
 ---
 ## 故障排除
 ### faster-whisper 相关
 **问题**: `ModuleNotFoundError: No module named 'faster_whisper'`
 ```bash
 cd /home/rongye/ProgramFiles/ViGent2/backend
 source venv/bin/activate
 pip install faster-whisper>=1.0.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
 ```
 **问题**: GPU 显存不足
 修改 `whisper_service.py`，使用较小的模型：
 ```python
 WhisperService(model_size="medium", compute_type="int8")
 ```
 ### Remotion 相关
 **问题**: `node_modules not found`
 ```bash
 cd /home/rongye/ProgramFiles/ViGent2/remotion
 npm install
 ```
 **问题**: Remotion 渲染失败 - `fs` 模块错误
 确保 `remotion/src/utils/captions.ts` 中没有使用 Node.js 的 `fs` 模块。Remotion 在浏览器环境打包，不支持 `fs`。
 **问题**: Remotion 渲染失败 - 视频文件读取错误 (`file://` 协议)
 确保 `render.ts` 使用 `publicDir` 选项指向视频所在目录，`VideoLayer.tsx` 使用 `staticFile()` 加载视频：
 ```typescript
 // render.ts
 const publicDir = path.dirname(path.resolve(options.videoPath));
 const bundleLocation = await bundle({
  entryPoint: path.resolve(__dirname, './src/index.ts'),
  publicDir,  // 关键配置
 });
 // VideoLayer.tsx
 const videoUrl = staticFile(videoSrc);  // 使用 staticFile
 ```
 **问题**: Remotion 渲染失败
 查看后端日志：
 ```bash
 pm2 logs vigent2-backend
 ```
 ### 查看服务健康状态
 ```bash
 # 字幕服务健康检查
 cd /home/rongye/ProgramFiles/ViGent2/backend
 source venv/bin/activate
 python -c "from app.services.whisper_service import whisper_service; import asyncio; print(asyncio.run(whisper_service.check_health()))"
 # Remotion 健康检查
 python -c "from app.services.remotion_service import remotion_service; import asyncio; print(asyncio.run(remotion_service.check_health()))"
 ```
 ---
 ## 可选优化
 ### 添加中文字体
 为获得更好的字幕渲染效果，可添加中文字体：
 ```bash
 # 下载 Noto Sans SC 字体
 cd /home/rongye/ProgramFiles/ViGent2/remotion/src/fonts
 wget https://github.com/googlefonts/noto-cjk/raw/main/Sans/OTF/SimplifiedChinese/NotoSansSC-Regular.otf -O NotoSansSC.otf
 ```
 ### 使用 GPU 0
 faster-whisper 默认使用 GPU 0，与 LatentSync (GPU 1) 分开，避免显存冲突。如需指定 GPU：
 ```python
 # 在 whisper_service.py 中修改
 WhisperService(device="cuda:0")  # 或 "cuda:1"
 ```
 ---
 ## 更新日志
 | 日期 | 版本 | 说明 |
 |------|------|------|
 | 2026-01-29 | 1.0.0 | 初始版本，使用 faster-whisper + Remotion 实现逐字高亮字幕和片头标题 |
 | 2026-01-30 | 1.0.1 | 字幕高亮样式与标题动画优化，视觉表现更清晰 |
--- a/Docs/implementation_plan.md
+++ b/Docs/implementation_plan.md
@@ -6,6 +6,7 @@
 - 上传静态人物视频 → 生成口播视频（唇形同步）
 - TTS 配音或声音克隆
 - 字幕自动生成与渲染
 - AI 自动生成标题与标签
 - 一键发布到多个社交平台
 ---
@@ -47,7 +48,7 @@
 | **任务队列** | Celery + Redis | RQ / Dramatiq |
 | **唇形同步** | **LatentSync 1.6** | MuseTalk / Wav2Lip |
 | **TTS 配音** | EdgeTTS | CosyVoice |
-| **声音克隆** | GPT-SoVITS (可选) | - |
+| **声音克隆** | **Qwen3-TTS 1.7B** ✅ | GPT-SoVITS |
 | **视频处理** | FFmpeg | MoviePy |
 | **自动发布** | social-auto-upload | 自行实现 |
 | **数据库** | SQLite → PostgreSQL | MySQL |
@@ -219,6 +220,7 @@ cp -r SuperIPAgent/social-auto-upload backend/social_upload
 | 功能 | 实现方式 |
 |------|----------|
 | **声音克隆** | 集成 GPT-SoVITS，用自己的声音 |
 | **AI 标题/标签生成** | 调用大模型 API 自动生成标题与标签 ✅ |
 | **批量生成** | 上传 Excel/CSV，批量生成视频 |
 | **字幕编辑器** | 可视化调整字幕样式、位置 |
 | **Docker 部署** | 一键部署到云服务器 | ✅ |
@@ -323,25 +325,42 @@ cp -r SuperIPAgent/social-auto-upload backend/social_upload
 - [x] 端口冲突解决 (3003/8008/8444)
 - [x] Basic Auth 管理后台保护
 ### 阶段十七：声音克隆功能集成 (Day 13) ✅
 > **目标**：实现用户自定义声音克隆能力
 - [x] Qwen3-TTS HTTP 服务 (独立 FastAPI，端口 8009)
 - [x] 声音克隆服务封装 (voice_clone_service.py)
 - [x] 参考音频管理 API (上传/列表/删除)
 - [x] 前端 TTS 模式选择 UI
 - [x] Supabase ref-audios Bucket 配置
 - [x] 端到端测试验证
 ### 阶段十八：手机号登录迁移 (Day 15) ✅
 > **目标**：将认证系统从邮箱迁移到手机号
 - [x] 数据库 Schema 迁移 (email → phone)
 - [x] 后端 API 适配 (auth.py/admin.py)
 - [x] 11位手机号校验 (正则验证)
 - [x] 修改密码功能 (/api/auth/change-password)
 - [x] 账户设置下拉菜单 (修改密码 + 有效期显示 + 退出)
 - [x] 前端登录/注册页面更新
 - [x] 数据库迁移脚本 (migrate_to_phone.sql)
 ### 阶段十九：深度性能优化与服务守护 (Day 16) ✅
 > **目标**：提升系统响应速度与服务稳定性
 - [x] Flash Attention 2 集成 (Qwen3-TTS 加速 5x)
 - [x] LatentSync 性能调优 (OMP 线程限制 + 原生 Flash Attn)
 - [x] Watchdog 服务守护 (自动重启僵死服务)
 - [x] 文档体系更新 (部署手册与运维指南)
 ---
 ## 项目目录结构 (最终)
 ```
 TalkingHeadAgent/
 ├── frontend/                # Next.js 前端
 │   ├── app/
 │   ├── components/
 │   └── package.json
 ├── backend/                 # FastAPI 后端
 │   ├── app/
 │   ├── MuseTalk/            # 唇形同步模型
 │   ├── social_upload/       # 社交发布模块
 │   └── requirements.txt
 ├── docker-compose.yml       # 一键部署
 └── README.md
 ```
 ---
 ## 开发时间估算
--- a/Docs/task_complete.md
+++ b/Docs/task_complete.md
@@ -1,360 +1,95 @@
-# ViGent 数字人口播系统 - 开发任务清单
+# ViGent2 开发任务清单 (Task Log)
-**项目**：ViGent2 数字人口播视频生成系统
+**项目**: ViGent2 数字人口播视频生成系统  
-**服务器**：Dell R730 (2× RTX 3090 24GB)
+**进度**: 100% (Day 16 - 深度优化完成)  
-**更新时间**：2026-01-28
+**更新时间**: 2026-02-03
 **整体进度**：100%（Day 12 iOS 兼容、移动端优化、Qwen3-TTS 部署）
 ## 📖 快速导航
 | 章节 | 说明 |
 |------|------|
 | [已完成任务](#-已完成任务) | Day 1-12 完成的功能 |
 | [后续规划](#️-后续规划) | 待办项目 |
 | [进度统计](#-进度统计) | 各模块完成度 |
 | [里程碑](#-里程碑) | 关键节点 |
 | [时间线](#-时间线) | 开发历程 |
 **相关文档**：
 - [Day 日志](file:///d:/CodingProjects/Antigravity/ViGent2/Docs/DevLogs/) (Day1-Day12)
 - [部署指南](file:///d:/CodingProjects/Antigravity/ViGent2/Docs/DEPLOY_MANUAL.md)
 - [Qwen3-TTS 部署](file:///d:/CodingProjects/Antigravity/ViGent2/Docs/QWEN3_TTS_DEPLOY.md)
 ---
-## ✅ 已完成任务
+## 📅 对话历史与开发日志
-### 阶段一：核心功能验证
+> 这里记录了每一天的核心开发内容与 milestone。
 - [x] EdgeTTS 配音集成
 - [x] FFmpeg 视频合成
 - [x] MuseTalk 唇形同步 (代码集成)
 - [x] 端到端流程验证
-### 阶段二：后端 API 开发
+### Day 16: 深度性能优化 (Current) 🚀
- [x] FastAPI 项目搭建
+- [x] **Qwen-TTS 加速**: 集成 Flash Attention 2，模型加载速度提升至 8.9s。
- [x] 视频生成 API
+- [x] **服务守护**: 开发 `Watchdog` 看门狗机制，自动监控并重启僵死服务。
- [x] 素材管理 API
+- [x] **LatentSync 性能确认**: 验证 DeepCache + 原生 Flash Attn 生效。
- [x] 文件存储管理
+- [x] **文档重构**: 全面更新 README、部署手册及后端文档。
 - [x] **UI 交互优化**: 选择项持久化、列表内定位、刷新回顶部。
 - [x] **样式与预览**: 标题/字幕样式选择 + 预览 + 字号调节。
 - [x] **背景音乐**: 试听 + 音量控制 + 混音稳定性修复。
 - [x] **资产库接入**: 字体/BGM 资源库 + `/api/assets` 资源接口。
-### 阶段三：前端 Web UI
+### Day 15: 手机号认证迁移
- [x] Next.js 项目初始化
+- [x] **认证系统升级**: 从邮箱迁移至 11 位手机号注册/登录。
- [x] 视频生成页面
+- [x] **账户管理**: 新增修改密码、有效期显示、安全退出功能。
- [x] 发布管理页面
+- [x] **AI 文案助手**: 升级 GLM-4.7-Flash，支持 B站/抖音链接提取与洗稿。
 - [x] 任务状态展示
-### 阶段四：社交媒体发布
+### Day 14: AI 增强与体验优化
- [x] Playwright 自动化框架
+- [x] **AI 标题/标签**: 集成 GLM-4API 自动生成视频元数据。
- [x] Cookie 管理功能
+- [x] **字幕升级**: Remotion 逐字高亮字幕 (卡拉OK效果) 及动画片头。
- [x] 多平台发布 UI
+- [x] **模型升级**: Qwen3-TTS 升级至 1.7B-Base 版本。
 - [x] 定时发布功能 (Day 7)
 - [x] QR码自动登录 (Day 7)
-### 阶段五：部署与文档
+### Day 13: 声音克隆集成
- [x] 手动部署指南 (DEPLOY_MANUAL.md)
+- [x] **声音克隆微服务**: 封装 Qwen3-TTS 为独立 API (8009端口)。
- [x] 一键部署脚本 (deploy.sh)
+- [x] **参考音频管理**: Supabase 存储桶配置与管理接口。
- [x] 环境配置模板 (.env.example)
+- [x] **多模态 TTS**: 前端支持 EdgeTTS / Clone Voice 切换。
 - [x] 项目文档 (README.md)
 - [x] 端口配置 (8006/3002)
-### 阶段六：MuseTalk 服务器部署 (Day 2-3)
+### Day 12: 移动端适配
- [x] conda 环境配置 (musetalk)
+- [x] **iOS 兼容**: 修复 Safari 安全区域、状态栏颜色、Cookie 拦截问题。
- [x] 模型权重下载 (~7GB)
+- [x] **响应式 UI**: 移动端 Header 与发布页重构。
 - [x] subprocess 调用方式实现
 - [x] 健康检查功能
 - [x] 实际推理调用验证 (Day 3 修复)
-### 阶段七：MuseTalk 完整修复 (Day 4)
+### Day 11: 上传架构重构
- [x] 权重检测路径修复 (软链接)
+- [x] **直传优化**: 前端直传 Supabase Storage，解决 Nginx 30s 超时问题。
- [x] 音视频长度不匹配修复 (audio_processor.py)
+- [x] **数据隔离**: 用户素材/视频按 UserID 物理隔离。
 - [x] 推理脚本错误日志增强 (inference.py)
 - [x] 视频合成 MP4 生成验证
 - [x] 端到端流程完整测试
-### 阶段八：前端功能增强 (Day 5)
+### Day 10: HTTPS 与安全
- [x] Web 视频上传功能
+- [x] **HTTPS 部署**: 配置 SSL 证书与 Nginx 反向代理。
- [x] 上传进度显示
+- [x] **安全加固**: Supabase Studio 增加 Basic Auth 保护。
 - [x] 自动刷新素材列表
-### 阶段九：唇形同步模型升级 (Day 6)
+### Day 9: 认证系统与发布闭环
- [x] MuseTalk → LatentSync 1.6 迁移
+- [x] **用户系统**: 基于 Supabase Auth 实现 JWT 认证。
- [x] 后端代码适配 (config.py, lipsync_service.py)
+- [x] **发布闭环**: 验证 B站/抖音/小红书 自动发布流程。
- [x] Conda 环境配置 (latentsync)
+- [x] **服务自愈**: 配置 PM2 进程守护。
 - [x] 模型权重部署指南
 - [x] 服务器端到端验证
-### 阶段十：性能优化 (Day 6)
+### Day 1-8: 核心功能构建
- [x] 视频预压缩优化 (高分辨率自动压缩到720p)
+- [x] **Day 8**: 历史记录持久化与文件管理。
- [x] 进度更新细化 (5% → 10% → 25% → ... → 100%)
+- [x] **Day 7**: 社交媒体自动登录与多平台发布。
- [x] LipSync 服务单例缓存
+- [x] **Day 6**: **LatentSync 1.6** 升级与服务器部署。
- [x] 健康检查缓存 (5分钟)
+- [x] **Day 5**: 前端视频上传与进度反馈。
- [x] 异步子进程修复 (subprocess.run → asyncio)
+- [x] **Day 4**: MuseTalk (旧版) 口型同步修复。
- [x] 预加载模型服务 (常驻 Server + FastAPI)
+- [x] **Day 3**: 服务器环境配置与模型权重下载。
- [x] 批量队列处理 (GPU 并发控制)
+- [x] **Day 1-2**: 项目基础框架 (FastAPI + Next.js) 搭建。
 ### 阶段十一：社交媒体发布完善 (Day 7)
 - [x] QR码自动登录 (Playwright headless)
 - [x] 多平台上传器架构 (B站/抖音/小红书)
 - [x] B站发布 (biliup官方库)
 - [x] 抖音/小红书发布 (Playwright)
 - [x] 定时发布功能
 - [x] 前端发布UI优化
 - [x] Cookie自动管理
 - [x] UI一致性修复 (导航栏对齐、滚动条隐藏)
 - [x] QR登录超时修复 (Stealth模式、多选择器fallback)
 - [x] 文档规则优化 (智能修改标准、工具使用规范)
 ### 阶段十二：用户体验优化 (Day 8)
 - [x] 文件名保留 (时间戳前缀 + 原始名称)
 - [x] 视频持久化 (从文件系统读取历史)
 - [x] 历史视频列表组件
 - [x] 素材/视频删除功能
 - [x] 登出功能 (Logout API + 前端按钮)
 - [x] 前端 SWR 轮询优化
 - [x] QR 登录状态检测修复
 ### 阶段十三：发布模块优化 (Day 9)
 - [x] B站/抖音发布验证通过
 - [x] 资源清理保障 (try-finally)
 - [x] 超时保护 (消除无限循环)
 - [x] 小红书 headless 模式修复
 - [x] API 输入验证
 - [x] 完整类型提示
 - [x] 扫码登录等待界面 (加载动画)
 - [x] 抖音/B站登录策略优化 (Text优先)
 - [x] 发布成功审核提示
 ### 阶段十四：用户认证系统 (Day 9)
 - [x] Supabase 数据库表设计与部署
 - [x] JWT 认证 (HttpOnly Cookie)
 - [x] 用户注册/登录/登出 API
 - [x] 管理员权限控制 (is_active)
 - [x] 单设备登录限制 (Session Token)
 - [x] 防止 Supabase 暂停 (GitHub Actions/Crontab)
 - [x] 认证部署文档 (AUTH_DEPLOY.md)
 ### 阶段十五：部署稳定性优化 (Day 9)
 - [x] 后端依赖修复 (bcrypt/email-validator)
 - [x] 前端生产环境构建修复 (npm run build)
 - [x] LatentSync 性能卡顿修复 (OMP_NUM_THREADS限制)
 - [x] 部署服务自愈 (PM2 配置优化)
 - [x] 部署手册全量更新 (DEPLOY_MANUAL.md)
 ### 阶段十六：HTTPS 部署与细节完善 (Day 10)
 - [x] 隧道访问修复 (StaticFiles 挂载 + Rewrite)
 - [x] 平台账号列表 500 错误修复 (paths.py)
 - [x] Nginx HTTPS 配置 (反向代理 + SSL)
 - [x] 浏览器标题修改 (ViGent)
 - [x] 代码自适应 HTTPS 验证
 - [x] **Supabase 自托管部署** (Docker, 3003/8008端口)
 - [x] **安全加固** (Basic Auth 保护后台)
 - [x] **端口冲突解决** (迁移 Analytics/Kong)
 ### 阶段十七：上传架构重构 (Day 11)
 - [x] **直传改造** (前端直接上传 Supabase，绕过后端代理)
 - [x] **后端适配** (Signed URL 签名生成)
 - [x] **RLS 策略部署** (SQL 脚本自动化权限配置)
 - [x] **超时问题根治** (彻底解决 Nginx/FRP 30s 限制)
 - [x] **前端依赖更新** (@supabase/supabase-js 集成)
 ### 阶段十八：用户隔离与存储优化 (Day 11)
 - [x] **用户数据隔离** (素材/视频/Cookie 按用户ID目录隔离)
 - [x] **Storage URL 修复** (SUPABASE_PUBLIC_URL 配置，修复 localhost 问题)
 - [x] **发布服务优化** (直接读取本地 Supabase Storage 文件，跳过 HTTP 下载)
 - [x] **Supabase Studio 配置** (公网访问配置)
 ### 阶段十九：iOS 兼容与移动端 UI 优化 (Day 12)
 - [x] **Axios 全局拦截器** (401/403 自动跳转登录，防重复跳转)
 - [x] **iOS Safari 安全区域修复** (viewport-fit: cover, themeColor, 渐变背景统一)
 - [x] **移动端 Header 优化** (按钮紧凑布局，响应式间距)
 - [x] **发布页面 UI 重构** (立即发布/定时发布按钮分离，防误触设计)
 - [x] **Qwen3-TTS 0.6B 部署** (声音克隆模型，GPU0，3秒参考音频快速克隆)
 ---
-## 🛤️ 后续规划
+## 🛤️ 后续规划 (Roadmap)
 ### 🔴 优先待办
- [ ] **Qwen3-TTS 集成到 ViGent2** - 前端 UI + 后端服务集成
+- [ ] **批量生成架构**: 支持 Excel 导入，批量生产视频。
- [ ] 批量视频生成架构设计
+- [ ] **定时任务后台化**: 迁移前端触发的定时发布到后端 APScheduler。
 ### 🟠 功能完善
 - [x] 定时发布功能 ✅ Day 7 完成
 - [ ] **后端定时发布** - 替代平台端定时，使用 APScheduler 实现任务调度
 - [ ] 批量视频生成
 - [ ] 字幕样式编辑器
 ### 🔵 长期探索
- [ ] Docker 容器化
+- [ ] **容器化交付**: 提供完整的 Docker Compose 一键部署包。
- [ ] Celery 分布式任务队列
+- [ ] **分布式队列**: 引入 Celery + Redis 处理超高并发任务。
 ---
-## 📊 进度统计
+## 📊 模块完成度
 ### 总体进度
 ```
 ████████████████████ 100%
 ```
 ### 各模块进度
 | 模块 | 进度 | 状态 |
 |------|------|------|
-| 后端 API | 100% | ✅ 完成 |
+| **核心 API** | 100% | ✅ 稳定 |
-| 前端 UI | 100% | ✅ 完成 |
+| **Web UI** | 100% | ✅ 稳定 (移动端适配) |
-| TTS 配音 | 100% | ✅ 完成 |
+| **唇形同步** | 100% | ✅ LatentSync 1.6 |
-| 视频合成 | 100% | ✅ 完成 |
+| **TTS 配音** | 100% | ✅ EdgeTTS + Qwen3 |
-| 唇形同步 | 100% | ✅ LatentSync 1.6 升级完成 |
+| **自动发布** | 100% | ✅ B站/抖音/小红书 |
-| 社交发布 | 100% | ✅ Day 9 验证通过 |
+| **用户认证** | 100% | ✅ 手机号 + JWT |
-| 用户认证 | 100% | ✅ Day 9 Supabase+JWT |
+| **部署运维** | 100% | ✅ PM2 + Watchdog |
 | 服务器部署 | 100% | ✅ Day 9 稳定性优化完成 |
 ---
-## 🎯 里程碑
+## 📎 相关文档
 ### Milestone 1: 项目框架搭建 ✅
 **完成时间**: Day 1  
 **成果**: 
 - FastAPI 后端 + Next.js 前端
 - EdgeTTS + FFmpeg 集成
 - 视频生成端到端验证
 ### Milestone 2: 服务器部署 ✅
 **完成时间**: Day 3  
 **成果**: 
 - PyTorch 2.0.1 + MMLab 环境修复
 - 模型目录重组与权重补全
 - MuseTalk 推理成功运行
 ### Milestone 3: 口型同步完整修复 ✅
 **完成时间**: Day 4  
 **成果**: 
 - 权重检测路径修复 (软链接)
 - 音视频长度不匹配修复
 - 视频合成 MP4 验证通过 (28MB → 3.8MB)
 ### Milestone 4: LatentSync 1.6 升级 ✅
 **完成时间**: Day 6  
 **成果**: 
 - MuseTalk → LatentSync 1.6 迁移
 - 512×512 高分辨率唇形同步
 - Latent Diffusion 架构升级
 - 性能优化 (视频预压缩、进度更新)
 ### Milestone 5: 用户认证系统 ✅
 **完成时间**: Day 9
 **成果**:
 - Supabase 云数据库集成
 - 安全的 JWT + HttpOnly Cookie 认证
 - 管理员后台与用户隔离
 - 完善的部署与保活方案
 ### Milestone 6: 生产环境部署稳定化 ✅
 **完成时间**: Day 9
 **成果**:
 - 修复了后端 (bcrypt) 和前端 (build) 的启动崩溃问题
 - 解决了 LatentSync 占用全量 CPU 导致服务器卡顿的严重问题
 - 完善了部署手册，记录了关键的 Troubleshooting 步骤
 - 实现了服务 Long-term 稳定运行 (Reset PM2 counter)
 ---
 ## 📅 时间线
 Day 1: 项目初始化 + 核心功能   ✅ 完成
       - 后端 API 框架
       - 前端 UI
       - TTS + 视频合成
       - 社交发布框架
       - 部署文档
 Day 2: 服务器部署 + MuseTalk   ✅ 完成
       - 端口配置 (8006/3002)
       - MuseTalk conda 环境初始化
       - subprocess 调用实现
       - 健康检查验证
 Day 3: 环境修复与验证          ✅ 完成
       - PyTorch 降级 (2.5 -> 2.0.1)
       - MMLab 依赖全量安装
       - 模型权重补全 (dwpose, syncnet)
       - 目录结构修复 (symlinks)
       - 推理脚本验证 (生成593帧)
 Day 4: 口型同步完整修复        ✅ 完成
       - 权重检测路径修复 (软链接)
       - audio_processor.py 音视频长度修复
       - inference.py 错误日志增强
       - MP4 视频合成验证通过
 Day 5: 前端功能增强            ✅ 完成
       - Web 视频上传功能
       - 上传进度显示
       - 自动刷新素材列表
 Day 6: LatentSync 1.6 升级     ✅ 完成
       - MuseTalk → LatentSync 迁移
       - 后端代码适配
       - 模型部署指南
       - 服务器部署验证
       - 性能优化 (视频预压缩、进度更新)
 Day 7: 社交媒体发布完善     ✅ 完成
       - QR码自动登录 (B站/抖音验证通过)
       - 智能定位策略 (CSS/Text并行)
       - 多平台发布 (B站/抖音/小红书)
       - UI 一致性优化
       - 文档规则体系优化
 Day 8: 用户体验优化          ✅ 完成
       - 文件名保留 (时间戳前缀)
       - 视频持久化 (历史视频API)
       - 历史视频列表组件
       - 素材/视频删除功能
 Day 9: 发布模块优化          ✅ 完成
       - B站/抖音登录+发布验证通过
       - 资源清理保障 (try-finally)
       - 超时保护 (消除无限循环)
       - 小红书 headless 模式修复
       - 扫码登录等待界面 (加载动画)
       - 抖音/B站登录策略优化 (Text优先)
       - 发布成功审核提示
       - 用户认证系统规划 (FastAPI+Supabase)
       - Supabase 表结构设计 (users/sessions)
       - 后端 JWT 认证实现 (auth.py/deps.py)
       - 数据库配置与 SQL 部署
       - 独立认证部署文档 (AUTH_DEPLOY.md)
       - 自动保活机制 (Crontab/Actions)
       - 部署稳定性优化 (Backend依赖修复)
       - 前端生产构建流程修复
       - LatentSync 严重卡顿修复 (线程数限制)
       - 部署手册全量更新
 Day 10: HTTPS 部署与细节完善 ✅ 完成
       - 隧道访问视频修正 (挂载 uploads)
       - 账号列表 Bug 修复 (paths.py 白名单)
       - 阿里云 Nginx HTTPS 部署
       - UI 细节优化 (Title 更新)
 Day 11: 上传架构重构          ✅ 完成
       - **核心修复**: Aliyun Nginx `client_max_body_size 0` 配置
       - 500 错误根治 (Direct Upload + Gateway Config)
       - Supabase RLS 权限策略部署
       - 前端集成 supabase-js
       - 彻底解决大文件上传超时 (30s 限制)
       - **用户数据隔离** (素材/视频/Cookie 按用户目录存储)
       - **Storage URL 修复** (SUPABASE_PUBLIC_URL 公网地址配置)
       - **发布服务优化** (本地文件直读，跳过 HTTP 下载)
 Day 12: iOS 兼容与移动端优化   ✅ 完成
       - Axios 全局拦截器 (401/403 自动跳转登录)
       - iOS Safari 安全区域白边修复 (viewport-fit: cover)
       - themeColor 配置 (状态栏颜色适配)
       - 渐变背景统一 (body 全局渐变，消除分层)
       - 移动端 Header 响应式优化 (按钮紧凑布局)
       - 发布页面 UI 重构 (立即发布 3/4 + 定时 1/4)
       - **Qwen3-TTS 0.6B 部署** (声音克隆模型，GPU0)
       - **部署文档** (QWEN3_TTS_DEPLOY.md)
 - [详细开发日志 (DevLogs)](file:///d:/CodingProjects/Antigravity/ViGent2/Docs/DevLogs/)
 - [部署手册 (DEPLOY_MANUAL)](file:///d:/CodingProjects/Antigravity/ViGent2/Docs/DEPLOY_MANUAL.md)
--- a/README.md
+++ b/README.md
@@ -1,34 +1,64 @@
 # ViGent2 - 数字人口播视频生成系统
-基于 **LatentSync 1.6 + EdgeTTS** 的开源数字人口播视频生成系统。
+<div align="center">
-> 📹 上传静态人物视频 → 🎙️ 输入口播文案 → 🎬 自动生成唇形同步视频
+> 📹 **上传人物** · 🎙️ **输入文案** · 🎬 **一键成片**
 基于 **LatentSync 1.6 + EdgeTTS** 的开源数字人口播视频生成系统。
 集成 **Qwen3-TTS** 声音克隆与自动社交媒体发布功能。
 [功能特性](#-功能特性) • [技术栈](#-技术栈) • [文档中心](#-文档中心) • [部署指南](Docs/DEPLOY_MANUAL.md)
 </div>
 ---
 ## ✨ 功能特性
- 🎬 **唇形同步** - LatentSync 1.6 驱动，512×512 高分辨率 Diffusion 模型
+### 核心能力
- 🎙️ **TTS 配音** - EdgeTTS 多音色支持（云溪、晓晓等）
+- 🎬 **高清唇形同步** - LatentSync 1.6 驱动，512×512 高分辨率 Latent Diffusion 模型。
- 📱 **全自动发布** - 扫码登录 + Cookie持久化，支持多平台(B站/抖音/小红书)定时发布
+- 🎙️ **多模态配音** - 支持 **EdgeTTS** (微软超自然语音) 和 **Qwen3-TTS** (3秒极速声音克隆)。
- 🖥️ **Web UI** - Next.js 现代化界面，iOS/Android 移动端适配
+- 📝 **智能字幕** - 集成 faster-whisper + Remotion，自动生成逐字高亮 (卡拉OK效果) 字幕。
- 🔐 **用户系统** - Supabase + JWT 认证，支持管理员后台、注册/登录
+- 🎨 **样式预设** - 标题/字幕样式选择 + 预览 + 字号调节，支持自定义字体库。
- 👥 **多用户隔离** - 素材/视频/Cookie 按用户独立存储，数据完全隔离
+- 🎵 **背景音乐** - 试听 + 音量控制 + 混音，保持配音音量稳定。
- 🚀 **性能优化** - 视频预压缩、常驻模型服务 (0s加载)、本地文件直读
+- 🤖 **AI 辅助创作** - 内置 GLM-4.7-Flash，支持 B站/抖音链接文案提取、AI 洗稿、标题/标签自动生成。
 ### 平台化功能
 - 📱 **全自动发布** - 支持 B站、抖音、小红书定时发布，扫码登录 + Cookie 持久化。
 - 🔐 **企业级认证** - 完善的用户隔离系统 (Supabase)，支持手机号注册/登录、密码管理。
 - 🛡️ **服务守护** - 内置 Watchdog 看门狗机制，自动监控并重启僵死服务，确保 7x24h 稳定运行。
 - 🚀 **极致性能** - 视频预压缩、模型常驻服务 (0s加载)、双 GPU 流水线并发。
 ---
 ## 🛠️ 技术栈
-| 模块 | 技术 |
+| 领域 | 核心技术 | 说明 |
-|------|------|
+|------|----------|------|
-| 前端 | Next.js 14 + TypeScript + TailwindCSS |
+| **前端** | Next.js 14 | TypeScript, TailwindCSS, SWR |
-| 后端 | FastAPI + Python 3.10 |
+| **后端** | FastAPI | Python 3.10, AsyncIO, PM2 |
-| 数据库 | **Supabase** (PostgreSQL) 自托管 Docker |
+| **数据库** | Supabase | PostgreSQL, Storage (本地/S3), Auth |
-| 存储 | **Supabase Storage** (本地文件系统) |
+| **唇形同步** | LatentSync 1.6 | PyTorch 2.5, Diffusers, DeepCache |
-| 认证 | **JWT** + HttpOnly Cookie |
+| **声音克隆** | Qwen3-TTS | 1.7B 参数量，Flash Attention 2 加速 |
-| 唇形同步 | **LatentSync 1.6** (Latent Diffusion, 512×512) |
+| **自动化** | Playwright | 社交媒体无头浏览器自动化 |
-| TTS | EdgeTTS |
+| **部署** | Docker & PM2 | 混合部署架构 |
-| 视频处理 | FFmpeg |
+
-| 自动发布 | Playwright |
+---
 ## 📖 文档中心
 我们提供了详尽的开发与部署文档：
 ### 部署运维
 - **[部署手册 (DEPLOY_MANUAL.md)](Docs/DEPLOY_MANUAL.md)** - 👈 **部署请看这里**！包含完整的环境搭建步骤。
 - [参考音频服务部署 (QWEN3_TTS_DEPLOY.md)](Docs/QWEN3_TTS_DEPLOY.md) - 声音克隆模型部署指南。
 - [LatentSync 部署指南](models/LatentSync/DEPLOY.md) - 唇形同步模型独立部署。
 - [用户认证部署 (AUTH_DEPLOY.md)](Docs/AUTH_DEPLOY.md) - Supabase 与 Auth 系统配置。
 ### 开发文档
 - [后端开发指南](Docs/BACKEND_README.md) - 接口规范与开发流程。
 - [前端开发指南](Docs/FRONTEND_DEV.md) - UI 组件与页面规范。
 - [开发日志 (DevLogs)](Docs/DevLogs/) - 每日开发进度与技术决策记录。
 ---
@@ -36,136 +66,33 @@
 ```
 ViGent2/
-├── backend/              # FastAPI 后端
+├── backend/              # FastAPI 后端服务
-│   ├── app/
+│   ├── app/              # 核心业务逻辑
-│   │   ├── api/          # API 路由
+│   ├── scripts/          # 运维脚本 (Watchdog 等)
-│   │   ├── services/     # 核心服务 (TTS, LipSync, Video)
+│   └── tests/            # 测试用例
-│   │   └── core/         # 配置
+├── frontend/             # Next.js 前端应用
-│   ├── requirements.txt
+├── models/               # AI 模型仓库
-│   └── .env.example
+│   ├── LatentSync/       # 唇形同步服务
-├── frontend/             # Next.js 前端
+│   └── Qwen3-TTS/        # 声音克隆服务
-│   └── src/app/
+└── Docs/                 # 项目文档
 ├── models/               # AI 模型
 │   └── LatentSync/       # 唇形同步模型
 │       └── DEPLOY.md     # LatentSync 部署指南
 └── Docs/                 # 文档
    ├── DEPLOY_MANUAL.md  # 部署手册
    ├── AUTH_DEPLOY.md    # 认证部署指南
    ├── task_complete.md
    └── DevLogs/
 ```
 ---
-## 🚀 快速开始
+## 🌐 服务架构
-### 1. 克隆项目
+系统采用微服务架构设计，各组件独立运行：
-```bash
+| 服务名称 | 端口 | 用途 |
-git clone <仓库地址> /home/rongye/ProgramFiles/ViGent2
+|----------|------|------|
-cd /home/rongye/ProgramFiles/ViGent2
+| **Web UI** | 3002 | 用户访问入口 (Next.js) |
-```
+| **Backend API** | 8006 | 核心业务接口 (FastAPI) |
-
+| **LatentSync** | 8007 | 唇形同步推理服务 |
-### 2. 安装后端
+| **Qwen3-TTS** | 8009 | 声音克隆推理服务 |
-
+| **Supabase** | 8008 | 数据库与认证网关 |
 ```bash
 cd backend
 python -m venv venv
 source venv/bin/activate  # Windows: venv\Scripts\activate
 pip install -r requirements.txt
 cp .env.example .env
 ```
 ### 3. 安装前端
 ```bash
 cd frontend
 npm install
 ```
 ### 4. 安装 LatentSync (服务器)
 详见 [models/LatentSync/DEPLOY.md](models/LatentSync/DEPLOY.md)
 ```bash
 # 创建独立 Conda 环境
 conda create -n latentsync python=3.10.13
 conda activate latentsync
 # 安装依赖并下载权重
 cd models/LatentSync
 pip install -r requirements.txt
 huggingface-cli download ByteDance/LatentSync-1.6 --local-dir checkpoints
 ```
 ### 5. 启动服务
 ```bash
 # 终端 1: 后端 (端口 8006)
 cd backend && source venv/bin/activate
 uvicorn app.main:app --host 0.0.0.0 --port 8006
 # 终端 2: 前端 (端口 3002)
 cd frontend
 npm run dev -- -p 3002
 # 终端 3: LatentSync 服务 (端口 8007, 推荐启动)
 cd models/LatentSync
 nohup python -m scripts.server > server.log 2>&1 &
 ```
 ---
-## 🖥️ 服务器配置
+## ⚖️ License
-**目标服务器**: Dell PowerEdge R730
+[MIT License](LICENSE) © 2026 ViGent Team
 | 配置 | 规格 |
 |------|------|
 | CPU | 2× Intel Xeon E5-2680 v4 (56 线程) |
 | 内存 | 192GB DDR4 |
 | GPU | 2× NVIDIA RTX 3090 24GB |
 | 存储 | 4.47TB |
 **GPU 分配**:
 - GPU 0: 其他服务
 - GPU 1: **LatentSync** 唇形同步 (~18GB VRAM)
 ---
 ## 🌐 访问地址
 | 服务 | 地址 | 说明 |
 |------|------|------|
 | **视频生成 (UI)** | `https://vigent.hbyrkj.top` | 用户访问入口 |
 | **API 服务** | `http://<服务器IP>:8006` | 后端 Swagger |
 | **认证管理 (Studio)** | `https://supabase.hbyrkj.top` | 需要 Basic Auth |
 | **认证 API (Kong)** | `https://api.hbyrkj.top` | Supabase 接口 |
 | **模型服务** | `http://<服务器IP>:8007` | LatentSync |
 ---
 ## 📖 文档
 - [手动部署指南](Docs/DEPLOY_MANUAL.md)
 - [Supabase 部署指南](Docs/SUPABASE_DEPLOY.md)
 - [LatentSync 部署指南](models/LatentSync/DEPLOY.md)
 - [开发日志](Docs/DevLogs/)
 - [任务进度](Docs/task_complete.md)
 ---
 ## 🆚 与 ViGent 的区别
 | 特性 | ViGent (v1) | ViGent2 |
 |------|-------------|---------|
 | 唇形同步模型 | MuseTalk v1.5 | **LatentSync 1.6** |
 | 分辨率 | 256×256 | **512×512** |
 | 架构 | GAN | **Latent Diffusion** |
 | 视频预处理 | 无 | **自动压缩优化** |
 ---
 ## 📄 License
 MIT
--- a/backend/.env.example
+++ b/backend/.env.example
@@ -20,16 +20,16 @@ LATENTSYNC_GPU_ID=1
 LATENTSYNC_LOCAL=true
 # 使用常驻服务 (Persistent Server) 加速
-LATENTSYNC_USE_SERVER=false
+LATENTSYNC_USE_SERVER=true
 # 远程 API 地址 (常驻服务默认端口 8007)
 # LATENTSYNC_API_URL=http://localhost:8007
 # 推理步数 (20-50, 越高质量越好，速度越慢)
-LATENTSYNC_INFERENCE_STEPS=20
+LATENTSYNC_INFERENCE_STEPS=40
 # 引导系数 (1.0-3.0, 越高唇同步越准，但可能抖动)
-LATENTSYNC_GUIDANCE_SCALE=1.5
+LATENTSYNC_GUIDANCE_SCALE=2.0
 # 启用 DeepCache 加速 (推荐开启)
 LATENTSYNC_ENABLE_DEEPCACHE=true
@@ -59,5 +59,10 @@ JWT_EXPIRE_HOURS=168
 # =============== 管理员配置 ===============
 # 服务启动时自动创建的管理员账号
-ADMIN_EMAIL=lamnickdavid@gmail.com
+ADMIN_PHONE=15549380526
 ADMIN_PASSWORD=lam1988324
 # =============== GLM AI 配置 ===============
 # 智谱 GLM API 配置 (用于生成标题和标签)
 GLM_API_KEY=32440cd3f3444d1f8fe721304acea8bd.YXNLrk7eIJMKcg4t
 GLM_MODEL=glm-4.7-flash
--- a/backend/app/api/admin.py
+++ b/backend/app/api/admin.py
@@ -14,7 +14,7 @@ router = APIRouter(prefix="/api/admin", tags=["管理"])
 class UserListItem(BaseModel):
    id: str
-    email: str
+    phone: str
    username: Optional[str]
    role: str
    is_active: bool
@@ -36,7 +36,7 @@ async def list_users(admin: dict = Depends(get_current_admin)):
        return [
            UserListItem(
                id=u["id"],
-                email=u["email"],
+                phone=u["phone"],
                username=u.get("username"),
                role=u["role"],
                is_active=u["is_active"],
@@ -87,7 +87,7 @@ async def activate_user(
                detail="用户不存在"
            )
-        logger.info(f"管理员 {admin['email']} 激活用户 {user_id}, 有效期: {request.expires_days or '永久'} 天")
+        logger.info(f"管理员 {admin['phone']} 激活用户 {user_id}, 有效期: {request.expires_days or '永久'} 天")
        return {
            "success": True,
@@ -128,7 +128,7 @@ async def deactivate_user(
        # 清除用户 session
        supabase.table("user_sessions").delete().eq("user_id", user_id).execute()
-        logger.info(f"管理员 {admin['email']} 停用用户 {user_id}")
+        logger.info(f"管理员 {admin['phone']} 停用用户 {user_id}")
        return {"success": True, "message": "用户已停用"}
    except HTTPException:
@@ -171,7 +171,7 @@ async def extend_user(
            "expires_at": expires_at
        }).eq("id", user_id).execute()
-        logger.info(f"管理员 {admin['email']} 延长用户 {user_id} 授权 {request.expires_days or '永久'} 天")
+        logger.info(f"管理员 {admin['phone']} 延长用户 {user_id} 授权 {request.expires_days or '永久'} 天")
        return {
            "success": True,
--- a/backend/app/api/ai.py
+++ b/backend/app/api/ai.py
@@ -0,0 +1,45 @@
 """
 AI 相关 API 路由
 """
 from fastapi import APIRouter, HTTPException
 from pydantic import BaseModel
 from loguru import logger
 from app.services.glm_service import glm_service
 router = APIRouter(prefix="/api/ai", tags=["AI"])
 class GenerateMetaRequest(BaseModel):
    """生成标题标签请求"""
    text: str
 class GenerateMetaResponse(BaseModel):
    """生成标题标签响应"""
    title: str
    tags: list[str]
@router.post("/generate-meta", response_model=GenerateMetaResponse)
 async def generate_meta(req: GenerateMetaRequest):
    """
    AI 生成视频标题和标签
    根据口播文案自动生成吸引人的标题和相关标签
    """
    if not req.text or not req.text.strip():
        raise HTTPException(status_code=400, detail="口播文案不能为空")
    try:
        logger.info(f"Generating meta for text: {req.text[:50]}...")
        result = await glm_service.generate_title_tags(req.text)
        return GenerateMetaResponse(
            title=result.get("title", ""),
            tags=result.get("tags", [])
        )
    except Exception as e:
        logger.error(f"Generate meta failed: {e}")
        raise HTTPException(status_code=500, detail=str(e))
--- a/backend/app/api/assets.py
+++ b/backend/app/api/assets.py
@@ -0,0 +1,22 @@
 from fastapi import APIRouter, Depends
 from app.core.deps import get_current_user
 from app.services.assets_service import list_styles, list_bgm
 router = APIRouter()
@router.get("/subtitle-styles")
 async def list_subtitle_styles(current_user: dict = Depends(get_current_user)):
    return {"styles": list_styles("subtitle")}
@router.get("/title-styles")
 async def list_title_styles(current_user: dict = Depends(get_current_user)):
    return {"styles": list_styles("title")}
@router.get("/bgm")
 async def list_bgm_items(current_user: dict = Depends(get_current_user)):
    return {"bgm": list_bgm()}
--- a/backend/app/api/auth.py
+++ b/backend/app/api/auth.py
@@ -1,8 +1,8 @@
 """
-认证 API：注册、登录、登出
+认证 API：注册、登录、登出、修改密码
 """
 from fastapi import APIRouter, HTTPException, Response, status, Request
-from pydantic import BaseModel, EmailStr
+from pydantic import BaseModel, field_validator
 from app.core.supabase import get_supabase
 from app.core.security import (
    get_password_hash,
@@ -15,27 +15,55 @@ from app.core.security import (
 )
 from loguru import logger
 from typing import Optional
 import re
 router = APIRouter(prefix="/api/auth", tags=["认证"])
 class RegisterRequest(BaseModel):
-    email: EmailStr
+    phone: str
    password: str
    username: Optional[str] = None
    @field_validator('phone')
    @classmethod
    def validate_phone(cls, v):
        if not re.match(r'^\d{11}$', v):
            raise ValueError('手机号必须是11位数字')
        return v
 class LoginRequest(BaseModel):
-    email: EmailStr
+    phone: str
    password: str
    @field_validator('phone')
    @classmethod
    def validate_phone(cls, v):
        if not re.match(r'^\d{11}$', v):
            raise ValueError('手机号必须是11位数字')
        return v
 class ChangePasswordRequest(BaseModel):
    old_password: str
    new_password: str
    @field_validator('new_password')
    @classmethod
    def validate_new_password(cls, v):
        if len(v) < 6:
            raise ValueError('新密码长度至少6位')
        return v
 class UserResponse(BaseModel):
    id: str
-    email: str
+    phone: str
    username: Optional[str]
    role: str
    is_active: bool
    expires_at: Optional[str] = None
@router.post("/register")
@@ -48,29 +76,29 @@ async def register(request: RegisterRequest):
    try:
        supabase = get_supabase()
-        # 检查邮箱是否已存在
+        # 检查手机号是否已存在
        existing = supabase.table("users").select("id").eq(
-            "email", request.email
+            "phone", request.phone
        ).execute()
        if existing.data:
            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
-                detail="该邮箱已注册"
+                detail="该手机号已注册"
            )
        # 创建用户
        password_hash = get_password_hash(request.password)
        result = supabase.table("users").insert({
-            "email": request.email,
+            "phone": request.phone,
            "password_hash": password_hash,
-            "username": request.username or request.email.split("@")[0],
+            "username": request.username or f"用户{request.phone[-4:]}",
            "role": "pending",
            "is_active": False
        }).execute()
-        logger.info(f"新用户注册: {request.email}")
+        logger.info(f"新用户注册: {request.phone}")
        return {
            "success": True,
@@ -100,21 +128,21 @@ async def login(request: LoginRequest, response: Response):
        # 查找用户
        user_result = supabase.table("users").select("*").eq(
-            "email", request.email
+            "phone", request.phone
        ).single().execute()
        user = user_result.data
        if not user:
            raise HTTPException(
                status_code=status.HTTP_401_UNAUTHORIZED,
-                detail="邮箱或密码错误"
+                detail="手机号或密码错误"
            )
        # 验证密码
        if not verify_password(request.password, user["password_hash"]):
            raise HTTPException(
                status_code=status.HTTP_401_UNAUTHORIZED,
-                detail="邮箱或密码错误"
+                detail="手机号或密码错误"
            )
        # 检查是否激活
@@ -154,17 +182,18 @@ async def login(request: LoginRequest, response: Response):
        # 设置 HttpOnly Cookie
        set_auth_cookie(response, token)
-        logger.info(f"用户登录: {request.email}")
+        logger.info(f"用户登录: {request.phone}")
        return {
            "success": True,
            "message": "登录成功",
            "user": UserResponse(
                id=user["id"],
-                email=user["email"],
+                phone=user["phone"],
                username=user.get("username"),
                role=user["role"],
-                is_active=user["is_active"]
+                is_active=user["is_active"],
                expires_at=user.get("expires_at")
            )
        }
    except HTTPException:
@@ -184,6 +213,91 @@ async def logout(response: Response):
    return {"success": True, "message": "已登出"}
@router.post("/change-password")
 async def change_password(request: ChangePasswordRequest, req: Request, response: Response):
    """
    修改密码
    - 验证当前密码
    - 设置新密码
    - 重新生成 session token
    """
    # 从 Cookie 获取用户
    token = req.cookies.get("access_token")
    if not token:
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="未登录"
        )
    token_data = decode_access_token(token)
    if not token_data:
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="Token 无效"
        )
    try:
        supabase = get_supabase()
        # 获取用户信息
        user_result = supabase.table("users").select("*").eq(
            "id", token_data.user_id
        ).single().execute()
        user = user_result.data
        if not user:
            raise HTTPException(
                status_code=status.HTTP_401_UNAUTHORIZED,
                detail="用户不存在"
            )
        # 验证当前密码
        if not verify_password(request.old_password, user["password_hash"]):
            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
                detail="当前密码错误"
            )
        # 更新密码
        new_password_hash = get_password_hash(request.new_password)
        supabase.table("users").update({
            "password_hash": new_password_hash
        }).eq("id", user["id"]).execute()
        # 生成新的 session token，使旧 token 失效
        new_session_token = generate_session_token()
        supabase.table("user_sessions").delete().eq(
            "user_id", user["id"]
        ).execute()
        supabase.table("user_sessions").insert({
            "user_id": user["id"],
            "session_token": new_session_token,
            "device_info": None
        }).execute()
        # 生成新的 JWT Token
        new_token = create_access_token(user["id"], new_session_token)
        set_auth_cookie(response, new_token)
        logger.info(f"用户修改密码: {user['phone']}")
        return {
            "success": True,
            "message": "密码修改成功"
        }
    except HTTPException:
        raise
    except Exception as e:
        logger.error(f"修改密码失败: {e}")
        raise HTTPException(
            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
            detail="修改密码失败，请稍后重试"
        )
@router.get("/me")
 async def get_me(request: Request):
    """获取当前用户信息"""
@@ -216,8 +330,9 @@ async def get_me(request: Request):
    return UserResponse(
        id=user["id"],
-        email=user["email"],
+        phone=user["phone"],
        username=user.get("username"),
        role=user["role"],
-        is_active=user["is_active"]
+        is_active=user["is_active"],
        expires_at=user.get("expires_at")
    )
--- a/backend/app/api/materials.py
+++ b/backend/app/api/materials.py
@@ -9,6 +9,10 @@ import os
 import aiofiles
 from pathlib import Path
 from loguru import logger
 from pydantic import BaseModel
 from typing import Optional
 import httpx
 router = APIRouter()
@@ -329,3 +333,6 @@ async def delete_material(material_id: str, current_user: dict = Depends(get_cur
        return {"success": True, "message": "素材已删除"}
    except Exception as e:
        raise HTTPException(500, f"删除失败: {str(e)}")
--- a/backend/app/api/ref_audios.py
+++ b/backend/app/api/ref_audios.py
@@ -0,0 +1,411 @@
 """
 参考音频管理 API
 支持上传/列表/删除参考音频，用于 Qwen3-TTS 声音克隆
 """
 from fastapi import APIRouter, UploadFile, File, Form, HTTPException, Depends
 from pydantic import BaseModel
 from typing import List, Optional
 from pathlib import Path
 from loguru import logger
 import time
 import json
 import subprocess
 import tempfile
 import os
 import re
 from app.core.deps import get_current_user
 from app.services.storage import storage_service
 router = APIRouter()
 # 支持的音频格式
 ALLOWED_AUDIO_EXTENSIONS = {'.wav', '.mp3', '.m4a', '.webm', '.ogg', '.flac', '.aac'}
 # 参考音频 bucket
 BUCKET_REF_AUDIOS = "ref-audios"
 class RefAudioResponse(BaseModel):
    id: str
    name: str
    path: str  # signed URL for playback
    ref_text: str
    duration_sec: float
    created_at: int
 class RefAudioListResponse(BaseModel):
    items: List[RefAudioResponse]
 def sanitize_filename(filename: str) -> str:
    """清理文件名，移除特殊字符"""
    safe_name = re.sub(r'[<>:"/\\|?*\s]', '_', filename)
    if len(safe_name) > 50:
        ext = Path(safe_name).suffix
        safe_name = safe_name[:50 - len(ext)] + ext
    return safe_name
 def get_audio_duration(file_path: str) -> float:
    """获取音频时长 (秒)"""
    try:
        result = subprocess.run(
            ['ffprobe', '-v', 'quiet', '-show_entries', 'format=duration',
             '-of', 'csv=p=0', file_path],
            capture_output=True, text=True, timeout=10
        )
        return float(result.stdout.strip())
    except Exception as e:
        logger.warning(f"获取音频时长失败: {e}")
        return 0.0
 def convert_to_wav(input_path: str, output_path: str) -> bool:
    """将音频转换为 WAV 格式 (16kHz, mono)"""
    try:
        subprocess.run([
            'ffmpeg', '-y', '-i', input_path,
            '-ar', '16000',  # 16kHz 采样率
            '-ac', '1',      # 单声道
            '-acodec', 'pcm_s16le',  # 16-bit PCM
            output_path
        ], capture_output=True, timeout=60, check=True)
        return True
    except Exception as e:
        logger.error(f"音频转换失败: {e}")
        return False
@router.post("", response_model=RefAudioResponse)
 async def upload_ref_audio(
    file: UploadFile = File(...),
    ref_text: str = Form(...),
    user: dict = Depends(get_current_user)
 ):
    """
    上传参考音频
    - file: 音频文件 (支持 wav, mp3, m4a, webm 等)
    - ref_text: 参考音频的转写文字 (必填)
    """
    user_id = user["id"]
    # 验证文件扩展名
    ext = Path(file.filename).suffix.lower()
    if ext not in ALLOWED_AUDIO_EXTENSIONS:
        raise HTTPException(
            status_code=400,
            detail=f"不支持的音频格式: {ext}。支持的格式: {', '.join(ALLOWED_AUDIO_EXTENSIONS)}"
        )
    # 验证 ref_text
    if not ref_text or len(ref_text.strip()) < 2:
        raise HTTPException(status_code=400, detail="参考文字不能为空")
    try:
        # 创建临时文件
        with tempfile.NamedTemporaryFile(delete=False, suffix=ext) as tmp_input:
            content = await file.read()
            tmp_input.write(content)
            tmp_input_path = tmp_input.name
        # 转换为 WAV 格式
        tmp_wav_path = tmp_input_path + ".wav"
        if ext != '.wav':
            if not convert_to_wav(tmp_input_path, tmp_wav_path):
                raise HTTPException(status_code=500, detail="音频格式转换失败")
        else:
            # 即使是 wav 也要标准化格式
            convert_to_wav(tmp_input_path, tmp_wav_path)
        # 获取音频时长
        duration = get_audio_duration(tmp_wav_path)
        if duration < 1.0:
            raise HTTPException(status_code=400, detail="音频时长过短，至少需要 1 秒")
        if duration > 60.0:
            raise HTTPException(status_code=400, detail="音频时长过长，最多 60 秒")
        # 3. 处理重名逻辑 (Friendly Display Name)
        original_name = file.filename
        # 获取用户现有的所有参考音频列表 (为了检查文件名冲突)
        # 注意: 这种列表方式在文件极多时性能一般，但考虑到单用户参考音频数量有限，目前可行
        existing_files = await storage_service.list_files(BUCKET_REF_AUDIOS, user_id)
        existing_names = set()
        # 预加载所有现有的 display name
        # 这里需要并发请求 metadata 可能会慢，优化: 仅检查 metadata 文件并解析
        # 简易方案: 仅在 metadata 中读取 original_filename 
        # 但 list_files 返回的是 name，我们需要 metadata
        # 考虑到性能，这里使用一种妥协方案：
        # 我们不做全量检查，而是简单的检查：如果用户上传 myvoice.wav
        # 我们看看有没有 (timestamp)_myvoice.wav 这种其实并不能准确判断 display name 是否冲突
        # 
        # 正确做法: 应该有个数据库表存 metadata。但目前是无数据库设计。
        # 
        # 改用简单方案: 
        # 既然我们无法快速获取所有 display name，
        # 我们暂时只处理 "在新上传时，original_filename 保持原样"
        # 但用户希望 "如果在列表中看到重复的，自动加(1)"
        # 
        # 鉴于无数据库架构的限制，要在上传时知道"已有的 display name" 成本太高(需遍历下载所有json)。
        # 
        # 💡 替代方案: 
        # 我们不检查旧的。我们只保证**存储**唯一。
        # 对于用户提到的 "新上传的文件名后加个数字" -> 这通常是指 "另存为" 的逻辑。
        # 既然用户现在的痛点是 "显示了时间戳太丑"，而我已经去掉了时间戳显示。
        # 那么如果用户上传两个 "TEST.wav"，列表里就会有两个 "TEST.wav" (但时间不同)。
        # 这其实是可以接受的。
        # 
        # 但如果用户强求 "自动重命名":
        # 我们可以在这里做一个轻量级的 "同名检测"：
        # 检查有没有 *_{original_name} 的文件存在。
        # 如果 storage 里已经有 123_abc.wav, 456_abc.wav
        # 我们可以认为 abc.wav 已经存在。
        dup_count = 0
        search_suffix = f"_{original_name}" # 比如 _test.wav
        for f in existing_files:
            fname = f.get('name', '')
            if fname.endswith(search_suffix):
                dup_count += 1
        final_display_name = original_name
        if dup_count > 0:
            name_stem = Path(original_name).stem
            name_ext = Path(original_name).suffix
            final_display_name = f"{name_stem}({dup_count}){name_ext}"
        # 生成存储路径 (唯一ID)
        timestamp = int(time.time())
        safe_name = sanitize_filename(Path(file.filename).stem)
        storage_path = f"{user_id}/{timestamp}_{safe_name}.wav"
        # 上传 WAV 文件到 Supabase
        with open(tmp_wav_path, 'rb') as f:
            wav_data = f.read()
        await storage_service.upload_file(
            bucket=BUCKET_REF_AUDIOS,
            path=storage_path,
            file_data=wav_data,
            content_type="audio/wav"
        )
        # 上传元数据 JSON
        metadata = {
            "ref_text": ref_text.strip(),
            "original_filename": final_display_name, # 这里的名字如果有重复会自动加(1)
            "duration_sec": duration,
            "created_at": timestamp
        }
        metadata_path = f"{user_id}/{timestamp}_{safe_name}.json"
        await storage_service.upload_file(
            bucket=BUCKET_REF_AUDIOS,
            path=metadata_path,
            file_data=json.dumps(metadata, ensure_ascii=False).encode('utf-8'),
            content_type="application/json"
        )
        # 获取签名 URL
        signed_url = await storage_service.get_signed_url(BUCKET_REF_AUDIOS, storage_path)
        # 清理临时文件
        os.unlink(tmp_input_path)
        if os.path.exists(tmp_wav_path):
            os.unlink(tmp_wav_path)
        return RefAudioResponse(
            id=storage_path,
            name=file.filename,
            path=signed_url,
            ref_text=ref_text.strip(),
            duration_sec=duration,
            created_at=timestamp
        )
    except HTTPException:
        raise
    except Exception as e:
        logger.error(f"上传参考音频失败: {e}")
        raise HTTPException(status_code=500, detail=f"上传失败: {str(e)}")
@router.get("", response_model=RefAudioListResponse)
 async def list_ref_audios(user: dict = Depends(get_current_user)):
    """列出当前用户的所有参考音频"""
    user_id = user["id"]
    try:
        # 列出用户目录下的文件
        files = await storage_service.list_files(BUCKET_REF_AUDIOS, user_id)
        # 过滤出 .wav 文件并获取对应的 metadata
        items = []
        for f in files:
            name = f.get("name", "")
            if not name.endswith(".wav"):
                continue
            storage_path = f"{user_id}/{name}"
            # 尝试读取 metadata
            metadata_name = name.replace(".wav", ".json")
            metadata_path = f"{user_id}/{metadata_name}"
            ref_text = ""
            duration_sec = 0.0
            created_at = 0
            original_filename = ""
            try:
                # 获取 metadata 内容
                metadata_url = await storage_service.get_signed_url(BUCKET_REF_AUDIOS, metadata_path)
                import httpx
                async with httpx.AsyncClient() as client:
                    resp = await client.get(metadata_url)
                    if resp.status_code == 200:
                        metadata = resp.json()
                        ref_text = metadata.get("ref_text", "")
                        duration_sec = metadata.get("duration_sec", 0.0)
                        created_at = metadata.get("created_at", 0)
                        original_filename = metadata.get("original_filename", "")
            except Exception as e:
                logger.warning(f"读取 metadata 失败: {e}")
                # 从文件名提取时间戳
                try:
                    created_at = int(name.split("_")[0])
                except:
                    pass
            # 获取音频签名 URL
            signed_url = await storage_service.get_signed_url(BUCKET_REF_AUDIOS, storage_path)
            # 优先显示原始文件名 (去掉时间戳前缀)
            display_name = original_filename if original_filename else name
            # 如果原始文件名丢失，尝试从现有文件名中通过正则去掉时间戳
            if not display_name or display_name == name:
                 # 匹配 "1234567890_filename.wav"
                 match = re.match(r'^\d+_(.+)$', name)
                 if match:
                     display_name = match.group(1)
            items.append(RefAudioResponse(
                id=storage_path,
                name=display_name,
                path=signed_url,
                ref_text=ref_text,
                duration_sec=duration_sec,
                created_at=created_at
            ))
        # 按创建时间倒序排列
        items.sort(key=lambda x: x.created_at, reverse=True)
        return RefAudioListResponse(items=items)
    except Exception as e:
        logger.error(f"列出参考音频失败: {e}")
        raise HTTPException(status_code=500, detail=f"获取列表失败: {str(e)}")
@router.delete("/{audio_id:path}")
 async def delete_ref_audio(audio_id: str, user: dict = Depends(get_current_user)):
    """删除参考音频"""
    user_id = user["id"]
    # 安全检查：确保只能删除自己的文件
    if not audio_id.startswith(f"{user_id}/"):
        raise HTTPException(status_code=403, detail="无权删除此文件")
    try:
        # 删除 WAV 文件
        await storage_service.delete_file(BUCKET_REF_AUDIOS, audio_id)
        # 删除 metadata JSON
        metadata_path = audio_id.replace(".wav", ".json")
        try:
            await storage_service.delete_file(BUCKET_REF_AUDIOS, metadata_path)
        except:
            pass  # metadata 可能不存在
        return {"success": True, "message": "删除成功"}
    except Exception as e:
        logger.error(f"删除参考音频失败: {e}")
        raise HTTPException(status_code=500, detail=f"删除失败: {str(e)}")
 class RenameRequest(BaseModel):
    new_name: str
@router.put("/{audio_id:path}")
 async def rename_ref_audio(
    audio_id: str,
    request: RenameRequest,
    user: dict = Depends(get_current_user)
 ):
    """重命名参考音频 (修改 metadata 中的 display name)"""
    user_id = user["id"]
    # 安全检查
    if not audio_id.startswith(f"{user_id}/"):
        raise HTTPException(status_code=403, detail="无权修改此文件")
    new_name = request.new_name.strip()
    if not new_name:
         raise HTTPException(status_code=400, detail="新名称不能为空")
    # 确保新名称有后缀 (保留原后缀或添加 .wav)
    if not Path(new_name).suffix:
        new_name += ".wav"
    try:
        # 1. 下载现有的 metadata
        metadata_path = audio_id.replace(".wav", ".json")
        try:
             # 获取已有的 JSON
             import httpx
             metadata_url = await storage_service.get_signed_url(BUCKET_REF_AUDIOS, metadata_path)
             if not metadata_url:
                  # 如果 json 不存在，则需要新建一个基础的
                  raise Exception("Metadata not found")
             async with httpx.AsyncClient() as client:
                resp = await client.get(metadata_url)
                if resp.status_code == 200:
                    metadata = resp.json()
                else:
                    raise Exception(f"Failed to fetch metadata: {resp.status_code}")
        except Exception as e:
            logger.warning(f"无法读取元数据: {e}, 将创建新的元数据")
            # 兜底：如果读取失败，构建最小元数据
            metadata = {
                "ref_text": "", # 可能丢失
                "duration_sec": 0.0,
                "created_at": int(time.time()),
                "original_filename": new_name
            }
        # 2. 更新 original_filename
        metadata["original_filename"] = new_name
        # 3. 覆盖上传 metadata
        await storage_service.upload_file(
            bucket=BUCKET_REF_AUDIOS,
            path=metadata_path,
            file_data=json.dumps(metadata, ensure_ascii=False).encode('utf-8'),
            content_type="application/json"
        )
        return {"success": True, "name": new_name}
    except Exception as e:
        logger.error(f"重命名失败: {e}")
        raise HTTPException(status_code=500, detail=f"重命名失败: {str(e)}")
--- a/backend/app/api/tools.py
+++ b/backend/app/api/tools.py
@@ -0,0 +1,398 @@
 from fastapi import APIRouter, UploadFile, File, Form, HTTPException
 from typing import Optional
 import shutil
 import os
 import time
 from pathlib import Path
 from loguru import logger
 import traceback
 import re
 import json
 import requests
 from urllib.parse import unquote
 from app.services.whisper_service import whisper_service
 from app.services.glm_service import glm_service
 router = APIRouter()
@router.post("/extract-script")
 async def extract_script_tool(
    file: Optional[UploadFile] = File(None),
    url: Optional[str] = Form(None),
    rewrite: bool = Form(True)
 ):
    """
    独立文案提取工具
    支持上传视频/音频 OR 输入视频链接 -> 提取文字 -> (可选) AI洗稿
    """
    if not file and not url:
        raise HTTPException(400, "必须提供文件或视频链接")
    temp_path = None
    try:
        timestamp = int(time.time())
        temp_dir = Path("/tmp")
        if os.name == 'nt':
            temp_dir = Path("d:/tmp")
        temp_dir.mkdir(parents=True, exist_ok=True)
        # 1. 获取/保存文件
        loop = asyncio.get_event_loop()
        if file:
            safe_filename = Path(file.filename).name.replace(" ", "_")
            temp_path = temp_dir / f"tool_extract_{timestamp}_{safe_filename}"
            # 文件 I/O 放入线程池
            await loop.run_in_executor(None, lambda: shutil.copyfileobj(file.file, open(temp_path, "wb")))
            logger.info(f"Tool processing upload file: {temp_path}")
        else:
            # URL 下载逻辑
            # 自动提取文案中的链接 (支持 Douyin/Bilibili 等分享文案)
            url_match = re.search(r'https?://[^\s]+', url)
            if url_match:
                extracted_url = url_match.group(0)
                logger.info(f"Extracted URL from text: {extracted_url}")
                url = extracted_url
            logger.info(f"Tool downloading URL: {url}")
            # 封装 yt-dlp 下载函数 (Blocking)
            def _download_yt_dlp():
                import yt_dlp
                logger.info("Attempting download with yt-dlp...")
                ydl_opts = {
                    'format': 'bestaudio/best',
                    'outtmpl': str(temp_dir / f"tool_download_{timestamp}_%(id)s.%(ext)s"),
                    'quiet': True,
                    'no_warnings': True,
                    'http_headers': {
                         'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
                         'Referer': 'https://www.douyin.com/',
                    }
                }
                with yt_dlp.YoutubeDL(ydl_opts) as ydl:
                    info = ydl.extract_info(url, download=True)
                    if 'requested_downloads' in info:
                        downloaded_file = info['requested_downloads'][0]['filepath']
                    else:
                        ext = info.get('ext', 'mp4')
                        id = info.get('id')
                        downloaded_file = str(temp_dir / f"tool_download_{timestamp}_{id}.{ext}")
                    return Path(downloaded_file)
            # 先尝试 yt-dlp (Run in Executor)
            try:
                temp_path = await loop.run_in_executor(None, _download_yt_dlp)
                logger.info(f"yt-dlp downloaded to: {temp_path}")
            except Exception as e:
                logger.warning(f"yt-dlp download failed: {e}. Trying manual Douyin fallback...")
                # 失败则尝试手动解析 (Douyin Fallback)
                if "douyin" in url:
                    manual_path = await download_douyin_manual(url, temp_dir, timestamp)
                    if manual_path:
                        temp_path = manual_path
                        logger.info(f"Manual Douyin fallback successful: {temp_path}")
                    else:
                         raise HTTPException(400, f"视频下载失败。yt-dlp 报错: {str(e)}")
                elif "bilibili" in url:
                    manual_path = await download_bilibili_manual(url, temp_dir, timestamp)
                    if manual_path:
                        temp_path = manual_path
                        logger.info(f"Manual Bilibili fallback successful: {temp_path}")
                    else:
                         raise HTTPException(400, f"视频下载失败。yt-dlp 报错: {str(e)}")
                else:
                    raise HTTPException(400, f"视频下载失败: {str(e)}")
        if not temp_path or not temp_path.exists():
             raise HTTPException(400, "文件获取失败")
        # 1.5 安全转换: 强制转为 WAV (16k)
        import subprocess
        audio_path = temp_dir / f"extract_audio_{timestamp}.wav"
        def _convert_audio():
            try:
                convert_cmd = [
                    'ffmpeg',
                    '-i', str(temp_path),
                    '-vn', # 忽略视频
                    '-acodec', 'pcm_s16le',
                    '-ar', '16000', # Whisper 推荐采样率
                    '-ac', '1',    # 单声道
                    '-y',          # 覆盖
                    str(audio_path)
                ]
                # 捕获 stderr
                subprocess.run(convert_cmd, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
                return True
            except subprocess.CalledProcessError as e:
                error_log = e.stderr.decode('utf-8', errors='ignore') if e.stderr else str(e)
                logger.error(f"FFmpeg check/convert failed: {error_log}")
                # 检查是否为 HTML
                head = b""
                try:
                    with open(temp_path, 'rb') as f:
                        head = f.read(100)
                except: pass
                if b'<!DOCTYPE html' in head or b'<html' in head:
                    raise ValueError("HTML_DETECTED")
                raise ValueError("CONVERT_FAILED")
        # 执行转换 (Run in Executor)
        try:
            await loop.run_in_executor(None, _convert_audio)
            logger.info(f"Converted to WAV: {audio_path}")
            target_path = audio_path
        except ValueError as ve:
            if str(ve) == "HTML_DETECTED":
                 raise HTTPException(400, "下载的文件是网页而非视频，请重试或手动上传。")
            else:
                 raise HTTPException(400, "下载的文件已损坏或格式无法识别。")
        # 2. 提取文案 (Whisper)
        script = await whisper_service.transcribe(str(target_path))
        # 3. AI 洗稿 (GLM)
        rewritten = None
        if rewrite:
            if script and len(script.strip()) > 0:
                logger.info("Rewriting script...")
                rewritten = await glm_service.rewrite_script(script)
            else:
                logger.warning("No script extracted, skipping rewrite")
        return {
            "success": True,
            "original_script": script,
            "rewritten_script": rewritten
        }
    except HTTPException as he:
        raise he
    except Exception as e:
        logger.error(f"Tool extract failed: {e}")
        logger.error(traceback.format_exc())
        # Friendly error message
        msg = str(e)
        if "Fresh cookies" in msg:
            msg = "下载失败：目标平台开启了反爬验证，请过段时间重试或直接上传视频文件。"
        raise HTTPException(500, f"提取失败: {msg}")
    finally:
        # 清理临时文件
        if temp_path and temp_path.exists():
            try:
                os.remove(temp_path)
                logger.info(f"Cleaned up temp file: {temp_path}")
            except Exception as e:
                logger.warning(f"Failed to cleanup temp file {temp_path}: {e}")
 async def download_douyin_manual(url: str, temp_dir: Path, timestamp: int) -> Optional[Path]:
    """
    手动下载抖音视频 (Fallback logic - Ported from SuperIPAgent/douyinDownloader)
    使用特定的 User Profile URL 和硬编码 Cookie 绕过反爬
    """
    logger.info(f"[SuperIPAgent] Starting download for: {url}")
    try:
        # 1. 提取 Modal ID (支持短链跳转)
        headers = {
            "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"
        }
        # 如果是短链或重定向
        resp = requests.get(url, headers=headers, allow_redirects=True, timeout=10)
        final_url = resp.url
        logger.info(f"[SuperIPAgent] Final URL: {final_url}")
        modal_id = None
        match = re.search(r'/video/(\d+)', final_url)
        if match:
            modal_id = match.group(1)
        if not modal_id:
            logger.error("[SuperIPAgent] Could not extract modal_id")
            return None
        logger.info(f"[SuperIPAgent] Extracted modal_id: {modal_id}")
        # 2. 构造特定请求 URL (Copy from SuperIPAgent)
        # 使用特定用户的 Profile 页 + modal_id 参数，配合特定 Cookie
        target_url = f"https://www.douyin.com/user/MS4wLjABAAAAN_s_hups7LD0N4qnrM3o2gI0vuG3pozNaEolz2_py3cHTTrpVr1Z4dukFD9SOlwY?from_tab_name=main&modal_id={modal_id}"
        # 3. 使用硬编码 Cookie (Copy from SuperIPAgent)
        headers_with_cookie = {
            "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
            "cookie": "douyin.com; device_web_cpu_core=10; device_web_memory_size=8; __ac_nonce=06760391f00b9b51264ae; __ac_signature=_02B4Z6wo00f019a5ceAAAIDAhEZR-X3jjWfWmXVAAJLXd4; ttwid=1%7C7MTKBSMsP4eOv9h5NAh8p0E-NYIud09ftNmB0mjLpWc%7C1734359327%7C8794abeabbd47447e1f56e5abc726be089f2a0344d6343b5f75f23e7b0f0028f; UIFID_TEMP=0de8750d2b188f4235dbfd208e44abbb976428f0720eb983255afefa45d39c0c6532e1d4768dd8587bf919f866ff1396912bcb2af71efee56a14a2a9f37b74010d0a0413795262f6d4afe02a032ac7ab; s_v_web_id=verify_m4r4ribr_c7krmY1z_WoeI_43po_ATpO_I4o8U1bex2D7; hevc_supported=true; home_can_add_dy_2_desktop=%220%22; dy_swidth=2560; dy_sheight=1440; stream_recommend_feed_params=%22%7B%5C%22cookie_enabled%5C%22%3Atrue%2C%5C%22screen_width%5C%22%3A2560%2C%5C%22screen_height%5C%22%3A1440%2C%5C%22browser_online%5C%22%3Atrue%2C%5C%22cpu_core_num%5C%22%3A10%2C%5C%22device_memory%5C%22%3A8%2C%5C%22downlink%5C%22%3A10%2C%5C%22effective_type%5C%22%3A%5C%224g%5C%22%2C%5C%22round_trip_time%5C%22%3A50%7D%22; strategyABtestKey=%221734359328.577%22; csrf_session_id=2f53aed9aa6974e83aa9a1014180c3a4; fpk1=U2FsdGVkX1/IpBh0qdmlKAVhGyYHgur4/VtL9AReZoeSxadXn4juKvsakahRGqjxOPytHWspYoBogyhS/V6QSw==; fpk2=0845b309c7b9b957afd9ecf775a4c21f; passport_csrf_token=d80e0c5b2fa2328219856be5ba7e671e; passport_csrf_token_default=d80e0c5b2fa2328219856be5ba7e671e; odin_tt=3c891091d2eb0f4718c1d5645bc4a0017032d4d5aa989decb729e9da2ad570918cbe5e9133dc6b145fa8c758de98efe32ff1f81aa0d611e838cc73ab08ef7d3f6adf66ab4d10e8372ddd628f94f16b8e; volume_info=%7B%22isUserMute%22%3Afalse%2C%22isMute%22%3Afalse%2C%22volume%22%3A0.5%7D; bd_ticket_guard_client_web_domain=2; FORCE_LOGIN=%7B%22videoConsumedRemainSeconds%22%3A180%7D; UIFID=0de8750d2b188f4235dbfd208e44abbb976428f0720eb983255afefa45d39c0c6532e1d4768dd8587bf919f866ff139655a3c2b735923234f371c699560c657923fd3d6c5b63ab7bb9b83423b6cb4787e2ce66a7fbc4ecb24c8570f520fe6de068bbb95115023c0c6c1b6ee31b49fb7e3996fb8349f43a3fd8b7a61cd9e18e8fe65eb6a7c13de4c0960d84e344b644725db3eb2fa6b7caf821de1b50527979f2; is_dash_user=1; biz_trace_id=b57a241f; bd_ticket_guard_client_data=eyJiZC10aWNrZXQtZ3VhcmQtdmVyc2lvbiI6MiwiYmQtdGlja2V0LWd1YXJkLWl0ZXJhdGlvbi12ZXJzaW9uIjoxLCJiZC10aWNrZXQtZ3VhcmQtcmVlLXB1YmxpYy1rZXkiOiJCTEo2R0lDalVoWW1XcHpGOFdrN0Vrc0dXcCtaUzNKY1g4NGNGY2k0TTl1TEowNjdUb21mbFU5aDdvWVBGamhNRWNRQWtKdnN1MnM3RmpTWnlJQXpHMjA9IiwiYmQtdGlja2V0LWd1YXJkLXdlYi12ZXJzaW9uIjoyfQ%3D%3D; download_guide=%221%2F20241216%2F0%22; sdk_source_info=7e276470716a68645a606960273f276364697660272927676c715a6d6069756077273f276364697660272927666d776a68605a607d71606b766c6a6b5a7666776c7571273f275e58272927666a6b766a69605a696c6061273f27636469766027292762696a6764695a7364776c6467696076273f275e5827292771273f273d33323131333c3036313632342778; bit_env=RiOY4jzzpxZoVCl6zdVSVhVRjdwHRTxqcqWdqMBZLPGjMdB4Tax1kAELHNTVAAh72KuhumewE4Lq6f0-VJ2UpJrkrhSxoPw9LUb3zQrq1OSwbeSPHkRlRgRQvO89sItdGUyq1oFr0XyRCnMYG87KSeWyc4x0czGR0o50hTDoDLG5rJVoRcdQOLvjiAegsqyytKF59sPX_QM9qffK2SqYsg0hCggURc_AI6kguDDE5DvG0bnyz1utw4z1eEnIoLrkGDqzqBZj4dOAr0BVU6ofbsS-pOQ2u2PM1dLP9FlBVBlVaqYVgHJeSLsR5k76BRTddUjTb4zEilVIEwAMJWGN4I1BxVt6fC9B5tBQpuT0lj3n3eKXCKXZsd8FrEs5_pbfDsxV-e_WMiXI2ff4qxiTC0U73sfo9OpicKICtZjdq8qsHxJuu6wVR36zvXeL2Wch5C6MzprNvkivv0l8nbh2mSgy1nabZr3dmU6NcR-Bg3Q3xTWUlR9aAUmpopC-cNuXjgLpT-Lw1AYGilSUnCvosth1Gfypq-b0MpgmdSDgTrQ%3D; gulu_source_res=eyJwX2luIjoiMDhjOGQ3ZTJiODQyNjZkZWI5Y2VkMGJiODNlNmY1ZWY0ZjMyNTE2ZmYyZjAzNDMzZjI0OWU1Y2Q1NTczNTk5NyJ9; passport_auth_mix_state=hp9bc3dgb1tm5wd8p82zawus27g0e3ue; IsDouyinActive=false",
            "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
        }
        logger.info(f"[SuperIPAgent] Requesting page with Cookie...")
        # 必须 verify=False 否则有些环境会报错
        response = requests.get(target_url, headers=headers_with_cookie, timeout=10)
        # 4. 解析 RENDER_DATA
        content_match = re.findall(r'<script id="RENDER_DATA" type="application/json">(.*?)</script>', response.text)
        if not content_match:
             # 尝试解码后再查找？或者结构变了
             # 再尝试找 SSR_HYDRATED_DATA
             if "SSR_HYDRATED_DATA" in response.text:
                 content_match = re.findall(r'<script id="SSR_HYDRATED_DATA" type="application/json">(.*?)</script>', response.text)
        if not content_match:
             logger.error(f"[SuperIPAgent] Could not find RENDER_DATA in page (len={len(response.text)})")
             return None
        content = unquote(content_match[0])
        try:
            data = json.loads(content)
        except:
            logger.error("[SuperIPAgent] JSON decode failed")
            return None
        # 5. 提取视频流
        video_url = None
        try:
            # 路径通常是: app -> videoDetail -> video -> bitRateList -> playAddr -> src
            if "app" in data and "videoDetail" in data["app"]:
                 info = data["app"]["videoDetail"]["video"]
                 if "bitRateList" in info and info["bitRateList"]:
                     video_url = info["bitRateList"][0]["playAddr"][0]["src"]
                 elif "playAddr" in info and info["playAddr"]:
                      video_url = info["playAddr"][0]["src"]
        except Exception as e:
            logger.error(f"[SuperIPAgent] Path extraction failed: {e}")
        if not video_url:
            logger.error("[SuperIPAgent] No video_url found")
            return None
        if video_url.startswith("//"):
            video_url = "https:" + video_url
        logger.info(f"[SuperIPAgent] Found video URL: {video_url[:50]}...")
        # 6. 下载 (带 Header)
        temp_path = temp_dir / f"douyin_manual_{timestamp}.mp4"
        download_headers = {
            'Referer': 'https://www.douyin.com/',
            'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36',
        }
        dl_resp = requests.get(video_url, headers=download_headers, stream=True, timeout=60)
        if dl_resp.status_code == 200:
             with open(temp_path, 'wb') as f:
                 for chunk in dl_resp.iter_content(chunk_size=1024):
                     f.write(chunk)
             logger.info(f"[SuperIPAgent] Downloaded successfully: {temp_path}")
             return temp_path
        else:
             logger.error(f"[SuperIPAgent] Download failed: {dl_resp.status_code}")
             return None
    except Exception as e:
        logger.error(f"[SuperIPAgent] Logic failed: {e}")
        return None
 async def download_bilibili_manual(url: str, temp_dir: Path, timestamp: int) -> Optional[Path]:
    """
    手动下载 Bilibili 视频 (Fallback logic - Playwright Version)
    B站通常音视频分离，这里只提取音频即可（因为只需要文案）
    """
    from playwright.async_api import async_playwright
    logger.info(f"[Playwright] Starting Bilibili download for: {url}")
    playwright = None
    browser = None
    try:
        playwright = await async_playwright().start()
        # Launch browser (ensure chromium is installed: playwright install chromium)
        browser = await playwright.chromium.launch(headless=True, args=['--no-sandbox', '--disable-setuid-sandbox'])
        # Mobile User Agent often gives single stream?
        # But Bilibili mobile web is tricky. Desktop is fine.
        context = await browser.new_context(
            user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
        )
        page = await context.new_page()
        # Intercept audio responses?
        # Bilibili streams are usually .m4s 
        # But finding the initial state is easier.
        logger.info("[Playwright] Navigating to Bilibili...")
        await page.goto(url, timeout=45000)
        # Wait for video element (triggers loading)
        try:
            await page.wait_for_selector('video', timeout=15000)
        except:
             logger.warning("[Playwright] Video selector timeout")
        # 1. Try extracting from __playinfo__
        # window.__playinfo__ contains dash streams
        playinfo = await page.evaluate("window.__playinfo__")
        audio_url = None
        if playinfo and "data" in playinfo and "dash" in playinfo["data"]:
            dash = playinfo["data"]["dash"]
            if "audio" in dash and dash["audio"]:
                audio_url = dash["audio"][0]["baseUrl"]
                logger.info(f"[Playwright] Found audio stream in __playinfo__: {audio_url[:50]}...")
        # 2. If playinfo fails, try extracting video src (sometimes it's a blob, which we can't fetch easily without interception)
        # But interception is complex. Let's try requests with Referer if we have URL.
        if not audio_url:
            logger.warning("[Playwright] Could not find audio in __playinfo__")
            return None
        # Download the audio stream
        temp_path = temp_dir / f"bilibili_audio_{timestamp}.m4s" # usually m4s
        try:
            api_request = context.request
            headers = {
                "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
                "Referer": "https://www.bilibili.com/"
            }
            logger.info(f"[Playwright] Downloading audio stream...")
            response = await api_request.get(audio_url, headers=headers)
            if response.status == 200:
                body = await response.body()
                with open(temp_path, 'wb') as f:
                    f.write(body)
                logger.info(f"[Playwright] Downloaded successfully: {temp_path}")
                return temp_path
            else:
                logger.error(f"[Playwright] API Request failed: {response.status}")
                return None
        except Exception as e:
             logger.error(f"[Playwright] Download logic error: {e}")
             return None
    except Exception as e:
        logger.error(f"[Playwright] Bilibili download failed: {e}")
        return None
    finally:
        if browser:
            await browser.close()
        if playwright:
            await playwright.stop()
--- a/backend/app/api/videos.py
+++ b/backend/app/api/videos.py
@@ -8,10 +8,19 @@ import traceback
 import time
 import httpx
 import os
-from app.services.tts_service import TTSService
+from app.services.tts_service import TTSService
-from app.services.video_service import VideoService
+from app.services.video_service import VideoService
-from app.services.lipsync_service import LipSyncService
+from app.services.lipsync_service import LipSyncService
-from app.services.storage import storage_service
+from app.services.voice_clone_service import voice_clone_service
 from app.services.assets_service import (
    get_style,
    get_default_style,
    resolve_bgm_path,
    prepare_style_for_remotion,
 )
 from app.services.storage import storage_service
 from app.services.whisper_service import whisper_service
 from app.services.remotion_service import remotion_service
 from app.core.config import settings
 from app.core.deps import get_current_user
@@ -21,6 +30,19 @@ class GenerateRequest(BaseModel):
    text: str
    voice: str = "zh-CN-YunxiNeural"
    material_path: str
    # 声音克隆模式新增字段
    tts_mode: str = "edgetts"  # "edgetts" | "voiceclone"
    ref_audio_id: Optional[str] = None  # 参考音频 storage path
    ref_text: Optional[str] = None  # 参考音频的转写文字
    # 字幕和标题功能
    title: Optional[str] = None  # 视频标题（片头显示）
    enable_subtitles: bool = True  # 是否启用逐字高亮字幕
    subtitle_style_id: Optional[str] = None  # 字幕样式 ID
    title_style_id: Optional[str] = None  # 标题样式 ID
    subtitle_font_size: Optional[int] = None  # 字幕字号（覆盖样式）
    title_font_size: Optional[int] = None  # 标题字号（覆盖样式）
    bgm_id: Optional[str] = None  # 背景音乐 ID
    bgm_volume: Optional[float] = 0.2  # 背景音乐音量 (0-1)
 tasks = {} # In-memory task store
@@ -42,15 +64,15 @@ async def _check_lipsync_ready(force: bool = False) -> bool:
    now = time.time()
    # 5分钟缓存
-    if not force and _lipsync_ready is not None and (now - _lipsync_last_check) < 300:
+    if not force and _lipsync_ready is not None and (now - _lipsync_last_check) < 300:
-        return _lipsync_ready
+        return bool(_lipsync_ready)
    lipsync = _get_lipsync_service()
    health = await lipsync.check_health()
    _lipsync_ready = health.get("ready", False)
    _lipsync_last_check = now
    print(f"[LipSync] Health check: ready={_lipsync_ready}")
-    return _lipsync_ready
+    return bool(_lipsync_ready)
 async def _download_material(path_or_url: str, temp_path: Path):
    """下载素材到临时文件 (流式下载，节省内存)"""
@@ -95,13 +117,42 @@ async def _process_video_generation(task_id: str, req: GenerateRequest, user_id:
        await _download_material(req.material_path, input_material_path)
        # 1. TTS - 进度 5% -> 25%
-        tasks[task_id]["message"] = "正在生成语音 (TTS)..."
+        tasks[task_id]["message"] = "正在生成语音..."
        tasks[task_id]["progress"] = 10
-        tts = TTSService()
+        audio_path = temp_dir / f"{task_id}_audio.wav"
        audio_path = temp_dir / f"{task_id}_audio.mp3"
        temp_files.append(audio_path)
-        await tts.generate_audio(req.text, req.voice, str(audio_path))
+
        if req.tts_mode == "voiceclone":
            # 声音克隆模式
            if not req.ref_audio_id or not req.ref_text:
                raise ValueError("声音克隆模式需要提供参考音频和参考文字")
            tasks[task_id]["message"] = "正在下载参考音频..."
            # 从 Supabase 下载参考音频
            ref_audio_local = temp_dir / f"{task_id}_ref.wav"
            temp_files.append(ref_audio_local)
            ref_audio_url = await storage_service.get_signed_url(
                bucket="ref-audios",
                path=req.ref_audio_id
            )
            await _download_material(ref_audio_url, ref_audio_local)
            tasks[task_id]["message"] = "正在克隆声音 (Qwen3-TTS)..."
            await voice_clone_service.generate_audio(
                text=req.text,
                ref_audio_path=str(ref_audio_local),
                ref_text=req.ref_text,
                output_path=str(audio_path),
                language="Chinese"
            )
        else:
            # EdgeTTS 模式 (默认)
            tasks[task_id]["message"] = "正在生成语音 (EdgeTTS)..."
            tts = TTSService()
            await tts.generate_audio(req.text, req.voice, str(audio_path))
        tts_time = time.time() - start_time
        print(f"[Pipeline] TTS completed in {tts_time:.1f}s")
@@ -133,17 +184,139 @@ async def _process_video_generation(task_id: str, req: GenerateRequest, user_id:
        lipsync_time = time.time() - lipsync_start
        print(f"[Pipeline] LipSync completed in {lipsync_time:.1f}s")
-        tasks[task_id]["progress"] = 85
+        tasks[task_id]["progress"] = 80
-        # 3. Composition - 进度 85% -> 100%
+        # 3. WhisperX 字幕对齐 - 进度 80% -> 85%
-        tasks[task_id]["message"] = "正在合成最终视频..."
+        captions_path = None
-        tasks[task_id]["progress"] = 90
+        if req.enable_subtitles:
            tasks[task_id]["message"] = "正在生成字幕 (Whisper)..."
            tasks[task_id]["progress"] = 82
            captions_path = temp_dir / f"{task_id}_captions.json"
            temp_files.append(captions_path)
            try:
                await whisper_service.align(
                    audio_path=str(audio_path),
                    text=req.text,
                    output_path=str(captions_path)
                )
                print(f"[Pipeline] Whisper alignment completed")
            except Exception as e:
                logger.warning(f"Whisper alignment failed, skipping subtitles: {e}")
                captions_path = None
        tasks[task_id]["progress"] = 85
        # 3.5 背景音乐混音（不影响唇形与字幕对齐）
        video = VideoService()
        final_audio_path = audio_path
        if req.bgm_id:
            tasks[task_id]["message"] = "正在合成背景音乐..."
            tasks[task_id]["progress"] = 86
            bgm_path = resolve_bgm_path(req.bgm_id)
            if bgm_path:
                mix_output_path = temp_dir / f"{task_id}_audio_mix.wav"
                temp_files.append(mix_output_path)
                volume = req.bgm_volume if req.bgm_volume is not None else 0.2
                volume = max(0.0, min(float(volume), 1.0))
                try:
                    video.mix_audio(
                        voice_path=str(audio_path),
                        bgm_path=str(bgm_path),
                        output_path=str(mix_output_path),
                        bgm_volume=volume
                    )
                    final_audio_path = mix_output_path
                except Exception as e:
                    logger.warning(f"BGM mix failed, fallback to voice only: {e}")
            else:
                logger.warning(f"BGM not found: {req.bgm_id}")
        # 4. Remotion 视频合成（字幕 + 标题）- 进度 85% -> 95%
        # 判断是否需要使用 Remotion（有字幕或标题时使用）
        use_remotion = (captions_path and captions_path.exists()) or req.title
        subtitle_style = None
        title_style = None
        if req.enable_subtitles:
            subtitle_style = get_style("subtitle", req.subtitle_style_id) or get_default_style("subtitle")
        if req.title:
            title_style = get_style("title", req.title_style_id) or get_default_style("title")
        if req.subtitle_font_size and req.enable_subtitles:
            if subtitle_style is None:
                subtitle_style = {}
            subtitle_style["font_size"] = int(req.subtitle_font_size)
        if req.title_font_size and req.title:
            if title_style is None:
                title_style = {}
            title_style["font_size"] = int(req.title_font_size)
        if use_remotion:
            subtitle_style = prepare_style_for_remotion(
                subtitle_style,
                temp_dir,
                f"{task_id}_subtitle_font"
            )
            title_style = prepare_style_for_remotion(
                title_style,
                temp_dir,
                f"{task_id}_title_font"
            )
        video = VideoService()
        final_output_local_path = temp_dir / f"{task_id}_output.mp4"
        temp_files.append(final_output_local_path)
-        await video.compose(str(lipsync_video_path), str(audio_path), str(final_output_local_path))
+        if use_remotion:
            tasks[task_id]["message"] = "正在合成视频 (Remotion)..."
            tasks[task_id]["progress"] = 87
            # 先用 FFmpeg 合成音视频（Remotion 需要带音频的视频）
            composed_video_path = temp_dir / f"{task_id}_composed.mp4"
            temp_files.append(composed_video_path)
            await video.compose(str(lipsync_video_path), str(final_audio_path), str(composed_video_path))
            # 检查 Remotion 是否可用
            remotion_health = await remotion_service.check_health()
            if remotion_health.get("ready"):
                try:
                    def on_remotion_progress(percent):
                        # 映射 Remotion 进度到 87-95%
                        mapped = 87 + int(percent * 0.08)
                        tasks[task_id]["progress"] = mapped
                    await remotion_service.render(
                        video_path=str(composed_video_path),
                        output_path=str(final_output_local_path),
                        captions_path=str(captions_path) if captions_path else None,
                        title=req.title,
                        title_duration=3.0,
                        fps=25,
                        enable_subtitles=req.enable_subtitles,
                        subtitle_style=subtitle_style,
                        title_style=title_style,
                        on_progress=on_remotion_progress
                    )
                    print(f"[Pipeline] Remotion render completed")
                except Exception as e:
                    logger.warning(f"Remotion render failed, using FFmpeg fallback: {e}")
                    # 回退到 FFmpeg 合成
                    import shutil
                    shutil.copy(str(composed_video_path), final_output_local_path)
            else:
                logger.warning(f"Remotion not ready: {remotion_health.get('error')}, using FFmpeg")
                import shutil
                shutil.copy(str(composed_video_path), final_output_local_path)
        else:
            # 不需要字幕和标题，直接用 FFmpeg 合成
            tasks[task_id]["message"] = "正在合成最终视频..."
            tasks[task_id]["progress"] = 90
            await video.compose(str(lipsync_video_path), str(final_audio_path), str(final_output_local_path))
        total_time = time.time() - start_time
@@ -217,6 +390,12 @@ async def lipsync_health():
    return await lipsync.check_health()
@router.get("/voiceclone/health")
 async def voiceclone_health():
    """获取声音克隆服务健康状态"""
    return await voice_clone_service.check_health()
@router.get("/generated")
 async def list_generated_videos(current_user: dict = Depends(get_current_user)):
    """从 Storage 读取当前用户生成的视频列表"""
--- a/backend/app/core/config.py
+++ b/backend/app/core/config.py
@@ -3,9 +3,10 @@ from pathlib import Path
 class Settings(BaseSettings):
    # 基础路径配置
-    BASE_DIR: Path = Path(__file__).resolve().parent.parent
+    BASE_DIR: Path = Path(__file__).resolve().parent.parent
-    UPLOAD_DIR: Path = BASE_DIR.parent / "uploads"
+    UPLOAD_DIR: Path = BASE_DIR.parent / "uploads"
-    OUTPUT_DIR: Path = BASE_DIR.parent / "outputs"
+    OUTPUT_DIR: Path = BASE_DIR.parent / "outputs"
    ASSETS_DIR: Path = BASE_DIR.parent / "assets"
    # 数据库/缓存
    REDIS_URL: str = "redis://localhost:6379/0"
@@ -22,9 +23,8 @@ class Settings(BaseSettings):
    LATENTSYNC_INFERENCE_STEPS: int = 20            # 推理步数 [20-50]
    LATENTSYNC_GUIDANCE_SCALE: float = 1.5          # 引导系数 [1.0-3.0]
    LATENTSYNC_ENABLE_DEEPCACHE: bool = True        # 启用 DeepCache 加速
    LATENTSYNC_ENABLE_DEEPCACHE: bool = True        # 启用 DeepCache 加速
    LATENTSYNC_SEED: int = 1247                     # 随机种子 (-1 则随机)
-    LATENTSYNC_USE_SERVER: bool = False             # 使用常驻服务 (Persistent Server) 加速
+    LATENTSYNC_USE_SERVER: bool = True              # 使用常驻服务 (Persistent Server) 加速
    # Supabase 配置
    SUPABASE_URL: str = ""
@@ -37,9 +37,13 @@ class Settings(BaseSettings):
    JWT_EXPIRE_HOURS: int = 24
    # 管理员配置
-    ADMIN_EMAIL: str = ""
+    ADMIN_PHONE: str = ""
    ADMIN_PASSWORD: str = ""
    # GLM AI 配置
    GLM_API_KEY: str = ""
    GLM_MODEL: str = "glm-4.7-flash"
    @property
    def LATENTSYNC_DIR(self) -> Path:
        """LatentSync 目录路径 (动态计算)"""
--- a/backend/app/main.py
+++ b/backend/app/main.py
@@ -2,7 +2,7 @@ from fastapi import FastAPI
 from fastapi.staticfiles import StaticFiles
 from fastapi.middleware.cors import CORSMiddleware
 from app.core import config
-from app.api import materials, videos, publish, login_helper, auth, admin
+from app.api import materials, videos, publish, login_helper, auth, admin, ref_audios, ai, tools, assets
 from loguru import logger
 import os
@@ -41,12 +41,14 @@ app.add_middleware(
 )
 # Create dirs
-settings.UPLOAD_DIR.mkdir(parents=True, exist_ok=True)
+settings.UPLOAD_DIR.mkdir(parents=True, exist_ok=True)
-settings.OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
+settings.OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
-(settings.UPLOAD_DIR / "materials").mkdir(exist_ok=True)
+(settings.UPLOAD_DIR / "materials").mkdir(exist_ok=True)
 settings.ASSETS_DIR.mkdir(parents=True, exist_ok=True)
-app.mount("/outputs", StaticFiles(directory=str(settings.OUTPUT_DIR)), name="outputs")
+app.mount("/outputs", StaticFiles(directory=str(settings.OUTPUT_DIR)), name="outputs")
-app.mount("/uploads", StaticFiles(directory=str(settings.UPLOAD_DIR)), name="uploads")
+app.mount("/uploads", StaticFiles(directory=str(settings.UPLOAD_DIR)), name="uploads")
 app.mount("/assets", StaticFiles(directory=str(settings.ASSETS_DIR)), name="assets")
 # 注册路由
 app.include_router(materials.router, prefix="/api/materials", tags=["Materials"])
@@ -55,6 +57,10 @@ app.include_router(publish.router, prefix="/api/publish", tags=["Publish"])
 app.include_router(login_helper.router, prefix="/api", tags=["LoginHelper"])
 app.include_router(auth.router)  # /api/auth
 app.include_router(admin.router)  # /api/admin
 app.include_router(ref_audios.router, prefix="/api/ref-audios", tags=["RefAudios"])
 app.include_router(ai.router)  # /api/ai
 app.include_router(tools.router, prefix="/api/tools", tags=["Tools"])
 app.include_router(assets.router, prefix="/api/assets", tags=["Assets"])
@app.on_event("startup")
@@ -62,11 +68,11 @@ async def init_admin():
    """
    服务启动时初始化管理员账号
    """
-    admin_email = settings.ADMIN_EMAIL
+    admin_phone = settings.ADMIN_PHONE
    admin_password = settings.ADMIN_PASSWORD
-    if not admin_email or not admin_password:
+    if not admin_phone or not admin_password:
-        logger.warning("未配置 ADMIN_EMAIL 和 ADMIN_PASSWORD，跳过管理员初始化")
+        logger.warning("未配置 ADMIN_PHONE 和 ADMIN_PASSWORD，跳过管理员初始化")
        return
    try:
@@ -76,15 +82,15 @@ async def init_admin():
        supabase = get_supabase()
        # 检查是否已存在
-        existing = supabase.table("users").select("id").eq("email", admin_email).execute()
+        existing = supabase.table("users").select("id").eq("phone", admin_phone).execute()
        if existing.data:
-            logger.info(f"管理员账号已存在: {admin_email}")
+            logger.info(f"管理员账号已存在: {admin_phone}")
            return
        # 创建管理员
        supabase.table("users").insert({
-            "email": admin_email,
+            "phone": admin_phone,
            "password_hash": get_password_hash(admin_password),
            "username": "Admin",
            "role": "admin",
@@ -92,7 +98,7 @@ async def init_admin():
            "expires_at": None  # 永不过期
        }).execute()
-        logger.success(f"管理员账号已创建: {admin_email}")
+        logger.success(f"管理员账号已创建: {admin_phone}")
    except Exception as e:
        logger.error(f"初始化管理员失败: {e}")
--- a/backend/app/services/assets_service.py
+++ b/backend/app/services/assets_service.py
@@ -0,0 +1,128 @@
 import json
 import shutil
 from pathlib import Path
 from typing import Optional, List, Dict, Any
 from loguru import logger
 from app.core.config import settings
 BGM_EXTENSIONS = {".wav", ".mp3", ".m4a", ".aac", ".flac", ".ogg", ".webm"}
 def _style_file_path(style_type: str) -> Path:
    return settings.ASSETS_DIR / "styles" / f"{style_type}.json"
 def _load_style_file(style_type: str) -> List[Dict[str, Any]]:
    style_path = _style_file_path(style_type)
    if not style_path.exists():
        return []
    try:
        with open(style_path, "r", encoding="utf-8") as f:
            data = json.load(f)
            if isinstance(data, list):
                return data
    except Exception as e:
        logger.error(f"Failed to load style file {style_path}: {e}")
    return []
 def list_styles(style_type: str) -> List[Dict[str, Any]]:
    return _load_style_file(style_type)
 def get_style(style_type: str, style_id: Optional[str]) -> Optional[Dict[str, Any]]:
    if not style_id:
        return None
    for item in _load_style_file(style_type):
        if item.get("id") == style_id:
            return item
    return None
 def get_default_style(style_type: str) -> Optional[Dict[str, Any]]:
    styles = _load_style_file(style_type)
    if not styles:
        return None
    for item in styles:
        if item.get("is_default"):
            return item
    return styles[0]
 def list_bgm() -> List[Dict[str, Any]]:
    bgm_root = settings.ASSETS_DIR / "bgm"
    if not bgm_root.exists():
        return []
    items: List[Dict[str, Any]] = []
    for path in bgm_root.rglob("*"):
        if not path.is_file():
            continue
        if path.suffix.lower() not in BGM_EXTENSIONS:
            continue
        rel = path.relative_to(bgm_root).as_posix()
        items.append({
            "id": rel,
            "name": path.stem,
            "ext": path.suffix.lower().lstrip(".")
        })
    items.sort(key=lambda x: x.get("name", ""))
    return items
 def resolve_bgm_path(bgm_id: str) -> Optional[Path]:
    if not bgm_id:
        return None
    bgm_root = settings.ASSETS_DIR / "bgm"
    candidate = (bgm_root / bgm_id).resolve()
    try:
        candidate.relative_to(bgm_root.resolve())
    except ValueError:
        return None
    if candidate.exists() and candidate.is_file():
        return candidate
    return None
 def prepare_style_for_remotion(
    style: Optional[Dict[str, Any]],
    temp_dir: Path,
    prefix: str
 ) -> Optional[Dict[str, Any]]:
    if not style:
        return None
    prepared = dict(style)
    font_file = prepared.get("font_file")
    if not font_file:
        return prepared
    source_font = (settings.ASSETS_DIR / "fonts" / font_file).resolve()
    try:
        source_font.relative_to((settings.ASSETS_DIR / "fonts").resolve())
    except ValueError:
        logger.warning(f"Font path outside assets: {font_file}")
        return prepared
    if not source_font.exists():
        logger.warning(f"Font file missing: {source_font}")
        return prepared
    temp_dir.mkdir(parents=True, exist_ok=True)
    ext = source_font.suffix.lower()
    target_name = f"{prefix}{ext}"
    target_path = temp_dir / target_name
    try:
        shutil.copy(source_font, target_path)
        prepared["font_file"] = target_name
        if not prepared.get("font_family"):
            prepared["font_family"] = prefix
    except Exception as e:
        logger.warning(f"Failed to copy font {source_font} -> {target_path}: {e}")
    return prepared
--- a/backend/app/services/glm_service.py
+++ b/backend/app/services/glm_service.py
@@ -0,0 +1,146 @@
 """
 GLM AI 服务
 使用智谱 GLM 生成标题和标签
 """
 import json
 import re
 from loguru import logger
 from zai import ZhipuAiClient
 from app.core.config import settings
 class GLMService:
    """GLM AI 服务"""
    def __init__(self):
        self.client = None
    def _get_client(self):
        """获取或创建 ZhipuAI 客户端"""
        if self.client is None:
            if not settings.GLM_API_KEY:
                raise Exception("GLM_API_KEY 未配置")
            self.client = ZhipuAiClient(api_key=settings.GLM_API_KEY)
        return self.client
    async def generate_title_tags(self, text: str) -> dict:
        """
        根据口播文案生成标题和标签
        Args:
            text: 口播文案
        Returns:
            {"title": "标题", "tags": ["标签1", "标签2", ...]}
        """
        prompt = f"""根据以下口播文案，生成一个吸引人的短视频标题和3个相关标签。
 口播文案：
 {text}
 要求：
 1. 标题要简洁有力，能吸引观众点击，不超过10个字
 2. 标签要与内容相关，便于搜索和推荐，只要3个
 请严格按以下JSON格式返回（不要包含其他内容）：
 {{"title": "标题", "tags": ["标签1", "标签2", "标签3"]}}"""
        try:
            client = self._get_client()
            logger.info(f"Calling GLM API with model: {settings.GLM_MODEL}")
            response = client.chat.completions.create(
                model=settings.GLM_MODEL,
                messages=[{"role": "user", "content": prompt}],
                thinking={"type": "disabled"},  # 禁用思考模式，加快响应
                max_tokens=500,
                temperature=0.7
            )
            # 提取生成的内容
            content = response.choices[0].message.content
            logger.info(f"GLM response (model: {settings.GLM_MODEL}): {content}")
            # 解析 JSON
            result = self._parse_json_response(content)
            return result
        except Exception as e:
            logger.error(f"GLM service error: {e}")
            raise Exception(f"AI 生成失败: {str(e)}")
    async def rewrite_script(self, text: str) -> str:
        """
        AI 洗稿（文案改写）
        Args:
            text: 原始文案
        Returns:
            改写后的文案
        """
        prompt = f"""请将以下视频文案进行改写。
 原始文案：
 {text}
 要求：
 1. 保持原意，但语气更加自然流畅
 2. 适合口播，读起来朗朗上口
 3. 字数与原文相当或略微精简
 4. 不要返回多余的解释，只返回改写后的正文"""
        try:
            client = self._get_client()
            logger.info(f"Using GLM to rewrite script")
            response = client.chat.completions.create(
                model=settings.GLM_MODEL,
                messages=[{"role": "user", "content": prompt}],
                thinking={"type": "disabled"},
                max_tokens=2000,
                temperature=0.8
            )
            content = response.choices[0].message.content
            logger.info("GLM rewrite completed")
            return content.strip()
        except Exception as e:
            logger.error(f"GLM rewrite error: {e}")
            raise Exception(f"AI 改写失败: {str(e)}")
    def _parse_json_response(self, content: str) -> dict:
        """解析 GLM 返回的 JSON 内容"""
        # 尝试直接解析
        try:
            return json.loads(content)
        except json.JSONDecodeError:
            pass
        # 尝试提取 JSON 块
        json_match = re.search(r'\{[^{}]*"title"[^{}]*"tags"[^{}]*\}', content, re.DOTALL)
        if json_match:
            try:
                return json.loads(json_match.group())
            except json.JSONDecodeError:
                pass
        # 尝试提取 ```json 代码块
        code_match = re.search(r'```(?:json)?\s*(\{.*?\})\s*```', content, re.DOTALL)
        if code_match:
            try:
                return json.loads(code_match.group(1))
            except json.JSONDecodeError:
                pass
        logger.error(f"Failed to parse GLM response: {content}")
        raise Exception("AI 返回格式解析失败")
 # 全局服务实例
 glm_service = GLMService()
--- a/backend/app/services/lipsync_service.py
+++ b/backend/app/services/lipsync_service.py
@@ -73,7 +73,51 @@ class LipSyncService:
            logger.warning(f"⚠️ Conda Python 不存在: {self.conda_python}")
            return False
        return True
-    
+
    def _get_media_duration(self, media_path: str) -> Optional[float]:
        """获取音频或视频的时长（秒）"""
        try:
            cmd = [
                "ffprobe", "-v", "error",
                "-show_entries", "format=duration",
                "-of", "default=noprint_wrappers=1:nokey=1",
                media_path
            ]
            result = subprocess.run(cmd, capture_output=True, text=True, timeout=10)
            if result.returncode == 0:
                return float(result.stdout.strip())
        except Exception as e:
            logger.warning(f"⚠️ 获取媒体时长失败: {e}")
        return None
    def _loop_video_to_duration(self, video_path: str, output_path: str, target_duration: float) -> str:
        """
        循环视频以匹配目标时长
        使用 FFmpeg stream_loop 实现无缝循环
        """
        try:
            cmd = [
                "ffmpeg", "-y",
                "-stream_loop", "-1",  # 无限循环
                "-i", video_path,
                "-t", str(target_duration),  # 截取到目标时长
                "-c:v", "libx264",
                "-preset", "fast",
                "-crf", "18",
                "-an",  # 去掉原音频
                output_path
            ]
            result = subprocess.run(cmd, capture_output=True, text=True, timeout=300)
            if result.returncode == 0 and Path(output_path).exists():
                logger.info(f"✅ 视频循环完成: {target_duration:.1f}s")
                return output_path
            else:
                logger.warning(f"⚠️ 视频循环失败: {result.stderr[:200]}")
                return video_path
        except Exception as e:
            logger.warning(f"⚠️ 视频循环异常: {e}")
            return video_path
    def _preprocess_video(self, video_path: str, output_path: str, target_height: int = 720) -> str:
        """
        视频预处理：压缩视频以加速后续处理
@@ -204,27 +248,34 @@ class LipSyncService:
        logger.info("⏳ 等待 GPU 资源 (排队中)...")
        async with self._lock:
-            if self.use_server:
+            # 使用临时目录存放中间文件
                # 模式 A: 调用常驻服务 (加速模式)
                return await self._call_persistent_server(video_path, audio_path, output_path)
            logger.info("🔄 调用 LatentSync 推理 (subprocess)...")
            # 使用临时目录存放输出
            with tempfile.TemporaryDirectory() as tmpdir:
                tmpdir = Path(tmpdir)
                # 获取音频和视频时长
                audio_duration = self._get_media_duration(audio_path)
                video_duration = self._get_media_duration(video_path)
                # 如果音频比视频长，循环视频以匹配音频长度
                if audio_duration and video_duration and audio_duration > video_duration + 0.5:
                    logger.info(f"🔄 音频({audio_duration:.1f}s) > 视频({video_duration:.1f}s)，循环视频...")
                    looped_video = tmpdir / "looped_input.mp4"
                    actual_video_path = self._loop_video_to_duration(
                        video_path,
                        str(looped_video),
                        audio_duration
                    )
                else:
                    actual_video_path = video_path
                if self.use_server:
                    # 模式 A: 调用常驻服务 (加速模式)
                    return await self._call_persistent_server(actual_video_path, audio_path, output_path)
                logger.info("🔄 调用 LatentSync 推理 (subprocess)...")
                temp_output = tmpdir / "output.mp4"
                # 视频预处理：压缩高分辨率视频以加速处理
                # preprocessed_video = tmpdir / "preprocessed_input.mp4"
                # actual_video_path = self._preprocess_video(
                #     video_path, 
                #     str(preprocessed_video), 
                #     target_height=720
                # )
                # 暂时禁用预处理以保持原始分辨率
                actual_video_path = video_path
                # 构建命令
                cmd = [
                    str(self.conda_python),
@@ -285,7 +336,7 @@ class LipSyncService:
                        return output_path
                    logger.info(f"LatentSync 输出:\n{stdout_text[-500:] if stdout_text else 'N/A'}")
-                    
+
                    # 检查输出文件
                    if temp_output.exists():
                        shutil.copy(temp_output, output_path)
--- a/backend/app/services/remotion_service.py
+++ b/backend/app/services/remotion_service.py
@@ -0,0 +1,159 @@
 """
 Remotion 视频渲染服务
 调用 Node.js Remotion 进行视频合成（字幕 + 标题）
 """
 import asyncio
 import json
 import subprocess
 from pathlib import Path
 from typing import Optional
 from loguru import logger
 class RemotionService:
    """Remotion 视频渲染服务"""
    def __init__(self, remotion_dir: Optional[str] = None):
        # Remotion 项目目录
        if remotion_dir:
            self.remotion_dir = Path(remotion_dir)
        else:
            # 默认在 ViGent2/remotion 目录
            self.remotion_dir = Path(__file__).parent.parent.parent.parent / "remotion"
    async def render(
        self,
        video_path: str,
        output_path: str,
        captions_path: Optional[str] = None,
        title: Optional[str] = None,
        title_duration: float = 3.0,
        fps: int = 25,
        enable_subtitles: bool = True,
        subtitle_style: Optional[dict] = None,
        title_style: Optional[dict] = None,
        on_progress: Optional[callable] = None
    ) -> str:
        """
        使用 Remotion 渲染视频（添加字幕和标题）
        Args:
            video_path: 输入视频路径（唇形同步后的视频）
            output_path: 输出视频路径
            captions_path: 字幕 JSON 文件路径（Whisper 生成）
            title: 视频标题（可选）
            title_duration: 标题显示时长（秒）
            fps: 帧率
            enable_subtitles: 是否启用字幕
            on_progress: 进度回调函数
        Returns:
            输出视频路径
        """
        # 构建命令参数
        cmd = [
            "npx", "ts-node", "render.ts",
            "--video", str(video_path),
            "--output", str(output_path),
            "--fps", str(fps),
            "--enableSubtitles", str(enable_subtitles).lower()
        ]
        if captions_path:
            cmd.extend(["--captions", str(captions_path)])
        if title:
            cmd.extend(["--title", title])
            cmd.extend(["--titleDuration", str(title_duration)])
        if subtitle_style:
            cmd.extend(["--subtitleStyle", json.dumps(subtitle_style, ensure_ascii=False)])
        if title_style:
            cmd.extend(["--titleStyle", json.dumps(title_style, ensure_ascii=False)])
        logger.info(f"Running Remotion render: {' '.join(cmd)}")
        # 在线程池中运行子进程
        def _run_render():
            process = subprocess.Popen(
                cmd,
                cwd=str(self.remotion_dir),
                stdout=subprocess.PIPE,
                stderr=subprocess.STDOUT,
                text=True,
                bufsize=1
            )
            output_lines = []
            for line in iter(process.stdout.readline, ''):
                line = line.strip()
                if line:
                    output_lines.append(line)
                    logger.debug(f"[Remotion] {line}")
                    # 解析进度
                    if "Rendering:" in line and "%" in line:
                        try:
                            percent_str = line.split("Rendering:")[1].strip().replace("%", "")
                            percent = int(percent_str)
                            if on_progress:
                                on_progress(percent)
                        except (ValueError, IndexError):
                            pass
            process.wait()
            if process.returncode != 0:
                error_msg = "\n".join(output_lines[-20:])  # 最后 20 行
                raise RuntimeError(f"Remotion render failed (code {process.returncode}):\n{error_msg}")
            return output_path
        loop = asyncio.get_event_loop()
        result = await loop.run_in_executor(None, _run_render)
        logger.info(f"Remotion render complete: {result}")
        return result
    async def check_health(self) -> dict:
        """检查 Remotion 服务健康状态"""
        try:
            # 检查 remotion 目录是否存在
            if not self.remotion_dir.exists():
                return {
                    "ready": False,
                    "error": f"Remotion directory not found: {self.remotion_dir}"
                }
            # 检查 package.json 是否存在
            package_json = self.remotion_dir / "package.json"
            if not package_json.exists():
                return {
                    "ready": False,
                    "error": "package.json not found"
                }
            # 检查 node_modules 是否存在
            node_modules = self.remotion_dir / "node_modules"
            if not node_modules.exists():
                return {
                    "ready": False,
                    "error": "node_modules not found, run 'npm install' first"
                }
            return {
                "ready": True,
                "remotion_dir": str(self.remotion_dir)
            }
        except Exception as e:
            return {
                "ready": False,
                "error": str(e)
            }
 # 全局服务实例
 remotion_service = RemotionService()
--- a/backend/app/services/storage.py
+++ b/backend/app/services/storage.py
@@ -16,6 +16,26 @@ class StorageService:
        self.supabase: Client = get_supabase()
        self.BUCKET_MATERIALS = "materials"
        self.BUCKET_OUTPUTS = "outputs"
        self.BUCKET_REF_AUDIOS = "ref-audios"
        # 确保所有 bucket 存在
        self._ensure_buckets()
    def _ensure_buckets(self):
        """确保所有必需的 bucket 存在"""
        buckets = [self.BUCKET_MATERIALS, self.BUCKET_OUTPUTS, self.BUCKET_REF_AUDIOS]
        try:
            existing = self.supabase.storage.list_buckets()
            existing_names = {b.name for b in existing} if existing else set()
            for bucket_name in buckets:
                if bucket_name not in existing_names:
                    try:
                        self.supabase.storage.create_bucket(bucket_name, options={"public": True})
                        logger.info(f"Created bucket: {bucket_name}")
                    except Exception as e:
                        # 可能已存在，忽略错误
                        logger.debug(f"Bucket {bucket_name} creation skipped: {e}")
        except Exception as e:
            logger.warning(f"Failed to ensure buckets: {e}")
    def _convert_to_public_url(self, url: str) -> str:
        """将内部 URL 转换为公网可访问的 URL"""
--- a/backend/app/services/video_service.py
+++ b/backend/app/services/video_service.py
@@ -1,9 +1,10 @@
 """
 视频合成服务
 """
-import os
+import os
-import subprocess
+import subprocess
-import json
+import json
 import shlex
 from pathlib import Path
 from loguru import logger
 from typing import Optional
@@ -12,18 +13,18 @@ class VideoService:
    def __init__(self):
        pass
-    def _run_ffmpeg(self, cmd: list) -> bool:
+    def _run_ffmpeg(self, cmd: list) -> bool:
-        cmd_str = ' '.join(f'"{c}"' if ' ' in c or '\\' in c else c for c in cmd)
+        cmd_str = ' '.join(shlex.quote(str(c)) for c in cmd)
-        logger.debug(f"FFmpeg CMD: {cmd_str}")
+        logger.debug(f"FFmpeg CMD: {cmd_str}")
-        try:
+        try:
-            # Synchronous call for BackgroundTasks compatibility
+            # Synchronous call for BackgroundTasks compatibility
-            result = subprocess.run(
+            result = subprocess.run(
-                cmd_str,
+                cmd,
-                shell=True,
+                shell=False,
-                capture_output=True,
+                capture_output=True,
-                text=True,
+                text=True,
-                encoding='utf-8',
+                encoding='utf-8',
-            )
+            )
            if result.returncode != 0:
                logger.error(f"FFmpeg Error: {result.stderr}")
                return False
@@ -32,9 +33,9 @@ class VideoService:
            logger.error(f"FFmpeg Exception: {e}")
            return False
-    def _get_duration(self, file_path: str) -> float:
+    def _get_duration(self, file_path: str) -> float:
-        # Synchronous call for BackgroundTasks compatibility
+        # Synchronous call for BackgroundTasks compatibility
-        cmd = f'ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "{file_path}"'
+        cmd = f'ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "{file_path}"'
        try:
            result = subprocess.run(
                cmd,
@@ -44,7 +45,39 @@ class VideoService:
            )
            return float(result.stdout.strip())
        except Exception:
-            return 0.0
+            return 0.0
    def mix_audio(
        self,
        voice_path: str,
        bgm_path: str,
        output_path: str,
        bgm_volume: float = 0.2
    ) -> str:
        """混合人声与背景音乐"""
        Path(output_path).parent.mkdir(parents=True, exist_ok=True)
        volume = max(0.0, min(float(bgm_volume), 1.0))
        filter_complex = (
            f"[0:a]volume=1.0[a0];"
            f"[1:a]volume={volume}[a1];"
            f"[a0][a1]amix=inputs=2:duration=first:dropout_transition=2:normalize=0[aout]"
        )
        cmd = [
            "ffmpeg", "-y",
            "-i", voice_path,
            "-stream_loop", "-1", "-i", bgm_path,
            "-filter_complex", filter_complex,
            "-map", "[aout]",
            "-c:a", "pcm_s16le",
            "-shortest",
            output_path,
        ]
        if self._run_ffmpeg(cmd):
            return output_path
        raise RuntimeError("FFmpeg audio mix failed")
    async def compose(
        self,
@@ -82,8 +115,15 @@ class VideoService:
        # Previous state: subtitles disabled due to font issues
        # if subtitle_path: ...
-        # Audio map
+        # Audio map with high quality encoding
-        cmd.extend(["-c:v", "libx264", "-c:a", "aac", "-shortest"])
+        cmd.extend([
            "-c:v", "libx264",
            "-preset", "slow",      # 慢速预设，更好的压缩效率
            "-crf", "18",           # 高质量（与 LatentSync 一致）
            "-c:a", "aac",
            "-b:a", "192k",         # 音频比特率
            "-shortest"
        ])
        # Use audio from input 1
        cmd.extend(["-map", "0:v", "-map", "1:a"])
--- a/backend/app/services/voice_clone_service.py
+++ b/backend/app/services/voice_clone_service.py
@@ -0,0 +1,115 @@
 """
 声音克隆服务
 通过 HTTP 调用 Qwen3-TTS 独立服务 (端口 8009)
 """
 import httpx
 import asyncio
 from pathlib import Path
 from typing import Optional
 from loguru import logger
 from app.core.config import settings
 # Qwen3-TTS 服务地址
 QWEN_TTS_URL = "http://localhost:8009"
 class VoiceCloneService:
    """声音克隆服务 - 调用 Qwen3-TTS HTTP API"""
    def __init__(self):
        self.base_url = QWEN_TTS_URL
        # 健康状态缓存
        self._health_cache: Optional[dict] = None
        self._health_cache_time: float = 0
        # GPU 并发锁 (Serial Queue)
        self._lock = asyncio.Lock()
    async def generate_audio(
        self,
        text: str,
        ref_audio_path: str,
        ref_text: str,
        output_path: str,
        language: str = "Chinese"
    ) -> str:
        """
        使用声音克隆生成语音
        Args:
            text: 要合成的文本
            ref_audio_path: 参考音频本地路径
            ref_text: 参考音频的转写文字
            output_path: 输出 wav 路径
            language: 语言 (Chinese/English/Auto)
        Returns:
            输出文件路径
        """
        # 使用锁确保串行执行，避免 GPU 显存溢出
        async with self._lock:
            logger.info(f"🎤 Voice Clone: {text[:30]}...")
            Path(output_path).parent.mkdir(parents=True, exist_ok=True)
            # 读取参考音频
            with open(ref_audio_path, "rb") as f:
                ref_audio_data = f.read()
            # 调用 Qwen3-TTS 服务
            timeout = httpx.Timeout(300.0)  # 5分钟超时
            async with httpx.AsyncClient(timeout=timeout) as client:
                try:
                    response = await client.post(
                        f"{self.base_url}/generate",
                        files={"ref_audio": ("ref.wav", ref_audio_data, "audio/wav")},
                        data={
                            "text": text,
                            "ref_text": ref_text,
                            "language": language
                        }
                    )
                    response.raise_for_status()
                    # 保存返回的音频
                    with open(output_path, "wb") as f:
                        f.write(response.content)
                    logger.info(f"✅ Voice clone saved: {output_path}")
                    return output_path
                except httpx.HTTPStatusError as e:
                    logger.error(f"Qwen3-TTS API error: {e.response.status_code} - {e.response.text}")
                    raise RuntimeError(f"声音克隆服务错误: {e.response.text}")
                except httpx.RequestError as e:
                    logger.error(f"Qwen3-TTS connection error: {e}")
                    raise RuntimeError("无法连接声音克隆服务，请检查服务是否启动")
    async def check_health(self) -> dict:
        """健康检查"""
        import time
        # 5分钟缓存
        now = time.time()
        if self._health_cache and (now - self._health_cache_time) < 300:
            return self._health_cache
        try:
            async with httpx.AsyncClient(timeout=5.0) as client:
                response = await client.get(f"{self.base_url}/health")
                response.raise_for_status()
                self._health_cache = response.json()
                self._health_cache_time = now
                return self._health_cache
        except Exception as e:
            logger.warning(f"Qwen3-TTS health check failed: {e}")
            return {
                "service": "Qwen3-TTS Voice Clone",
                "model": "0.6B-Base",
                "ready": False,
                "gpu_id": 0,
                "error": str(e)
            }
 # 单例
 voice_clone_service = VoiceCloneService()
--- a/backend/app/services/whisper_service.py
+++ b/backend/app/services/whisper_service.py
@@ -0,0 +1,288 @@
 """
 字幕对齐服务
 使用 faster-whisper 生成字级别时间戳
 """
 import json
 import re
 from pathlib import Path
 from typing import Optional, List
 from loguru import logger
 # 模型缓存
 _whisper_model = None
 # 断句标点
 SENTENCE_PUNCTUATION = set('。！？，、；：,.!?;:')
 # 每行最大字数
 MAX_CHARS_PER_LINE = 12
 def split_word_to_chars(word: str, start: float, end: float) -> list:
    """
    将词拆分成单个字符，时间戳线性插值
    Args:
        word: 词文本
        start: 词开始时间
        end: 词结束时间
    Returns:
        单字符列表，每个包含 word/start/end
    """
    tokens = []
    ascii_buffer = ""
    for char in word:
        if not char.strip():
            continue
        if char.isascii() and char.isalnum():
            ascii_buffer += char
            continue
        if ascii_buffer:
            tokens.append(ascii_buffer)
            ascii_buffer = ""
        tokens.append(char)
    if ascii_buffer:
        tokens.append(ascii_buffer)
    if not tokens:
        return []
    if len(tokens) == 1:
        return [{"word": tokens[0], "start": start, "end": end}]
    # 线性插值时间戳
    duration = end - start
    token_duration = duration / len(tokens)
    result = []
    for i, token in enumerate(tokens):
        token_start = start + i * token_duration
        token_end = start + (i + 1) * token_duration
        result.append({
            "word": token,
            "start": round(token_start, 3),
            "end": round(token_end, 3)
        })
    return result
 def split_segment_to_lines(words: List[dict], max_chars: int = MAX_CHARS_PER_LINE) -> List[dict]:
    """
    将长段落按标点和字数拆分成多行
    Args:
        words: 字列表，每个包含 word/start/end
        max_chars: 每行最大字数
    Returns:
        拆分后的 segment 列表
    """
    if not words:
        return []
    segments = []
    current_words = []
    current_text = ""
    for word_info in words:
        char = word_info["word"]
        current_words.append(word_info)
        current_text += char
        # 判断是否需要断句
        should_break = False
        # 1. 遇到断句标点
        if char in SENTENCE_PUNCTUATION:
            should_break = True
        # 2. 达到最大字数
        elif len(current_text) >= max_chars:
            should_break = True
        if should_break and current_words:
            segments.append({
                "text": current_text,
                "start": current_words[0]["start"],
                "end": current_words[-1]["end"],
                "words": current_words.copy()
            })
            current_words = []
            current_text = ""
    # 处理剩余的字
    if current_words:
        segments.append({
            "text": current_text,
            "start": current_words[0]["start"],
            "end": current_words[-1]["end"],
            "words": current_words.copy()
        })
    return segments
 class WhisperService:
    """字幕对齐服务（基于 faster-whisper）"""
    def __init__(
        self,
        model_size: str = "large-v3",
        device: str = "cuda",
        compute_type: str = "float16",
    ):
        self.model_size = model_size
        self.device = device
        self.compute_type = compute_type
    def _load_model(self):
        """懒加载 faster-whisper 模型"""
        global _whisper_model
        if _whisper_model is None:
            from faster_whisper import WhisperModel
            logger.info(f"Loading faster-whisper model: {self.model_size} on {self.device}")
            _whisper_model = WhisperModel(
                self.model_size,
                device=self.device,
                compute_type=self.compute_type
            )
            logger.info("faster-whisper model loaded")
        return _whisper_model
    async def align(
        self,
        audio_path: str,
        text: str,
        output_path: Optional[str] = None
    ) -> dict:
        """
        对音频进行转录，生成字级别时间戳
        Args:
            audio_path: 音频文件路径
            text: 原始文本（用于参考，但实际使用 whisper 转录结果）
            output_path: 可选，输出 JSON 文件路径
        Returns:
            包含字级别时间戳的字典
        """
        import asyncio
        def _do_transcribe():
            model = self._load_model()
            logger.info(f"Transcribing audio: {audio_path}")
            # 转录并获取字级别时间戳
            segments_iter, info = model.transcribe(
                audio_path,
                language="zh",
                word_timestamps=True,  # 启用字级别时间戳
                vad_filter=True,  # 启用 VAD 过滤静音
            )
            logger.info(f"Detected language: {info.language} (prob: {info.language_probability:.2f})")
            all_segments = []
            for segment in segments_iter:
                # 提取每个字的时间戳，并拆分成单字
                all_words = []
                if segment.words:
                    for word_info in segment.words:
                        word_text = word_info.word.strip()
                        if word_text:
                            # 将词拆分成单字，时间戳线性插值
                            chars = split_word_to_chars(
                                word_text,
                                word_info.start,
                                word_info.end
                            )
                            all_words.extend(chars)
                # 将长段落按标点和字数拆分成多行
                if all_words:
                    line_segments = split_segment_to_lines(all_words, MAX_CHARS_PER_LINE)
                    all_segments.extend(line_segments)
            logger.info(f"Generated {len(all_segments)} subtitle segments")
            return {"segments": all_segments}
        # 在线程池中执行
        loop = asyncio.get_event_loop()
        result = await loop.run_in_executor(None, _do_transcribe)
        # 保存到文件
        if output_path:
            output_file = Path(output_path)
            output_file.parent.mkdir(parents=True, exist_ok=True)
            with open(output_file, "w", encoding="utf-8") as f:
                json.dump(result, f, ensure_ascii=False, indent=2)
            logger.info(f"Captions saved to: {output_path}")
        return result
    async def transcribe(self, audio_path: str) -> str:
        """
        仅转录文本（用于提取文案）
        Args:
            audio_path: 音频/视频文件路径
        Returns:
            纯文本内容
        """
        import asyncio
        def _do_transcribe_text():
            model = self._load_model()
            logger.info(f"Extracting script from: {audio_path}")
            # 转录 (无需字级时间戳)
            segments_iter, _ = model.transcribe(
                audio_path,
                language="zh",
                word_timestamps=False,
                vad_filter=True,
            )
            text_parts = []
            for segment in segments_iter:
                text_parts.append(segment.text.strip())
            full_text = " ".join(text_parts)
            logger.info(f"Extracted text length: {len(full_text)}")
            return full_text
        # 在线程池中执行
        loop = asyncio.get_event_loop()
        result = await loop.run_in_executor(None, _do_transcribe_text)
        return result
    async def check_health(self) -> dict:
        """检查服务健康状态"""
        try:
            from faster_whisper import WhisperModel
            return {
                "ready": True,
                "model_size": self.model_size,
                "device": self.device,
                "backend": "faster-whisper"
            }
        except ImportError:
            return {
                "ready": False,
                "error": "faster-whisper not installed"
            }
 # 全局服务实例
 whisper_service = WhisperService()
--- a/backend/assets/styles/subtitle.json
+++ b/backend/assets/styles/subtitle.json
@@ -0,0 +1,58 @@
 [
  {
    "id": "subtitle_classic_yellow",
    "label": "经典黄字",
    "font_file": "title/思源黑体/SourceHanSansCN-Bold思源黑体免费.otf",
    "font_family": "SourceHanSansCN-Bold",
    "font_size": 52,
    "highlight_color": "#FFE600",
    "normal_color": "#FFFFFF",
    "stroke_color": "#000000",
    "stroke_size": 3,
    "letter_spacing": 2,
    "bottom_margin": 80,
    "is_default": true
  },
  {
    "id": "subtitle_cyan",
    "label": "清爽青蓝",
    "font_file": "DingTalk Sans.ttf",
    "font_family": "DingTalkSans",
    "font_size": 48,
    "highlight_color": "#00E5FF",
    "normal_color": "#FFFFFF",
    "stroke_color": "#000000",
    "stroke_size": 3,
    "letter_spacing": 1,
    "bottom_margin": 76,
    "is_default": false
  },
  {
    "id": "subtitle_orange",
    "label": "活力橙",
    "font_file": "simhei.ttf",
    "font_family": "SimHei",
    "font_size": 50,
    "highlight_color": "#FF8A00",
    "normal_color": "#FFFFFF",
    "stroke_color": "#000000",
    "stroke_size": 3,
    "letter_spacing": 2,
    "bottom_margin": 80,
    "is_default": false
  },
  {
    "id": "subtitle_clean_white",
    "label": "纯白轻描",
    "font_file": "DingTalk JinBuTi.ttf",
    "font_family": "DingTalkJinBuTi",
    "font_size": 46,
    "highlight_color": "#FFFFFF",
    "normal_color": "#FFFFFF",
    "stroke_color": "#111111",
    "stroke_size": 2,
    "letter_spacing": 1,
    "bottom_margin": 72,
    "is_default": false
  }
 ]
--- a/backend/assets/styles/title.json
+++ b/backend/assets/styles/title.json
@@ -0,0 +1,58 @@
 [
  {
    "id": "title_bold_white",
    "label": "黑体大标题",
    "font_file": "title/思源黑体/SourceHanSansCN-Heavy思源黑体免费.otf",
    "font_family": "SourceHanSansCN-Heavy",
    "font_size": 72,
    "color": "#FFFFFF",
    "stroke_color": "#000000",
    "stroke_size": 8,
    "letter_spacing": 4,
    "top_margin": 60,
    "font_weight": 900,
    "is_default": true
  },
  {
    "id": "title_serif_gold",
    "label": "宋体金色",
    "font_file": "title/思源宋体/SourceHanSerifCN-SemiBold思源宋体免费.otf",
    "font_family": "SourceHanSerifCN-SemiBold",
    "font_size": 70,
    "color": "#FDE68A",
    "stroke_color": "#2B1B00",
    "stroke_size": 8,
    "letter_spacing": 3,
    "top_margin": 58,
    "font_weight": 800,
    "is_default": false
  },
  {
    "id": "title_douyin",
    "label": "抖音活力",
    "font_file": "title/抖音美好体开源.otf",
    "font_family": "DouyinMeiHao",
    "font_size": 72,
    "color": "#FFFFFF",
    "stroke_color": "#1F0A00",
    "stroke_size": 8,
    "letter_spacing": 4,
    "top_margin": 60,
    "font_weight": 900,
    "is_default": false
  },
  {
    "id": "title_pop",
    "label": "站酷快乐体",
    "font_file": "title/站酷快乐体.ttf",
    "font_family": "ZCoolHappy",
    "font_size": 74,
    "color": "#FFFFFF",
    "stroke_color": "#000000",
    "stroke_size": 8,
    "letter_spacing": 5,
    "top_margin": 62,
    "font_weight": 900,
    "is_default": false
  }
 ]
--- a/backend/database/migrate_to_phone.sql
+++ b/backend/database/migrate_to_phone.sql
@@ -0,0 +1,88 @@
 -- ============================================================
 -- ViGent 手机号登录迁移脚本
 -- 用于将 email 字段改为 phone 字段
 -- 
 -- 执行方式（任选一种）：
 -- 1. Supabase Studio: 打开 https://supabase.hbyrkj.top -> SQL Editor -> 粘贴执行
 -- 2. Docker 命令: docker exec -i supabase-db psql -U postgres < migrate_to_phone.sql
 -- ============================================================
 -- 注意：此脚本会删除现有的用户数据！
 -- 如需保留数据，请先备份
 -- 1. 删除依赖表（有外键约束）
 DROP TABLE IF EXISTS user_sessions CASCADE;
 DROP TABLE IF EXISTS social_accounts CASCADE;
 -- 2. 删除用户表
 DROP TABLE IF EXISTS users CASCADE;
 -- 3. 重新创建 users 表（使用 phone 字段）
 CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    phone TEXT UNIQUE NOT NULL,
    password_hash TEXT NOT NULL,
    username TEXT,
    role TEXT DEFAULT 'pending' CHECK (role IN ('pending', 'user', 'admin')),
    is_active BOOLEAN DEFAULT FALSE,
    expires_at TIMESTAMP WITH TIME ZONE,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
 );
 -- 4. 重新创建 user_sessions 表
 CREATE TABLE user_sessions (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID REFERENCES users(id) ON DELETE CASCADE UNIQUE,
    session_token TEXT UNIQUE NOT NULL,
    device_info TEXT,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
 );
 -- 5. 重新创建 social_accounts 表
 CREATE TABLE social_accounts (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID REFERENCES users(id) ON DELETE CASCADE,
    platform TEXT NOT NULL CHECK (platform IN ('bilibili', 'douyin', 'xiaohongshu')),
    logged_in BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    UNIQUE(user_id, platform)
 );
 -- 6. 创建索引
 CREATE INDEX idx_users_phone ON users(phone);
 CREATE INDEX idx_sessions_user_id ON user_sessions(user_id);
 CREATE INDEX idx_social_user_platform ON social_accounts(user_id, platform);
 -- 7. 启用 RLS
 ALTER TABLE users ENABLE ROW LEVEL SECURITY;
 ALTER TABLE user_sessions ENABLE ROW LEVEL SECURITY;
 ALTER TABLE social_accounts ENABLE ROW LEVEL SECURITY;
 -- 8. 创建 RLS 策略
 CREATE POLICY "Users can view own profile" ON users
    FOR SELECT USING (auth.uid()::text = id::text);
 CREATE POLICY "Users can access own sessions" ON user_sessions
    FOR ALL USING (user_id::text = auth.uid()::text);
 CREATE POLICY "Users can access own social accounts" ON social_accounts
    FOR ALL USING (user_id::text = auth.uid()::text);
 -- 9. 更新时间触发器
 CREATE OR REPLACE FUNCTION update_updated_at()
 RETURNS TRIGGER AS $$
 BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
 END;
 $$ LANGUAGE plpgsql;
 DROP TRIGGER IF EXISTS users_updated_at ON users;
 CREATE TRIGGER users_updated_at
    BEFORE UPDATE ON users
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at();
 -- 完成！
 -- 管理员账号会在后端服务重启时自动创建 (15549380526)
--- a/backend/database/schema.sql
+++ b/backend/database/schema.sql
@@ -4,7 +4,7 @@
 -- 1. 创建 users 表
 CREATE TABLE IF NOT EXISTS users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
-    email TEXT UNIQUE NOT NULL,
+    phone TEXT UNIQUE NOT NULL,
    password_hash TEXT NOT NULL,
    username TEXT,
    role TEXT DEFAULT 'pending' CHECK (role IN ('pending', 'user', 'admin')),
@@ -34,7 +34,7 @@ CREATE TABLE IF NOT EXISTS social_accounts (
 );
 -- 4. 创建索引
-CREATE INDEX IF NOT EXISTS idx_users_email ON users(email);
+CREATE INDEX IF NOT EXISTS idx_users_phone ON users(phone);
 CREATE INDEX IF NOT EXISTS idx_sessions_user_id ON user_sessions(user_id);
 CREATE INDEX IF NOT EXISTS idx_social_user_platform ON social_accounts(user_id, platform);
--- a/backend/requirements.txt
+++ b/backend/requirements.txt
@@ -28,3 +28,10 @@ supabase>=2.0.0
 python-jose[cryptography]>=3.3.0
 passlib[bcrypt]>=1.7.4
 bcrypt==4.0.1
 # 字幕对齐
 faster-whisper>=1.0.0
 # 文案提取与AI生成
 yt-dlp>=2023.0.0
 zai-sdk>=0.2.0
--- a/backend/scripts/watchdog.py
+++ b/backend/scripts/watchdog.py
@@ -0,0 +1,84 @@
 import asyncio
 import httpx
 import logging
 import subprocess
 import time
 from datetime import datetime
 # 配置日志
 logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler("watchdog.log"),
        logging.StreamHandler()
    ]
 )
 logger = logging.getLogger("Watchdog")
 # 服务配置
 SERVICES = [
    {
        "name": "vigent2-qwen-tts",
        "url": "http://localhost:8009/health",
        "failures": 0,
        "threshold": 3,
        "timeout": 10.0,
        "restart_cmd": ["pm2", "restart", "vigent2-qwen-tts"]
    }
 ]
 async def check_service(service):
    """检查单个服务健康状态"""
    try:
        timeout = service.get("timeout", 10.0)
        async with httpx.AsyncClient(timeout=timeout) as client:
            response = await client.get(service["url"])
            if response.status_code == 200:
                # 成功
                if service["failures"] > 0:
                    logger.info(f"✅ 服务 {service['name']} 已恢复正常")
                service["failures"] = 0
                return True
            else:
                logger.warning(f"⚠️ 服务 {service['name']} 返回状态码 {response.status_code}")
    except Exception as e:
        logger.warning(f"⚠️ 无法连接服务 {service['name']}: {str(e)}")
    # 失败处理
    service["failures"] += 1
    logger.warning(f"❌ 服务 {service['name']} 连续失败 {service['failures']}/{service['threshold']} 次")
    if service["failures"] >= service['threshold']:
        logger.error(f"🚨 服务 {service['name']} 已达到失败阈值，正在重启...")
        try:
            subprocess.run(service["restart_cmd"], check=True)
            logger.info(f"♻️ 服务 {service['name']} 重启命令已发送")
            # 重启后给予一段宽限期 (例如 60秒) 不检查，等待服务启动
            service["failures"] = 0 # 重置计数
            return "restarting" 
        except Exception as restart_error:
            logger.error(f"💥 重启服务 {service['name']} 失败: {restart_error}")
    return False
 async def main():
    logger.info("🛡️ ViGent2 服务看门狗 (Watchdog) 已启动")
    while True:
        # 并发检查所有服务
        for service in SERVICES:
            result = await check_service(service)
            if result == "restarting":
                # 如果有服务重启，额外等待包含启动时间
                pass
        # 每 30 秒检查一次
        await asyncio.sleep(30)
 if __name__ == "__main__":
    try:
        asyncio.run(main())
    except KeyboardInterrupt:
        logger.info("🛑 看门狗已停止")
--- a/frontend/README.md
+++ b/frontend/README.md
@@ -7,8 +7,10 @@ ViGent2 的前端界面，采用 Next.js 14 + TailwindCSS 构建。
 ### 1. 视频生成 (`/`)
 - **素材管理**: 拖拽上传人物视频，实时预览。
 - **文案配音**: 集成 EdgeTTS，支持多音色选择 (云溪 / 晓晓)。
 - **AI 标题/标签**: 一键生成视频标题与标签 (Day 14)。
 - **进度追踪**: 实时显示视频生成进度 (10% -> 100%)。
 - **结果预览**: 生成完成后直接播放下载。
 - **本地保存**: 文案/标题自动保存，刷新后恢复 (Day 14)。
 ### 2. 全自动发布 (`/publish`) [Day 7 新增]
 - **多平台管理**: 统一管理 B站、抖音、小红书账号状态。
@@ -19,13 +21,34 @@ ViGent2 的前端界面，采用 Next.js 14 + TailwindCSS 构建。
 - **发布配置**: 设置视频标题、标签、简介。
 - **定时任务**: 支持 "立即发布" 或 "定时发布"。
 ### 3. 声音克隆 [Day 13 新增]
 - **TTS 模式选择**: EdgeTTS (预设音色) / 声音克隆 (自定义音色) 切换。
 - **参考音频管理**: 上传/列表/删除参考音频 (3-20秒 WAV)。
 - **一键克隆**: 选择参考音频后自动调用 Qwen3-TTS 服务。
 ### 4. 字幕与标题 [Day 13 新增]
 - **片头标题**: 可选输入，视频开头显示 3 秒淡入淡出标题。
 - **逐字高亮字幕**: 卡拉OK效果，默认开启，可关闭。
 - **自动对齐**: 基于 faster-whisper 生成字级别时间戳。
 ### 5. 账户设置 [Day 15 新增]
 - **手机号登录**: 11位中国手机号验证登录。
 - **账户下拉菜单**: 显示有效期 + 修改密码 + 安全退出。
 - **修改密码**: 弹窗输入当前密码与新密码，修改后强制重新登录。
 ### 6. 文案提取助手 (`ScriptExtractionModal`) [Day 15 新增]
 - **多源提取**: 支持文件拖拽上传与 URL 粘贴 (B站/抖音/TikTok)。
 - **AI 洗稿**: 集成 GLM-4.7-Flash，自动改写为口播文案。
 - **一键填入**: 提取结果直接填充至视频生成输入框。
 - **智能交互**: 实时进度展示，防误触设计。
 ## 🛠️ 技术栈
 - **框架**: Next.js 14 (App Router)
 - **样式**: TailwindCSS
 - **图标**: Lucide React
 - **组件**: 自定义现代化组件 (Glassmorphism 风格)
- **API**: Fetch API (对接后端 FastAPI :8006)
+- **API**: Axios 实例 `@/lib/axios` (对接后端 FastAPI :8006)
 ## 🚀 开发指南
--- a/frontend/next.config.ts
+++ b/frontend/next.config.ts
@@ -16,6 +16,10 @@ const nextConfig: NextConfig = {
        source: '/outputs/:path*',
        destination: 'http://localhost:8006/outputs/:path*',  // 转发生成的视频
      },
      {
        source: '/assets/:path*',
        destination: 'http://localhost:8006/assets/:path*',  // 转发静态资源（字体/音乐）
      },
    ];
  },
 };
--- a/frontend/package-lock.json
+++ b/frontend/package-lock.json
@@ -10,6 +10,7 @@
      "dependencies": {
        "@supabase/supabase-js": "^2.93.1",
        "axios": "^1.13.4",
        "lucide-react": "^0.563.0",
        "next": "16.1.1",
        "react": "19.2.3",
        "react-dom": "19.2.3",
@@ -5000,6 +5001,15 @@
        "yallist": "^3.0.2"
      }
    },
    "node_modules/lucide-react": {
      "version": "0.563.0",
      "resolved": "https://registry.npmjs.org/lucide-react/-/lucide-react-0.563.0.tgz",
      "integrity": "sha512-8dXPB2GI4dI8jV4MgUDGBeLdGk8ekfqVZ0BdLcrRzocGgG75ltNEmWS+gE7uokKF/0oSUuczNDT+g9hFJ23FkA==",
      "license": "ISC",
      "peerDependencies": {
        "react": "^16.5.1 || ^17.0.0 || ^18.0.0 || ^19.0.0"
      }
    },
    "node_modules/magic-string": {
      "version": "0.30.21",
      "resolved": "https://registry.npmjs.org/magic-string/-/magic-string-0.30.21.tgz",
--- a/frontend/package.json
+++ b/frontend/package.json
@@ -11,6 +11,7 @@
  "dependencies": {
    "@supabase/supabase-js": "^2.93.1",
    "axios": "^1.13.4",
    "lucide-react": "^0.563.0",
    "next": "16.1.1",
    "react": "19.2.3",
    "react-dom": "19.2.3",
--- a/frontend/src/app/admin/page.tsx
+++ b/frontend/src/app/admin/page.tsx
@@ -7,7 +7,7 @@ import api from '@/lib/axios';
 interface UserListItem {
    id: string;
-    email: string;
+    phone: string;
    username: string | null;
    role: string;
    is_active: boolean;
@@ -144,8 +144,8 @@ export default function AdminPage() {
                                <tr key={user.id} className="hover:bg-white/5">
                                    <td className="px-6 py-4">
                                        <div>
-                                            <div className="text-white font-medium">{user.username || user.email.split('@')[0]}</div>
+                                            <div className="text-white font-medium">{user.username || `用户${user.phone.slice(-4)}`}</div>
-                                            <div className="text-gray-400 text-sm">{user.email}</div>
+                                            <div className="text-gray-400 text-sm">{user.phone}</div>
                                        </div>
                                    </td>
                                    <td className="px-6 py-4">
--- a/frontend/src/app/globals.css
+++ b/frontend/src/app/globals.css
@@ -38,6 +38,7 @@ body {
  font-family: Arial, Helvetica, sans-serif;
  padding-top: env(safe-area-inset-top);
  padding-bottom: env(safe-area-inset-bottom);
  background: linear-gradient(to bottom, #0f172a 0%, #0f172a 5%, #581c87 50%, #0f172a 95%, #0f172a 100%);
 }
 /* 自定义滚动条样式 - 深色主题 */
--- a/frontend/src/app/layout.tsx
+++ b/frontend/src/app/layout.tsx
@@ -1,6 +1,9 @@
 import type { Metadata, Viewport } from "next";
 import { Geist, Geist_Mono } from "next/font/google";
 import "./globals.css";
 import { AuthProvider } from "@/contexts/AuthContext";
 import { TaskProvider } from "@/contexts/TaskContext";
 import GlobalTaskIndicator from "@/components/GlobalTaskIndicator";
 const geistSans = Geist({
  variable: "--font-geist-sans",
@@ -13,8 +16,8 @@ const geistMono = Geist_Mono({
 });
 export const metadata: Metadata = {
-  title: "ViGent",
+  title: "IPAgent",
-  description: "ViGent Talking Head Agent",
+  description: "IPAgent Talking Head Agent",
 };
 export const viewport: Viewport = {
@@ -30,16 +33,15 @@ export default function RootLayout({
  children: React.ReactNode;
 }>) {
  return (
-    <html lang="en" style={{ backgroundColor: '#0f172a' }}>
+    <html lang="en">
      <body
        className={`${geistSans.variable} ${geistMono.variable} antialiased`}
        style={{
          margin: 0,
          minHeight: '100dvh',
          background: 'linear-gradient(to bottom, #0f172a 0%, #0f172a 5%, #581c87 50%, #0f172a 95%, #0f172a 100%)',
        }}
      >
-        {children}
+        <AuthProvider>
          <TaskProvider>
            {children}
          </TaskProvider>
        </AuthProvider>
      </body>
    </html>
  );
--- a/frontend/src/app/login/page.tsx
+++ b/frontend/src/app/login/page.tsx
@@ -6,7 +6,7 @@ import { login } from '@/lib/auth';
 export default function LoginPage() {
    const router = useRouter();
-    const [email, setEmail] = useState('');
+    const [phone, setPhone] = useState('');
    const [password, setPassword] = useState('');
    const [error, setError] = useState('');
    const [loading, setLoading] = useState(false);
@@ -14,10 +14,17 @@ export default function LoginPage() {
    const handleSubmit = async (e: React.FormEvent) => {
        e.preventDefault();
        setError('');
        // 验证手机号格式
        if (!/^\d{11}$/.test(phone)) {
            setError('请输入正确的11位手机号');
            return;
        }
        setLoading(true);
        try {
-            const result = await login(email, password);
+            const result = await login(phone, password);
            if (result.success) {
                router.push('/');
            } else {
@@ -34,22 +41,23 @@ export default function LoginPage() {
        <div className="min-h-dvh flex items-center justify-center">
            <div className="w-full max-w-md p-8 bg-white/10 backdrop-blur-lg rounded-2xl shadow-2xl border border-white/20">
                <div className="text-center mb-8">
-                    <h1 className="text-3xl font-bold text-white mb-2">ViGent</h1>
+                    <h1 className="text-3xl font-bold text-white mb-2">IPAgent</h1>
                    <p className="text-gray-300">AI 视频生成平台</p>
                </div>
                <form onSubmit={handleSubmit} className="space-y-6">
                    <div>
                        <label className="block text-sm font-medium text-gray-200 mb-2">
-                            邮箱
+                            手机号
                        </label>
                        <input
-                            type="email"
+                            type="tel"
-                            value={email}
+                            value={phone}
-                            onChange={(e) => setEmail(e.target.value)}
+                            onChange={(e) => setPhone(e.target.value.replace(/\D/g, '').slice(0, 11))}
                            required
                            maxLength={11}
                            className="w-full px-4 py-3 bg-white/5 border border-white/10 rounded-lg text-white placeholder-gray-400 focus:outline-none focus:ring-2 focus:ring-purple-500 focus:border-transparent"
-                            placeholder="your@email.com"
+                            placeholder="请输入11位手机号"
                        />
                    </div>
--- a/frontend/src/app/page.tsx
+++ b/frontend/src/app/page.tsx
--- a/frontend/src/app/publish/page.tsx
+++ b/frontend/src/app/publish/page.tsx
@@ -1,9 +1,19 @@
 "use client";
-import { useState, useEffect } from "react";
+import { useState, useEffect } from "react";
-import useSWR from 'swr';
+import useSWR from 'swr';
-import Link from "next/link";
+import Link from "next/link";
-import api from "@/lib/axios";
+import api from "@/lib/axios";
 import { useAuth } from "@/contexts/AuthContext";
 import AccountSettingsDropdown from "@/components/AccountSettingsDropdown";
 import {
    ArrowLeft,
    RotateCcw,
    LogOut,
    QrCode,
    Rocket,
    Clock,
 } from "lucide-react";
 // SWR fetcher 使用 axios（自动处理 401/403）
 const fetcher = (url: string) => api.get(url).then((res) => res.data);
@@ -51,11 +61,68 @@ export default function PublishPage() {
    const [qrPlatform, setQrPlatform] = useState<string | null>(null);
    const [isLoadingQR, setIsLoadingQR] = useState(false);
-    // 加载账号和视频列表
+    // 使用全局认证状态
    const { userId, isLoading: isAuthLoading } = useAuth();
    // 是否已从 localStorage 恢复完成
    const [isRestored, setIsRestored] = useState(false);
    // 加载账号和视频列表
    useEffect(() => {
        fetchAccounts();
        fetchVideos();
    }, []);
    useEffect(() => {
        if (typeof window === 'undefined') return;
        if ('scrollRestoration' in window.history) {
            window.history.scrollRestoration = 'manual';
        }
        window.scrollTo({ top: 0, left: 0, behavior: 'auto' });
    }, []);
    // 获取存储 key 的前缀（登录用户使用 userId，未登录使用 guest）
    const storageKey = userId || 'guest';
    // 从 localStorage 恢复用户输入（等待认证完成后）
    useEffect(() => {
-        fetchAccounts();
+        console.log("[Publish] 恢复检查 - isAuthLoading:", isAuthLoading, "userId:", userId);
-        fetchVideos();
+        if (isAuthLoading) return;
-    }, []);
+
        console.log("[Publish] 开始从 localStorage 恢复数据，storageKey:", storageKey);
        // 从 localStorage 恢复用户输入（带用户隔离，未登录用户使用 guest）
        const savedTitle = localStorage.getItem(`vigent_${storageKey}_publish_title`);
        const savedTags = localStorage.getItem(`vigent_${storageKey}_publish_tags`);
        console.log("[Publish] localStorage 数据:", { savedTitle, savedTags });
        if (savedTitle) setTitle(savedTitle);
        if (savedTags) {
            // 兼容 JSON 数组格式（AI 生成）和字符串格式（手动输入）
            try {
                const parsed = JSON.parse(savedTags);
                if (Array.isArray(parsed)) {
                    setTags(parsed.join(', '));
                } else {
                    setTags(savedTags);
                }
            } catch {
                setTags(savedTags);
            }
        }
        // 恢复完成后才允许保存
        setIsRestored(true);
        console.log("[Publish] 恢复完成，isRestored = true");
    }, [storageKey, isAuthLoading]);
    // 保存用户输入到 localStorage（恢复完成后才保存，未登录用户也可保存）
    useEffect(() => {
        if (isRestored) localStorage.setItem(`vigent_${storageKey}_publish_title`, title);
    }, [title, storageKey, isRestored]);
    useEffect(() => {
        if (isRestored) localStorage.setItem(`vigent_${storageKey}_publish_tags`, tags);
    }, [tags, storageKey, isRestored]);
    const fetchAccounts = async () => {
        try {
@@ -246,38 +313,27 @@ export default function PublishPage() {
            )}
            {/* Header - 统一样式 */}
-            <header className="border-b border-white/10 bg-black/20 backdrop-blur-sm">
+            <header className="border-b border-white/10 bg-black/20 backdrop-blur-sm relative z-[100]">
                <div className="max-w-6xl mx-auto px-4 sm:px-6 py-3 sm:py-4 flex items-center justify-between">
                    <Link href="/" className="text-xl sm:text-2xl font-bold text-white flex items-center gap-2 sm:gap-3 hover:opacity-80 transition-opacity">
                        <span className="text-3xl sm:text-4xl">🎬</span>
-                        ViGent
+                        IPAgent
                    </Link>
                    <div className="flex items-center gap-1 sm:gap-4">
-                        <Link
+                        <Link
-                            href="/"
+                            href="/"
-                            className="px-2 sm:px-4 py-1 sm:py-2 text-sm sm:text-base bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors"
+                            className="px-2 sm:px-4 py-1 sm:py-2 text-sm sm:text-base bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors flex items-center gap-1"
-                        >
+                        >
-                            返回创作
+                            <ArrowLeft className="h-4 w-4" />
-                        </Link>
+                            返回创作
-                        <span className="px-2 sm:px-4 py-1 sm:py-2 text-sm sm:text-base bg-gradient-to-r from-purple-600 to-pink-600 text-white rounded-lg font-semibold">
+                        </Link>
-                            发布管理
+                        <span className="px-2 sm:px-4 py-1 sm:py-2 text-sm sm:text-base bg-gradient-to-r from-purple-600 to-pink-600 text-white rounded-lg font-semibold">
-                        </span>
+                            发布管理
-                        <button
+                        </span>
-                            onClick={async () => {
+                        <AccountSettingsDropdown />
-                                if (confirm('确定要退出登录吗？')) {
+                    </div>
-                                    try {
+                </div>
-                                        await api.post('/api/auth/logout');
+            </header>
                                    } catch (e) { }
                                    window.location.href = '/login';
                                }
                            }}
                            className="px-2 sm:px-4 py-1 sm:py-2 text-sm sm:text-base bg-red-500/10 hover:bg-red-500/20 text-red-200 rounded-lg transition-colors"
                        >
                            退出
                        </button>
                    </div>
                </div>
            </header>
            <main className="max-w-6xl mx-auto px-6 py-8">
                <div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
@@ -315,26 +371,29 @@ export default function PublishPage() {
                                        <div className="flex gap-2">
                                            {account.logged_in ? (
                                                <>
-                                                    <button
+                                                    <button
-                                                        onClick={() => handleLogin(account.platform)}
+                                                        onClick={() => handleLogin(account.platform)}
-                                                        className="px-3 py-1 bg-white/10 hover:bg-white/20 text-white text-sm rounded-lg transition-colors"
+                                                        className="px-3 py-1 bg-white/10 hover:bg-white/20 text-white text-sm rounded-lg transition-colors flex items-center gap-1"
-                                                    >
+                                                    >
-                                                        ↻ 重新登录
+                                                        <RotateCcw className="h-3.5 w-3.5" />
-                                                    </button>
+                                                        重新登录
-                                                    <button
+                                                    </button>
-                                                        onClick={() => handleLogout(account.platform)}
+                                                    <button
-                                                        className="px-3 py-1 bg-red-500/80 hover:bg-red-600 text-white text-sm rounded-lg transition-colors"
+                                                        onClick={() => handleLogout(account.platform)}
-                                                    >
+                                                        className="px-3 py-1 bg-red-500/80 hover:bg-red-600 text-white text-sm rounded-lg transition-colors flex items-center gap-1"
-                                                        注销
+                                                    >
-                                                    </button>
+                                                        <LogOut className="h-3.5 w-3.5" />
                                                        注销
                                                    </button>
                                                </>
                                            ) : (
-                                                <button
+                                                <button
-                                                    onClick={() => handleLogin(account.platform)}
+                                                    onClick={() => handleLogin(account.platform)}
-                                                    className="px-3 py-1 bg-purple-600 hover:bg-purple-700 text-white text-sm rounded-lg transition-colors"
+                                                    className="px-3 py-1 bg-purple-600 hover:bg-purple-700 text-white text-sm rounded-lg transition-colors flex items-center gap-1"
-                                                >
+                                                >
-                                                    🔐 扫码登录
+                                                    <QrCode className="h-3.5 w-3.5" />
-                                                </button>
+                                                    扫码登录
                                                </button>
                                            )}
                                        </div>
                                    </div>
@@ -440,32 +499,40 @@ export default function PublishPage() {
                        <div className="space-y-3">
                            <div className="flex gap-3">
                                {/* 立即发布 - 占 3/4 */}
-                                <button
+                                <button
-                                    onClick={() => {
+                                    onClick={() => {
-                                        setScheduleMode("now");
+                                        setScheduleMode("now");
-                                        handlePublish();
+                                        handlePublish();
-                                    }}
+                                    }}
-                                    disabled={isPublishing || selectedPlatforms.length === 0}
+                                    disabled={isPublishing || selectedPlatforms.length === 0}
-                                    className={`flex-[3] py-4 rounded-xl font-bold text-lg transition-all ${isPublishing || selectedPlatforms.length === 0
+                                    className={`flex-[3] py-4 rounded-xl font-bold text-lg transition-all flex items-center justify-center gap-2 ${isPublishing || selectedPlatforms.length === 0
-                                        ? "bg-gray-600 cursor-not-allowed text-gray-400"
+                                        ? "bg-gray-600 cursor-not-allowed text-gray-400"
-                                        : "bg-gradient-to-r from-green-600 to-teal-600 hover:from-green-700 hover:to-teal-700 text-white"
+                                        : "bg-gradient-to-r from-green-600 to-teal-600 hover:from-green-700 hover:to-teal-700 text-white"
-                                        }`}
+                                        }`}
-                                >
+                                >
-                                    {isPublishing && scheduleMode === "now" ? "发布中..." : "🚀 立即发布"}
+                                    {isPublishing && scheduleMode === "now" ? (
-                                </button>
+                                        "发布中..."
                                    ) : (
                                        <>
                                            <Rocket className="h-5 w-5" />
                                            立即发布
                                        </>
                                    )}
                                </button>
                                {/* 定时发布 - 占 1/4 */}
-                                <button
+                                <button
-                                    onClick={() => setScheduleMode(scheduleMode === "scheduled" ? "now" : "scheduled")}
+                                    onClick={() => setScheduleMode(scheduleMode === "scheduled" ? "now" : "scheduled")}
-                                    disabled={isPublishing || selectedPlatforms.length === 0}
+                                    disabled={isPublishing || selectedPlatforms.length === 0}
-                                    className={`flex-1 py-4 rounded-xl font-bold text-base transition-all ${isPublishing || selectedPlatforms.length === 0
+                                    className={`flex-1 py-4 rounded-xl font-bold text-base transition-all flex items-center justify-center gap-2 ${isPublishing || selectedPlatforms.length === 0
-                                        ? "bg-gray-600 cursor-not-allowed text-gray-400"
+                                        ? "bg-gray-600 cursor-not-allowed text-gray-400"
-                                        : scheduleMode === "scheduled"
+                                        : scheduleMode === "scheduled"
-                                            ? "bg-purple-600 text-white"
+                                            ? "bg-purple-600 text-white"
-                                            : "bg-white/10 hover:bg-white/20 text-white"
+                                            : "bg-white/10 hover:bg-white/20 text-white"
-                                        }`}
+                                        }`}
-                                >
+                                >
-                                    ⏰ 定时
+                                    <Clock className="h-5 w-5" />
-                                </button>
+                                    定时
                                </button>
                            </div>
                            {/* 定时发布时间选择器 */}
--- a/frontend/src/app/register/page.tsx
+++ b/frontend/src/app/register/page.tsx
@@ -6,7 +6,7 @@ import { register } from '@/lib/auth';
 export default function RegisterPage() {
    const router = useRouter();
-    const [email, setEmail] = useState('');
+    const [phone, setPhone] = useState('');
    const [password, setPassword] = useState('');
    const [confirmPassword, setConfirmPassword] = useState('');
    const [username, setUsername] = useState('');
@@ -18,6 +18,12 @@ export default function RegisterPage() {
        e.preventDefault();
        setError('');
        // 验证手机号格式
        if (!/^\d{11}$/.test(phone)) {
            setError('请输入正确的11位手机号');
            return;
        }
        if (password !== confirmPassword) {
            setError('两次输入的密码不一致');
            return;
@@ -31,7 +37,7 @@ export default function RegisterPage() {
        setLoading(true);
        try {
-            const result = await register(email, password, username || undefined);
+            const result = await register(phone, password, username || undefined);
            if (result.success) {
                setSuccess(true);
            } else {
@@ -73,22 +79,24 @@ export default function RegisterPage() {
            <div className="w-full max-w-md p-8 bg-white/10 backdrop-blur-lg rounded-2xl shadow-2xl border border-white/20">
                <div className="text-center mb-8">
                    <h1 className="text-3xl font-bold text-white mb-2">注册账号</h1>
-                    <p className="text-gray-300">创建您的 ViGent 账号</p>
+                    <p className="text-gray-300">创建您的 IPAgent 账号</p>
                </div>
                <form onSubmit={handleSubmit} className="space-y-5">
                    <div>
                        <label className="block text-sm font-medium text-gray-200 mb-2">
-                            邮箱 <span className="text-red-400">*</span>
+                            手机号 <span className="text-red-400">*</span>
                        </label>
                        <input
-                            type="email"
+                            type="tel"
-                            value={email}
+                            value={phone}
-                            onChange={(e) => setEmail(e.target.value)}
+                            onChange={(e) => setPhone(e.target.value.replace(/\D/g, '').slice(0, 11))}
                            required
                            maxLength={11}
                            className="w-full px-4 py-3 bg-white/5 border border-white/10 rounded-lg text-white placeholder-gray-400 focus:outline-none focus:ring-2 focus:ring-purple-500"
-                            placeholder="your@email.com"
+                            placeholder="请输入11位手机号"
                        />
                        <p className="mt-1 text-xs text-gray-500">必须是11位数字</p>
                    </div>
                    <div>
--- a/frontend/src/components/AccountSettingsDropdown.tsx
+++ b/frontend/src/components/AccountSettingsDropdown.tsx
@@ -0,0 +1,211 @@
 "use client";
 import { useState, useEffect, useRef } from "react";
 import { useAuth } from "@/contexts/AuthContext";
 import api from "@/lib/axios";
 // 账户设置下拉菜单组件
 export default function AccountSettingsDropdown() {
    const { user } = useAuth();
    const [isOpen, setIsOpen] = useState(false);
    const [showPasswordModal, setShowPasswordModal] = useState(false);
    const [oldPassword, setOldPassword] = useState('');
    const [newPassword, setNewPassword] = useState('');
    const [confirmPassword, setConfirmPassword] = useState('');
    const [error, setError] = useState('');
    const [success, setSuccess] = useState('');
    const [loading, setLoading] = useState(false);
    const dropdownRef = useRef<HTMLDivElement>(null);
    // 点击外部关闭菜单
    useEffect(() => {
        const handleClickOutside = (event: MouseEvent) => {
            if (dropdownRef.current && !dropdownRef.current.contains(event.target as Node)) {
                setIsOpen(false);
            }
        };
        if (isOpen) {
            document.addEventListener('mousedown', handleClickOutside);
        }
        return () => {
            document.removeEventListener('mousedown', handleClickOutside);
        };
    }, [isOpen]);
    // 格式化有效期
    const formatExpiry = (expiresAt: string | null) => {
        if (!expiresAt) return '永久有效';
        const date = new Date(expiresAt);
        return `${date.getFullYear()}-${String(date.getMonth() + 1).padStart(2, '0')}-${String(date.getDate()).padStart(2, '0')}`;
    };
    const handleLogout = async () => {
        if (confirm('确定要退出登录吗？')) {
            try {
                await api.post('/api/auth/logout');
            } catch (e) { }
            window.location.href = '/login';
        }
    };
    const handleChangePassword = async (e: React.FormEvent) => {
        e.preventDefault();
        setError('');
        setSuccess('');
        if (newPassword !== confirmPassword) {
            setError('两次输入的新密码不一致');
            return;
        }
        if (newPassword.length < 6) {
            setError('新密码长度至少6位');
            return;
        }
        setLoading(true);
        try {
            const res = await api.post('/api/auth/change-password', {
                old_password: oldPassword,
                new_password: newPassword
            });
            if (res.data.success) {
                setSuccess('密码修改成功，正在跳转登录页...');
                // 清除登录状态并跳转
                setTimeout(async () => {
                    try {
                        await api.post('/api/auth/logout');
                    } catch (e) { }
                    window.location.href = '/login';
                }, 1500);
            } else {
                setError(res.data.message || '修改失败');
            }
        } catch (err: any) {
            setError(err.response?.data?.detail || '修改失败，请重试');
        } finally {
            setLoading(false);
        }
    };
    return (
        <div className="relative" ref={dropdownRef}>
            <button
                onClick={() => setIsOpen(!isOpen)}
                className="px-2 sm:px-4 py-1 sm:py-2 text-sm sm:text-base bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors flex items-center gap-1"
            >
                <span>⚙️</span>
                <span className="hidden sm:inline">账户</span>
                <svg className={`w-4 h-4 transition-transform ${isOpen ? 'rotate-180' : ''}`} fill="none" stroke="currentColor" viewBox="0 0 24 24">
                    <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M19 9l-7 7-7-7" />
                </svg>
            </button>
            {/* 下拉菜单 */}
            {isOpen && (
                <div className="absolute right-0 mt-2 bg-gray-800 border border-white/10 rounded-lg shadow-xl z-[160] overflow-hidden whitespace-nowrap">
                    {/* 有效期显示 */}
                    <div className="px-3 py-2 border-b border-white/10 text-center">
                        <div className="text-xs text-gray-400">账户有效期</div>
                        <div className="text-sm text-white font-medium">
                            {user?.expires_at ? formatExpiry(user.expires_at) : '永久有效'}
                        </div>
                    </div>
                    <button
                        onClick={() => {
                            setIsOpen(false);
                            setShowPasswordModal(true);
                        }}
                        className="w-full px-3 py-2 text-left text-sm text-white hover:bg-white/10 flex items-center gap-2"
                    >
                        🔐 修改密码
                    </button>
                    <button
                        onClick={handleLogout}
                        className="w-full px-3 py-2 text-left text-sm text-red-300 hover:bg-red-500/20 flex items-center gap-2"
                    >
                        🚪 退出登录
                    </button>
                </div>
            )}
            {/* 修改密码弹窗 */}
            {showPasswordModal && (
                <div className="fixed inset-0 z-[200] flex items-start justify-center pt-20 bg-black/60 backdrop-blur-sm p-4">
                    <div className="w-full max-w-md p-6 bg-gray-900 border border-white/10 rounded-2xl shadow-2xl mx-4">
                        <h3 className="text-xl font-bold text-white mb-4">修改密码</h3>
                        <form onSubmit={handleChangePassword} className="space-y-4">
                            <div>
                                <label className="block text-sm text-gray-300 mb-1">当前密码</label>
                                <input
                                    type="password"
                                    value={oldPassword}
                                    onChange={(e) => setOldPassword(e.target.value)}
                                    required
                                    className="w-full px-3 py-2 bg-white/5 border border-white/10 rounded-lg text-white placeholder-gray-500 focus:outline-none focus:ring-2 focus:ring-purple-500"
                                    placeholder="输入当前密码"
                                />
                            </div>
                            <div>
                                <label className="block text-sm text-gray-300 mb-1">新密码</label>
                                <input
                                    type="password"
                                    value={newPassword}
                                    onChange={(e) => setNewPassword(e.target.value)}
                                    required
                                    className="w-full px-3 py-2 bg-white/5 border border-white/10 rounded-lg text-white placeholder-gray-500 focus:outline-none focus:ring-2 focus:ring-purple-500"
                                    placeholder="至少6位"
                                />
                            </div>
                            <div>
                                <label className="block text-sm text-gray-300 mb-1">确认新密码</label>
                                <input
                                    type="password"
                                    value={confirmPassword}
                                    onChange={(e) => setConfirmPassword(e.target.value)}
                                    required
                                    className="w-full px-3 py-2 bg-white/5 border border-white/10 rounded-lg text-white placeholder-gray-500 focus:outline-none focus:ring-2 focus:ring-purple-500"
                                    placeholder="再次输入新密码"
                                />
                            </div>
                            {error && (
                                <div className="p-2 bg-red-500/20 border border-red-500/50 rounded text-red-200 text-sm">
                                    {error}
                                </div>
                            )}
                            {success && (
                                <div className="p-2 bg-green-500/20 border border-green-500/50 rounded text-green-200 text-sm">
                                    {success}
                                </div>
                            )}
                            <div className="flex gap-3 pt-2">
                                <button
                                    type="button"
                                    onClick={() => {
                                        setShowPasswordModal(false);
                                        setError('');
                                        setOldPassword('');
                                        setNewPassword('');
                                        setConfirmPassword('');
                                    }}
                                    className="flex-1 py-2 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors"
                                >
                                    取消
                                </button>
                                <button
                                    type="submit"
                                    disabled={loading}
                                    className="flex-1 py-2 bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-700 hover:to-pink-700 text-white rounded-lg transition-colors disabled:opacity-50"
                                >
                                    {loading ? '修改中...' : '确认修改'}
                                </button>
                            </div>
                        </form>
                    </div>
                </div>
            )}
        </div>
    );
 }
--- a/frontend/src/components/GlobalTaskIndicator.tsx
+++ b/frontend/src/components/GlobalTaskIndicator.tsx
@@ -0,0 +1,42 @@
 "use client";
 import { useTask } from "@/contexts/TaskContext";
 import Link from "next/link";
 export default function GlobalTaskIndicator() {
  const { currentTask, isGenerating } = useTask();
  if (!isGenerating) return null;
  return (
    <div className="fixed top-0 left-0 right-0 z-50 bg-gradient-to-r from-purple-600 to-pink-600 text-white shadow-lg">
      <div className="max-w-6xl mx-auto px-6 py-3">
        <div className="flex items-center justify-between">
          <div className="flex items-center gap-3">
            <div className="animate-spin rounded-full h-5 w-5 border-2 border-white border-t-transparent"></div>
            <span className="font-medium">
              视频生成中... {currentTask?.progress || 0}%
            </span>
            {currentTask?.message && (
              <span className="text-white/80 text-sm">
                {currentTask.message}
              </span>
            )}
          </div>
          <Link
            href="/"
            className="px-3 py-1 bg-white/20 hover:bg-white/30 rounded transition-colors text-sm"
          >
            查看详情
          </Link>
        </div>
        <div className="mt-2 w-full bg-white/20 rounded-full h-1.5 overflow-hidden">
          <div
            className="bg-white h-full transition-all duration-300 ease-out"
            style={{ width: `${currentTask?.progress || 0}%` }}
          ></div>
        </div>
      </div>
    </div>
  );
 }
--- a/frontend/src/components/ScriptExtractionModal.tsx
+++ b/frontend/src/components/ScriptExtractionModal.tsx
@@ -0,0 +1,424 @@
 "use client";
 import { useState, useRef, useEffect } from "react";
 import api from "@/lib/axios";
 interface ScriptExtractionModalProps {
    isOpen: boolean;
    onClose: () => void;
    onApply?: (text: string) => void;
 }
 export default function ScriptExtractionModal({
    isOpen,
    onClose,
    onApply
 }: ScriptExtractionModalProps) {
    const [isLoading, setIsLoading] = useState(false);
    const [script, setScript] = useState("");
    const [rewrittenScript, setRewrittenScript] = useState("");
    const [error, setError] = useState<string | null>(null);
    const [doRewrite, setDoRewrite] = useState(true);
    const [step, setStep] = useState<'config' | 'processing' | 'result'>('config');
    const [dragActive, setDragActive] = useState(false);
    const [selectedFile, setSelectedFile] = useState<File | null>(null);
    // New state for URL mode
    const [activeTab, setActiveTab] = useState<'file' | 'url'>('url');
    const [inputUrl, setInputUrl] = useState("");
    // Reset state when modal opens
    useEffect(() => {
        if (isOpen) {
            setStep('config');
            setScript("");
            setRewrittenScript("");
            setError(null);
            setIsLoading(false);
            setSelectedFile(null);
            setInputUrl("");
            setActiveTab('url');
        }
    }, [isOpen]);
    const handleDrag = (e: React.DragEvent) => {
        e.preventDefault();
        e.stopPropagation();
        if (e.type === "dragenter" || e.type === "dragover") {
            setDragActive(true);
        } else if (e.type === "dragleave") {
            setDragActive(false);
        }
    };
    const handleDrop = (e: React.DragEvent) => {
        e.preventDefault();
        e.stopPropagation();
        setDragActive(false);
        if (e.dataTransfer.files && e.dataTransfer.files[0]) {
            handleFile(e.dataTransfer.files[0]);
        }
    };
    const handleFileChange = (e: React.ChangeEvent<HTMLInputElement>) => {
        if (e.target.files && e.target.files[0]) {
            handleFile(e.target.files[0]);
        }
    };
    const handleFile = (file: File) => {
        const validTypes = ['.mp4', '.mov', '.avi', '.mp3', '.wav', '.m4a'];
        const ext = file.name.toLowerCase().slice(file.name.lastIndexOf('.'));
        if (!validTypes.includes(ext)) {
            setError(`不支持的文件格式 ${ext}，请上传视频或音频文件`);
            return;
        }
        setSelectedFile(file);
        setError(null);
    };
    const handleExtract = async () => {
        if (activeTab === 'file' && !selectedFile) {
            setError("请先上传文件");
            return;
        }
        if (activeTab === 'url' && !inputUrl.trim()) {
            setError("请先输入视频链接");
            return;
        }
        setIsLoading(true);
        setStep('processing');
        setError(null);
        try {
            const formData = new FormData();
            if (activeTab === 'file' && selectedFile) {
                formData.append('file', selectedFile);
            } else if (activeTab === 'url') {
                formData.append('url', inputUrl.trim());
            }
            formData.append('rewrite', doRewrite ? 'true' : 'false');
            const { data } = await api.post('/api/tools/extract-script', formData, {
                headers: { 'Content-Type': 'multipart/form-data' },
                timeout: 180000 // 3 minutes timeout
            });
            if (data.success) {
                setScript(data.original_script);
                setRewrittenScript(data.rewritten_script || "");
                setStep('result');
            } else {
                setError("提取失败：未知错误");
                setStep('config');
            }
        } catch (err: any) {
            console.error(err);
            const msg = err.response?.data?.detail || err.message || "请求失败";
            setError(msg);
            setStep('config');
        } finally {
            setIsLoading(false);
        }
    };
    const copyToClipboard = (text: string) => {
        if (navigator.clipboard && window.isSecureContext) {
            navigator.clipboard.writeText(text).then(() => {
                alert("已复制到剪贴板");
            }).catch(err => {
                console.error('Async: Could not copy text: ', err);
                fallbackCopyTextToClipboard(text);
            });
        } else {
            fallbackCopyTextToClipboard(text);
        }
    };
    const fallbackCopyTextToClipboard = (text: string) => {
        var textArea = document.createElement("textarea");
        textArea.value = text;
        // Avoid scrolling to bottom
        textArea.style.top = "0";
        textArea.style.left = "0";
        textArea.style.position = "fixed";
        textArea.style.opacity = "0";
        document.body.appendChild(textArea);
        textArea.focus();
        textArea.select();
        try {
            var successful = document.execCommand('copy');
            var msg = successful ? 'successful' : 'unsuccessful';
            if (successful) {
                alert("已复制到剪贴板");
            } else {
                alert("复制失败，请手动复制");
            }
        } catch (err) {
            console.error('Fallback: Oops, unable to copy', err);
            alert("复制失败，请手动复制");
        }
        document.body.removeChild(textArea);
    };
    // Close when clicking outside - DISABLED as per user request
    // const modalRef = useRef<HTMLDivElement>(null);
    // const handleBackdropClick = (e: React.MouseEvent) => {
    //     if (modalRef.current && !modalRef.current.contains(e.target as Node)) {
    //         onClose();
    //     }
    // };
    if (!isOpen) return null;
    return (
        <div
            className="fixed inset-0 z-50 flex items-center justify-center bg-black/80 backdrop-blur-sm p-4 animate-in fade-in duration-200"
        >
            <div
                // ref={modalRef}
                className="bg-[#1a1a1a] border border-white/10 rounded-2xl w-full max-w-2xl max-h-[90vh] overflow-hidden flex flex-col shadow-2xl"
            >
                {/* Header */}
                <div className="flex items-center justify-between p-4 border-b border-white/10 bg-white/5">
                    <h3 className="text-lg font-semibold text-white flex items-center gap-2">
                        📜 文案提取助手
                    </h3>
                    <button
                        onClick={onClose}
                        className="text-gray-400 hover:text-white transition-colors text-2xl leading-none"
                    >
                        &times;
                    </button>
                </div>
                {/* Content */}
                <div className="flex-1 overflow-y-auto p-6">
                    {step === 'config' && (
                        <div className="space-y-6">
                            {/* Tabs */}
                            <div className="flex p-1 bg-white/5 rounded-xl border border-white/10">
                                <button
                                    onClick={() => setActiveTab('url')}
                                    className={`flex-1 py-2 rounded-lg text-sm font-medium transition-all ${activeTab === 'url'
                                        ? 'bg-purple-600 text-white shadow-lg'
                                        : 'text-gray-400 hover:text-white hover:bg-white/5'
                                        }`}
                                >
                                    🔗 粘贴链接
                                </button>
                                <button
                                    onClick={() => setActiveTab('file')}
                                    className={`flex-1 py-2 rounded-lg text-sm font-medium transition-all ${activeTab === 'file'
                                        ? 'bg-purple-600 text-white shadow-lg'
                                        : 'text-gray-400 hover:text-white hover:bg-white/5'
                                        }`}
                                >
                                    📂 上传文件
                                </button>
                            </div>
                            {/* URL Input Area */}
                            {activeTab === 'url' && (
                                <div className="space-y-2 py-4">
                                    <div className="relative">
                                        <input
                                            type="text"
                                            value={inputUrl}
                                            onChange={(e) => setInputUrl(e.target.value)}
                                            placeholder="请粘贴抖音、B站等主流平台视频链接..."
                                            className="w-full bg-black/20 border border-white/10 rounded-xl px-4 py-4 text-white placeholder-gray-500 focus:outline-none focus:border-purple-500 transition-colors"
                                        />
                                        {inputUrl && (
                                            <button
                                                onClick={() => setInputUrl("")}
                                                className="absolute right-3 top-1/2 -translate-y-1/2 text-gray-500 hover:text-white p-1"
                                            >
                                                ✕
                                            </button>
                                        )}
                                    </div>
                                    <p className="text-xs text-gray-400 px-1">
                                        支持抖音、B站等主流平台分享链接，自动解析下载并提取文案。
                                    </p>
                                </div>
                            )}
                            {/* File Upload Area */}
                            {activeTab === 'file' && (
                                <div
                                    className={`
                                        relative border-2 border-dashed rounded-xl p-8 text-center transition-all cursor-pointer
                                        ${dragActive ? 'border-purple-500 bg-purple-500/10' : 'border-white/20 hover:border-white/40 hover:bg-white/5'}
                                        ${selectedFile ? 'bg-purple-900/10 border-purple-500/50' : ''}
                                    `}
                                    onDragEnter={handleDrag}
                                    onDragLeave={handleDrag}
                                    onDragOver={handleDrag}
                                    onDrop={handleDrop}
                                >
                                    <input
                                        type="file"
                                        className="absolute inset-0 w-full h-full opacity-0 cursor-pointer"
                                        onChange={handleFileChange}
                                        accept=".mp4,.mov,.avi,.mp3,.wav,.m4a"
                                    />
                                    {selectedFile ? (
                                        <div className="flex flex-col items-center">
                                            <div className="text-4xl mb-2">📄</div>
                                            <div className="font-medium text-white break-all max-w-xs">{selectedFile.name}</div>
                                            <div className="text-sm text-gray-400 mt-1">{(selectedFile.size / (1024 * 1024)).toFixed(1)} MB</div>
                                            <div className="mt-4 text-xs text-purple-400">点击更换文件</div>
                                        </div>
                                    ) : (
                                        <div className="flex flex-col items-center">
                                            <div className="text-4xl mb-2">📤</div>
                                            <div className="font-medium text-white">点击上传或拖拽文件到此处</div>
                                            <div className="text-sm text-gray-400 mt-2">支持 MP4, MOV, MP3, WAV 等音视频格式</div>
                                        </div>
                                    )}
                                </div>
                            )}
                            {/* Options */}
                            <div className="bg-white/5 rounded-xl p-4 border border-white/10">
                                <label className="flex items-center gap-3 cursor-pointer">
                                    <input
                                        type="checkbox"
                                        checked={doRewrite}
                                        onChange={e => setDoRewrite(e.target.checked)}
                                        className="w-5 h-5 accent-purple-600 rounded"
                                    />
                                    <div>
                                        <div className="text-white font-medium">启用 AI 洗稿</div>
                                        <div className="text-xs text-gray-400">自动将提取的文案重写为更自然流畅的口播稿</div>
                                    </div>
                                </label>
                            </div>
                            {error && (
                                <div className="p-3 bg-red-500/20 text-red-200 rounded-lg text-sm text-center">
                                    ❌ {error}
                                </div>
                            )}
                            <div className="flex justify-center pt-2">
                                <button
                                    onClick={handleExtract}
                                    className="w-full sm:w-auto px-10 py-3 bg-gradient-to-r from-purple-600 to-pink-600 text-white rounded-xl font-bold hover:shadow-lg hover:from-purple-500 hover:to-pink-500 transition-all transform hover:-translate-y-0.5 disabled:opacity-50 disabled:cursor-not-allowed"
                                    disabled={activeTab === 'file' ? !selectedFile : !inputUrl.trim()}
                                >
                                    {activeTab === 'url' ? '🔗 解析并提取' : '🚀 开始提取'}
                                </button>
                            </div>
                        </div>
                    )}
                    {step === 'processing' && (
                        <div className="flex flex-col items-center justify-center py-20">
                            <div className="relative w-20 h-20 mb-6">
                                <div className="absolute inset-0 border-4 border-purple-500/30 rounded-full"></div>
                                <div className="absolute inset-0 border-4 border-t-purple-500 rounded-full animate-spin"></div>
                            </div>
                            <h4 className="text-xl font-medium text-white mb-2">正在处理中...</h4>
                            <p className="text-sm text-gray-400 text-center max-w-sm px-4">
                                {activeTab === 'url' && "正在下载视频..."}<br />
                                {doRewrite ? "正在进行语音识别和 AI 智能改写..." : "正在进行语音识别..."}<br />
                                <span className="opacity-75">大文件可能需要几分钟，请不要关闭窗口</span>
                            </p>
                        </div>
                    )}
                    {step === 'result' && (
                        <div className="space-y-6">
                            {rewrittenScript && (
                                <div className="space-y-2">
                                    <div className="flex justify-between items-center">
                                        <h4 className="font-semibold text-purple-300 flex items-center gap-2">
                                            ✨ AI 洗稿结果 <span className="text-xs font-normal text-purple-400/70">(推荐)</span>
                                        </h4>
                                        {onApply && (
                                            <button
                                                onClick={() => {
                                                    onApply(rewrittenScript);
                                                    onClose();
                                                }}
                                                className="text-xs bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-500 hover:to-pink-500 text-white px-3 py-1.5 rounded-lg transition-colors flex items-center gap-1 shadow-sm"
                                            >
                                                📥 填入
                                            </button>
                                        )}
                                        <button
                                            onClick={() => copyToClipboard(rewrittenScript)}
                                            className="text-xs bg-purple-600 hover:bg-purple-500 text-white px-3 py-1.5 rounded-lg transition-colors flex items-center gap-1"
                                        >
                                            📋 复制内容
                                        </button>
                                    </div>
                                    <div className="bg-purple-900/10 border border-purple-500/20 rounded-xl p-4 max-h-60 overflow-y-auto custom-scrollbar">
                                        <p className="text-gray-200 text-sm leading-relaxed whitespace-pre-wrap">
                                            {rewrittenScript}
                                        </p>
                                    </div>
                                </div>
                            )}
                            <div className="space-y-2">
                                <div className="flex justify-between items-center">
                                    <h4 className="font-semibold text-gray-400 flex items-center gap-2">
                                        🎙️ 原始识别结果
                                    </h4>
                                    {onApply && (
                                        <button
                                            onClick={() => {
                                                onApply(script);
                                                onClose();
                                            }}
                                            className="text-xs bg-white/10 hover:bg-white/20 text-white px-3 py-1.5 rounded-lg transition-colors flex items-center gap-1"
                                        >
                                            📥 填入
                                        </button>
                                    )}
                                    <button
                                        onClick={() => copyToClipboard(script)}
                                        className="text-xs bg-white/10 hover:bg-white/20 text-white px-3 py-1.5 rounded-lg transition-colors"
                                    >
                                        复制
                                    </button>
                                </div>
                                <div className="bg-white/5 border border-white/10 rounded-xl p-4 max-h-40 overflow-y-auto custom-scrollbar">
                                    <p className="text-gray-400 text-sm leading-relaxed whitespace-pre-wrap">
                                        {script}
                                    </p>
                                </div>
                            </div>
                            <div className="flex justify-center pt-4">
                                <button
                                    onClick={() => {
                                        setStep('config');
                                        setScript("");
                                        setRewrittenScript("");
                                        setSelectedFile(null);
                                        setInputUrl("");
                                        // Keep current tab active
                                    }}
                                    className="px-6 py-2 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors"
                                >
                                    提取下一个
                                </button>
                            </div>
                        </div>
                    )}
                </div>
            </div>
        </div>
    );
 }
--- a/frontend/src/components/VideoPreviewModal.tsx
+++ b/frontend/src/components/VideoPreviewModal.tsx
@@ -0,0 +1,64 @@
 "use client";
 import { useEffect } from "react";
 interface VideoPreviewModalProps {
    videoUrl: string | null;
    onClose: () => void;
 }
 export default function VideoPreviewModal({ videoUrl, onClose }: VideoPreviewModalProps) {
    useEffect(() => {
        // 按 ESC 关闭
        const handleEsc = (e: KeyboardEvent) => {
            if (e.key === 'Escape') onClose();
        };
        if (videoUrl) {
            document.addEventListener('keydown', handleEsc);
            // 禁止背景滚动
            document.body.style.overflow = 'hidden';
        }
        return () => {
            document.removeEventListener('keydown', handleEsc);
            document.body.style.overflow = 'unset';
        };
    }, [videoUrl, onClose]);
    if (!videoUrl) return null;
    return (
        <div className="fixed inset-0 z-[200] flex items-center justify-center bg-black/80 backdrop-blur-sm p-4 animate-in fade-in duration-200">
            <div className="relative w-full max-w-4xl bg-gray-900 border border-white/10 rounded-2xl shadow-2xl overflow-hidden flex flex-col">
                {/* Header */}
                <div className="flex items-center justify-between px-6 py-2 border-b border-white/10 bg-white/5">
                    <h3 className="text-lg font-semibold text-white flex items-center gap-2">
                        🎥 视频预览
                    </h3>
                    <button
                        onClick={onClose}
                        className="p-2 text-gray-400 hover:text-white hover:bg-white/10 rounded-lg transition-colors"
                    >
                        <svg className="w-6 h-6" fill="none" stroke="currentColor" viewBox="0 0 24 24">
                            <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M6 18L18 6M6 6l12 12" />
                        </svg>
                    </button>
                </div>
                {/* Video Player */}
                <div className="bg-black flex items-center justify-center min-h-[50vh] max-h-[80vh]">
                    <video
                        src={videoUrl}
                        controls
                        autoPlay
                        className="w-full h-full max-h-[80vh] object-contain"
                    />
                </div>
            </div>
            {/* Click outside to close */}
            <div className="absolute inset-0 -z-10" onClick={onClose}></div>
        </div>
    );
 }
--- a/frontend/src/contexts/AuthContext.tsx
+++ b/frontend/src/contexts/AuthContext.tsx
@@ -0,0 +1,80 @@
 "use client";
 import { createContext, useContext, useState, useEffect, ReactNode } from "react";
 import api from "@/lib/axios";
 interface User {
  id: string;
  phone: string;
  username: string | null;
  role: string;
  is_active: boolean;
  expires_at: string | null;
 }
 interface AuthContextType {
  userId: string | null;
  user: User | null;
  isLoading: boolean;
  isAuthenticated: boolean;
 }
 const AuthContext = createContext<AuthContextType>({
  userId: null,
  user: null,
  isLoading: true,
  isAuthenticated: false,
 });
 export function AuthProvider({ children }: { children: ReactNode }) {
  const [user, setUser] = useState<User | null>(null);
  const [isLoading, setIsLoading] = useState(true);
  useEffect(() => {
    let retryCount = 0;
    const maxRetries = 2;
    const fetchUser = async () => {
      console.log("[AuthContext] 开始获取用户信息...");
      try {
        const { data } = await api.get('/api/auth/me');
        console.log("[AuthContext] 获取用户信息成功:", data);
        if (data && data.id) {
          setUser(data);
          console.log("[AuthContext] 设置 user:", data);
        } else {
          console.warn("[AuthContext] 响应中没有用户数据");
        }
        setIsLoading(false);
      } catch (error) {
        console.error("[AuthContext] 获取用户信息失败:", error);
        // 重试逻辑
        if (retryCount < maxRetries) {
          retryCount++;
          console.log(`[AuthContext] 重试 ${retryCount}/${maxRetries}...`);
          setTimeout(fetchUser, 1000);
        } else {
          console.error("[AuthContext] 重试次数用尽，放弃获取用户信息");
          setIsLoading(false);
        }
      }
    };
    fetchUser();
  }, []);
  return (
    <AuthContext.Provider value={{
      userId: user?.id || null,
      user,
      isLoading,
      isAuthenticated: !!user
    }}>
      {children}
    </AuthContext.Provider>
  );
 }
 export function useAuth() {
  return useContext(AuthContext);
 }
--- a/frontend/src/contexts/TaskContext.tsx
+++ b/frontend/src/contexts/TaskContext.tsx
@@ -0,0 +1,119 @@
 "use client";
 import { createContext, useContext, useState, useEffect, ReactNode } from "react";
 import api from "@/lib/axios";
 interface Task {
  task_id: string;
  status: string;
  progress: number;
  message: string;
  download_url?: string;
 }
 interface TaskContextType {
  currentTask: Task | null;
  isGenerating: boolean;
  startTask: (taskId: string) => void;
  clearTask: () => void;
 }
 const TaskContext = createContext<TaskContextType | undefined>(undefined);
 export function TaskProvider({ children }: { children: ReactNode }) {
  const [currentTask, setCurrentTask] = useState<Task | null>(null);
  const [isGenerating, setIsGenerating] = useState(false);
  const [taskId, setTaskId] = useState<string | null>(null);
  // 轮询任务状态
  useEffect(() => {
    if (!taskId) return;
    const pollTask = async () => {
      try {
        const { data } = await api.get(`/api/videos/tasks/${taskId}`);
        setCurrentTask(data);
        // 处理任务完成、失败或不存在的情况
        if (data.status === "completed" || data.status === "failed" || data.status === "not_found") {
          setIsGenerating(false);
          setTaskId(null);
          // 清除 localStorage
          if (typeof window !== 'undefined') {
            const keys = Object.keys(localStorage);
            keys.forEach(key => {
              if (key.includes('_current_task')) {
                localStorage.removeItem(key);
              }
            });
          }
        }
      } catch (error) {
        console.error("轮询任务失败:", error);
        setIsGenerating(false);
        setTaskId(null);
        // 清除 localStorage
        if (typeof window !== 'undefined') {
          const keys = Object.keys(localStorage);
          keys.forEach(key => {
            if (key.includes('_current_task')) {
              localStorage.removeItem(key);
            }
          });
        }
      }
    };
    // 立即执行一次
    pollTask();
    // 每秒轮询
    const interval = setInterval(pollTask, 1000);
    return () => clearInterval(interval);
  }, [taskId]);
  // 页面加载时恢复任务
  useEffect(() => {
    if (typeof window === 'undefined') return;
    // 查找所有可能的任务ID
    const keys = Object.keys(localStorage);
    const taskKey = keys.find(key => key.includes('_current_task'));
    if (taskKey) {
      const savedTaskId = localStorage.getItem(taskKey);
      if (savedTaskId) {
        console.log("[TaskContext] 恢复任务:", savedTaskId);
        setTaskId(savedTaskId);
        setIsGenerating(true);
      }
    }
  }, []);
  const startTask = (newTaskId: string) => {
    setTaskId(newTaskId);
    setIsGenerating(true);
    setCurrentTask(null);
  };
  const clearTask = () => {
    setTaskId(null);
    setIsGenerating(false);
    setCurrentTask(null);
  };
  return (
    <TaskContext.Provider value={{ currentTask, isGenerating, startTask, clearTask }}>
      {children}
    </TaskContext.Provider>
  );
 }
 export function useTask() {
  const context = useContext(TaskContext);
  if (context === undefined) {
    throw new Error("useTask must be used within a TaskProvider");
  }
  return context;
 }
--- a/frontend/src/lib/auth.ts
+++ b/frontend/src/lib/auth.ts
@@ -8,10 +8,11 @@ const API_BASE = typeof window === 'undefined'
 export interface User {
    id: string;
-    email: string;
+    phone: string;
    username: string | null;
    role: string;
    is_active: boolean;
    expires_at: string | null;
 }
 export interface AuthResponse {
@@ -23,12 +24,12 @@ export interface AuthResponse {
 /**
 * 用户注册
 */
-export async function register(email: string, password: string, username?: string): Promise<AuthResponse> {
+export async function register(phone: string, password: string, username?: string): Promise<AuthResponse> {
    const res = await fetch(`${API_BASE}/api/auth/register`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        credentials: 'include',
-        body: JSON.stringify({ email, password, username })
+        body: JSON.stringify({ phone, password, username })
    });
    return res.json();
 }
@@ -36,12 +37,12 @@ export async function register(email: string, password: string, username?: strin
 /**
 * 用户登录
 */
-export async function login(email: string, password: string): Promise<AuthResponse> {
+export async function login(phone: string, password: string): Promise<AuthResponse> {
    const res = await fetch(`${API_BASE}/api/auth/login`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        credentials: 'include',
-        body: JSON.stringify({ email, password })
+        body: JSON.stringify({ phone, password })
    });
    return res.json();
 }
@@ -57,6 +58,19 @@ export async function logout(): Promise<AuthResponse> {
    return res.json();
 }
 /**
 * 修改密码
 */
 export async function changePassword(oldPassword: string, newPassword: string): Promise<AuthResponse> {
    const res = await fetch(`${API_BASE}/api/auth/change-password`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        credentials: 'include',
        body: JSON.stringify({ old_password: oldPassword, new_password: newPassword })
    });
    return res.json();
 }
 /**
 * 获取当前用户
 */
--- a/frontend/src/lib/axios.ts
+++ b/frontend/src/lib/axios.ts
@@ -12,6 +12,8 @@ const API_BASE = typeof window === 'undefined'
 // 防止重复跳转
 let isRedirecting = false;
 const PUBLIC_PATHS = new Set(['/login', '/register']);
 // 创建 axios 实例
 const api = axios.create({
    baseURL: API_BASE,
@@ -27,7 +29,9 @@ api.interceptors.response.use(
    async (error) => {
        const status = error.response?.status;
-        if ((status === 401 || status === 403) && !isRedirecting) {
+        const isPublicPath = typeof window !== 'undefined' && PUBLIC_PATHS.has(window.location.pathname);
        if ((status === 401 || status === 403) && !isRedirecting && !isPublicPath) {
            isRedirecting = true;
            // 调用 logout API 清除 HttpOnly cookie
--- a/frontend/src/middleware.ts
+++ b/frontend/src/middleware.ts
@@ -1,33 +0,0 @@
 import { NextResponse } from 'next/server';
 import type { NextRequest } from 'next/server';
 // 需要登录才能访问的路径
 const protectedPaths = ['/', '/publish', '/admin'];
 // 公开路径 (无需登录)
 const publicPaths = ['/login', '/register'];
 export function middleware(request: NextRequest) {
    const { pathname } = request.nextUrl;
    // 检查是否有 access_token cookie
    const token = request.cookies.get('access_token');
    // 访问受保护页面但未登录 → 重定向到登录页
    if (protectedPaths.some(path => pathname === path || pathname.startsWith(path + '/')) && !token) {
        const loginUrl = new URL('/login', request.url);
        loginUrl.searchParams.set('from', pathname);
        return NextResponse.redirect(loginUrl);
    }
    // 已登录用户访问登录/注册页 → 重定向到首页
    if (publicPaths.includes(pathname) && token) {
        return NextResponse.redirect(new URL('/', request.url));
    }
    return NextResponse.next();
 }
 export const config = {
    matcher: ['/', '/publish/:path*', '/admin/:path*', '/login', '/register']
 };
--- a/models/Qwen3-TTS/qwen_tts_server.py
+++ b/models/Qwen3-TTS/qwen_tts_server.py
@@ -0,0 +1,189 @@
 """
 Qwen3-TTS 独立推理服务
 端口: 8009
 GPU: 0
 启动方式:
    conda activate qwen-tts
    python qwen_tts_server.py
 PM2 启动:
    pm2 start qwen_tts_server.py --name qwen-tts --interpreter /home/rongye/ProgramFiles/miniconda3/envs/qwen-tts/bin/python
 """
 import os
 import sys
 import tempfile
 import time
 from pathlib import Path
 from typing import Optional
 # 设置 GPU
 os.environ["CUDA_VISIBLE_DEVICES"] = "0"
 from fastapi import FastAPI, HTTPException, UploadFile, File, Form
 from fastapi.responses import FileResponse
 from pydantic import BaseModel
 import uvicorn
 app = FastAPI(title="Qwen3-TTS Voice Clone Service", version="1.0")
 # 模型路径 (1.7B-Base 提供更高质量的声音克隆)
 MODEL_PATH = Path(__file__).parent / "checkpoints" / "1.7B-Base"
 # 全局模型实例
 _model = None
 _model_loaded = False
 def load_model():
    """加载模型（启动时调用）"""
    global _model, _model_loaded
    if _model_loaded:
        return
    print("🔄 Loading Qwen3-TTS model...")
    start = time.time()
    import torch
    from qwen_tts import Qwen3TTSModel
    _model = Qwen3TTSModel.from_pretrained(
        str(MODEL_PATH),
        device_map="cuda:0",
        dtype=torch.bfloat16,
    )
    _model_loaded = True
    print(f"✅ Qwen3-TTS model loaded in {time.time() - start:.1f}s")
 class GenerateRequest(BaseModel):
    text: str
    ref_text: str
    language: str = "Chinese"
 class HealthResponse(BaseModel):
    service: str
    model: str
    ready: bool
    gpu_id: int
@app.on_event("startup")
 async def startup():
    """服务启动时预加载模型"""
    try:
        load_model()
    except Exception as e:
        print(f"❌ Model loading failed: {e}")
@app.get("/health", response_model=HealthResponse)
 async def health():
    """健康检查"""
    gpu_ok = False
    try:
        import torch
        gpu_ok = torch.cuda.is_available()
    except:
        pass
    return HealthResponse(
        service="Qwen3-TTS Voice Clone",
        model="1.7B-Base",
        ready=_model_loaded and gpu_ok,
        gpu_id=0
    )
@app.post("/generate")
 async def generate(
    ref_audio: UploadFile = File(...),
    text: str = Form(...),
    ref_text: str = Form(...),
    language: str = Form("Chinese")
 ):
    """
    声音克隆生成
    Args:
        ref_audio: 参考音频文件 (WAV)
        text: 要合成的文本
        ref_text: 参考音频的转写文字
        language: 语言 (Chinese/English/Auto)
    Returns:
        生成的音频文件 (WAV)
    """
    if not _model_loaded:
        raise HTTPException(status_code=503, detail="Model not loaded")
    import soundfile as sf
    # 保存上传的参考音频到临时文件
    with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tmp_ref:
        content = await ref_audio.read()
        tmp_ref.write(content)
        ref_audio_path = tmp_ref.name
    # 生成输出路径
    output_path = tempfile.mktemp(suffix=".wav")
    try:
        print(f"🎤 Generating: {text[:30]}...")
        print(f"📝 Ref text: {ref_text[:50]}...")
        start = time.time()
        wavs, sr = _model.generate_voice_clone(
            text=text,
            language=language,
            ref_audio=ref_audio_path,
            ref_text=ref_text,
        )
        sf.write(output_path, wavs[0], sr)
        duration = len(wavs[0]) / sr
        print(f"✅ Generated in {time.time() - start:.1f}s, duration: {duration:.1f}s")
        # 返回音频文件
        return FileResponse(
            output_path,
            media_type="audio/wav",
            filename="output.wav",
            background=None  # 让客户端下载完再删除
        )
    except Exception as e:
        print(f"❌ Generation failed: {e}")
        raise HTTPException(status_code=500, detail=str(e))
    finally:
        # 清理参考音频临时文件
        try:
            os.unlink(ref_audio_path)
        except:
            pass
@app.on_event("shutdown")
 async def shutdown():
    """清理临时文件"""
    # 清理 /tmp 中的残留文件
    import glob
    for f in glob.glob("/tmp/tmp*.wav"):
        try:
            os.unlink(f)
        except:
            pass
 if __name__ == "__main__":
    uvicorn.run(
        app,
        host="0.0.0.0",
        port=8009,
        log_level="info"
    )
--- a/remotion/package-lock.json
+++ b/remotion/package-lock.json
--- a/remotion/package.json
+++ b/remotion/package.json
@@ -0,0 +1,24 @@
 {
  "name": "vigent-remotion",
  "version": "1.0.0",
  "description": "Remotion video composition for ViGent2 subtitles and titles",
  "scripts": {
    "start": "remotion studio",
    "build": "remotion bundle",
    "render": "npx ts-node render.ts"
  },
  "dependencies": {
    "remotion": "^4.0.0",
    "@remotion/renderer": "^4.0.0",
    "@remotion/cli": "^4.0.0",
    "@remotion/media-utils": "^4.0.0",
    "react": "^18.2.0",
    "react-dom": "^18.2.0"
  },
  "devDependencies": {
    "@types/node": "^20.0.0",
    "@types/react": "^18.2.0",
    "typescript": "^5.0.0",
    "ts-node": "^10.9.0"
  }
 }
--- a/remotion/render.ts
+++ b/remotion/render.ts
@@ -0,0 +1,171 @@
 /**
 * Remotion 服务端渲染脚本
 * 用于从命令行渲染视频
 *
 * 使用方式:
 * npx ts-node render.ts --video /path/to/video.mp4 --captions /path/to/captions.json --title "视频标题" --output /path/to/output.mp4
 */
 import { bundle } from '@remotion/bundler';
 import { renderMedia, selectComposition } from '@remotion/renderer';
 import path from 'path';
 import fs from 'fs';
 interface RenderOptions {
  videoPath: string;
  captionsPath?: string;
  title?: string;
  titleDuration?: number;
  outputPath: string;
  fps?: number;
  enableSubtitles?: boolean;
  width?: number;
  height?: number;
 }
 async function parseArgs(): Promise<RenderOptions> {
  const args = process.argv.slice(2);
  const options: Partial<RenderOptions> = {};
  for (let i = 0; i < args.length; i += 2) {
    const key = args[i].replace('--', '');
    const value = args[i + 1];
    switch (key) {
      case 'video':
        options.videoPath = value;
        break;
      case 'captions':
        options.captionsPath = value;
        break;
      case 'title':
        options.title = value;
        break;
      case 'titleDuration':
        options.titleDuration = parseFloat(value);
        break;
      case 'output':
        options.outputPath = value;
        break;
      case 'fps':
        options.fps = parseInt(value, 10);
        break;
      case 'enableSubtitles':
        options.enableSubtitles = value === 'true';
        break;
    }
  }
  if (!options.videoPath || !options.outputPath) {
    console.error('Usage: npx ts-node render.ts --video <path> --output <path> [--captions <path>] [--title <text>] [--fps <number>]');
    process.exit(1);
  }
  return options as RenderOptions;
 }
 async function main() {
  const options = await parseArgs();
  const fps = options.fps || 25;
  console.log('Starting Remotion render...');
  console.log('Options:', JSON.stringify(options, null, 2));
  // 读取字幕数据
  let captions = undefined;
  if (options.captionsPath && fs.existsSync(options.captionsPath)) {
    const captionsContent = fs.readFileSync(options.captionsPath, 'utf-8');
    captions = JSON.parse(captionsContent);
    console.log(`Loaded captions with ${captions.segments?.length || 0} segments`);
  }
  // 获取视频时长和尺寸
  let durationInFrames = 300; // 默认 12 秒
  let videoWidth = 1280;
  let videoHeight = 720;
  try {
    // 使用 ffprobe 获取视频时长
    const { execSync } = require('child_process');
    const ffprobeOutput = execSync(
      `ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 "${options.videoPath}"`,
      { encoding: 'utf-8' }
    );
    const durationInSeconds = parseFloat(ffprobeOutput.trim());
    durationInFrames = Math.ceil(durationInSeconds * fps);
    console.log(`Video duration: ${durationInSeconds}s (${durationInFrames} frames at ${fps}fps)`);
    // 使用 ffprobe 获取视频尺寸
    const dimensionsOutput = execSync(
      `ffprobe -v error -select_streams v:0 -show_entries stream=width,height -of csv=s=x:p=0 "${options.videoPath}"`,
      { encoding: 'utf-8' }
    );
    const [width, height] = dimensionsOutput.trim().split('x').map(Number);
    if (width && height) {
      videoWidth = width;
      videoHeight = height;
      console.log(`Video dimensions: ${videoWidth}x${videoHeight}`);
    }
  } catch (e) {
    console.warn('Could not get video duration, using default:', e);
  }
  // 设置 publicDir 为视频文件所在目录，使用文件名作为 videoSrc
  const publicDir = path.dirname(path.resolve(options.videoPath));
  const videoFileName = path.basename(options.videoPath);
  console.log(`Public dir: ${publicDir}, Video file: ${videoFileName}`);
  // Bundle the Remotion project
  console.log('Bundling Remotion project...');
  const bundleLocation = await bundle({
    entryPoint: path.resolve(__dirname, './src/index.ts'),
    webpackOverride: (config) => config,
    publicDir,
  });
  // Select the composition
  const composition = await selectComposition({
    serveUrl: bundleLocation,
    id: 'ViGentVideo',
    inputProps: {
      videoSrc: videoFileName,
      captions,
      title: options.title,
      titleDuration: options.titleDuration || 3,
      enableSubtitles: options.enableSubtitles !== false,
    },
  });
  // Override duration and dimensions
  composition.durationInFrames = durationInFrames;
  composition.fps = fps;
  composition.width = videoWidth;
  composition.height = videoHeight;
  // Render the video
  console.log('Rendering video...');
  await renderMedia({
    composition,
    serveUrl: bundleLocation,
    codec: 'h264',
    outputLocation: options.outputPath,
    inputProps: {
      videoSrc: videoFileName,
      captions,
      title: options.title,
      titleDuration: options.titleDuration || 3,
      enableSubtitles: options.enableSubtitles !== false,
    },
    onProgress: ({ progress }) => {
      const percent = Math.round(progress * 100);
      process.stdout.write(`\rRendering: ${percent}%`);
    },
  });
  console.log('\nRender complete!');
  console.log(`Output: ${options.outputPath}`);
 }
 main().catch((err) => {
  console.error('Render failed:', err);
  process.exit(1);
 });
--- a/remotion/src/Root.tsx
+++ b/remotion/src/Root.tsx
@@ -0,0 +1,30 @@
 import React from 'react';
 import { Composition } from 'remotion';
 import { Video, VideoProps } from './Video';
 /**
 * Remotion 根组件
 * 定义视频合成配置
 */
 export const RemotionRoot: React.FC = () => {
  return (
    <>
      <Composition
        id="ViGentVideo"
        component={Video}
        durationInFrames={300} // 默认值，会被 render.ts 覆盖
        fps={25}
        width={1280}
        height={720}
        defaultProps={{
          videoSrc: '',
          audioSrc: undefined,
          captions: undefined,
          title: undefined,
          titleDuration: 3,
          enableSubtitles: true,
        }}
      />
    </>
  );
 };
--- a/remotion/src/Video.tsx
+++ b/remotion/src/Video.tsx
@@ -0,0 +1,45 @@
 import React from 'react';
 import { AbsoluteFill, Composition } from 'remotion';
 import { VideoLayer } from './components/VideoLayer';
 import { Title } from './components/Title';
 import { Subtitles } from './components/Subtitles';
 import { CaptionsData } from './utils/captions';
 export interface VideoProps {
  videoSrc: string;
  audioSrc?: string;
  captions?: CaptionsData;
  title?: string;
  titleDuration?: number;
  enableSubtitles?: boolean;
 }
 /**
 * 主视频组件
 * 组合视频层、标题层和字幕层
 */
 export const Video: React.FC<VideoProps> = ({
  videoSrc,
  audioSrc,
  captions,
  title,
  titleDuration = 3,
  enableSubtitles = true,
 }) => {
  return (
    <AbsoluteFill style={{ backgroundColor: 'black' }}>
      {/* 底层：视频 */}
      <VideoLayer videoSrc={videoSrc} audioSrc={audioSrc} />
      {/* 中层：字幕 */}
      {enableSubtitles && captions && (
        <Subtitles captions={captions} />
      )}
      {/* 顶层：标题 */}
      {title && (
        <Title title={title} duration={titleDuration} />
      )}
    </AbsoluteFill>
  );
 };
--- a/remotion/src/components/Subtitles.tsx
+++ b/remotion/src/components/Subtitles.tsx
@@ -0,0 +1,87 @@
 import React from 'react';
 import { AbsoluteFill, useCurrentFrame, useVideoConfig } from 'remotion';
 import {
  CaptionsData,
  getCurrentSegment,
  getCurrentWordIndex,
 } from '../utils/captions';
 interface SubtitlesProps {
  captions: CaptionsData;
  highlightColor?: string;
  normalColor?: string;
  fontSize?: number;
 }
 /**
 * 逐字高亮字幕组件
 * 根据时间戳逐字高亮显示字幕（无背景，纯文字描边）
 */
 export const Subtitles: React.FC<SubtitlesProps> = ({
  captions,
  highlightColor = '#FFFF00',
  normalColor = '#FFFFFF',
  fontSize = 52,
 }) => {
  const frame = useCurrentFrame();
  const { fps } = useVideoConfig();
  const currentTimeInSeconds = frame / fps;
  // 获取当前段落
  const currentSegment = getCurrentSegment(captions, currentTimeInSeconds);
  if (!currentSegment || currentSegment.words.length === 0) {
    return null;
  }
  // 获取当前高亮字的索引
  const currentWordIndex = getCurrentWordIndex(currentSegment, currentTimeInSeconds);
  return (
    <AbsoluteFill
      style={{
        justifyContent: 'flex-end',
        alignItems: 'center',
        paddingBottom: '6%',
      }}
    >
      <p
        style={{
          margin: 0,
          fontSize: `${fontSize}px`,
          fontFamily: '"PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Noto Sans SC", sans-serif',
          fontWeight: 800,
          lineHeight: 1.4,
          textAlign: 'center',
          maxWidth: '90%',
          wordBreak: 'keep-all',
          letterSpacing: '2px',
        }}
      >
        {currentSegment.words.map((word, index) => {
          const isHighlighted = index <= currentWordIndex;
          return (
            <span
              key={`${word.word}-${index}`}
              style={{
                color: isHighlighted ? highlightColor : normalColor,
                textShadow: `
                  -3px -3px 0 #000,
                  3px -3px 0 #000,
                  -3px 3px 0 #000,
                  3px 3px 0 #000,
                  0 0 12px rgba(0,0,0,0.9),
                  0 4px 8px rgba(0,0,0,0.6)
                `,
                transition: 'color 0.05s ease',
              }}
            >
              {word.word}
            </span>
          );
        })}
      </p>
    </AbsoluteFill>
  );
 };
--- a/remotion/src/components/Title.tsx
+++ b/remotion/src/components/Title.tsx
@@ -0,0 +1,93 @@
 import React from 'react';
 import {
  AbsoluteFill,
  interpolate,
  useCurrentFrame,
  useVideoConfig,
 } from 'remotion';
 interface TitleProps {
  title: string;
  duration?: number; // 标题显示时长（秒）
  fadeOutStart?: number; // 开始淡出的时间（秒）
 }
 /**
 * 片头标题组件
 * 在视频顶部显示标题，带淡入淡出效果
 */
 export const Title: React.FC<TitleProps> = ({
  title,
  duration = 3,
  fadeOutStart = 2,
 }) => {
  const frame = useCurrentFrame();
  const { fps } = useVideoConfig();
  const currentTimeInSeconds = frame / fps;
  // 如果超过显示时长，不渲染
  if (currentTimeInSeconds > duration) {
    return null;
  }
  // 淡入效果 (0-0.5秒)
  const fadeInOpacity = interpolate(
    currentTimeInSeconds,
    [0, 0.5],
    [0, 1],
    { extrapolateRight: 'clamp' }
  );
  // 淡出效果
  const fadeOutOpacity = interpolate(
    currentTimeInSeconds,
    [fadeOutStart, duration],
    [1, 0],
    { extrapolateLeft: 'clamp', extrapolateRight: 'clamp' }
  );
  const opacity = Math.min(fadeInOpacity, fadeOutOpacity);
  // 轻微的下滑动画
  const translateY = interpolate(
    currentTimeInSeconds,
    [0, 0.5],
    [-20, 0],
    { extrapolateRight: 'clamp' }
  );
  return (
    <AbsoluteFill
      style={{
        justifyContent: 'flex-start',
        alignItems: 'center',
        paddingTop: '6%',
        opacity,
      }}
    >
      <h1
        style={{
          transform: `translateY(${translateY}px)`,
          textAlign: 'center',
          color: '#FFFFFF',
          fontSize: '72px',
          fontWeight: 900,
          fontFamily: '"PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Noto Sans SC", sans-serif',
          textShadow: `
            0 0 10px rgba(0,0,0,0.9),
            0 0 20px rgba(0,0,0,0.7),
            0 4px 8px rgba(0,0,0,0.8),
            0 8px 16px rgba(0,0,0,0.5)
          `,
          margin: 0,
          padding: '0 5%',
          lineHeight: 1.3,
          letterSpacing: '4px',
        }}
      >
        {title}
      </h1>
    </AbsoluteFill>
  );
 };
--- a/remotion/src/components/VideoLayer.tsx
+++ b/remotion/src/components/VideoLayer.tsx
@@ -0,0 +1,34 @@
 import React from 'react';
 import { AbsoluteFill, OffthreadVideo, Audio, staticFile } from 'remotion';
 interface VideoLayerProps {
  videoSrc: string;
  audioSrc?: string;
 }
 /**
 * 视频图层组件
 * 渲染底层视频和音频，视频自动循环以匹配音频长度
 */
 export const VideoLayer: React.FC<VideoLayerProps> = ({
  videoSrc,
  audioSrc,
 }) => {
  // 使用 staticFile 从 publicDir 加载视频
  const videoUrl = staticFile(videoSrc);
  return (
    <AbsoluteFill>
      <OffthreadVideo
        src={videoUrl}
        loop
        style={{
          width: '100%',
          height: '100%',
          objectFit: 'cover',
        }}
      />
      {audioSrc && <Audio src={staticFile(audioSrc)} />}
    </AbsoluteFill>
  );
 };
--- a/remotion/src/index.ts
+++ b/remotion/src/index.ts
@@ -0,0 +1,4 @@
 import { registerRoot } from 'remotion';
 import { RemotionRoot } from './Root';
 registerRoot(RemotionRoot);
--- a/remotion/src/utils/captions.ts
+++ b/remotion/src/utils/captions.ts
@@ -0,0 +1,66 @@
 /**
 * 字幕数据类型定义和处理工具
 */
 export interface WordTimestamp {
  word: string;
  start: number;
  end: number;
 }
 export interface Segment {
  text: string;
  start: number;
  end: number;
  words: WordTimestamp[];
 }
 export interface CaptionsData {
  segments: Segment[];
 }
 /**
 * 根据当前时间获取应该显示的字幕段落
 */
 export function getCurrentSegment(
  captions: CaptionsData,
  currentTimeInSeconds: number
 ): Segment | null {
  for (const segment of captions.segments) {
    if (currentTimeInSeconds >= segment.start && currentTimeInSeconds <= segment.end) {
      return segment;
    }
  }
  return null;
 }
 /**
 * 根据当前时间获取当前高亮的字的索引
 */
 export function getCurrentWordIndex(
  segment: Segment,
  currentTimeInSeconds: number
 ): number {
  for (let i = 0; i < segment.words.length; i++) {
    const word = segment.words[i];
    if (currentTimeInSeconds >= word.start && currentTimeInSeconds <= word.end) {
      return i;
    }
    // 如果当前时间在两个字之间，返回前一个字
    if (i < segment.words.length - 1) {
      const nextWord = segment.words[i + 1];
      if (currentTimeInSeconds > word.end && currentTimeInSeconds < nextWord.start) {
        return i;
      }
    }
  }
  // 如果超过最后一个字的结束时间，返回最后一个字
  if (segment.words.length > 0) {
    const lastWord = segment.words[segment.words.length - 1];
    if (currentTimeInSeconds >= lastWord.end) {
      return segment.words.length - 1;
    }
  }
  return -1;
 }
--- a/remotion/tsconfig.json
+++ b/remotion/tsconfig.json
@@ -0,0 +1,19 @@
 {
  "compilerOptions": {
    "target": "ES2020",
    "module": "commonjs",
    "lib": ["ES2020", "DOM"],
    "jsx": "react-jsx",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "forceConsistentCasingInFileNames": true,
    "resolveJsonModule": true,
    "declaration": true,
    "declarationMap": true,
    "outDir": "./dist",
    "rootDir": "."
  },
  "include": ["src/**/*", "render.ts"],
  "exclude": ["node_modules", "dist"]
 }
--- a/run_qwen_tts.sh
+++ b/run_qwen_tts.sh
@@ -0,0 +1,9 @@
 #!/bin/bash
 # Qwen3-TTS 声音克隆服务启动脚本
 # 端口: 8009
 # GPU: 0
 cd /home/rongye/ProgramFiles/ViGent2/models/Qwen3-TTS
 # 使用 qwen-tts conda 环境的 Python
 /home/rongye/ProgramFiles/miniconda3/envs/qwen-tts/bin/python qwen_tts_server.py
--- a/run_watchdog.sh
+++ b/run_watchdog.sh
@@ -0,0 +1,15 @@
 #!/bin/bash
 # 启动 ViGent2 服务看门狗
 # 监控 Qwen-TTS and LatentSync 服务健康状态
 cd "$(dirname "$0")"
 # 使用 backend 的虚拟环境 Python (包含 httpx 等依赖)
 PYTHON_PATH="./backend/venv/bin/python"
 if [ -f "$PYTHON_PATH" ]; then
    "$PYTHON_PATH" backend/scripts/watchdog.py
 else
    echo "❌ 错误: 找不到 Python 解释器: $PYTHON_PATH"
    exit 1
 fi
Author	SHA1	Message	Date
Kevin Wong	33d8e52802	更新	2026-02-03 17:42:04 +08:00
Kevin Wong	9af50a9066	更新	2026-02-03 17:15:35 +08:00
Kevin Wong	6c6fbae13a	更新	2026-02-03 17:12:30 +08:00
Kevin Wong	cb10da52fc	更新	2026-02-03 13:46:52 +08:00
Kevin Wong	eb3ed23326	更新	2026-02-02 17:34:36 +08:00
Kevin Wong	6e58f4bbe7	更新	2026-02-02 17:16:07 +08:00
Kevin Wong	7bfd6bf862	更新	2026-02-02 14:28:48 +08:00
Kevin Wong	569736d05b	更新代码	2026-02-02 11:49:22 +08:00
Kevin Wong	ec16e08bdb	更新代码	2026-02-02 10:58:21 +08:00
Kevin Wong	6801d3e8aa	更新代码	2026-02-02 10:51:27 +08:00
Kevin Wong	cf679b34bf	更新	2026-01-29 17:58:07 +08:00
Kevin Wong	b74bacb0b5	更新	2026-01-29 17:54:43 +08:00
Kevin Wong	661a8f357c	更新	2026-01-29 12:16:41 +08:00