更新

2026-02-27 16:11:34 +08:00 · 2026-02-26 11:13:03 +08:00 · 2026-02-26 10:49:22 +08:00 · 2026-02-26 10:14:41 +08:00 · 2026-02-25 17:51:58 +08:00 · 2026-02-24 16:55:29 +08:00
443 changed files with 216914 additions and 1463 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -40,6 +40,7 @@ backend/uploads/
 backend/cookies/
 backend/user_data/
 backend/debug_screenshots/
+backend/keys/
 *_cookies.json

 # ============ 模型权重 ============
--- a/Docs/ALIPAY_DEPLOY.md
+++ b/Docs/ALIPAY_DEPLOY.md
@@ -0,0 +1,278 @@
+# 支付宝付费开通会员 — 部署指南
+
+本文档涵盖支付宝电脑网站支付功能的完整部署流程。用户注册后通过支付宝付费自动激活会员，有效期 1 年。
+
+---
+
+## 前置条件
+
+- 支付宝企业/个体商户账号
+- 已在 [支付宝开放平台](https://open.alipay.com) 创建应用并获取 APPID
+- 应用已开通 **「电脑网站支付」** 产品权限（`alipay.trade.page.pay` 接口）
+- 服务器域名已配置 HTTPS（支付宝回调要求公网可达）
+
+---
+
+## 第一部分：支付宝开放平台配置
+
+### 1. 创建应用
+
+登录 https://open.alipay.com → 控制台 → 创建应用（或使用已有应用）。
+
+### 2. 开通「电脑网站支付」产品
+
+进入应用详情 → 产品绑定/产品管理 → 添加 **「电脑网站支付」** → 提交审核。
+
+> **注意**：未开通此产品会导致 `ACQ.ACCESS_FORBIDDEN` 错误。
+
+### 3. 生成密钥对
+
+进入应用详情 → 开发设置 → 接口加签方式 → 选择 **RSA2(SHA256)**：
+
+1. 使用支付宝官方密钥工具生成 RSA2048 密钥对
+2. 将 **应用公钥** 上传到开放平台
+3. 上传后平台会显示 **支付宝公钥**（`alipayPublicKey_RSA2`）
+
+最终你会得到两样东西：
+- **应用私钥**：你本地保存，代码用来签名请求
+- **支付宝公钥**：平台返回给你，代码用来验证回调签名
+
+> 应用公钥只是上传用的中间产物，代码中不需要。
+
+---
+
+## 第二部分：服务器配置
+
+### 1. 放置密钥文件
+
+将密钥保存为标准 PEM 格式，放到 `backend/keys/` 目录：
+
+```bash
+mkdir -p /home/rongye/ProgramFiles/ViGent2/backend/keys
+```
+
+**`backend/keys/app_private_key.pem`**（应用私钥）：
+
+```
+-----BEGIN PRIVATE KEY-----
+MIIEvQIBADANBgkqhkiG9w0BAQEFAASC...（你的私钥内容）
+...
+-----END PRIVATE KEY-----
+```
+
+**`backend/keys/alipay_public_key.pem`**（支付宝公钥）：
+
+```
+-----BEGIN PUBLIC KEY-----
+MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8A...（支付宝公钥内容）
+...
+-----END PUBLIC KEY-----
+```
+
+#### PEM 格式要求
+
+支付宝密钥工具导出的是一行纯文本，需要转换为标准 PEM 格式：
+
+- 必须有头尾标记（`-----BEGIN/END ...-----`）
+- 密钥内容每 64 字符换行
+- 私钥头标记为 `-----BEGIN PRIVATE KEY-----`（PKCS#8 格式）
+- 公钥头标记为 `-----BEGIN PUBLIC KEY-----`
+
+如果你拿到的是一行裸密钥，用以下命令转换：
+
+```bash
+# 私钥格式化（假设裸密钥在 raw_private.txt 中）
+echo "-----BEGIN PRIVATE KEY-----" > app_private_key.pem
+cat raw_private.txt | fold -w 64 >> app_private_key.pem
+echo "-----END PRIVATE KEY-----" >> app_private_key.pem
+
+# 公钥格式化
+echo "-----BEGIN PUBLIC KEY-----" > alipay_public_key.pem
+cat raw_public.txt | fold -w 64 >> alipay_public_key.pem
+echo "-----END PUBLIC KEY-----" >> alipay_public_key.pem
+```
+
+> `backend/keys/` 目录已加入 `.gitignore`，不会被提交到仓库。
+
+### 2. 配置环境变量
+
+在 `backend/.env` 中添加：
+
+```ini
+# =============== 支付宝配置 ===============
+ALIPAY_APP_ID=你的应用APPID
+ALIPAY_PRIVATE_KEY_PATH=/home/rongye/ProgramFiles/ViGent2/backend/keys/app_private_key.pem
+ALIPAY_PUBLIC_KEY_PATH=/home/rongye/ProgramFiles/ViGent2/backend/keys/alipay_public_key.pem
+ALIPAY_NOTIFY_URL=https://vigent.hbyrkj.top/api/payment/notify
+ALIPAY_RETURN_URL=https://vigent.hbyrkj.top/pay
+```
+
+| 变量 | 说明 |
+|------|------|
+| `ALIPAY_APP_ID` | 支付宝开放平台应用 APPID |
+| `ALIPAY_PRIVATE_KEY_PATH` | 应用私钥 PEM 文件绝对路径 |
+| `ALIPAY_PUBLIC_KEY_PATH` | 支付宝公钥 PEM 文件绝对路径 |
+| `ALIPAY_NOTIFY_URL` | 异步回调地址（服务器间通信），必须公网 HTTPS 可达 |
+| `ALIPAY_RETURN_URL` | 同步跳转地址（用户支付完成后浏览器跳转回的页面） |
+
+`config.py` 中还有几个可调参数（已有默认值，一般不需要加到 .env）：
+
+| 变量 | 默认值 | 说明 |
+|------|--------|------|
+| `ALIPAY_SANDBOX` | `false` | 是否使用沙箱环境 |
+| `PAYMENT_AMOUNT` | `999.00` | 会员价格（元） |
+| `PAYMENT_EXPIRE_DAYS` | `365` | 会员有效天数 |
+
+### 3. 创建数据库表
+
+通过 Docker 在本地 Supabase 中执行：
+
+```bash
+docker exec -i supabase-db psql -U postgres -c "
+CREATE TABLE IF NOT EXISTS orders (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    user_id UUID REFERENCES users(id) ON DELETE CASCADE,
+    out_trade_no TEXT UNIQUE NOT NULL,
+    amount DECIMAL(10, 2) NOT NULL DEFAULT 999.00,
+    status TEXT DEFAULT 'pending' CHECK (status IN ('pending', 'paid', 'failed')),
+    trade_no TEXT,
+    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
+    paid_at TIMESTAMP WITH TIME ZONE
+);
+
+CREATE INDEX IF NOT EXISTS idx_orders_user_id ON orders(user_id);
+CREATE INDEX IF NOT EXISTS idx_orders_out_trade_no ON orders(out_trade_no);
+"
+```
+
+### 4. 安装依赖
+
+```bash
+# 后端（在 venv 中）
+cd /home/rongye/ProgramFiles/ViGent2/backend
+venv/bin/pip install python-alipay-sdk
+```
+
+> 前端无额外依赖需要安装。
+
+### 5. Nginx 配置
+
+确保 Nginx 将 `/api/payment/notify` 代理到后端。如果现有配置已覆盖 `/api/` 前缀，则无需额外修改：
+
+```nginx
+location /api/ {
+    proxy_pass http://localhost:8006;
+    # ... 现有配置
+}
+```
+
+### 6. 重启服务
+
+```bash
+# 构建前端
+cd /home/rongye/ProgramFiles/ViGent2/frontend
+npx next build
+
+# 重启
+pm2 restart vigent2-backend
+pm2 restart vigent2-frontend
+```
+
+---
+
+## 第三部分：正式上线
+
+测试通过后，将 `backend/app/core/config.py` 中的测试金额改为正式价格：
+
+```python
+PAYMENT_AMOUNT: float = 999.00      # 正式价格
+```
+
+或在 `backend/.env` 中添加覆盖：
+
+```ini
+PAYMENT_AMOUNT=999.00
+```
+
+然后重启后端：
+
+```bash
+pm2 restart vigent2-backend
+```
+
+---
+
+## 支付流程说明
+
+```
+用户注册 → 登录（密码正确但 is_active=false）
+         → 后端返回 403 + payment_token
+         → 前端跳转 /pay 页面
+         → POST /api/payment/create-order → 返回支付宝收银台 URL
+         → 前端重定向到支付宝收银台页面（支持扫码、账号登录、余额等多种支付方式）
+         → 用户完成支付
+         → 支付宝异步回调 POST /api/payment/notify
+         → 后端验签 → 更新订单 → 激活用户（is_active=true, expires_at=+365天）
+         → 支付宝同步跳转回 /pay?out_trade_no=xxx
+         → 前端轮询 GET /api/payment/status/{out_trade_no}
+         → 轮询到 paid → 提示成功 → 跳转登录页
+         → 用户重新登录 → 成功进入系统
+```
+
+**电脑网站支付 vs 当面付**：电脑网站支付（`alipay.trade.page.pay`）会跳转到支付宝官方收银台页面，用户可以选择扫码、支付宝账号登录、余额等多种方式支付，体验更好。当面付（`alipay.trade.precreate`）仅生成一个二维码，只能扫码支付。
+
+会员到期续费同流程：登录时检测到过期 → 返回 PAYMENT_REQUIRED → 跳转 /pay。
+
+管理员手动激活功能不受影响，两种方式并存。
+
+---
+
+## 涉及文件
+
+| 文件 | 变更类型 | 说明 |
+|------|---------|------|
+| `backend/requirements.txt` | 修改 | 添加 `python-alipay-sdk` |
+| `backend/database/schema.sql` | 修改 | 新增 `orders` 表 |
+| `backend/app/core/config.py` | 修改 | 支付宝配置项 |
+| `backend/app/core/security.py` | 修改 | payment_token 函数 |
+| `backend/app/core/deps.py` | 修改 | is_active 安全兜底 |
+| `backend/app/repositories/orders.py` | 新建 | orders 数据层 |
+| `backend/app/modules/payment/__init__.py` | 新建 | 模块初始化 |
+| `backend/app/modules/payment/schemas.py` | 新建 | 请求/响应模型 |
+| `backend/app/modules/payment/service.py` | 新建 | 支付业务逻辑（电脑网站支付） |
+| `backend/app/modules/payment/router.py` | 新建 | 3 个 API 端点 |
+| `backend/app/modules/auth/router.py` | 修改 | 登录返回 PAYMENT_REQUIRED |
+| `backend/app/main.py` | 修改 | 注册 payment_router |
+| `backend/.env` | 修改 | 支付宝环境变量 |
+| `backend/keys/` | 新建 | PEM 密钥文件 |
+| `frontend/src/shared/lib/auth.ts` | 修改 | login() 处理 paymentToken |
+| `frontend/src/shared/api/axios.ts` | 修改 | PUBLIC_PATHS 加 /pay |
+| `frontend/src/app/login/page.tsx` | 修改 | paymentToken 跳转 |
+| `frontend/src/app/register/page.tsx` | 修改 | 注册成功提示文案 |
+| `frontend/src/app/pay/page.tsx` | 新建 | 付费页面（重定向到支付宝收银台） |
+
+---
+
+## 常见问题
+
+### RSA key format is not supported
+
+密钥文件缺少 PEM 头尾标记或未按 64 字符换行。参考「PEM 格式要求」重新格式化。
+
+### ACQ.ACCESS_FORBIDDEN
+
+应用未开通「电脑网站支付」产品。在支付宝开放平台 → 应用详情 → 产品管理中添加并开通。
+
+### 支付宝回调不到
+
+1. 检查 `ALIPAY_NOTIFY_URL` 是否公网 HTTPS 可达
+2. 检查 Nginx 是否将 `/api/payment/notify` 代理到后端
+3. 支付宝回调超时（15s 未响应）会重试，共重试 8 次，持续 24 小时
+
+### 支付完成后页面未跳转回来
+
+检查 `ALIPAY_RETURN_URL` 配置是否正确，必须是前端 `/pay` 页面的完整 URL（如 `https://vigent.hbyrkj.top/pay`）。支付宝会在用户支付完成后将浏览器重定向到此地址，并附带 `out_trade_no` 等参数。
+
+### 前端显示"网络错误"而非具体错误
+
+API 函数缺少 try/catch 捕获 axios 异常。已在 `auth.ts` 的 `register()` 和 `login()` 中修复。
--- a/Docs/BACKEND_DEV.md
+++ b/Docs/BACKEND_DEV.md
@@ -33,11 +33,13 @@ backend/
 │   │   ├── materials/       # 素材管理（router/schemas/service）
 │   │   ├── publish/         # 多平台发布
 │   │   ├── auth/            # 认证与会话
-│   │   ├── ai/              # AI 功能（标题标签生成等）
+│   │   ├── ai/              # AI 功能（标题标签生成、多语言翻译）
 │   │   ├── assets/          # 静态资源（字体/样式/BGM）
 │   │   ├── ref_audios/      # 声音克隆参考音频（router/schemas/service）
+│   │   ├── generated_audios/ # 预生成配音管理（router/schemas/service）
 │   │   ├── login_helper/    # 扫码登录辅助
 │   │   ├── tools/           # 工具接口（router/schemas/service）
+│   │   ├── payment/         # 支付宝付费开通（router/schemas/service）
 │   │   └── admin/           # 管理员功能
 │   ├── repositories/        # Supabase 数据访问
 │   ├── services/            # 外部服务集成
@@ -73,6 +75,18 @@ backend/
 - 错误通过 `HTTPException` 抛出，统一由全局异常处理返回 `{success:false, message, code}`。
 - 不再使用 `detail` 作为前端错误文案（前端已改为读 `message`）。

+### `/api/videos/generate` 参数契约（关键约定）
+
+- `custom_assignments` 每项使用 `material_path/start/end/source_start/source_end?`，并以时间轴可见段为准。
+- `output_aspect_ratio` 仅允许 `9:16` / `16:9`，默认 `9:16`。
+- 标题显示模式参数：
+  - `title_display_mode`: `short` / `persistent`（默认 `short`）
+  - `title_duration`: 默认 `4.0`（秒），仅 `short` 模式生效
+- 片头副标题参数：
+  - `secondary_title`: 副标题文字（可选，限 20 字），仅在视频画面中显示，不参与发布标题
+  - `secondary_title_style_id` / `secondary_title_font_size` / `secondary_title_top_margin`: 副标题样式配置
+- workflow/remotion 侧需保持字段透传一致，避免前后端语义漂移。
+
 ---

 ## 4. 认证与权限
@@ -142,6 +156,14 @@ backend/user_data/{user_uuid}/cookies/
 - `LATENTSYNC_*`
 - `CORS_ORIGINS` (CORS 白名单，默认 *)

+### MuseTalk / 混合唇形同步
+- `MUSETALK_GPU_ID` (GPU 编号，默认 0)
+- `MUSETALK_API_URL` (常驻服务地址，默认 http://localhost:8011)
+- `MUSETALK_BATCH_SIZE` (推理批大小，默认 32)
+- `MUSETALK_VERSION` (v15)
+- `MUSETALK_USE_FLOAT16` (半精度，默认 true)
+- `LIPSYNC_DURATION_THRESHOLD` (秒，>=此值用 MuseTalk，默认 120)
+
 ### 微信视频号
 - `WEIXIN_HEADLESS_MODE` (headful/headless-new)
 - `WEIXIN_CHROME_PATH` / `WEIXIN_BROWSER_CHANNEL`
@@ -156,7 +178,13 @@ backend/user_data/{user_uuid}/cookies/
 - `DOUYIN_LOCALE` / `DOUYIN_TIMEZONE_ID`
 - `DOUYIN_FORCE_SWIFTSHADER`
 - `DOUYIN_DEBUG_ARTIFACTS` / `DOUYIN_RECORD_VIDEO` / `DOUYIN_KEEP_SUCCESS_VIDEO`
- `DOUYIN_COOKIE` (抖音视频下载 Cookie)
+
+### 支付宝
+- `ALIPAY_APP_ID` / `ALIPAY_PRIVATE_KEY_PATH` / `ALIPAY_PUBLIC_KEY_PATH`
+- `ALIPAY_NOTIFY_URL` / `ALIPAY_RETURN_URL`
+- `ALIPAY_SANDBOX` (沙箱模式，默认 false)
+- `PAYMENT_AMOUNT` (会员价格，默认 999.00)
+- `PAYMENT_EXPIRE_DAYS` (会员有效天数，默认 365)

 ---

--- a/Docs/BACKEND_README.md
+++ b/Docs/BACKEND_README.md
@@ -19,11 +19,13 @@ backend/
 │   │   ├── materials/    # 素材管理（router/schemas/service）
 │   │   ├── publish/      # 多平台发布
 │   │   ├── auth/         # 认证与会话
-│   │   ├── ai/           # AI 功能（标题标签生成）
+│   │   ├── ai/              # AI 功能（标题标签生成、多语言翻译）
 │   │   ├── assets/          # 静态资源（字体/样式/BGM）
 │   │   ├── ref_audios/      # 声音克隆参考音频（router/schemas/service）
+│   │   ├── generated_audios/ # 预生成配音管理（router/schemas/service）
 │   │   ├── login_helper/    # 扫码登录辅助
 │   │   ├── tools/           # 工具接口（router/schemas/service）
+│   │   ├── payment/         # 支付宝付费开通（router/schemas/service）
 │   │   └── admin/           # 管理员功能
 │   ├── repositories/     # Supabase 数据访问
 │   ├── services/         # 外部服务集成 (TTS/Remotion/Storage/Uploader 等)
@@ -50,6 +52,8 @@ backend/
    *   `POST /api/auth/register`: 用户注册
    *   `GET /api/auth/me`: 获取当前用户信息

+> 授权有效期策略：在登录与受保护接口鉴权时，后端会检查 `users.expires_at`。账号到期会自动停用 (`is_active=false`) 并清理 session，返回 `403: 会员已到期，请续费`。
+
 2.  **视频生成 (Videos)**
    *   `POST /api/videos/generate`: 提交生成任务
    *   `GET /api/videos/tasks/{task_id}`: 查询单个任务状态
@@ -76,20 +80,36 @@ backend/
    *   `GET /api/assets/bgm`: 背景音乐列表

 6.  **声音克隆 (Ref Audios)**
-    *   `POST /api/ref-audios`: 上传参考音频 (multipart/form-data)
+    *   `POST /api/ref-audios`: 上传参考音频 (multipart/form-data，自动 Whisper 转写 ref_text)
    *   `GET /api/ref-audios`: 获取参考音频列表
    *   `PUT /api/ref-audios/{id}`: 重命名参考音频
    *   `DELETE /api/ref-audios/{id}`: 删除参考音频
+    *   `POST /api/ref-audios/{id}/retranscribe`: 重新识别参考音频文字（Whisper 转写 + 超 10s 自动截取）

 7.  **AI 功能 (AI)**
    *   `POST /api/ai/generate-meta`: AI 生成标题和标签
+    *   `POST /api/ai/translate`: AI 多语言翻译（支持 9 种目标语言）

-8.  **工具 (Tools)**
+8.  **预生成配音 (Generated Audios)**
+    *   `POST /api/generated-audios/generate`: 异步生成配音（返回 task_id）
+    *   `GET /api/generated-audios/tasks/{task_id}`: 轮询生成进度
+    *   `GET /api/generated-audios`: 列出用户所有配音
+    *   `DELETE /api/generated-audios/{audio_id}`: 删除配音
+    *   `PUT /api/generated-audios/{audio_id}`: 重命名配音
+
+9.  **工具 (Tools)**
    *   `POST /api/tools/extract-script`: 从视频链接提取文案

-9.  **健康检查**
-    *   `GET /api/lipsync/health`: LatentSync 服务健康状态
-    *   `GET /api/voiceclone/health`: Qwen3-TTS 服务健康状态
+10. **健康检查**
+    *   `GET /api/lipsync/health`: 唇形同步服务健康状态（含 LatentSync + MuseTalk + 混合路由阈值）
+    *   `GET /api/voiceclone/health`: CosyVoice 3.0 服务健康状态
+
+11. **支付 (Payment)**
+    *   `POST /api/payment/create-order`: 创建支付宝电脑网站支付订单（需 payment_token）
+    *   `POST /api/payment/notify`: 支付宝异步通知回调（返回纯文本 success/fail）
+    *   `GET /api/payment/status/{out_trade_no}`: 查询订单支付状态（前端轮询）
+
+> 登录时若账号未激活或已过期，返回 403 + `payment_token`，前端跳转 `/pay` 页面完成付费。详见 [支付宝部署指南](ALIPAY_DEPLOY.md)。

 ### 统一响应结构

@@ -108,20 +128,39 @@ backend/

 `POST /api/videos/generate` 支持以下可选字段：

+- `material_path`: 视频素材路径（单素材模式）
+- `material_paths`: 多素材路径数组（多机位模式，≥2 个素材时按句子自动切换）
 - `tts_mode`: TTS 模式 (`edgetts` / `voiceclone`)
 - `voice`: EdgeTTS 音色 ID（edgetts 模式）
 - `ref_audio_id` / `ref_text`: 参考音频 ID 与文本（voiceclone 模式）
+- `generated_audio_id`: 预生成配音 ID（存在时跳过内联 TTS，使用已生成的配音文件）
+- `speed`: 语速（声音克隆模式，默认 1.0，范围 0.8-1.2）
+- `custom_assignments`: 自定义素材分配数组（每项含 `material_path` / `start` / `end` / `source_start` / `source_end?`），存在时优先按时间轴可见段生成
+- `output_aspect_ratio`: 输出画面比例（`9:16` 或 `16:9`，默认 `9:16`）
+- `language`: TTS 语言（默认自动检测，声音克隆时透传给 CosyVoice 3.0）
 - `title`: 片头标题文字
+- `title_display_mode`: 标题显示模式（`short` / `persistent`，默认 `short`）
+- `title_duration`: 标题显示时长（秒，默认 `4.0`；`short` 模式生效）
 - `subtitle_style_id`: 字幕样式 ID
 - `title_style_id`: 标题样式 ID
 - `subtitle_font_size`: 字幕字号（覆盖样式默认值）
 - `title_font_size`: 标题字号（覆盖样式默认值）
 - `title_top_margin`: 标题距顶部像素
+- `secondary_title`: 片头副标题文字（可选，限 20 字，仅视频画面显示）
+- `secondary_title_style_id`: 副标题样式 ID
+- `secondary_title_font_size`: 副标题字号
+- `secondary_title_top_margin`: 副标题距主标题间距
 - `subtitle_bottom_margin`: 字幕距底部像素
 - `enable_subtitles`: 是否启用字幕
 - `bgm_id`: 背景音乐 ID
 - `bgm_volume`: 背景音乐音量（0-1，默认 0.2）

+### 多素材稳定性说明
+
+- 多素材片段在拼接前统一重编码，并强制 `25fps + CFR`，减少段边界时间基不一致导致的画面卡顿。
+- concat 流程启用 `+genpts` 重建时间戳，提升拼接后时间轴连续性。
+- 对带旋转元数据的 MOV 素材会先做方向归一化，再进入分辨率判断和后续流程。
+
 ## 📦 资源库与静态资源

 - 本地资源目录：`backend/assets/{fonts,bgm,styles}`
@@ -163,6 +202,12 @@ GLM_API_KEY=your_glm_api_key

 # LatentSync 配置
 LATENTSYNC_GPU_ID=1
+
+# MuseTalk 配置 (长视频唇形同步)
+MUSETALK_GPU_ID=0
+MUSETALK_API_URL=http://localhost:8011
+MUSETALK_BATCH_SIZE=32
+LIPSYNC_DURATION_THRESHOLD=120
 ```

 ### 4. 启动服务
@@ -185,6 +230,14 @@ uvicorn app.main:app --host 0.0.0.0 --port 8006 --reload
 3.  **重要**: 如果模型占用 GPU，请务必使用 `asyncio.Lock` 进行并发控制，防止 OOM。
 4.  在 `app/modules/` 下创建对应模块，添加 router/service/schemas，并在 `main.py` 注册路由。

+### 唇形同步混合路由
+
+`lipsync_service.py` 实现了 LatentSync + MuseTalk 混合路由：
+- 短视频 (<`LIPSYNC_DURATION_THRESHOLD`s) → LatentSync 1.6 (GPU1, 端口 8007)
+- 长视频 (>=阈值) → MuseTalk 1.5 (GPU0, 端口 8011)
+- MuseTalk 不可用时自动回退到 LatentSync
+- 路由逻辑对 workflow 完全透明
+
 ### 添加定时任务

 目前推荐使用 **APScheduler** 或 **Crontab** 来管理定时任务。
--- a/Docs/COSYVOICE3_DEPLOY.md
+++ b/Docs/COSYVOICE3_DEPLOY.md
@@ -0,0 +1,212 @@
+# CosyVoice 3.0 部署文档
+
+## 概览
+
+| 项目 | 值 |
+|------|------|
+| 模型 | Fun-CosyVoice3-0.5B-2512 (0.5B 参数) |
+| 端口 | 8010 |
+| GPU | 0 (CUDA_VISIBLE_DEVICES=0) |
+| 推理精度 | FP16 (自动混合精度) |
+| PM2 名称 | vigent2-cosyvoice (id=15) |
+| Conda 环境 | cosyvoice (Python 3.10) |
+| 启动脚本 | `run_cosyvoice.sh` |
+| 服务脚本 | `models/CosyVoice/cosyvoice_server.py` |
+| 模型加载时间 | ~22-34 秒 |
+| 显存占用 | ~3-5 GB |
+
+## 支持语言
+
+中文、英文、日语、韩语、德语、西班牙语、法语、意大利语、俄语，18+ 中国方言
+
+## 目录结构
+
+```
+models/CosyVoice/
+├── cosyvoice_server.py              # FastAPI 服务 (端口 8010)
+├── cosyvoice/                        # CosyVoice 源码
+│   └── cli/cosyvoice.py             # AutoModel 入口
+├── third_party/Matcha-TTS/          # 子模块依赖
+├── pretrained_models/
+│   ├── Fun-CosyVoice3-0.5B/        # 模型文件 (~8.2GB)
+│   │   ├── llm.pt                   # LLM 模型 (1.9GB)
+│   │   ├── llm.rl.pt               # RL 模型 (1.9GB, 备用)
+│   │   ├── flow.pt                  # Flow 模型 (1.3GB)
+│   │   ├── hift.pt                  # HiFT 声码器 (80MB)
+│   │   ├── campplus.onnx            # 说话人嵌入 (27MB)
+│   │   ├── speech_tokenizer_v3.onnx # 语音分词器 (925MB)
+│   │   ├── cosyvoice3.yaml          # 模型配置
+│   │   └── CosyVoice-BlankEN/      # Qwen tokenizer
+│   └── CosyVoice-ttsfrd/           # 文本正则化资源
+│       ├── resource/                # 解压后的 ttsfrd 资源
+│       └── resource.zip
+run_cosyvoice.sh                      # PM2 启动脚本
+```
+
+## API 接口
+
+### GET /health
+
+健康检查，返回：
+```json
+{
+  "service": "CosyVoice 3.0 Voice Clone",
+  "model": "Fun-CosyVoice3-0.5B",
+  "ready": true,
+  "gpu_id": 0
+}
+```
+
+### POST /generate
+
+声音克隆生成。
+
+**参数 (multipart/form-data)：**
+
+| 参数 | 类型 | 必填 | 说明 |
+|------|------|------|------|
+| ref_audio | File | 是 | 参考音频 (WAV) |
+| text | string | 是 | 要合成的文本 |
+| ref_text | string | 是 | 参考音频的转写文字 |
+| language | string | 否 | 语言 (默认 "Chinese"，CosyVoice 自动检测) |
+| speed | float | 否 | 语速 (默认 1.0，范围 0.5-2.0，建议 0.8-1.2) |
+
+**返回：** WAV 音频文件
+
+**状态码：**
+- 200: 成功
+- 429: GPU 忙，请重试
+- 500: 生成失败/超时
+- 503: 模型未加载/服务中毒
+
+## 安全机制
+
+1. **GPU 推理锁** (`asyncio.Lock`): 防止并发推理导致 GPU 状态损坏
+2. **429 拒绝**: 锁被占用时立即返回 429，客户端重试
+3. **超时保护**: `60 + len(text) * 2` 秒，上限 300 秒
+4. **Poisoned 标记**: 超时后标记服务为中毒状态，健康检查返回 `ready: false`
+5. **强制退出**: 超时后 1.5 秒强制 `os._exit(1)`，PM2 自动重启
+6. **启动自检**: 启动时用短文本做一次真实推理，验证 GPU 推理链路可用；失败则 `_model_loaded = False`，健康检查返回 `ready: false`，避免假阳性
+7. **参考音频自动截取**: 参考音频超过 10 秒时自动截取前 10 秒（CosyVoice 建议 3-10 秒），避免采样异常
+
+## 运维命令
+
+```bash
+# 启动
+pm2 start run_cosyvoice.sh --name vigent2-cosyvoice
+
+# 重启
+pm2 restart vigent2-cosyvoice
+
+# 查看日志
+pm2 logs vigent2-cosyvoice --lines 50
+
+# 健康检查
+curl http://localhost:8010/health
+
+# 停止
+pm2 stop vigent2-cosyvoice
+```
+
+## 从零部署步骤
+
+### 1. 克隆仓库
+
+```bash
+cd /home/rongye/ProgramFiles/ViGent2/models
+git clone --recursive https://github.com/FunAudioLLM/CosyVoice.git
+cd CosyVoice
+git submodule update --init --recursive
+```
+
+### 2. 创建 Conda 环境
+
+```bash
+conda create -n cosyvoice -y python=3.10
+conda activate cosyvoice
+```
+
+### 3. 安装依赖
+
+注意：不能直接 `pip install -r requirements.txt`，有版本冲突需要处理。
+
+```bash
+# 安装 PyTorch 2.3.1 (CUDA 12.1) — 必须先装，版本严格要求
+pip install torch==2.3.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
+
+# 核心推理依赖
+pip install conformer==0.3.2 HyperPyYAML==1.2.2 inflect==7.3.1 \
+  librosa==0.10.2 lightning==2.2.4 modelscope==1.20.0 omegaconf==2.3.0 \
+  pydantic==2.7.0 soundfile==0.12.1 fastapi==0.115.6 uvicorn==0.30.0 \
+  transformers==4.51.3 protobuf==4.25 hydra-core==1.3.2 \
+  rich==13.7.1 diffusers==0.29.0 x-transformers==2.11.24 wetext==0.0.4
+
+# onnxruntime-gpu
+pip install onnxruntime-gpu==1.18.0 \
+  --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
+
+# 其他必要依赖
+pip install gdown matplotlib pyarrow wget onnx python-multipart httpx
+
+# openai-whisper 需要 setuptools < 71（提供 pkg_resources）
+pip install "setuptools<71"
+pip install --no-build-isolation openai-whisper==20231117
+
+# pyworld 需要 g++ 和 Cython
+pip install Cython
+PATH="/usr/bin:$PATH" pip install pyworld==0.3.4
+
+# 关键版本修复
+pip install "numpy<2"               # onnxruntime-gpu 不兼容 numpy 2.x
+pip install "ruamel.yaml<0.18"      # hyperpyyaml 不兼容 ruamel.yaml 0.19+
+```
+
+> **重要**: CosyVoice 要求 torch==2.3.1。torch 2.10+ 会导致 CUBLAS_STATUS_INVALID_VALUE 错误。
+> torch 2.3.1+cu121 自带 nvidia-cudnn-cu12，onnxruntime CUDAExecutionProvider 可正常使用。
+
+### 4. 下载模型
+
+```bash
+# 使用 huggingface_hub (国内用 hf-mirror.com)
+HF_ENDPOINT=https://hf-mirror.com python -c "
+from huggingface_hub import snapshot_download
+snapshot_download('FunAudioLLM/Fun-CosyVoice3-0.5B-2512', local_dir='pretrained_models/Fun-CosyVoice3-0.5B')
+snapshot_download('FunAudioLLM/CosyVoice-ttsfrd', local_dir='pretrained_models/CosyVoice-ttsfrd')
+"
+```
+
+### 5. 安装 ttsfrd (可选，提升文本正则化质量)
+
+```bash
+cd pretrained_models/CosyVoice-ttsfrd/
+unzip resource.zip -d .
+pip install ttsfrd_dependency-0.1-py3-none-any.whl
+pip install ttsfrd-0.4.2-cp310-cp310-linux_x86_64.whl
+```
+
+### 6. 注册 PM2
+
+```bash
+pm2 start run_cosyvoice.sh --name vigent2-cosyvoice
+pm2 save
+```
+
+## 已知问题
+
+1. **ttsfrd "prepare tts engine failed"**: ttsfrd C 库内部日志，Python 层初始化成功，不影响使用
+2. **Sliding Window Attention 警告**: transformers 库提示，不影响推理结果
+3. **onnxruntime Memcpy 性能提示**: `Memcpy nodes are not supported by the CUDA EP`，仅为性能建议日志，不影响功能
+
+> 注：libcudnn.so.8 问题在 torch 2.3.1+cu121 环境下已解决（自带 nvidia-cudnn-cu12），onnxruntime CUDAExecutionProvider 可正常加载。
+
+## 与 Qwen3-TTS 对比
+
+| 特性 | Qwen3-TTS (已停用) | CosyVoice 3.0 (当前) |
+|------|-----------|----------------|
+| 端口 | 8009 | 8010 |
+| 模型大小 | 0.6B | 0.5B |
+| 语言 | 中/英/日/韩 | 9 语言 + 18 方言 |
+| 克隆方式 | ref_audio + ref_text | ref_audio + ref_text |
+| prompt 格式 | 直接传 ref_text | `You are a helpful assistant.<\|endofprompt\|>` + ref_text |
+| 内置分段 | 无，需客户端分段 | 内置 text_normalize 自动分段 |
+| 状态 | 已停用 (PM2 stopped) | 生产使用中 |
--- a/Docs/DEPLOY_MANUAL.md
+++ b/Docs/DEPLOY_MANUAL.md
@@ -7,8 +7,8 @@
 | 服务器 | Dell PowerEdge R730 |
 | CPU | 2× Intel Xeon E5-2680 v4 (56 线程) |
 | 内存 | 192GB DDR4 |
-| GPU 0 | NVIDIA RTX 3090 24GB |
-| GPU 1 | NVIDIA RTX 3090 24GB (用于 LatentSync) |
+| GPU 0 | NVIDIA RTX 3090 24GB (MuseTalk + CosyVoice) |
+| GPU 1 | NVIDIA RTX 3090 24GB (LatentSync) |
 | 部署路径 | `/home/rongye/ProgramFiles/ViGent2` |

 ---
@@ -72,7 +72,9 @@ cd /home/rongye/ProgramFiles/ViGent2

 ---

-## 步骤 3: 部署 AI 模型 (LatentSync 1.6)
+## 步骤 3: 部署 AI 模型
+
+### 3a. LatentSync 1.6 (短视频唇形同步, GPU1)

 > ⚠️ **重要**：LatentSync 需要独立的 Conda 环境和 **~18GB VRAM**。请**不要**直接安装在后端环境中。

@@ -93,6 +95,26 @@ conda activate latentsync
 python -m scripts.server  # 测试能否启动，Ctrl+C 退出
 ```

+### 3b. MuseTalk 1.5 (长视频唇形同步, GPU0)
+
+> MuseTalk 是单步潜空间修复模型（非扩散模型），推理速度接近实时，适合 >=120s 的长视频。与 CosyVoice 共享 GPU0，fp16 推理约需 4-8GB 显存。
+
+请参考详细的独立部署指南：
+**[MuseTalk 部署指南](MUSETALK_DEPLOY.md)**
+
+简要步骤：
+1. 创建独立的 `musetalk` Conda 环境 (Python 3.10 + PyTorch 2.0.1 + CUDA 11.8)
+2. 安装 mmcv/mmdet/mmpose 等依赖
+3. 下载模型权重 (`download_weights.sh`)
+4. 创建必要的软链接 (`musetalk/config.json`, `musetalk/musetalkV15`)
+
+**验证 MuseTalk 部署**:
+```bash
+cd /home/rongye/ProgramFiles/ViGent2/models/MuseTalk
+/home/rongye/ProgramFiles/miniconda3/envs/musetalk/bin/python scripts/server.py
+# 另一个终端: curl http://localhost:8011/health
+```
+
 ---

 ## 步骤 4: 安装后端依赖
@@ -166,6 +188,8 @@ playwright install chromium
    EOF
    ```

+> **注意**：后端启动时会自动创建额外的存储桶（`ref-audios`、`generated-audios`），无需手动创建。
+ 
 ---
 
 ## 步骤 7: 配置环境变量
@@ -187,7 +211,7 @@ cp .env.example .env
 | `SUPABASE_PUBLIC_URL` | `https://api.hbyrkj.top` | Supabase API 公网地址 (前端访问) |
 | `LATENTSYNC_GPU_ID` | 1 | GPU 选择 (0 或 1) |
 | `LATENTSYNC_USE_SERVER` | false | 设为 true 以启用常驻服务加速 |
-| `LATENTSYNC_INFERENCE_STEPS` | 20 | 推理步数 (20-50) |
+| `LATENTSYNC_INFERENCE_STEPS` | 16 | 推理步数 (16-50) |
 | `LATENTSYNC_GUIDANCE_SCALE` | 1.5 | 引导系数 (1.0-3.0) |
 | `DEBUG` | true | 生产环境改为 false |
 | `REDIS_URL` | `redis://localhost:6379/0` | 任务状态存储（不可用时回退内存） |
@@ -210,7 +234,21 @@ cp .env.example .env
 | `DOUYIN_RECORD_VIDEO` | false | 录制浏览器操作视频 |
 | `DOUYIN_KEEP_SUCCESS_VIDEO` | false | 成功后保留录屏 |
 | `CORS_ORIGINS` | `*` | CORS 允许源 (生产环境建议白名单) |
-| `DOUYIN_COOKIE` | 空 | 抖音视频下载 Cookie (文案提取功能) |
+| `MUSETALK_GPU_ID` | 0 | MuseTalk GPU 编号 |
+| `MUSETALK_API_URL` | `http://localhost:8011` | MuseTalk 常驻服务地址 |
+| `MUSETALK_BATCH_SIZE` | 32 | MuseTalk 推理批大小 |
+| `MUSETALK_VERSION` | v15 | MuseTalk 模型版本 |
+| `MUSETALK_USE_FLOAT16` | true | MuseTalk 半精度加速 |
+| `LIPSYNC_DURATION_THRESHOLD` | 120 | 秒，>=此值用 MuseTalk，<此值用 LatentSync |
+| `ALIPAY_APP_ID` | 空 | 支付宝应用 APPID |
+| `ALIPAY_PRIVATE_KEY_PATH` | 空 | 应用私钥 PEM 文件路径 |
+| `ALIPAY_PUBLIC_KEY_PATH` | 空 | 支付宝公钥 PEM 文件路径 |
+| `ALIPAY_NOTIFY_URL` | 空 | 支付宝异步回调地址 (公网 HTTPS) |
+| `ALIPAY_RETURN_URL` | 空 | 支付完成后浏览器跳转地址 |
+| `PAYMENT_AMOUNT` | `999.00` | 会员价格 (元) |
+| `PAYMENT_EXPIRE_DAYS` | `365` | 会员有效天数 |
+
+> 支付宝完整配置步骤（密钥生成、PEM 格式、产品开通等）请参考 **[支付宝部署指南](ALIPAY_DEPLOY.md)**。

 ---

@@ -261,6 +299,13 @@ conda activate latentsync
 python -m scripts.server
 ```

+### 启动 MuseTalk (终端 4, 长视频唇形同步)
+
+```bash
+cd /home/rongye/ProgramFiles/ViGent2/models/MuseTalk
+/home/rongye/ProgramFiles/miniconda3/envs/musetalk/bin/python scripts/server.py
+```
+ 
 ### 验证

 1. 访问 http://服务器IP:3002 查看前端
@@ -334,34 +379,48 @@ chmod +x run_latentsync.sh
 pm2 start ./run_latentsync.sh --name vigent2-latentsync
 ```

-### 4. 启动 Qwen3-TTS 声音克隆服务 (可选)
+### 4. 启动 CosyVoice 3.0 声音克隆服务 (可选)

-> 如需使用声音克隆功能，需要启动此服务。
+> 如需使用声音克隆功能，需要启动此服务。详细部署步骤见 [CosyVoice 3.0 部署文档](COSYVOICE3_DEPLOY.md)。

-1. 安装 HTTP 服务依赖:
-```bash
-conda activate qwen-tts
-pip install fastapi uvicorn python-multipart
-```
+1. 启动脚本位于项目根目录: `run_cosyvoice.sh`

-2. 启动脚本位于项目根目录: `run_qwen_tts.sh`
-
-3. 使用 pm2 启动:
+2. 使用 pm2 启动:
 ```bash
 cd /home/rongye/ProgramFiles/ViGent2
-pm2 start ./run_qwen_tts.sh --name vigent2-qwen-tts
+pm2 start ./run_cosyvoice.sh --name vigent2-cosyvoice
 pm2 save
 ```

-4. 验证服务:
+3. 验证服务:
 ```bash
 # 检查健康状态
-curl http://localhost:8009/health
+curl http://localhost:8010/health
 ```

-### 5. 启动服务看门狗 (Watchdog)
+### 5. 启动 MuseTalk 长视频唇形同步服务

-> 🛡️ **推荐**：监控 Qwen-TTS 和 LatentSync 服务健康状态，卡死时自动重启。
+> 长视频 (>=120s) 自动路由到 MuseTalk。MuseTalk 不可用时自动回退 LatentSync。
+> 详细部署步骤见 [MuseTalk 部署指南](MUSETALK_DEPLOY.md)。
+
+1. 启动脚本位于项目根目录: `run_musetalk.sh`
+
+2. 使用 pm2 启动:
+```bash
+cd /home/rongye/ProgramFiles/ViGent2
+pm2 start ./run_musetalk.sh --name vigent2-musetalk
+pm2 save
+```
+
+3. 验证服务:
+```bash
+curl http://localhost:8011/health
+# {"status":"ok","model_loaded":true}
+```
+
+### 6. 启动服务看门狗 (Watchdog)
+
+> 🛡️ **推荐**：监控 CosyVoice 和 LatentSync 服务健康状态，卡死时自动重启。

 ```bash
 cd /home/rongye/ProgramFiles/ViGent2
@@ -376,13 +435,16 @@ pm2 save
 pm2 startup
 ```

+> **提示**: 完整的 PM2 进程列表应包含 5-6 个服务: vigent2-backend, vigent2-frontend, vigent2-latentsync, vigent2-cosyvoice, vigent2-musetalk, vigent2-watchdog。
+
 ### pm2 常用命令

 ```bash
 pm2 status                    # 查看所有服务状态
 pm2 logs                      # 查看所有日志
 pm2 logs vigent2-backend      # 查看后端日志
-pm2 logs vigent2-qwen-tts     # 查看 Qwen3-TTS 日志
+pm2 logs vigent2-cosyvoice    # 查看 CosyVoice 日志
+pm2 logs vigent2-musetalk     # 查看 MuseTalk 日志
 pm2 restart all               # 重启所有服务
 pm2 stop vigent2-latentsync   # 停止 LatentSync 服务
 pm2 delete all                # 删除所有服务
@@ -521,7 +583,8 @@ python3 -c "import torch; print(torch.cuda.is_available())"
 sudo lsof -i :8006
 sudo lsof -i :3002
 sudo lsof -i :8007
-sudo lsof -i :8009  # Qwen3-TTS
+sudo lsof -i :8010  # CosyVoice
+sudo lsof -i :8011  # MuseTalk
 ```

 ### 查看日志
@@ -531,7 +594,8 @@ sudo lsof -i :8009  # Qwen3-TTS
 pm2 logs vigent2-backend
 pm2 logs vigent2-frontend
 pm2 logs vigent2-latentsync
-pm2 logs vigent2-qwen-tts
+pm2 logs vigent2-cosyvoice
+pm2 logs vigent2-musetalk
 ```

 ### SSH 连接卡顿 / 系统响应慢
@@ -562,6 +626,7 @@ pm2 logs vigent2-qwen-tts
 | `playwright` | 社交媒体自动发布 |
 | `biliup` | B站视频上传 |
 | `loguru` | 日志管理 |
+| `python-alipay-sdk` | 支付宝支付集成 |

 ### 前端关键依赖

@@ -570,6 +635,7 @@ pm2 logs vigent2-qwen-tts
 | `next` | React 框架 |
 | `swr` | 数据请求与缓存 |
 | `tailwindcss` | CSS 样式 |
+| `wavesurfer.js` | 音频波形（时间轴编辑器） |

 ### LatentSync 关键依赖

--- a/Docs/DevLogs/Day21.md
+++ b/Docs/DevLogs/Day21.md
@@ -315,3 +315,135 @@ npm run build && pm2 restart vigent2-frontend  # 刷脸验证UI
 pm2 restart vigent2-backend
 npm run build && pm2 restart vigent2-frontend
 ```
+
+---
+
+## 🎬 多素材视频生成（多机位效果）
+
+### 概述
+支持用户上传多个不同角度的自拍视频，生成视频时按句子自动切换素材，最终效果类似多机位拍摄。单素材时走原有流程，无额外开销。
+
+### 核心架构
+
+#### 流水线变更
+```
+【单素材（不变）】
+text → TTS → audio → LatentSync(1个素材+完整audio) → Whisper字幕 → Remotion → 成片
+
+【多素材（新增）】
+text → TTS → audio → Whisper字幕(提前) → 按素材数量均分时长(对齐字边界)
+  → 对每段: 切分audio + LatentSync(素材[i]+音频片段[i])
+  → FFmpeg拼接所有片段 → Remotion(完整字幕时间戳) → 成片
+```
+
+#### 素材切换逻辑（均分方案）
+1. Whisper 对完整音频转录，得到字级别时间戳
+2. 按素材数量**均分音频总时长**（`total_duration / N`）
+3. 每个分割点对齐到最近的 Whisper 字边界，避免在字中间切分
+4. 首段 start 扩展为 0.0，末段 end 扩展为音频结尾，确保完整覆盖
+
+> **设计决策**：最初方案基于原始文案标点分句，但用户文案往往不含句号（只有逗号），导致只产生 1 段。改为均分方案后不依赖文案标点，对任何输入都能正确切分。
+
+---
+
+### 一、后端改动
+
+#### 1. `backend/app/modules/videos/schemas.py`
+- 新增 `material_paths: Optional[List[str]]` 字段
+- 保留 `material_path: str` 向后兼容
+
+#### 2. `backend/app/modules/videos/workflow.py`（核心改动）
+
+**新增函数**：
+- `_split_equal(segments, material_paths)`: 按素材数量均分音频时长，对齐到最近的 Whisper 字边界
+
+**修改 `process_video_generation()`**：
+- `is_multi = len(material_paths) > 1` 判断走多素材/单素材分支
+- 多素材分支：Whisper 提前 → 均分切分 → 音频切分 → 逐段 LatentSync → FFmpeg 拼接
+
+#### 3. `backend/app/services/video_service.py`
+- 新增 `concat_videos()`: FFmpeg concat demuxer (`-c copy`) 拼接视频片段
+- 新增 `split_audio()`: FFmpeg 按时间范围切分音频 (`-ss` + `-t` + `-c copy`)
+
+#### 4. `backend/scripts/watchdog.py`
+- 健康检查阈值从 3 次提高到 5 次（容忍期 2.5 分钟）
+- 新增重启后 120 秒冷却期，避免模型加载期间被误判为故障
+- 启动时给所有服务 60 秒初始冷却期
+
+---
+
+### 二、前端改动
+
+#### 1. 新增依赖
+```bash
+npm install @dnd-kit/core @dnd-kit/sortable @dnd-kit/utilities
+```
+
+#### 2. `frontend/src/features/home/model/useMaterials.ts`
+- `selectedMaterial: string` → `selectedMaterials: string[]`（多选）
+- 新增 `toggleMaterial(id)`: 切换选中/取消（至少保留1个）
+- 新增 `reorderMaterials(activeId, overId)`: 拖拽排序
+- 上传格式扩展：新增 `.mkv/.webm/.flv/.wmv/.m4v/.ts/.mts`
+
+#### 3. `frontend/src/features/home/ui/MaterialSelector.tsx`（重写）
+- 素材列表每行增加复选框 + 序号徽标（①②③）
+- 选中 ≥2 个时显示拖拽排序区（@dnd-kit `SortableContext`）
+- 每个排序项：拖拽把手 + 序号 + 素材名 + 移除按钮
+- HTML input accept 改为 `video/*`
+
+#### 4. `frontend/src/features/home/model/useHomeController.ts`
+- 多素材 payload：`material_paths` 数组 + `material_path` 向后兼容
+- `enable_subtitles` 硬编码为 `true`（移除开关）
+- 验证：至少选中 1 个素材
+
+#### 5. `frontend/src/features/home/model/useHomePersistence.ts`
+- 素材持久化改为 JSON 数组，向后兼容旧格式（单字符串）
+- 移除 `enableSubtitles` 持久化
+
+#### 6. `frontend/src/features/home/ui/TitleSubtitlePanel.tsx`
+- 移除"逐字高亮字幕"开关，字幕样式区始终显示
+
+#### 7. `frontend/src/features/home/ui/HomePage.tsx`
+- 更新 props 传递（`selectedMaterials`, `toggleMaterial`, `reorderMaterials`）
+
+---
+
+### 三、Bug 修复记录
+
+#### BUG-1: 多素材只使用第一个视频（基于标点的分句方案失败）
+- **现象**: 选了 2 个素材但生成的视频只使用第 1 个，日志显示 `Multi-material: 1 segments, 2 materials`。
+- **根因 v1**: 最初通过正则 `[。！？!?]` 在 Whisper 输出中分句，但 Whisper 不输出标点。
+- **修复 v1**: 改为用原始文案标点分句——但用户文案往往只含逗号（，），无句末标点（。！？），仍退化为 1 段。
+- **最终修复**: 彻底放弃基于标点的分句方案，改为 `_split_equal()` **按素材数量均分音频时长**，对齐到最近的 Whisper 字边界。不依赖任何标点符号，对所有文案均有效。
+
+#### BUG-2: 口型对不上（音频时间偏移）
+- **根因**: `split_audio` 用 Whisper 的 start/end 时间（如 0.11~7.21）切分音频，但 `compose()` 用完整原始音频（0.0~结尾）合成，导致时间偏移。
+- **修复**: 强制首段 start=0.0，末段 end=音频实际时长，确保切分音频完整覆盖。
+
+#### BUG-3: min_segment_sec 过度合并导致退化（已随方案切换移除）
+- **根因**: 旧方案中 2 个句子第 2 句不足 3 秒时，最短时长检查合并为 1 段，多素材退化为单素材。
+- **状态**: 均分方案不存在此问题，相关代码已移除。
+
+---
+
+### 涉及文件汇总
+
+| 文件 | 变更类型 | 说明 |
+|------|----------|------|
+| `backend/app/modules/videos/schemas.py` | 修改 | 新增 material_paths 字段 |
+| `backend/app/modules/videos/workflow.py` | 修改 | 多素材流水线核心逻辑 + 3个 Bug 修复 |
+| `backend/app/services/video_service.py` | 修改 | 新增 concat_videos / split_audio |
+| `backend/scripts/watchdog.py` | 修改 | 阈值优化 + 冷却期机制 |
+| `frontend/package.json` | 修改 | 新增 @dnd-kit 依赖 |
+| `frontend/src/features/home/model/useMaterials.ts` | 修改 | 多选 + 排序状态管理 |
+| `frontend/src/features/home/ui/MaterialSelector.tsx` | 重写 | 多选复选框 + 拖拽排序 UI |
+| `frontend/src/features/home/model/useHomeController.ts` | 修改 | 多素材 payload + 移除字幕开关 |
+| `frontend/src/features/home/model/useHomePersistence.ts` | 修改 | JSON 数组持久化 |
+| `frontend/src/features/home/ui/TitleSubtitlePanel.tsx` | 修改 | 移除字幕开关 |
+| `frontend/src/features/home/ui/HomePage.tsx` | 修改 | 更新 props 传递 |
+
+### 重启要求
+```bash
+pm2 restart vigent2-backend
+npm run build && pm2 restart vigent2-frontend
+```
--- a/Docs/DevLogs/Day22.md
+++ b/Docs/DevLogs/Day22.md
@@ -0,0 +1,221 @@
+## 🔧 多素材生成优化与健壮性加固 (Day 22)
+
+### 概述
+对 Day 21 实现的多素材视频生成（多机位）功能进行全面审查，修复 6 个高优先级 Bug、完成 8 项体验优化，并将多素材流水线从"逐段 LatentSync"重构为"先拼接再推理"架构，推理次数从 N 次降为 1 次。
+
+---
+
+### 一、后端高优 Bug 修复
+
+#### 1. `_split_equal()` 素材数 > 字符数边界溢出
+- **问题**: 5 个素材但只有 2 个 Whisper 字符时，边界索引重复，部分素材被跳过
+- **修复**: 加入 `n = min(n, len(all_chars))` 上限保护
+- **文件**: `backend/app/modules/videos/workflow.py`
+
+#### 2. 多素材 LatentSync 单段失败无 fallback
+- **问题**: 单素材模式下 LatentSync 失败会 fallback 到原始素材，但多素材模式直接抛异常，整个任务失败
+- **修复**: 多素材循环中加 try-except，失败时 fallback 到原始素材片段
+- **文件**: `backend/app/modules/videos/workflow.py`
+
+#### 3. `num_segments == 0` 时 ZeroDivisionError
+- **问题**: 所有 assignments 被跳过后 `i / num_segments` 触发除零
+- **修复**: 循环前加 `if num_segments == 0` 检查并抛出明确错误
+- **文件**: `backend/app/modules/videos/workflow.py`
+
+#### 4. `split_audio` 未校验 duration > 0
+- **问题**: `end <= start` 时 FFmpeg 行为异常
+- **修复**: 加入 `if duration <= 0: raise ValueError(...)`
+- **文件**: `backend/app/services/video_service.py`
+
+#### 5. Whisper 失败时按时长均分兜底
+- **问题**: Whisper 失败后直接退化为单素材，其他素材被浪费
+- **修复**: 按 `audio_duration / len(material_paths)` 均分，不依赖字符对齐
+- **文件**: `backend/app/modules/videos/workflow.py`
+
+#### 6. `concat_videos` 空列表未检查
+- **问题**: 传入空 `video_paths` 时 FFmpeg 报错
+- **修复**: 加入 `if not video_paths: raise ValueError(...)`
+- **文件**: `backend/app/services/video_service.py`
+
+---
+
+### 二、前端优化
+
+#### 1. payload 构建非空断言修复
+- `m!.path` → `m?.path` + `.filter(Boolean)`，防止素材被删后 crash
+- **文件**: `frontend/src/features/home/model/useHomeController.ts`
+
+#### 2. 生成按钮展示后端进度消息
+- 新增 `message` prop，生成中显示如"(正在处理片段 2/3...)"
+- **文件**: `frontend/src/features/home/ui/GenerateActionBar.tsx`, `HomePage.tsx`
+
+#### 3. 新上传素材自动选中
+- 上传成功后对比前后素材列表，新增的 ID 自动追加到 `selectedMaterials`
+- **文件**: `frontend/src/features/home/model/useMaterials.ts`
+
+#### 4. Material 接口统一
+- 三处 `interface Material` 重复定义提取到 `shared/types/material.ts`
+- **文件**: `frontend/src/shared/types/material.ts` (新建), `useMaterials.ts`, `useHomeController.ts`, `MaterialSelector.tsx`
+
+#### 5. 拖拽排序修复
+- 移除 `DragOverlay`（`backdrop-blur` 创建新 containing block 导致定位错乱）
+- 改为 `useSortable` 原生拖拽 + `CSS.Translate`，拖拽中元素高亮加阴影
+- **文件**: `frontend/src/features/home/ui/MaterialSelector.tsx`
+
+#### 6. 素材选择上限 4 个
+- `toggleMaterial` 新增 `MAX_MATERIALS = 4` 限制
+- UI 选满后未选中项变半透明禁用，提示文字改为"可多选，最多4个"
+- **文件**: `useMaterials.ts`, `MaterialSelector.tsx`
+
+#### 7. 移动端排序区域响应式
+- 素材列表 `max-h-64` → `max-h-48 sm:max-h-64`
+- **文件**: `MaterialSelector.tsx`
+
+#### 8. 多素材耗时提示
+- 选中 ≥2 素材时生成按钮下方显示"多素材模式 (N 个机位)，生成耗时较长"
+- **文件**: `GenerateActionBar.tsx`, `HomePage.tsx`
+
+---
+
+### 三、核心架构重构：先拼接再推理
+
+#### V1 (Day 21): 逐段 LatentSync
+```
+素材A → LatentSync(素材A, 音频片段1) → lipsync_A
+素材B → LatentSync(素材B, 音频片段2) → lipsync_B
+FFmpeg concat(lipsync_A, lipsync_B) → 最终视频
+```
+- 缺点：N 个素材 = N 次 LatentSync 推理（每次 ~30s）
+
+#### V2 (Day 22): 先拼接再推理
+```
+素材A → prepare_segment(裁剪到3.67s) → prepared_A
+素材B → prepare_segment(裁剪到4.00s) → prepared_B
+FFmpeg concat(prepared_A, prepared_B) → concat_video (7.67s)
+LatentSync(concat_video, 完整音频) → 最终视频
+```
+- 优点：只需 **1 次** LatentSync 推理，时间从 N×30s 降为 1×30s
+
+#### 新增 `prepare_segment()` 方法
+```python
+def prepare_segment(self, video_path, target_duration, output_path, target_resolution=None):
+    # 素材时长 > 目标: 裁剪 (-t)
+    # 素材时长 < 目标: 循环 (-stream_loop) + 裁剪
+    # 分辨率一致: -c copy 无损 (不重编码)
+    # 分辨率不一致: scale + pad 统一到第一个素材分辨率
+```
+
+#### 分辨率处理策略
+- 新增 `get_resolution()` 方法检测各素材分辨率
+- 所有素材分辨率相同时：`-c copy` 无损裁剪（保持原画质）
+- 分辨率不一致时：统一到第一个素材的分辨率，`force_original_aspect_ratio=decrease` + `pad` 居中
+- LatentSync 只处理嘴部 512×512 区域，输出保持原分辨率
+
+#### 时间对齐验证
+
+| 环节 | 时间基准 | 对齐关系 |
+|------|---------|---------|
+| TTS 音频 | 原始时长 (7.67s) | 基准 |
+| Whisper 字幕 | 基于 TTS 音频 | 时间戳对齐音频 |
+| 均分切分 | assignments 总时长 = 音频时长 | 首段 start=0, 末段 end=audio_duration |
+| prepare 各段 | `-t seg_dur` 精确截断 | 总和 ≈ 音频时长 |
+| LatentSync | concat_video + 完整音频 | 内部 0.5s 容差 |
+| compose | lipsync_video + 音频/BGM | `-shortest` 保证同步 |
+| Remotion | 基于 captions_path 渲染字幕 | 时间戳对齐音频 |
+
+---
+
+### 涉及文件汇总
+
+| 文件 | 变更类型 | 说明 |
+|------|----------|------|
+| `backend/app/modules/videos/workflow.py` | 修改 | 6 个 Bug 修复 + 流水线重构（先拼接再推理）|
+| `backend/app/services/video_service.py` | 修改 | 新增 `prepare_segment()`、`get_resolution()`，`split_audio` 校验，`concat_videos` 空列表检查 |
+| `frontend/src/shared/types/material.ts` | 新建 | 统一 Material 接口 |
+| `frontend/src/features/home/model/useMaterials.ts` | 修改 | 上传自动选中、素材上限 4 个 |
+| `frontend/src/features/home/model/useHomeController.ts` | 修改 | payload 非空断言修复、Material 接口引用 |
+| `frontend/src/features/home/ui/MaterialSelector.tsx` | 修改 | 拖拽修复、上限 4 个 UI、移动端响应式 |
+| `frontend/src/features/home/ui/GenerateActionBar.tsx` | 修改 | 进度消息展示、多素材耗时提示 |
+| `frontend/src/features/home/ui/HomePage.tsx` | 修改 | 传递 message、materialCount prop |
+
+---
+
+### 四、AI 多语言翻译
+
+#### 功能
+在文案编辑区新增「AI多语言」按钮，支持将中文口播文案一键翻译为 9 种语言，并可随时还原原文。
+
+#### 支持语言
+英语 English、日语 日本語、韩语 한국어、法语 Français、德语 Deutsch、西班牙语 Español、俄语 Русский、意大利语 Italiano、葡萄牙语 Português
+
+#### 实现
+
+##### 后端
+- **`backend/app/services/glm_service.py`** — 新增 `translate_text()` 方法，调用智谱 GLM API（temperature=0.3），prompt 要求只返回译文、保持语气风格
+- **`backend/app/modules/ai/router.py`** — 新增 `POST /api/ai/translate` 接口，接收 `{text, target_lang}`，返回 `{translated_text}`
+
+##### 前端
+- **`frontend/src/features/home/ui/ScriptEditor.tsx`** — 新增 `LANGUAGES` 列表（9 种语言）、语言下拉菜单（点击外部自动关闭）、翻译中 loading 状态、「还原原文」按钮（翻译过后出现在菜单顶部）
+- **`frontend/src/features/home/model/useHomeController.ts`** — 新增 `handleTranslate`（调用翻译 API、首次翻译保存原文）、`originalText` 状态、`handleRestoreOriginal`（恢复原文）
+
+#### 涉及文件
+
+| 文件 | 变更 | 说明 |
+|------|------|------|
+| `backend/app/services/glm_service.py` | 修改 | 新增 `translate_text()` 方法 |
+| `backend/app/modules/ai/router.py` | 修改 | 新增 `/api/ai/translate` 接口 |
+| `frontend/src/features/home/ui/ScriptEditor.tsx` | 修改 | 语言菜单 UI、翻译 loading、还原原文按钮 |
+| `frontend/src/features/home/model/useHomeController.ts` | 修改 | `handleTranslate`、`originalText`、`handleRestoreOriginal` |
+
+---
+
+### 五、TTS 多语言支持
+
+#### 背景
+翻译功能实现后，用户可将中文文案翻译为其他语言。但翻译后生成视频时 TTS 仍只支持中文：
+- **EdgeTTS**：声音列表只有 5 个 `zh-CN-*` 中文声音
+- **声音克隆 (Qwen3-TTS)**：`language` 参数硬编码为 `"Chinese"`
+
+#### 实现方案
+
+##### 1. 前端：语言感知的声音列表
+- `VOICES` 从扁平数组扩展为 `Record<string, VoiceOption[]>`，覆盖 10 种语言（zh-CN / en-US / ja-JP / ko-KR / fr-FR / de-DE / es-ES / ru-RU / it-IT / pt-BR），每种语言 2 个声音（男/女）
+- 新增 `LANG_TO_LOCALE` 映射：翻译目标语言名 → EdgeTTS locale（如 `"English" → "en-US"`）
+- 新增 `textLang` 状态，跟踪当前文案语言，默认 `"zh-CN"`
+
+##### 2. 翻译时自动切换声音
+- `handleTranslate` 成功后：根据目标语言设置 `textLang`，EdgeTTS 模式下自动切换 `voice` 为目标语言的默认声音
+- `handleRestoreOriginal` 还原时：重置 `textLang` 为 `"zh-CN"`，恢复中文默认声音
+- `VoiceSelector` 根据 `textLang` 动态显示对应语言的声音列表
+
+##### 3. 声音克隆语言透传
+- 前端：新增 `LOCALE_TO_QWEN_LANG` 映射（`zh-CN→"Chinese"`, `en-US→"English"`, 其他→`"Auto"`）
+- 生成请求 payload 加入 `language` 字段（仅声音克隆模式）
+- 后端 `GenerateRequest` schema 新增 `language: str = "Chinese"` 字段
+- `workflow.py`：`language="Chinese"` 硬编码改为 `language=req.language`
+
+##### 4. Bug 修复：textLang 持久化
+- **问题**: `voice` 已持久化但 `textLang` 未持久化，刷新页面后 `voice` 恢复为英文声音但 `textLang` 默认回中文，导致 VoiceSelector 显示中文声音列表却选中英文声音，无高亮按钮
+- **修复**: 在 `useHomePersistence` 中加入 `textLang` 的 localStorage 读写
+
+#### 数据流
+
+```
+用户翻译 "English"
+  → ScriptEditor.onTranslate("English")
+  → LANG_TO_LOCALE["English"] = "en-US"
+  → setTextLang("en-US"), setVoice("en-US-GuyNeural")
+  → VoiceSelector 显示 VOICES["en-US"] = [Guy, Jenny]
+  → 生成时:
+      EdgeTTS: payload.voice = "en-US-GuyNeural"
+      声音克隆: payload.language = "English" (via getQwenLanguage)
+```
+
+#### 涉及文件
+
+| 文件 | 变更 | 说明 |
+|------|------|------|
+| `frontend/src/features/home/model/useHomeController.ts` | 修改 | VOICES 多语言 Record、textLang 状态、LANG_TO_LOCALE / LOCALE_TO_QWEN_LANG 映射、翻译自动切换 voice |
+| `frontend/src/features/home/model/useHomePersistence.ts` | 修改 | textLang 持久化读写 |
+| `backend/app/modules/videos/schemas.py` | 修改 | GenerateRequest 加 `language` 字段 |
+| `backend/app/modules/videos/workflow.py` | 修改 | 声音克隆调用处用 `req.language` 替代硬编码 |
--- a/Docs/DevLogs/Day23.md
+++ b/Docs/DevLogs/Day23.md
@@ -0,0 +1,856 @@
+## 🎙️ 配音前置重构 — 第一阶段 (Day 23)
+
+### 概述
+
+将配音从视频生成流程中独立出来，实现"先生成配音 → 选中配音 → 再选素材 → 生成视频"的新工作流。用户可以独立管理配音（生成/试听/改名/删除/选择），并在选中配音后看到时长信息，为第二阶段的素材时间轴编排奠定数据基础。
+
+**旧流程**: 文案 + 选素材 → 一键生成（内联 TTS → Whisper → 均分 → LipSync → 合成）
+**新流程**: 文案 → 配音方式 → **生成配音** → 选中配音 → 选素材 → 背景音乐 → 生成视频
+
+---
+
+### 一、后端：新增 `generated_audios` 模块
+
+#### 模块结构
+
+```
+backend/app/modules/generated_audios/
+├── __init__.py
+├── router.py      # 5 个 API 端点
+├── schemas.py     # 请求/响应模型
+└── service.py     # 生成/列表/删除/改名
+```
+
+#### API 端点
+
+| 方法 | 路径 | 说明 |
+|------|------|------|
+| POST | `/api/generated-audios/generate` | 异步生成配音（返回 task_id） |
+| GET | `/api/generated-audios/tasks/{task_id}` | 轮询生成进度 |
+| GET | `/api/generated-audios` | 列出用户所有配音 |
+| DELETE | `/api/generated-audios/{audio_id}` | 删除配音 |
+| PUT | `/api/generated-audios/{audio_id}` | 改名 |
+
+#### 存储方案
+
+- Supabase 存储桶：`generated-audios`（启动时自动创建）
+- 音频文件：`{user_id}/{timestamp}_audio.wav`
+- 元数据文件：`{user_id}/{timestamp}_audio.json`（含 display_name、text、tts_mode、duration_sec 等）
+
+#### 生成流程
+
+复用现有 `TTSService` / `voice_clone_service` / `task_store`：
+
+```
+POST /generate → 创建 task → BackgroundTask:
+  1. edgetts → TTSService.generate_audio()
+     voiceclone → 下载 ref_audio → voice_clone_service.generate_audio()
+  2. ffprobe 获取时长
+  3. 上传 .wav + .json 到 generated-audios 桶
+  4. 更新 task(status=completed, output={audio_id, duration_sec, ...})
+```
+
+---
+
+### 二、后端：修改视频生成 workflow
+
+#### `GenerateRequest` 新增字段
+
+```python
+generated_audio_id: Optional[str] = None  # 预生成配音 ID（存在时跳过内联 TTS）
+```
+
+#### `workflow.py` TTS 阶段新增分支
+
+```python
+if req.generated_audio_id:
+    # 下载预生成配音 + 从元数据读取 language
+elif req.tts_mode == "voiceclone":
+    # 原有声音克隆逻辑
+else:
+    # 原有 EdgeTTS 逻辑
+```
+
+向后兼容：不传 `generated_audio_id` 时，原有内联 TTS 流程不受影响。
+
+---
+
+### 三、前端：新增配音列表 hook + 面板
+
+#### `useGeneratedAudios.ts`
+
+- 状态：`generatedAudios[]`、`selectedAudio`、`isGeneratingAudio`、`audioTask`
+- 方法：`fetchGeneratedAudios()`、`generateAudio()`、`deleteAudio()`、`renameAudio()`、`selectAudio()`
+- 轮询：生成后 1s 轮询 task 状态，完成后自动刷新列表并选中最新配音
+- 独立于视频生成的 TaskContext（不互相干扰）
+
+#### `GeneratedAudiosPanel.tsx`
+
+- 每条配音：播放/暂停、名称、时长、重命名、删除
+- 选中态：`border-purple-500 bg-purple-500/20`
+- 内嵌进度条（生成中显示）
+- 底部显示选中配音的原始文案（截断）
+- 播放逻辑自包含于面板内（`new Audio()` + play/pause toggle）
+
+---
+
+### 四、前端：UI 面板重排序
+
+**旧顺序**: MaterialSelector → ScriptEditor → TitleSubtitle → VoiceSelector → BgmPanel → GenerateActionBar
+
+**新顺序**:
+1. ScriptEditor（文案编辑）
+2. TitleSubtitlePanel（标题与字幕样式）
+3. VoiceSelector（配音方式）
+4. **GeneratedAudiosPanel**（配音列表）← 新增
+5. MaterialSelector（视频素材）← 后移，需选中配音才解锁
+6. BgmPanel（背景音乐）
+7. GenerateActionBar（生成视频）
+
+#### 素材区门控
+
+未选中配音时，素材区显示半透明遮罩 + "请先生成并选中配音"提示。素材上传/预览/改名/删除始终可用，仅选择勾选被遮罩。
+
+#### 时长信息
+
+选中配音后，MaterialSelector 顶部显示：
+```
+当前配音: 45.2 秒 | 已选 3 个素材（自动均分每段 ~15.1 秒）
+```
+
+#### 生成按钮条件更新
+
+```typescript
+// 旧条件
+disabled={isGenerating || selectedMaterials.length === 0 || (ttsMode === "voiceclone" && !selectedRefAudio)}
+// 新条件
+disabled={isGenerating || selectedMaterials.length === 0 || !selectedAudio}
+```
+
+---
+
+### 五、持久化
+
+`useHomePersistence` 新增 `selectedAudioId` 的 localStorage 读写，刷新页面后恢复选中的配音。
+
+---
+
+### 涉及文件汇总
+
+#### 后端新增
+
+| 文件 | 说明 |
+|------|------|
+| `backend/app/modules/generated_audios/__init__.py` | 模块标记 |
+| `backend/app/modules/generated_audios/router.py` | 5 个 API 端点 |
+| `backend/app/modules/generated_audios/service.py` | 生成/列表/删除/改名 |
+| `backend/app/modules/generated_audios/schemas.py` | 请求/响应模型 |
+
+#### 后端修改
+
+| 文件 | 变更 |
+|------|------|
+| `backend/app/main.py` | 注册 generated_audios 路由 |
+| `backend/app/services/storage.py` | 新增 `BUCKET_GENERATED_AUDIOS`，启动时自动创建桶 |
+| `backend/app/modules/videos/schemas.py` | `GenerateRequest` 新增 `generated_audio_id` 字段 |
+| `backend/app/modules/videos/workflow.py` | TTS 阶段新增预生成音频分支 |
+
+#### 前端新增
+
+| 文件 | 说明 |
+|------|------|
+| `frontend/src/features/home/model/useGeneratedAudios.ts` | 配音列表 hook |
+| `frontend/src/features/home/ui/GeneratedAudiosPanel.tsx` | 配音列表面板 |
+
+#### 前端修改
+
+| 文件 | 变更 |
+|------|------|
+| `frontend/src/features/home/ui/HomePage.tsx` | 面板重排序 + 素材区门控 + 插入 GeneratedAudiosPanel |
+| `frontend/src/features/home/ui/MaterialSelector.tsx` | 新增 `selectedAudioDuration` prop + 时长信息显示 |
+| `frontend/src/features/home/ui/GenerateActionBar.tsx` | 禁用条件改为 `!selectedAudio` |
+| `frontend/src/features/home/model/useHomeController.ts` | 集成 useGeneratedAudios、新增 handleGenerateAudio、修改 handleGenerate 使用 generated_audio_id |
+| `frontend/src/features/home/model/useHomePersistence.ts` | 新增 selectedAudioId 持久化 |
+
+---
+
+## 🎞️ 素材时间轴编排 — 第二阶段 (Day 23)
+
+### 概述
+
+在第一阶段"配音前置"基础上，新增**时间轴编辑器**，用户可以：
+1. 在音频波形上查看各素材块的时长分配
+2. 拖拽分割线调整每段素材的时长（无缝铺满，调整一段自动压缩/扩展相邻段）
+3. 为每段素材设置**源视频截取起点**（从视频任意位置开始，而非始终从头）
+
+**旧行为**: 多素材时自动均分（`_split_equal`），无法控制每段时长和源视频起始点
+**新行为**: 时间轴编辑器可视化分配 + 拖拽调整 + ClipTrimmer 截取设置
+
+---
+
+### 一、后端改动
+
+#### 1.1 新增 `CustomAssignment` 模型
+
+```python
+# backend/app/modules/videos/schemas.py
+class CustomAssignment(BaseModel):
+    material_path: str
+    start: float           # 音频时间轴起点
+    end: float             # 音频时间轴终点
+    source_start: float = 0.0  # 源视频截取起点
+```
+
+`GenerateRequest` 新增 `custom_assignments: Optional[List[CustomAssignment]] = None`。存在时跳过 Whisper 均分，直接使用用户定义的分配。
+
+#### 1.2 `prepare_segment` 支持 `source_start`
+
+```python
+def prepare_segment(self, video_path, target_duration, output_path,
+                    target_resolution=None, source_start: float = 0.0):
+```
+
+关键逻辑：
+- `source_start > 0` 时使用 `-ss` 快速 seek，并强制重编码（避免 stream copy 关键帧不精确）
+- 当需要循环且有 `source_start` 时，先裁剪出 `source_start` 到视频结尾的片段，再循环裁剪后的文件（避免 `stream_loop` 从视频 0s 开始循环）
+- 裁剪临时文件在 `finally` 中自动清理
+
+#### 1.3 `workflow.py` 支持 `custom_assignments`
+
+- **多素材模式**: `custom_assignments` 存在时，直接使用用户分配（仍运行 Whisper 生成字幕），每个 `prepare_segment` 调用传入 `source_start`
+- **单素材模式**: `custom_assignments` 有 1 条且 `source_start > 0` 时，先截取片段再传入 LatentSync
+- **向后兼容**: `custom_assignments` 为 `None` 时完全走旧路径
+
+---
+
+### 二、前端新增组件
+
+#### 2.1 `useTimelineEditor.ts` — 时间轴段管理 hook
+
+```typescript
+interface TimelineSegment {
+  id: string;              // React key
+  materialId: string;      // 素材 ID
+  materialName: string;    // 显示名
+  start: number;           // 音频时间轴开始秒数
+  end: number;             // 音频时间轴结束秒数
+  sourceStart: number;     // 源视频截取起点（默认 0）
+  sourceEnd: number;       // 源视频截取终点（0 = 到结尾）
+  color: string;           // 色块颜色
+}
+```
+
+核心方法：
+- `initSegments()`: selectedMaterials 变化时按数量均分 audioDuration
+- `resizeSegment(id, newEnd)`: 拖拽右边界，约束每段最小 1s
+- `setSourceRange(id, sourceStart, sourceEnd)`: 设置截取范围
+- `toCustomAssignments()`: 转为后端 `CustomAssignment[]` 格式
+
+#### 2.2 `TimelineEditor.tsx` — 波形 + 色块时间轴
+
+- **wavesurfer.js** 渲染音频波形（仅展示，不播放）
+- 色块层按比例排列，显示素材名 + 时长 + 截取标记
+- 色块间分割线可拖拽（`onPointerDown/Move/Up` 实现连续像素拖拽）
+- 点击色块打开 ClipTrimmer
+
+#### 2.3 `ClipTrimmer.tsx` — 素材截取模态框
+
+- HTML5 `<video>` 实时预览，拖拽滑块时 `video.currentTime` 跟随
+- 双端 Range Slider（起点/终点），互锁约束 ≥ 0.5s
+- 显示截取时长 vs 分配时长对比（循环补足/截断提示）
+- `loadedmetadata` 获取源视频时长
+
+---
+
+### 三、前端整合改动
+
+#### 3.1 `useHomeController.ts`
+
+- 集成 `useTimelineEditor` hook
+- 新增 `clipTrimmerOpen` / `clipTrimmerSegmentId` 状态
+- `handleGenerate` 多素材时始终发送 `custom_assignments`；单素材 + `sourceStart > 0` 时也发送
+- 移除不再使用的 `reorderMaterials` 导出
+
+#### 3.2 `HomePage.tsx`
+
+- 在 MaterialSelector 和 BgmPanel 之间插入 TimelineEditor（仅当有配音且已选素材时显示）
+- 底部新增 ClipTrimmer 模态框
+- 移除 `reorderMaterials` 和 `selectedAudioDuration` prop 传递
+
+#### 3.3 `MaterialSelector.tsx`
+
+- 移除配音时长信息栏（功能迁至 TimelineEditor）
+- 移除拖拽排序区（SortableChip + @dnd-kit 相关代码）
+- 移除 `onReorderMaterials` / `selectedAudioDuration` prop
+
+---
+
+### 四、审查修复的 Bug
+
+| # | 严重程度 | 问题 | 修复 |
+|---|---------|------|------|
+| 1 | **中** | `prepare_segment` 使用 `source_start > 0` + stream copy 时 seek 不精确 | 添加 `source_start > 0` 到重编码条件 |
+| 2 | **高** | `stream_loop + source_start` 循环时从视频 0s 开始而非从 source_start 循环 | 改为两步：先裁剪片段再循环裁剪后的文件 |
+| 3 | **低** | `useHomeController` 导出已废弃的 `reorderMaterials` | 移除 |
+
+---
+
+### 涉及文件汇总
+
+#### 后端修改
+
+| 文件 | 变更 |
+|------|------|
+| `backend/app/modules/videos/schemas.py` | 新增 `CustomAssignment` model，`GenerateRequest` 新增 `custom_assignments` 字段 |
+| `backend/app/services/video_service.py` | `prepare_segment` 新增 `source_start` 参数，循环+截取两步处理 |
+| `backend/app/modules/videos/workflow.py` | 多素材/单素材流水线支持 `custom_assignments`，传递 `source_start` |
+
+#### 前端新增
+
+| 文件 | 说明 |
+|------|------|
+| `frontend/src/features/home/model/useTimelineEditor.ts` | 时间轴段管理 hook |
+| `frontend/src/features/home/ui/TimelineEditor.tsx` | 波形 + 色块时间轴组件 |
+| `frontend/src/features/home/ui/ClipTrimmer.tsx` | 素材截取模态框 |
+
+#### 前端修改
+
+| 文件 | 变更 |
+|------|------|
+| `frontend/src/features/home/ui/HomePage.tsx` | 插入 TimelineEditor + ClipTrimmer |
+| `frontend/src/features/home/ui/MaterialSelector.tsx` | 移除时长信息 + 拖拽排序区 + 相关 prop |
+| `frontend/src/features/home/model/useHomeController.ts` | 集成 useTimelineEditor，handleGenerate 发送 custom_assignments |
+| `frontend/package.json` | 新增 `wavesurfer.js` 依赖 |
+
+---
+
+## 🎨 UI 体验优化 + TTS 稳定性修复 — 第三阶段 (Day 23)
+
+### 概述
+
+根据用户反馈，修复 6 项 UI 体验问题，同时修复声音克隆服务的 SoX 路径问题和显存缓存管理。
+
+> **注**: Qwen3-TTS 已在后续被 CosyVoice 3.0 (端口 8010) 替换，以下记录为当时的修复过程。
+
+---
+
+### 一、Qwen3-TTS 稳定性修复 (已被 CosyVoice 3.0 替换)
+
+#### 1.1 SoX PATH 修复
+
+**问题**: PM2 启动 qwen-tts 时，`sox` 工具安装在 conda env 的 bin 目录中，系统 PATH 找不到，导致音频编解码走 fallback 路径（CPU 密集型），日志中出现 `SoX could not be found!` 警告。
+
+**修复**: `run_qwen_tts.sh` 中 export conda env bin 到 PATH：
+
+```bash
+export PATH="/home/rongye/ProgramFiles/miniconda3/envs/qwen-tts/bin:$PATH"
+```
+
+#### 1.2 CUDA 缓存清理
+
+**修复**: `qwen_tts_server.py` 每次生成完成后（无论成功或失败）调用 `torch.cuda.empty_cache()`，防止显存碎片累积。使用 `asyncio.to_thread()` 在线程池中运行推理，避免阻塞事件循环导致健康检查超时。
+
+> **后续**: Qwen3-TTS 已停用，CosyVoice 3.0 沿用了相同的保护机制（GPU 推理锁、超时保护、显存清理、启动自检）。
+
+---
+
+### 二、配音列表按钮布局统一 (反馈 #1 + #6)
+
+**问题**: `GeneratedAudiosPanel` 的试听按钮位于左侧（独立于 Edit/Delete），与 `RefAudioPanel` 的布局不一致。底部文案摘要区域不需要展示。
+
+**修复**:
+- Play/Edit/Delete 按钮统一放在右侧同组，hover 显示，顺序为 试听→重命名→删除
+- 移除选中配音的文案摘要区域
+- 布局与 RefAudioPanel 一致：左侧名称+时长，右侧操作按钮组
+
+---
+
+### 三、视频素材区域移除配音依赖遮罩 (反馈 #2)
+
+**问题**: MaterialSelector 被 `!selectedAudio` 遮罩覆盖，必须先选配音才能操作素材。
+
+**修复**: 移除 `HomePage.tsx` 中 MaterialSelector 外层的 disabled overlay `<div>`。素材随时可上传/预览/管理，仅 TimelineEditor 需要选中配音才显示（已有独立条件 `selectedAudio && selectedMaterials.length > 0`）。
+
+---
+
+### 四、时间轴拖拽排序 (反馈 #3)
+
+**问题**: TimelineEditor 不支持调换素材顺序。
+
+**修复**:
+- `useTimelineEditor` 已有 `reorderSegments()` 方法（交换两个段的素材信息但保留时间范围）
+- 通过 `useHomeController` 暴露 `reorderSegments`，传入 `TimelineEditor`
+- 色块支持 HTML5 Drag & Drop：`draggable` + `onDragStart/Over/Drop/End`
+- 拖拽时：源色块半透明（`opacity-50`），目标色块高亮 ring（`ring-2 ring-purple-400 scale-[1.02]`）
+- 光标样式：`cursor-grab` / `active:cursor-grabbing`
+
+---
+
+### 五、截取设置双手柄 Range Slider (反馈 #4)
+
+**问题**: ClipTrimmer 使用两个独立的 `<input type="range">` 滑块，起点和终点分开操作，体验不直观。
+
+**修复**: 改为自定义双手柄 range slider：
+- 单条轨道，紫色圆形手柄（起点）+ 粉色圆形手柄（终点）
+- 轨道底色 `bg-white/10`，选中范围用素材对应颜色高亮
+- Pointer Events 实现拖拽：`onPointerDown` 捕获手柄 → `onPointerMove` 更新位置 → `onPointerUp` 释放
+- 手柄互锁约束：起点不超过终点 - 0.5s，终点不低于起点 + 0.5s
+- 底部显示起点（紫色）和终点（粉色）时间标签
+
+---
+
+### 六、截取设置视频预览 (反馈 #5)
+
+**问题**: ClipTrimmer 的视频只能静态查看，无法播放预览截取范围。
+
+**修复**:
+- 视频区域点击可播放/暂停（Play/Pause 图标覆盖层）
+- 播放范围：从 sourceStart 播放到 sourceEnd 自动停止
+- 播放结束后回到起点
+- 拖拽手柄时 `video.currentTime` 实时跟随（seek 到当前位置查看画面）
+- 播放进度条（白色竖线）叠加在 range slider 轨道上
+- `preload="auto"` 预加载视频，确保拖拽时快速 seek
+
+---
+
+### 涉及文件汇总
+
+#### 后端修改
+
+| 文件 | 变更 |
+|------|------|
+| `run_qwen_tts.sh` | export conda env bin 到 PATH，修复 SoX 找不到问题 (已停用) |
+| `models/Qwen3-TTS/qwen_tts_server.py` | 每次生成后 `torch.cuda.empty_cache()`，asyncio.to_thread 避免阻塞 (已停用) |
+
+#### 前端修改
+
+| 文件 | 变更 |
+|------|------|
+| `frontend/src/features/home/ui/GeneratedAudiosPanel.tsx` | 按钮布局统一（Play/Edit/Delete 右侧同组），移除文案摘要 |
+| `frontend/src/features/home/ui/HomePage.tsx` | 移除 MaterialSelector 配音遮罩，传入 onReorderSegment |
+| `frontend/src/features/home/ui/TimelineEditor.tsx` | 新增 HTML5 Drag & Drop 排序，新增 onReorderSegment prop |
+| `frontend/src/features/home/ui/ClipTrimmer.tsx` | 双手柄 range slider + 视频播放预览 + 播放进度指示 |
+| `frontend/src/features/home/model/useHomeController.ts` | 暴露 reorderSegments 方法 |
+
+---
+
+## 📝 历史文案保存 + 时间轴拖拽修复 — 第四阶段 (Day 23)
+
+### 概述
+
+新增文案手动保存与加载功能，修复时间轴拖拽排序后素材时长不跟随的 Bug，统一按钮视觉规范。
+
+---
+
+### 一、历史文案保存与加载
+
+#### 功能
+
+用户可手动保存当前文案到历史列表，随时从历史中加载恢复。只有手动保存的文案才出现在历史列表中，与自动保存（`useHomePersistence`）完全独立。
+
+#### UI 布局
+
+```
+按钮栏: [历史文案▼] [文案提取助手] [AI多语言▼] [AI生成标题标签]
+底部栏: 128 字                                    [保存文案]
+```
+
+- **历史文案下拉**: 展示已保存列表（名称 + 日期 + 删除按钮），点击条目加载文案，空列表显示"暂无保存的文案"
+- **保存文案按钮**: 文案为空时 disabled，点击后 `toast.success("文案已保存")`
+- **预计时长已移除**: 底部栏只保留字数 + 保存按钮
+
+#### 实现
+
+##### `useSavedScripts.ts`（新建）
+
+```typescript
+interface SavedScript { id: string; name: string; content: string; savedAt: number }
+```
+
+- localStorage key: `vigent_{storageKey}_savedScripts`
+- `saveScript(content)`: 取前 15 字符自动命名，新条目插入列表头部，**直接写入 localStorage**
+- `deleteScript(id)`: 删除指定条目，直接写入 localStorage
+- `useEffect([lsKey])`: lsKey 变化时（guest → userId）重新从 localStorage 读取
+- **不使用自动持久化 effect**，避免 storageKey 切换时空数组覆盖已有数据
+
+##### 数据流
+
+```
+ScriptEditor (UI)
+  ↑ savedScripts / onSaveScript / onLoadScript / onDeleteScript (纯 props + callbacks)
+  │
+useHomeController
+  ├── useSavedScripts(storageKey) → { savedScripts, saveScript, deleteScript }
+  └── handleSaveScript() → saveScript(text) + toast
+  │
+HomePage
+  └── 传递 props 到 ScriptEditor
+```
+
+---
+
+### 二、时间轴拖拽排序 Bug 修复
+
+#### 问题
+
+拖拽调换素材顺序后，各素材的时长没有跟随素材移动，而是留在原槽位。例如：素材1(3s) + 素材2(8s+4s循环)，拖拽后变成素材2(3s) + 素材1(8s+4s循环)，时长分配没变。
+
+#### 根因
+
+`reorderSegments` 使用**属性交换**方式：逐个拷贝 `materialId`、`sourceStart`、`sourceEnd` 等属性在两个槽位间交换，然后调用 `recalcPositions` 重算位置。
+
+#### 修复
+
+改为**数组移动**（splice）：将整个 segment 对象从旧位置取出，插入到新位置。segment 对象携带全部属性（materialId、sourceStart、sourceEnd、color 等）作为一个整体移动，再由 `recalcPositions` 重算位置。
+
+```typescript
+// 修复前：属性交换
+const fromMat = { materialId: next[fromIdx].materialId, ... };
+const toMat = { materialId: next[toIdx].materialId, ... };
+next[fromIdx] = { ...next[fromIdx], ...toMat };
+next[toIdx] = { ...next[toIdx], ...fromMat };
+
+// 修复后：数组移动
+const [moved] = next.splice(fromIdx, 1);
+next.splice(toIdx, 0, moved);
+```
+
+附带优势：3+ 素材拖拽行为从"交换"变为"插入"，更符合用户直觉。
+
+---
+
+### 三、按钮视觉统一
+
+#### 问题
+
+历史文案、文案提取助手、AI多语言、AI生成标题标签 4 个按钮高度不一致，AI 按钮的文本被 `<span>` 嵌套包裹导致内部布局差异。
+
+#### 修复
+
+- 4 个按钮统一为 `h-7 px-2.5 text-xs rounded inline-flex items-center gap-1`（固定高度 28px）
+- 移除 AI多语言 / AI生成标题标签 按钮内多余的 `<span>` 嵌套，改为 `<>...</>` fragment
+
+---
+
+### 涉及文件汇总
+
+#### 前端新增
+
+| 文件 | 说明 |
+|------|------|
+| `frontend/src/features/home/model/useSavedScripts.ts` | 历史文案 hook（localStorage 持久化） |
+
+#### 前端修改
+
+| 文件 | 变更 |
+|------|------|
+| `frontend/src/features/home/ui/ScriptEditor.tsx` | 历史文案下拉 + 保存按钮 + 移除预计时长 + 按钮高度统一 |
+| `frontend/src/features/home/model/useHomeController.ts` | 集成 useSavedScripts，新增 handleSaveScript |
+| `frontend/src/features/home/ui/HomePage.tsx` | 传递 savedScripts / handleSaveScript / deleteSavedScript 到 ScriptEditor |
+| `frontend/src/features/home/model/useTimelineEditor.ts` | reorderSegments 从属性交换改为数组移动（splice） |
+
+---
+
+## 🔤 字幕语言不匹配 + 视频比例错位修复 — 第五阶段 (Day 23)
+
+### 概述
+
+修复两个视频生成 Bug：
+1. **字幕语言不匹配**: 中文配音 + 英文翻译文案 → 字幕错误显示英文（Whisper 独立转录，忽略原文）
+2. **标题字幕比例错位**: 9:16 竖屏素材生成视频后，标题/字幕按 16:9 横屏布局渲染
+
+附带修复代码审查中发现的 `split_word_to_chars` 英文空格丢失问题。
+
+---
+
+### 一、字幕用原文替换 Whisper 转录文字
+
+#### 根因
+
+Whisper 对音频独立转录，完全忽略传入的 `text` 参数。当配音语言与编辑器文案语言不一致时（例如：用户先写中文文案 → 翻译成英文 → 生成英文配音 → 再改回中文文案），Whisper "听到"英文语音就输出英文字幕。
+
+#### 修复思路
+
+Whisper 仅负责检测**语音总时间范围**（`first_start` → `last_end`），字幕文字永远用配音保存的原始文案。
+
+#### `whisper_service.py` — `align()` 新增 `original_text` 参数
+
+```python
+async def align(self, audio_path, text, output_path=None,
+                language="zh", original_text=None):
+```
+
+当 `original_text` 非空时：
+1. 正常运行 Whisper 转录，记录 `whisper_first_start` 和 `whisper_last_end`
+2. 将 `original_text` 传入 `split_word_to_chars()` 在总时间范围上线性分布
+3. 用 `split_segment_to_lines()` 按标点和字数断行
+4. 替换 Whisper 的转录结果
+
+#### `workflow.py` — 配音元数据无条件覆盖 + 传入原文
+
+```python
+# 改前（只在文案为空时覆盖）
+if not req.text.strip():
+    req.text = meta.get("text", req.text)
+
+# 改后（无条件用配音元数据覆盖）
+meta_text = meta.get("text", "")
+if meta_text:
+    req.text = meta_text
+```
+
+所有 4 处 `whisper_service.align()` 调用添加 `original_text=req.text`。
+
+---
+
+### 二、Remotion 动态传入视频尺寸
+
+#### 根因
+
+`remotion/src/Root.tsx` 硬编码 `width={1280} height={720}`。虽然 `render.ts` 用 ffprobe 检测真实尺寸后覆盖 `composition.width/height`，但 `selectComposition` 阶段组件已按 1280×720 初始化，标题和字幕定位基于错误的画布尺寸。
+
+#### 修复
+
+##### `Root.tsx` — `calculateMetadata` 从 props 读取尺寸
+
+```tsx
+<Composition
+  id="ViGentVideo"
+  component={Video}
+  durationInFrames={300}
+  fps={25}
+  width={1080}
+  height={1920}
+  calculateMetadata={async ({ props }) => ({
+    width: props.width || 1080,
+    height: props.height || 1920,
+  })}
+  defaultProps={{
+    videoSrc: '',
+    width: 1080,
+    height: 1920,
+    // ...
+  }}
+/>
+```
+
+默认从 1280×720 改为 1080×1920（竖屏优先），`calculateMetadata` 确保 `selectComposition` 阶段使用 ffprobe 检测的真实尺寸。
+
+##### `Video.tsx` — VideoProps 新增可选 `width/height`
+
+仅供 `calculateMetadata` 访问，组件渲染不引用。
+
+##### `render.ts` — inputProps 统一传入视频尺寸
+
+```typescript
+const inputProps = {
+  videoSrc: videoFileName,
+  captions,
+  title: options.title,
+  // ...
+  width: videoWidth,     // ffprobe 检测值
+  height: videoHeight,   // ffprobe 检测值
+};
+```
+
+`selectComposition` 和 `renderMedia` 使用同一个 `inputProps`。保留显式 `composition.width/height` 覆盖作为保险。
+
+---
+
+### 三、代码审查修复：英文空格丢失
+
+#### 问题
+
+`split_word_to_chars` 原设计处理 Whisper 单个词（如 `" Hello"`），但 `original_text` 传入整段文本时，中间空格被 `continue` 跳过且不 flush `ascii_buffer`，导致 `"Hello World"` 变成 `"HelloWorld"`。
+
+#### 执行路径追踪
+
+```
+输入: "Hello World"
+  H,e,l,l,o → ascii_buffer = "Hello"
+  ' '       → continue（跳过，不 flush！）
+  W,o,r,l,d → ascii_buffer = "HelloWorld"
+结果: tokens = ["HelloWorld"]  ← 空格丢失
+```
+
+#### 修复
+
+遇到空格时 flush `ascii_buffer`，并用 `pending_space` 标记给下一个 token 前置空格：
+
+```python
+if not char.strip():
+    if ascii_buffer:
+        tokens.append(ascii_buffer)
+        ascii_buffer = ""
+    if tokens:
+        pending_space = True
+    continue
+```
+
+修复后：`"Hello World"` → tokens = `["Hello", " World"]` → 字幕正确显示。中文不受影响。
+
+---
+
+### 涉及文件汇总
+
+#### 后端修改
+
+| 文件 | 变更 |
+|------|------|
+| `backend/app/services/whisper_service.py` | `align()` 新增 `original_text` 参数；`split_word_to_chars` 修复英文空格丢失 |
+| `backend/app/modules/videos/workflow.py` | 配音元数据无条件覆盖 text/language；4 处 `align()` 调用传入 `original_text` |
+
+#### 前端修改（Remotion）
+
+| 文件 | 变更 |
+|------|------|
+| `remotion/src/Root.tsx` | 默认尺寸改为 1080×1920，新增 `calculateMetadata` + width/height defaultProps |
+| `remotion/src/Video.tsx` | VideoProps 新增可选 `width`/`height` |
+| `remotion/render.ts` | inputProps 统一传入 `videoWidth`/`videoHeight`，selectComposition 和 renderMedia 共用 |
+
+---
+
+## 🎤 参考音频自动转写 + 语速控制 — 第六阶段 (Day 23)
+
+### 概述
+
+解决声音克隆 ref_text 不匹配问题：旧方案使用前端固定文字作为 ref_text，CosyVoice zero-shot 克隆要求 ref_text 必须与参考音频实际内容匹配，不匹配时模型会在生成音频开头"幻觉"出多余片段。
+
+**改进**：上传参考音频时自动调用 Whisper 转写内容作为 ref_text，同时新增语速控制功能。
+
+---
+
+### 一、Whisper 自动转写参考音频
+
+#### 1.1 `whisper_service.py` — 语言自动检测
+
+`transcribe()` 方法原先硬编码 `language="zh"`，改为接受可选 `language` 参数（默认 `None` = 自动检测），支持多语言参考音频。
+
+#### 1.2 `ref_audios/service.py` — 上传时自动转写
+
+上传流程变更：转码 WAV → 检查时长(≥1s) → 超 10s 在静音点截取 → **Whisper 自动转写** → 验证非空 → 上传。
+
+```python
+try:
+    transcribed = await whisper_service.transcribe(tmp_wav_path)
+    if transcribed.strip():
+        ref_text = transcribed.strip()
+except Exception as e:
+    logger.warning(f"Auto-transcribe failed: {e}")
+
+if not ref_text or not ref_text.strip():
+    raise ValueError("无法识别音频内容，请确保音频包含清晰的语音")
+```
+
+#### 1.3 `ref_audios/router.py` — ref_text 改为可选
+
+`ref_text: str = Form("")`（不再必填），前端不再发送固定文字。
+
+---
+
+### 二、参考音频智能截取（10 秒上限）
+
+CosyVoice 对 3-10 秒参考音频效果最好。
+
+#### 2.1 静音点检测
+
+使用 ffmpeg `silencedetect` 找 10 秒内最后一个静音结束点（阈值 -30dB，最短 0.3s），避免在字词中间硬切：
+
+```python
+def _find_silence_cut_point(file_path, max_duration):
+    # silencedetect → 解析 silence_end → 找 3s~max_duration 内最后的静音点
+    # 找不到则回退到 max_duration
+```
+
+#### 2.2 淡出处理
+
+截取时末尾 0.1 秒淡出（`afade=t=out`），避免截断爆音。
+
+---
+
+### 三、重新识别功能（旧数据迁移）
+
+#### 3.1 新增 API
+
+`POST /api/ref-audios/{audio_id}/retranscribe` — 下载音频 → 超 10s 截取 → Whisper 转写 → 重新上传音频和元数据。
+
+#### 3.2 前端 UI
+
+- RefAudioPanel 新增 RotateCw 按钮（"重新识别文字"），转写中显示 `animate-spin`
+- 旧音频 ref_text 以固定文字开头时显示 ⚠ 黄色警告
+
+---
+
+### 四、语速控制（CosyVoice speed 参数）
+
+#### 4.1 全链路传递
+
+```
+前端 GeneratedAudiosPanel (速度选择器)
+  → useHomeController (speed state + persistence)
+  → useGeneratedAudios.generateAudio(params)
+  → POST /api/generated-audios/generate { speed: 1.0 }
+  → GenerateAudioRequest.speed (Pydantic)
+  → generate_audio_task → voice_clone_service.generate_audio(speed=)
+  → _generate_once → POST /generate { speed: "1.0" }
+  → cosyvoice_server → _model.inference_zero_shot(speed=speed)
+```
+
+#### 4.2 前端 UI
+
+声音克隆模式下，配音列表面板标题栏"生成配音"按钮左侧显示语速下拉菜单（`语速: 正常 ▼`）：
+
+| 标签 | speed 值 |
+|------|----------|
+| 较慢 | 0.8 |
+| 稍慢 | 0.9 |
+| 正常 | 1.0 (默认) |
+| 稍快 | 1.1 |
+| 较快 | 1.2 |
+
+语速选择持久化到 localStorage（`vigent_{storageKey}_speed`）。
+
+---
+
+### 五、缺少参考音频门控
+
+声音克隆模式下未选参考音频时：
+- "生成配音"按钮禁用 + title 提示"请先选择参考音频"
+- 面板内显示黄色警告条"声音克隆模式需要先选择参考音频"
+
+---
+
+### 六、前端清理
+
+- 移除 `FIXED_REF_TEXT` 常量和 `fixedRefText` prop
+- 移除"请朗读以下内容"引导区块
+- 上传提示简化为"上传任意语音样本（3-10秒），系统将自动识别内容并克隆声音"
+- 录音区备注"建议 3-10 秒，超出将自动截取"
+
+---
+
+### 涉及文件汇总
+
+#### 后端修改
+
+| 文件 | 变更 |
+|------|------|
+| `backend/app/services/whisper_service.py` | `transcribe()` 增加可选 `language` 参数，默认 None (自动检测) |
+| `backend/app/modules/ref_audios/service.py` | 上传自动转写 + 静音点截取 + 淡出 + retranscribe 函数 |
+| `backend/app/modules/ref_audios/router.py` | `ref_text` 改为 Form("")，新增 retranscribe 端点 |
+| `backend/app/modules/generated_audios/schemas.py` | `GenerateAudioRequest` 新增 `speed: float = 1.0` |
+| `backend/app/modules/generated_audios/service.py` | 传递 `req.speed` 到 voice_clone_service |
+| `backend/app/services/voice_clone_service.py` | `generate_audio()` / `_generate_once()` 接受并传递 speed |
+| `models/CosyVoice/cosyvoice_server.py` | `/generate` 端点接受 `speed` 参数，传递到 `inference_zero_shot(speed=)` |
+
+#### 前端修改
+
+| 文件 | 变更 |
+|------|------|
+| `frontend/src/features/home/model/useHomeController.ts` | 新增 speed state，移除 FIXED_REF_TEXT，handleGenerateAudio 传 speed |
+| `frontend/src/features/home/model/useHomePersistence.ts` | 新增 speed 持久化 |
+| `frontend/src/features/home/model/useRefAudios.ts` | 移除 fixedRefText，新增 retranscribe |
+| `frontend/src/features/home/model/useGeneratedAudios.ts` | generateAudio params 新增 speed |
+| `frontend/src/features/home/ui/GeneratedAudiosPanel.tsx` | 新增语速选择器 + 缺少参考音频门控 |
+| `frontend/src/features/home/ui/RefAudioPanel.tsx` | 移除朗读引导，新增重新识别按钮 + ⚠ 警告 |
+| `frontend/src/features/home/ui/HomePage.tsx` | 传递 speed/setSpeed/ttsMode 到 GeneratedAudiosPanel |
--- a/Docs/DevLogs/Day24.md
+++ b/Docs/DevLogs/Day24.md
@@ -0,0 +1,185 @@
+## 🔧 鉴权到期治理 + 多素材时间轴稳定性修复 (Day 24)
+
+### 概述
+
+本日主要完成两条主线：
+
+1. **账号与鉴权治理**：会员到期改为请求时自动失效（登录/鉴权接口触发），并统一返回续费提示。
+2. **视频生成稳定性**：围绕多素材时间轴、截取语义、拼接边界冻结、画面比例与字幕标题适配进行一轮端到端修复。
+
+---
+
+## 🔐 会员到期请求时失效 — 第一阶段 (Day 24)
+
+### 目标
+
+避免依赖定时任务，用户在触发登录或访问受保护接口时即可完成到期判定与账号停用。
+
+### 行为调整
+
+- 到期判断基于 `users.expires_at`。
+- 判定到期后：
+  - 将 `is_active` 自动置为 `false`
+  - 删除该用户全部 session
+  - 返回 `403`，提示：`会员已到期，请续费`
+
+### 实现点
+
+- `users.py` 新增 `deactivate_user_if_expired()`，并补充 `_parse_expires_at()` 统一时区解析。
+- `deps.py` 在 `get_current_user` / `get_current_user_optional` 中统一接入到期检查。
+- `auth/router.py` 在登录路径增加到期停用逻辑；`/api/auth/me` 统一走 `Depends(get_current_user)`。
+
+---
+
+## 🖼️ 画面比例控制 + 字幕标题适配 — 第二阶段 (Day 24)
+
+### 2.1 输出画面比例可配置
+
+- 时间轴顶部新增“画面比例”下拉：`9:16` / `16:9`。
+- 默认值 `9:16`，并持久化到 localStorage。
+- 生成请求携带 `output_aspect_ratio`，后端在单素材与多素材流程中统一按目标分辨率处理。
+
+### 2.2 标题/字幕在窄屏画布防溢出
+
+为减少“预览正常、成片溢出”的差异，统一了预览与渲染策略：
+
+- 根据 composition 宽度进行响应式缩放。
+- 开启可换行：`white-space: normal` + `word-break` + `overflow-wrap`。
+- 描边、字距、上下边距同步按比例缩放。
+
+### 2.3 片头标题显示模式（短暂/常驻）
+
+- 在“标题与字幕”面板的“片头标题”行尾新增下拉，支持：`短暂显示` / `常驻显示`。
+- 默认模式为 `短暂显示`，短暂模式默认时长为 4 秒。
+- 用户选择会持久化到 localStorage，刷新后保持上次配置。
+- 生成请求新增 `title_display_mode`，短暂模式透传 `title_duration=4.0`。
+- Remotion 端到端支持该参数：
+  - `short`：标题在设定时长后淡出并结束渲染；
+  - `persistent`：标题全程常驻（保留淡入动画，不执行淡出）。
+
+---
+
+## 🎥 方向归一化 + 多素材拼接稳定性 — 第三阶段 (Day 24)
+
+### 3.1 MOV 旋转元数据导致横竖识别错误
+
+问题场景：编码分辨率是横屏，但依赖 rotation side-data 才能正确显示为竖屏（常见于手机 MOV）。
+
+修复方案：
+
+- `get_video_metadata()` 扩展返回 `rotation/effective_width/effective_height`。
+- 新增 `normalize_orientation()`，在流程前对带旋转元数据素材做物理方向归一化。
+- 单素材和多素材下载后统一执行方向归一化，再做分辨率决策。
+
+### 3.2 多素材“只看到第一段”与边界冻结
+
+针对拼接可靠性补了两类保护：
+
+- **分配保护**：`custom_assignments` 与素材数量不一致时，后端回退自动分配，避免异常输入导致仅首段生效。
+- **编码一致性**：
+  - 片段准备阶段统一重编码；
+  - concat 阶段不再走拷贝；
+  - 进一步统一为 `25fps + CFR`，并在 concat 增加 `+genpts`，降低段边界时间基不连续导致的“画面冻结口型还动”风险。
+
+---
+
+## ⏱️ 时间轴截取语义对齐修复 — 第四阶段 (Day 24)
+
+### 背景
+
+时间轴设计语义是：
+
+- 每段可以设置 `sourceStart/sourceEnd`；
+- 总时长超出音频时，仅保留可见段，末段截齐音频；
+- 总时长不足时，由最后可见段循环补齐。
+
+本日将前后端对齐到这一语义。
+
+### 4.1 `source_end` 全链路打通
+
+此前仅传 `source_start`，导致后端无法准确知道“截到哪里”。
+
+本次改动：
+
+- 前端 `toCustomAssignments()` 增加可选 `source_end`。
+- 后端 `CustomAssignment` schema 增加 `source_end`。
+- workflow 将 `source_end` 透传到 `prepare_segment()`（单素材/多素材均支持）。
+- `prepare_segment()` 增加 `source_end` 参数，按 `[source_start, source_end)` 计算可用片段，并在需要循环时先裁剪再循环，避免循环范围错位。
+
+### 4.2 时间轴有效时长计算修复
+
+修复 `sourceStart > 0 且 sourceEnd = 0` 时的有效时长错误：
+
+- 旧逻辑会按整段素材时长计算；
+- 新逻辑改为 `materialDuration - sourceStart`。
+
+该修复同时用于：
+
+- `recalcPositions()` 的段时长计算；
+- TimelineEditor 中“循环补足”可视化比例计算。
+
+### 4.3 可见段分配优先级修复
+
+修复“可见段数 < 已选素材数时，custom_assignments 被丢弃回退自动分配”的问题：
+
+- 生成请求优先以时间轴可见段的 `assignments` 为准；
+- 超出时间轴的素材不参与本次生成。
+
+### 4.4 单素材截取触发条件补齐
+
+单素材模式下，若只改了终点（`sourceEnd > 0`）也会发送 `custom_assignments`，确保截取生效。
+
+---
+
+## 🧭 页面交互与体验细节 — 第五阶段 (Day 24)
+
+- 页面刷新后自动回到顶部，避免从历史滚动位置进入页面。
+- 素材列表与历史视频列表滚动增加“跳过首次自动滚动”保护，减少恢复状态时页面跳动。
+- 时间轴比例区移除多余文案，保持信息简洁。
+
+---
+
+## 涉及文件汇总
+
+### 后端修改
+
+| 文件 | 变更 |
+|------|------|
+| `backend/app/repositories/users.py` | 新增 `deactivate_user_if_expired()` 与 `_parse_expires_at()` |
+| `backend/app/core/deps.py` | `get_current_user` / `get_current_user_optional` 接入到期失效检查 |
+| `backend/app/modules/auth/router.py` | 登录时到期停用 + `/api/auth/me` 统一鉴权依赖 |
+| `backend/app/modules/videos/schemas.py` | `CustomAssignment` 新增 `source_end`；保留 `output_aspect_ratio` |
+| `backend/app/modules/videos/workflow.py` | 多素材/单素材透传 `source_end`；多素材 prepare/concat 统一 25fps；标题显示模式参数透传 Remotion |
+| `backend/app/services/video_service.py` | 旋转元数据解析与方向归一化；`prepare_segment` 支持 `source_end/target_fps`；concat 强制 CFR + `+genpts` |
+| `backend/app/services/remotion_service.py` | render 支持 `title_display_mode/title_duration` 并传递到 render.ts |
+
+### 前端修改
+
+| 文件 | 变更 |
+|------|------|
+| `frontend/src/features/home/model/useTimelineEditor.ts` | `CustomAssignment` 新增 `source_end`；修复 sourceStart 开放终点时长计算 |
+| `frontend/src/features/home/model/useHomeController.ts` | 多素材以可见 assignments 为准发送；单素材截取触发条件补齐 |
+| `frontend/src/features/home/ui/TimelineEditor.tsx` | 画面比例下拉；循环比例按截取后有效时长计算 |
+| `frontend/src/features/home/model/useHomePersistence.ts` | `outputAspectRatio` 与 `titleDisplayMode` 持久化 |
+| `frontend/src/features/home/ui/HomePage.tsx` | 页面进入滚动到顶部；ClipTrimmer/Timeline 交互保持一致 |
+| `frontend/src/features/home/ui/FloatingStylePreview.tsx` | 标题/字幕样式预览与成片渲染策略对齐 |
+| `frontend/src/features/home/ui/TitleSubtitlePanel.tsx` | 标题行新增“短暂显示/常驻显示”下拉 |
+
+### Remotion 修改
+
+| 文件 | 变更 |
+|------|------|
+| `remotion/src/components/Title.tsx` | 标题响应式缩放与自动换行；新增短暂/常驻显示模式控制 |
+| `remotion/src/components/Subtitles.tsx` | 字幕响应式缩放与自动换行，减少预览/成片差异 |
+| `remotion/src/Video.tsx` | 新增 `titleDisplayMode` 透传到标题组件 |
+| `remotion/src/Root.tsx` | 默认 props 增加 `titleDisplayMode='short'` 与 `titleDuration=4` |
+| `remotion/render.ts` | CLI 参数新增 `--titleDisplayMode`，inputProps 增加 `titleDisplayMode` |
+
+---
+
+## 验证记录
+
+- 后端语法检查：`python -m py_compile backend/app/modules/videos/schemas.py backend/app/modules/videos/workflow.py backend/app/services/video_service.py backend/app/services/remotion_service.py`
+- 前端类型检查：`npx tsc --noEmit`
+- 前端 ESLint：`npx eslint src/features/home/model/useHomeController.ts src/features/home/model/useHomePersistence.ts src/features/home/ui/HomePage.tsx src/features/home/ui/TitleSubtitlePanel.tsx`
+- Remotion 渲染脚本构建：`npm run build:render`
--- a/Docs/DevLogs/Day25.md
+++ b/Docs/DevLogs/Day25.md
@@ -0,0 +1,254 @@
+## 🔧 文案提取助手修复 — 抖音链接无法提取文案 (Day 25)
+
+### 概述
+
+文案提取助手粘贴抖音链接后无法提取文案，yt-dlp 报错 `Fresh cookies are needed`，手动回退方案也因抖音页面结构变化失效。本日完成了完整修复，并清理了不再需要的 `DOUYIN_COOKIE` 配置。
+
+---
+
+## 🐛 问题诊断
+
+### 错误链路
+
+1. **yt-dlp 失败**：`ERROR: [Douyin] Fresh cookies (not necessarily logged in) are needed`
+   - yt-dlp 版本 `2025.12.08` 过旧
+   - 抖音 API `aweme/v1/web/aweme/detail/` 需要签名 cookie（`s_v_web_id` 等），即使升级 yt-dlp 到最新版 + 传入 cookie 仍无法解决，属 yt-dlp 已知问题
+2. **手动回退失败**：`Could not find RENDER_DATA in page`
+   - 旧方案通过桌面端用户主页 + `modal_id` 访问，抖音 SSR 已不再返回 `videoDetail` 数据
+3. **`.env` 中 `DOUYIN_COOKIE`**：时间戳 2024 年 12 月，早已过期
+
+---
+
+## ✅ 修复方案：移动端分享页 + 自动获取 ttwid
+
+### 核心思路
+
+放弃依赖 yt-dlp 下载抖音视频和手动维护 cookie，改为：
+
+1. 自动从 ByteDance 公共 API 获取新鲜 `ttwid`（匿名令牌，不绑定账号）
+2. 用 `ttwid` 访问移动端分享页 `m.douyin.com/share/video/{id}`
+3. 从页面内嵌 JSON 中提取 `play_addr` 播放地址并下载
+
+### 关键代码（`_download_douyin_manual` 重写）
+
+```python
+# 1. 获取新鲜 ttwid
+ttwid_resp = await client.post(
+    "https://ttwid.bytedance.com/ttwid/union/register/",
+    json={"region": "cn", "aid": 6383, "service": "www.douyin.com", ...}
+)
+ttwid = ttwid_resp.cookies.get("ttwid", "")
+
+# 2. 访问移动端分享页
+page_resp = await client.get(
+    f"https://m.douyin.com/share/video/{video_id}",
+    headers={"cookie": f"ttwid={ttwid}", ...}
+)
+
+# 3. 提取 play_addr
+addr_match = re.search(r'"play_addr":\{"uri":"([^"]+)","url_list":\["([^"]+)"', page_text)
+video_url = addr_match.group(2).replace(r"\u002F", "/")
+```
+
+### 优势
+
+- 不再依赖手动维护的 `DOUYIN_COOKIE`，ttwid 每次请求自动获取
+- 不受 yt-dlp 对抖音支持状况影响
+- 所有用户通用，不绑定特定账号
+
+---
+
+## 🧹 清理 DOUYIN_COOKIE 配置
+
+`DOUYIN_COOKIE` 仅用于文案提取，新方案不再需要，已从以下位置删除：
+
+| 文件 | 变更 |
+|------|------|
+| `backend/.env` | 删除 `DOUYIN_COOKIE` 配置项及注释 |
+| `backend/app/core/config.py` | 删除 `DOUYIN_COOKIE: str = ""` 字段定义 |
+| `backend/app/modules/tools/service.py` | 删除 yt-dlp 传 cookie 逻辑和 `_write_netscape_cookies` 辅助函数 |
+
+---
+
+## 🔤 前端文案修正
+
+将文案提取界面中的"AI 洗稿结果"改为"AI 改写结果"。
+
+| 文件 | 变更 |
+|------|------|
+| `frontend/src/features/home/ui/ScriptExtractionModal.tsx` | `AI 洗稿结果` → `AI 改写结果` |
+| `backend/app/modules/tools/service.py` | 注释中"洗稿"→"改写" |
+| `backend/app/services/glm_service.py` | docstring 中"洗稿"→"改写文案" |
+
+---
+
+## 📦 其他变更
+
+- **yt-dlp 升级**：`2025.12.08` → `2026.2.21`
+- **yt-dlp 初始化修正**：改为 `YoutubeDL(ydl_opts)` 直接传参初始化（原先空初始化后 update params 不生效）
+- **User-Agent 更新**：yt-dlp 中 `Chrome/91` → `Chrome/131`
+
+---
+
+## 涉及文件汇总
+
+### 后端修改
+
+| 文件 | 变更 |
+|------|------|
+| `backend/app/modules/tools/service.py` | 重写 `_download_douyin_manual`（移动端分享页方案）；修正 yt-dlp 初始化；清理 cookie 相关代码；注释改写 |
+| `backend/app/services/glm_service.py` | docstring "洗稿" → "改写文案" |
+| `backend/app/core/config.py` | 删除 `DOUYIN_COOKIE` 字段 |
+| `backend/.env` | 删除 `DOUYIN_COOKIE` 配置 |
+| `backend/requirements.txt` | yt-dlp 版本升级 |
+
+### 前端修改
+
+| 文件 | 变更 |
+|------|------|
+| `frontend/src/features/home/ui/ScriptExtractionModal.tsx` | "AI 洗稿结果" → "AI 改写结果" |
+
+---
+
+## ✏️ AI 智能改写 — 自定义提示词功能
+
+### 概述
+
+文案提取助手的"AI 智能改写"原先使用硬编码 prompt，用户无法定制改写风格。本次在 checkbox 右侧新增"自定义提示词"折叠区域，用户可编辑自定义 prompt，持久化到 localStorage，后端按需替换默认 prompt。
+
+### 后端修改
+
+**路由层** (`router.py`)：`extract_script_tool` 新增可选 Form 参数 `custom_prompt: Optional[str] = Form(None)`，透传给 service。
+
+**服务层** (`service.py`)：`extract_script()` 签名新增 `custom_prompt`，透传给 `glm_service.rewrite_script(script, custom_prompt)`。
+
+**AI 层** (`glm_service.py`)：`rewrite_script(self, text, custom_prompt=None)`，若 `custom_prompt` 有值则用自定义 prompt + 原文拼接，否则保持原有默认 prompt。
+
+```python
+if custom_prompt and custom_prompt.strip():
+    prompt = f"""{custom_prompt.strip()}
+
+原始文案：
+{text}"""
+else:
+    prompt = f"""请将以下视频文案进行改写。...（原有默认）"""
+```
+
+### 前端修改
+
+**Hook** (`useScriptExtraction.ts`)：
+- 新增 `customPrompt` / `showCustomPrompt` 状态
+- 初始值从 `localStorage.getItem("vigent_rewriteCustomPrompt")` 恢复
+- `customPrompt` 变化时防抖 300ms 保存到 localStorage
+- `handleExtract()` 中若 `doRewrite && customPrompt.trim()` 有值，追加 `formData.append("custom_prompt", ...)`
+- modal 重置时不清空 customPrompt（持久化偏好）
+
+**UI** (`ScriptExtractionModal.tsx`)：
+- checkbox 同行右侧新增"自定义提示词 ▼"按钮（仅 `doRewrite` 时显示）
+- 点击展开 textarea 编辑区域，底部提示"留空则使用默认提示词"
+- 取消勾选 AI 智能改写时，自定义提示词区域自动隐藏
+
+### 涉及文件
+
+| 文件 | 变更 |
+|------|------|
+| `backend/app/modules/tools/router.py` | 新增 `custom_prompt` Form 参数 |
+| `backend/app/modules/tools/service.py` | `extract_script()` 透传 `custom_prompt` |
+| `backend/app/services/glm_service.py` | `rewrite_script()` 支持自定义 prompt |
+| `frontend/.../useScriptExtraction.ts` | 新增状态、localStorage 持久化、FormData 传参 |
+| `frontend/.../ScriptExtractionModal.tsx` | UI 按钮 + 展开 textarea |
+
+### 验证
+
+- 后端 `python -m py_compile` 三个文件通过
+- 前端 `npx tsc --noEmit` 通过
+
+---
+
+## 🐛 SSR 构建修复 — localStorage is not defined
+
+### 问题
+
+`npm run build` 报错 `ReferenceError: localStorage is not defined`，因为 `useScriptExtraction.ts` 中 `useState` 的初始化函数在 SSR（Node.js）环境下也会执行，而服务端没有 `localStorage`。
+
+### 修复
+
+`useState` 初始化加 `typeof window !== "undefined"` 守卫：
+
+```typescript
+const [customPrompt, setCustomPrompt] = useState(
+    () => typeof window !== "undefined" ? localStorage.getItem(CUSTOM_PROMPT_KEY) || "" : ""
+);
+```
+
+| 文件 | 变更 |
+|------|------|
+| `frontend/.../useScriptExtraction.ts` | `useState` 初始化增加 SSR 安全守卫 |
+
+---
+
+## 🎬 片头副标题功能
+
+### 概述
+
+新增片头副标题（secondary_title），显示在主标题下方，用于补充说明或悬念引导。副标题有独立的样式配置（字体、字号、颜色等），可由 AI 同时生成，20 字限制，仅在视频画面中显示，不参与发布标题。
+
+命名约定：后端 `secondary_title`（snake_case），前端 `videoSecondaryTitle`（camelCase），用户界面"片头副标题"。
+
+---
+
+### 后端修改
+
+| 文件 | 变更 |
+|------|------|
+| `backend/app/modules/videos/schemas.py` | `GenerateRequest` 新增 4 个可选字段：`secondary_title`、`secondary_title_style_id`、`secondary_title_font_size`、`secondary_title_top_margin` |
+| `backend/app/services/glm_service.py` | AI prompt 增加副标题生成要求（不超过20字），JSON 格式新增 `secondary_title` 字段 |
+| `backend/app/modules/ai/router.py` | `GenerateMetaResponse` 增加 `secondary_title: str = ""`，endpoint 返回时取 `result.get("secondary_title", "")` |
+| `backend/app/modules/videos/workflow.py` | `use_remotion` 条件增加 `or req.secondary_title`；副标题样式解析复用 `get_style("title", ...)`；字号/间距覆盖；`prepare_style_for_remotion` 处理副标题字体；`remotion_service.render()` 传入 `secondary_title` + `secondary_title_style` |
+| `backend/app/services/remotion_service.py` | `render()` 新增 `secondary_title` 和 `secondary_title_style` 参数，构建 CLI 参数 `--secondaryTitle` 和 `--secondaryTitleStyle` |
+
+### Remotion 修改
+
+| 文件 | 变更 |
+|------|------|
+| `remotion/render.ts` | `RenderOptions` 新增 `secondaryTitle?` + `secondaryTitleStyle?`；`parseArgs()` 新增 switch case；`inputProps` 新增两个字段 |
+| `remotion/src/components/Title.tsx` | `TitleProps` 新增 `secondaryTitle?` 和 `secondaryTitleStyle?`；`AbsoluteFill` 改为 `flexDirection: 'column'` 垂直堆叠；主标题 `<h1>` 后增加副标题 `<h2>`，独立样式（默认字号 48px、字重 700），共享淡入淡出动画；副标题字体使用独立 `@font-face`（`SecondaryTitleFont`）避免与主标题冲突 |
+| `remotion/src/Video.tsx` | `VideoProps` 新增 `secondaryTitle?` + `secondaryTitleStyle?`；传递给 `<Title>` 组件；渲染条件改为 `{(title \|\| secondaryTitle) && ...}` |
+| `remotion/src/Root.tsx` | `defaultProps` 新增 `secondaryTitle: undefined` + `secondaryTitleStyle: undefined` |
+
+### 前端修改
+
+| 文件 | 变更 |
+|------|------|
+| `frontend/src/shared/lib/title.ts` | 新增 `SECONDARY_TITLE_MAX_LENGTH = 20` 和 `clampSecondaryTitle()` |
+| `frontend/src/features/home/model/useHomeController.ts` | 新增状态 `videoSecondaryTitle`、`selectedSecondaryTitleStyleId`、`secondaryTitleFontSize`、`secondaryTitleTopMargin`、`secondaryTitleSizeLocked`；新建 `secondaryTitleInput = useTitleInput({ maxLength: 20 })`（不 sync 到发布页）；`handleGenerateMeta()` 接收并填充 `secondary_title`；`handleGenerate()` 构建 payload 增加副标题字段；return 暴露所有新状态 |
+| `frontend/src/features/home/model/useHomePersistence.ts` | 新增 localStorage key：`secondaryTitle`、`secondaryTitleStyle`、`secondaryTitleFontSize`、`secondaryTitleTopMargin`；对应的恢复和保存 effect |
+| `frontend/src/features/home/ui/TitleSubtitlePanel.tsx` | Props 新增副标题相关；主标题输入框下方添加"片头副标题（限制20个字）"输入框；副标题样式选择器（复用 titleStyles 预设）、字号滑块（30-100px）、间距滑块（0-100px） |
+| `frontend/src/features/home/ui/FloatingStylePreview.tsx` | 标题预览改为 flex column 布局；主标题下方增加副标题预览行，独立样式渲染 |
+| `frontend/src/features/home/ui/HomePage.tsx` | 从 `useHomeController` 解构新状态，传给 `TitleSubtitlePanel` |
+
+---
+
+## 🐛 参考音频上传 — 中文文件名 InvalidKey 修复
+
+### 问题
+
+上传中文名参考音频（如"我的声音.wav"）时，Supabase Storage 报 `InvalidKey`，因为存储路径直接使用了原始中文文件名。
+
+### 修复
+
+在 `ref_audios/service.py` 新增 `sanitize_filename()` 函数，将存储路径的文件名清洗为 ASCII 安全字符（仅 `A-Za-z0-9._-`）：
+
+- NFKD 规范化 → 丢弃非 ASCII → 非法字符替换为 `_`
+- 纯中文/emoji 清洗后为空时，使用 MD5 哈希兜底（如 `audio_e924b1193007`）
+- 文件名限长 50 字符
+- 原始中文文件名保留在 metadata 中作为展示名，前端显示不受影响
+
+```
+修复前: cbbe.../1771915755_我的声音.wav  → InvalidKey
+修复后: cbbe.../1771915755_audio_xxxxxxxx.wav → 上传成功
+```
+
+| 文件 | 变更 |
+|------|------|
+| `backend/app/modules/ref_audios/service.py` | 新增 `sanitize_filename()` 函数，上传路径使用清洗后文件名 |
--- a/Docs/DevLogs/Day26.md
+++ b/Docs/DevLogs/Day26.md
@@ -0,0 +1,239 @@
+## 🎨 前端优化：板块合并 + 序号标题 + UI 精细化 (Day 26)
+
+### 概述
+
+首页原有 9 个独立板块（左栏 7 个 + 右栏 2 个），每个都有自己的卡片容器和标题，视觉碎片化严重。本次将相关板块合并为 5 个主板块，添加中文序号（一~十），移除 emoji 图标，并对多个子组件的布局和交互细节进行优化。
+
+---
+
+## ✅ 改动内容
+
+### 1. 板块合并方案
+
+**左栏（4 个主板块 + 2 个独立区域）：**
+
+| 序号 | 板块名 | 子板块 | 原组件 |
+|------|--------|--------|--------|
+| 一 | 文案提取与编辑 | — | ScriptEditor |
+| 二 | 标题与字幕 | — | TitleSubtitlePanel |
+| 三 | 配音 | 配音方式 / 配音列表 | VoiceSelector + GeneratedAudiosPanel |
+| 四 | 素材编辑 | 视频素材 / 时间轴编辑 | MaterialSelector + TimelineEditor |
+| 五 | 背景音乐 | — | BgmPanel |
+| — | 生成按钮 | — | GenerateActionBar（不编号） |
+
+**右栏（1 个主板块）：**
+
+| 序号 | 板块名 | 子板块 | 原组件 |
+|------|--------|--------|--------|
+| 六 | 作品 | 作品列表 / 作品预览 | HistoryList + PreviewPanel |
+
+**发布页（/publish）：**
+
+| 序号 | 板块名 |
+|------|--------|
+| 七 | 平台账号 |
+| 八 | 选择发布作品 |
+| 九 | 发布信息 |
+| 十 | 选择发布平台 |
+
+### 2. embedded 模式
+
+6 个组件新增 `embedded?: boolean` prop（默认 `false`）：
+
+- `VoiceSelector` — embedded 时不渲染外层卡片和主标题
+- `GeneratedAudiosPanel` — embedded 时两行布局：第 1 行（语速+生成配音右对齐）、第 2 行（配音列表+刷新）
+- `MaterialSelector` — embedded 时自渲染 h3 子标题"视频素材"+ 上传/刷新按钮同行
+- `TimelineEditor` — embedded 时自渲染 h3 子标题"时间轴编辑"+ 画面比例/播放控件同行
+- `PreviewPanel` — embedded 时不渲染外层卡片和标题
+- `HistoryList` — embedded 时不渲染外层卡片和标题（刷新按钮由 HomePage 提供）
+
+### 3. 序号标题 + emoji 移除
+
+所有编号板块移除 emoji 图标，使用纯中文序号：
+
+- ScriptEditor: `✍️ 文案提取与编辑` → `一、文案提取与编辑`
+- TitleSubtitlePanel: `🎬 标题与字幕` → `二、标题与字幕`
+- BgmPanel: `🎵 背景音乐` → `五、背景音乐`
+- HomePage 右栏: `五、作品` → `六、作品`
+- PublishPage: `👤 平台账号` → `七、平台账号`、`📹 选择发布作品` → `八、选择发布作品`、`✍️ 发布信息` → `九、发布信息`、`📱 选择发布平台` → `十、选择发布平台`
+
+### 4. 子标题与分隔样式
+
+- **主标题**: `text-base sm:text-lg font-semibold text-white`
+- **子标题**: `text-sm font-medium text-gray-400`
+- **分隔线**: `<div className="border-t border-white/10 my-4" />`
+
+### 5. 配音列表布局优化
+
+GeneratedAudiosPanel embedded 模式下采用两行布局：
+- **第 1 行**：语速下拉 + 生成配音按钮（右对齐，`flex justify-end`）
+- **第 2 行**：`<h3>配音列表</h3>` + 刷新按钮（两端对齐）
+- 非 embedded 模式保持原单行布局
+
+### 6. TitleSubtitlePanel 下拉对齐
+
+- 标题样式/副标题样式/字幕样式三行标签统一 `w-20`（固定 80px），确保下拉菜单垂直对齐
+- 下拉菜单宽度 `w-1/3 min-w-[100px]`，避免过宽
+
+### 7. RefAudioPanel 文案简化
+
+- 原底部段落"上传任意语音样本（3-10秒）…" 移至 "我的参考音频" 标题旁，简化为 `(上传3-10秒语音样本)`
+
+### 8. 账户下拉菜单添加手机号
+
+- AccountSettingsDropdown 在账户有效期上方新增手机号显示区域
+- 显示 `user?.phone || '未知账户'`
+
+### 9. 标题显示模式对副标题生效
+
+- **payload 修复**: `useHomeController.ts` 中 `title_display_mode` 的发送条件从 `videoTitle.trim()` 改为 `videoTitle.trim() || videoSecondaryTitle.trim()`，确保仅有副标题时也能发送显示模式
+- **UI 调整**: 短暂显示/常驻显示下拉从片头标题输入行移至"二、标题与字幕"板块标题行（与预览样式按钮同行），明确表示该设置对标题和副标题同时生效
+- Remotion 端 `Title.tsx` 已支持（标题和副标题作为整体组件渲染，`displayMode` 统一控制）
+
+### 10. 时间轴模糊遮罩
+
+遮罩从外层 wrapper 移入"四、素材编辑"卡片内，仅覆盖时间轴子区域（`rounded-xl`）。
+
+### 11. 登录后用户信息立即可用
+
+- AuthContext 新增 `setUser` 方法暴露给消费者
+- 登录页成功后调用 `setUser(result.user)` 立即写入 Context，无需等页面刷新
+- 修复登录后账户下拉显示"未知账户"、刷新后才显示手机号的问题
+
+### 12. 文案与选项微调
+
+- MaterialSelector 描述 `(可多选，最多4个)` → `(上传自拍视频，最多可选4个)`
+- TitleSubtitlePanel 显示模式选项 `短暂显示/常驻显示` → `标题短暂显示/标题常驻显示`
+
+### 13. UI/UX 体验优化（6 项）
+
+- **操作按钮移动端可见**: 配音列表、作品列表、素材列表、参考音频、历史文案的操作按钮从 `opacity-0`（hover 才显示）改为 `opacity-40`（平时半透明可见，hover 全亮），解决触屏设备无法发现按钮的问题
+- **手机号脱敏**: AccountSettingsDropdown 手机号中间四位遮掩 `138****5678`
+- **标题字数计数器**: TitleSubtitlePanel 标题/副标题输入框右侧显示实时字数 `3/15`，超限变红
+- **列表滚动条提示**: ~~配音列表、作品列表、素材列表、BGM 列表从 `hide-scrollbar` 改为 `custom-scrollbar`~~ → 已全部改回 `hide-scrollbar` 隐藏滚动条（滚动功能不变）
+- **时间轴拖拽提示**: TimelineEditor 色块左上角新增 `GripVertical` 抓手图标，暗示可拖拽排序
+- **截取滑块放大**: ClipTrimmer 手柄从 16px 放大到 20px，触控区从 32px 放大到 40px
+
+### 14. 代码质量修复（4 项）
+
+- **AccountSettingsDropdown**: 关闭密码弹窗补齐 `setSuccess('')` 清空
+- **MaterialSelector**: `selectedSet` 加 `useMemo` 避免每次渲染重建
+- **TimelineEditor**: `visibleSegments`/`overflowSegments` 加 `useMemo`
+- **MaterialSelector**: 素材满 4 个时非选中项按钮加 `disabled`
+
+### 15. 发布页平台账号响应式布局
+
+- **单行布局**：图标+名称+状态在左，按钮在右（`flex items-center`）
+- **移动端紧凑**：图标 `h-6 w-6`、按钮 `text-xs px-2 py-1 rounded-md`、间距 `space-y-2 px-3 py-2.5`
+- **桌面端宽松**：`sm:h-7 sm:w-7`、`sm:text-sm sm:px-3 sm:py-1.5 sm:rounded-lg`、`sm:space-y-3 sm:px-4 sm:py-3.5`
+- 两端各自美观，风格与其他板块一致
+
+### 16. 移动端刷新回顶部修复
+
+- **问题**: 移动端刷新页面后不回到顶部，而是滚动到背景音乐板块
+- **根因**: 1) 浏览器原生滚动恢复覆盖 `scrollTo(0,0)`；2) 列表 scroll effect 有双依赖（`selectedId` + `list`），数据异步加载时第二次触发跳过了 ref 守卫，执行了 `scrollIntoView` 导致页面跳动
+- **修复**: 三管齐下 — ① `history.scrollRestoration = "manual"` 禁用浏览器原生恢复；② 时间门控 `scrollEffectsEnabled` ref（1 秒内禁止所有列表自动滚动）替代单次 ref 守卫；③ 200ms 延迟兜底 `scrollTo(0,0)`
+
+### 17. 移动端样式预览窗口缩小
+
+- **问题**: 移动端点击"预览样式"后窗口占满整屏（宽 358px，高约 636px），遮挡样式调节控件
+- **修复**: 移动端宽度从 `window.innerWidth - 32` 缩小到 **160px**；位置从左上角改为**右下角**（`right:12, bottom:12`），不遮挡上方控件；最大高度限制 `50dvh`
+- 桌面端保持不变（280px，左上角）
+
+### 18. 列表滚动条统一隐藏
+
+- 将 Day 26 早期改为 `custom-scrollbar`（细紫色滚动条）的 7 处全部改回 `hide-scrollbar`
+- 涉及：BgmPanel、GeneratedAudiosPanel、HistoryList、MaterialSelector（2处）、ScriptExtractionModal（2处）
+- 滚动功能不受影响，仅视觉上不显示滚动条
+
+### 19. 配音按钮移动端适配
+
+- VoiceSelector "选择声音/克隆声音" 按钮：内边距 `px-4` → `px-2 sm:px-4`，字号加 `text-sm sm:text-base`，图标加 `shrink-0`
+- 修复移动端窄屏下按钮被挤压导致"克隆声音"不可见的问题
+
+### 20. 素材标题溢出修复
+
+- MaterialSelector embedded 标题行移除 `whitespace-nowrap`
+- 描述文字 `(上传自拍视频，最多可选4个)` 在移动端隐藏（`hidden sm:inline`），桌面端正常显示
+- 修复移动端刷新按钮被推出容器外的问题
+
+### 21. 生成配音按钮放大
+
+- "生成配音" 作为核心操作按钮，从辅助尺寸升级为主操作尺寸
+- 内边距 `px-2/px-3 py-1/py-1.5` → `px-4 py-2`，字号 `text-xs` → `text-sm font-medium`
+- 图标 `h-3.5 w-3.5` → `h-4 w-4`，新增 `shadow-sm` + hover `shadow-md`
+- embedded 与非 embedded 模式统一放大
+
+### 22. 生成进度条位置调整
+
+- **问题**: 生成进度条在"六、作品"卡片内部（作品预览下方），不够醒目
+- **修复**: 进度条从 PreviewPanel 内部提取到 HomePage 右栏，作为独立卡片渲染在"六、作品"卡片**上方**
+- 使用紫色边框（`border-purple-500/30`）区分，显示任务消息和百分比
+- PreviewPanel embedded 模式下不再渲染进度条（传入 `currentTask={null}`）
+- 生成完成后进度卡片自动消失
+
+### 23. LatentSync 超时修复
+
+- **问题**: 约 2 分钟的视频（3023 帧，190 段推理）预计推理 54 分钟，但 httpx 超时仅 20 分钟，导致 LatentSync 调用失败并回退到无口型同步
+- **根因**: `lipsync_service.py` 中 `httpx.AsyncClient(timeout=1200.0)` 不足以覆盖长视频推理时间
+- **修复**: 超时从 `1200s`（20 分钟）改为 `3600s`（1 小时），足以覆盖 2-3 分钟视频的推理
+
+### 24. 字幕时间戳节奏映射（修复长视频字幕漂移）
+
+- **问题**: 2 分钟视频字幕明显对不上语音，越到后面偏差越大
+- **根因**: `whisper_service.py` 的 `original_text` 处理逻辑丢弃了 Whisper 逐词时间戳，仅保留总时间范围后做全程线性插值，每个字分配相同时长，完全忽略语速变化和停顿
+- **修复**: 保留 Whisper 的逐字时间戳作为语音节奏模板，将原文字符按比例映射到 Whisper 时间节奏上（rhythm-mapping），而非线性均分。字幕文字不变，只是时间戳跟随真实语速
+- **算法**: 原文第 i 个字符映射到 Whisper 时间线的 `(i/N)*M` 位置（N=原文字符数，M=Whisper字符数），在相邻 Whisper 时间点间线性插值
+
+---
+
+## 📁 修改文件清单
+
+| 文件 | 改动 |
+|------|------|
+| `VoiceSelector.tsx` | 新增 embedded prop，移动端按钮适配（`px-2 sm:px-4`） |
+| `GeneratedAudiosPanel.tsx` | 新增 embedded prop，两行布局，操作按钮可见度，"生成配音"按钮放大 |
+| `MaterialSelector.tsx` | 新增 embedded prop，自渲染子标题+操作按钮，useMemo，disabled 守卫，操作按钮可见度，标题溢出修复 |
+| `TimelineEditor.tsx` | 新增 embedded prop，自渲染子标题+控件，useMemo，拖拽抓手图标 |
+| `PreviewPanel.tsx` | 新增 embedded prop |
+| `HistoryList.tsx` | 新增 embedded prop，操作按钮可见度 |
+| `ScriptEditor.tsx` | 标题加序号，移除 emoji，操作按钮可见度 |
+| `TitleSubtitlePanel.tsx` | 标题加序号，移除 emoji，下拉对齐，显示模式下拉上移，字数计数器 |
+| `BgmPanel.tsx` | 标题加序号 |
+| `HomePage.tsx` | 核心重构：合并板块、序号标题、生成配音按钮迁入、`scrollRestoration` + 延迟兜底修复刷新回顶部、生成进度条提取到作品卡片上方 |
+| `PublishPage.tsx` | 四个板块加序号（七~十），移除 emoji，平台卡片响应式单行布局 |
+| `RefAudioPanel.tsx` | 简化提示文案，操作按钮可见度 |
+| `AccountSettingsDropdown.tsx` | 新增手机号显示（脱敏），补齐 success 清空 |
+| `AuthContext.tsx` | 新增 `setUser` 方法，登录后立即更新用户状态 |
+| `login/page.tsx` | 登录成功后调用 `setUser` 写入用户数据 |
+| `useHomeController.ts` | titleDisplayMode 条件修复，列表 scroll 时间门控 `scrollEffectsEnabled` |
+| `FloatingStylePreview.tsx` | 移动端预览窗口缩小（160px）并移至右下角 |
+| `ScriptExtractionModal.tsx` | 滚动条改回隐藏 |
+| `ClipTrimmer.tsx` | 滑块手柄放大、触控区增高 |
+| `lipsync_service.py` | httpx 超时从 1200s 改为 3600s |
+| `whisper_service.py` | 字幕时间戳从线性插值改为 Whisper 节奏映射 |
+
+---
+
+## 🔍 验证
+
+- `npm run build` — 零报错零警告
+- 合并后布局：各子板块分隔清晰、主标题有序号
+- 向后兼容：`embedded` 默认 `false`，组件独立使用不受影响
+- 配音列表两行布局：语速+生成配音在上，配音列表+刷新在下
+- 下拉菜单垂直对齐正确
+- 短暂显示/常驻显示对标题和副标题同时生效
+- 操作按钮在移动端（触屏）可见
+- 手机号脱敏显示
+- 标题字数计数器正常
+- 列表滚动条全部隐藏
+- 时间轴拖拽抓手图标显示
+- 发布页平台卡片：移动端紧凑、桌面端宽松，风格一致
+- 移动端刷新后回到顶部，不再滚动到背景音乐位置
+- 移动端样式预览窗口不遮挡控件
+- 移动端配音按钮（选择声音/克隆声音）均可见
+- 移动端素材标题行按钮不溢出
+- 生成配音按钮视觉层级高于辅助按钮
+- 生成进度条在作品卡片上方独立显示
+- LatentSync 长视频推理不再超时回退
+- 字幕时间戳与语音节奏同步，长视频不漂移
--- a/Docs/DevLogs/Day27.md
+++ b/Docs/DevLogs/Day27.md
@@ -0,0 +1,231 @@
+## Remotion 描边修复 + 字体样式扩展 + TypeScript 修复 (Day 27)
+
+### 概述
+
+修复标题/字幕描边渲染问题（描边过粗 + 副标题重影），扩展字体样式选项（标题 4→12、字幕 4→8），修复 Remotion 项目 TypeScript 类型错误。
+
+---
+
+## ✅ 改动内容
+
+### 1. 描边渲染修复（标题 + 字幕）
+
+- **问题**: 标题黑色描边过粗，副标题出现重影/鬼影
+- **根因**: `buildTextShadow` 用 4 方向 `textShadow` 模拟描边 — 对角线叠加导致描边视觉上比实际 `stroke_size` 更粗；4 角方向在中间有间隙和叠加，造成重影
+- **修复**: 改用 CSS 原生描边 `-webkit-text-stroke` + `paint-order: stroke fill`（Remotion 用 Chromium 渲染，完美支持）
+- **旧方案**:
+  ```javascript
+  textShadow: `-8px -8px 0 #000, 8px -8px 0 #000, -8px 8px 0 #000, 8px 8px 0 #000, 0 0 16px rgba(0,0,0,0.5), 0 2px 4px rgba(0,0,0,0.3)`
+  ```
+- **新方案**:
+  ```javascript
+  WebkitTextStroke: `5px #000000`,
+  paintOrder: 'stroke fill',
+  textShadow: `0 2px 4px rgba(0,0,0,0.3)`,
+  ```
+- 同时将所有预设样式的 `stroke_size` 从 8 降到 5，配合原生描边视觉更干净
+
+### 2. 字体样式扩展
+
+**标题样式**: 4 个 → 12 个（+8）
+
+| ID | 样式名 | 字体 | 配色 |
+|----|--------|------|------|
+| title_pangmen | 庞门正道 | 庞门正道标题体3.0 | 白字黑描 |
+| title_round | 优设标题圆 | 优设标题圆 | 白字紫描 |
+| title_alibaba | 阿里数黑体 | 阿里巴巴数黑体 | 白字黑描 |
+| title_chaohei | 文道潮黑 | 文道潮黑 | 青蓝字深蓝描 |
+| title_wujie | 无界黑 | 标小智无界黑 | 白字深灰描 |
+| title_houdi | 厚底黑 | Aa厚底黑 | 红字深黑描 |
+| title_banyuan | 寒蝉半圆体 | 寒蝉半圆体 | 白字黑描 |
+| title_jixiang | 欣意吉祥宋 | 字体圈欣意吉祥宋 | 金字棕描 |
+
+**字幕样式**: 4 个 → 8 个（+4）
+
+| ID | 样式名 | 字体 | 高亮色 |
+|----|--------|------|--------|
+| subtitle_pink | 少女粉 | DingTalk JinBuTi | 粉色 #FF69B4 |
+| subtitle_lime | 清新绿 | DingTalk Sans | 荧光绿 #76FF03 |
+| subtitle_gold | 金色隶书 | 阿里妈妈刀隶体 | 金色 #FDE68A |
+| subtitle_kai | 楷体红字 | SimKai | 红色 #FF4444 |
+
+### 3. TypeScript 类型错误修复
+
+- **Root.tsx**: `Composition` 泛型类型与 `calculateMetadata` 参数类型不匹配 — 内联 `calculateMetadata` 并显式标注参数类型，`defaultProps` 使用 `satisfies VideoProps` 约束
+- **Video.tsx**: `VideoProps` 接口添加 `[key: string]: unknown` 索引签名，兼容 Remotion 要求的 `Record<string, unknown>` 约束
+- **VideoLayer.tsx**: `OffthreadVideo` 组件不支持 `loop` prop — 移除（该 prop 原本就被忽略）
+
+### 4. 进度条文案还原
+
+- **问题**: 进度条显示后端推送的详细阶段消息（如"正在合成唇型"），用户希望只显示"正在AI生成中..."
+- **修复**: `HomePage.tsx` 进度条文案从 `{currentTask.message || "正在AI生成中..."}` 改为固定 `正在AI生成中...`
+
+---
+
+## 📁 修改文件清单
+
+| 文件 | 改动 |
+|------|------|
+| `remotion/src/components/Title.tsx` | `buildTextShadow` → `buildStrokeStyle`（CSS 原生描边），标题+副标题同时生效 |
+| `remotion/src/components/Subtitles.tsx` | `buildTextShadow` → `buildStrokeStyle`（CSS 原生描边） |
+| `remotion/src/Root.tsx` | 修复 `Composition` 泛型类型、`calculateMetadata` 参数类型 |
+| `remotion/src/Video.tsx` | `VideoProps` 添加索引签名 |
+| `remotion/src/components/VideoLayer.tsx` | 移除 `OffthreadVideo` 不支持的 `loop` prop |
+| `backend/assets/styles/title.json` | 标题样式从 4 个扩展到 12 个，`stroke_size` 8→5 |
+| `backend/assets/styles/subtitle.json` | 字幕样式从 4 个扩展到 8 个 |
+| `frontend/.../HomePage.tsx` | 进度条文案还原为固定"正在AI生成中..." |
+
+---
+
+## 🔍 验证
+
+- `npx tsc --noEmit` — 零错误
+- `npm run build:render` — 渲染脚本编译成功
+- `npm run build`（前端）— 零报错
+- 描边：标题/副标题/字幕使用 CSS 原生描边，无重影、无虚胖
+- 样式选择：前端下拉可加载全部 12 个标题 + 8 个字幕样式
+
+---
+
+## 视频生成流水线性能优化
+
+### 概述
+
+针对视频生成流水线进行全面性能优化，涵盖 FFmpeg 编码参数、LatentSync 推理参数、多素材并行化、以及后处理阶段并行化。预估 15s 单素材视频从 ~280s 降至 ~190s (32%)，30s 双素材从 ~400s 降至 ~240s (40%)。
+
+**服务器配置**: 2x RTX 3090 (24GB), 2x Xeon E5-2680 v4 (56核), 192GB RAM
+
+### 第一阶段：FFmpeg 编码优化
+
+**最终合成 preset `slow` → `medium`**
+- 合成阶段从 ~50s 降到 ~25s，质量几乎无变化
+
+**中间文件 CRF 18 → 23**
+- 中间产物（trim、prepare_segment、concat、loop、normalize_orientation）不是最终输出，不需要高质量编码
+- 每个中间步骤快 3-8 秒
+
+**最终合成 CRF 18 → 20**
+- 15 秒口播视频 CRF 18 vs 20 肉眼无法区分
+
+### 第二阶段：LatentSync 推理参数调优
+
+**inference_steps 20 → 16**
+- 推理时间线性减少 20%（~180s → ~144s）
+
+**guidance_scale 2.0 → 1.5**
+- classifier-free guidance 权重降低，每步计算量微降（5-10%）
+
+> ⚠️ 两项需重启 LatentSync 服务后测试唇形质量，确认可接受再保留。如质量不佳可回退 .env 参数。
+
+### 第三阶段：多素材流水线并行化
+
+**素材下载 + 归一化并行**
+- 串行 `for` 循环改为 `asyncio.gather()`，`normalize_orientation` 通过 `run_in_executor` 在线程池执行
+- N 个素材从串行 N×5s → ~5s
+
+**片段预处理并行**
+- 逐个 `prepare_segment` 改为 `asyncio.gather()` + `run_in_executor`
+- 2 素材 ~90s → ~50s；4 素材 ~180s → ~60s
+
+### 第四阶段：流水线交叠
+
+**Whisper 字幕对齐 与 BGM 混音 并行**
+- 两者互不依赖（都只依赖 audio_path），用 `asyncio.gather()` 并行执行
+- 单素材模式下 Whisper 从 LatentSync 之后的串行步骤移至与 BGM 并行
+- 不开 BGM 或不开字幕时行为不变，只有同时启用时才并行
+
+### 修改文件
+
+| 文件 | 改动 |
+|------|------|
+| `backend/app/services/video_service.py` | compose: preset slow→medium, CRF 18→20; normalize_orientation/prepare_segment/concat: CRF 18→23 |
+| `backend/app/services/lipsync_service.py` | _loop_video_to_duration: CRF 18→23 |
+| `backend/.env` | LATENTSYNC_INFERENCE_STEPS=16, LATENTSYNC_GUIDANCE_SCALE=1.5 |
+| `backend/app/modules/videos/workflow.py` | import asyncio; 素材下载/归一化并行; 片段预处理并行; Whisper+BGM 并行 |
+
+### 回退方案
+
+- FFmpeg 参数：如画质不满意，将最终 CRF 改回 18、preset 改回 slow
+- LatentSync：如唇形质量下降，将 .env 中 `INFERENCE_STEPS` 改回 20、`GUIDANCE_SCALE` 改回 2.0
+- 并行化：纯架构优化，无质量影响，无需回退
+
+---
+
+## MuseTalk + LatentSync 混合唇形同步方案
+
+### 概述
+
+LatentSync 1.6 质量高但推理极慢（~78% 总时长），长视频（>=2min）耗时 20-60 分钟不可接受。MuseTalk 1.5 是单步潜空间修复（非扩散模型），逐帧推理速度接近实时（30fps+ on V100），适合长视频。混合方案按音频时长自动路由：短视频用 LatentSync 保质量，长视频用 MuseTalk 保速度。
+
+### 架构
+
+- **路由阈值**: `LIPSYNC_DURATION_THRESHOLD` (默认 120s)
+- **短视频 (<120s)**: LatentSync 1.6 (GPU1, 端口 8007)
+- **长视频 (>=120s)**: MuseTalk 1.5 (GPU0, 端口 8011)
+- **回退**: MuseTalk 不可用时自动 fallback 到 LatentSync
+
+### 改动文件
+
+| 文件 | 改动 |
+|------|------|
+| `models/MuseTalk/` | 从 Temp/MuseTalk 复制代码 + 下载权重 |
+| `models/MuseTalk/scripts/server.py` | 新建 FastAPI 常驻服务 (端口 8011, GPU0) |
+| `backend/app/core/config.py` | 新增 MUSETALK_* 和 LIPSYNC_DURATION_THRESHOLD |
+| `backend/.env` | 新增对应环境变量 |
+| `backend/app/services/lipsync_service.py` | 新增 `_call_musetalk_server()` + 混合路由逻辑 + 扩展 `check_health()` |
+
+---
+
+## MuseTalk 推理性能优化 (server.py v2)
+
+### 概述
+
+MuseTalk 首次长视频测试 (136s, 3404 帧) 耗时 1799s (~30 分钟)，分析发现瓶颈集中在人脸检测 (28%)、BiSeNet 合成 (22%)、I/O (17%)，而非 UNet 推理本身 (17%)。通过 6 项优化预估降至 8-10 分钟 (~3x 加速)。
+
+### 性能瓶颈分析 (优化前, 1799s)
+
+| 阶段 | 耗时 | 占比 | 瓶颈原因 |
+|------|------|------|---------|
+| DWPose + 人脸检测 | ~510s | 28% | `batch_size_fa=1`, 每帧跑 2 个 NN, 完全串行 |
+| 合成 + BiSeNet 人脸解析 | ~400s | 22% | 每帧都跑 BiSeNet + PNG 写盘 |
+| UNet 推理 | ~300s | 17% | batch_size=8 太小 |
+| I/O (PNG 读写 + FFmpeg) | ~300s | 17% | PNG 压缩慢, ffmpeg→PNG→imread 链路 |
+| VAE 编码 | ~100s | 6% | 逐帧编码, 未批处理 |
+
+### 6 项优化
+
+| # | 优化项 | 详情 |
+|---|--------|------|
+| 1 | **batch_size 8→32** | `.env` 修改, RTX 3090 显存充裕 |
+| 2 | **cv2.VideoCapture 直读帧** | 跳过 ffmpeg→PNG→imread 链路, 省去 3404 次 PNG 编解码 |
+| 3 | **人脸检测降频 (每5帧)** | 每 5 帧运行 DWPose + FaceAlignment, 中间帧线性插值 bbox |
+| 4 | **BiSeNet mask 缓存 (每5帧)** | 每 5 帧运行 `get_image_prepare_material`, 中间帧用 `get_image_blending` 复用缓存 mask |
+| 5 | **cv2.VideoWriter 直写** | 跳过逐帧 PNG 写盘 + ffmpeg 重编码, 用 VideoWriter 直写 mp4 |
+| 6 | **每阶段计时** | 7 个阶段精确计时, 方便后续进一步调优 |
+
+### 修改文件
+
+| 文件 | 改动 |
+|------|------|
+| `models/MuseTalk/scripts/server.py` | 完全重写 `_run_inference()`, 新增 `_detect_faces_subsampled()` |
+| `backend/.env` | `MUSETALK_BATCH_SIZE` 8→32 |
+
+---
+
+## Remotion 并发渲染优化
+
+### 概述
+
+Remotion 渲染在 56 核服务器上默认只用 8 并发 (`min(8, cores/2)`)，改为 16 并发，预估从 ~5 分钟降到 ~2-3 分钟。
+
+### 改动
+
+- `remotion/render.ts`: `renderMedia()` 新增 `concurrency` 参数 (默认 16), 支持 `--concurrency` CLI 参数覆盖
+- `remotion/dist/render.js`: 重新编译
+
+### 修改文件
+
+| 文件 | 改动 |
+|------|------|
+| `remotion/render.ts` | `RenderOptions` 新增 `concurrency` 字段, `renderMedia()` 传入 `concurrency` |
+| `remotion/dist/render.js` | TypeScript 重新编译 |
--- a/Docs/DevLogs/Day28.md
+++ b/Docs/DevLogs/Day28.md
@@ -0,0 +1,203 @@
+## CosyVoice FP16 加速 + 文档更新 + AI改写界面重构 + 标题字幕面板重排与视频帧预览 (Day 28)
+
+### 概述
+
+CosyVoice 3.0 声音克隆服务开启 FP16 半精度推理，预估提速 30-40%。同步更新 4 个项目文档。重构 AI 改写文案界面（RewriteModal 两步流程 + ScriptExtractionModal 逻辑抽取）。前端将"标题与字幕"面板从第二步移至第四步（素材编辑之后），样式预览窗口背景从紫粉渐变改为视频片头帧截图，实现所见即所得。
+
+---
+
+## ✅ 改动内容
+
+### 1. CosyVoice FP16 半精度加速
+
+- **问题**: CosyVoice 3.0 以 FP32 全精度运行，RTF (Real-Time Factor) 约 0.9-1.35x，生成 2 分钟音频需要约 2 分钟
+- **根因**: `AutoModel()` 初始化时未传入 `fp16=True`，LLM 推理和 Flow Matching (DiT) 均在 FP32 下运行
+- **修复**: 一行改动开启 FP16 自动混合精度
+
+```python
+# 旧: _model = AutoModel(model_dir=str(MODEL_DIR))
+# 新:
+_model = AutoModel(model_dir=str(MODEL_DIR), fp16=True)
+```
+
+- **生效机制**: `CosyVoice3Model` 在 `llm_job()` 和 `token2wav()` 中通过 `torch.cuda.amp.autocast(self.fp16)` 自动将计算转为 FP16
+- **预期效果**:
+  - 推理速度提升 30-40%
+  - 显存占用降低 ~30%
+  - 语音质量基本无损（0.5B 模型 FP16 精度充足）
+- **验证**: 服务重启后自检通过，健康检查 `ready: true`
+
+### 2. 文档全面更新 (4 个文件)
+
+补充 Day 27 新增的 MuseTalk 混合唇形同步方案、性能优化、Remotion 并发渲染等内容到所有相关文档。
+
+#### README.md
+- 项目描述更新为 "LatentSync 1.6 + MuseTalk 1.5 混合唇形同步"
+- 唇形同步功能描述改为混合方案（短视频 LatentSync，长视频 MuseTalk）
+- 技术栈表新增 MuseTalk 1.5
+- 项目结构新增 `models/MuseTalk/`
+- 服务架构表新增 MuseTalk (端口 8011)
+- 文档中心新增 MuseTalk 部署指南链接
+- 性能优化描述新增降频检测 + Remotion 16 并发
+
+#### DEPLOY_MANUAL.md
+- GPU 分配说明更新 (GPU0=MuseTalk+CosyVoice, GPU1=LatentSync)
+- 步骤 3 拆分为 3a (LatentSync) + 3b (MuseTalk)
+- 环境变量表新增 7 个 MuseTalk 变量，移除过时的 `DOUYIN_COOKIE`
+- LatentSync 推理步数默认值 20→16
+- 测试运行新增 MuseTalk 启动终端
+- PM2 管理新增 MuseTalk 服务（第 5 项）
+- 端口检查、日志查看命令新增 8011/vigent2-musetalk
+
+#### SUBTITLE_DEPLOY.md
+- 技术架构图更新为 LatentSync/MuseTalk 混合路由
+- 新增唇形同步路由说明
+- Remotion 配置表新增 `concurrency` 参数 (默认 16)
+- GPU 分配说明更新
+- 更新日志新增 v1.3.0 条目
+
+#### BACKEND_README.md
+- 健康检查接口描述更新为含 LatentSync + MuseTalk + 混合路由阈值
+- 环境变量配置新增 MuseTalk 相关变量
+- 服务集成指南新增"唇形同步混合路由"章节
+
+---
+
+### 3. AI 改写文案界面重构
+
+#### RewriteModal 重构
+
+将 AI 改写弹窗改为两步式流程，提升交互体验：
+
+**第一步 — 配置与触发**：
+- 自定义提示词输入（可选），自动持久化到 localStorage
+- "开始改写"按钮触发 `/api/ai/rewrite` 请求
+
+**第二步 — 结果对比与选择**：
+- 上方：AI 改写结果 + "使用此结果"按钮（紫粉渐变色，醒目）
+- 下方：原文对比 + "保留原文"按钮（灰色低调）
+- 底部：可"重新改写"（重回第一步，保留自定义提示词）
+- ESC 快捷键关闭
+
+#### ScriptExtractionModal 逻辑抽取
+
+将文案提取模态框的全部业务逻辑抽取到独立 hook `useScriptExtraction`：
+
+- **useScriptExtraction.ts** (新建): 管理 URL/文件双模式输入、拖拽上传、提取请求、步骤状态机 (config → processing → result)、剪贴板复制
+- **ScriptExtractionModal.tsx**: 纯展示组件，消费 hook 返回值，新增 ESC/Enter 快捷键
+
+#### ScriptEditor 工具栏调整
+
+- 按钮组右对齐 (`justify-end`)，统一高度 `h-7` 和圆角
+- "历史文案"按钮用灰色 (bg-gray-600) 区分辅助功能
+- "文案提取助手"用紫色 (bg-purple-600) 表示主功能
+- "AI多语言"用绿渐变 (emerald-teal)，"AI生成标题标签"用蓝渐变 (blue-cyan)
+- "AI智能改写"和"保存文案"移至文本框下方状态栏
+
+---
+
+### 4. 标题字幕面板重排 + 视频帧背景预览
+
+#### 面板顺序重排
+
+将 `<TitleSubtitlePanel>` 从第二步移至第四步（素材编辑之后），使用户在设置标题字幕样式时已经完成了素材选择和时间轴编排。
+
+新顺序：
+```
+一、文案提取与编辑（不变）
+二、配音（原三）
+三、素材编辑（原四）
+四、标题与字幕（原二）→ 移到素材编辑之后
+```
+
+#### 新建 useVideoFrameCapture hook
+
+从视频 URL 截取 0.1s 处帧画面，返回 JPEG data URL：
+
+- 创建 `<video>` 元素，设置 `crossOrigin="anonymous"`（素材存储在 Supabase Storage 跨域地址）
+- 先绑定 `loadedmetadata` / `canplay` / `seeked` / `error` 事件监听，再设 src（避免事件丢失）
+- `loadedmetadata` 或 `canplay` 触发后 seek 到 0.1s，`seeked` 回调中用 canvas `drawImage` 截帧
+- canvas 缩放到 480px 宽再编码（预览窗口最大 280px，节省内存）
+- `canvas.toDataURL("image/jpeg", 0.7)` 导出
+- 防御 `videoWidth/videoHeight` 为 0 的边界情况
+- try-catch 防 canvas taint，失败返回 null（降级渐变）
+- `isActive` 标志 + `seeked` 去重标志防止 stale 和重复更新
+- 截图完成后清理 video 元素释放内存
+
+#### 按需截取（性能优化）
+
+只在样式预览窗口打开时才触发截取：
+
+```typescript
+const materialPosterUrl = useVideoFrameCapture(
+  showStylePreview ? firstTimelineMaterialUrl : null
+);
+```
+
+截取源优先使用**时间轴第一段素材**（用户拖拽排序后的真实片头），回退到 `selectedMaterials[0]`（未生成配音、时间轴为空时）。
+
+#### 预览背景替换
+
+`FloatingStylePreview` 有视频帧时直接显示原始画面（不加半透明，保证颜色真实），文字靠描边保证可读性；无视频帧时降级为原紫粉渐变背景。
+
+#### 踩坑记录
+
+1. **CORS tainted canvas**: 素材文件存储在 Supabase Storage (`api.hbyrkj.top`)，是跨域签名链接。必须设 `video.crossOrigin = "anonymous"` 才能让 canvas `toDataURL` 不被 SecurityError 拦截
+2. **时间轴为空**: `useTimelineEditor` 在 `audioDuration <= 0`（未选配音）时返回空数组，需回退到 `selectedMaterials[0]`
+3. **事件监听顺序**: 必须先绑定事件监听再设 `video.src`，否则快速加载时事件可能丢失
+
+---
+
+## 📁 修改文件清单
+
+| 文件 | 改动 |
+|------|------|
+| `models/CosyVoice/cosyvoice_server.py` | `AutoModel()` 新增 `fp16=True` 参数 |
+| `README.md` | 混合唇形同步描述、技术栈、服务架构、项目结构更新 |
+| `Docs/DEPLOY_MANUAL.md` | MuseTalk 部署步骤、环境变量、PM2 管理、端口检查 |
+| `Docs/SUBTITLE_DEPLOY.md` | 架构图、Remotion concurrency、GPU 分配、更新日志 |
+| `Docs/BACKEND_README.md` | 健康检查、环境变量、混合路由章节 |
+| `frontend/.../RewriteModal.tsx` | 两步式改写流程（自定义提示词 → 结果对比） |
+| `frontend/.../script-extraction/useScriptExtraction.ts` | **新建** — 文案提取逻辑 hook |
+| `frontend/.../ScriptExtractionModal.tsx` | 纯展示组件，消费 hook，新增快捷键 |
+| `frontend/.../ScriptEditor.tsx` | 工具栏右对齐 + 按钮分色 + 改写/保存移至底部 |
+| `frontend/.../useVideoFrameCapture.ts` | **新建** — 视频帧截取 hook，crossOrigin + canvas 缩放 |
+| `frontend/.../useHomeController.ts` | 新增 useMemo 计算素材 URL，调用帧截取 hook，showStylePreview 门控 |
+| `frontend/.../HomePage.tsx` | 面板重排（二↔四互换），编号更新，透传 materialPosterUrl |
+| `frontend/.../TitleSubtitlePanel.tsx` | 编号"二"→"四"，新增 previewBackgroundUrl prop |
+| `frontend/.../FloatingStylePreview.tsx` | 新增 previewBackgroundUrl prop，条件渲染视频帧/渐变背景 |
+
+---
+
+## 🔍 验证
+
+- CosyVoice 重启成功，健康检查 `{"ready": true}`
+- 自检推理通过（7.2s for "你好"）
+- FP16 通过 `torch.cuda.amp.autocast(self.fp16)` 在 LLM 和 Flow Matching 阶段生效
+- `npx tsc --noEmit` — 零错误
+- AI 改写：自定义提示词持久化 → 改写结果 + 原文对比 → "使用此结果"/"保留原文"
+- 文案提取：URL / 文件双模式 → 处理中动画 → 结果填入
+- 面板顺序：一→文案、二→配音、三→素材编辑、四→标题与字幕
+- 样式预览背景：有素材时显示真实视频片头帧，无素材降级紫粉渐变
+- 预览关闭时不触发截取，不浪费资源
+
+---
+
+## 💡 CosyVoice 性能分析备注
+
+### 当前性能基线 (FP32, 优化前)
+
+| 文本长度 | 音频时长 | 推理耗时 | RTF |
+|----------|----------|----------|-----|
+| 42 字 | 9.8s | 13.2s | 1.35x |
+| 89 字 | 18.2s | 20.3s | 1.12x |
+| ~530 字 | 115.8s | 107.7s | 0.93x |
+| ~670 字 | 143.5s | 131.6s | 0.92x |
+
+### 未来可选优化（收益递减，暂不实施）
+
+| 优化项 | 预期提升 | 复杂度 |
+|--------|----------|--------|
+| TensorRT (DiT 模块) | +20-30% | 需编译 .plan 引擎 |
+| torch.compile() | +10-20% | 一行代码，但首次编译慢 |
+| vLLM (LLM 模块) | +10-15% | 额外依赖 |
--- a/Docs/Doc_Rules.md
+++ b/Docs/Doc_Rules.md
@@ -30,7 +30,7 @@
 | ⚡ **Med** | `Docs/BACKEND_README.md` | **(后端文档)** 接口说明、架构设计 |
 | ⚡ **Med** | `Docs/FRONTEND_DEV.md` | **(前端规范)** API封装、日期格式化、新页面规范 |
 | ⚡ **Med** | `Docs/FRONTEND_README.md` | **(前端文档)** 功能说明、页面变更 |
-| 🧊 **Low** | `Docs/*_DEPLOY.md` | **(子系统部署)** LatentSync/Qwen3/字幕等独立部署文档 |
+| 🧊 **Low** | `Docs/*_DEPLOY.md` | **(子系统部署)** LatentSync/CosyVoice/字幕等独立部署文档 |

 ---

@@ -195,7 +195,8 @@ ViGent2/Docs/
 ├── DEPLOY_MANUAL.md              # 部署手册
 ├── SUPABASE_DEPLOY.md            # Supabase 部署文档
 ├── LATENTSYNC_DEPLOY.md          # LatentSync 部署文档
-├── QWEN3_TTS_DEPLOY.md           # 声音克隆部署文档
+├── COSYVOICE3_DEPLOY.md           # 声音克隆部署文档
+├── ALIPAY_DEPLOY.md              # 支付宝付费部署文档
 ├── SUBTITLE_DEPLOY.md            # 字幕系统部署文档
 └── DevLogs/
    ├── Day1.md                   # 开发日志
@@ -304,4 +305,4 @@ ViGent2/Docs/

 ---

-**最后更新**：2026-02-08
+**最后更新**：2026-02-11
--- a/Docs/FRONTEND_DEV.md
+++ b/Docs/FRONTEND_DEV.md
@@ -11,7 +11,8 @@ frontend/src/
 │   ├── publish/               # 发布管理页
 │   ├── admin/                 # 管理员页面
 │   ├── login/                # 登录
-│   └── register/              # 注册
+│   ├── register/              # 注册
+│   └── pay/                   # 付费开通会员
 ├── features/                  # 功能模块（按业务拆分）
 │   ├── home/
 │   │   ├── model/             # 业务逻辑 hooks
@@ -19,9 +20,12 @@ frontend/src/
 │   │   │   ├── useHomePersistence.ts   # 持久化管理
 │   │   │   ├── useBgm.ts
 │   │   │   ├── useGeneratedVideos.ts
+│   │   │   ├── useGeneratedAudios.ts
 │   │   │   ├── useMaterials.ts
 │   │   │   ├── useMediaPlayers.ts
 │   │   │   ├── useRefAudios.ts
+│   │   │   ├── useSavedScripts.ts
+│   │   │   ├── useTimelineEditor.ts
 │   │   │   └── useTitleSubtitleStyles.ts
 │   │   └── ui/                # UI 组件（纯 props + 回调）
 │   │       ├── HomePage.tsx
@@ -35,6 +39,9 @@ frontend/src/
 │   │       ├── FloatingStylePreview.tsx
 │   │       ├── VoiceSelector.tsx
 │   │       ├── RefAudioPanel.tsx
+│   │       ├── GeneratedAudiosPanel.tsx
+│   │       ├── TimelineEditor.tsx
+│   │       ├── ClipTrimmer.tsx
 │   │       ├── BgmPanel.tsx
 │   │       ├── GenerateActionBar.tsx
 │   │       ├── PreviewPanel.tsx
@@ -144,6 +151,33 @@ body {
 | `sm:` | ≥ 640px | 平板/桌面 |
 | `lg:` | ≥ 1024px | 大屏桌面 |

+### embedded 组件模式
+
+合并板块时，子组件通过 `embedded?: boolean` prop 控制是否渲染外层卡片容器和主标题。
+
+```tsx
+// embedded=false（独立使用）：渲染完整卡片
+<div className="bg-white/5 rounded-2xl p-6 border border-white/10">
+  <h2>标题</h2>
+  {content}
+</div>
+
+// embedded=true（嵌入父卡片）：只渲染内容
+{content}
+```
+
+- 子标题使用 `<h3 className="text-sm font-medium text-gray-400">`
+- 分隔线使用 `<div className="border-t border-white/10 my-4" />`
+- 移动端标题行避免 `whitespace-nowrap`，长描述文字可用 `hidden sm:inline` 在移动端隐藏
+
+### 按钮视觉层级
+
+| 层级 | 样式 | 用途 |
+|------|------|------|
+| 主操作 | `px-4 py-2 text-sm font-medium bg-gradient-to-r from-purple-600 to-pink-600 shadow-sm` | 生成配音、立即发布 |
+| 辅助操作 | `px-2 py-1 text-xs bg-white/10 rounded` | 刷新、上传、语速 |
+| 触屏可见 | `opacity-40 group-hover:opacity-100` | 列表行内操作（编辑/删除） |
+
 ---

 ## API 请求规范
@@ -250,6 +284,38 @@ import { formatDate } from '@/shared/lib/media';

 ## ⚡️ 体验优化规范

+### 刷新回顶部（统一体验）
+
+- 长页面（如首页/发布页）在首次挂载时统一回到顶部。
+- **必须**在页面级 `useEffect` 中设置 `history.scrollRestoration = "manual"` 禁用浏览器原生滚动恢复。
+- 调用 `window.scrollTo({ top: 0, left: 0, behavior: "auto" })` 并追加 200ms 延迟兜底（防止异步 effect 覆盖）。
+- **列表自动滚动必须使用时间门控**：页面加载后 1 秒内禁止所有列表自动滚动效果（`scrollEffectsEnabled` ref），防止持久化恢复 + 异步数据加载触发 `scrollIntoView` 导致页面跳动。
+- 推荐模式：
+
+```typescript
+// 页面级（HomePage / PublishPage）
+useEffect(() => {
+  if (typeof window === "undefined") return;
+  if ("scrollRestoration" in history) history.scrollRestoration = "manual";
+  window.scrollTo({ top: 0, left: 0, behavior: "auto" });
+  const timer = setTimeout(() => window.scrollTo({ top: 0, left: 0, behavior: "auto" }), 200);
+  return () => clearTimeout(timer);
+}, []);
+
+// Controller 级（列表滚动时间门控）
+const scrollEffectsEnabled = useRef(false);
+useEffect(() => {
+  const timer = setTimeout(() => { scrollEffectsEnabled.current = true; }, 1000);
+  return () => clearTimeout(timer);
+}, []);
+
+// 列表滚动 effect（BGM/素材/视频等）
+useEffect(() => {
+  if (!selectedId || !scrollEffectsEnabled.current) return;
+  target?.scrollIntoView({ block: "nearest", behavior: "smooth" });
+}, [selectedId, list]);
+```
+
 ### 路由预取

 - 首页进入发布管理时使用 `router.prefetch("/publish")`
@@ -299,8 +365,20 @@ import { formatDate } from '@/shared/lib/media';
 - **必须持久化**：
  - 标题样式 ID / 字幕样式 ID
  - 标题字号 / 字幕字号
+  - 标题显示模式（`short` / `persistent`）
  - 背景音乐选择 / 音量 / 开关状态
+  - 输出画面比例（`9:16` / `16:9`）
  - 素材选择 / 历史作品选择
+  - 选中配音 ID (`selectedAudioId`)
+  - 语速 (`speed`，声音克隆模式)
+  - 时间轴段信息 (`useTimelineEditor` 的 localStorage)
+
+### 历史文案（独立持久化）
+
+`useSavedScripts` hook 独立管理历史文案的 localStorage 持久化：
+- key: `vigent_{storageKey}_savedScripts`
+- 仅在用户手动保存/删除时写入 localStorage，不使用自动持久化 effect
+- 与 `useHomePersistence` 完全独立，互不影响

 ### 实施规范
 - 使用 `storageKey = userId || 'guest'`，按用户隔离。
@@ -317,6 +395,7 @@ import { formatDate } from '@/shared/lib/media';
 - 片头标题与发布信息标题统一限制 15 字。
 - 中文输入法合成阶段不截断，合成结束后才校验长度。
 - 首页片头标题修改会同步写入 `vigent_${storageKey}_publish_title`。
+- 标题显示模式使用 `short` / `persistent` 两个固定值；默认 `short`（短暂显示 4 秒）。
 - 避免使用 `maxLength` 强制截断输入法合成态。
 - 推荐使用 `@/shared/hooks/useTitleInput` 统一处理输入逻辑。

@@ -346,9 +425,11 @@ import { formatDate } from '@/shared/lib/media';

 | 接口 | 方法 | 功能 |
 |------|------|------|
-| `/api/ref-audios` | POST | 上传参考音频 (multipart/form-data: file + ref_text) |
+| `/api/ref-audios` | POST | 上传参考音频 (multipart/form-data: file，ref_text 可选，后端自动 Whisper 转写) |
 | `/api/ref-audios` | GET | 列出用户的参考音频 |
+| `/api/ref-audios/{id}` | PUT | 重命名参考音频 |
 | `/api/ref-audios/{id}` | DELETE | 删除参考音频 (id 需 encodeURIComponent) |
+| `/api/ref-audios/{id}/retranscribe` | POST | 重新识别参考音频文字（Whisper 转写 + 超 10s 自动截取） |

 ### 视频生成 API 扩展

@@ -367,7 +448,8 @@ await api.post('/api/videos/generate', {
    text: '口播文案',
    tts_mode: 'voiceclone',
    ref_audio_id: 'user_id/timestamp_name.wav',
-    ref_text: '参考音频对应文字',
+    ref_text: '参考音频对应文字',  // 从参考音频 metadata 自动获取
+    speed: 1.0,  // 语速 (0.8-1.2)
 });
 ```

@@ -381,8 +463,14 @@ const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
 const mediaRecorder = new MediaRecorder(stream, { mimeType: 'audio/webm' });
 ```

+### 参考音频自动处理
+
+- **自动转写**: 上传参考音频时后端自动调用 Whisper 转写内容作为 `ref_text`，无需用户手动输入
+- **自动截取**: 参考音频超过 10 秒时自动在静音点截取前 10 秒（CosyVoice 建议 3-10 秒）
+- **重新识别**: 旧参考音频可通过 retranscribe 端点重新转写并截取
+
 ### UI 结构

 配音方式使用 Tab 切换：
 - **EdgeTTS 音色** - 预设音色 2x3 网格
- **声音克隆** - 参考音频列表 + 在线录音 + 参考文字输入
+- **声音克隆** - 参考音频列表 + 在线录音 + 语速下拉菜单 (5 档: 较慢/稍慢/正常/稍快/较快)
--- a/Docs/FRONTEND_README.md
+++ b/Docs/FRONTEND_README.md
@@ -5,19 +5,19 @@ ViGent2 的前端界面，采用 Next.js 16 + TailwindCSS 构建。
 ## ✨ 核心功能

 ### 1. 视频生成 (`/`)
- **素材管理**: 拖拽上传人物视频，实时预览。
- **素材重命名**: 支持在列表中直接重命名素材。
- **文案配音**: 集成 EdgeTTS，支持多音色选择 (云溪 / 晓晓)。
- **AI 标题/标签**: 一键生成视频标题与标签 (Day 14)。
- **标题/字幕样式**: 样式选择 + 预览 + 字号调节 (Day 16)。
- **背景音乐**: 试听 + 音量控制 + 选择持久化 (Day 16)。
- **交互优化**: 选择项持久化、列表内定位、刷新回顶部 (Day 16)。
- **预览一致性**: 标题/字幕预览按素材分辨率缩放，效果更接近成片 (Day 17)。
+- **一、文案提取与编辑**: 文案输入/提取/翻译/保存。
+- **二、配音**: 配音方式（EdgeTTS/声音克隆）+ 配音列表（生成/试听/管理）合并为一个板块。
+- **三、素材编辑**: 视频素材（上传/选择/管理）+ 时间轴编辑（波形/色块/拖拽排序）合并为一个板块。
+- **四、标题与字幕**: 片头标题/副标题/字幕样式配置；短暂显示/常驻显示；样式预览使用视频片头帧作为真实背景 (Day 28)。
+- **五、背景音乐**: 试听 + 音量控制 + 选择持久化。
+- **六、作品**（右栏）: 作品列表 + 作品预览合并为一个板块。
 - **进度追踪**: 实时显示视频生成进度 (10% -> 100%)。
 - **作品预览**: 生成完成后直接播放下载（作品预览 + 历史作品）。
 - **预览优化**: 预览视频 `metadata` 预取，首帧加载更快。
 - **本地保存**: 文案/标题/偏好由 `useHomePersistence` 统一持久化，刷新后恢复 (Day 14/17)。
+- **历史文案**: 手动保存/加载/删除历史文案，独立 localStorage 持久化 (Day 23)。
 - **选择持久化**: 首页/发布页作品选择均使用稳定 `id` 持久化，刷新保持用户选择；新视频生成后自动选中最新 (Day 21)。
+- **AI 多语言翻译**: 支持 9 种目标语言翻译文案 + 还原原文 (Day 22)。

 ### 2. 全自动发布 (`/publish`) [Day 7 新增]
 - **多平台管理**: 统一管理抖音、微信视频号、B站、小红书账号状态。
@@ -33,30 +33,52 @@ ViGent2 的前端界面，采用 Next.js 16 + TailwindCSS 构建。

 ### 3. 声音克隆 [Day 13 新增]
 - **TTS 模式选择**: EdgeTTS (预设音色) / 声音克隆 (自定义音色) 切换。
- **参考音频管理**: 上传/列表/删除参考音频 (3-20秒 WAV)。
- **一键克隆**: 选择参考音频后自动调用 Qwen3-TTS 服务。
+- **参考音频管理**: 上传/列表/重命名/删除参考音频，上传后自动 Whisper 转写 ref_text + 超 10s 自动截取。
+- **重新识别**: 旧参考音频可重新转写并截取 (RotateCw 按钮)。
+- **一键克隆**: 选择参考音频后自动调用 CosyVoice 3.0 服务。
+- **语速控制**: 声音克隆模式下支持 5 档语速 (0.8-1.2)，选择持久化 (Day 23)。
+- **多语言支持**: EdgeTTS 10 语言声音列表，声音克隆 language 透传 (Day 22)。

-### 4. 字幕与标题 [Day 13 新增]
- **片头标题**: 可选输入，限制 15 字，视频开头显示 3 秒淡入淡出标题。
+### 4. 配音前置 + 时间轴编排 [Day 23 新增]
+- **配音独立生成**: 先生成配音 → 选中配音 → 再选素材 → 生成视频。
+- **配音管理面板**: 生成/试听/改名/删除/选中，异步生成 + 进度轮询。
+- **时间轴编辑器**: wavesurfer.js 音频波形 + 色块可视化素材分配，拖拽分割线调整各段时长。
+- **素材截取设置**: ClipTrimmer 双手柄 range slider + HTML5 视频预览播放。
+- **拖拽排序**: 时间轴色块支持 HTML5 Drag & Drop 调换素材顺序。
+- **自定义分配**: 后端 `custom_assignments` 支持用户定义的素材分配方案（含 `source_start/source_end` 截取区间）。
+- **时间轴语义对齐**: 超出音频时仅保留可见段并截齐末段，超出段不参与生成；不足音频时最后可见段自动循环补齐。
+- **画面比例控制**: 时间轴顶部支持 `9:16 / 16:9` 输出比例选择，设置持久化并透传后端。
+
+### 5. 字幕与标题 [Day 13 新增]
+- **片头标题**: 可选输入，限制 15 字；支持”短暂显示 / 常驻显示”，默认短暂显示（4 秒），对标题和副标题同时生效。
+- **片头副标题**: 可选输入，限制 20 字；显示在主标题下方，用于补充说明或悬念引导；独立样式配置（字体/字号/颜色/间距），可由 AI 同时生成；与标题共享显示模式设定；仅在视频画面中显示，不参与发布标题 (Day 25)。
 - **标题同步**: 首页片头标题修改会同步到发布信息标题。
 - **逐字高亮字幕**: 卡拉OK效果，默认开启，可关闭。
 - **自动对齐**: 基于 faster-whisper 生成字级别时间戳。
- **样式预设**: 标题/字幕样式选择 + 预览 + 字号调节 (Day 16)。
+- **样式预设**: 标题/字幕/副标题样式选择 + 预览 + 字号调节 (Day 16/25)。
 - **默认样式**: 标题 90px 站酷快乐体；字幕 60px 经典黄字 + DingTalkJinBuTi (Day 17)。
- **样式持久化**: 标题/字幕样式与字号刷新保留 (Day 17)。
+- **样式持久化**: 标题/字幕/副标题样式与字号刷新保留 (Day 17/25)。

-### 5. 背景音乐 [Day 16 新增]
+### 6. 背景音乐 [Day 16 新增]
 - **试听预览**: 点击试听即选中，音量滑块实时生效。
 - **混音控制**: 仅影响 BGM，配音保持原音量。

-### 6. 账户设置 [Day 15 新增]
+### 7. 账户设置 [Day 15 新增]
 - **手机号登录**: 11位中国手机号验证登录。
- **账户下拉菜单**: 显示有效期 + 修改密码 + 安全退出。
+- **账户下拉菜单**: 显示手机号（中间四位脱敏）+ 有效期 + 修改密码 + 安全退出。
 - **修改密码**: 弹窗输入当前密码与新密码，修改后强制重新登录。
+- **登录即时生效**: 登录成功后 AuthContext 立即写入用户数据，无需刷新即显示手机号。

-### 7. 文案提取助手 (`ScriptExtractionModal`) [Day 15 新增]
+### 8. 付费开通会员 (`/pay`)
+- **支付宝电脑网站支付**: 跳转支付宝官方收银台，支持扫码/账号登录/余额等多种支付方式。
+- **自动激活**: 支付成功后异步回调自动激活会员（有效期 1 年），前端轮询检测支付结果。
+- **到期续费**: 会员到期后登录自动跳转付费页续费，流程与首次开通一致。
+- **管理员激活**: 管理员手动激活功能并存，两种方式互不影响。
+
+### 8. 文案提取助手 (`ScriptExtractionModal`) [Day 15 新增]
 - **多源提取**: 支持文件拖拽上传与 URL 粘贴 (B站/抖音/TikTok)。
- **AI 洗稿**: 集成 GLM-4.7-Flash，自动改写为口播文案。
+- **AI 智能改写**: 集成 GLM-4.7-Flash，自动改写为口播文案。
+- **自定义提示词**: 可自定义改写提示词，留空使用默认；设置持久化到 localStorage (Day 25)。
 - **一键填入**: 提取结果直接填充至视频生成输入框。
 - **智能交互**: 实时进度展示，防误触设计。

@@ -66,6 +88,7 @@ ViGent2 的前端界面，采用 Next.js 16 + TailwindCSS 构建。
 - **样式**: TailwindCSS
 - **图标**: Lucide React
 - **组件**: 自定义现代化组件 (Glassmorphism 风格)
+- **音频波形**: wavesurfer.js (时间轴编辑器)
 - **API**: Axios 实例 `@/shared/api/axios` (对接后端 FastAPI :8006)

 ## 🚀 开发指南
@@ -93,6 +116,8 @@ src/
 │   ├── page.tsx           # 视频生成主页
 │   ├── publish/           # 发布管理页
 │   │   └── page.tsx
+│   ├── pay/               # 付费开通会员页
+│   │   └── page.tsx
 │   └── layout.tsx         # 全局布局 (导航栏)
 ├── features/
 │   ├── home/
@@ -117,5 +142,8 @@ src/
 ## 🎨 设计规范

 - **主色调**: 深紫/黑色系 (Dark Mode)
- **交互**: 悬停微动画 (Hover Effects)
- **响应式**: 适配桌面端大屏操作
+- **交互**: 悬停微动画 (Hover Effects)；操作按钮默认半透明可见 (opacity-40)，hover 时全亮，兼顾触屏设备
+- **响应式**: 适配桌面端与移动端；发布页平台卡片响应式布局（移动端紧凑/桌面端宽松）
+- **滚动体验**: 列表滚动条统一隐藏 (hide-scrollbar)；刷新后自动回到顶部（禁用浏览器滚动恢复 + 列表 scroll 时间门控）
+- **样式预览**: 浮动预览窗口，桌面端左上角 280px，移动端右下角 160px（不遮挡控件）
+- **输入辅助**: 标题/副标题输入框实时字数计数器，超限变红
--- a/Docs/MUSETALK_DEPLOY.md
+++ b/Docs/MUSETALK_DEPLOY.md
@@ -0,0 +1,252 @@
+# MuseTalk 部署指南
+
+> **更新时间**：2026-02-27
+> **适用版本**：MuseTalk v1.5 (常驻服务模式)
+> **架构**：FastAPI 常驻服务 + PM2 进程管理
+
+---
+
+## 架构概览
+
+MuseTalk 作为 **混合唇形同步方案** 的长视频引擎：
+
+- **短视频 (<120s)** → LatentSync 1.6 (GPU1, 端口 8007)
+- **长视频 (>=120s)** → MuseTalk 1.5 (GPU0, 端口 8011)
+- 路由阈值由 `LIPSYNC_DURATION_THRESHOLD` 控制
+- MuseTalk 不可用时自动回退到 LatentSync
+
+---
+
+## 硬件要求
+
+| 配置 | 最低要求 | 推荐配置 |
+|------|----------|----------|
+| GPU | 8GB VRAM (RTX 3060) | 24GB VRAM (RTX 3090) |
+| 内存 | 32GB | 64GB |
+| CUDA | 11.7+ | 11.8 |
+
+> MuseTalk fp16 推理约需 4-8GB 显存，可与 CosyVoice 共享 GPU0。
+
+---
+
+## 安装步骤
+
+### 1. Conda 环境
+
+```bash
+cd /home/rongye/ProgramFiles/ViGent2/models/MuseTalk
+conda create -n musetalk python=3.10 -y
+conda activate musetalk
+```
+
+### 2. PyTorch 2.0.1 + CUDA 11.8
+
+> 必须使用此版本，mmcv 预编译包依赖。
+
+```bash
+pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
+```
+
+### 3. 依赖安装
+
+```bash
+pip install -r requirements.txt
+
+# MMLab 系列
+pip install --no-cache-dir -U openmim
+mim install mmengine
+mim install "mmcv==2.0.1"
+mim install "mmdet==3.1.0"
+pip install chumpy --no-build-isolation
+pip install "mmpose==1.1.0" --no-deps
+
+# FastAPI 服务依赖
+pip install fastapi uvicorn httpx
+```
+
+---
+
+## 模型权重
+
+### 目录结构
+
+```
+models/MuseTalk/models/
+├── musetalk/                   ← v1 基础模型
+│   ├── config.json -> musetalk.json    (软链接)
+│   ├── musetalk.json
+│   ├── musetalkV15 -> ../musetalkV15   (软链接, 关键!)
+│   └── pytorch_model.bin       (~3.2GB)
+├── musetalkV15/                ← v1.5 UNet 模型
+│   ├── musetalk.json
+│   └── unet.pth                (~3.2GB)
+├── sd-vae/                     ← Stable Diffusion VAE
+│   ├── config.json
+│   └── diffusion_pytorch_model.bin
+├── whisper/                    ← OpenAI Whisper Tiny
+│   ├── config.json
+│   ├── pytorch_model.bin       (~151MB)
+│   └── preprocessor_config.json
+├── dwpose/                     ← DWPose 人体姿态检测
+│   └── dw-ll_ucoco_384.pth     (~387MB)
+├── syncnet/                    ← SyncNet 唇形同步评估
+│   └── latentsync_syncnet.pt
+└── face-parse-bisent/          ← 人脸解析模型
+    ├── 79999_iter.pth          (~53MB)
+    └── resnet18-5c106cde.pth   (~45MB)
+```
+
+### 下载方式
+
+使用项目自带脚本：
+
+```bash
+cd /home/rongye/ProgramFiles/ViGent2/models/MuseTalk
+conda activate musetalk
+bash download_weights.sh
+```
+
+或手动 Python API 下载：
+
+```bash
+conda activate musetalk
+export HF_ENDPOINT=https://hf-mirror.com
+python -c "
+from huggingface_hub import snapshot_download
+snapshot_download('TMElyralab/MuseTalk', local_dir='models',
+    allow_patterns=['musetalk/*', 'musetalkV15/*'])
+snapshot_download('stabilityai/sd-vae-ft-mse', local_dir='models/sd-vae',
+    allow_patterns=['config.json', 'diffusion_pytorch_model.bin'])
+snapshot_download('openai/whisper-tiny', local_dir='models/whisper',
+    allow_patterns=['config.json', 'pytorch_model.bin', 'preprocessor_config.json'])
+snapshot_download('yzd-v/DWPose', local_dir='models/dwpose',
+    allow_patterns=['dw-ll_ucoco_384.pth'])
+"
+```
+
+### 创建必要的软链接
+
+```bash
+cd /home/rongye/ProgramFiles/ViGent2/models/MuseTalk/models/musetalk
+ln -sf musetalk.json config.json
+ln -sf ../musetalkV15 musetalkV15
+```
+
+> **关键**：`musetalk/musetalkV15` 软链接缺失会导致权重检测失败 (`weights: False`)。
+
+---
+
+## 服务启动
+
+### PM2 进程管理（推荐）
+
+```bash
+# 首次注册
+cd /home/rongye/ProgramFiles/ViGent2
+pm2 start run_musetalk.sh --name vigent2-musetalk
+pm2 save
+
+# 日常管理
+pm2 restart vigent2-musetalk
+pm2 logs vigent2-musetalk
+pm2 stop vigent2-musetalk
+```
+
+### 手动启动
+
+```bash
+cd /home/rongye/ProgramFiles/ViGent2/models/MuseTalk
+/home/rongye/ProgramFiles/miniconda3/envs/musetalk/bin/python scripts/server.py
+```
+
+### 健康检查
+
+```bash
+curl http://localhost:8011/health
+# {"status":"ok","model_loaded":true}
+```
+
+---
+
+## 后端配置
+
+`backend/.env` 中的相关变量：
+
+```ini
+# MuseTalk 配置
+MUSETALK_GPU_ID=0                        # GPU 编号 (与 CosyVoice 共存)
+MUSETALK_API_URL=http://localhost:8011    # 常驻服务地址
+MUSETALK_BATCH_SIZE=32                   # 推理批大小
+MUSETALK_VERSION=v15                     # 模型版本
+MUSETALK_USE_FLOAT16=true                # 半精度加速
+
+# 混合唇形同步路由
+LIPSYNC_DURATION_THRESHOLD=120           # 秒, >=此值用 MuseTalk
+```
+
+---
+
+## 相关文件
+
+| 文件 | 说明 |
+|------|------|
+| `models/MuseTalk/scripts/server.py` | FastAPI 常驻服务 (端口 8011) |
+| `run_musetalk.sh` | PM2 启动脚本 |
+| `backend/app/services/lipsync_service.py` | 混合路由 + `_call_musetalk_server()` |
+| `backend/app/core/config.py` | `MUSETALK_*` 配置项 |
+
+---
+
+## 性能优化 (server.py v2)
+
+首次长视频测试 (136s, 3404 帧) 耗时 30 分钟。分析发现瓶颈在人脸检测 (28%)、BiSeNet 合成 (22%)、I/O (17%)，而非 UNet 推理 (17%)。
+
+### 已实施优化
+
+| 优化项 | 说明 |
+|--------|------|
+| `MUSETALK_BATCH_SIZE` 8→32 | RTX 3090 显存充裕，UNet 推理加速 ~3x |
+| cv2.VideoCapture 直读帧 | 跳过 ffmpeg→PNG→imread 链路 |
+| 人脸检测降频 (每5帧) | DWPose + FaceAlignment 只在采样帧运行，中间帧线性插值 bbox |
+| BiSeNet mask 缓存 (每5帧) | `get_image_prepare_material` 每 5 帧运行，中间帧用 `get_image_blending` 复用 |
+| cv2.VideoWriter 直写 | 跳过逐帧 PNG 写盘 + ffmpeg 重编码 |
+| 每阶段计时 | 7 个阶段精确计时，方便后续调优 |
+
+### 调优参数
+
+`models/MuseTalk/scripts/server.py` 顶部可调：
+
+```python
+DETECT_EVERY = 5        # 人脸检测降频间隔 (帧)
+BLEND_CACHE_EVERY = 5   # BiSeNet mask 缓存间隔 (帧)
+```
+
+> 对于口播视频 (人脸几乎不动)，5 帧间隔的插值误差可忽略。
+> 如人脸运动剧烈的场景，可降低为 2-3。
+
+---
+
+## 常见问题
+
+### huggingface-hub 版本冲突
+
+```
+ImportError: huggingface-hub>=0.19.3,<1.0 is required
+```
+
+**解决**：降级 huggingface-hub
+
+```bash
+pip install "huggingface-hub>=0.19.3,<1.0"
+```
+
+### mmcv 导入失败
+
+```bash
+pip uninstall mmcv mmcv-full -y
+mim install "mmcv==2.0.1"
+```
+
+### 音视频长度不匹配
+
+已在 `musetalk/utils/audio_processor.py` 中修复（零填充逻辑），无需额外处理。
--- a/Docs/QWEN3_TTS_DEPLOY.md
+++ b/Docs/QWEN3_TTS_DEPLOY.md
@@ -298,12 +298,20 @@ Response: audio/wav 文件
 SoX could not be found!
 ```

-**解决**: 通过 conda 安装 sox：
+**解决**:
+
+1. 通过 conda 安装 sox：

 ```bash
 conda install -y -c conda-forge sox
 ```

+2. 确保启动脚本 `run_qwen_tts.sh` 中已 export conda env bin 到 PATH（PM2 启动时系统 PATH 不含 conda 环境目录）：
+
+```bash
+export PATH="/home/rongye/ProgramFiles/miniconda3/envs/qwen-tts/bin:$PATH"
+```
+
 ### CUDA 内存不足

 Qwen3-TTS 1.7B 通常需要 8-10GB VRAM。如果遇到 OOM：
@@ -371,6 +379,7 @@ FOR INSERT TO anon WITH CHECK (bucket_id = 'ref-audios');

 | 日期 | 版本 | 说明 |
 |------|------|------|
+| 2026-02-09 | 1.2.0 | 修复 SoX PATH 问题（run_qwen_tts.sh export conda bin），每次生成后 empty_cache() |
 | 2026-01-30 | 1.1.0 | 明确默认模型升级为 1.7B-Base，替换旧版 0.6B 路径 |

 ---
--- a/Docs/SUBTITLE_DEPLOY.md
+++ b/Docs/SUBTITLE_DEPLOY.md
@@ -15,11 +15,17 @@
 原有流程:
  文本 → EdgeTTS → 音频 → LatentSync → FFmpeg合成 → 最终视频

-新流程:
-  文本 → EdgeTTS → 音频 ─┬→ LatentSync → 唇形视频 ─┐
+新流程 (单素材):
+  文本 → EdgeTTS/CosyVoice/预生成配音 → 音频 ─┬→ LatentSync/MuseTalk → 唇形视频 ─┐
                                              └→ faster-whisper → 字幕JSON ─┴→ Remotion合成 → 最终视频
+
+新流程 (多素材):
+  音频 → 多素材按 custom_assignments 拼接 → LatentSync/MuseTalk (单次推理) → 唇形视频 ─┐
+  音频 → faster-whisper → 字幕JSON ─────────────────────────────────────────────┴→ Remotion合成 → 最终视频
 ```

+> **唇形同步路由**: 短视频 (<120s) 用 LatentSync 1.6 (GPU1)，长视频 (>=120s) 用 MuseTalk 1.5 (GPU0)，由 `LIPSYNC_DURATION_THRESHOLD` 控制。
+
 ## 系统要求

 | 组件 | 要求 |
@@ -140,7 +146,7 @@ remotion/
 | 阶段 | 进度 | 说明 |
 |------|------|------|
 | 下载素材 | 0% → 5% | 从 Supabase 下载输入视频 |
-| TTS 语音生成 | 5% → 25% | EdgeTTS 或 Qwen3-TTS 生成音频 |
+| TTS 语音生成 | 5% → 25% | EdgeTTS / Qwen3-TTS / 预生成配音下载 |
 | 唇形同步 | 25% → 80% | LatentSync 推理 |
 | 字幕对齐 | 80% → 85% | faster-whisper 生成字级别时间戳 |
 | Remotion 渲染 | 85% → 95% | 合成字幕和标题 |
@@ -181,7 +187,9 @@ Remotion 渲染参数在 `backend/app/services/remotion_service.py` 中配置：
 | 参数 | 默认值 | 说明 |
 |------|--------|------|
 | `fps` | 25 | 输出帧率 |
-| `title_duration` | 3.0 | 标题显示时长（秒） |
+| `concurrency` | 16 | Remotion 并发渲染进程数（默认 16，可通过 `--concurrency` CLI 参数覆盖） |
+| `title_display_mode` | `short` | 标题显示模式（`short`=短暂显示；`persistent`=常驻显示） |
+| `title_duration` | 4.0 | 标题显示时长（秒，仅 `short` 模式生效） |

 ---

@@ -268,7 +276,7 @@ wget https://github.com/googlefonts/noto-cjk/raw/main/Sans/OTF/SimplifiedChinese

 ### 使用 GPU 0

-faster-whisper 默认使用 GPU 0，与 LatentSync (GPU 1) 分开，避免显存冲突。如需指定 GPU：
+faster-whisper 默认使用 GPU 0，与 MuseTalk 共享 GPU 0；LatentSync 使用 GPU 1，互不冲突。如需指定 GPU：

 ```python
 # 在 whisper_service.py 中修改
@@ -282,4 +290,7 @@ WhisperService(device="cuda:0")  # 或 "cuda:1"
 | 日期 | 版本 | 说明 |
 |------|------|------|
 | 2026-01-29 | 1.0.0 | 初始版本，使用 faster-whisper + Remotion 实现逐字高亮字幕和片头标题 |
+| 2026-02-10 | 1.1.0 | 更新架构图：多素材 concat-then-infer、预生成配音选项 |
 | 2026-01-30 | 1.0.1 | 字幕高亮样式与标题动画优化，视觉表现更清晰 |
+| 2026-02-25 | 1.2.0 | 字幕时间戳从线性插值改为 Whisper 节奏映射，修复长视频字幕漂移 |
+| 2026-02-27 | 1.3.0 | 架构图更新 MuseTalk 混合路由；Remotion 并发渲染从 8 提升到 16；GPU 分配说明更新 |
--- a/Docs/task_complete.md
+++ b/Docs/task_complete.md
@@ -1,8 +1,8 @@
 # ViGent2 开发任务清单 (Task Log)

 **项目**: ViGent2 数字人口播视频生成系统
-**进度**: 100% (Day 21 - 缺陷修复与持久化回归治理)
-**更新时间**: 2026-02-08
+**进度**: 100% (Day 28 - CosyVoice FP16 加速 + 文档全面更新)
+**更新时间**: 2026-02-27

 ---

@@ -10,7 +10,118 @@

 > 这里记录了每一天的核心开发内容与 milestone。

-### Day 21: 缺陷修复 + 浮动预览 + 发布重构 + 架构优化 (Current)
+### Day 28: CosyVoice FP16 加速 + 文档全面更新 (Current)
+- [x] **CosyVoice FP16 半精度加速**: `AutoModel()` 开启 `fp16=True`，LLM 推理和 Flow Matching 自动混合精度运行，预估提速 30-40%、显存降低 ~30%。
+- [x] **文档全面更新**: README.md / DEPLOY_MANUAL.md / SUBTITLE_DEPLOY.md / BACKEND_README.md 补充 MuseTalk 混合唇形同步方案、性能优化、Remotion 并发渲染等内容。
+
+### Day 27: Remotion 描边修复 + 字体样式扩展 + 混合唇形同步 + 性能优化
+- [x] **描边渲染修复**: 标题/副标题/字幕从 `textShadow` 4 方向模拟改为 CSS 原生 `-webkit-text-stroke` + `paint-order: stroke fill`，修复描边过粗和副标题重影问题。
+- [x] **字体样式扩展**: 标题样式 4→12 个（+庞门正道/优设标题圆/阿里数黑体/文道潮黑/无界黑/厚底黑/寒蝉半圆体/欣意吉祥宋），字幕样式 4→8 个（+少女粉/清新绿/金色隶书/楷体红字）。
+- [x] **描边参数优化**: 所有预设 `stroke_size` 从 8 降至 4~5，配合原生描边视觉更干净。
+- [x] **TypeScript 类型修复**: Root.tsx `Composition` 泛型与 `calculateMetadata` 参数类型对齐；Video.tsx `VideoProps` 添加索引签名兼容 `Record<string, unknown>`；VideoLayer.tsx 移除 `OffthreadVideo` 不支持的 `loop` prop。
+- [x] **进度条文案还原**: 进度条从显示后端推送消息改回固定 `正在AI生成中...`。
+- [x] **MuseTalk 混合唇形同步**: 部署 MuseTalk 1.5 常驻服务 (GPU0, 端口 8011)，按音频时长自动路由 — 短视频 (<120s) 走 LatentSync，长视频 (>=120s) 走 MuseTalk，MuseTalk 不可用时自动回退。
+- [x] **MuseTalk 推理性能优化**: server.py v2 重写 — cv2 直读帧(跳过 ffmpeg→PNG)、人脸检测降频(每5帧)、BiSeNet mask 缓存(每5帧)、cv2.VideoWriter 直写(跳过 PNG 写盘)、batch_size 8→32，预估 30min→8-10min (~3x)。
+- [x] **Remotion 并发渲染优化**: render.ts 新增 concurrency 参数，从默认 8 提升到 16 (56核 CPU)，预估 5min→2-3min。
+
+### Day 26: 前端优化：板块合并 + 序号标题 + UI 精细化
+- [x] **板块合并**: 首页 9 个独立板块合并为 5 个主板块（配音方式+配音列表→三、配音；视频素材+时间轴→四、素材编辑；历史作品+作品预览→六、作品）。
+- [x] **中文序号标题**: 一~十编号（首页一~六，发布页七~十），移除所有 emoji 图标。
+- [x] **embedded 模式**: 6 个组件支持 `embedded` prop，嵌入时不渲染外层卡片/标题。
+- [x] **配音列表两行布局**: embedded 模式第 1 行语速+生成配音（右对齐），第 2 行配音列表+刷新。
+- [x] **子组件自渲染子标题**: MaterialSelector/TimelineEditor embedded 时自渲染 h3 子标题+操作按钮同行。
+- [x] **下拉对齐**: TitleSubtitlePanel 标签统一 `w-20`，下拉 `w-1/3 min-w-[100px]`，垂直对齐。
+- [x] **参考音频文案简化**: 底部段落移至标题旁，简化为 `(上传3-10秒语音样本)`。
+- [x] **账户手机号显示**: AccountSettingsDropdown 新增手机号显示。
+- [x] **标题显示模式对副标题生效**: payload 条件修复 + UI 下拉上移至板块标题行。
+- [x] **登录后用户信息立即可用**: AuthContext 暴露 `setUser`，登录成功后立即写入用户数据，修复登录后显示"未知账户"的问题。
+- [x] **文案微调**: 素材描述改为"上传自拍视频，最多可选4个"；显示模式选项加"标题"前缀。
+- [x] **UI/UX 体验优化**: 操作按钮移动端可见（opacity-40）、手机号脱敏、标题字数计数器、时间轴拖拽抓手图标、截取滑块放大。
+- [x] **代码质量修复**: 密码弹窗 success 清空、MaterialSelector useMemo + disabled 守卫、TimelineEditor useMemo。
+- [x] **发布页响应式布局**: 平台账号卡片单行布局，移动端紧凑（小图标/小按钮），桌面端宽松（与其他板块风格一致）。
+- [x] **移动端刷新回顶部**: `scrollRestoration = "manual"` + 列表 scroll 时间门控（`scrollEffectsEnabled` ref，1 秒内禁止自动滚动）+ 延迟兜底 `scrollTo(0,0)`。
+- [x] **移动端样式预览缩小**: FloatingStylePreview 移动端宽度缩至 160px，位置改为右下角，不遮挡样式调节控件。
+- [x] **列表滚动条统一隐藏**: 所有列表（BGM/配音/作品/素材/文案提取）滚动条改回 `hide-scrollbar`。
+- [x] **移动端配音/素材适配**: VoiceSelector 按钮移动端缩小（`px-2 sm:px-4`）修复克隆声音不可见；MaterialSelector 标题行移除 `whitespace-nowrap`，描述移动端隐藏，修复刷新按钮溢出。
+- [x] **生成配音按钮放大**: 从辅助尺寸（`text-xs px-2 py-1`）升级为主操作尺寸（`text-sm font-medium px-4 py-2`），新增阴影。
+- [x] **生成进度条位置调整**: 从"六、作品"卡片内部提取到右栏独立卡片，显示在作品卡片上方，更醒目。
+- [x] **LatentSync 超时修复**: httpx 超时从 1200s（20 分钟）改为 3600s（1 小时），修复 2 分钟以上视频口型推理超时回退问题。
+- [x] **字幕时间戳节奏映射**: `whisper_service.py` 从全程线性插值改为 Whisper 逐词节奏映射，修复长视频字幕漂移。
+
+### Day 25: 文案提取修复 + 自定义提示词 + 片头副标题
+- [x] **抖音文案提取修复**: yt-dlp Fresh cookies 报错，重写 `_download_douyin_manual` 为移动端分享页 + 自动获取 ttwid 方案。
+- [x] **清理 DOUYIN_COOKIE**: 新方案不再需要手动维护 Cookie，从 `.env`/`config.py`/`service.py` 全面删除。
+- [x] **AI 智能改写自定义提示词**: 后端 `rewrite_script()` 支持 `custom_prompt` 参数；前端 checkbox 旁新增折叠式提示词编辑区，localStorage 持久化。
+- [x] **SSR 构建修复**: `useState` 初始化 `localStorage` 访问加 `typeof window` 守卫，修复 `npm run build` 报错。
+- [x] **片头副标题**: 新增 secondary_title（后端/Remotion/前端全链路），AI 同时生成，独立样式配置，20 字限制。
+- [x] **前端文案修正**: "AI 洗稿结果"→"AI 改写结果"。
+- [x] **yt-dlp 升级**: `2025.12.08` → `2026.2.21`。
+- [x] **参考音频中文文件名修复**: `sanitize_filename()` 将存储路径清洗为 ASCII 安全字符，纯中文名哈希兜底，原始名保留为展示名。
+
+### Day 24: 鉴权到期治理 + 多素材时间轴稳定性修复
+- [x] **会员到期请求时失效**: 登录与鉴权接口统一执行 `expires_at` 检查；到期后自动停用账号、清理 session，并返回“会员已到期，请续费”。
+- [x] **画面比例控制**: 时间轴新增 `9:16 / 16:9` 输出比例选择，前端持久化并透传后端，单素材/多素材统一按目标分辨率处理。
+- [x] **标题/字幕防溢出**: Remotion 与前端预览统一响应式缩放、自动换行、描边/字距/边距比例缩放，降低预览与成片差异。
+- [x] **标题显示模式**: 标题行新增“短暂显示/常驻显示”下拉；默认短暂显示（4 秒），用户选择持久化并透传至 Remotion 渲染链路。
+- [x] **MOV 方向归一化**: 新增旋转元数据解析与 orientation normalize，修复“编码横屏+旋转元数据”导致的竖屏判断偏差。
+- [x] **多素材拼接稳定性**: 片段 prepare 与 concat 统一 25fps/CFR，concat 增加 `+genpts`，缓解段切换处“画面冻结口型还动”。
+- [x] **时间轴语义对齐**: 打通 `source_end` 全链路；修复 `sourceStart>0 且 sourceEnd=0` 时长计算；生成时以时间轴可见段 assignments 为准，超出段不参与。
+- [x] **交互细节优化**: 页面刷新回顶部；素材/历史列表首轮自动滚动抑制，减少恢复状态时页面跳动。
+
+### Day 23: 配音前置重构 + 素材时间轴编排 + UI 体验优化 + 声音克隆增强
+
+#### 第一阶段：配音前置
+- [x] **配音生成独立化**: 新增 `generated_audios` 后端模块（router/schemas/service），5 个 API 端点，复用现有 TTSService / voice_clone_service / task_store。
+- [x] **配音管理面板**: 前端新增 `useGeneratedAudios` hook + `GeneratedAudiosPanel` 组件，支持生成/试听/改名/删除/选中。
+- [x] **UI 面板重排序**: 文案 → 标题字幕 → 配音方式 → 配音列表 → 素材选择 → BGM → 生成视频。
+- [x] **素材区门控**: 未选中配音时素材区显示遮罩，选中后显示配音时长 + 素材均分信息。
+- [x] **视频生成对接**: workflow.py 新增预生成音频分支（`generated_audio_id`），跳过内联 TTS，向后兼容。
+- [x] **持久化**: selectedAudioId 加入 useHomePersistence，刷新页面恢复选中配音。
+
+#### 第二阶段：素材时间轴编排
+- [x] **时间轴编辑器**: 新增 `TimelineEditor` 组件，wavesurfer.js 音频波形 + 色块可视化素材分配，拖拽分割线调整各段时长。
+- [x] **素材截取设置**: 新增 `ClipTrimmer` 模态框，HTML5 视频预览 + 双端滑块设置源视频截取起点/终点。
+- [x] **后端自定义分配**: 新增 `CustomAssignment` 模型，`prepare_segment` 支持 `source_start`，workflow 多素材/单素材流水线支持 `custom_assignments`。
+- [x] **循环截取修复**: `stream_loop + source_start` 改为两步处理（先裁剪再循环），确保从截取起点循环而非从视频 0s 开始。
+- [x] **MaterialSelector 精简**: 移除旧的时长信息栏和拖拽排序区（功能迁移到 TimelineEditor）。
+
+#### 第三阶段：UI 体验优化 + TTS 稳定性
+- [x] **TTS SoX PATH 修复**: `run_qwen_tts.sh` export conda env bin 到 PATH (Qwen3-TTS 已停用，已被 CosyVoice 3.0 替换)。
+- [x] **TTS 显存管理**: 每次生成后 `torch.cuda.empty_cache()`，asyncio.to_thread 避免阻塞事件循环 (CosyVoice 沿用相同机制)。
+- [x] **配音列表按钮统一**: Play/Edit/Delete 按钮右侧同组 hover 显示，与 RefAudioPanel 一致，移除文案摘要。
+- [x] **素材区解除配音门控**: 移除 MaterialSelector 的 selectedAudio 遮罩，素材随时可上传管理。
+- [x] **时间轴拖拽排序**: TimelineEditor 色块支持 HTML5 Drag & Drop 调换素材顺序。
+- [x] **截取设置 Range Slider**: ClipTrimmer 改为单轨道双手柄（紫色起点+粉色终点），替换两个独立滑块。
+- [x] **截取设置视频预览**: 视频区域可播放/暂停，从 sourceStart 到 sourceEnd 自动停止，拖拽手柄时实时 seek。
+
+#### 第四阶段：历史文案 + Bug 修复
+- [x] **历史文案保存与加载**: 新增 `useSavedScripts` hook，手动保存/加载/删除历史文案，独立 localStorage 持久化。
+- [x] **时间轴拖拽修复**: `reorderSegments` 从属性交换改为数组移动（splice），修复拖拽后时长不跟随素材的 Bug。
+- [x] **按钮视觉统一**: 文案编辑区 4 个按钮统一为固定高度 `h-7`，移除多余 `<span>` 嵌套。
+- [x] **底部栏调整**: "保存文案"按钮移至底部右侧，移除预计时长显示。
+
+#### 第五阶段：字幕语言不匹配 + 视频比例错位修复
+- [x] **字幕用原文替换 Whisper 转录**: `align()` 新增 `original_text` 参数，字幕文字永远用配音保存的原始文案。
+- [x] **Remotion 动态视频尺寸**: `calculateMetadata` 从 props 读取真实尺寸，修复标题/字幕比例错位。
+- [x] **英文空格丢失修复**: `split_word_to_chars` 遇到空格时 flush buffer + pending_space 标记。
+
+#### 第六阶段：参考音频自动转写 + 语速控制
+- [x] **Whisper 自动转写 ref_text**: 上传参考音频时自动调用 Whisper 转写内容作为 ref_text，不再使用前端固定文字。
+- [x] **参考音频自动截取**: 超过 10 秒自动在静音点截取（ffmpeg silencedetect），末尾 0.1 秒淡出避免截断爆音。
+- [x] **重新识别功能**: 新增 `POST /ref-audios/{id}/retranscribe` 端点 + 前端 RotateCw 按钮，旧音频可重新转写并截取。
+- [x] **语速控制**: 全链路 speed 参数（前端选择器 → 持久化 → 后端 → CosyVoice `inference_zero_shot(speed=)`），5 档：较慢(0.8)/稍慢(0.9)/正常(1.0)/稍快(1.1)/较快(1.2)。
+- [x] **缺少参考音频门控**: 声音克隆模式下未选参考音频时，生成配音按钮禁用 + 黄色警告提示。
+- [x] **Whisper 语言自动检测**: `transcribe()` language 参数改为可选（默认 None = 自动检测），支持多语言参考音频。
+- [x] **前端清理**: 移除固定 ref_text 常量、朗读引导文字，简化为"上传任意语音样本，系统将自动识别内容并克隆声音"。
+
+### Day 22: 多素材优化 + AI 翻译 + TTS 多语言
+- [x] **多素材 Bug 修复**: 6 个高优 Bug（边界溢出、单段 fallback、除零、duration 校验、Whisper 兜底、空列表检查）。
+- [x] **架构重构**: 多素材从"逐段 LatentSync"重构为"先拼接再推理"，推理次数 N→1。
+- [x] **前端优化**: payload 安全、进度消息、上传自动选中、Material 接口统一、拖拽修复、素材上限 4 个。
+- [x] **AI 多语言翻译**: 新增 `/api/ai/translate` 接口，前端 9 种语言翻译 + 还原原文。
+- [x] **TTS 多语言**: EdgeTTS 10 语言声音列表、翻译自动切换声音、声音克隆 language 透传、textLang 持久化。
+
+### Day 21: 缺陷修复 + 浮动预览 + 发布重构 + 架构优化 + 多素材生成
 - [x] **Remotion 崩溃容错**: 渲染进程 SIGABRT 退出时检查输出文件，避免误判失败导致标题/字幕丢失。
 - [x] **首页作品选择持久化**: 修复 `fetchGeneratedVideos` 无条件覆盖恢复值的问题，新增 `preferVideoId` 参数控制选中逻辑。
 - [x] **发布页作品选择持久化**: 根因为签名 URL 不稳定，全面改用 `video.id` 替代 `path` 进行选择/持久化/比较。
@@ -22,6 +133,11 @@
 - [x] **后端模块分层**: materials/tools/ref_audios 三个模块补全 router+schemas+service 分层。
 - [x] **开发规范更新**: BACKEND_DEV.md 新增渐进原则、DOC_RULES.md 取消 TASK_COMPLETE.md 手动触发约束。
 - [x] **文档全面更新**: BACKEND_DEV/README、FRONTEND_DEV、DEPLOY_MANUAL、README.md 同步更新。
+- [x] **多素材视频生成（多机位效果）**: 支持多选素材 + 拖拽排序，按素材数量均分音频时长（对齐 Whisper 字边界）自动切换机位。逐段 LatentSync + FFmpeg 拼接。前端 @dnd-kit 拖拽排序 UI。
+- [x] **字幕开关移除**: 默认启用逐字高亮字幕，移除开关及相关死代码。
+- [x] **视频格式扩展**: 上传支持 mkv/webm/flv/wmv/m4v/ts/mts 等常见格式。
+- [x] **Watchdog 优化**: 健康检查阈值提高到 5 次，新增重启冷却期 120 秒，避免误重启。
+- [x] **多素材 Bug 修复**: 修复标点分句方案对无句末标点文案无效（改为均分方案）、音频时间偏移导致口型不对齐等缺陷。

 ### Day 20: 代码质量与安全优化
 - [x] **功能性修复**: LatentSync 回退逻辑、任务状态接口认证、User 类型统一。
@@ -73,7 +189,7 @@
 - [x] **体验细节优化**: 录音预览 URL 回收，预览弹窗滚动恢复，全局任务提示挂载。

 ### Day 16: 深度性能优化
- [x] **Qwen-TTS 加速**: 集成 Flash Attention 2，模型加载速度提升至 8.9s。
+- [x] **Qwen-TTS 加速**: 集成 Flash Attention 2 (已停用，被 CosyVoice 3.0 替换)。
 - [x] **服务守护**: 开发 `Watchdog` 看门狗机制，自动监控并重启僵死服务。
 - [x] **LatentSync 性能确认**: 验证 DeepCache + 原生 Flash Attn 生效。
 - [x] **文档重构**: 全面更新 README、部署手册及后端文档。
@@ -86,10 +202,10 @@
 ### Day 14: AI 增强与体验优化
 - [x] **AI 标题/标签**: 集成 GLM-4API 自动生成视频元数据。
 - [x] **字幕升级**: Remotion 逐字高亮字幕 (卡拉OK效果) 及动画片头。
- [x] **模型升级**: Qwen3-TTS 升级至 1.7B-Base 版本。
+- [x] **模型升级**: 声音克隆已迁移至 CosyVoice 3.0 (0.5B)。

 ### Day 13: 声音克隆集成
- [x] **声音克隆微服务**: 封装 Qwen3-TTS 为独立 API (8009端口)。
+- [x] **声音克隆微服务**: 封装 CosyVoice 3.0 为独立 API (8010端口，替换 Qwen3-TTS)。
 - [x] **参考音频管理**: Supabase 存储桶配置与管理接口。
 - [x] **多模态 TTS**: 前端支持 EdgeTTS / Clone Voice 切换。

@@ -124,6 +240,7 @@
 ## 🛤️ 后续规划 (Roadmap)

 ### 🔴 优先待办
+- [x] ~~**配音前置重构 — 第二阶段**: 素材片段截取 + 语音时间轴编排~~ ✅ Day 23 已完成
 - [ ] **批量生成架构**: 支持 Excel 导入，批量生产视频。
 - [ ] **定时任务后台化**: 迁移前端触发的定时发布到后端 APScheduler。
 - [ ] **发布任务恢复机制**: 发布任务化 + 状态持久化 + 前端断点恢复，解决刷新后状态丢失。
@@ -141,9 +258,10 @@
 | **核心 API** | 100% | ✅ 稳定 |
 | **Web UI** | 100% | ✅ 稳定 (移动端适配) |
 | **唇形同步** | 100% | ✅ LatentSync 1.6 |
-| **TTS 配音** | 100% | ✅ EdgeTTS + Qwen3 |
+| **TTS 配音** | 100% | ✅ EdgeTTS + CosyVoice 3.0 + 配音前置 + 时间轴编排 + 自动转写 + 语速控制 |
 | **自动发布** | 100% | ✅ 抖音/微信视频号/B站/小红书 |
 | **用户认证** | 100% | ✅ 手机号 + JWT |
+| **付费会员** | 100% | ✅ 支付宝电脑网站支付 + 自动激活 |
 | **部署运维** | 100% | ✅ PM2 + Watchdog |

 ---
--- a/README.md
+++ b/README.md
@@ -4,8 +4,8 @@

 > 📹 **上传人物** · 🎙️ **输入文案** · 🎬 **一键成片**

-基于 **LatentSync 1.6 + EdgeTTS** 的开源数字人口播视频生成系统。
-集成 **Qwen3-TTS** 声音克隆与自动社交媒体发布功能。
+基于 **LatentSync 1.6 + MuseTalk 1.5 混合唇形同步** 的开源数字人口播视频生成系统。
+集成 **CosyVoice 3.0** 声音克隆与自动社交媒体发布功能。

 [功能特性](#-功能特性) • [技术栈](#-技术栈) • [文档中心](#-文档中心) • [部署指南](Docs/DEPLOY_MANUAL.md)

@@ -16,23 +16,28 @@
 ## ✨ 功能特性

 ### 核心能力
- 🎬 **高清唇形同步** - LatentSync 1.6 驱动，512×512 高分辨率 Latent Diffusion 模型。
- 🎙️ **多模态配音** - 支持 **EdgeTTS** (微软超自然语音) 和 **Qwen3-TTS** (3秒极速声音克隆)。
+- 🎬 **高清唇形同步** - 混合方案：短视频 (<120s) 用 LatentSync 1.6 (高质量 Latent Diffusion)，长视频 (>=120s) 用 MuseTalk 1.5 (实时级单步推理)，自动路由 + 回退。
+- 🎙️ **多模态配音** - 支持 **EdgeTTS** (微软超自然语音, 10 语言) 和 **CosyVoice 3.0** (3秒极速声音克隆, 9语言+18方言, 语速可调)。上传参考音频自动 Whisper 转写 + 智能截取。配音前置工作流：先生成配音 → 选素材 → 生成视频。
 - 📝 **智能字幕** - 集成 faster-whisper + Remotion，自动生成逐字高亮 (卡拉OK效果) 字幕。
- 🎨 **样式预设** - 标题/字幕样式选择 + 预览 + 字号调节，支持自定义字体库。
- 🖼️ **作品预览一致性** - 标题/字幕预览按素材分辨率缩放，效果更接近成片。
- 💾 **用户偏好持久化** - 首页状态统一恢复/保存，刷新后延续上次配置。
+- 🎨 **样式预设** - 12 种标题 + 8 种字幕样式预设，支持预览 + 字号调节 + 自定义字体库。CSS 原生描边渲染，清晰无重影。
+- 🏷️ **标题显示模式** - 片头标题支持 `短暂显示` / `常驻显示`，默认短暂显示（4秒），用户偏好自动持久化。
+- 📌 **片头副标题** - 可选副标题显示在主标题下方，独立样式配置，AI 可同时生成，20 字限制。
+- 🖼️ **作品预览一致性** - 标题/字幕预览与 Remotion 成片统一响应式缩放和自动换行，窄屏画布也稳定显示。
+- 🎞️ **多素材多机位** - 支持多选素材 + 时间轴编辑器 (wavesurfer.js 波形可视化)，拖拽分割线调整时长、拖拽排序切换机位、按 `source_start/source_end` 截取片段。
+- 📐 **画面比例控制** - 时间轴一键切换 `9:16 / 16:9` 输出比例，生成链路全程按目标比例处理。
+- 💾 **用户偏好持久化** - 首页状态统一恢复/保存，刷新后延续上次配置。历史文案手动保存与加载。
 - 🎵 **背景音乐** - 试听 + 音量控制 + 混音，保持配音音量稳定。
- 🤖 **AI 辅助创作** - 内置 GLM-4.7-Flash，支持 B站/抖音链接文案提取、AI 洗稿、标题/标签自动生成。
+- 🤖 **AI 辅助创作** - 内置 GLM-4.7-Flash，支持 B站/抖音链接文案提取、AI 智能改写（支持自定义提示词）、标题/标签自动生成、9 语言翻译。

 ### 平台化功能
 - 📱 **全自动发布** - 支持抖音/微信视频号/B站/小红书立即发布；扫码登录 + Cookie 持久化。
 - 🖥️ **发布管理预览** - 支持签名 URL / 相对路径作品预览，确保可直接播放。
 - 📸 **发布结果可视化** - 抖音/微信视频号发布成功后返回截图，发布页结果卡片可直接查看。
 - 🛡️ **发布防误操作** - 发布进行中自动提示“请勿刷新或关闭网页”，并拦截刷新/关页二次确认。
+- 💳 **付费会员** - 支付宝电脑网站支付自动开通会员，到期自动停用并引导续费，管理员手动激活并存。
 - 🔐 **认证与隔离** - 基于 Supabase 的用户隔离，支持手机号注册/登录、密码管理。
 - 🛡️ **服务守护** - 内置 Watchdog 看门狗机制，自动监控并重启僵死服务，确保 7x24h 稳定运行。
- 🚀 **性能优化** - 视频预压缩、模型常驻服务（近实时加载）、双 GPU 流水线并发。
+- 🚀 **性能优化** - 视频预压缩、模型常驻服务（近实时加载）、双 GPU 流水线并发、MuseTalk 人脸检测降频 + BiSeNet 缓存、Remotion 16 并发渲染。

 ---

@@ -40,11 +45,11 @@

 | 领域 | 核心技术 | 说明 |
 |------|----------|------|
-| **前端** | Next.js 16 | TypeScript, TailwindCSS, SWR |
-| **后端** | FastAPI | Python 3.10, AsyncIO, PM2 |
+| **前端** | Next.js 16 | TypeScript, TailwindCSS, SWR, wavesurfer.js |
+| **后端** | FastAPI | Python 3.12, AsyncIO, PM2 |
 | **数据库** | Supabase | PostgreSQL, Storage (本地/S3), Auth |
-| **唇形同步** | LatentSync 1.6 | PyTorch 2.5, Diffusers, DeepCache |
-| **声音克隆** | Qwen3-TTS | 1.7B 参数量，Flash Attention 2 加速 |
+| **唇形同步** | LatentSync 1.6 + MuseTalk 1.5 | 混合路由：短视频 Diffusion 高质量，长视频单步实时推理 |
+| **声音克隆** | CosyVoice 3.0 | 0.5B 参数量，9 语言 + 18 方言 |
 | **自动化** | Playwright | 社交媒体无头浏览器自动化 |
 | **部署** | Docker & PM2 | 混合部署架构 |

@@ -56,14 +61,18 @@

 ### 部署运维
 - **[部署手册 (DEPLOY_MANUAL.md)](Docs/DEPLOY_MANUAL.md)** - 👈 **部署请看这里**！包含完整的环境搭建步骤。
- [参考音频服务部署 (QWEN3_TTS_DEPLOY.md)](Docs/QWEN3_TTS_DEPLOY.md) - 声音克隆模型部署指南。
- [LatentSync 部署指南](models/LatentSync/DEPLOY.md) - 唇形同步模型独立部署。
+- [参考音频服务部署 (COSYVOICE3_DEPLOY.md)](Docs/COSYVOICE3_DEPLOY.md) - 声音克隆模型部署指南。
+- [LatentSync 部署指南 (LATENTSYNC_DEPLOY.md)](Docs/LATENTSYNC_DEPLOY.md) - 唇形同步模型独立部署。
+- [MuseTalk 部署指南 (MUSETALK_DEPLOY.md)](Docs/MUSETALK_DEPLOY.md) - 长视频唇形同步模型部署。
 - [Supabase 部署指南 (SUPABASE_DEPLOY.md)](Docs/SUPABASE_DEPLOY.md) - Supabase 与认证系统配置。
+- [支付宝部署指南 (ALIPAY_DEPLOY.md)](Docs/ALIPAY_DEPLOY.md) - 支付宝付费开通会员配置。

 ### 开发文档
- [后端开发指南](Docs/BACKEND_README.md) - 接口规范与开发流程。
- [后端开发规范](Docs/BACKEND_DEV.md) - 分层约定与开发习惯。
- [前端开发指南](Docs/FRONTEND_DEV.md) - UI 组件与页面规范。
+- [后端开发指南 (BACKEND_README.md)](Docs/BACKEND_README.md) - 接口规范与开发流程。
+- [后端开发规范 (BACKEND_DEV.md)](Docs/BACKEND_DEV.md) - 分层约定与开发习惯。
+- [前端开发指南 (FRONTEND_DEV.md)](Docs/FRONTEND_DEV.md) - UI 组件与页面规范。
+- [前端组件文档 (FRONTEND_README.md)](Docs/FRONTEND_README.md) - 组件结构与板块说明。
+- [Remotion 字幕部署 (SUBTITLE_DEPLOY.md)](Docs/SUBTITLE_DEPLOY.md) - 字幕渲染服务部署。
 - [开发日志 (DevLogs)](Docs/DevLogs/) - 每日开发进度与技术决策记录。

 ---
@@ -80,8 +89,9 @@ ViGent2/
 ├── frontend/             # Next.js 前端应用
 ├── remotion/             # Remotion 视频渲染 (标题/字幕合成)
 ├── models/               # AI 模型仓库
-│   ├── LatentSync/       # 唇形同步服务
-│   └── Qwen3-TTS/        # 声音克隆服务
+│   ├── LatentSync/       # 唇形同步服务 (GPU1, 短视频)
+│   ├── MuseTalk/         # 唇形同步服务 (GPU0, 长视频)
+│   └── CosyVoice/       # 声音克隆服务
 └── Docs/                 # 项目文档
 ```

@@ -95,8 +105,9 @@ ViGent2/
 |----------|------|------|
 | **Web UI** | 3002 | 用户访问入口 (Next.js) |
 | **Backend API** | 8006 | 核心业务接口 (FastAPI) |
-| **LatentSync** | 8007 | 唇形同步推理服务 |
-| **Qwen3-TTS** | 8009 | 声音克隆推理服务 |
+| **LatentSync** | 8007 | 唇形同步推理服务 (GPU1, 短视频) |
+| **MuseTalk** | 8011 | 唇形同步推理服务 (GPU0, 长视频) |
+| **CosyVoice 3.0** | 8010 | 声音克隆推理服务 |
 | **Supabase** | 8008 | 数据库与认证网关 |

 ---
--- a/backend/.env.example
+++ b/backend/.env.example
@@ -25,10 +25,10 @@ LATENTSYNC_USE_SERVER=true
 # LATENTSYNC_API_URL=http://localhost:8007

 # 推理步数 (20-50, 越高质量越好，速度越慢)
-LATENTSYNC_INFERENCE_STEPS=40
+LATENTSYNC_INFERENCE_STEPS=16

 # 引导系数 (1.0-3.0, 越高唇同步越准，但可能抖动)
-LATENTSYNC_GUIDANCE_SCALE=2.0
+LATENTSYNC_GUIDANCE_SCALE=1.5

 # 启用 DeepCache 加速 (推荐开启)
 LATENTSYNC_ENABLE_DEEPCACHE=true
@@ -36,6 +36,26 @@ LATENTSYNC_ENABLE_DEEPCACHE=true
 # 随机种子 (设为 -1 则随机)
 LATENTSYNC_SEED=1247

+# =============== MuseTalk 配置 ===============
+# GPU 选择 (默认 GPU0，与 CosyVoice 共存)
+MUSETALK_GPU_ID=0
+
+# 常驻服务地址 (端口 8011)
+MUSETALK_API_URL=http://localhost:8011
+
+# 推理批大小
+MUSETALK_BATCH_SIZE=32
+
+# 模型版本
+MUSETALK_VERSION=v15
+
+# 半精度加速
+MUSETALK_USE_FLOAT16=true
+
+# =============== 混合唇形同步路由 ===============
+# 音频时长 >= 此阈值（秒）用 MuseTalk，< 此阈值用 LatentSync
+LIPSYNC_DURATION_THRESHOLD=120
+
 # =============== 上传配置 ===============
 # 最大上传文件大小 (MB)
 MAX_UPLOAD_SIZE_MB=500
@@ -70,6 +90,9 @@ GLM_MODEL=glm-4.7-flash
 # 确保存储卷映射正确，避免硬编码路径
 SUPABASE_STORAGE_LOCAL_PATH=/home/rongye/ProgramFiles/Supabase/volumes/storage/stub/stub

-# =============== 抖音视频下载 Cookie ===============
-# 用于从抖音 URL 提取视频文案功能，会过期需要定期更新
-DOUYIN_COOKIE=douyin.com; device_web_cpu_core=10; device_web_memory_size=8; __ac_nonce=06760391f00b9b51264ae; __ac_signature=_02B4Z6wo00f019a5ceAAAIDAhEZR-X3jjWfWmXVAAJLXd4; ttwid=1%7C7MTKBSMsP4eOv9h5NAh8p0E-NYIud09ftNmB0mjLpWc%7C1734359327%7C8794abeabbd47447e1f56e5abc726be089f2a0344d6343b5f75f23e7b0f0028f; UIFID_TEMP=0de8750d2b188f4235dbfd208e44abbb976428f0720eb983255afefa45d39c0c6532e1d4768dd8587bf919f866ff1396912bcb2af71efee56a14a2a9f37b74010d0a0413795262f6d4afe02a032ac7ab; s_v_web_id=verify_m4r4ribr_c7krmY1z_WoeI_43po_ATpO_I4o8U1bex2D7; hevc_supported=true; home_can_add_dy_2_desktop=%220%22; dy_swidth=2560; dy_sheight=1440; stream_recommend_feed_params=%22%7B%5C%22cookie_enabled%5C%22%3Atrue%2C%5C%22screen_width%5C%22%3A2560%2C%5C%22screen_height%5C%22%3A1440%2C%5C%22browser_online%5C%22%3Atrue%2C%5C%22cpu_core_num%5C%22%3A10%2C%5C%22device_memory%5C%22%3A8%2C%5C%22downlink%5C%22%3A10%2C%5C%22effective_type%5C%22%3A%5C%224g%5C%22%2C%5C%22round_trip_time%5C%22%3A50%7D%22; strategyABtestKey=%221734359328.577%22; csrf_session_id=2f53aed9aa6974e83aa9a1014180c3a4; fpk1=U2FsdGVkX1/IpBh0qdmlKAVhGyYHgur4/VtL9AReZoeSxadXn4juKvsakahRGqjxOPytHWspYoBogyhS/V6QSw==; fpk2=0845b309c7b9b957afd9ecf775a4c21f; passport_csrf_token=d80e0c5b2fa2328219856be5ba7e671e; passport_csrf_token_default=d80e0c5b2fa2328219856be5ba7e671e; odin_tt=3c891091d2eb0f4718c1d5645bc4a0017032d4d5aa989decb729e9da2ad570918cbe5e9133dc6b145fa8c758de98efe32ff1f81aa0d611e838cc73ab08ef7d3f6adf66ab4d10e8372ddd628f94f16b8e; volume_info=%7B%22isUserMute%22%3Afalse%2C%22isMute%22%3Afalse%2C%22volume%22%3A0.5%7D; bd_ticket_guard_client_web_domain=2; FORCE_LOGIN=%7B%22videoConsumedRemainSeconds%22%3A180%7D; UIFID=0de8750d2b188f4235dbfd208e44abbb976428f0720eb983255afefa45d39c0c6532e1d4768dd8587bf919f866ff139655a3c2b735923234f371c699560c657923fd3d6c5b63ab7bb9b83423b6cb4787e2ce66a7fbc4ecb24c8570f520fe6de068bbb95115023c0c6c1b6ee31b49fb7e3996fb8349f43a3fd8b7a61cd9e18e8fe65eb6a7c13de4c0960d84e344b644725db3eb2fa6b7caf821de1b50527979f2; is_dash_user=1; biz_trace_id=b57a241f; bd_ticket_guard_client_data=eyJiZC10aWNrZXQtZ3VhcmQtdmVyc2lvbiI6MiwiYmQtdGlja2V0LWd1YXJkLWl0ZXJhdGlvbi12ZXJzaW9uIjoxLCJiZC10aWNrZXQtZ3VhcmQtcmVlLXB1YmxpYy1rZXkiOiJCTEo2R0lDalVoWW1XcHpGOFdrN0Vrc0dXcCtaUzNKY1g4NGNGY2k0TTl1TEowNjdUb21mbFU5aDdvWVBGamhNRWNRQWtKdnN1MnM3RmpTWnlJQXpHMjA9IiwiYmQtdGlja2V0LWd1YXJkLXdlYi12ZXJzaW9uIjoyfQ%3D%3D; download_guide=%221%2F20241216%2F0%22; sdk_source_info=7e276470716a68645a606960273f276364697660272927676c715a6d6069756077273f276364697660272927666d776a68605a607d71606b766c6a6b5a7666776c7571273f275e58272927666a6b766a69605a696c6061273f27636469766027292762696a6764695a7364776c6467696076273f275e5827292771273f273d33323131333c3036313632342778; bit_env=RiOY4jzzpxZoVCl6zdVSVhVRjdwHRTxqcqWdqMBZLPGjMdB4Tax1kAELHNTVAAh72KuhumewE4Lq6f0-VJ2UpJrkrhSxoPw9LUb3zQrq1OSwbeSPHkRlRgRQvO89sItdGUyq1oFr0XyRCnMYG87KSeWyc4x0czGR0o50hTDoDLG5rJVoRcdQOLvjiAegsqyytKF59sPX_QM9qffK2SqYsg0hCggURc_AI6kguDDE5DvG0bnyz1utw4z1eEnIoLrkGDqzqBZj4dOAr0BVU6ofbsS-pOQ2u2PM1dLP9FlBVBlVaqYVgHJeSLsR5k76BRTddUjTb4zEilVIEwAMJWGN4I1BxVt6fC9B5tBQpuT0lj3n3eKXCKXZsd8FrEs5_pbfDsxV-e_WMiXI2ff4qxiTC0U73sfo9OpicKICtZjdq8qsHxJuu6wVR36zvXeL2Wch5C6MzprNvkivv0l8nbh2mSgy1nabZr3dmU6NcR-Bg3Q3xTWUlR9aAUmpopC-cNuXjgLpT-Lw1AYGilSUnCvosth1Gfypq-b0MpgmdSDgTrQ%3D; gulu_source_res=eyJwX2luIjoiMDhjOGQ3ZTJiODQyNjZkZWI5Y2VkMGJiODNlNmY1ZWY0ZjMyNTE2ZmYyZjAzNDMzZjI0OWU1Y2Q1NTczNTk5NyJ9; passport_auth_mix_state=hp9bc3dgb1tm5wd8p82zawus27g0e3ue; IsDouyinActive=false
+# =============== 支付宝配置 ===============
+ALIPAY_APP_ID=2021006132600283
+ALIPAY_PRIVATE_KEY_PATH=/home/rongye/ProgramFiles/ViGent2/backend/keys/app_private_key.pem
+ALIPAY_PUBLIC_KEY_PATH=/home/rongye/ProgramFiles/ViGent2/backend/keys/alipay_public_key.pem
+ALIPAY_NOTIFY_URL=https://vigent.hbyrkj.top/api/payment/notify
+ALIPAY_RETURN_URL=https://vigent.hbyrkj.top/pay
--- a/backend/app/core/config.py
+++ b/backend/app/core/config.py
@@ -58,6 +58,16 @@ class Settings(BaseSettings):
    LATENTSYNC_SEED: int = 1247                     # 随机种子 (-1 则随机)
    LATENTSYNC_USE_SERVER: bool = True              # 使用常驻服务 (Persistent Server) 加速

+    # MuseTalk 配置
+    MUSETALK_GPU_ID: int = 0                        # GPU ID (默认使用 GPU0)
+    MUSETALK_API_URL: str = "http://localhost:8011"  # 常驻服务地址
+    MUSETALK_BATCH_SIZE: int = 8                    # 推理批大小
+    MUSETALK_VERSION: str = "v15"                   # 模型版本
+    MUSETALK_USE_FLOAT16: bool = True               # 半精度加速
+
+    # 混合唇形同步路由
+    LIPSYNC_DURATION_THRESHOLD: float = 120.0       # 秒，>=此值用 MuseTalk
+
    # Supabase 配置
    SUPABASE_URL: str = ""
    SUPABASE_PUBLIC_URL: str = ""  # 公网访问地址，用于生成前端可访问的 URL
@@ -76,17 +86,28 @@ class Settings(BaseSettings):
    GLM_API_KEY: str = ""
    GLM_MODEL: str = "glm-4.7-flash"
    
+    # 支付宝配置
+    ALIPAY_APP_ID: str = ""
+    ALIPAY_PRIVATE_KEY_PATH: str = ""   # 应用私钥 PEM 文件路径
+    ALIPAY_PUBLIC_KEY_PATH: str = ""    # 支付宝公钥 PEM 文件路径
+    ALIPAY_NOTIFY_URL: str = ""         # 异步通知回调地址（公网可达）
+    ALIPAY_RETURN_URL: str = ""         # 支付成功后同步跳转地址
+    ALIPAY_SANDBOX: bool = False        # 是否使用沙箱环境
+    PAYMENT_AMOUNT: float = 999.00      # 会员价格（元）
+    PAYMENT_EXPIRE_DAYS: int = 365      # 会员有效天数
+
    # CORS 配置 (逗号分隔的域名列表，* 表示允许所有)
    CORS_ORIGINS: str = "*"
-    
-    # 抖音 Cookie (用于视频下载功能，会过期需要定期更新)
-    DOUYIN_COOKIE: str = ""
-    
    @property
    def LATENTSYNC_DIR(self) -> Path:
        """LatentSync 目录路径 (动态计算)"""
        return self.BASE_DIR.parent.parent / "models" / "LatentSync"

+    @property
+    def MUSETALK_DIR(self) -> Path:
+        """MuseTalk 目录路径 (动态计算)"""
+        return self.BASE_DIR.parent.parent / "models" / "MuseTalk"
+
    class Config:
        env_file = ".env"
        extra = "ignore"  # 忽略未知的环境变量
--- a/backend/app/core/deps.py
+++ b/backend/app/core/deps.py
@@ -3,9 +3,9 @@
 """
 from typing import Optional, Any, Dict, cast
 from fastapi import Request, HTTPException, Depends, status
-from app.core.security import decode_access_token, TokenData
-from app.repositories.sessions import get_session
-from app.repositories.users import get_user_by_id
+from app.core.security import decode_access_token
+from app.repositories.sessions import get_session, delete_sessions
+from app.repositories.users import get_user_by_id, deactivate_user_if_expired
 from loguru import logger


@@ -35,8 +35,16 @@ async def get_current_user_optional(
            logger.warning(f"Session token 无效: user_id={token_data.user_id}")
            return None

-        user = get_user_by_id(token_data.user_id)
-        return cast(Optional[Dict[str, Any]], user)
+        user = cast(Optional[Dict[str, Any]], get_user_by_id(token_data.user_id))
+        if user and deactivate_user_if_expired(user):
+            delete_sessions(token_data.user_id)
+            return None
+
+        if user and not user.get("is_active"):
+            delete_sessions(token_data.user_id)
+            return None
+
+        return user
    except Exception as e:
        logger.error(f"获取用户信息失败: {e}")
        return None
@@ -82,13 +90,18 @@ async def get_current_user(
            )
        user = cast(Dict[str, Any], user)

-        if user.get("expires_at"):
-            from datetime import datetime, timezone
-            expires_at = datetime.fromisoformat(user["expires_at"].replace("Z", "+00:00"))
-            if datetime.now(timezone.utc) > expires_at:
+        if deactivate_user_if_expired(user):
+            delete_sessions(token_data.user_id)
            raise HTTPException(
                status_code=status.HTTP_403_FORBIDDEN,
-                    detail="授权已过期，请联系管理员续期"
+                detail="会员已到期，请续费"
+            )
+
+        if not user.get("is_active"):
+            delete_sessions(token_data.user_id)
+            raise HTTPException(
+                status_code=status.HTTP_403_FORBIDDEN,
+                detail="账号已停用"
            )

        return user
--- a/backend/app/core/security.py
+++ b/backend/app/core/security.py
@@ -110,3 +110,28 @@ def set_auth_cookie(response: Response, token: str) -> None:
 def clear_auth_cookie(response: Response) -> None:
    """清除认证 Cookie"""
    response.delete_cookie(key="access_token")
+
+
+def create_payment_token(user_id: str) -> str:
+    """生成付费专用短期 JWT token（30 分钟有效）"""
+    payload = {
+        "sub": user_id,
+        "purpose": "payment",
+        "exp": datetime.now(timezone.utc) + timedelta(minutes=30),
+    }
+    return jwt.encode(payload, settings.JWT_SECRET_KEY, algorithm=settings.JWT_ALGORITHM)
+
+
+def decode_payment_token(token: str) -> str | None:
+    """解析 payment_token，返回 user_id（仅 purpose=payment 有效）"""
+    try:
+        data = jwt.decode(
+            token,
+            settings.JWT_SECRET_KEY,
+            algorithms=[settings.JWT_ALGORITHM],
+        )
+        if data.get("purpose") != "payment":
+            return None
+        return data.get("sub")
+    except JWTError:
+        return None
--- a/backend/app/main.py
+++ b/backend/app/main.py
@@ -15,6 +15,8 @@ from app.modules.ref_audios.router import router as ref_audios_router
 from app.modules.ai.router import router as ai_router
 from app.modules.tools.router import router as tools_router
 from app.modules.assets.router import router as assets_router
+from app.modules.generated_audios.router import router as generated_audios_router
+from app.modules.payment.router import router as payment_router
 from loguru import logger
 import os

@@ -124,6 +126,8 @@ app.include_router(ref_audios_router, prefix="/api/ref-audios", tags=["RefAudios
 app.include_router(ai_router)  # /api/ai
 app.include_router(tools_router, prefix="/api/tools", tags=["Tools"])
 app.include_router(assets_router, prefix="/api/assets", tags=["Assets"])
+app.include_router(generated_audios_router, prefix="/api/generated-audios", tags=["GeneratedAudios"])
+app.include_router(payment_router)  # /api/payment


@app.on_event("startup")
--- a/backend/app/modules/ai/router.py
+++ b/backend/app/modules/ai/router.py
@@ -2,6 +2,8 @@
 AI 相关 API 路由
 """

+from typing import Optional
+
 from fastapi import APIRouter, HTTPException
 from pydantic import BaseModel
 from loguru import logger
@@ -21,9 +23,43 @@ class GenerateMetaRequest(BaseModel):
 class GenerateMetaResponse(BaseModel):
    """生成标题标签响应"""
    title: str
+    secondary_title: str = ""
    tags: list[str]


+class RewriteRequest(BaseModel):
+    """改写请求"""
+    text: str
+    custom_prompt: Optional[str] = None
+
+
+class TranslateRequest(BaseModel):
+    """翻译请求"""
+    text: str
+    target_lang: str
+
+
+@router.post("/translate")
+async def translate_text(req: TranslateRequest):
+    """
+    AI 翻译文案
+
+    将文案翻译为指定目标语言
+    """
+    if not req.text or not req.text.strip():
+        raise HTTPException(status_code=400, detail="文案不能为空")
+    if not req.target_lang or not req.target_lang.strip():
+        raise HTTPException(status_code=400, detail="目标语言不能为空")
+
+    try:
+        logger.info(f"Translating text to {req.target_lang}: {req.text[:50]}...")
+        translated = await glm_service.translate_text(req.text.strip(), req.target_lang.strip())
+        return success_response({"translated_text": translated})
+    except Exception as e:
+        logger.error(f"Translate failed: {e}")
+        raise HTTPException(status_code=500, detail=str(e))
+
+
@router.post("/generate-meta")
 async def generate_meta(req: GenerateMetaRequest):
    """
@@ -39,8 +75,24 @@ async def generate_meta(req: GenerateMetaRequest):
        result = await glm_service.generate_title_tags(req.text)
        return success_response(GenerateMetaResponse(
            title=result.get("title", ""),
+            secondary_title=result.get("secondary_title", ""),
            tags=result.get("tags", [])
        ).model_dump())
    except Exception as e:
        logger.error(f"Generate meta failed: {e}")
        raise HTTPException(status_code=500, detail=str(e))
+
+
+@router.post("/rewrite")
+async def rewrite_script(req: RewriteRequest):
+    """AI 改写文案"""
+    if not req.text or not req.text.strip():
+        raise HTTPException(status_code=400, detail="文案不能为空")
+
+    try:
+        logger.info(f"Rewriting text: {req.text[:50]}...")
+        rewritten = await glm_service.rewrite_script(req.text.strip(), req.custom_prompt)
+        return success_response({"rewritten_text": rewritten})
+    except Exception as e:
+        logger.error(f"Rewrite failed: {e}")
+        raise HTTPException(status_code=500, detail=str(e))
--- a/backend/app/modules/auth/router.py
+++ b/backend/app/modules/auth/router.py
@@ -1,7 +1,8 @@
 """
 认证 API：注册、登录、登出、修改密码
 """
-from fastapi import APIRouter, HTTPException, Response, status, Request
+from fastapi import APIRouter, HTTPException, Response, status, Request, Depends
+from fastapi.responses import JSONResponse
 from pydantic import BaseModel, field_validator
 from app.core.security import (
    get_password_hash,
@@ -10,10 +11,19 @@ from app.core.security import (
    generate_session_token,
    set_auth_cookie,
    clear_auth_cookie,
-    decode_access_token
+    decode_access_token,
+    create_payment_token,
 )
 from app.repositories.sessions import create_session, delete_sessions
-from app.repositories.users import create_user, get_user_by_id, get_user_by_phone, user_exists_by_phone, update_user
+from app.repositories.users import (
+    create_user,
+    get_user_by_id,
+    get_user_by_phone,
+    user_exists_by_phone,
+    update_user,
+    deactivate_user_if_expired,
+)
+from app.core.deps import get_current_user
 from app.core.response import success_response
 from loguru import logger
 from typing import Optional, Any, cast
@@ -130,21 +140,25 @@ async def login(request: LoginRequest, response: Response):
                detail="手机号或密码错误"
            )
        
-        # 检查是否激活
-        if not user["is_active"]:
-            raise HTTPException(
-                status_code=status.HTTP_403_FORBIDDEN,
-                detail="账号未激活，请等待管理员审核"
-            )
+        # 过期自动停用（注意：只更新 DB，不修改内存中的 user 字典）
+        expired = deactivate_user_if_expired(user)
+        if expired:
+            delete_sessions(user["id"])

-        # 检查授权是否过期
-        if user.get("expires_at"):
-            from datetime import datetime, timezone
-            expires_at = datetime.fromisoformat(user["expires_at"].replace("Z", "+00:00"))
-            if datetime.now(timezone.utc) > expires_at:
-                raise HTTPException(
-                    status_code=status.HTTP_403_FORBIDDEN,
-                    detail="授权已过期，请联系管理员续期"
+        # 过期 或 未激活（新注册）→ 返回付费指引
+        if expired or not user["is_active"]:
+            payment_token = create_payment_token(user["id"])
+            return JSONResponse(
+                status_code=403,
+                content={
+                    "success": False,
+                    "message": "请付费开通会员",
+                    "code": 403,
+                    "data": {
+                        "reason": "PAYMENT_REQUIRED",
+                        "payment_token": payment_token,
+                    }
+                }
            )
        
        # 生成新的 session_token (后踢前)
@@ -259,30 +273,8 @@ async def change_password(request: ChangePasswordRequest, req: Request, response


@router.get("/me")
-async def get_me(request: Request):
+async def get_me(user: dict = Depends(get_current_user)):
    """获取当前用户信息"""
-    # 从 Cookie 获取用户
-    token = request.cookies.get("access_token")
-    if not token:
-        raise HTTPException(
-            status_code=status.HTTP_401_UNAUTHORIZED,
-            detail="未登录"
-        )
-    
-    token_data = decode_access_token(token)
-    if not token_data:
-        raise HTTPException(
-            status_code=status.HTTP_401_UNAUTHORIZED,
-            detail="Token 无效"
-        )
-    
-    user = cast(dict[str, Any], get_user_by_id(token_data.user_id) or {})
-    if not user:
-        raise HTTPException(
-            status_code=status.HTTP_401_UNAUTHORIZED,
-            detail="用户不存在"
-        )
-    
    return success_response(UserResponse(
        id=user["id"],
        phone=user["phone"],
--- a/backend/app/modules/generated_audios/init.py
+++ b/backend/app/modules/generated_audios/init.py
--- a/backend/app/modules/generated_audios/router.py
+++ b/backend/app/modules/generated_audios/router.py
@@ -0,0 +1,77 @@
+"""生成配音 API"""
+from fastapi import APIRouter, BackgroundTasks, Depends, HTTPException
+import uuid
+from loguru import logger
+
+from app.core.deps import get_current_user
+from app.core.response import success_response
+from app.modules.videos.task_store import create_task, get_task
+from app.modules.generated_audios.schemas import GenerateAudioRequest, RenameAudioRequest
+from app.modules.generated_audios import service
+
+router = APIRouter()
+
+
+@router.post("/generate")
+async def generate_audio(
+    req: GenerateAudioRequest,
+    background_tasks: BackgroundTasks,
+    user: dict = Depends(get_current_user),
+):
+    """异步生成配音（返回 task_id）"""
+    task_id = str(uuid.uuid4())
+    create_task(task_id, user["id"])
+    background_tasks.add_task(service.generate_audio_task, task_id, req, user["id"])
+    return success_response({"task_id": task_id})
+
+
+@router.get("/tasks/{task_id}")
+async def get_audio_task(task_id: str, user: dict = Depends(get_current_user)):
+    """轮询配音生成进度"""
+    task = get_task(task_id)
+    if task.get("status") != "not_found" and task.get("user_id") != user["id"]:
+        return success_response({"status": "not_found"})
+    return success_response(task)
+
+
+@router.get("")
+async def list_audios(user: dict = Depends(get_current_user)):
+    """列出当前用户所有已生成配音"""
+    try:
+        result = await service.list_generated_audios(user["id"])
+        return success_response(result)
+    except Exception as e:
+        logger.error(f"列出配音失败: {e}")
+        raise HTTPException(status_code=500, detail=f"获取列表失败: {str(e)}")
+
+
+@router.delete("/{audio_id:path}")
+async def delete_audio(audio_id: str, user: dict = Depends(get_current_user)):
+    """删除配音"""
+    try:
+        await service.delete_generated_audio(audio_id, user["id"])
+        return success_response(message="删除成功")
+    except PermissionError as e:
+        raise HTTPException(status_code=403, detail=str(e))
+    except Exception as e:
+        logger.error(f"删除配音失败: {e}")
+        raise HTTPException(status_code=500, detail=f"删除失败: {str(e)}")
+
+
+@router.put("/{audio_id:path}")
+async def rename_audio(
+    audio_id: str,
+    request: RenameAudioRequest,
+    user: dict = Depends(get_current_user),
+):
+    """重命名配音"""
+    try:
+        result = await service.rename_generated_audio(audio_id, request.new_name, user["id"])
+        return success_response(result, message="重命名成功")
+    except PermissionError as e:
+        raise HTTPException(status_code=403, detail=str(e))
+    except ValueError as e:
+        raise HTTPException(status_code=400, detail=str(e))
+    except Exception as e:
+        logger.error(f"重命名配音失败: {e}")
+        raise HTTPException(status_code=500, detail=f"重命名失败: {str(e)}")
--- a/backend/app/modules/generated_audios/schemas.py
+++ b/backend/app/modules/generated_audios/schemas.py
@@ -0,0 +1,31 @@
+from pydantic import BaseModel
+from typing import Optional, List
+
+
+class GenerateAudioRequest(BaseModel):
+    text: str
+    tts_mode: str = "edgetts"
+    voice: str = "zh-CN-YunxiNeural"
+    ref_audio_id: Optional[str] = None
+    ref_text: Optional[str] = None
+    language: str = "zh-CN"
+    speed: float = 1.0
+
+
+class RenameAudioRequest(BaseModel):
+    new_name: str
+
+
+class GeneratedAudioItem(BaseModel):
+    id: str
+    name: str
+    path: str
+    duration_sec: float
+    text: str
+    tts_mode: str
+    language: str
+    created_at: int
+
+
+class GeneratedAudioListResponse(BaseModel):
+    items: List[GeneratedAudioItem]
--- a/backend/app/modules/generated_audios/service.py
+++ b/backend/app/modules/generated_audios/service.py
@@ -0,0 +1,264 @@
+"""生成配音 - 业务逻辑"""
+import re
+import json
+import time
+import asyncio
+import subprocess
+import tempfile
+import os
+from pathlib import Path
+from typing import Optional
+
+import httpx
+from loguru import logger
+
+from app.services.storage import storage_service
+from app.services.tts_service import TTSService
+from app.services.voice_clone_service import voice_clone_service
+from app.modules.videos.task_store import task_store
+from app.modules.generated_audios.schemas import (
+    GenerateAudioRequest,
+    GeneratedAudioItem,
+    GeneratedAudioListResponse,
+)
+
+BUCKET = "generated-audios"
+
+
+def _locale_to_tts_lang(locale: str) -> str:
+    mapping = {"zh": "Chinese", "en": "English"}
+    return mapping.get(locale.split("-")[0], "Auto")
+
+
+def _get_audio_duration(file_path: str) -> float:
+    try:
+        result = subprocess.run(
+            ['ffprobe', '-v', 'quiet', '-show_entries', 'format=duration',
+             '-of', 'csv=p=0', file_path],
+            capture_output=True, text=True, timeout=10
+        )
+        return float(result.stdout.strip())
+    except Exception as e:
+        logger.warning(f"获取音频时长失败: {e}")
+        return 0.0
+
+
+async def generate_audio_task(task_id: str, req: GenerateAudioRequest, user_id: str):
+    """后台任务：生成配音"""
+    try:
+        task_store.update(task_id, {"status": "processing", "progress": 10, "message": "正在生成配音..."})
+
+        with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tmp:
+            audio_path = tmp.name
+
+        try:
+            if req.tts_mode == "voiceclone":
+                if not req.ref_audio_id or not req.ref_text:
+                    raise ValueError("声音克隆模式需要提供参考音频和参考文字")
+
+                task_store.update(task_id, {"progress": 20, "message": "正在下载参考音频..."})
+
+                with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tmp_ref:
+                    ref_local = tmp_ref.name
+
+                try:
+                    ref_url = await storage_service.get_signed_url(
+                        bucket="ref-audios", path=req.ref_audio_id
+                    )
+                    timeout = httpx.Timeout(None)
+                    async with httpx.AsyncClient(timeout=timeout) as client:
+                        async with client.stream("GET", ref_url) as resp:
+                            resp.raise_for_status()
+                            with open(ref_local, "wb") as f:
+                                async for chunk in resp.aiter_bytes():
+                                    f.write(chunk)
+
+                    task_store.update(task_id, {"progress": 40, "message": "正在克隆声音..."})
+                    await voice_clone_service.generate_audio(
+                        text=req.text,
+                        ref_audio_path=ref_local,
+                        ref_text=req.ref_text,
+                        output_path=audio_path,
+                        language=_locale_to_tts_lang(req.language),
+                        speed=req.speed,
+                    )
+                finally:
+                    if os.path.exists(ref_local):
+                        os.unlink(ref_local)
+            else:
+                task_store.update(task_id, {"progress": 30, "message": "正在生成语音..."})
+                tts = TTSService()
+                await tts.generate_audio(req.text, req.voice, audio_path)
+
+            task_store.update(task_id, {"progress": 70, "message": "正在上传配音..."})
+
+            duration = _get_audio_duration(audio_path)
+            timestamp = int(time.time())
+            audio_id = f"{user_id}/{timestamp}_audio.wav"
+            meta_id = f"{user_id}/{timestamp}_audio.json"
+
+            # 生成 display_name
+            now = time.strftime("%Y%m%d_%H%M", time.localtime(timestamp))
+            display_name = f"配音_{now}"
+
+            with open(audio_path, "rb") as f:
+                wav_data = f.read()
+
+            await storage_service.upload_file(
+                bucket=BUCKET, path=audio_id,
+                file_data=wav_data, content_type="audio/wav",
+            )
+
+            metadata = {
+                "display_name": display_name,
+                "text": req.text,
+                "tts_mode": req.tts_mode,
+                "voice": req.voice if req.tts_mode == "edgetts" else None,
+                "ref_audio_id": req.ref_audio_id,
+                "language": req.language,
+                "duration_sec": duration,
+                "created_at": timestamp,
+            }
+            await storage_service.upload_file(
+                bucket=BUCKET, path=meta_id,
+                file_data=json.dumps(metadata, ensure_ascii=False).encode("utf-8"),
+                content_type="application/json",
+            )
+
+            signed_url = await storage_service.get_signed_url(BUCKET, audio_id)
+
+            task_store.update(task_id, {
+                "status": "completed",
+                "progress": 100,
+                "message": f"配音生成完成 ({duration:.1f}s)",
+                "output": {
+                    "audio_id": audio_id,
+                    "name": display_name,
+                    "path": signed_url,
+                    "duration_sec": duration,
+                    "text": req.text,
+                    "tts_mode": req.tts_mode,
+                    "language": req.language,
+                    "created_at": timestamp,
+                },
+            })
+        finally:
+            if os.path.exists(audio_path):
+                os.unlink(audio_path)
+
+    except Exception as e:
+        import traceback
+        task_store.update(task_id, {
+            "status": "failed",
+            "message": f"配音生成失败: {str(e)}",
+            "error": traceback.format_exc(),
+        })
+        logger.error(f"Generate audio failed: {e}")
+
+
+async def list_generated_audios(user_id: str) -> dict:
+    """列出用户的所有已生成配音"""
+    files = await storage_service.list_files(BUCKET, user_id)
+    wav_files = [f for f in files if f.get("name", "").endswith("_audio.wav")]
+
+    if not wav_files:
+        return GeneratedAudioListResponse(items=[]).model_dump()
+
+    async def fetch_info(f):
+        name = f.get("name", "")
+        storage_path = f"{user_id}/{name}"
+        meta_name = name.replace("_audio.wav", "_audio.json")
+        meta_path = f"{user_id}/{meta_name}"
+
+        display_name = name
+        text = ""
+        tts_mode = "edgetts"
+        language = "zh-CN"
+        duration_sec = 0.0
+        created_at = 0
+
+        try:
+            meta_url = await storage_service.get_signed_url(BUCKET, meta_path)
+            async with httpx.AsyncClient(timeout=5.0) as client:
+                resp = await client.get(meta_url)
+                if resp.status_code == 200:
+                    meta = resp.json()
+                    display_name = meta.get("display_name", name)
+                    text = meta.get("text", "")
+                    tts_mode = meta.get("tts_mode", "edgetts")
+                    language = meta.get("language", "zh-CN")
+                    duration_sec = meta.get("duration_sec", 0.0)
+                    created_at = meta.get("created_at", 0)
+        except Exception as e:
+            logger.debug(f"读取配音 metadata 失败: {e}")
+            try:
+                created_at = int(name.split("_")[0])
+            except:
+                pass
+
+        signed_url = await storage_service.get_signed_url(BUCKET, storage_path)
+
+        return GeneratedAudioItem(
+            id=storage_path,
+            name=display_name,
+            path=signed_url,
+            duration_sec=duration_sec,
+            text=text,
+            tts_mode=tts_mode,
+            language=language,
+            created_at=created_at,
+        )
+
+    items = await asyncio.gather(*[fetch_info(f) for f in wav_files])
+    items = sorted(items, key=lambda x: x.created_at, reverse=True)
+    return GeneratedAudioListResponse(items=items).model_dump()
+
+
+async def delete_generated_audio(audio_id: str, user_id: str) -> None:
+    if not audio_id.startswith(f"{user_id}/"):
+        raise PermissionError("无权删除此文件")
+
+    await storage_service.delete_file(BUCKET, audio_id)
+    meta_path = audio_id.replace("_audio.wav", "_audio.json")
+    try:
+        await storage_service.delete_file(BUCKET, meta_path)
+    except:
+        pass
+
+
+async def rename_generated_audio(audio_id: str, new_name: str, user_id: str) -> dict:
+    if not audio_id.startswith(f"{user_id}/"):
+        raise PermissionError("无权修改此文件")
+
+    new_name = new_name.strip()
+    if not new_name:
+        raise ValueError("新名称不能为空")
+
+    meta_path = audio_id.replace("_audio.wav", "_audio.json")
+    try:
+        meta_url = await storage_service.get_signed_url(BUCKET, meta_path)
+        async with httpx.AsyncClient() as client:
+            resp = await client.get(meta_url)
+            if resp.status_code == 200:
+                metadata = resp.json()
+            else:
+                raise Exception(f"Failed to fetch metadata: {resp.status_code}")
+    except Exception as e:
+        logger.warning(f"无法读取配音元数据: {e}, 将创建新的")
+        metadata = {
+            "display_name": new_name,
+            "text": "",
+            "tts_mode": "edgetts",
+            "language": "zh-CN",
+            "duration_sec": 0.0,
+            "created_at": int(time.time()),
+        }
+
+    metadata["display_name"] = new_name
+    await storage_service.upload_file(
+        bucket=BUCKET,
+        path=meta_path,
+        file_data=json.dumps(metadata, ensure_ascii=False).encode("utf-8"),
+        content_type="application/json",
+    )
+    return {"name": new_name}
--- a/backend/app/modules/payment/init.py
+++ b/backend/app/modules/payment/init.py
--- a/backend/app/modules/payment/router.py
+++ b/backend/app/modules/payment/router.py
@@ -0,0 +1,52 @@
+"""
+支付 API：创建订单、异步通知、状态查询
+
+遵循 BACKEND_DEV.md 规范：router 只做参数校验、调用 service、返回统一响应
+"""
+from fastapi import APIRouter, HTTPException, Request, status
+from fastapi.responses import PlainTextResponse
+
+from app.core.response import success_response
+from .schemas import CreateOrderRequest, CreateOrderResponse, OrderStatusResponse
+from . import service
+
+router = APIRouter(prefix="/api/payment", tags=["支付"])
+
+
+@router.post("/create-order")
+async def create_payment_order(request: CreateOrderRequest):
+    """创建支付宝电脑网站支付订单，返回收银台 URL"""
+    try:
+        result = service.create_payment_order(request.payment_token)
+    except ValueError as e:
+        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail=str(e))
+    except RuntimeError as e:
+        raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(e))
+
+    return success_response(
+        CreateOrderResponse(**result).model_dump()
+    )
+
+
+@router.post("/notify")
+async def payment_notify(request: Request):
+    """
+    支付宝异步通知回调
+
+    必须返回纯文本 "success"（不是 JSON），否则支付宝会重复推送。
+    """
+    form_data = await request.form()
+    verified = service.handle_payment_notify(dict(form_data))
+    return PlainTextResponse("success" if verified else "fail")
+
+
+@router.get("/status/{out_trade_no}")
+async def check_payment_status(out_trade_no: str):
+    """查询订单支付状态（前端轮询）"""
+    order_status = service.get_order_status(out_trade_no)
+    if order_status is None:
+        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="订单不存在")
+
+    return success_response(
+        OrderStatusResponse(status=order_status).model_dump()
+    )
--- a/backend/app/modules/payment/schemas.py
+++ b/backend/app/modules/payment/schemas.py
@@ -0,0 +1,15 @@
+from pydantic import BaseModel
+
+
+class CreateOrderRequest(BaseModel):
+    payment_token: str
+
+
+class CreateOrderResponse(BaseModel):
+    pay_url: str
+    out_trade_no: str
+    amount: float
+
+
+class OrderStatusResponse(BaseModel):
+    status: str
--- a/backend/app/modules/payment/service.py
+++ b/backend/app/modules/payment/service.py
@@ -0,0 +1,137 @@
+"""
+支付业务服务
+
+职责：Alipay SDK 封装、创建订单、处理支付通知、查询状态
+遵循 BACKEND_DEV.md "薄路由 + 厚服务" 原则
+"""
+from datetime import datetime, timezone, timedelta
+import uuid
+
+from alipay import AliPay
+from loguru import logger
+
+from app.core.config import settings
+from app.core.security import decode_payment_token
+from app.repositories.orders import create_order, get_order_by_trade_no, update_order_status
+from app.repositories.users import update_user
+
+# 支付宝网关地址
+ALIPAY_GATEWAY = "https://openapi.alipay.com/gateway.do"
+ALIPAY_GATEWAY_SANDBOX = "https://openapi-sandbox.dl.alipaydev.com/gateway.do"
+
+
+def _get_alipay_client() -> AliPay:
+    """延迟初始化 Alipay 客户端"""
+    return AliPay(
+        appid=settings.ALIPAY_APP_ID,
+        app_notify_url=settings.ALIPAY_NOTIFY_URL,
+        app_private_key_string=open(settings.ALIPAY_PRIVATE_KEY_PATH).read(),
+        alipay_public_key_string=open(settings.ALIPAY_PUBLIC_KEY_PATH).read(),
+        sign_type="RSA2",
+        debug=settings.ALIPAY_SANDBOX,
+    )
+
+
+def _create_page_pay_url(out_trade_no: str, amount: float, subject: str) -> str | None:
+    """调用 alipay.trade.page.pay，返回支付宝收银台 URL"""
+    client = _get_alipay_client()
+    order_string = client.api_alipay_trade_page_pay(
+        subject=subject,
+        out_trade_no=out_trade_no,
+        total_amount=amount,
+        return_url=settings.ALIPAY_RETURN_URL,
+    )
+    if not order_string:
+        logger.error(f"电脑网站支付下单失败: {out_trade_no}")
+        return None
+
+    gateway = ALIPAY_GATEWAY_SANDBOX if settings.ALIPAY_SANDBOX else ALIPAY_GATEWAY
+    pay_url = f"{gateway}?{order_string}"
+    logger.info(f"电脑网站支付下单成功: {out_trade_no}")
+    return pay_url
+
+
+def _verify_signature(data: dict, signature: str) -> bool:
+    """验证支付宝异步通知签名"""
+    client = _get_alipay_client()
+    return client.verify(data, signature)
+
+
+def create_payment_order(payment_token: str) -> dict:
+    """
+    创建支付订单完整流程
+
+    Returns: {"pay_url": str, "out_trade_no": str, "amount": float}
+    Raises: ValueError (token 无效), RuntimeError (API 失败)
+    """
+    user_id = decode_payment_token(payment_token)
+    if not user_id:
+        raise ValueError("付费凭证无效或已过期，请重新登录")
+
+    out_trade_no = f"VG_{int(datetime.now().timestamp())}_{uuid.uuid4().hex[:8]}"
+    amount = settings.PAYMENT_AMOUNT
+
+    create_order(user_id, out_trade_no, amount)
+
+    pay_url = _create_page_pay_url(out_trade_no, amount, "IPAgent 会员开通")
+    if not pay_url:
+        raise RuntimeError("创建支付订单失败，请稍后重试")
+
+    logger.info(f"用户 {user_id} 创建支付订单: {out_trade_no}")
+
+    return {"pay_url": pay_url, "out_trade_no": out_trade_no, "amount": amount}
+
+
+def handle_payment_notify(form_data: dict) -> bool:
+    """
+    处理支付宝异步通知完整流程
+
+    Returns: True=验签通过, False=验签失败
+    """
+    data = dict(form_data)
+
+    signature = data.pop("sign", "")
+    data.pop("sign_type", None)
+
+    if not _verify_signature(data, signature):
+        logger.warning(f"支付宝通知验签失败: {data.get('out_trade_no')}")
+        return False
+
+    out_trade_no = data.get("out_trade_no", "")
+    trade_status = data.get("trade_status", "")
+    trade_no = data.get("trade_no", "")
+
+    logger.info(f"收到支付宝通知: {out_trade_no}, status={trade_status}, trade_no={trade_no}")
+
+    if trade_status not in ("TRADE_SUCCESS", "TRADE_FINISHED"):
+        return True
+
+    order = get_order_by_trade_no(out_trade_no)
+    if not order:
+        logger.warning(f"订单不存在: {out_trade_no}")
+        return True
+
+    if order["status"] == "paid":
+        logger.info(f"订单已处理过: {out_trade_no}")
+        return True
+
+    update_order_status(out_trade_no, "paid", trade_no)
+
+    user_id = order["user_id"]
+    expires_at = (datetime.now(timezone.utc) + timedelta(days=settings.PAYMENT_EXPIRE_DAYS)).isoformat()
+    update_user(user_id, {
+        "is_active": True,
+        "role": "user",
+        "expires_at": expires_at,
+    })
+
+    logger.success(f"用户 {user_id} 支付成功，已激活，有效期至 {expires_at}")
+    return True
+
+
+def get_order_status(out_trade_no: str) -> str | None:
+    """查询订单支付状态"""
+    order = get_order_by_trade_no(out_trade_no)
+    if not order:
+        return None
+    return order["status"]
--- a/backend/app/modules/ref_audios/router.py
+++ b/backend/app/modules/ref_audios/router.py
@@ -13,7 +13,7 @@ router = APIRouter()
@router.post("")
 async def upload_ref_audio(
    file: UploadFile = File(...),
-    ref_text: str = Form(...),
+    ref_text: str = Form(""),
    user: dict = Depends(get_current_user)
 ):
    """上传参考音频"""
@@ -68,3 +68,21 @@ async def rename_ref_audio(
    except Exception as e:
        logger.error(f"重命名失败: {e}")
        raise HTTPException(status_code=500, detail=f"重命名失败: {str(e)}")
+
+
+@router.post("/{audio_id:path}/retranscribe")
+async def retranscribe_ref_audio(
+    audio_id: str,
+    user: dict = Depends(get_current_user)
+):
+    """重新识别参考音频的文字内容"""
+    try:
+        result = await service.retranscribe_ref_audio(audio_id, user["id"])
+        return success_response(result, message="识别完成")
+    except PermissionError as e:
+        raise HTTPException(status_code=403, detail=str(e))
+    except ValueError as e:
+        raise HTTPException(status_code=400, detail=str(e))
+    except Exception as e:
+        logger.error(f"重新识别失败: {e}")
+        raise HTTPException(status_code=500, detail=f"识别失败: {str(e)}")
--- a/backend/app/modules/ref_audios/service.py
+++ b/backend/app/modules/ref_audios/service.py
@@ -2,9 +2,11 @@ import re
 import os
 import time
 import json
+import hashlib
 import asyncio
 import subprocess
 import tempfile
+import unicodedata
 from pathlib import Path
 from typing import Optional

@@ -19,8 +21,16 @@ BUCKET_REF_AUDIOS = "ref-audios"


 def sanitize_filename(filename: str) -> str:
-    """清理文件名，移除特殊字符"""
-    safe_name = re.sub(r'[<>:"/\\|?*\s]', '_', filename)
+    """清理文件名用于 Storage key（仅保留 ASCII 安全字符）。"""
+    normalized = unicodedata.normalize("NFKD", filename)
+    ascii_name = normalized.encode("ascii", "ignore").decode("ascii")
+    safe_name = re.sub(r"[^A-Za-z0-9._-]+", "_", ascii_name).strip("._-")
+
+    # 纯中文/emoji 等场景会被清空，使用稳定哈希兜底，避免 InvalidKey
+    if not safe_name:
+        digest = hashlib.md5(filename.encode("utf-8")).hexdigest()[:12]
+        safe_name = f"audio_{digest}"
+
    if len(safe_name) > 50:
        ext = Path(safe_name).suffix
        safe_name = safe_name[:50 - len(ext)] + ext
@@ -41,16 +51,40 @@ def _get_audio_duration(file_path: str) -> float:
        return 0.0


-def _convert_to_wav(input_path: str, output_path: str) -> bool:
-    """将音频转换为 WAV 格式 (16kHz, mono)"""
+def _find_silence_cut_point(file_path: str, max_duration: float) -> float:
+    """在 max_duration 附近找一个静音点作为截取位置，找不到则回退到 max_duration"""
    try:
-        subprocess.run([
-            'ffmpeg', '-y', '-i', input_path,
-            '-ar', '16000',
-            '-ac', '1',
-            '-acodec', 'pcm_s16le',
-            output_path
-        ], capture_output=True, timeout=60, check=True)
+        # 用 silencedetect 找所有静音段（阈值 -30dB，最短 0.3 秒）
+        result = subprocess.run(
+            ['ffmpeg', '-i', file_path, '-af',
+             'silencedetect=noise=-30dB:d=0.3', '-f', 'null', '-'],
+            capture_output=True, text=True, timeout=30
+        )
+        # 解析 silence_end 时间点
+        import re as _re
+        ends = [float(m) for m in _re.findall(r'silence_end:\s*([\d.]+)', result.stderr)]
+        # 找 max_duration 之前最后一个静音结束点（至少 3 秒）
+        candidates = [t for t in ends if 3.0 <= t <= max_duration]
+        if candidates:
+            cut = candidates[-1]
+            logger.info(f"Found silence cut point at {cut:.1f}s (max={max_duration}s)")
+            return cut
+    except Exception as e:
+        logger.warning(f"Silence detection failed: {e}")
+    return max_duration
+
+
+def _convert_to_wav(input_path: str, output_path: str, max_duration: float = 0) -> bool:
+    """将音频转换为 WAV 格式 (16kHz, mono)，可选截取前 max_duration 秒并淡出"""
+    try:
+        cmd = ['ffmpeg', '-y', '-i', input_path]
+        if max_duration > 0:
+            cmd += ['-t', str(max_duration)]
+            # 末尾 0.1 秒淡出，避免截断爆音
+            fade_start = max(0, max_duration - 0.1)
+            cmd += ['-af', f'afade=t=out:st={fade_start}:d=0.1']
+        cmd += ['-ar', '16000', '-ac', '1', '-acodec', 'pcm_s16le', output_path]
+        subprocess.run(cmd, capture_output=True, timeout=60, check=True)
        return True
    except Exception as e:
        logger.error(f"音频转换失败: {e}")
@@ -67,9 +101,6 @@ async def upload_ref_audio(file, ref_text: str, user_id: str) -> dict:
    if ext not in ALLOWED_AUDIO_EXTENSIONS:
        raise ValueError(f"不支持的音频格式: {ext}。支持的格式: {', '.join(ALLOWED_AUDIO_EXTENSIONS)}")

-    if not ref_text or len(ref_text.strip()) < 2:
-        raise ValueError("参考文字不能为空")
-
    # 创建临时文件
    with tempfile.NamedTemporaryFile(delete=False, suffix=ext) as tmp_input:
        content = await file.read()
@@ -86,8 +117,31 @@ async def upload_ref_audio(file, ref_text: str, user_id: str) -> dict:
        duration = _get_audio_duration(tmp_wav_path)
        if duration < 1.0:
            raise ValueError("音频时长过短，至少需要 1 秒")
-        if duration > 60.0:
-            raise ValueError("音频时长过长，最多 60 秒")
+
+        # 超过 10 秒自动在静音点截取（CosyVoice 对 3-10 秒效果最好）
+        MAX_REF_DURATION = 10.0
+        if duration > MAX_REF_DURATION:
+            cut_point = _find_silence_cut_point(tmp_wav_path, MAX_REF_DURATION)
+            logger.info(f"Ref audio {duration:.1f}s > {MAX_REF_DURATION}s, trimming at {cut_point:.1f}s")
+            trimmed_path = tmp_input_path + "_trimmed.wav"
+            if not _convert_to_wav(tmp_wav_path, trimmed_path, max_duration=cut_point):
+                raise RuntimeError("音频截取失败")
+            os.unlink(tmp_wav_path)
+            tmp_wav_path = trimmed_path
+            duration = _get_audio_duration(tmp_wav_path)
+
+        # 自动转写参考音频内容
+        try:
+            from app.services.whisper_service import whisper_service
+            transcribed = await whisper_service.transcribe(tmp_wav_path)
+            if transcribed.strip():
+                ref_text = transcribed.strip()
+                logger.info(f"Auto-transcribed ref audio: {ref_text[:50]}...")
+        except Exception as e:
+            logger.warning(f"Auto-transcribe failed: {e}")
+
+        if not ref_text or not ref_text.strip():
+            raise ValueError("无法识别音频内容，请确保音频包含清晰的语音")

        # 检查重名
        existing_files = await storage_service.list_files(BUCKET_REF_AUDIOS, user_id)
@@ -267,3 +321,85 @@ async def rename_ref_audio(audio_id: str, new_name: str, user_id: str) -> dict:
    )

    return {"name": new_name}
+
+
+async def retranscribe_ref_audio(audio_id: str, user_id: str) -> dict:
+    """重新转写参考音频的 ref_text，并截取前 10 秒重新上传（用于迁移旧数据）"""
+    if not audio_id.startswith(f"{user_id}/"):
+        raise PermissionError("无权修改此文件")
+
+    # 下载音频到临时文件
+    audio_url = await storage_service.get_signed_url(BUCKET_REF_AUDIOS, audio_id)
+    tmp_wav_path = None
+    trimmed_path = None
+    try:
+        with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tmp:
+            tmp_wav_path = tmp.name
+            timeout = httpx.Timeout(None)
+            async with httpx.AsyncClient(timeout=timeout) as client:
+                async with client.stream("GET", audio_url) as resp:
+                    resp.raise_for_status()
+                    async for chunk in resp.aiter_bytes():
+                        tmp.write(chunk)
+
+        # 超过 10 秒则截取前 10 秒并重新上传音频
+        MAX_REF_DURATION = 10.0
+        duration = _get_audio_duration(tmp_wav_path)
+        transcribe_path = tmp_wav_path
+        need_reupload = False
+
+        if duration > MAX_REF_DURATION:
+            cut_point = _find_silence_cut_point(tmp_wav_path, MAX_REF_DURATION)
+            logger.info(f"Retranscribe: trimming {audio_id} from {duration:.1f}s at {cut_point:.1f}s")
+            trimmed_path = tmp_wav_path + "_trimmed.wav"
+            if _convert_to_wav(tmp_wav_path, trimmed_path, max_duration=cut_point):
+                transcribe_path = trimmed_path
+                duration = _get_audio_duration(trimmed_path)
+                need_reupload = True
+
+        # Whisper 转写
+        from app.services.whisper_service import whisper_service
+        transcribed = await whisper_service.transcribe(transcribe_path)
+        if not transcribed or not transcribed.strip():
+            raise ValueError("无法识别音频内容")
+
+        ref_text = transcribed.strip()
+        logger.info(f"Re-transcribed ref audio {audio_id}: {ref_text[:50]}...")
+
+        # 截取过的音频重新上传覆盖原文件
+        if need_reupload and trimmed_path:
+            with open(trimmed_path, "rb") as f:
+                await storage_service.upload_file(
+                    bucket=BUCKET_REF_AUDIOS, path=audio_id,
+                    file_data=f.read(), content_type="audio/wav",
+                )
+            logger.info(f"Re-uploaded trimmed audio: {audio_id} ({duration:.1f}s)")
+
+        # 更新 metadata
+        metadata_path = audio_id.replace(".wav", ".json")
+        try:
+            meta_url = await storage_service.get_signed_url(BUCKET_REF_AUDIOS, metadata_path)
+            async with httpx.AsyncClient(timeout=5.0) as client:
+                resp = await client.get(meta_url)
+                if resp.status_code == 200:
+                    metadata = resp.json()
+                else:
+                    raise Exception(f"status {resp.status_code}")
+        except Exception:
+            metadata = {}
+
+        metadata["ref_text"] = ref_text
+        metadata["duration_sec"] = duration
+        await storage_service.upload_file(
+            bucket=BUCKET_REF_AUDIOS,
+            path=metadata_path,
+            file_data=json.dumps(metadata, ensure_ascii=False).encode('utf-8'),
+            content_type="application/json"
+        )
+
+        return {"ref_text": ref_text, "duration_sec": duration}
+    finally:
+        if tmp_wav_path and os.path.exists(tmp_wav_path):
+            os.unlink(tmp_wav_path)
+        if trimmed_path and os.path.exists(trimmed_path):
+            os.unlink(trimmed_path)
--- a/backend/app/modules/tools/router.py
+++ b/backend/app/modules/tools/router.py
@@ -13,11 +13,12 @@ router = APIRouter()
 async def extract_script_tool(
    file: Optional[UploadFile] = File(None),
    url: Optional[str] = Form(None),
-    rewrite: bool = Form(True)
+    rewrite: bool = Form(True),
+    custom_prompt: Optional[str] = Form(None)
 ):
    """独立文案提取工具"""
    try:
-        result = await service.extract_script(file=file, url=url, rewrite=rewrite)
+        result = await service.extract_script(file=file, url=url, rewrite=rewrite, custom_prompt=custom_prompt)
        return success_response(result)
    except ValueError as e:
        raise HTTPException(400, str(e))
--- a/backend/app/modules/tools/service.py
+++ b/backend/app/modules/tools/service.py
@@ -17,9 +17,9 @@ from app.services.whisper_service import whisper_service
 from app.services.glm_service import glm_service


-async def extract_script(file=None, url: Optional[str] = None, rewrite: bool = True) -> dict:
+async def extract_script(file=None, url: Optional[str] = None, rewrite: bool = True, custom_prompt: Optional[str] = None) -> dict:
    """
-    文案提取：上传文件或视频链接 -> Whisper 转写 -> (可选) GLM 洗稿
+    文案提取：上传文件或视频链接 -> Whisper 转写 -> (可选) GLM 改写
    """
    if not file and not url:
        raise ValueError("必须提供文件或视频链接")
@@ -63,11 +63,15 @@ async def extract_script(file=None, url: Optional[str] = None, rewrite: bool = T
        # 2. 提取文案 (Whisper)
        script = await whisper_service.transcribe(str(audio_path))

-        # 3. AI 洗稿 (GLM)
+        # 3. AI 改写 (GLM) — 失败时降级返回原文
        rewritten = None
        if rewrite and script and len(script.strip()) > 0:
            logger.info("Rewriting script...")
-            rewritten = await glm_service.rewrite_script(script)
+            try:
+                rewritten = await glm_service.rewrite_script(script, custom_prompt)
+            except Exception as e:
+                logger.warning(f"GLM rewrite failed, returning original script: {e}")
+                rewritten = None

        return {
            "original_script": script,
@@ -156,125 +160,120 @@ def _download_yt_dlp(url_value: str, temp_dir: Path, timestamp: int) -> Path:
        'quiet': True,
        'no_warnings': True,
        'http_headers': {
-            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
+            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36',
            'Referer': 'https://www.douyin.com/',
        }
    }

-    with yt_dlp.YoutubeDL() as ydl_raw:
-        ydl: Any = ydl_raw
-        ydl.params.update(ydl_opts)
+    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
        info = ydl.extract_info(url_value, download=True)
        if 'requested_downloads' in info:
            downloaded_file = info['requested_downloads'][0]['filepath']
        else:
            ext = info.get('ext', 'mp4')
-            id = info.get('id')
-            downloaded_file = str(temp_dir / f"tool_download_{timestamp}_{id}.{ext}")
+            vid_id = info.get('id')
+            downloaded_file = str(temp_dir / f"tool_download_{timestamp}_{vid_id}.{ext}")

        return Path(downloaded_file)


 async def _download_douyin_manual(url: str, temp_dir: Path, timestamp: int) -> Optional[Path]:
-    """手动下载抖音视频 (Fallback)"""
-    logger.info(f"[SuperIPAgent] Starting download for: {url}")
+    """手动下载抖音视频 (Fallback) — 通过移动端分享页获取播放地址"""
+    logger.info(f"[douyin-fallback] Starting download for: {url}")

    try:
+        # 1. 解析短链接，提取视频 ID
        headers = {
-            "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"
+            "user-agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 16_0 like Mac OS X) AppleWebKit/605.1.15"
        }

        async with httpx.AsyncClient(follow_redirects=True, timeout=10.0) as client:
            resp = await client.get(url, headers=headers)
            final_url = str(resp.url)

-        logger.info(f"[SuperIPAgent] Final URL: {final_url}")
+        logger.info(f"[douyin-fallback] Final URL: {final_url}")

-        modal_id = None
+        video_id = None
        match = re.search(r'/video/(\d+)', final_url)
        if match:
-            modal_id = match.group(1)
+            video_id = match.group(1)

-        if not modal_id:
-            logger.error("[SuperIPAgent] Could not extract modal_id")
+        if not video_id:
+            logger.error("[douyin-fallback] Could not extract video_id")
            return None

-        logger.info(f"[SuperIPAgent] Extracted modal_id: {modal_id}")
+        logger.info(f"[douyin-fallback] Extracted video_id: {video_id}")

-        target_url = f"https://www.douyin.com/user/MS4wLjABAAAAN_s_hups7LD0N4qnrM3o2gI0vuG3pozNaEolz2_py3cHTTrpVr1Z4dukFD9SOlwY?from_tab_name=main&modal_id={modal_id}"
+        # 2. 获取新鲜 ttwid
+        ttwid = ""
+        try:
+            async with httpx.AsyncClient(timeout=10.0) as client:
+                ttwid_resp = await client.post(
+                    "https://ttwid.bytedance.com/ttwid/union/register/",
+                    json={
+                        "region": "cn", "aid": 6383, "needFid": False,
+                        "service": "www.douyin.com",
+                        "migrate_info": {"ticket": "", "source": "node"},
+                        "cbUrlProtocol": "https", "union": True,
+                    }
+                )
+                ttwid = ttwid_resp.cookies.get("ttwid", "")
+                logger.info(f"[douyin-fallback] Got fresh ttwid (len={len(ttwid)})")
+        except Exception as e:
+            logger.warning(f"[douyin-fallback] Failed to get ttwid: {e}")

-        from app.core.config import settings
-        if not settings.DOUYIN_COOKIE:
-            logger.warning("[SuperIPAgent] DOUYIN_COOKIE 未配置，视频下载可能失败")
-
-        headers_with_cookie = {
-            "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
-            "cookie": settings.DOUYIN_COOKIE,
-            "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
+        # 3. 访问移动端分享页提取播放地址
+        page_headers = {
+            "user-agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 16_0 like Mac OS X) AppleWebKit/605.1.15",
+            "cookie": f"ttwid={ttwid}" if ttwid else "",
        }

-        logger.info(f"[SuperIPAgent] Requesting page with Cookie...")
+        async with httpx.AsyncClient(follow_redirects=True, timeout=15.0) as client:
+            page_resp = await client.get(
+                f"https://m.douyin.com/share/video/{video_id}",
+                headers=page_headers,
+            )

-        async with httpx.AsyncClient(timeout=10.0) as client:
-            response = await client.get(target_url, headers=headers_with_cookie)
+        page_text = page_resp.text
+        logger.info(f"[douyin-fallback] Mobile page length: {len(page_text)}")

-        content_match = re.findall(r'<script id="RENDER_DATA" type="application/json">(.*?)</script>', response.text)
-        if not content_match:
-            if "SSR_HYDRATED_DATA" in response.text:
-                content_match = re.findall(r'<script id="SSR_HYDRATED_DATA" type="application/json">(.*?)</script>', response.text)
-
-        if not content_match:
-            logger.error(f"[SuperIPAgent] Could not find RENDER_DATA in page (len={len(response.text)})")
-            return None
-
-        content = unquote(content_match[0])
-        try:
-            data = json.loads(content)
-        except:
-            logger.error("[SuperIPAgent] JSON decode failed")
-            return None
-
-        video_url = None
-        try:
-            if "app" in data and "videoDetail" in data["app"]:
-                info = data["app"]["videoDetail"]["video"]
-                if "bitRateList" in info and info["bitRateList"]:
-                    video_url = info["bitRateList"][0]["playAddr"][0]["src"]
-                elif "playAddr" in info and info["playAddr"]:
-                    video_url = info["playAddr"][0]["src"]
-        except Exception as e:
-            logger.error(f"[SuperIPAgent] Path extraction failed: {e}")
-
-        if not video_url:
-            logger.error("[SuperIPAgent] No video_url found")
+        # 4. 提取 play_addr
+        addr_match = re.search(
+            r'"play_addr":\{"uri":"([^"]+)","url_list":\["([^"]+)"',
+            page_text,
+        )
+        if not addr_match:
+            logger.error("[douyin-fallback] Could not find play_addr in mobile page")
            return None

+        video_url = addr_match.group(2).replace(r"\u002F", "/")
        if video_url.startswith("//"):
            video_url = "https:" + video_url

-        logger.info(f"[SuperIPAgent] Found video URL: {video_url[:50]}...")
+        logger.info(f"[douyin-fallback] Found video URL: {video_url[:80]}...")

+        # 5. 下载视频
        temp_path = temp_dir / f"douyin_manual_{timestamp}.mp4"
        download_headers = {
-            'Referer': 'https://www.douyin.com/',
-            'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36',
+            "Referer": "https://www.douyin.com/",
+            "User-Agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 16_0 like Mac OS X) AppleWebKit/605.1.15",
        }

-        async with httpx.AsyncClient(timeout=60.0) as client:
+        async with httpx.AsyncClient(timeout=120.0, follow_redirects=True) as client:
            async with client.stream("GET", video_url, headers=download_headers) as dl_resp:
                if dl_resp.status_code == 200:
-                    with open(temp_path, 'wb') as f:
+                    with open(temp_path, "wb") as f:
                        async for chunk in dl_resp.aiter_bytes(chunk_size=8192):
                            f.write(chunk)

-                    logger.info(f"[SuperIPAgent] Downloaded successfully: {temp_path}")
+                    logger.info(f"[douyin-fallback] Downloaded successfully: {temp_path}")
                    return temp_path
                else:
-                    logger.error(f"[SuperIPAgent] Download failed: {dl_resp.status_code}")
+                    logger.error(f"[douyin-fallback] Download failed: {dl_resp.status_code}")
                    return None

    except Exception as e:
-        logger.error(f"[SuperIPAgent] Logic failed: {e}")
+        logger.error(f"[douyin-fallback] Logic failed: {e}")
        return None


--- a/backend/app/modules/videos/schemas.py
+++ b/backend/app/modules/videos/schemas.py
@@ -1,21 +1,40 @@
 from pydantic import BaseModel
-from typing import Optional
+from typing import Optional, List, Literal
+
+
+class CustomAssignment(BaseModel):
+    material_path: str
+    start: float           # 音频时间轴起点
+    end: float             # 音频时间轴终点
+    source_start: float = 0.0  # 源视频截取起点
+    source_end: Optional[float] = None  # 源视频截取终点（可选）


 class GenerateRequest(BaseModel):
    text: str
    voice: str = "zh-CN-YunxiNeural"
    material_path: str
+    material_paths: Optional[List[str]] = None
    tts_mode: str = "edgetts"
    ref_audio_id: Optional[str] = None
    ref_text: Optional[str] = None
+    language: str = "zh-CN"
+    generated_audio_id: Optional[str] = None  # 预生成配音 ID（存在时跳过内联 TTS）
    title: Optional[str] = None
+    title_display_mode: Literal["short", "persistent"] = "short"
+    title_duration: float = 4.0
    enable_subtitles: bool = True
    subtitle_style_id: Optional[str] = None
    title_style_id: Optional[str] = None
+    secondary_title: Optional[str] = None
+    secondary_title_style_id: Optional[str] = None
+    secondary_title_font_size: Optional[int] = None
+    secondary_title_top_margin: Optional[int] = None
    subtitle_font_size: Optional[int] = None
    title_font_size: Optional[int] = None
    title_top_margin: Optional[int] = None
    subtitle_bottom_margin: Optional[int] = None
    bgm_id: Optional[str] = None
    bgm_volume: Optional[float] = 0.2
+    custom_assignments: Optional[List[CustomAssignment]] = None
+    output_aspect_ratio: Literal["9:16", "16:9"] = "9:16"
--- a/backend/app/modules/videos/workflow.py
+++ b/backend/app/modules/videos/workflow.py
@@ -1,5 +1,6 @@
-from typing import Optional, Any
+from typing import Optional, Any, List
 from pathlib import Path
+import asyncio
 import time
 import traceback
 import httpx
@@ -24,6 +25,17 @@ from .schemas import GenerateRequest
 from .task_store import task_store


+def _locale_to_whisper_lang(locale: str) -> str:
+    """'en-US' → 'en', 'zh-CN' → 'zh'"""
+    return locale.split("-")[0] if "-" in locale else locale
+
+
+def _locale_to_tts_lang(locale: str) -> str:
+    """'zh-CN' → 'Chinese', 'en-US' → 'English', 其他 → 'Auto'"""
+    mapping = {"zh": "Chinese", "en": "English"}
+    return mapping.get(locale.split("-")[0], "Auto")
+
+
 _lipsync_service: Optional[LipSyncService] = None
 _lipsync_ready: Optional[bool] = None
 _lipsync_last_check: float = 0
@@ -79,26 +91,162 @@ def _update_task(task_id: str, **updates: Any) -> None:
    task_store.update(task_id, updates)


+# ── 多素材辅助函数 ──
+
+
+def _split_equal(segments: List[dict], material_paths: List[str]) -> List[dict]:
+    """按素材数量均分音频时长，对齐到最近的 Whisper 字边界。
+
+    Args:
+        segments: Whisper 产出的 segment 列表, 每个包含 words (字级时间戳)
+        material_paths: 素材路径列表
+
+    Returns:
+        [{"material_path": "...", "start": 0.0, "end": 5.2, "index": 0}, ...]
+    """
+    # 展平所有 Whisper 字符
+    all_chars: List[dict] = []
+    for seg in segments:
+        for w in seg.get("words", []):
+            all_chars.append(w)
+
+    n = len(material_paths)
+
+    if not all_chars or n == 0:
+        return [{"material_path": material_paths[0] if material_paths else "",
+                 "start": 0.0, "end": 99999.0, "index": 0}]
+
+    # 素材数不能超过字符数，否则边界会重复
+    if n > len(all_chars):
+        logger.warning(f"[MultiMat] 素材数({n}) > 字符数({len(all_chars)})，裁剪为 {len(all_chars)}")
+        n = len(all_chars)
+
+    total_start = all_chars[0]["start"]
+    total_end = all_chars[-1]["end"]
+    seg_dur = (total_end - total_start) / n
+
+    # 计算 N-1 个分割点，对齐到最近的字边界
+    boundaries = [0]  # 第一段从第 0 个字开始
+    for i in range(1, n):
+        target_time = total_start + i * seg_dur
+        # 找到 start 时间最接近 target_time 的字
+        best_idx = boundaries[-1] + 1  # 至少比上一个边界后移 1
+        best_diff = float("inf")
+        for j in range(boundaries[-1] + 1, len(all_chars)):
+            diff = abs(all_chars[j]["start"] - target_time)
+            if diff < best_diff:
+                best_diff = diff
+                best_idx = j
+            elif diff > best_diff:
+                break  # 时间递增，差值开始变大后可以停了
+        boundaries.append(min(best_idx, len(all_chars) - 1))
+    boundaries.append(len(all_chars))  # 最后一段到末尾
+
+    # 按边界生成分配结果
+    assignments: List[dict] = []
+    for i in range(n):
+        s_idx = boundaries[i]
+        e_idx = boundaries[i + 1]
+        if s_idx >= len(all_chars) or s_idx >= e_idx:
+            continue
+        assignments.append({
+            "material_path": material_paths[i],
+            "start": all_chars[s_idx]["start"],
+            "end": all_chars[e_idx - 1]["end"],
+            "text": "".join(c["word"] for c in all_chars[s_idx:e_idx]),
+            "index": len(assignments),
+        })
+
+    if not assignments:
+        return [{"material_path": material_paths[0], "start": 0.0, "end": 99999.0, "index": 0}]
+
+    logger.info(f"[MultiMat] 均分 {len(all_chars)} 字为 {len(assignments)} 段")
+    for a in assignments:
+        dur = a["end"] - a["start"]
+        logger.info(f"  段{a['index']}: [{a['start']:.2f}-{a['end']:.2f}s] ({dur:.1f}s) {a['text'][:20]}")
+
+    return assignments
+
+
 async def process_video_generation(task_id: str, req: GenerateRequest, user_id: str):
    temp_files = []
    try:
        start_time = time.time()
+
+        # ── 确定素材列表 ──
+        material_paths: List[str] = []
+        if req.custom_assignments and len(req.custom_assignments) > 1:
+            material_paths = [a.material_path for a in req.custom_assignments if a.material_path]
+        elif req.material_paths and len(req.material_paths) > 1:
+            material_paths = req.material_paths
+        else:
+            material_paths = [req.material_path]
+
+        is_multi = len(material_paths) > 1
+        target_resolution = (1080, 1920) if req.output_aspect_ratio == "9:16" else (1920, 1080)
+
+        logger.info(
+            f"[Render] 输出画面比例: {req.output_aspect_ratio}, "
+            f"目标分辨率: {target_resolution[0]}x{target_resolution[1]}"
+        )
+
        _update_task(task_id, status="processing", progress=5, message="正在下载素材...")

        temp_dir = settings.UPLOAD_DIR / "temp"
        temp_dir.mkdir(parents=True, exist_ok=True)
+        video = VideoService()
+        input_material_path: Optional[Path] = None

+        # 单素材模式：下载主素材
+        if not is_multi:
            input_material_path = temp_dir / f"{task_id}_input.mp4"
            temp_files.append(input_material_path)
+            await _download_material(material_paths[0], input_material_path)

-        await _download_material(req.material_path, input_material_path)
+            # 归一化旋转元数据（如 iPhone MOV 1920x1080 + rotation=-90）
+            normalized_input_path = temp_dir / f"{task_id}_input_norm.mp4"
+            normalized_result = video.normalize_orientation(
+                str(input_material_path),
+                str(normalized_input_path),
+            )
+            if normalized_result != str(input_material_path):
+                temp_files.append(normalized_input_path)
+                input_material_path = normalized_input_path

        _update_task(task_id, message="正在生成语音...", progress=10)

        audio_path = temp_dir / f"{task_id}_audio.wav"
        temp_files.append(audio_path)

-        if req.tts_mode == "voiceclone":
+        if req.generated_audio_id:
+            # 新流程：使用预生成的配音
+            _update_task(task_id, message="正在下载配音...", progress=12)
+            audio_url = await storage_service.get_signed_url(
+                bucket="generated-audios",
+                path=req.generated_audio_id,
+            )
+            await _download_material(audio_url, audio_path)
+
+            # 从元数据获取 language
+            meta_path = req.generated_audio_id.replace("_audio.wav", "_audio.json")
+            try:
+                meta_url = await storage_service.get_signed_url(
+                    bucket="generated-audios", path=meta_path,
+                )
+                import httpx as _httpx
+                async with _httpx.AsyncClient(timeout=5.0) as client:
+                    resp = await client.get(meta_url)
+                    if resp.status_code == 200:
+                        meta = resp.json()
+                        req.language = meta.get("language", req.language)
+                        # 无条件用配音元数据覆盖文案，确保字幕与配音语言一致
+                        meta_text = meta.get("text", "")
+                        if meta_text:
+                            req.text = meta_text
+            except Exception as e:
+                logger.warning(f"读取配音元数据失败: {e}")
+
+        elif req.tts_mode == "voiceclone":
            if not req.ref_audio_id or not req.ref_text:
                raise ValueError("声音克隆模式需要提供参考音频和参考文字")

@@ -113,13 +261,13 @@ async def process_video_generation(task_id: str, req: GenerateRequest, user_id:
            )
            await _download_material(ref_audio_url, ref_audio_local)

-            _update_task(task_id, message="正在克隆声音 (Qwen3-TTS)...")
+            _update_task(task_id, message="正在克隆声音...")
            await voice_clone_service.generate_audio(
                text=req.text,
                ref_audio_path=str(ref_audio_local),
                ref_text=req.ref_text,
                output_path=str(audio_path),
-                language="Chinese"
+                language=_locale_to_tts_lang(req.language)
            )
        else:
            _update_task(task_id, message="正在生成语音 (EdgeTTS)...")
@@ -128,14 +276,292 @@ async def process_video_generation(task_id: str, req: GenerateRequest, user_id:

        tts_time = time.time() - start_time
        print(f"[Pipeline] TTS completed in {tts_time:.1f}s")
-        _update_task(task_id, progress=25)
-
-        _update_task(task_id, message="正在合成唇形 (LatentSync)...", progress=30)

        lipsync = _get_lipsync_service()
        lipsync_video_path = temp_dir / f"{task_id}_lipsync.mp4"
        temp_files.append(lipsync_video_path)

+        captions_path = None
+
+        if is_multi:
+            # ══════════════════════════════════════
+            # 多素材流水线
+            # ══════════════════════════════════════
+            _update_task(task_id, progress=12, message="正在分配素材...")
+
+            if req.custom_assignments and len(req.custom_assignments) == len(material_paths):
+                # 用户自定义分配，跳过 Whisper 均分
+                assignments = [
+                    {
+                        "material_path": a.material_path,
+                        "start": a.start,
+                        "end": a.end,
+                        "source_start": a.source_start,
+                        "source_end": a.source_end,
+                        "index": i,
+                    }
+                    for i, a in enumerate(req.custom_assignments)
+                ]
+                # 仍然需要 Whisper 生成字幕（如果启用）
+                captions_path = temp_dir / f"{task_id}_captions.json"
+                temp_files.append(captions_path)
+                if req.enable_subtitles:
+                    _update_task(task_id, message="正在生成字幕 (Whisper)...")
+                    try:
+                        await whisper_service.align(
+                            audio_path=str(audio_path),
+                            text=req.text,
+                            output_path=str(captions_path),
+                            language=_locale_to_whisper_lang(req.language),
+                            original_text=req.text,
+                        )
+                        print(f"[Pipeline] Whisper alignment completed (custom assignments)")
+                    except Exception as e:
+                        logger.warning(f"Whisper alignment failed: {e}")
+                        captions_path = None
+                else:
+                    captions_path = None
+            elif req.custom_assignments:
+                logger.warning(
+                    f"[MultiMat] custom_assignments 数量({len(req.custom_assignments)})"
+                    f" 与素材数量({len(material_paths)})不一致，回退自动分配"
+                )
+
+                # 原有逻辑：Whisper → _split_equal
+                _update_task(task_id, message="正在生成字幕 (Whisper)...")
+
+                captions_path = temp_dir / f"{task_id}_captions.json"
+                temp_files.append(captions_path)
+
+                try:
+                    captions_data = await whisper_service.align(
+                        audio_path=str(audio_path),
+                        text=req.text,
+                        output_path=str(captions_path),
+                        language=_locale_to_whisper_lang(req.language),
+                        original_text=req.text,
+                    )
+                    print(f"[Pipeline] Whisper alignment completed (multi-material)")
+                except Exception as e:
+                    logger.warning(f"Whisper alignment failed: {e}")
+                    captions_data = None
+                    captions_path = None
+
+                _update_task(task_id, progress=15, message="正在分配素材...")
+
+                if captions_data and captions_data.get("segments"):
+                    assignments = _split_equal(captions_data["segments"], material_paths)
+                else:
+                    # Whisper 失败 → 按时长均分（不依赖字符对齐）
+                    logger.warning("[MultiMat] Whisper 无数据，按时长均分")
+                    audio_dur = video._get_duration(str(audio_path))
+                    if audio_dur <= 0:
+                        audio_dur = 30.0  # 安全兜底
+                    seg_dur = audio_dur / len(material_paths)
+                    assignments = [
+                        {"material_path": material_paths[i], "start": i * seg_dur,
+                         "end": (i + 1) * seg_dur, "index": i}
+                        for i in range(len(material_paths))
+                    ]
+
+            else:
+                # 原有逻辑：Whisper → _split_equal
+                _update_task(task_id, message="正在生成字幕 (Whisper)...")
+
+                captions_path = temp_dir / f"{task_id}_captions.json"
+                temp_files.append(captions_path)
+
+                try:
+                    captions_data = await whisper_service.align(
+                        audio_path=str(audio_path),
+                        text=req.text,
+                        output_path=str(captions_path),
+                        language=_locale_to_whisper_lang(req.language),
+                        original_text=req.text,
+                    )
+                    print(f"[Pipeline] Whisper alignment completed (multi-material)")
+                except Exception as e:
+                    logger.warning(f"Whisper alignment failed: {e}")
+                    captions_data = None
+                    captions_path = None
+
+                _update_task(task_id, progress=15, message="正在分配素材...")
+
+                if captions_data and captions_data.get("segments"):
+                    assignments = _split_equal(captions_data["segments"], material_paths)
+                else:
+                    # Whisper 失败 → 按时长均分（不依赖字符对齐）
+                    logger.warning("[MultiMat] Whisper 无数据，按时长均分")
+                    audio_dur = video._get_duration(str(audio_path))
+                    if audio_dur <= 0:
+                        audio_dur = 30.0  # 安全兜底
+                    seg_dur = audio_dur / len(material_paths)
+                    assignments = [
+                        {"material_path": material_paths[i], "start": i * seg_dur,
+                         "end": (i + 1) * seg_dur, "index": i}
+                        for i in range(len(material_paths))
+                    ]
+
+            # 扩展段覆盖完整音频范围：首段从0开始，末段到音频结尾
+            audio_duration = video._get_duration(str(audio_path))
+            if assignments and audio_duration > 0:
+                assignments[0]["start"] = 0.0
+                assignments[-1]["end"] = audio_duration
+
+            num_segments = len(assignments)
+            print(f"[Pipeline] Multi-material: {num_segments} segments, {len(material_paths)} materials")
+
+            if num_segments == 0:
+                raise RuntimeError("Multi-material: no valid segments after splitting")
+
+            lipsync_start = time.time()
+
+            # ── 第一步：并行下载所有素材并检测分辨率 ──
+            material_locals: List[Path] = []
+            resolutions = []
+
+            async def _download_and_normalize(i: int, assignment: dict):
+                """下载单个素材并归一化方向"""
+                material_local = temp_dir / f"{task_id}_material_{i}.mp4"
+                temp_files.append(material_local)
+                await _download_material(assignment["material_path"], material_local)
+
+                normalized_material = temp_dir / f"{task_id}_material_{i}_norm.mp4"
+                loop = asyncio.get_event_loop()
+                normalized_result = await loop.run_in_executor(
+                    None,
+                    video.normalize_orientation,
+                    str(material_local),
+                    str(normalized_material),
+                )
+                if normalized_result != str(material_local):
+                    temp_files.append(normalized_material)
+                    material_local = normalized_material
+
+                res = video.get_resolution(str(material_local))
+                return material_local, res
+
+            download_tasks = [
+                _download_and_normalize(i, assignment)
+                for i, assignment in enumerate(assignments)
+            ]
+            download_results = await asyncio.gather(*download_tasks)
+            for local, res in download_results:
+                material_locals.append(local)
+                resolutions.append(res)
+
+            # 按用户选择的画面比例统一分辨率
+            base_res = target_resolution
+            need_scale = any(r != base_res for r in resolutions)
+            if need_scale:
+                logger.info(f"[MultiMat] 素材分辨率不一致，统一到 {base_res[0]}x{base_res[1]}")
+
+            # ── 第二步：并行裁剪每段素材到对应时长 ──
+            prepared_segments: List[Path] = [None] * num_segments
+
+            async def _prepare_one_segment(i: int, assignment: dict):
+                """将单个素材裁剪/循环到对应时长"""
+                seg_dur = assignment["end"] - assignment["start"]
+                prepared_path = temp_dir / f"{task_id}_prepared_{i}.mp4"
+                temp_files.append(prepared_path)
+
+                loop = asyncio.get_event_loop()
+                await loop.run_in_executor(
+                    None,
+                    video.prepare_segment,
+                    str(material_locals[i]),
+                    seg_dur,
+                    str(prepared_path),
+                    base_res,
+                    assignment.get("source_start", 0.0),
+                    assignment.get("source_end"),
+                    25,
+                )
+                return i, prepared_path
+
+            _update_task(
+                task_id,
+                progress=15,
+                message=f"正在并行准备 {num_segments} 个素材片段..."
+            )
+
+            prepare_tasks = [
+                _prepare_one_segment(i, assignment)
+                for i, assignment in enumerate(assignments)
+            ]
+            prepare_results = await asyncio.gather(*prepare_tasks)
+            for i, path in prepare_results:
+                prepared_segments[i] = path
+
+            # ── 第二步：拼接所有素材片段 ──
+            _update_task(task_id, progress=50, message="正在拼接素材片段...")
+            concat_path = temp_dir / f"{task_id}_concat.mp4"
+            temp_files.append(concat_path)
+            video.concat_videos(
+                [str(p) for p in prepared_segments],
+                str(concat_path),
+                target_fps=25,
+            )
+
+            # ── 第三步：一次 LatentSync 推理 ──
+            is_ready = await _check_lipsync_ready()
+
+            if is_ready:
+                _update_task(task_id, progress=55, message="正在合成唇形 (LatentSync)...")
+                print(f"[LipSync] Multi-material: single LatentSync on concatenated video")
+                try:
+                    await lipsync.generate(str(concat_path), str(audio_path), str(lipsync_video_path))
+                except Exception as e:
+                    logger.warning(f"[LipSync] Failed, fallback to concat without lipsync: {e}")
+                    import shutil
+                    shutil.copy(str(concat_path), str(lipsync_video_path))
+            else:
+                print(f"[LipSync] Not ready, using concatenated video without lipsync")
+                import shutil
+                shutil.copy(str(concat_path), str(lipsync_video_path))
+
+            lipsync_time = time.time() - lipsync_start
+            print(f"[Pipeline] Multi-material prepare + concat + LipSync completed in {lipsync_time:.1f}s")
+            _update_task(task_id, progress=80)
+
+            # 如果用户关闭了字幕，清除 captions_path（Whisper 仅用于句子切分）
+            if not req.enable_subtitles:
+                captions_path = None
+
+        else:
+            # ══════════════════════════════════════
+            # 单素材流水线（原有逻辑）
+            # ══════════════════════════════════════
+
+            if input_material_path is None:
+                raise RuntimeError("单素材流程缺少输入素材")
+
+            # 单素材：按用户选择画面比例统一到目标分辨率，并应用 source_start
+            single_source_start = 0.0
+            single_source_end = None
+            if req.custom_assignments and len(req.custom_assignments) == 1:
+                single_source_start = req.custom_assignments[0].source_start
+                single_source_end = req.custom_assignments[0].source_end
+
+            _update_task(task_id, progress=20, message="正在准备素材片段...")
+            audio_dur = video._get_duration(str(audio_path))
+            if audio_dur <= 0:
+                audio_dur = 30.0
+            prepared_single_path = temp_dir / f"{task_id}_prepared_single.mp4"
+            temp_files.append(prepared_single_path)
+            video.prepare_segment(
+                str(input_material_path),
+                audio_dur,
+                str(prepared_single_path),
+                target_resolution=target_resolution,
+                source_start=single_source_start,
+                source_end=single_source_end,
+            )
+            input_material_path = prepared_single_path
+
+            _update_task(task_id, progress=25)
+            _update_task(task_id, message="正在合成唇形 (LatentSync)...", progress=30)
+
            lipsync_start = time.time()
            is_ready = await _check_lipsync_ready()

@@ -153,58 +579,100 @@ async def process_video_generation(task_id: str, req: GenerateRequest, user_id:
            print(f"[Pipeline] LipSync completed in {lipsync_time:.1f}s")
            _update_task(task_id, progress=80)

-        captions_path = None
-        if req.enable_subtitles:
-            _update_task(task_id, message="正在生成字幕 (Whisper)...", progress=82)
-
-            captions_path = temp_dir / f"{task_id}_captions.json"
-            temp_files.append(captions_path)
-
-            try:
-                await whisper_service.align(
-                    audio_path=str(audio_path),
-                    text=req.text,
-                    output_path=str(captions_path)
-                )
-                print(f"[Pipeline] Whisper alignment completed")
-            except Exception as e:
-                logger.warning(f"Whisper alignment failed, skipping subtitles: {e}")
+            # 单素材模式：Whisper 延迟到下方与 BGM 并行执行
+            if not req.enable_subtitles:
                captions_path = None

        _update_task(task_id, progress=85)

-        video = VideoService()
+        # ── Whisper 字幕 + BGM 混音 并行（两者都只依赖 audio_path）──
        final_audio_path = audio_path
-        if req.bgm_id:
-            _update_task(task_id, message="正在合成背景音乐...", progress=86)
+        _whisper_task = None
+        _bgm_task = None

+        # 单素材模式下 Whisper 尚未执行，这里与 BGM 并行启动
+        need_whisper = not is_multi and req.enable_subtitles and captions_path is None
+        if need_whisper:
+            captions_path = temp_dir / f"{task_id}_captions.json"
+            temp_files.append(captions_path)
+            _captions_path_str = str(captions_path)
+
+            async def _run_whisper():
+                _update_task(task_id, message="正在生成字幕 (Whisper)...", progress=82)
+                try:
+                    await whisper_service.align(
+                        audio_path=str(audio_path),
+                        text=req.text,
+                        output_path=_captions_path_str,
+                        language=_locale_to_whisper_lang(req.language),
+                        original_text=req.text,
+                    )
+                    print(f"[Pipeline] Whisper alignment completed")
+                    return True
+                except Exception as e:
+                    logger.warning(f"Whisper alignment failed, skipping subtitles: {e}")
+                    return False
+
+            _whisper_task = _run_whisper()
+
+        if req.bgm_id:
            bgm_path = resolve_bgm_path(req.bgm_id)
            if bgm_path:
                mix_output_path = temp_dir / f"{task_id}_audio_mix.wav"
                temp_files.append(mix_output_path)
                volume = req.bgm_volume if req.bgm_volume is not None else 0.2
                volume = max(0.0, min(float(volume), 1.0))
+                _mix_output = str(mix_output_path)
+                _bgm_path = str(bgm_path)
+                _voice_path = str(audio_path)
+                _volume = volume
+
+                async def _run_bgm():
+                    _update_task(task_id, message="正在合成背景音乐...", progress=86)
+                    loop = asyncio.get_event_loop()
                    try:
-                    video.mix_audio(
-                        voice_path=str(audio_path),
-                        bgm_path=str(bgm_path),
-                        output_path=str(mix_output_path),
-                        bgm_volume=volume
+                        await loop.run_in_executor(
+                            None,
+                            video.mix_audio,
+                            _voice_path,
+                            _bgm_path,
+                            _mix_output,
+                            _volume,
                        )
-                    final_audio_path = mix_output_path
+                        return True
                    except Exception as e:
                        logger.warning(f"BGM mix failed, fallback to voice only: {e}")
+                        return False
+
+                _bgm_task = _run_bgm()
            else:
                logger.warning(f"BGM not found: {req.bgm_id}")

-        use_remotion = (captions_path and captions_path.exists()) or req.title
+        # 并行等待 Whisper + BGM
+        parallel_tasks = [t for t in (_whisper_task, _bgm_task) if t is not None]
+        if parallel_tasks:
+            results = await asyncio.gather(*parallel_tasks)
+            result_idx = 0
+            if _whisper_task is not None:
+                if not results[result_idx]:
+                    captions_path = None
+                result_idx += 1
+            if _bgm_task is not None:
+                if results[result_idx]:
+                    final_audio_path = mix_output_path
+
+
+        use_remotion = (captions_path and captions_path.exists()) or req.title or req.secondary_title

        subtitle_style = None
        title_style = None
+        secondary_title_style = None
        if req.enable_subtitles:
            subtitle_style = get_style("subtitle", req.subtitle_style_id) or get_default_style("subtitle")
        if req.title:
            title_style = get_style("title", req.title_style_id) or get_default_style("title")
+        if req.secondary_title:
+            secondary_title_style = get_style("title", req.secondary_title_style_id) or get_default_style("title")

        if req.subtitle_font_size and req.enable_subtitles:
            if subtitle_style is None:
@@ -226,6 +694,16 @@ async def process_video_generation(task_id: str, req: GenerateRequest, user_id:
                subtitle_style = {}
            subtitle_style["bottom_margin"] = int(req.subtitle_bottom_margin)

+        if req.secondary_title_font_size and req.secondary_title:
+            if secondary_title_style is None:
+                secondary_title_style = {}
+            secondary_title_style["font_size"] = int(req.secondary_title_font_size)
+
+        if req.secondary_title_top_margin is not None and req.secondary_title:
+            if secondary_title_style is None:
+                secondary_title_style = {}
+            secondary_title_style["top_margin"] = int(req.secondary_title_top_margin)
+
        if use_remotion:
            subtitle_style = prepare_style_for_remotion(
                subtitle_style,
@@ -237,6 +715,11 @@ async def process_video_generation(task_id: str, req: GenerateRequest, user_id:
                temp_dir,
                f"{task_id}_title_font"
            )
+            secondary_title_style = prepare_style_for_remotion(
+                secondary_title_style,
+                temp_dir,
+                f"{task_id}_secondary_title_font"
+            )

        final_output_local_path = temp_dir / f"{task_id}_output.mp4"
        temp_files.append(final_output_local_path)
@@ -256,16 +739,26 @@ async def process_video_generation(task_id: str, req: GenerateRequest, user_id:
                        mapped = 87 + int(percent * 0.08)
                        _update_task(task_id, progress=mapped)

+                    title_display_mode = (
+                        req.title_display_mode
+                        if req.title_display_mode in ("short", "persistent")
+                        else "short"
+                    )
+                    title_duration = max(0.5, min(float(req.title_duration or 4.0), 30.0))
+
                    await remotion_service.render(
                        video_path=str(composed_video_path),
                        output_path=str(final_output_local_path),
                        captions_path=str(captions_path) if captions_path else None,
                        title=req.title,
-                        title_duration=3.0,
+                        title_duration=title_duration,
+                        title_display_mode=title_display_mode,
                        fps=25,
                        enable_subtitles=req.enable_subtitles,
                        subtitle_style=subtitle_style,
                        title_style=title_style,
+                        secondary_title=req.secondary_title,
+                        secondary_title_style=secondary_title_style,
                        on_progress=on_remotion_progress
                    )
                    print(f"[Pipeline] Remotion render completed")
--- a/backend/app/repositories/orders.py
+++ b/backend/app/repositories/orders.py
@@ -0,0 +1,34 @@
+"""
+订单数据访问层
+"""
+from datetime import datetime, timezone
+from typing import Any, Dict, Optional, cast
+
+from app.core.supabase import get_supabase
+
+
+def create_order(user_id: str, out_trade_no: str, amount: float) -> Dict[str, Any]:
+    supabase = get_supabase()
+    result = supabase.table("orders").insert({
+        "user_id": user_id,
+        "out_trade_no": out_trade_no,
+        "amount": amount,
+        "status": "pending",
+    }).execute()
+    return cast(Dict[str, Any], (result.data or [{}])[0])
+
+
+def get_order_by_trade_no(out_trade_no: str) -> Optional[Dict[str, Any]]:
+    supabase = get_supabase()
+    result = supabase.table("orders").select("*").eq("out_trade_no", out_trade_no).single().execute()
+    return cast(Optional[Dict[str, Any]], result.data or None)
+
+
+def update_order_status(out_trade_no: str, status: str, trade_no: str | None = None) -> None:
+    supabase = get_supabase()
+    payload: Dict[str, Any] = {"status": status}
+    if trade_no:
+        payload["trade_no"] = trade_no
+    if status == "paid":
+        payload["paid_at"] = datetime.now(timezone.utc).isoformat()
+    supabase.table("orders").update(payload).eq("out_trade_no", out_trade_no).execute()
--- a/backend/app/repositories/users.py
+++ b/backend/app/repositories/users.py
@@ -1,3 +1,4 @@
+from datetime import datetime, timezone
 from typing import Any, Dict, List, Optional, cast

 from app.core.supabase import get_supabase
@@ -37,3 +38,33 @@ def update_user(user_id: str, payload: Dict[str, Any]) -> List[Dict[str, Any]]:
    supabase = get_supabase()
    result = supabase.table("users").update(payload).eq("id", user_id).execute()
    return cast(List[Dict[str, Any]], result.data or [])
+
+
+def _parse_expires_at(expires_at: Any) -> Optional[datetime]:
+    try:
+        expires_at_dt = datetime.fromisoformat(str(expires_at).replace("Z", "+00:00"))
+    except Exception:
+        return None
+
+    if expires_at_dt.tzinfo is None:
+        expires_at_dt = expires_at_dt.replace(tzinfo=timezone.utc)
+    return expires_at_dt.astimezone(timezone.utc)
+
+
+def deactivate_user_if_expired(user: Dict[str, Any]) -> bool:
+    expires_at = user.get("expires_at")
+    if not expires_at:
+        return False
+
+    expires_at_dt = _parse_expires_at(expires_at)
+    if not expires_at_dt:
+        return False
+
+    if datetime.now(timezone.utc) <= expires_at_dt:
+        return False
+
+    user_id = user.get("id")
+    if user.get("is_active") and user_id:
+        update_user(cast(str, user_id), {"is_active": False})
+
+    return True
--- a/backend/app/services/glm_service.py
+++ b/backend/app/services/glm_service.py
@@ -35,17 +35,19 @@ class GLMService:
        Returns:
            {"title": "标题", "tags": ["标签1", "标签2", ...]}
        """
-        prompt = f"""根据以下口播文案，生成一个吸引人的短视频标题和3个相关标签。
+        prompt = f"""根据以下口播文案，生成一个吸引人的短视频标题、副标题和3个相关标签。

 口播文案：
 {text}

 要求：
 1. 标题要简洁有力，能吸引观众点击，不超过10个字
-2. 标签要与内容相关，便于搜索和推荐，只要3个
+2. 副标题是对标题的补充说明或描述性文字，不超过20个字
+3. 标签要与内容相关，便于搜索和推荐，只要3个
+4. 标题、副标题和标签必须使用与口播文案相同的语言（如文案是英文就用英文，日文就用日文）

 请严格按以下JSON格式返回（不要包含其他内容）：
-{{"title": "标题", "tags": ["标签1", "标签2", "标签3"]}}"""
+{{"title": "标题", "secondary_title": "副标题", "tags": ["标签1", "标签2", "标签3"]}}"""

        try:
            client = self._get_client()
@@ -74,16 +76,23 @@ class GLMService:
            logger.error(f"GLM service error: {e}")
            raise Exception(f"AI 生成失败: {str(e)}")

-    async def rewrite_script(self, text: str) -> str:
+    async def rewrite_script(self, text: str, custom_prompt: str = None) -> str:
        """
-        AI 洗稿（文案改写）
+        AI 改写文案

        Args:
            text: 原始文案
+            custom_prompt: 自定义提示词，为空则使用默认提示词

        Returns:
            改写后的文案
        """
+        if custom_prompt and custom_prompt.strip():
+            prompt = f"""{custom_prompt.strip()}
+
+原始文案：
+{text}"""
+        else:
            prompt = f"""请将以下视频文案进行改写。

 原始文案：
@@ -120,6 +129,49 @@ class GLMService:



+    async def translate_text(self, text: str, target_lang: str) -> str:
+        """
+        将文案翻译为指定语言
+
+        Args:
+            text: 原始文案
+            target_lang: 目标语言（如 English, 日本語 等）
+
+        Returns:
+            翻译后的文案
+        """
+        prompt = f"""请将以下文案翻译为{target_lang}。
+
+原文：
+{text}
+
+要求：
+1. 只返回翻译后的文案，不要添加任何解释或说明
+2. 保持原文的语气和风格
+3. 翻译要自然流畅，符合目标语言的表达习惯"""
+
+        try:
+            client = self._get_client()
+            logger.info(f"Using GLM to translate text to {target_lang}")
+
+            import asyncio
+            response = await asyncio.to_thread(
+                client.chat.completions.create,
+                model=settings.GLM_MODEL,
+                messages=[{"role": "user", "content": prompt}],
+                thinking={"type": "disabled"},
+                max_tokens=2000,
+                temperature=0.3
+            )
+
+            content = response.choices[0].message.content
+            logger.info("GLM translation completed")
+            return content.strip()
+
+        except Exception as e:
+            logger.error(f"GLM translate error: {e}")
+            raise Exception(f"AI 翻译失败: {str(e)}")
+
    def _parse_json_response(self, content: str) -> dict:
        """解析 GLM 返回的 JSON 内容"""
        # 尝试直接解析
@@ -130,6 +182,8 @@ class GLMService:

        # 尝试提取 JSON 块
        json_match = re.search(r'\{[^{}]*"title"[^{}]*"tags"[^{}]*\}', content, re.DOTALL)
+        if not json_match:
+            json_match = re.search(r'\{[^{}]*"title"[^{}]*"secondary_title"[^{}]*"tags"[^{}]*\}', content, re.DOTALL)
        if json_match:
            try:
                return json.loads(json_match.group())
--- a/backend/app/services/lipsync_service.py
+++ b/backend/app/services/lipsync_service.py
@@ -1,7 +1,7 @@
 """
 唇形同步服务
-通过 subprocess 调用 LatentSync conda 环境进行推理
-配置为使用 GPU1 (CUDA:1)
+混合方案: 短视频用 LatentSync (高质量), 长视频用 MuseTalk (高速度)
+路由阈值: LIPSYNC_DURATION_THRESHOLD (默认 120s)
 """
 import os
 import shutil
@@ -17,7 +17,7 @@ from app.core.config import settings


 class LipSyncService:
-    """唇形同步服务 - LatentSync 1.6 集成 (Subprocess 方式)"""
+    """唇形同步服务 - LatentSync 1.6 + MuseTalk 1.5 混合方案"""

    def __init__(self):
        self.use_local = settings.LATENTSYNC_LOCAL
@@ -26,6 +26,9 @@ class LipSyncService:
        self.gpu_id = settings.LATENTSYNC_GPU_ID
        self.use_server = settings.LATENTSYNC_USE_SERVER

+        # MuseTalk 配置
+        self.musetalk_api_url = settings.MUSETALK_API_URL
+
        # GPU 并发锁 (Serial Queue)
        self._lock = asyncio.Lock()
        
@@ -103,7 +106,7 @@ class LipSyncService:
                "-t", str(target_duration),  # 截取到目标时长
                "-c:v", "libx264",
                "-preset", "fast",
-                "-crf", "18",
+                "-crf", "23",
                "-an",  # 去掉原音频
                output_path
            ]
@@ -268,6 +271,18 @@ class LipSyncService:
                else:
                    actual_video_path = video_path

+                # 混合路由: 长视频走 MuseTalk，短视频走 LatentSync
+                if audio_duration and audio_duration >= settings.LIPSYNC_DURATION_THRESHOLD:
+                    logger.info(
+                        f"🔄 音频 {audio_duration:.1f}s >= {settings.LIPSYNC_DURATION_THRESHOLD}s，路由到 MuseTalk"
+                    )
+                    musetalk_result = await self._call_musetalk_server(
+                        actual_video_path, audio_path, output_path
+                    )
+                    if musetalk_result:
+                        return musetalk_result
+                    logger.warning("⚠️ MuseTalk 不可用，回退到 LatentSync（长视频，会较慢）")
+
                if self.use_server:
                    # 模式 A: 调用常驻服务 (加速模式)
                    return await self._call_persistent_server(actual_video_path, audio_path, output_path)
@@ -352,6 +367,55 @@ class LipSyncService:
                    shutil.copy(video_path, output_path)
                    return output_path
    
+    async def _call_musetalk_server(
+        self, video_path: str, audio_path: str, output_path: str
+    ) -> Optional[str]:
+        """
+        调用 MuseTalk 常驻服务。
+        成功返回 output_path，不可用返回 None（信号上层回退到 LatentSync）。
+        """
+        server_url = self.musetalk_api_url
+        logger.info(f"⚡ 调用 MuseTalk 服务: {server_url}")
+
+        try:
+            async with httpx.AsyncClient(timeout=3600.0) as client:
+                # 健康检查
+                try:
+                    resp = await client.get(f"{server_url}/health", timeout=5.0)
+                    if resp.status_code != 200:
+                        logger.warning("⚠️ MuseTalk 健康检查失败")
+                        return None
+                    health = resp.json()
+                    if not health.get("model_loaded"):
+                        logger.warning("⚠️ MuseTalk 模型未加载")
+                        return None
+                except Exception:
+                    logger.warning("⚠️ 无法连接 MuseTalk 服务")
+                    return None
+
+                # 发送推理请求
+                payload = {
+                    "video_path": str(Path(video_path).resolve()),
+                    "audio_path": str(Path(audio_path).resolve()),
+                    "video_out_path": str(Path(output_path).resolve()),
+                    "batch_size": settings.MUSETALK_BATCH_SIZE,
+                }
+
+                response = await client.post(f"{server_url}/lipsync", json=payload)
+
+                if response.status_code == 200:
+                    result = response.json()
+                    if Path(result["output_path"]).exists():
+                        logger.info(f"✅ MuseTalk 推理完成: {output_path}")
+                        return output_path
+
+                logger.error(f"❌ MuseTalk 服务报错: {response.text}")
+                return None
+
+        except Exception as e:
+            logger.error(f"❌ MuseTalk 调用失败: {e}")
+            return None
+
    async def _call_persistent_server(self, video_path: str, audio_path: str, output_path: str) -> str:
        """调用本地常驻服务 (server.py)"""
        server_url = "http://localhost:8007"
@@ -369,7 +433,7 @@ class LipSyncService:
        }
        
        try:
-            async with httpx.AsyncClient(timeout=1200.0) as client:
+            async with httpx.AsyncClient(timeout=3600.0) as client:
                # 先检查健康状态
                try:
                    resp = await client.get(f"{server_url}/health", timeout=5.0)
@@ -477,8 +541,18 @@ class LipSyncService:
            except:
                pass
        
+        # 检查 MuseTalk 服务
+        musetalk_ready = False
+        try:
+            async with httpx.AsyncClient(timeout=5.0) as client:
+                resp = await client.get(f"{self.musetalk_api_url}/health")
+                if resp.status_code == 200:
+                    musetalk_ready = resp.json().get("model_loaded", False)
+        except Exception:
+            pass
+
        return {
-            "model": "LatentSync 1.6",
+            "model": "LatentSync 1.6 + MuseTalk 1.5",
            "conda_env": conda_ok,
            "weights": weights_ok,
            "gpu": gpu_ok,
@@ -486,5 +560,7 @@ class LipSyncService:
            "gpu_id": self.gpu_id,
            "inference_steps": settings.LATENTSYNC_INFERENCE_STEPS,
            "guidance_scale": settings.LATENTSYNC_GUIDANCE_SCALE,
-            "ready": conda_ok and weights_ok and gpu_ok
+            "ready": conda_ok and weights_ok and gpu_ok,
+            "musetalk_ready": musetalk_ready,
+            "lipsync_threshold": settings.LIPSYNC_DURATION_THRESHOLD,
        }
--- a/backend/app/services/remotion_service.py
+++ b/backend/app/services/remotion_service.py
@@ -7,6 +7,7 @@ import asyncio
 import json
 import os
 import subprocess
+from collections.abc import Callable
 from pathlib import Path
 from typing import Optional
 from loguru import logger
@@ -29,12 +30,15 @@ class RemotionService:
        output_path: str,
        captions_path: Optional[str] = None,
        title: Optional[str] = None,
-        title_duration: float = 3.0,
+        title_duration: float = 4.0,
+        title_display_mode: str = "short",
        fps: int = 25,
        enable_subtitles: bool = True,
        subtitle_style: Optional[dict] = None,
        title_style: Optional[dict] = None,
-        on_progress: Optional[callable] = None
+        secondary_title: Optional[str] = None,
+        secondary_title_style: Optional[dict] = None,
+        on_progress: Optional[Callable[[int], None]] = None
    ) -> str:
        """
        使用 Remotion 渲染视频（添加字幕和标题）
@@ -45,6 +49,7 @@ class RemotionService:
            captions_path: 字幕 JSON 文件路径（Whisper 生成）
            title: 视频标题（可选）
            title_duration: 标题显示时长（秒）
+            title_display_mode: 标题显示模式（short/persistent）
            fps: 帧率
            enable_subtitles: 是否启用字幕
            on_progress: 进度回调函数
@@ -75,6 +80,7 @@ class RemotionService:
        if title:
            cmd.extend(["--title", title])
            cmd.extend(["--titleDuration", str(title_duration)])
+            cmd.extend(["--titleDisplayMode", title_display_mode])

        if subtitle_style:
            cmd.extend(["--subtitleStyle", json.dumps(subtitle_style, ensure_ascii=False)])
@@ -82,6 +88,12 @@ class RemotionService:
        if title_style:
            cmd.extend(["--titleStyle", json.dumps(title_style, ensure_ascii=False)])

+        if secondary_title:
+            cmd.extend(["--secondaryTitle", secondary_title])
+
+        if secondary_title_style:
+            cmd.extend(["--secondaryTitleStyle", json.dumps(secondary_title_style, ensure_ascii=False)])
+
        logger.info(f"Running Remotion render: {' '.join(cmd)}")

        # 在线程池中运行子进程
@@ -95,8 +107,12 @@ class RemotionService:
                bufsize=1
            )

+            if process.stdout is None:
+                raise RuntimeError("Remotion process stdout is unavailable")
+            stdout = process.stdout
+
            output_lines = []
-            for line in iter(process.stdout.readline, ''):
+            for line in iter(stdout.readline, ''):
                line = line.strip()
                if line:
                    output_lines.append(line)
--- a/backend/app/services/storage.py
+++ b/backend/app/services/storage.py
@@ -20,12 +20,13 @@ class StorageService:
        self.BUCKET_MATERIALS = "materials"
        self.BUCKET_OUTPUTS = "outputs"
        self.BUCKET_REF_AUDIOS = "ref-audios"
+        self.BUCKET_GENERATED_AUDIOS = "generated-audios"
        # 确保所有 bucket 存在
        self._ensure_buckets()

    def _ensure_buckets(self):
        """确保所有必需的 bucket 存在"""
-        buckets = [self.BUCKET_MATERIALS, self.BUCKET_OUTPUTS, self.BUCKET_REF_AUDIOS]
+        buckets = [self.BUCKET_MATERIALS, self.BUCKET_OUTPUTS, self.BUCKET_REF_AUDIOS, self.BUCKET_GENERATED_AUDIOS]
        try:
            existing = self.supabase.storage.list_buckets()
            existing_names = {b.name for b in existing} if existing else set()
--- a/backend/app/services/video_service.py
+++ b/backend/app/services/video_service.py
@@ -13,6 +13,107 @@ class VideoService:
    def __init__(self):
        pass

+    def get_video_metadata(self, file_path: str) -> dict:
+        """获取视频元信息（含旋转角与有效显示分辨率）"""
+        cmd = [
+            "ffprobe", "-v", "error",
+            "-select_streams", "v:0",
+            "-show_entries", "stream=width,height:stream_side_data=rotation",
+            "-of", "json",
+            file_path,
+        ]
+        default_info = {
+            "width": 0,
+            "height": 0,
+            "rotation": 0,
+            "effective_width": 0,
+            "effective_height": 0,
+        }
+
+        try:
+            result = subprocess.run(cmd, capture_output=True, text=True, timeout=10)
+            if result.returncode != 0:
+                return default_info
+
+            payload = json.loads(result.stdout or "{}")
+            streams = payload.get("streams") or []
+            if not streams:
+                return default_info
+
+            stream = streams[0]
+            width = int(stream.get("width") or 0)
+            height = int(stream.get("height") or 0)
+
+            rotation = 0
+            for side_data in stream.get("side_data_list") or []:
+                if not isinstance(side_data, dict):
+                    continue
+                raw_rotation = side_data.get("rotation")
+                if raw_rotation is None:
+                    continue
+                try:
+                    rotation = int(round(float(str(raw_rotation))))
+                except Exception:
+                    rotation = 0
+                break
+
+            norm_rotation = rotation % 360
+            if norm_rotation > 180:
+                norm_rotation -= 360
+            swap_wh = abs(norm_rotation) == 90
+
+            effective_width = height if swap_wh else width
+            effective_height = width if swap_wh else height
+
+            return {
+                "width": width,
+                "height": height,
+                "rotation": norm_rotation,
+                "effective_width": effective_width,
+                "effective_height": effective_height,
+            }
+        except Exception as e:
+            logger.warning(f"获取视频元信息失败: {e}")
+            return default_info
+
+    def normalize_orientation(self, video_path: str, output_path: str) -> str:
+        """将带旋转元数据的视频转为物理方向，避免后续流程忽略 rotation。"""
+        info = self.get_video_metadata(video_path)
+        rotation = int(info.get("rotation") or 0)
+        if rotation == 0:
+            return video_path
+
+        Path(output_path).parent.mkdir(parents=True, exist_ok=True)
+        logger.info(
+            f"检测到旋转元数据 rotation={rotation}，归一化方向: "
+            f"{info.get('effective_width', 0)}x{info.get('effective_height', 0)}"
+        )
+
+        cmd = [
+            "ffmpeg", "-y",
+            "-i", video_path,
+            "-map", "0:v:0",
+            "-map", "0:a?",
+            "-c:v", "libx264",
+            "-preset", "fast",
+            "-crf", "23",
+            "-c:a", "copy",
+            "-movflags", "+faststart",
+            output_path,
+        ]
+
+        if self._run_ffmpeg(cmd):
+            normalized = self.get_video_metadata(output_path)
+            logger.info(
+                "视频方向归一化完成: "
+                f"coded={normalized.get('width', 0)}x{normalized.get('height', 0)}, "
+                f"rotation={normalized.get('rotation', 0)}"
+            )
+            return output_path
+
+        logger.warning("视频方向归一化失败，回退使用原视频")
+        return video_path
+
    def _run_ffmpeg(self, cmd: list) -> bool:
        cmd_str = ' '.join(shlex.quote(str(c)) for c in cmd)
        logger.debug(f"FFmpeg CMD: {cmd_str}")
@@ -123,8 +224,8 @@ class VideoService:
        # Audio map with high quality encoding
        cmd.extend([
            "-c:v", "libx264",
-            "-preset", "slow",      # 慢速预设，更好的压缩效率
-            "-crf", "18",           # 高质量（与 LatentSync 一致）
+            "-preset", "medium",    # 平衡速度与压缩效率
+            "-crf", "20",           # 最终输出：高质量（肉眼无损）
            "-c:a", "aac",
            "-b:a", "192k",         # 音频比特率
            "-shortest"
@@ -138,3 +239,167 @@ class VideoService:
            return output_path
        else:
            raise RuntimeError("FFmpeg composition failed")
+
+    def concat_videos(self, video_paths: list, output_path: str, target_fps: int = 25) -> str:
+        """使用 FFmpeg concat demuxer 拼接多个视频片段"""
+        if not video_paths:
+            raise ValueError("No video segments to concat")
+
+        Path(output_path).parent.mkdir(parents=True, exist_ok=True)
+
+        # 生成 concat list 文件
+        list_path = Path(output_path).parent / f"{Path(output_path).stem}_concat.txt"
+        with open(list_path, "w", encoding="utf-8") as f:
+            for vp in video_paths:
+                f.write(f"file '{vp}'\n")
+
+        cmd = [
+            "ffmpeg", "-y",
+            "-f", "concat",
+            "-safe", "0",
+            "-fflags", "+genpts",
+            "-i", str(list_path),
+            "-an",
+            "-vsync", "cfr",
+            "-r", str(target_fps),
+            "-c:v", "libx264",
+            "-preset", "fast",
+            "-crf", "23",
+            "-pix_fmt", "yuv420p",
+            "-movflags", "+faststart",
+            output_path,
+        ]
+
+        try:
+            if self._run_ffmpeg(cmd):
+                return output_path
+            else:
+                raise RuntimeError("FFmpeg concat failed")
+        finally:
+            try:
+                list_path.unlink(missing_ok=True)
+            except Exception:
+                pass
+
+    def split_audio(self, audio_path: str, start: float, end: float, output_path: str) -> str:
+        """用 FFmpeg 按时间范围切分音频"""
+        Path(output_path).parent.mkdir(parents=True, exist_ok=True)
+
+        duration = end - start
+        if duration <= 0:
+            raise ValueError(f"Invalid audio split range: start={start}, end={end}, duration={duration}")
+
+        cmd = [
+            "ffmpeg", "-y",
+            "-ss", str(start),
+            "-t", str(duration),
+            "-i", audio_path,
+            "-c", "copy",
+            output_path,
+        ]
+
+        if self._run_ffmpeg(cmd):
+            return output_path
+        raise RuntimeError(f"FFmpeg audio split failed: {start}-{end}")
+
+    def get_resolution(self, file_path: str) -> tuple[int, int]:
+        """获取视频有效显示分辨率（考虑旋转元数据）。"""
+        info = self.get_video_metadata(file_path)
+        return (
+            int(info.get("effective_width") or 0),
+            int(info.get("effective_height") or 0),
+        )
+
+    def prepare_segment(self, video_path: str, target_duration: float, output_path: str,
+                        target_resolution: Optional[tuple] = None, source_start: float = 0.0,
+                        source_end: Optional[float] = None, target_fps: Optional[int] = None) -> str:
+        """将素材视频裁剪或循环到指定时长（无音频）。
+        target_resolution: (width, height) 如需统一分辨率则传入，否则保持原分辨率。
+        source_start: 源视频截取起点（秒），默认 0。
+        source_end: 源视频截取终点（秒），默认到素材结尾。
+        target_fps: 输出帧率（可选），用于多素材拼接前统一时间基。
+        """
+        Path(output_path).parent.mkdir(parents=True, exist_ok=True)
+
+        video_dur = self._get_duration(video_path)
+        if video_dur <= 0:
+            video_dur = target_duration
+
+        clip_end = video_dur
+        if source_end is not None:
+            try:
+                source_end_value = float(source_end)
+                if source_end_value > source_start:
+                    clip_end = min(source_end_value, video_dur)
+            except Exception:
+                pass
+
+        # 可用时长 = 从 source_start 到视频结尾
+        available = max(clip_end - source_start, 0.1)
+        needs_loop = target_duration > available
+        needs_scale = target_resolution is not None
+        needs_fps = bool(target_fps and target_fps > 0)
+        has_source_end = clip_end < video_dur
+
+        # 当需要循环且存在截取范围时，先裁剪出片段，再循环裁剪后的文件
+        # 避免 stream_loop 循环整个视频（而不是截取后的片段）
+        actual_input = video_path
+        trim_temp = None
+        if needs_loop and (source_start > 0 or has_source_end):
+            trim_temp = str(Path(output_path).parent / (Path(output_path).stem + "_trim_tmp.mp4"))
+            trim_cmd = [
+                "ffmpeg", "-y",
+                "-ss", str(source_start),
+                "-i", video_path,
+                "-t", str(available),
+                "-an",
+                "-c:v", "libx264", "-preset", "fast", "-crf", "23",
+                trim_temp,
+            ]
+            if not self._run_ffmpeg(trim_cmd):
+                raise RuntimeError(f"FFmpeg trim for loop failed: {video_path}")
+            actual_input = trim_temp
+            source_start = 0.0  # 已裁剪，不需要再 seek
+            # 重新计算循环次数（基于裁剪后文件）
+            available = self._get_duration(trim_temp) or available
+
+        loop_count = int(target_duration / available) + 1 if needs_loop else 0
+
+        cmd = ["ffmpeg", "-y"]
+        if needs_loop:
+            cmd.extend(["-stream_loop", str(loop_count)])
+        if source_start > 0:
+            cmd.extend(["-ss", str(source_start)])
+        cmd.extend(["-i", actual_input, "-t", str(target_duration), "-an"])
+
+        filters = []
+        if needs_fps:
+            filters.append(f"fps={int(target_fps)}")
+        if needs_scale:
+            w, h = target_resolution
+            filters.append(f"scale={w}:{h}:force_original_aspect_ratio=decrease,pad={w}:{h}:(ow-iw)/2:(oh-ih)/2")
+
+        if filters:
+            cmd.extend(["-vf", ",".join(filters)])
+        if needs_fps:
+            cmd.extend(["-vsync", "cfr", "-r", str(int(target_fps))])
+
+        # 需要循环、缩放或指定起点时必须重编码，否则用 stream copy 保持原画质
+        if needs_loop or needs_scale or source_start > 0 or has_source_end or needs_fps:
+            cmd.extend(["-c:v", "libx264", "-preset", "fast", "-crf", "23"])
+        else:
+            cmd.extend(["-c:v", "copy"])
+
+        cmd.append(output_path)
+
+        try:
+            if self._run_ffmpeg(cmd):
+                return output_path
+            raise RuntimeError(f"FFmpeg prepare_segment failed: {video_path}")
+        finally:
+            # 清理裁剪临时文件
+            if trim_temp:
+                try:
+                    Path(trim_temp).unlink(missing_ok=True)
+                except Exception:
+                    pass
--- a/backend/app/services/voice_clone_service.py
+++ b/backend/app/services/voice_clone_service.py
@@ -1,37 +1,104 @@
 """
 声音克隆服务
-通过 HTTP 调用 Qwen3-TTS 独立服务 (端口 8009)
+通过 HTTP 调用 CosyVoice 3.0 独立服务 (端口 8010)
 """
-import httpx
 import asyncio
 from pathlib import Path
 from typing import Optional
+
+import httpx
 from loguru import logger

-from app.core.config import settings
-
-# Qwen3-TTS 服务地址
-QWEN_TTS_URL = "http://localhost:8009"
+# CosyVoice 3.0 服务地址
+VOICE_CLONE_URL = "http://localhost:8010"


 class VoiceCloneService:
-    """声音克隆服务 - 调用 Qwen3-TTS HTTP API"""
+    """声音克隆服务 - 调用 CosyVoice 3.0 HTTP API"""

    def __init__(self):
-        self.base_url = QWEN_TTS_URL
+        self.base_url = VOICE_CLONE_URL
        # 健康状态缓存
        self._health_cache: Optional[dict] = None
        self._health_cache_time: float = 0
        # GPU 并发锁 (Serial Queue)
        self._lock = asyncio.Lock()

+    async def _generate_once(
+        self,
+        *,
+        text: str,
+        ref_audio_data: bytes,
+        ref_text: str,
+        language: str,
+        speed: float = 1.0,
+        max_retries: int = 4,
+    ) -> bytes:
+        timeout = httpx.Timeout(240.0)
+
+        for attempt in range(max_retries):
+            try:
+                async with httpx.AsyncClient(timeout=timeout) as client:
+                    response = await client.post(
+                        f"{self.base_url}/generate",
+                        files={"ref_audio": ("ref.wav", ref_audio_data, "audio/wav")},
+                        data={
+                            "text": text,
+                            "ref_text": ref_text,
+                            "language": language,
+                            "speed": str(speed),
+                        },
+                    )
+
+                retryable = False
+                reason = ""
+
+                if response.status_code in (429, 502, 503, 504):
+                    retryable = True
+                    reason = f"HTTP {response.status_code}"
+                elif response.status_code == 500 and (
+                    "生成超时" in response.text or "timeout" in response.text.lower()
+                ):
+                    retryable = True
+                    reason = "upstream timeout"
+
+                if retryable and attempt < max_retries - 1:
+                    wait = 8 * (attempt + 1)
+                    logger.warning(
+                        f"Voice clone retryable error ({reason}), retrying in {wait}s "
+                        f"(attempt {attempt + 1}/{max_retries})"
+                    )
+                    await asyncio.sleep(wait)
+                    continue
+
+                response.raise_for_status()
+                return response.content
+
+            except httpx.HTTPStatusError as e:
+                logger.error(f"Voice clone API error: {e.response.status_code} - {e.response.text}")
+                raise RuntimeError(f"声音克隆服务错误: {e.response.text}")
+            except httpx.RequestError as e:
+                if attempt < max_retries - 1:
+                    wait = 6 * (attempt + 1)
+                    logger.warning(
+                        f"Voice clone connection error: {e}; retrying in {wait}s "
+                        f"(attempt {attempt + 1}/{max_retries})"
+                    )
+                    await asyncio.sleep(wait)
+                    continue
+                logger.error(f"Voice clone connection error: {e}")
+                raise RuntimeError("无法连接声音克隆服务，请检查服务是否启动")
+
+        raise RuntimeError("声音克隆服务繁忙，请稍后重试")
+
    async def generate_audio(
        self,
        text: str,
        ref_audio_path: str,
        ref_text: str,
        output_path: str,
-        language: str = "Chinese"
+        language: str = "Chinese",
+        speed: float = 1.0,
    ) -> str:
        """
        使用声音克隆生成语音
@@ -48,63 +115,52 @@ class VoiceCloneService:
        """
        # 使用锁确保串行执行，避免 GPU 显存溢出
        async with self._lock:
-            logger.info(f"🎤 Voice Clone: {text[:30]}...")
+            logger.info(f"🎤 Voice Clone: {text[:30]}... (language={language})")
            Path(output_path).parent.mkdir(parents=True, exist_ok=True)

-            # 读取参考音频
+            text = text.strip()
+            if not text:
+                raise RuntimeError("文本为空，无法生成语音")
+
            with open(ref_audio_path, "rb") as f:
                ref_audio_data = f.read()

-            # 调用 Qwen3-TTS 服务
-            timeout = httpx.Timeout(300.0)  # 5分钟超时
-            async with httpx.AsyncClient(timeout=timeout) as client:
-                try:
-                    response = await client.post(
-                        f"{self.base_url}/generate",
-                        files={"ref_audio": ("ref.wav", ref_audio_data, "audio/wav")},
-                        data={
-                            "text": text,
-                            "ref_text": ref_text,
-                            "language": language
-                        }
+            # CosyVoice 内部自带 text_normalize 分段，无需客户端切分
+            audio_bytes = await self._generate_once(
+                text=text,
+                ref_audio_data=ref_audio_data,
+                ref_text=ref_text,
+                language=language,
+                speed=speed,
            )
-                    response.raise_for_status()
-
-                    # 保存返回的音频
            with open(output_path, "wb") as f:
-                        f.write(response.content)
-
+                f.write(audio_bytes)
            logger.info(f"✅ Voice clone saved: {output_path}")
            return output_path

-                except httpx.HTTPStatusError as e:
-                    logger.error(f"Qwen3-TTS API error: {e.response.status_code} - {e.response.text}")
-                    raise RuntimeError(f"声音克隆服务错误: {e.response.text}")
-                except httpx.RequestError as e:
-                    logger.error(f"Qwen3-TTS connection error: {e}")
-                    raise RuntimeError("无法连接声音克隆服务，请检查服务是否启动")
-
    async def check_health(self) -> dict:
        """健康检查"""
        import time

-        # 5分钟缓存
+        # 30秒缓存
        now = time.time()
-        if self._health_cache and (now - self._health_cache_time) < 300:
-            return self._health_cache
+        cached = self._health_cache
+        if cached is not None and (now - self._health_cache_time) < 30:
+            return cached

        try:
            async with httpx.AsyncClient(timeout=5.0) as client:
                response = await client.get(f"{self.base_url}/health")
                response.raise_for_status()
-                self._health_cache = response.json()
+                payload = response.json()
+                self._health_cache = payload
                self._health_cache_time = now
-                return self._health_cache
+                return payload
        except Exception as e:
-            logger.warning(f"Qwen3-TTS health check failed: {e}")
+            logger.warning(f"Voice clone health check failed: {e}")
            return {
-                "service": "Qwen3-TTS Voice Clone",
-                "model": "0.6B-Base",
+                "service": "CosyVoice 3.0 Voice Clone",
+                "model": "unknown",
                "ready": False,
                "gpu_id": 0,
                "error": str(e)
--- a/backend/app/services/whisper_service.py
+++ b/backend/app/services/whisper_service.py
@@ -20,24 +20,41 @@ MAX_CHARS_PER_LINE = 12

 def split_word_to_chars(word: str, start: float, end: float) -> list:
    """
-    将词拆分成单个字符，时间戳线性插值
+    将词拆分成单个字符，时间戳线性插值。
+    保留英文词前的空格（Whisper 输出如 " Hello"），用于正确重建英文字幕。

    Args:
-        word: 词文本
+        word: 词文本（可能含前导空格）
        start: 词开始时间
        end: 词结束时间

    Returns:
        单字符列表，每个包含 word/start/end
    """
+    # 保留前导空格（英文 Whisper 输出常见 " Hello" 形式）
+    leading_space = ""
+    if word and not word[0].strip():
+        leading_space = " "
+        word = word.lstrip()
+
    tokens = []
    ascii_buffer = ""
+    pending_space = False  # 记录是否有待处理的空格（用于英文单词间距）

    for char in word:
        if not char.strip():
+            # 空格：flush ascii_buffer，标记下一个 token 需要前导空格
+            if ascii_buffer:
+                tokens.append(ascii_buffer)
+                ascii_buffer = ""
+            if tokens:  # 仅在已有 token 时标记（避免开头重复空格）
+                pending_space = True
            continue

        if char.isascii() and char.isalnum():
+            if pending_space and not ascii_buffer:
+                ascii_buffer = " "  # 将空格前置到新英文单词
+                pending_space = False
            ascii_buffer += char
            continue

@@ -45,7 +62,9 @@ def split_word_to_chars(word: str, start: float, end: float) -> list:
            tokens.append(ascii_buffer)
            ascii_buffer = ""

-        tokens.append(char)
+        prefix = " " if pending_space else ""
+        pending_space = False
+        tokens.append(prefix + char)

    if ascii_buffer:
        tokens.append(ascii_buffer)
@@ -54,7 +73,8 @@ def split_word_to_chars(word: str, start: float, end: float) -> list:
        return []

    if len(tokens) == 1:
-        return [{"word": tokens[0], "start": start, "end": end}]
+        w = leading_space + tokens[0] if leading_space else tokens[0]
+        return [{"word": w, "start": start, "end": end}]

    # 线性插值时间戳
    duration = end - start
@@ -64,8 +84,11 @@ def split_word_to_chars(word: str, start: float, end: float) -> list:
    for i, token in enumerate(tokens):
        token_start = start + i * token_duration
        token_end = start + (i + 1) * token_duration
+        w = token
+        if i == 0 and leading_space:
+            w = leading_space + w
        result.append({
-            "word": token,
+            "word": w,
            "start": round(token_start, 3),
            "end": round(token_end, 3)
        })
@@ -108,7 +131,7 @@ def split_segment_to_lines(words: List[dict], max_chars: int = MAX_CHARS_PER_LIN

        if should_break and current_words:
            segments.append({
-                "text": current_text,
+                "text": current_text.strip(),
                "start": current_words[0]["start"],
                "end": current_words[-1]["end"],
                "words": current_words.copy()
@@ -119,7 +142,7 @@ def split_segment_to_lines(words: List[dict], max_chars: int = MAX_CHARS_PER_LIN
    # 处理剩余的字
    if current_words:
        segments.append({
-            "text": current_text,
+            "text": current_text.strip(),
            "start": current_words[0]["start"],
            "end": current_words[-1]["end"],
            "words": current_words.copy()
@@ -162,7 +185,9 @@ class WhisperService:
        self,
        audio_path: str,
        text: str,
-        output_path: Optional[str] = None
+        output_path: Optional[str] = None,
+        language: str = "zh",
+        original_text: Optional[str] = None,
    ) -> dict:
        """
        对音频进行转录，生成字级别时间戳
@@ -171,12 +196,18 @@ class WhisperService:
            audio_path: 音频文件路径
            text: 原始文本（用于参考，但实际使用 whisper 转录结果）
            output_path: 可选，输出 JSON 文件路径
+            language: 语言代码 (zh/en 等)
+            original_text: 原始文案。非空时，Whisper 仅用于检测总时间范围，
+                           字幕文字用此原文替换（解决语言不匹配问题）

        Returns:
            包含字级别时间戳的字典
        """
        import asyncio

+        # 英文等西文需要更大的每行字数
+        max_chars = 40 if language != "zh" else MAX_CHARS_PER_LINE
+
        def _do_transcribe():
            model = self._load_model()

@@ -185,22 +216,26 @@ class WhisperService:
            # 转录并获取字级别时间戳
            segments_iter, info = model.transcribe(
                audio_path,
-                language="zh",
+                language=language,
                word_timestamps=True,  # 启用字级别时间戳
                vad_filter=True,  # 启用 VAD 过滤静音
            )

            logger.info(f"Detected language: {info.language} (prob: {info.language_probability:.2f})")

+            # 收集 Whisper 转录结果（始终需要，用于获取时间范围）
            all_segments = []
+            whisper_first_start = None
+            whisper_last_end = None
            for segment in segments_iter:
-                # 提取每个字的时间戳，并拆分成单字
                all_words = []
                if segment.words:
                    for word_info in segment.words:
-                        word_text = word_info.word.strip()
-                        if word_text:
-                            # 将词拆分成单字，时间戳线性插值
+                        word_text = word_info.word
+                        if word_text.strip():
+                            if whisper_first_start is None:
+                                whisper_first_start = word_info.start
+                            whisper_last_end = word_info.end
                            chars = split_word_to_chars(
                                word_text,
                                word_info.start,
@@ -208,11 +243,72 @@ class WhisperService:
                            )
                            all_words.extend(chars)

-                # 将长段落按标点和字数拆分成多行
                if all_words:
-                    line_segments = split_segment_to_lines(all_words, MAX_CHARS_PER_LINE)
+                    line_segments = split_segment_to_lines(all_words, max_chars)
                    all_segments.extend(line_segments)

+            # 如果提供了 original_text，用原文替换 Whisper 转录文字，保留语音节奏
+            if original_text and original_text.strip() and whisper_first_start is not None:
+                # 收集 Whisper 逐字时间戳（保留真实语音节奏）
+                whisper_chars = []
+                for seg in all_segments:
+                    whisper_chars.extend(seg.get("words", []))
+
+                # 用原文字符 + Whisper 节奏生成新的时间戳
+                orig_chars = split_word_to_chars(
+                    original_text.strip(),
+                    whisper_first_start,
+                    whisper_last_end
+                )
+
+                if orig_chars and len(whisper_chars) >= 2:
+                    # 将原文字符按比例映射到 Whisper 的时间节奏上
+                    n_w = len(whisper_chars)
+                    n_o = len(orig_chars)
+                    w_starts = [c["start"] for c in whisper_chars]
+                    w_final_end = whisper_chars[-1]["end"]
+
+                    logger.info(
+                        f"Using original_text for subtitles (len={len(original_text)}), "
+                        f"rhythm-mapping {n_o} orig chars onto {n_w} Whisper chars, "
+                        f"time range: {whisper_first_start:.2f}-{whisper_last_end:.2f}s"
+                    )
+
+                    remapped = []
+                    for i, oc in enumerate(orig_chars):
+                        # 原文第 i 个字符对应 Whisper 时间线的位置
+                        pos = (i / n_o) * n_w
+                        idx = min(int(pos), n_w - 1)
+                        frac = pos - idx
+                        t_start = (
+                            w_starts[idx] + frac * (w_starts[idx + 1] - w_starts[idx])
+                            if idx < n_w - 1
+                            else w_starts[idx] + frac * (w_final_end - w_starts[idx])
+                        )
+
+                        # 结束时间 = 下一个字符的开始时间
+                        pos_next = ((i + 1) / n_o) * n_w
+                        idx_n = min(int(pos_next), n_w - 1)
+                        frac_n = pos_next - idx_n
+                        t_end = (
+                            w_starts[idx_n] + frac_n * (w_starts[idx_n + 1] - w_starts[idx_n])
+                            if idx_n < n_w - 1
+                            else w_starts[idx_n] + frac_n * (w_final_end - w_starts[idx_n])
+                        )
+
+                        remapped.append({
+                            "word": oc["word"],
+                            "start": round(t_start, 3),
+                            "end": round(t_end, 3),
+                        })
+
+                    all_segments = split_segment_to_lines(remapped, max_chars)
+                    logger.info(f"Rebuilt {len(all_segments)} subtitle segments (rhythm-mapped)")
+                elif orig_chars:
+                    # Whisper 字符不足，退回线性插值
+                    all_segments = split_segment_to_lines(orig_chars, max_chars)
+                    logger.info(f"Rebuilt {len(all_segments)} subtitle segments (linear fallback)")
+
            logger.info(f"Generated {len(all_segments)} subtitle segments")
            return {"segments": all_segments}

@@ -230,12 +326,13 @@ class WhisperService:

        return result

-    async def transcribe(self, audio_path: str) -> str:
+    async def transcribe(self, audio_path: str, language: str | None = None) -> str:
        """
        仅转录文本（用于提取文案）

        Args:
            audio_path: 音频/视频文件路径
+            language: 语言代码，None 表示自动检测

        Returns:
            纯文本内容
@@ -249,7 +346,7 @@ class WhisperService:
            # 转录 (无需字级时间戳)
            segments_iter, _ = model.transcribe(
                audio_path,
-                language="zh",
+                language=language,
                word_timestamps=False,
                vad_filter=True,
            )
--- a/backend/assets/styles/subtitle.json
+++ b/backend/assets/styles/subtitle.json
@@ -54,5 +54,61 @@
    "letter_spacing": 1,
    "bottom_margin": 72,
    "is_default": false
+  },
+  {
+    "id": "subtitle_pink",
+    "label": "少女粉",
+    "font_file": "DingTalk JinBuTi.ttf",
+    "font_family": "DingTalkJinBuTi",
+    "font_size": 56,
+    "highlight_color": "#FF69B4",
+    "normal_color": "#FFFFFF",
+    "stroke_color": "#1A0010",
+    "stroke_size": 3,
+    "letter_spacing": 2,
+    "bottom_margin": 80,
+    "is_default": false
+  },
+  {
+    "id": "subtitle_lime",
+    "label": "清新绿",
+    "font_file": "DingTalk Sans.ttf",
+    "font_family": "DingTalkSans",
+    "font_size": 50,
+    "highlight_color": "#76FF03",
+    "normal_color": "#FFFFFF",
+    "stroke_color": "#001A00",
+    "stroke_size": 3,
+    "letter_spacing": 1,
+    "bottom_margin": 78,
+    "is_default": false
+  },
+  {
+    "id": "subtitle_gold",
+    "label": "金色隶书",
+    "font_file": "阿里妈妈刀隶体.ttf",
+    "font_family": "AliMamaDaoLiTi",
+    "font_size": 56,
+    "highlight_color": "#FDE68A",
+    "normal_color": "#E8D5B0",
+    "stroke_color": "#2B1B00",
+    "stroke_size": 3,
+    "letter_spacing": 3,
+    "bottom_margin": 80,
+    "is_default": false
+  },
+  {
+    "id": "subtitle_kai",
+    "label": "楷体红字",
+    "font_file": "simkai.ttf",
+    "font_family": "SimKai",
+    "font_size": 54,
+    "highlight_color": "#FF4444",
+    "normal_color": "#FFFFFF",
+    "stroke_color": "#000000",
+    "stroke_size": 3,
+    "letter_spacing": 2,
+    "bottom_margin": 80,
+    "is_default": false
  }
 ]
--- a/backend/assets/styles/title.json
+++ b/backend/assets/styles/title.json
@@ -7,7 +7,7 @@
    "font_size": 90,
    "color": "#FFFFFF",
    "stroke_color": "#000000",
-    "stroke_size": 8,
+    "stroke_size": 5,
    "letter_spacing": 5,
    "top_margin": 62,
    "font_weight": 900,
@@ -21,7 +21,7 @@
    "font_size": 72,
    "color": "#FFFFFF",
    "stroke_color": "#000000",
-    "stroke_size": 8,
+    "stroke_size": 5,
    "letter_spacing": 4,
    "top_margin": 60,
    "font_weight": 900,
@@ -35,7 +35,7 @@
    "font_size": 70,
    "color": "#FDE68A",
    "stroke_color": "#2B1B00",
-    "stroke_size": 8,
+    "stroke_size": 5,
    "letter_spacing": 3,
    "top_margin": 58,
    "font_weight": 800,
@@ -49,10 +49,122 @@
    "font_size": 72,
    "color": "#FFFFFF",
    "stroke_color": "#1F0A00",
-    "stroke_size": 8,
+    "stroke_size": 5,
    "letter_spacing": 4,
    "top_margin": 60,
    "font_weight": 900,
    "is_default": false
+  },
+  {
+    "id": "title_pangmen",
+    "label": "庞门正道",
+    "font_file": "title/庞门正道标题体3.0.ttf",
+    "font_family": "PangMenZhengDao",
+    "font_size": 80,
+    "color": "#FFFFFF",
+    "stroke_color": "#000000",
+    "stroke_size": 5,
+    "letter_spacing": 5,
+    "top_margin": 60,
+    "font_weight": 900,
+    "is_default": false
+  },
+  {
+    "id": "title_round",
+    "label": "优设标题圆",
+    "font_file": "title/优设标题圆.otf",
+    "font_family": "YouSheBiaoTiYuan",
+    "font_size": 78,
+    "color": "#FFFFFF",
+    "stroke_color": "#4A1A6B",
+    "stroke_size": 5,
+    "letter_spacing": 4,
+    "top_margin": 60,
+    "font_weight": 900,
+    "is_default": false
+  },
+  {
+    "id": "title_alibaba",
+    "label": "阿里数黑体",
+    "font_file": "title/阿里巴巴数黑体.ttf",
+    "font_family": "AlibabaShuHeiTi",
+    "font_size": 72,
+    "color": "#FFFFFF",
+    "stroke_color": "#000000",
+    "stroke_size": 4,
+    "letter_spacing": 3,
+    "top_margin": 60,
+    "font_weight": 900,
+    "is_default": false
+  },
+  {
+    "id": "title_chaohei",
+    "label": "文道潮黑",
+    "font_file": "title/文道潮黑.ttf",
+    "font_family": "WenDaoChaoHei",
+    "font_size": 76,
+    "color": "#00E5FF",
+    "stroke_color": "#001A33",
+    "stroke_size": 5,
+    "letter_spacing": 4,
+    "top_margin": 60,
+    "font_weight": 900,
+    "is_default": false
+  },
+  {
+    "id": "title_wujie",
+    "label": "无界黑",
+    "font_file": "title/标小智无界黑.otf",
+    "font_family": "BiaoXiaoZhiWuJieHei",
+    "font_size": 74,
+    "color": "#FFFFFF",
+    "stroke_color": "#1A1A1A",
+    "stroke_size": 4,
+    "letter_spacing": 3,
+    "top_margin": 60,
+    "font_weight": 900,
+    "is_default": false
+  },
+  {
+    "id": "title_houdi",
+    "label": "厚底黑",
+    "font_file": "title/Aa厚底黑.ttf",
+    "font_family": "AaHouDiHei",
+    "font_size": 76,
+    "color": "#FF6B6B",
+    "stroke_color": "#1A0000",
+    "stroke_size": 5,
+    "letter_spacing": 4,
+    "top_margin": 60,
+    "font_weight": 900,
+    "is_default": false
+  },
+  {
+    "id": "title_banyuan",
+    "label": "寒蝉半圆体",
+    "font_file": "title/寒蝉半圆体.otf",
+    "font_family": "HanChanBanYuan",
+    "font_size": 78,
+    "color": "#FFFFFF",
+    "stroke_color": "#000000",
+    "stroke_size": 5,
+    "letter_spacing": 4,
+    "top_margin": 60,
+    "font_weight": 900,
+    "is_default": false
+  },
+  {
+    "id": "title_jixiang",
+    "label": "欣意吉祥宋",
+    "font_file": "title/字体圈欣意吉祥宋.ttf",
+    "font_family": "XinYiJiXiangSong",
+    "font_size": 70,
+    "color": "#FDE68A",
+    "stroke_color": "#2B1B00",
+    "stroke_size": 5,
+    "letter_spacing": 3,
+    "top_margin": 58,
+    "font_weight": 800,
+    "is_default": false
  }
 ]
--- a/backend/database/schema.sql
+++ b/backend/database/schema.sql
@@ -71,3 +71,18 @@ CREATE TRIGGER users_updated_at
    BEFORE UPDATE ON users
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at();
+
+-- 8. 订单表（支付宝付费）
+CREATE TABLE IF NOT EXISTS orders (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    user_id UUID REFERENCES users(id) ON DELETE CASCADE,
+    out_trade_no TEXT UNIQUE NOT NULL,
+    amount DECIMAL(10, 2) NOT NULL DEFAULT 999.00,
+    status TEXT DEFAULT 'pending' CHECK (status IN ('pending', 'paid', 'failed')),
+    trade_no TEXT,
+    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
+    paid_at TIMESTAMP WITH TIME ZONE
+);
+
+CREATE INDEX IF NOT EXISTS idx_orders_user_id ON orders(user_id);
+CREATE INDEX IF NOT EXISTS idx_orders_out_trade_no ON orders(out_trade_no);
--- a/backend/package-lock.json
+++ b/backend/package-lock.json
@@ -0,0 +1,31 @@
+{
+  "name": "backend",
+  "lockfileVersion": 3,
+  "requires": true,
+  "packages": {
+    "": {
+      "dependencies": {
+        "qrcode.react": "^4.2.0"
+      }
+    },
+    "node_modules/qrcode.react": {
+      "version": "4.2.0",
+      "resolved": "https://registry.npmjs.org/qrcode.react/-/qrcode.react-4.2.0.tgz",
+      "integrity": "sha512-QpgqWi8rD9DsS9EP3z7BT+5lY5SFhsqGjpgW5DY/i3mK4M9DTBNz3ErMi8BWYEfI3L0d8GIbGmcdFAS1uIRGjA==",
+      "license": "ISC",
+      "peerDependencies": {
+        "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0"
+      }
+    },
+    "node_modules/react": {
+      "version": "19.2.4",
+      "resolved": "https://registry.npmjs.org/react/-/react-19.2.4.tgz",
+      "integrity": "sha512-9nfp2hYpCwOjAN+8TZFGhtWEwgvWHXqESH8qT89AT/lWklpLON22Lc8pEtnpsZz7VmawabSU0gCjnj8aC0euHQ==",
+      "license": "MIT",
+      "peer": true,
+      "engines": {
+        "node": ">=0.10.0"
+      }
+    }
+  }
+}
--- a/backend/package.json
+++ b/backend/package.json
@@ -0,0 +1,5 @@
+{
+  "dependencies": {
+    "qrcode.react": "^4.2.0"
+  }
+}
--- a/backend/requirements.txt
+++ b/backend/requirements.txt
@@ -29,6 +29,9 @@ python-jose[cryptography]>=3.3.0
 passlib[bcrypt]>=1.7.4
 bcrypt==4.0.1

+# 支付宝支付
+python-alipay-sdk>=3.6.0
+
 # 字幕对齐
 faster-whisper>=1.0.0

--- a/backend/scripts/watchdog.py
+++ b/backend/scripts/watchdog.py
@@ -20,27 +20,45 @@ logger = logging.getLogger("Watchdog")
 # 服务配置
 SERVICES = [
    {
-        "name": "vigent2-qwen-tts",
-        "url": "http://localhost:8009/health",
+        "name": "vigent2-cosyvoice",
+        "url": "http://localhost:8010/health",
        "failures": 0,
-        "threshold": 3,
+        "threshold": 3,          # 连续3次失败才重启（3×15s ≈ 45秒容忍期）
        "timeout": 10.0,
-        "restart_cmd": ["pm2", "restart", "vigent2-qwen-tts"]
+        "restart_cmd": ["pm2", "restart", "vigent2-cosyvoice"],
+        "cooldown_until": 0,     # 重启后的冷却截止时间戳
+        "cooldown_sec": 45,      # 重启后等待45秒再开始检查
    }
 ]

 async def check_service(service):
    """检查单个服务健康状态"""
+    # 冷却期内跳过检查
+    now = time.time()
+    if now < service.get("cooldown_until", 0):
+        remaining = int(service["cooldown_until"] - now)
+        logger.debug(f"⏳ 服务 {service['name']} 冷却中，剩余 {remaining}s")
+        return True
+
    try:
        timeout = service.get("timeout", 10.0)
        async with httpx.AsyncClient(timeout=timeout) as client:
            response = await client.get(service["url"])
            if response.status_code == 200:
-                # 成功
+                ready = True
+                try:
+                    payload = response.json()
+                    ready = bool(payload.get("ready", True))
+                except Exception:
+                    payload = {}
+
+                if ready:
                    if service["failures"] > 0:
                        logger.info(f"✅ 服务 {service['name']} 已恢复正常")
                    service["failures"] = 0
                    return True
+
+                logger.warning(f"⚠️ 服务 {service['name']} ready=false，健康检查未通过: {payload}")
            else:
                logger.warning(f"⚠️ 服务 {service['name']} 返回状态码 {response.status_code}")
    except Exception as e:
@@ -55,8 +73,9 @@ async def check_service(service):
        try:
            subprocess.run(service["restart_cmd"], check=True)
            logger.info(f"♻️ 服务 {service['name']} 重启命令已发送")
-            # 重启后给予一段宽限期 (例如 60秒) 不检查，等待服务启动
-            service["failures"] = 0 # 重置计数
+            service["failures"] = 0
+            # 设置冷却期，等待服务完成启动和模型加载
+            service["cooldown_until"] = time.time() + service.get("cooldown_sec", 120)
            return "restarting"
        except Exception as restart_error:
            logger.error(f"💥 重启服务 {service['name']} 失败: {restart_error}")
@@ -66,16 +85,16 @@ async def check_service(service):
 async def main():
    logger.info("🛡️ ViGent2 服务看门狗 (Watchdog) 已启动")

-    while True:
-        # 并发检查所有服务
+    # 启动时给所有服务一个初始冷却期，避免服务还没起来就被判定失败
    for service in SERVICES:
-            result = await check_service(service)
-            if result == "restarting":
-                # 如果有服务重启，额外等待包含启动时间
-                pass
+        service["cooldown_until"] = time.time() + 60

-        # 每 30 秒检查一次
-        await asyncio.sleep(30)
+    while True:
+        for service in SERVICES:
+            await check_service(service)
+
+        # 每 15 秒检查一次
+        await asyncio.sleep(15)

 if __name__ == "__main__":
    try:
--- a/frontend/package-lock.json
+++ b/frontend/package-lock.json
@@ -8,14 +8,19 @@
      "name": "frontend",
      "version": "0.1.0",
      "dependencies": {
+        "@dnd-kit/core": "^6.3.1",
+        "@dnd-kit/sortable": "^10.0.0",
+        "@dnd-kit/utilities": "^3.2.2",
        "@supabase/supabase-js": "^2.93.1",
        "axios": "^1.13.4",
        "lucide-react": "^0.563.0",
        "next": "16.1.1",
+        "qrcode.react": "^4.2.0",
        "react": "19.2.3",
        "react-dom": "19.2.3",
        "sonner": "^2.0.7",
-        "swr": "^2.3.8"
+        "swr": "^2.3.8",
+        "wavesurfer.js": "^7.12.1"
      },
      "devDependencies": {
        "@tailwindcss/postcss": "^4",
@@ -281,6 +286,59 @@
        "node": ">=6.9.0"
      }
    },
+    "node_modules/@dnd-kit/accessibility": {
+      "version": "3.1.1",
+      "resolved": "https://registry.npmjs.org/@dnd-kit/accessibility/-/accessibility-3.1.1.tgz",
+      "integrity": "sha512-2P+YgaXF+gRsIihwwY1gCsQSYnu9Zyj2py8kY5fFvUM1qm2WA2u639R6YNVfU4GWr+ZM5mqEsfHZZLoRONbemw==",
+      "license": "MIT",
+      "dependencies": {
+        "tslib": "^2.0.0"
+      },
+      "peerDependencies": {
+        "react": ">=16.8.0"
+      }
+    },
+    "node_modules/@dnd-kit/core": {
+      "version": "6.3.1",
+      "resolved": "https://registry.npmjs.org/@dnd-kit/core/-/core-6.3.1.tgz",
+      "integrity": "sha512-xkGBRQQab4RLwgXxoqETICr6S5JlogafbhNsidmrkVv2YRs5MLwpjoF2qpiGjQt8S9AoxtIV603s0GIUpY5eYQ==",
+      "license": "MIT",
+      "dependencies": {
+        "@dnd-kit/accessibility": "^3.1.1",
+        "@dnd-kit/utilities": "^3.2.2",
+        "tslib": "^2.0.0"
+      },
+      "peerDependencies": {
+        "react": ">=16.8.0",
+        "react-dom": ">=16.8.0"
+      }
+    },
+    "node_modules/@dnd-kit/sortable": {
+      "version": "10.0.0",
+      "resolved": "https://registry.npmjs.org/@dnd-kit/sortable/-/sortable-10.0.0.tgz",
+      "integrity": "sha512-+xqhmIIzvAYMGfBYYnbKuNicfSsk4RksY2XdmJhT+HAC01nix6fHCztU68jooFiMUB01Ky3F0FyOvhG/BZrWkg==",
+      "license": "MIT",
+      "dependencies": {
+        "@dnd-kit/utilities": "^3.2.2",
+        "tslib": "^2.0.0"
+      },
+      "peerDependencies": {
+        "@dnd-kit/core": "^6.3.0",
+        "react": ">=16.8.0"
+      }
+    },
+    "node_modules/@dnd-kit/utilities": {
+      "version": "3.2.2",
+      "resolved": "https://registry.npmjs.org/@dnd-kit/utilities/-/utilities-3.2.2.tgz",
+      "integrity": "sha512-+MKAJEOfaBe5SmV6t34p80MMKhjvUz0vRrvVJbPT0WElzaOJ/1xs+D+KDv+tD/NE5ujfrChEcshd4fLn0wpiqg==",
+      "license": "MIT",
+      "dependencies": {
+        "tslib": "^2.0.0"
+      },
+      "peerDependencies": {
+        "react": ">=16.8.0"
+      }
+    },
    "node_modules/@emnapi/core": {
      "version": "1.8.1",
      "resolved": "https://registry.npmjs.org/@emnapi/core/-/core-1.8.1.tgz",
@@ -5561,6 +5619,15 @@
        "node": ">=6"
      }
    },
+    "node_modules/qrcode.react": {
+      "version": "4.2.0",
+      "resolved": "https://registry.npmjs.org/qrcode.react/-/qrcode.react-4.2.0.tgz",
+      "integrity": "sha512-QpgqWi8rD9DsS9EP3z7BT+5lY5SFhsqGjpgW5DY/i3mK4M9DTBNz3ErMi8BWYEfI3L0d8GIbGmcdFAS1uIRGjA==",
+      "license": "ISC",
+      "peerDependencies": {
+        "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0"
+      }
+    },
    "node_modules/queue-microtask": {
      "version": "1.2.3",
      "resolved": "https://registry.npmjs.org/queue-microtask/-/queue-microtask-1.2.3.tgz",
@@ -6611,6 +6678,12 @@
        "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0"
      }
    },
+    "node_modules/wavesurfer.js": {
+      "version": "7.12.1",
+      "resolved": "https://registry.npmjs.org/wavesurfer.js/-/wavesurfer.js-7.12.1.tgz",
+      "integrity": "sha512-NswPjVHxk0Q1F/VMRemCPUzSojjuHHisQrBqQiRXg7MVbe3f5vQ6r0rTTXA/a/neC/4hnOEC4YpXca4LpH0SUg==",
+      "license": "BSD-3-Clause"
+    },
    "node_modules/which": {
      "version": "2.0.2",
      "resolved": "https://registry.npmjs.org/which/-/which-2.0.2.tgz",
--- a/frontend/package.json
+++ b/frontend/package.json
@@ -9,14 +9,19 @@
    "lint": "eslint"
  },
  "dependencies": {
+    "@dnd-kit/core": "^6.3.1",
+    "@dnd-kit/sortable": "^10.0.0",
+    "@dnd-kit/utilities": "^3.2.2",
    "@supabase/supabase-js": "^2.93.1",
    "axios": "^1.13.4",
    "lucide-react": "^0.563.0",
    "next": "16.1.1",
+    "qrcode.react": "^4.2.0",
    "react": "19.2.3",
    "react-dom": "19.2.3",
    "sonner": "^2.0.7",
-    "swr": "^2.3.8"
+    "swr": "^2.3.8",
+    "wavesurfer.js": "^7.12.1"
  },
  "devDependencies": {
    "@tailwindcss/postcss": "^4",
--- a/frontend/src/app/layout.tsx
+++ b/frontend/src/app/layout.tsx
@@ -46,7 +46,6 @@ export default function RootLayout({
        <Toaster
          position="top-center"
          richColors
-          closeButton
          toastOptions={{
            duration: 3000,
            className: "text-sm",
--- a/frontend/src/app/login/page.tsx
+++ b/frontend/src/app/login/page.tsx
@@ -3,9 +3,11 @@
 import { useState } from 'react';
 import { useRouter } from 'next/navigation';
 import { login } from "@/shared/lib/auth";
+import { useAuth } from "@/shared/contexts/AuthContext";

 export default function LoginPage() {
    const router = useRouter();
+    const { setUser } = useAuth();
    const [phone, setPhone] = useState('');
    const [password, setPassword] = useState('');
    const [error, setError] = useState('');
@@ -25,7 +27,11 @@ export default function LoginPage() {

        try {
            const result = await login(phone, password);
-            if (result.success) {
+            if (result.paymentToken) {
+                sessionStorage.setItem('payment_token', result.paymentToken);
+                router.push('/pay');
+            } else if (result.success) {
+                if (result.user) setUser(result.user);
                router.push('/');
            } else {
                setError(result.message || '登录失败');
--- a/frontend/src/app/pay/page.tsx
+++ b/frontend/src/app/pay/page.tsx
@@ -0,0 +1,160 @@
+'use client';
+
+import { Suspense, useState, useEffect, useRef } from 'react';
+import { useRouter, useSearchParams } from 'next/navigation';
+import api from '@/shared/api/axios';
+
+type PageStatus = 'loading' | 'redirecting' | 'checking' | 'success' | 'error';
+
+function PayContent() {
+    const router = useRouter();
+    const searchParams = useSearchParams();
+    const [status, setStatus] = useState<PageStatus>('loading');
+    const [errorMsg, setErrorMsg] = useState('');
+    const pollRef = useRef<ReturnType<typeof setInterval> | null>(null);
+
+    useEffect(() => {
+        const outTradeNo = searchParams.get('out_trade_no');
+        if (outTradeNo) {
+            setStatus('checking');
+            startPolling(outTradeNo);
+            return;
+        }
+
+        const token = sessionStorage.getItem('payment_token');
+        if (!token) {
+            router.replace('/login');
+            return;
+        }
+        createOrder(token);
+
+        return () => {
+            if (pollRef.current) clearInterval(pollRef.current);
+        };
+    }, []);
+
+    const createOrder = async (token: string) => {
+        try {
+            const { data } = await api.post('/api/payment/create-order', { payment_token: token });
+            const { pay_url } = data.data;
+            setStatus('redirecting');
+            window.location.href = pay_url;
+        } catch (err: any) {
+            setStatus('error');
+            setErrorMsg(err.response?.data?.message || '创建订单失败，请重新登录');
+        }
+    };
+
+    const startPolling = (tradeNo: string) => {
+        checkStatus(tradeNo);
+        pollRef.current = setInterval(() => checkStatus(tradeNo), 3000);
+    };
+
+    const checkStatus = async (tradeNo: string) => {
+        try {
+            const { data } = await api.get(`/api/payment/status/${tradeNo}`);
+            if (data.data.status === 'paid') {
+                if (pollRef.current) clearInterval(pollRef.current);
+                setStatus('success');
+                sessionStorage.removeItem('payment_token');
+                setTimeout(() => router.replace('/login'), 3000);
+            }
+        } catch {
+            // ignore polling errors
+        }
+    };
+
+    return (
+        <div className="w-full max-w-md p-8 bg-white/10 backdrop-blur-lg rounded-2xl shadow-2xl border border-white/20">
+            {(status === 'loading' || status === 'redirecting') && (
+                <div className="text-center">
+                    <div className="mb-6">
+                        <svg className="animate-spin h-12 w-12 mx-auto text-purple-400" xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24">
+                            <circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4"></circle>
+                            <path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"></path>
+                        </svg>
+                    </div>
+                    <p className="text-gray-300">
+                        {status === 'loading' ? '正在创建订单...' : '正在跳转到支付宝...'}
+                    </p>
+                </div>
+            )}
+
+            {status === 'checking' && (
+                <div className="text-center">
+                    <h1 className="text-2xl font-bold text-white mb-6">支付确认中</h1>
+                    <div className="flex items-center justify-center gap-2 text-purple-300 mb-4">
+                        <svg className="animate-spin h-5 w-5" xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24">
+                            <circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4"></circle>
+                            <path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"></path>
+                        </svg>
+                        正在确认支付结果...
+                    </div>
+                    <p className="text-gray-400 text-sm">如果您已完成支付，请稍候</p>
+                </div>
+            )}
+
+            {status === 'success' && (
+                <div className="text-center">
+                    <div className="mb-6">
+                        <svg className="w-16 h-16 mx-auto text-green-400" fill="none" stroke="currentColor" viewBox="0 0 24 24">
+                            <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9 12l2 2 4-4m6 2a9 9 0 11-18 0 9 9 0 0118 0z" />
+                        </svg>
+                    </div>
+                    <h2 className="text-2xl font-bold text-white mb-4">支付成功！</h2>
+                    <p className="text-gray-300 mb-2">会员已开通，即将跳转到登录页...</p>
+                    <p className="text-gray-500 text-sm">请重新登录即可使用</p>
+                </div>
+            )}
+
+            {status === 'error' && (
+                <div className="text-center">
+                    <div className="mb-6">
+                        <svg className="w-16 h-16 mx-auto text-red-400" fill="none" stroke="currentColor" viewBox="0 0 24 24">
+                            <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M12 8v4m0 4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
+                        </svg>
+                    </div>
+                    <h2 className="text-2xl font-bold text-white mb-4">创建订单失败</h2>
+                    <p className="text-red-300 mb-6">{errorMsg}</p>
+                    <button
+                        onClick={() => router.replace('/login')}
+                        className="py-3 px-6 bg-gradient-to-r from-purple-600 to-pink-600 text-white font-semibold rounded-lg"
+                    >
+                        返回登录
+                    </button>
+                </div>
+            )}
+
+            {status === 'checking' && (
+                <div className="mt-6 text-center">
+                    <button
+                        onClick={() => {
+                            if (pollRef.current) clearInterval(pollRef.current);
+                            router.replace('/login');
+                        }}
+                        className="text-purple-300 hover:text-purple-200 text-sm"
+                    >
+                        返回登录
+                    </button>
+                </div>
+            )}
+        </div>
+    );
+}
+
+export default function PayPage() {
+    return (
+        <div className="min-h-dvh flex items-center justify-center">
+            <Suspense fallback={
+                <div className="w-full max-w-md p-8 bg-white/10 backdrop-blur-lg rounded-2xl shadow-2xl border border-white/20 text-center">
+                    <svg className="animate-spin h-12 w-12 mx-auto text-purple-400" xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24">
+                        <circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="4"></circle>
+                        <path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4zm2 5.291A7.962 7.962 0 014 12H0c0 3.042 1.135 5.824 3 7.938l3-2.647z"></path>
+                    </svg>
+                </div>
+            }>
+                <PayContent />
+            </Suspense>
+        </div>
+    );
+}
--- a/frontend/src/app/register/page.tsx
+++ b/frontend/src/app/register/page.tsx
@@ -61,7 +61,7 @@ export default function RegisterPage() {
                    </div>
                    <h2 className="text-2xl font-bold text-white mb-4">注册成功！</h2>
                    <p className="text-gray-300 mb-6">
-                        您的账号已创建，请等待管理员审核激活后即可登录。
+                        注册成功！请返回登录页，登录后完成付费即可开通。
                    </p>
                    <a
                        href="/login"
--- a/frontend/src/components/AccountSettingsDropdown.tsx
+++ b/frontend/src/components/AccountSettingsDropdown.tsx
@@ -106,6 +106,10 @@ export default function AccountSettingsDropdown() {
            {/* 下拉菜单 */}
            {isOpen && (
                <div className="absolute right-0 mt-2 bg-gray-800 border border-white/10 rounded-lg shadow-xl z-[160] overflow-hidden whitespace-nowrap">
+                    {/* 账户名称 */}
+                    <div className="px-3 py-2 border-b border-white/10 text-center">
+                        <div className="text-sm text-white font-medium">{user?.phone ? `${user.phone.slice(0, 3)}****${user.phone.slice(-4)}` : '未知账户'}</div>
+                    </div>
                    {/* 有效期显示 */}
                    <div className="px-3 py-2 border-b border-white/10 text-center">
                        <div className="text-xs text-gray-400">账户有效期</div>
@@ -188,6 +192,7 @@ export default function AccountSettingsDropdown() {
                                    onClick={() => {
                                        setShowPasswordModal(false);
                                        setError('');
+                                        setSuccess('');
                                        setOldPassword('');
                                        setNewPassword('');
                                        setConfirmPassword('');
--- a/frontend/src/features/home/model/useGeneratedAudios.ts
+++ b/frontend/src/features/home/model/useGeneratedAudios.ts
@@ -0,0 +1,193 @@
+import { useCallback, useEffect, useRef, useState } from "react";
+import api from "@/shared/api/axios";
+import { ApiResponse, unwrap } from "@/shared/api/types";
+import { toast } from "sonner";
+
+export interface GeneratedAudio {
+  id: string;
+  name: string;
+  path: string;
+  duration_sec: number;
+  text: string;
+  tts_mode: string;
+  language: string;
+  created_at: number;
+}
+
+interface AudioTask {
+  status: string;
+  progress?: number;
+  message?: string;
+  output?: GeneratedAudio & { audio_id: string };
+}
+
+interface UseGeneratedAudiosOptions {
+  selectedAudioId: string | null;
+  setSelectedAudioId: React.Dispatch<React.SetStateAction<string | null>>;
+}
+
+export const useGeneratedAudios = ({
+  selectedAudioId,
+  setSelectedAudioId,
+}: UseGeneratedAudiosOptions) => {
+  const [generatedAudios, setGeneratedAudios] = useState<GeneratedAudio[]>([]);
+  const [selectedAudio, setSelectedAudio] = useState<GeneratedAudio | null>(null);
+  const [isGeneratingAudio, setIsGeneratingAudio] = useState(false);
+  const [audioTaskId, setAudioTaskId] = useState<string | null>(null);
+  const [audioTask, setAudioTask] = useState<AudioTask | null>(null);
+  const pollRef = useRef<NodeJS.Timeout | null>(null);
+
+  const fetchGeneratedAudios = useCallback(async (selectId?: string) => {
+    try {
+      const { data: res } = await api.get<ApiResponse<{ items: GeneratedAudio[] }>>(
+        "/api/generated-audios"
+      );
+      const payload = unwrap(res);
+      const items: GeneratedAudio[] = payload.items || [];
+      setGeneratedAudios(items);
+
+      if (selectId && items.length > 0) {
+        if (selectId === "__latest__") {
+          setSelectedAudioId(items[0].id);
+          setSelectedAudio(items[0]);
+        } else {
+          const found = items.find((a) => a.id === selectId);
+          if (found) {
+            setSelectedAudioId(found.id);
+            setSelectedAudio(found);
+          }
+        }
+      }
+    } catch (error) {
+      console.error("获取配音列表失败:", error);
+    }
+  }, [setSelectedAudioId]);
+
+  // Sync selectedAudio when selectedAudioId changes externally (e.g. from persistence)
+  useEffect(() => {
+    if (!selectedAudioId || generatedAudios.length === 0) return;
+    const found = generatedAudios.find((a) => a.id === selectedAudioId);
+    if (found) {
+      setSelectedAudio(found);
+    }
+  }, [selectedAudioId, generatedAudios]);
+
+  const stopPolling = useCallback(() => {
+    if (pollRef.current) {
+      clearInterval(pollRef.current);
+      pollRef.current = null;
+    }
+  }, []);
+
+  const startPolling = useCallback((taskId: string) => {
+    stopPolling();
+    pollRef.current = setInterval(async () => {
+      try {
+        const { data: res } = await api.get<ApiResponse<AudioTask>>(
+          `/api/generated-audios/tasks/${taskId}`
+        );
+        const task = unwrap(res);
+        setAudioTask(task);
+
+        if (task.status === "completed") {
+          stopPolling();
+          setIsGeneratingAudio(false);
+          setAudioTaskId(null);
+          // Refresh list and select the new audio
+          await fetchGeneratedAudios("__latest__");
+          toast.success(task.message || "配音生成完成");
+        } else if (task.status === "failed") {
+          stopPolling();
+          setIsGeneratingAudio(false);
+          setAudioTaskId(null);
+          toast.error(task.message || "配音生成失败");
+        } else if (task.status === "not_found") {
+          stopPolling();
+          setIsGeneratingAudio(false);
+          setAudioTaskId(null);
+          setAudioTask(null);
+          toast.error("任务已丢失（服务可能已重启），请重新生成");
+        }
+      } catch {
+        // Network error, keep polling
+      }
+    }, 1000);
+  }, [stopPolling, fetchGeneratedAudios]);
+
+  // Cleanup on unmount
+  useEffect(() => {
+    return () => stopPolling();
+  }, [stopPolling]);
+
+  const generateAudio = useCallback(async (params: {
+    text: string;
+    tts_mode: string;
+    voice?: string;
+    ref_audio_id?: string;
+    ref_text?: string;
+    language: string;
+    speed?: number;
+  }) => {
+    setIsGeneratingAudio(true);
+    setAudioTask({ status: "pending", progress: 0, message: "正在提交..." });
+
+    try {
+      const { data: res } = await api.post<ApiResponse<{ task_id: string }>>(
+        "/api/generated-audios/generate",
+        params
+      );
+      const { task_id } = unwrap(res);
+      setAudioTaskId(task_id);
+      startPolling(task_id);
+    } catch (err: unknown) {
+      setIsGeneratingAudio(false);
+      setAudioTask(null);
+      const axiosErr = err as { response?: { data?: { message?: string } }; message?: string };
+      const errorMsg = axiosErr.response?.data?.message || axiosErr.message || String(err);
+      toast.error(`配音生成失败: ${errorMsg}`);
+    }
+  }, [startPolling]);
+
+  const deleteAudio = useCallback(async (audioId: string) => {
+    if (!confirm("确定要删除这个配音吗？")) return;
+    try {
+      await api.delete(`/api/generated-audios/${encodeURIComponent(audioId)}`);
+      if (selectedAudioId === audioId) {
+        setSelectedAudioId(null);
+        setSelectedAudio(null);
+      }
+      fetchGeneratedAudios();
+    } catch (error) {
+      toast.error("删除失败: " + error);
+    }
+  }, [fetchGeneratedAudios, selectedAudioId, setSelectedAudioId]);
+
+  const renameAudio = useCallback(async (audioId: string, newName: string) => {
+    try {
+      await api.put(`/api/generated-audios/${encodeURIComponent(audioId)}`, {
+        new_name: newName,
+      });
+      fetchGeneratedAudios();
+    } catch (err: unknown) {
+      toast.error("重命名失败: " + String(err));
+    }
+  }, [fetchGeneratedAudios]);
+
+  const selectAudio = useCallback((audio: GeneratedAudio) => {
+    setSelectedAudioId(audio.id);
+    setSelectedAudio(audio);
+  }, [setSelectedAudioId]);
+
+  return {
+    generatedAudios,
+    selectedAudio,
+    selectedAudioId,
+    isGeneratingAudio,
+    audioTask,
+    fetchGeneratedAudios,
+    generateAudio,
+    deleteAudio,
+    renameAudio,
+    selectAudio,
+  };
+};
--- a/frontend/src/features/home/model/useGeneratedVideos.ts
+++ b/frontend/src/features/home/model/useGeneratedVideos.ts
@@ -12,7 +12,7 @@ interface GeneratedVideo {
 }

 interface UseGeneratedVideosOptions {
-
+  storageKey: string;
  selectedVideoId: string | null;
  setSelectedVideoId: React.Dispatch<React.SetStateAction<string | null>>;
  setGeneratedVideo: React.Dispatch<React.SetStateAction<string | null>>;
@@ -20,7 +20,7 @@ interface UseGeneratedVideosOptions {
 }

 export const useGeneratedVideos = ({
-
+  storageKey,
  selectedVideoId,
  setSelectedVideoId,
  setGeneratedVideo,
@@ -45,6 +45,8 @@ export const useGeneratedVideos = ({
        if (preferVideoId === "__latest__") {
          setSelectedVideoId(videos[0].id);
          setGeneratedVideo(resolveMediaUrl(videos[0].path));
+          // 写入跨页面共享标记，让另一个页面也能感知最新生成的视频
+          localStorage.setItem(`vigent_${storageKey}_latestGeneratedVideoId`, videos[0].id);
        } else {
          const found = videos.find(v => v.id === preferVideoId);
          if (found) {
--- a/frontend/src/features/home/model/useHomeController.ts
+++ b/frontend/src/features/home/model/useHomeController.ts
@@ -1,4 +1,4 @@
-import { useEffect, useRef, useState } from "react";
+import { useEffect, useMemo, useRef, useState } from "react";
 import api from "@/shared/api/axios";
 import {
  buildTextShadow,
@@ -9,7 +9,7 @@ import {
  resolveBgmUrl,
  resolveMediaUrl,
 } from "@/shared/lib/media";
-import { clampTitle } from "@/shared/lib/title";
+import { clampTitle, clampSecondaryTitle, SECONDARY_TITLE_MAX_LENGTH } from "@/shared/lib/title";
 import { useTitleInput } from "@/shared/hooks/useTitleInput";
 import { useAuth } from "@/shared/contexts/AuthContext";
 import { useTask } from "@/shared/contexts/TaskContext";
@@ -18,26 +18,80 @@ import { usePublishPrefetch } from "@/shared/hooks/usePublishPrefetch";
 import { PublishAccount } from "@/shared/types/publish";
 import { useBgm } from "@/features/home/model/useBgm";
 import { useGeneratedVideos } from "@/features/home/model/useGeneratedVideos";
+import { useGeneratedAudios } from "@/features/home/model/useGeneratedAudios";
 import { useHomePersistence } from "@/features/home/model/useHomePersistence";
 import { useMaterials } from "@/features/home/model/useMaterials";
 import { useMediaPlayers } from "@/features/home/model/useMediaPlayers";
 import { useRefAudios } from "@/features/home/model/useRefAudios";
 import { useTitleSubtitleStyles } from "@/features/home/model/useTitleSubtitleStyles";
+import { useTimelineEditor } from "@/features/home/model/useTimelineEditor";
+import { useSavedScripts } from "@/features/home/model/useSavedScripts";
+import { useVideoFrameCapture } from "@/features/home/model/useVideoFrameCapture";
 import { ApiResponse, unwrap } from "@/shared/api/types";

-const VOICES = [
+const VOICES: Record<string, { id: string; name: string }[]> = {
+  "zh-CN": [
    { id: "zh-CN-YunxiNeural", name: "云溪 (男声-年轻)" },
    { id: "zh-CN-YunjianNeural", name: "云健 (男声-新闻)" },
    { id: "zh-CN-YunyangNeural", name: "云扬 (男声-专业)" },
    { id: "zh-CN-XiaoxiaoNeural", name: "晓晓 (女声-活泼)" },
    { id: "zh-CN-XiaoyiNeural", name: "晓伊 (女声-温柔)" },
-];
+  ],
+  "en-US": [
+    { id: "en-US-GuyNeural", name: "Guy (Male)" },
+    { id: "en-US-JennyNeural", name: "Jenny (Female)" },
+  ],
+  "ja-JP": [
+    { id: "ja-JP-KeitaNeural", name: "圭太 (男声)" },
+    { id: "ja-JP-NanamiNeural", name: "七海 (女声)" },
+  ],
+  "ko-KR": [
+    { id: "ko-KR-InJoonNeural", name: "인준 (男声)" },
+    { id: "ko-KR-SunHiNeural", name: "선히 (女声)" },
+  ],
+  "fr-FR": [
+    { id: "fr-FR-HenriNeural", name: "Henri (Male)" },
+    { id: "fr-FR-DeniseNeural", name: "Denise (Female)" },
+  ],
+  "de-DE": [
+    { id: "de-DE-ConradNeural", name: "Conrad (Male)" },
+    { id: "de-DE-KatjaNeural", name: "Katja (Female)" },
+  ],
+  "es-ES": [
+    { id: "es-ES-AlvaroNeural", name: "Álvaro (Male)" },
+    { id: "es-ES-ElviraNeural", name: "Elvira (Female)" },
+  ],
+  "ru-RU": [
+    { id: "ru-RU-DmitryNeural", name: "Дмитрий (Male)" },
+    { id: "ru-RU-SvetlanaNeural", name: "Светлана (Female)" },
+  ],
+  "it-IT": [
+    { id: "it-IT-DiegoNeural", name: "Diego (Male)" },
+    { id: "it-IT-ElsaNeural", name: "Elsa (Female)" },
+  ],
+  "pt-BR": [
+    { id: "pt-BR-AntonioNeural", name: "Antonio (Male)" },
+    { id: "pt-BR-FranciscaNeural", name: "Francisca (Female)" },
+  ],
+};
+
+const LANG_TO_LOCALE: Record<string, string> = {
+  "中文": "zh-CN",
+  "English": "en-US",
+  "日本語": "ja-JP",
+  "한국어": "ko-KR",
+  "Français": "fr-FR",
+  "Deutsch": "de-DE",
+  "Español": "es-ES",
+  "Русский": "ru-RU",
+  "Italiano": "it-IT",
+  "Português": "pt-BR",
+};
+
+const DEFAULT_SHORT_TITLE_DURATION = 4;



-const FIXED_REF_TEXT =
-  "其实生活中有许多美好的瞬间，比如清晨的阳光，或者一杯温热的清茶。希望这次生成的音色能够自然、流畅，完美还原出我最真实的声音状态。";
-
 const scrollContainerToItem = (container: HTMLDivElement, item: HTMLDivElement) => {
  const containerRect = container.getBoundingClientRect();
  const itemRect = item.getBoundingClientRect();
@@ -70,22 +124,17 @@ interface RefAudio {
  created_at: number;
 }

-interface Material {
-  id: string;
-  name: string;
-  path: string;
-  size_mb: number;
-  scene?: string;
-}
+import type { Material } from "@/shared/types/material";

 export const useHomeController = () => {
  const apiBase = getApiBaseUrl();

-  const [selectedMaterial, setSelectedMaterial] = useState<string>("");
+  const [selectedMaterials, setSelectedMaterials] = useState<string[]>([]);
  const [previewMaterial, setPreviewMaterial] = useState<string | null>(null);

  const [text, setText] = useState<string>("");
  const [voice, setVoice] = useState<string>("zh-CN-YunxiNeural");
+  const [textLang, setTextLang] = useState<string>("zh-CN");

  // 使用全局任务状态
  const { currentTask, isGenerating, startTask } = useTask();
@@ -96,7 +145,6 @@ export const useHomeController = () => {

  // 字幕和标题相关状态
  const [videoTitle, setVideoTitle] = useState<string>("");
-  const [enableSubtitles, setEnableSubtitles] = useState<boolean>(true);
  const [selectedSubtitleStyleId, setSelectedSubtitleStyleId] = useState<string>("");
  const [selectedTitleStyleId, setSelectedTitleStyleId] = useState<string>("");
  const [subtitleFontSize, setSubtitleFontSize] = useState<number>(80);
@@ -104,10 +152,19 @@ export const useHomeController = () => {
  const [subtitleSizeLocked, setSubtitleSizeLocked] = useState<boolean>(false);
  const [titleSizeLocked, setTitleSizeLocked] = useState<boolean>(false);
  const [titleTopMargin, setTitleTopMargin] = useState<number>(62);
+  const [titleDisplayMode, setTitleDisplayMode] = useState<"short" | "persistent">("short");
  const [subtitleBottomMargin, setSubtitleBottomMargin] = useState<number>(80);
+  const [outputAspectRatio, setOutputAspectRatio] = useState<"9:16" | "16:9">("9:16");
  const [showStylePreview, setShowStylePreview] = useState<boolean>(false);
  const [materialDimensions, setMaterialDimensions] = useState<{ width: number; height: number } | null>(null);

+  // 副标题相关状态
+  const [videoSecondaryTitle, setVideoSecondaryTitle] = useState<string>("");
+  const [selectedSecondaryTitleStyleId, setSelectedSecondaryTitleStyleId] = useState<string>("");
+  const [secondaryTitleFontSize, setSecondaryTitleFontSize] = useState<number>(48);
+  const [secondaryTitleTopMargin, setSecondaryTitleTopMargin] = useState<number>(12);
+  const [secondaryTitleSizeLocked, setSecondaryTitleSizeLocked] = useState<boolean>(false);
+

  // 背景音乐相关状态
  const [selectedBgmId, setSelectedBgmId] = useState<string>("");
@@ -117,7 +174,17 @@ export const useHomeController = () => {
  // 声音克隆相关状态
  const [ttsMode, setTtsMode] = useState<"edgetts" | "voiceclone">("edgetts");
  const [selectedRefAudio, setSelectedRefAudio] = useState<RefAudio | null>(null);
-  const [refText, setRefText] = useState(FIXED_REF_TEXT);
+  const [refText, setRefText] = useState("");
+
+  // 预生成配音选中 ID
+  const [selectedAudioId, setSelectedAudioId] = useState<string | null>(null);
+
+  // 语速控制
+  const [speed, setSpeed] = useState<number>(1.0);
+
+  // ClipTrimmer 模态框状态
+  const [clipTrimmerOpen, setClipTrimmerOpen] = useState(false);
+  const [clipTrimmerSegmentId, setClipTrimmerSegmentId] = useState<string | null>(null);

  // 音频预览与重命名状态
  const [editingAudioId, setEditingAudioId] = useState<string | null>(null);
@@ -181,8 +248,8 @@ export const useHomeController = () => {
        { new_name: editMaterialName.trim() }
      );
      const payload = unwrap(res);
-      if (selectedMaterial === materialId && payload?.id) {
-        setSelectedMaterial(payload.id);
+      if (selectedMaterials.includes(materialId) && payload?.id) {
+        setSelectedMaterials((prev) => prev.map((x) => (x === materialId ? payload.id : x)));
      }
      setEditingMaterialId(null);
      setEditMaterialName("");
@@ -197,6 +264,10 @@ export const useHomeController = () => {
  // AI 生成标题标签
  const [isGeneratingMeta, setIsGeneratingMeta] = useState(false);

+  // AI 多语言翻译
+  const [isTranslating, setIsTranslating] = useState(false);
+  const [originalText, setOriginalText] = useState<string | null>(null);
+
  // 在线录音相关
  const [isRecording, setIsRecording] = useState(false);
  const [recordedBlob, setRecordedBlob] = useState<Blob | null>(null);
@@ -210,6 +281,9 @@ export const useHomeController = () => {
  // 文案提取模态框
  const [extractModalOpen, setExtractModalOpen] = useState(false);

+  // AI 改写模态框
+  const [rewriteModalOpen, setRewriteModalOpen] = useState(false);
+
  // 获取存储 key 的前缀（登录用户使用 userId，未登录使用 guest）
  const storageKey = userId || "guest";

@@ -226,11 +300,12 @@ export const useHomeController = () => {
    uploadError,
    setUploadError,
    fetchMaterials,
+    toggleMaterial,
    deleteMaterial,
    handleUpload,
  } = useMaterials({
-    selectedMaterial,
-    setSelectedMaterial,
+    selectedMaterials,
+    setSelectedMaterials,
  });

  const {
@@ -253,8 +328,9 @@ export const useHomeController = () => {
    fetchRefAudios,
    uploadRefAudio,
    deleteRefAudio,
+    retranscribeRefAudio,
+    retranscribingId,
  } = useRefAudios({
-    fixedRefText: FIXED_REF_TEXT,
    selectedRefAudio,
    setSelectedRefAudio,
    setRefText,
@@ -289,13 +365,52 @@ export const useHomeController = () => {
    fetchGeneratedVideos,
    deleteVideo,
  } = useGeneratedVideos({
-
+    storageKey,
    selectedVideoId,
    setSelectedVideoId,
    setGeneratedVideo,
    resolveMediaUrl,
  });

+  const {
+    generatedAudios,
+    selectedAudio,
+    isGeneratingAudio,
+    audioTask,
+    fetchGeneratedAudios,
+    generateAudio,
+    deleteAudio,
+    renameAudio,
+    selectAudio,
+  } = useGeneratedAudios({
+    selectedAudioId,
+    setSelectedAudioId,
+  });
+
+  const {
+    segments: timelineSegments,
+    reorderSegments,
+    setSourceRange,
+    toCustomAssignments,
+  } = useTimelineEditor({
+    audioDuration: selectedAudio?.duration_sec ?? 0,
+    materials,
+    selectedMaterials,
+    storageKey,
+  });
+
+  // 时间轴第一段素材的视频 URL（用于帧截取预览）
+  // 有时间轴段时用第一段，没有（如未选配音）回退到 selectedMaterials[0]
+  const firstTimelineMaterialUrl = useMemo(() => {
+    const firstSeg = timelineSegments[0];
+    const matId = firstSeg?.materialId ?? selectedMaterials[0];
+    if (!matId) return null;
+    const mat = materials.find((m) => m.id === matId);
+    return mat?.path ? resolveMediaUrl(mat.path) : null;
+  }, [materials, timelineSegments, selectedMaterials]);
+
+  const materialPosterUrl = useVideoFrameCapture(showStylePreview ? firstTimelineMaterialUrl : null);
+
  useEffect(() => {
    if (isAuthLoading || !userId) return;
    let active = true;
@@ -338,28 +453,41 @@ export const useHomeController = () => {
    setText,
    videoTitle,
    setVideoTitle,
-    enableSubtitles,
-    setEnableSubtitles,
+    videoSecondaryTitle,
+    setVideoSecondaryTitle,
    ttsMode,
    setTtsMode,
    voice,
    setVoice,
-    selectedMaterial,
-    setSelectedMaterial,
+    textLang,
+    setTextLang,
+    selectedMaterials,
+    setSelectedMaterials,
    selectedSubtitleStyleId,
    setSelectedSubtitleStyleId,
    selectedTitleStyleId,
    setSelectedTitleStyleId,
+    selectedSecondaryTitleStyleId,
+    setSelectedSecondaryTitleStyleId,
    subtitleFontSize,
    setSubtitleFontSize,
    titleFontSize,
    setTitleFontSize,
+    secondaryTitleFontSize,
+    setSecondaryTitleFontSize,
    setSubtitleSizeLocked,
    setTitleSizeLocked,
+    setSecondaryTitleSizeLocked,
    titleTopMargin,
    setTitleTopMargin,
+    secondaryTitleTopMargin,
+    setSecondaryTitleTopMargin,
+    titleDisplayMode,
+    setTitleDisplayMode,
    subtitleBottomMargin,
    setSubtitleBottomMargin,
+    outputAspectRatio,
+    setOutputAspectRatio,
    selectedBgmId,
    setSelectedBgmId,
    bgmVolume,
@@ -369,8 +497,20 @@ export const useHomeController = () => {
    selectedVideoId,
    setSelectedVideoId,
    selectedRefAudio,
+    selectedAudioId,
+    setSelectedAudioId,
+    speed,
+    setSpeed,
  });

+  const { savedScripts, saveScript, deleteScript: deleteSavedScript } = useSavedScripts(storageKey);
+
+  const handleSaveScript = () => {
+    if (!text.trim()) return;
+    saveScript(text);
+    toast.success("文案已保存");
+  };
+
  const syncTitleToPublish = (value: string) => {
    if (typeof window !== "undefined") {
      localStorage.setItem(`vigent_${storageKey}_publish_title`, value);
@@ -383,6 +523,12 @@ export const useHomeController = () => {
    onCommit: syncTitleToPublish,
  });

+  const secondaryTitleInput = useTitleInput({
+    value: videoSecondaryTitle,
+    onChange: setVideoSecondaryTitle,
+    maxLength: SECONDARY_TITLE_MAX_LENGTH,
+  });
+
  // 加载素材列表和历史视频
  useEffect(() => {
    if (isAuthLoading) return;
@@ -390,6 +536,7 @@ export const useHomeController = () => {
      fetchMaterials(),
      fetchGeneratedVideos(),
      fetchRefAudios(),
+      fetchGeneratedAudios(),
      refreshSubtitleStyles(),
      refreshTitleStyles(),
      fetchBgmList(),
@@ -410,7 +557,8 @@ export const useHomeController = () => {
  }, [isGenerating, currentTask, fetchGeneratedVideos]);

  useEffect(() => {
-    const material = materials.find((item) => item.id === selectedMaterial);
+    const firstSelected = selectedMaterials[0];
+    const material = materials.find((item) => item.id === firstSelected);
    if (!material?.path) {
      setMaterialDimensions(null);
      return;
@@ -423,7 +571,6 @@ export const useHomeController = () => {

    let isActive = true;
    const video = document.createElement("video");
-    video.crossOrigin = "anonymous";
    video.preload = "metadata";
    video.src = url;
    video.load();
@@ -450,7 +597,7 @@ export const useHomeController = () => {
      video.removeEventListener("loadedmetadata", handleLoaded);
      video.removeEventListener("error", handleError);
    };
-  }, [materials, selectedMaterial]);
+  }, [materials, selectedMaterials]);


  useEffect(() => {
@@ -473,11 +620,32 @@ export const useHomeController = () => {
    }
  }, [titleStyles, selectedTitleStyleId, titleSizeLocked]);

+  useEffect(() => {
+    if (secondaryTitleSizeLocked || titleStyles.length === 0) return;
+    const active = titleStyles.find((s) => s.id === selectedSecondaryTitleStyleId)
+      || titleStyles.find((s) => s.is_default)
+      || titleStyles[0];
+    if (active?.font_size) {
+      setSecondaryTitleFontSize(active.font_size);
+    }
+  }, [titleStyles, selectedSecondaryTitleStyleId, secondaryTitleSizeLocked]);
+
  // 移除重复的 BGM 持久化恢复逻辑 (已统一移动到 useHomePersistence 中)
  // useEffect(() => { ... })

+  // 时间门控：页面加载后 1 秒内禁止所有列表自动滚动效果
+  // 防止持久化恢复 + 异步数据加载触发 scrollIntoView 导致移动端页面跳动
+  const scrollEffectsEnabled = useRef(false);
  useEffect(() => {
-    if (!selectedBgmId) return;
+    const timer = setTimeout(() => {
+      scrollEffectsEnabled.current = true;
+    }, 1000);
+    return () => clearTimeout(timer);
+  }, []);
+
+  // BGM 列表滚动
+  useEffect(() => {
+    if (!selectedBgmId || !scrollEffectsEnabled.current) return;
    const container = bgmListContainerRef.current;
    const target = bgmItemRefs.current[selectedBgmId];
    if (container && target) {
@@ -485,13 +653,16 @@ export const useHomeController = () => {
    }
  }, [selectedBgmId, bgmList]);

+  // 素材列表滚动
  useEffect(() => {
-    if (!selectedMaterial) return;
-    const target = materialItemRefs.current[selectedMaterial];
+    const firstSelected = selectedMaterials[0];
+    if (!firstSelected || !scrollEffectsEnabled.current) return;
+    const target = materialItemRefs.current[firstSelected];
    if (target) {
      target.scrollIntoView({ block: "nearest", behavior: "smooth" });
    }
-  }, [selectedMaterial, materials]);
+    // eslint-disable-next-line react-hooks/exhaustive-deps
+  }, [selectedMaterials.length]);

  // 【修复】历史视频默认选中逻辑
  // 当持久化恢复完成，且列表加载完毕，如果没选中任何视频，默认选中第一个
@@ -501,7 +672,7 @@ export const useHomeController = () => {
      setSelectedVideoId(firstId);
      setGeneratedVideo(resolveMediaUrl(generatedVideos[0].path));
    }
-  }, [isRestored, generatedVideos, selectedVideoId, setSelectedVideoId, setGeneratedVideo, resolveMediaUrl]);
+  }, [isRestored, generatedVideos, selectedVideoId, setSelectedVideoId, setGeneratedVideo]);

  // 【修复】BGM 默认选中逻辑
  useEffect(() => {
@@ -510,8 +681,9 @@ export const useHomeController = () => {
    }
  }, [isRestored, bgmList, selectedBgmId, enableBgm, setSelectedBgmId]);

+  // 视频列表滚动
  useEffect(() => {
-    if (!selectedVideoId) return;
+    if (!selectedVideoId || !scrollEffectsEnabled.current) return;
    const target = videoItemRefs.current[selectedVideoId];
    if (target) {
      target.scrollIntoView({ block: "nearest", behavior: "smooth" });
@@ -617,7 +789,7 @@ export const useHomeController = () => {

    setIsGeneratingMeta(true);
    try {
-      const { data: res } = await api.post<ApiResponse<{ title?: string; tags?: string[] }>>(
+      const { data: res } = await api.post<ApiResponse<{ title?: string; secondary_title?: string; tags?: string[] }>>(
        "/api/ai/generate-meta",
        { text: text.trim() }
      );
@@ -627,6 +799,10 @@ export const useHomeController = () => {
      const nextTitle = clampTitle(payload.title || "");
      titleInput.commitValue(nextTitle);

+      // 更新副标题
+      const nextSecondaryTitle = clampSecondaryTitle(payload.secondary_title || "");
+      secondaryTitleInput.commitValue(nextSecondaryTitle);
+
      // 同步到发布页 localStorage
      localStorage.setItem(`vigent_${storageKey}_publish_tags`, JSON.stringify(payload.tags || []));
    } catch (err: unknown) {
@@ -639,20 +815,89 @@ export const useHomeController = () => {
    }
  };

+  // AI 多语言翻译
+  const handleTranslate = async (targetLang: string) => {
+    if (!text.trim()) {
+      toast.error("请先输入口播文案");
+      return;
+    }
+
+    // 首次翻译时保存原文
+    if (originalText === null) {
+      setOriginalText(text);
+    }
+
+    setIsTranslating(true);
+    try {
+      const { data: res } = await api.post<ApiResponse<{ translated_text: string }>>(
+        "/api/ai/translate",
+        { text: text.trim(), target_lang: targetLang }
+      );
+      const payload = unwrap(res);
+      setText(payload.translated_text || "");
+
+      // 根据翻译目标语言更新 textLang 并自动切换声音
+      const locale = LANG_TO_LOCALE[targetLang] || "zh-CN";
+      setTextLang(locale);
+      if (ttsMode === "edgetts") {
+        const langVoices = VOICES[locale] || VOICES["zh-CN"];
+        setVoice(langVoices[0].id);
+      }
+    } catch (err: unknown) {
+      console.error("AI translate failed:", err);
+      const axiosErr = err as { response?: { data?: { message?: string } }; message?: string };
+      const errorMsg = axiosErr.response?.data?.message || axiosErr.message || String(err);
+      toast.error(`AI 翻译失败: ${errorMsg}`);
+    } finally {
+      setIsTranslating(false);
+    }
+  };
+
+  const handleRestoreOriginal = () => {
+    if (originalText !== null) {
+      setText(originalText);
+      setOriginalText(null);
+      setTextLang("zh-CN");
+      if (ttsMode === "edgetts") {
+        setVoice(VOICES["zh-CN"][0].id);
+      }
+    }
+  };
+
+  // 生成配音
+  const handleGenerateAudio = async () => {
+    if (!text.trim()) {
+      toast.error("请先输入文案");
+      return;
+    }
+    if (ttsMode === "voiceclone" && !selectedRefAudio) {
+      toast.error("请选择参考音频");
+      return;
+    }
+
+    const params = {
+      text: text.trim(),
+      tts_mode: ttsMode,
+      voice: ttsMode === "edgetts" ? voice : undefined,
+      ref_audio_id: ttsMode === "voiceclone" ? selectedRefAudio!.id : undefined,
+      ref_text: ttsMode === "voiceclone" ? refText : undefined,
+      language: textLang,
+      speed: ttsMode === "voiceclone" ? speed : undefined,
+    };
+    await generateAudio(params);
+  };
+
  // 生成视频
  const handleGenerate = async () => {
-    if (!selectedMaterial || !text.trim()) {
+    if (selectedMaterials.length === 0 || !text.trim()) {
      toast.error("请先选择素材并填写文案");
      return;
    }

-    // 声音克隆模式校验
-    if (ttsMode === "voiceclone") {
-      if (!selectedRefAudio) {
-        toast.error("请选择或上传参考音频");
+    if (!selectedAudio) {
+      toast.error("请先生成并选中配音");
      return;
    }
-    }

    if (enableBgm && !selectedBgmId) {
      toast.error("请选择背景音乐");
@@ -663,26 +908,81 @@ export const useHomeController = () => {

    try {
      // 查找选中的素材对象以获取路径
-      const materialObj = materials.find((m) => m.id === selectedMaterial);
-      if (!materialObj) {
+      const firstMaterialObj = materials.find((m) => m.id === selectedMaterials[0]);
+      if (!firstMaterialObj) {
        toast.error("素材数据异常");
        return;
      }

-      // 构建请求参数
+      // 构建请求参数 - 使用预生成配音
      const payload: Record<string, unknown> = {
-        material_path: materialObj.path,
-        text: text,
-        tts_mode: ttsMode,
+        material_path: firstMaterialObj.path,
+        text: selectedAudio.text || text,
+        generated_audio_id: selectedAudio.id,
+        language: selectedAudio.language || textLang,
        title: videoTitle.trim() || undefined,
-        enable_subtitles: enableSubtitles,
+        enable_subtitles: true,
+        output_aspect_ratio: outputAspectRatio,
      };

-      if (enableSubtitles && selectedSubtitleStyleId) {
+      // 多素材
+      if (selectedMaterials.length > 1) {
+        const timelineOrderedIds = timelineSegments
+          .map((seg) => seg.materialId)
+          .filter((id, index, arr) => arr.indexOf(id) === index);
+        const orderedMaterialIds = [
+          ...timelineOrderedIds.filter((id) => selectedMaterials.includes(id)),
+          ...selectedMaterials.filter((id) => !timelineOrderedIds.includes(id)),
+        ];
+
+        const materialPaths = orderedMaterialIds
+          .map((id) => materials.find((x) => x.id === id)?.path)
+          .filter((path): path is string => !!path);
+
+        if (materialPaths.length === 0) {
+          toast.error("多素材解析失败，请刷新素材后重试");
+          return;
+        }
+
+        payload.material_paths = materialPaths;
+        payload.material_path = materialPaths[0];
+
+        // 发送自定义时间轴分配
+        const assignments = toCustomAssignments();
+        if (assignments.length > 0) {
+          const assignmentPaths = assignments
+            .map((a) => a.material_path)
+            .filter((path): path is string => !!path);
+
+          if (assignmentPaths.length === assignments.length) {
+            // 以时间轴可见段为准：超出时间轴的素材不会参与本次生成
+            payload.material_paths = assignmentPaths;
+            payload.material_path = assignmentPaths[0];
+          }
+          payload.custom_assignments = assignments;
+        } else {
+          console.warn(
+            "[Timeline] custom_assignments 为空，回退后端自动分配",
+            { materials: materialPaths.length }
+          );
+        }
+      }
+
+      // 单素材 + 截取范围
+      const singleSeg = timelineSegments[0];
+      if (
+        selectedMaterials.length === 1
+        && singleSeg
+        && (singleSeg.sourceStart > 0 || singleSeg.sourceEnd > 0)
+      ) {
+        payload.custom_assignments = toCustomAssignments();
+      }
+
+      if (selectedSubtitleStyleId) {
        payload.subtitle_style_id = selectedSubtitleStyleId;
      }

-      if (enableSubtitles && subtitleFontSize) {
+      if (subtitleFontSize) {
        payload.subtitle_font_size = Math.round(subtitleFontSize);
      }

@@ -694,26 +994,35 @@ export const useHomeController = () => {
        payload.title_font_size = Math.round(titleFontSize);
      }

+      if (videoTitle.trim() || videoSecondaryTitle.trim()) {
+        payload.title_display_mode = titleDisplayMode;
+        if (titleDisplayMode === "short") {
+          payload.title_duration = DEFAULT_SHORT_TITLE_DURATION;
+        }
+      }
+
      if (videoTitle.trim()) {
        payload.title_top_margin = Math.round(titleTopMargin);
      }

-      if (enableSubtitles) {
-        payload.subtitle_bottom_margin = Math.round(subtitleBottomMargin);
+      if (videoSecondaryTitle.trim()) {
+        payload.secondary_title = videoSecondaryTitle.trim();
+        if (selectedSecondaryTitleStyleId) {
+          payload.secondary_title_style_id = selectedSecondaryTitleStyleId;
        }
+        if (secondaryTitleFontSize) {
+          payload.secondary_title_font_size = Math.round(secondaryTitleFontSize);
+        }
+        payload.secondary_title_top_margin = Math.round(secondaryTitleTopMargin);
+      }
+
+      payload.subtitle_bottom_margin = Math.round(subtitleBottomMargin);

      if (enableBgm && selectedBgmId) {
        payload.bgm_id = selectedBgmId;
        payload.bgm_volume = bgmVolume;
      }

-      if (ttsMode === "edgetts") {
-        payload.voice = voice;
-      } else {
-        payload.ref_audio_id = selectedRefAudio!.id;
-        payload.ref_text = refText;
-      }
-
      // 创建生成任务
      const { data: res } = await api.post<ApiResponse<{ task_id: string }>>(
        "/api/videos/generate",
@@ -774,8 +1083,8 @@ export const useHomeController = () => {
    fetchMaterials,
    deleteMaterial,
    handleUpload,
-    selectedMaterial,
-    setSelectedMaterial,
+    selectedMaterials,
+    toggleMaterial,
    handlePreviewMaterial,
    editingMaterialId,
    editMaterialName,
@@ -787,8 +1096,17 @@ export const useHomeController = () => {
    setText,
    extractModalOpen,
    setExtractModalOpen,
+    rewriteModalOpen,
+    setRewriteModalOpen,
    handleGenerateMeta,
    isGeneratingMeta,
+    handleTranslate,
+    isTranslating,
+    originalText,
+    handleRestoreOriginal,
+    savedScripts,
+    handleSaveScript,
+    deleteSavedScript,
    showStylePreview,
    setShowStylePreview,
    videoTitle,
@@ -799,6 +1117,15 @@ export const useHomeController = () => {
    titleFontSize,
    setTitleFontSize,
    setTitleSizeLocked,
+    videoSecondaryTitle,
+    secondaryTitleInput,
+    selectedSecondaryTitleStyleId,
+    setSelectedSecondaryTitleStyleId,
+    secondaryTitleFontSize,
+    setSecondaryTitleFontSize,
+    setSecondaryTitleSizeLocked,
+    secondaryTitleTopMargin,
+    setSecondaryTitleTopMargin,
    subtitleStyles,
    selectedSubtitleStyleId,
    setSelectedSubtitleStyleId,
@@ -807,19 +1134,23 @@ export const useHomeController = () => {
    setSubtitleSizeLocked,
    titleTopMargin,
    setTitleTopMargin,
+    titleDisplayMode,
+    setTitleDisplayMode,
    subtitleBottomMargin,
    setSubtitleBottomMargin,
-    enableSubtitles,
-    setEnableSubtitles,
+    outputAspectRatio,
+    setOutputAspectRatio,
    resolveAssetUrl,
    getFontFormat,
    buildTextShadow,
    materialDimensions,
+    materialPosterUrl,
    ttsMode,
    setTtsMode,
-    voices: VOICES,
+    voices: VOICES[textLang] || VOICES["zh-CN"],
    voice,
    setVoice,
+    textLang,
    refAudios,
    selectedRefAudio,
    handleSelectRefAudio,
@@ -837,6 +1168,8 @@ export const useHomeController = () => {
    saveEditing,
    cancelEditing,
    deleteRefAudio,
+    retranscribeRefAudio,
+    retranscribingId,
    recordedBlob,
    isRecording,
    recordingTime,
@@ -844,7 +1177,6 @@ export const useHomeController = () => {
    stopRecording,
    useRecording,
    formatRecordingTime,
-    fixedRefText: FIXED_REF_TEXT,
    bgmList,
    bgmLoading,
    bgmError,
@@ -870,5 +1202,24 @@ export const useHomeController = () => {
    fetchGeneratedVideos,
    registerVideoRef,
    formatDate,
+    generatedAudios,
+    selectedAudio,
+    selectedAudioId,
+    isGeneratingAudio,
+    audioTask,
+    fetchGeneratedAudios,
+    handleGenerateAudio,
+    deleteAudio,
+    renameAudio,
+    selectAudio,
+    speed,
+    setSpeed,
+    timelineSegments,
+    reorderSegments,
+    setSourceRange,
+    clipTrimmerOpen,
+    setClipTrimmerOpen,
+    clipTrimmerSegmentId,
+    setClipTrimmerSegmentId,
  };
 };
--- a/frontend/src/features/home/model/useHomePersistence.ts
+++ b/frontend/src/features/home/model/useHomePersistence.ts
@@ -1,5 +1,5 @@
 import { useEffect, useState } from "react";
-import { clampTitle } from "@/shared/lib/title";
+import { clampTitle, clampSecondaryTitle } from "@/shared/lib/title";

 interface RefAudio {
  id: string;
@@ -17,28 +17,41 @@ interface UseHomePersistenceOptions {
  setText: React.Dispatch<React.SetStateAction<string>>;
  videoTitle: string;
  setVideoTitle: React.Dispatch<React.SetStateAction<string>>;
-  enableSubtitles: boolean;
-  setEnableSubtitles: React.Dispatch<React.SetStateAction<boolean>>;
+  videoSecondaryTitle: string;
+  setVideoSecondaryTitle: React.Dispatch<React.SetStateAction<string>>;
  ttsMode: 'edgetts' | 'voiceclone';
  setTtsMode: React.Dispatch<React.SetStateAction<'edgetts' | 'voiceclone'>>;
  voice: string;
  setVoice: React.Dispatch<React.SetStateAction<string>>;
-  selectedMaterial: string;
-  setSelectedMaterial: React.Dispatch<React.SetStateAction<string>>;
+  textLang: string;
+  setTextLang: React.Dispatch<React.SetStateAction<string>>;
+  selectedMaterials: string[];
+  setSelectedMaterials: React.Dispatch<React.SetStateAction<string[]>>;
  selectedSubtitleStyleId: string;
  setSelectedSubtitleStyleId: React.Dispatch<React.SetStateAction<string>>;
  selectedTitleStyleId: string;
  setSelectedTitleStyleId: React.Dispatch<React.SetStateAction<string>>;
+  selectedSecondaryTitleStyleId: string;
+  setSelectedSecondaryTitleStyleId: React.Dispatch<React.SetStateAction<string>>;
  subtitleFontSize: number;
  setSubtitleFontSize: React.Dispatch<React.SetStateAction<number>>;
  titleFontSize: number;
  setTitleFontSize: React.Dispatch<React.SetStateAction<number>>;
+  secondaryTitleFontSize: number;
+  setSecondaryTitleFontSize: React.Dispatch<React.SetStateAction<number>>;
  setSubtitleSizeLocked: React.Dispatch<React.SetStateAction<boolean>>;
  setTitleSizeLocked: React.Dispatch<React.SetStateAction<boolean>>;
+  setSecondaryTitleSizeLocked: React.Dispatch<React.SetStateAction<boolean>>;
  titleTopMargin: number;
  setTitleTopMargin: React.Dispatch<React.SetStateAction<number>>;
+  secondaryTitleTopMargin: number;
+  setSecondaryTitleTopMargin: React.Dispatch<React.SetStateAction<number>>;
+  titleDisplayMode: 'short' | 'persistent';
+  setTitleDisplayMode: React.Dispatch<React.SetStateAction<'short' | 'persistent'>>;
  subtitleBottomMargin: number;
  setSubtitleBottomMargin: React.Dispatch<React.SetStateAction<number>>;
+  outputAspectRatio: '9:16' | '16:9';
+  setOutputAspectRatio: React.Dispatch<React.SetStateAction<'9:16' | '16:9'>>;
  selectedBgmId: string;
  setSelectedBgmId: React.Dispatch<React.SetStateAction<string>>;
  bgmVolume: number;
@@ -48,6 +61,10 @@ interface UseHomePersistenceOptions {
  selectedVideoId: string | null;
  setSelectedVideoId: React.Dispatch<React.SetStateAction<string | null>>;
  selectedRefAudio: RefAudio | null;
+  selectedAudioId: string | null;
+  setSelectedAudioId: React.Dispatch<React.SetStateAction<string | null>>;
+  speed: number;
+  setSpeed: React.Dispatch<React.SetStateAction<number>>;
 }

 export const useHomePersistence = ({
@@ -57,28 +74,41 @@ export const useHomePersistence = ({
  setText,
  videoTitle,
  setVideoTitle,
-  enableSubtitles,
-  setEnableSubtitles,
+  videoSecondaryTitle,
+  setVideoSecondaryTitle,
  ttsMode,
  setTtsMode,
  voice,
  setVoice,
-  selectedMaterial,
-  setSelectedMaterial,
+  textLang,
+  setTextLang,
+  selectedMaterials,
+  setSelectedMaterials,
  selectedSubtitleStyleId,
  setSelectedSubtitleStyleId,
  selectedTitleStyleId,
  setSelectedTitleStyleId,
+  selectedSecondaryTitleStyleId,
+  setSelectedSecondaryTitleStyleId,
  subtitleFontSize,
  setSubtitleFontSize,
  titleFontSize,
  setTitleFontSize,
+  secondaryTitleFontSize,
+  setSecondaryTitleFontSize,
  setSubtitleSizeLocked,
  setTitleSizeLocked,
+  setSecondaryTitleSizeLocked,
  titleTopMargin,
  setTitleTopMargin,
+  secondaryTitleTopMargin,
+  setSecondaryTitleTopMargin,
+  titleDisplayMode,
+  setTitleDisplayMode,
  subtitleBottomMargin,
  setSubtitleBottomMargin,
+  outputAspectRatio,
+  setOutputAspectRatio,
  selectedBgmId,
  setSelectedBgmId,
  bgmVolume,
@@ -88,6 +118,10 @@ export const useHomePersistence = ({
  selectedVideoId,
  setSelectedVideoId,
  selectedRefAudio,
+  selectedAudioId,
+  setSelectedAudioId,
+  speed,
+  setSpeed,
 }: UseHomePersistenceOptions) => {
  const [isRestored, setIsRestored] = useState(false);

@@ -96,30 +130,53 @@ export const useHomePersistence = ({

    const savedText = localStorage.getItem(`vigent_${storageKey}_text`);
    const savedTitle = localStorage.getItem(`vigent_${storageKey}_title`);
-    const savedSubtitles = localStorage.getItem(`vigent_${storageKey}_subtitles`);
+    const savedSecondaryTitle = localStorage.getItem(`vigent_${storageKey}_secondaryTitle`);
    const savedTtsMode = localStorage.getItem(`vigent_${storageKey}_ttsMode`);
    const savedVoice = localStorage.getItem(`vigent_${storageKey}_voice`);
+    const savedTextLang = localStorage.getItem(`vigent_${storageKey}_textLang`);
    const savedMaterial = localStorage.getItem(`vigent_${storageKey}_material`);
    const savedSubtitleStyle = localStorage.getItem(`vigent_${storageKey}_subtitleStyle`);
    const savedTitleStyle = localStorage.getItem(`vigent_${storageKey}_titleStyle`);
+    const savedSecondaryTitleStyle = localStorage.getItem(`vigent_${storageKey}_secondaryTitleStyle`);
    const savedSubtitleFontSize = localStorage.getItem(`vigent_${storageKey}_subtitleFontSize`);
    const savedTitleFontSize = localStorage.getItem(`vigent_${storageKey}_titleFontSize`);
+    const savedSecondaryTitleFontSize = localStorage.getItem(`vigent_${storageKey}_secondaryTitleFontSize`);
    const savedBgmId = localStorage.getItem(`vigent_${storageKey}_bgmId`);
-    const savedSelectedVideoId = localStorage.getItem(`vigent_${storageKey}_selectedVideoId`);
+    const savedSelectedVideoId = localStorage.getItem(`vigent_${storageKey}_latestGeneratedVideoId`)
+      || localStorage.getItem(`vigent_${storageKey}_selectedVideoId`);
+    const savedSelectedAudioId = localStorage.getItem(`vigent_${storageKey}_selectedAudioId`);
    const savedBgmVolume = localStorage.getItem(`vigent_${storageKey}_bgmVolume`);
    const savedEnableBgm = localStorage.getItem(`vigent_${storageKey}_enableBgm`);
    const savedTitleTopMargin = localStorage.getItem(`vigent_${storageKey}_titleTopMargin`);
+    const savedSecondaryTitleTopMargin = localStorage.getItem(`vigent_${storageKey}_secondaryTitleTopMargin`);
+    const savedTitleDisplayMode = localStorage.getItem(`vigent_${storageKey}_titleDisplayMode`);
    const savedSubtitleBottomMargin = localStorage.getItem(`vigent_${storageKey}_subtitleBottomMargin`);
+    const savedOutputAspectRatio = localStorage.getItem(`vigent_${storageKey}_outputAspectRatio`);
+    const savedSpeed = localStorage.getItem(`vigent_${storageKey}_speed`);

    setText(savedText || "大家好，欢迎来到我的频道，今天给大家分享一些有趣的内容。");
    setVideoTitle(savedTitle ? clampTitle(savedTitle) : "");
-    setEnableSubtitles(savedSubtitles !== null ? savedSubtitles === 'true' : true);
+    setVideoSecondaryTitle(savedSecondaryTitle ? clampSecondaryTitle(savedSecondaryTitle) : "");
    setTtsMode((savedTtsMode as 'edgetts' | 'voiceclone') || 'edgetts');
    setVoice(savedVoice || "zh-CN-YunxiNeural");
+    if (savedTextLang) setTextLang(savedTextLang);

-    if (savedMaterial) setSelectedMaterial(savedMaterial);
+    if (savedMaterial) {
+      try {
+        const parsed = JSON.parse(savedMaterial);
+        if (Array.isArray(parsed)) {
+          setSelectedMaterials(parsed);
+        } else {
+          setSelectedMaterials([savedMaterial]);
+        }
+      } catch {
+        // 旧格式: 单字符串
+        setSelectedMaterials([savedMaterial]);
+      }
+    }
    if (savedSubtitleStyle) setSelectedSubtitleStyleId(savedSubtitleStyle);
    if (savedTitleStyle) setSelectedTitleStyleId(savedTitleStyle);
+    if (savedSecondaryTitleStyle) setSelectedSecondaryTitleStyleId(savedSecondaryTitleStyle);

    if (savedSubtitleFontSize) {
      const parsed = parseInt(savedSubtitleFontSize, 10);
@@ -137,41 +194,77 @@ export const useHomePersistence = ({
      }
    }

+    if (savedSecondaryTitleFontSize) {
+      const parsed = parseInt(savedSecondaryTitleFontSize, 10);
+      if (!Number.isNaN(parsed)) {
+        setSecondaryTitleFontSize(parsed);
+        setSecondaryTitleSizeLocked(true);
+      }
+    }
+
    if (savedBgmId) setSelectedBgmId(savedBgmId);
    if (savedBgmVolume) setBgmVolume(parseFloat(savedBgmVolume));
    if (savedEnableBgm !== null) setEnableBgm(savedEnableBgm === 'true');
    if (savedSelectedVideoId) setSelectedVideoId(savedSelectedVideoId);
+    // 消费后清除跨页面共享标记，避免反复覆盖
+    localStorage.removeItem(`vigent_${storageKey}_latestGeneratedVideoId`);
+    if (savedSelectedAudioId) setSelectedAudioId(savedSelectedAudioId);

    if (savedTitleTopMargin) {
      const parsed = parseInt(savedTitleTopMargin, 10);
      if (!Number.isNaN(parsed)) setTitleTopMargin(parsed);
    }
+    if (savedSecondaryTitleTopMargin) {
+      const parsed = parseInt(savedSecondaryTitleTopMargin, 10);
+      if (!Number.isNaN(parsed)) setSecondaryTitleTopMargin(parsed);
+    }
+    if (savedTitleDisplayMode === 'short' || savedTitleDisplayMode === 'persistent') {
+      setTitleDisplayMode(savedTitleDisplayMode);
+    }
    if (savedSubtitleBottomMargin) {
      const parsed = parseInt(savedSubtitleBottomMargin, 10);
      if (!Number.isNaN(parsed)) setSubtitleBottomMargin(parsed);
    }

+    if (savedOutputAspectRatio === '9:16' || savedOutputAspectRatio === '16:9') {
+      setOutputAspectRatio(savedOutputAspectRatio);
+    }
+
+    if (savedSpeed) {
+      const parsed = parseFloat(savedSpeed);
+      if (!Number.isNaN(parsed)) setSpeed(parsed);
+    }
+
    // eslint-disable-next-line react-hooks/set-state-in-effect
    setIsRestored(true);
  }, [
    isAuthLoading,
    setBgmVolume,
    setEnableBgm,
-    setEnableSubtitles,
    setSelectedBgmId,
-    setSelectedMaterial,
+    setSelectedMaterials,
    setSelectedSubtitleStyleId,
    setSelectedTitleStyleId,
+    setSelectedSecondaryTitleStyleId,
    setSelectedVideoId,
+    setSelectedAudioId,
+    setSpeed,
    setSubtitleFontSize,
    setSubtitleSizeLocked,
    setText,
+    setTextLang,
    setTitleFontSize,
    setTitleSizeLocked,
+    setSecondaryTitleFontSize,
+    setSecondaryTitleSizeLocked,
    setTitleTopMargin,
+    setSecondaryTitleTopMargin,
+    setTitleDisplayMode,
    setSubtitleBottomMargin,
+    setOutputAspectRatio,
    setTtsMode,
    setVideoTitle,
+    setVideoSecondaryTitle,
    setVoice,
    storageKey,
  ]);
@@ -193,8 +286,12 @@ export const useHomePersistence = ({
  }, [videoTitle, storageKey, isRestored]);

  useEffect(() => {
-    if (isRestored) localStorage.setItem(`vigent_${storageKey}_subtitles`, String(enableSubtitles));
-  }, [enableSubtitles, storageKey, isRestored]);
+    if (!isRestored) return;
+    const timeout = setTimeout(() => {
+      localStorage.setItem(`vigent_${storageKey}_secondaryTitle`, videoSecondaryTitle);
+    }, 300);
+    return () => clearTimeout(timeout);
+  }, [videoSecondaryTitle, storageKey, isRestored]);

  useEffect(() => {
    if (isRestored) localStorage.setItem(`vigent_${storageKey}_ttsMode`, ttsMode);
@@ -205,10 +302,14 @@ export const useHomePersistence = ({
  }, [voice, storageKey, isRestored]);

  useEffect(() => {
-    if (isRestored && selectedMaterial) {
-      localStorage.setItem(`vigent_${storageKey}_material`, selectedMaterial);
+    if (isRestored) localStorage.setItem(`vigent_${storageKey}_textLang`, textLang);
+  }, [textLang, storageKey, isRestored]);
+
+  useEffect(() => {
+    if (isRestored && selectedMaterials.length > 0) {
+      localStorage.setItem(`vigent_${storageKey}_material`, JSON.stringify(selectedMaterials));
    }
-  }, [selectedMaterial, storageKey, isRestored]);
+  }, [selectedMaterials, storageKey, isRestored]);

  useEffect(() => {
    if (isRestored && selectedSubtitleStyleId) {
@@ -222,6 +323,12 @@ export const useHomePersistence = ({
    }
  }, [selectedTitleStyleId, storageKey, isRestored]);

+  useEffect(() => {
+    if (isRestored && selectedSecondaryTitleStyleId) {
+      localStorage.setItem(`vigent_${storageKey}_secondaryTitleStyle`, selectedSecondaryTitleStyleId);
+    }
+  }, [selectedSecondaryTitleStyleId, storageKey, isRestored]);
+
  useEffect(() => {
    if (isRestored) {
      localStorage.setItem(`vigent_${storageKey}_subtitleFontSize`, String(subtitleFontSize));
@@ -234,18 +341,42 @@ export const useHomePersistence = ({
    }
  }, [titleFontSize, storageKey, isRestored]);

+  useEffect(() => {
+    if (isRestored) {
+      localStorage.setItem(`vigent_${storageKey}_secondaryTitleFontSize`, String(secondaryTitleFontSize));
+    }
+  }, [secondaryTitleFontSize, storageKey, isRestored]);
+
  useEffect(() => {
    if (isRestored) {
      localStorage.setItem(`vigent_${storageKey}_titleTopMargin`, String(titleTopMargin));
    }
  }, [titleTopMargin, storageKey, isRestored]);

+  useEffect(() => {
+    if (isRestored) {
+      localStorage.setItem(`vigent_${storageKey}_secondaryTitleTopMargin`, String(secondaryTitleTopMargin));
+    }
+  }, [secondaryTitleTopMargin, storageKey, isRestored]);
+
+  useEffect(() => {
+    if (isRestored) {
+      localStorage.setItem(`vigent_${storageKey}_titleDisplayMode`, titleDisplayMode);
+    }
+  }, [titleDisplayMode, storageKey, isRestored]);
+
  useEffect(() => {
    if (isRestored) {
      localStorage.setItem(`vigent_${storageKey}_subtitleBottomMargin`, String(subtitleBottomMargin));
    }
  }, [subtitleBottomMargin, storageKey, isRestored]);

+  useEffect(() => {
+    if (isRestored) {
+      localStorage.setItem(`vigent_${storageKey}_outputAspectRatio`, outputAspectRatio);
+    }
+  }, [outputAspectRatio, storageKey, isRestored]);
+
  useEffect(() => {
    if (isRestored) {
      localStorage.setItem(`vigent_${storageKey}_bgmId`, selectedBgmId);
@@ -275,11 +406,26 @@ export const useHomePersistence = ({
    }
  }, [selectedVideoId, storageKey, isRestored]);

+  useEffect(() => {
+    if (!isRestored) return;
+    if (selectedAudioId) {
+      localStorage.setItem(`vigent_${storageKey}_selectedAudioId`, selectedAudioId);
+    } else {
+      localStorage.removeItem(`vigent_${storageKey}_selectedAudioId`);
+    }
+  }, [selectedAudioId, storageKey, isRestored]);
+
  useEffect(() => {
    if (isRestored && selectedRefAudio) {
      localStorage.setItem(`vigent_${storageKey}_refAudioId`, selectedRefAudio.id);
    }
  }, [selectedRefAudio, storageKey, isRestored]);

+  useEffect(() => {
+    if (isRestored) {
+      localStorage.setItem(`vigent_${storageKey}_speed`, String(speed));
+    }
+  }, [speed, storageKey, isRestored]);
+
  return { isRestored };
 };
--- a/frontend/src/features/home/model/useMaterials.ts
+++ b/frontend/src/features/home/model/useMaterials.ts
@@ -2,23 +2,44 @@ import { useCallback, useState } from "react";
 import api from "@/shared/api/axios";
 import { ApiResponse, unwrap } from "@/shared/api/types";
 import { toast } from "sonner";
+import { resolveMediaUrl } from "@/shared/lib/media";
+import type { Material } from "@/shared/types/material";

-interface Material {
-  id: string;
-  name: string;
-  scene: string;
-  size_mb: number;
-  path: string;
+/** Probe video duration from a URL using <video> element */
+function probeVideoDuration(url: string): Promise<number> {
+  return new Promise((resolve) => {
+    const video = document.createElement("video");
+    video.preload = "metadata";
+    video.crossOrigin = "anonymous";
+    const cleanup = () => {
+      video.removeEventListener("loadedmetadata", onMeta);
+      video.removeEventListener("error", onError);
+      video.src = "";
+    };
+    const onMeta = () => {
+      const dur = video.duration;
+      cleanup();
+      resolve(Number.isFinite(dur) ? dur : 0);
+    };
+    const onError = () => {
+      cleanup();
+      resolve(0);
+    };
+    video.addEventListener("loadedmetadata", onMeta);
+    video.addEventListener("error", onError);
+    video.src = url;
+    video.load();
+  });
 }

 interface UseMaterialsOptions {
-  selectedMaterial: string;
-  setSelectedMaterial: React.Dispatch<React.SetStateAction<string>>;
+  selectedMaterials: string[];
+  setSelectedMaterials: React.Dispatch<React.SetStateAction<string[]>>;
 }

 export const useMaterials = ({
-  selectedMaterial,
-  setSelectedMaterial,
+  selectedMaterials,
+  setSelectedMaterials,
 }: UseMaterialsOptions) => {
  const [materials, setMaterials] = useState<Material[]>([]);
  const [fetchError, setFetchError] = useState<string | null>(null);
@@ -41,12 +62,25 @@ export const useMaterials = ({
      setMaterials(nextMaterials);
      setLastMaterialCount(nextMaterials.length);

-      setSelectedMaterial((prev) => {
-        // 如果当前选中的素材在列表中依然存在，保持选中
-        const exists = nextMaterials.some((item) => item.id === prev);
-        if (exists) return prev;
+      // Probe video durations in background
+      if (nextMaterials.length > 0) {
+        Promise.all(
+          nextMaterials.map(async (m) => {
+            const url = resolveMediaUrl(m.path);
+            if (!url) return m;
+            const dur = await probeVideoDuration(url);
+            return { ...m, duration_sec: dur };
+          })
+        ).then((enriched) => setMaterials(enriched));
+      }
+
+      setSelectedMaterials((prev) => {
+        // 保留已选中且仍存在的
+        const existingIds = new Set(nextMaterials.map((m) => m.id));
+        const kept = prev.filter((id) => existingIds.has(id));
+        if (kept.length > 0) return kept;
        // 否则默认选中第一个
-        return nextMaterials[0]?.id || "";
+        return nextMaterials[0]?.id ? [nextMaterials[0].id] : [];
      });
    } catch (error) {
      console.error("获取素材失败:", error);
@@ -54,29 +88,58 @@ export const useMaterials = ({
    } finally {
      setIsFetching(false);
    }
-  }, [setSelectedMaterial]);
+  }, [setSelectedMaterials]);
+
+  const MAX_MATERIALS = 4;
+
+  const toggleMaterial = useCallback((id: string) => {
+    setSelectedMaterials((prev) => {
+      if (prev.includes(id)) {
+        // 不能取消最后一个
+        if (prev.length <= 1) return prev;
+        return prev.filter((x) => x !== id);
+      }
+      if (prev.length >= MAX_MATERIALS) return prev;
+      return [...prev, id];
+    });
+  }, [setSelectedMaterials]);
+
+  const reorderMaterials = useCallback((activeId: string, overId: string) => {
+    setSelectedMaterials((prev) => {
+      const oldIndex = prev.indexOf(activeId);
+      const newIndex = prev.indexOf(overId);
+      if (oldIndex === -1 || newIndex === -1) return prev;
+      const next = [...prev];
+      next.splice(oldIndex, 1);
+      next.splice(newIndex, 0, activeId);
+      return next;
+    });
+  }, [setSelectedMaterials]);

  const deleteMaterial = useCallback(async (materialId: string) => {
    if (!confirm("确定要删除这个素材吗？")) return;
    try {
      await api.delete(`/api/materials/${materialId}`);
      fetchMaterials();
-      if (selectedMaterial === materialId) {
-        setSelectedMaterial("");
+      if (selectedMaterials.includes(materialId)) {
+        setSelectedMaterials((prev) => {
+          const next = prev.filter((id) => id !== materialId);
+          return next.length > 0 ? next : [];
+        });
      }
    } catch (error) {
      toast.error("删除失败: " + error);
    }
-  }, [fetchMaterials, selectedMaterial, setSelectedMaterial]);
+  }, [fetchMaterials, selectedMaterials, setSelectedMaterials]);

  const handleUpload = useCallback(async (e: React.ChangeEvent<HTMLInputElement>) => {
    const file = e.target.files?.[0];
    if (!file) return;

-    const validTypes = ['.mp4', '.mov', '.avi'];
+    const validTypes = ['.mp4', '.mov', '.avi', '.mkv', '.webm', '.flv', '.wmv', '.m4v', '.ts', '.mts'];
    const ext = file.name.toLowerCase().slice(file.name.lastIndexOf('.'));
    if (!validTypes.includes(ext)) {
-      setUploadError('仅支持 MP4、MOV、AVI 格式');
+      setUploadError('不支持的视频格式');
      return;
    }

@@ -100,7 +163,37 @@ export const useMaterials = ({

      setUploadProgress(100);
      setIsUploading(false);
-      fetchMaterials();
+
+      // 上传后重新拉列表并自动选中新素材
+      const { data: res } = await api.get<ApiResponse<{ materials: Material[] }>>(
+        `/api/materials?t=${new Date().getTime()}`
+      );
+      const payload = unwrap(res);
+      const nextMaterials = payload.materials || [];
+      setMaterials(nextMaterials);
+      setLastMaterialCount(nextMaterials.length);
+
+      // Probe video durations in background
+      if (nextMaterials.length > 0) {
+        Promise.all(
+          nextMaterials.map(async (m) => {
+            const url = resolveMediaUrl(m.path);
+            if (!url) return m;
+            const dur = await probeVideoDuration(url);
+            return { ...m, duration_sec: dur };
+          })
+        ).then((enriched) => setMaterials(enriched));
+      }
+
+      // 找出新增素材并默认仅选中新上传项，避免误触发多素材模式
+      const oldIds = new Set(materials.map((m) => m.id));
+      const newIds = nextMaterials.filter((m) => !oldIds.has(m.id)).map((m) => m.id);
+      if (newIds.length > 0) {
+        setSelectedMaterials([newIds[0]]);
+      } else if (nextMaterials[0]?.id) {
+        // 兜底：即使未识别到新增项，也保持单素材默认选择最新一个
+        setSelectedMaterials([nextMaterials[0].id]);
+      }
    } catch (err: unknown) {
      console.error("Upload failed:", err);
      setIsUploading(false);
@@ -110,7 +203,7 @@ export const useMaterials = ({
    }

    e.target.value = '';
-  }, [fetchMaterials]);
+  }, [materials, setSelectedMaterials]);

  return {
    materials,
@@ -122,6 +215,8 @@ export const useMaterials = ({
    uploadError,
    setUploadError,
    fetchMaterials,
+    toggleMaterial,
+    reorderMaterials,
    deleteMaterial,
    handleUpload,
  };
--- a/frontend/src/features/home/model/useRefAudios.ts
+++ b/frontend/src/features/home/model/useRefAudios.ts
@@ -13,14 +13,12 @@ interface RefAudio {
 }

 interface UseRefAudiosOptions {
-  fixedRefText: string;
  selectedRefAudio: RefAudio | null;
  setSelectedRefAudio: React.Dispatch<React.SetStateAction<RefAudio | null>>;
  setRefText: React.Dispatch<React.SetStateAction<string>>;
 }

 export const useRefAudios = ({
-  fixedRefText,
  selectedRefAudio,
  setSelectedRefAudio,
  setRefText,
@@ -28,6 +26,7 @@ export const useRefAudios = ({
  const [refAudios, setRefAudios] = useState<RefAudio[]>([]);
  const [isUploadingRef, setIsUploadingRef] = useState(false);
  const [uploadRefError, setUploadRefError] = useState<string | null>(null);
+  const [retranscribingId, setRetranscribingId] = useState<string | null>(null);

  const fetchRefAudios = useCallback(async () => {
    try {
@@ -42,15 +41,12 @@ export const useRefAudios = ({
  }, []);

  const uploadRefAudio = useCallback(async (file: File) => {
-    const refTextInput = fixedRefText;
-
    setIsUploadingRef(true);
    setUploadRefError(null);

    try {
      const formData = new FormData();
      formData.append('file', file);
-      formData.append('ref_text', refTextInput);

      const { data: res } = await api.post<ApiResponse<RefAudio>>('/api/ref-audios', formData, {
        headers: { 'Content-Type': 'multipart/form-data' },
@@ -68,7 +64,7 @@ export const useRefAudios = ({
      const errorMsg = axiosErr.response?.data?.message || axiosErr.message || String(err);
      setUploadRefError(`上传失败: ${errorMsg}`);
    }
-  }, [fetchRefAudios, fixedRefText, setRefText, setSelectedRefAudio]);
+  }, [fetchRefAudios, setRefText, setSelectedRefAudio]);

  const deleteRefAudio = useCallback(async (audioId: string) => {
    if (!confirm("确定要删除这个参考音频吗？")) return;
@@ -84,6 +80,28 @@ export const useRefAudios = ({
    }
  }, [fetchRefAudios, selectedRefAudio, setRefText, setSelectedRefAudio]);

+  const retranscribeRefAudio = useCallback(async (audioId: string) => {
+    setRetranscribingId(audioId);
+    try {
+      const { data: res } = await api.post<ApiResponse<{ ref_text: string }>>(
+        `/api/ref-audios/${encodeURIComponent(audioId)}/retranscribe`
+      );
+      const payload = unwrap(res);
+      toast.success("识别完成");
+      // 更新列表和当前选中
+      await fetchRefAudios();
+      if (selectedRefAudio?.id === audioId) {
+        setRefText(payload.ref_text);
+      }
+    } catch (err: unknown) {
+      const axiosErr = err as { response?: { data?: { message?: string } }; message?: string };
+      const errorMsg = axiosErr.response?.data?.message || axiosErr.message || String(err);
+      toast.error(`识别失败: ${errorMsg}`);
+    } finally {
+      setRetranscribingId(null);
+    }
+  }, [fetchRefAudios, selectedRefAudio, setRefText]);
+
  return {
    refAudios,
    isUploadingRef,
@@ -92,5 +110,7 @@ export const useRefAudios = ({
    fetchRefAudios,
    uploadRefAudio,
    deleteRefAudio,
+    retranscribeRefAudio,
+    retranscribingId,
  };
 };
--- a/frontend/src/features/home/model/useSavedScripts.ts
+++ b/frontend/src/features/home/model/useSavedScripts.ts
@@ -0,0 +1,51 @@
+import { useState, useEffect, useRef } from "react";
+
+export interface SavedScript {
+  id: string;
+  name: string;
+  content: string;
+  savedAt: number;
+}
+
+export function useSavedScripts(storageKey: string) {
+  const lsKey = `vigent_${storageKey}_savedScripts`;
+  const lsKeyRef = useRef(lsKey);
+  lsKeyRef.current = lsKey;
+
+  const [savedScripts, setSavedScripts] = useState<SavedScript[]>([]);
+
+  // Re-read from localStorage whenever lsKey changes (e.g. guest → userId)
+  useEffect(() => {
+    try {
+      const raw = localStorage.getItem(lsKey);
+      setSavedScripts(raw ? JSON.parse(raw) : []);
+    } catch {
+      setSavedScripts([]);
+    }
+  }, [lsKey]);
+
+  const saveScript = (content: string) => {
+    const name = content.slice(0, 15).replace(/\n/g, " ") || "未命名";
+    const entry: SavedScript = {
+      id: Date.now().toString(36) + Math.random().toString(36).slice(2, 6),
+      name,
+      content,
+      savedAt: Date.now(),
+    };
+    setSavedScripts((prev) => {
+      const next = [entry, ...prev];
+      localStorage.setItem(lsKeyRef.current, JSON.stringify(next));
+      return next;
+    });
+  };
+
+  const deleteScript = (id: string) => {
+    setSavedScripts((prev) => {
+      const next = prev.filter((s) => s.id !== id);
+      localStorage.setItem(lsKeyRef.current, JSON.stringify(next));
+      return next;
+    });
+  };
+
+  return { savedScripts, saveScript, deleteScript };
+}
--- a/frontend/src/features/home/model/useTimelineEditor.ts
+++ b/frontend/src/features/home/model/useTimelineEditor.ts
@@ -0,0 +1,256 @@
+import { useCallback, useEffect, useRef, useState } from "react";
+import type { Material } from "@/shared/types/material";
+
+export interface TimelineSegment {
+  id: string;
+  materialId: string;
+  materialName: string;
+  start: number;
+  end: number;
+  sourceStart: number;
+  sourceEnd: number;
+  color: string;
+}
+
+export interface CustomAssignment {
+  material_path: string;
+  start: number;
+  end: number;
+  source_start: number;
+  source_end?: number;
+}
+
+const COLORS = ["#8b5cf6", "#ec4899", "#06b6d4", "#f59e0b", "#10b981", "#f97316"];
+
+/** Serializable subset for localStorage */
+interface SegmentSnapshot {
+  materialId: string;
+  start: number;
+  end: number;
+  sourceStart: number;
+  sourceEnd: number;
+}
+
+/** Get effective duration of a segment (clipped range or full material duration) */
+function getEffectiveDuration(
+  seg: { sourceStart: number; sourceEnd: number; materialId: string },
+  mats: Material[]
+): number {
+  const mat = mats.find((m) => m.id === seg.materialId);
+  const matDur = mat?.duration_sec ?? 0;
+  if (seg.sourceEnd > seg.sourceStart) return seg.sourceEnd - seg.sourceStart;
+  if (seg.sourceStart > 0) return Math.max(matDur - seg.sourceStart, 0);
+  return matDur;
+}
+
+/**
+ * Recalculate segment start/end positions based on effective durations.
+ * - Segments placed sequentially by effective duration
+ * - Segments exceeding audioDuration keep their positions (overflow, start >= duration)
+ * - Last visible segment is capped/extended to exactly audioDuration (loop fill)
+ */
+function recalcPositions(
+  segs: TimelineSegment[],
+  mats: Material[],
+  duration: number
+): TimelineSegment[] {
+  if (segs.length === 0 || duration <= 0) return segs;
+
+  const fallbackDur = duration / segs.length;
+  let cursor = 0;
+  const result = segs.map((seg) => {
+    const effDur = getEffectiveDuration(seg, mats);
+    const dur = effDur > 0 ? effDur : fallbackDur;
+    const newSeg = { ...seg, start: cursor, end: cursor + dur };
+    cursor += dur;
+    return newSeg;
+  });
+
+  // Find last segment that starts before audioDuration
+  let lastVisibleIdx = -1;
+  for (let i = result.length - 1; i >= 0; i--) {
+    if (result[i].start < duration) {
+      lastVisibleIdx = i;
+      break;
+    }
+  }
+
+  // Cap/extend last visible segment to exactly audioDuration
+  if (lastVisibleIdx >= 0) {
+    result[lastVisibleIdx] = { ...result[lastVisibleIdx], end: duration };
+  }
+
+  return result;
+}
+
+interface UseTimelineEditorOptions {
+  audioDuration: number;
+  materials: Material[];
+  selectedMaterials: string[];
+  storageKey?: string;
+}
+
+export const useTimelineEditor = ({
+  audioDuration,
+  materials,
+  selectedMaterials,
+  storageKey,
+}: UseTimelineEditorOptions) => {
+  const [segments, setSegments] = useState<TimelineSegment[]>([]);
+  const prevKey = useRef("");
+  const restoredRef = useRef(false);
+
+  // Refs for stable callbacks (avoid recreating on every materials/duration change)
+  const materialsRef = useRef(materials);
+  const audioDurationRef = useRef(audioDuration);
+
+  useEffect(() => {
+    materialsRef.current = materials;
+  }, [materials]);
+
+  useEffect(() => {
+    audioDurationRef.current = audioDuration;
+  }, [audioDuration]);
+
+  // Build a durationsKey so segments re-init when material durations become available
+  const durationsKey = selectedMaterials
+    .map((id) => materials.find((m) => m.id === id)?.duration_sec ?? 0)
+    .join(",");
+
+  // Build a cache key from materials + duration
+  const cacheKey = `${selectedMaterials.join(",")}_${audioDuration.toFixed(1)}`;
+  const lsKey = storageKey ? `vigent_${storageKey}_timeline` : null;
+
+  const initSegments = useCallback(() => {
+    if (selectedMaterials.length === 0 || audioDuration <= 0) {
+      setSegments([]);
+      return;
+    }
+
+    // Try restore from localStorage
+    if (lsKey) {
+      try {
+        const raw = localStorage.getItem(lsKey);
+        if (raw) {
+          const saved = JSON.parse(raw) as { key: string; segments: SegmentSnapshot[] };
+          if (saved.key === cacheKey && saved.segments.length === selectedMaterials.length) {
+            const allMatch = saved.segments.every(
+              (s, i) => s.materialId === selectedMaterials[i] || saved.segments.some((ss) => ss.materialId === selectedMaterials[i])
+            );
+            if (allMatch) {
+              const restored: TimelineSegment[] = saved.segments.map((s, i) => {
+                const mat = materials.find((m) => m.id === s.materialId);
+                return {
+                  id: `seg-${i}-${Date.now()}`,
+                  materialId: s.materialId,
+                  materialName: mat?.scene || mat?.name || s.materialId,
+                  start: 0,
+                  end: 0,
+                  sourceStart: s.sourceStart,
+                  sourceEnd: s.sourceEnd,
+                  color: COLORS[i % COLORS.length],
+                };
+              });
+              setSegments(recalcPositions(restored, materials, audioDuration));
+              restoredRef.current = true;
+              return;
+            }
+          }
+        }
+      } catch {
+        // ignore parse errors
+      }
+    }
+
+    // Create fresh segments — positions derived by recalcPositions
+    const newSegments: TimelineSegment[] = selectedMaterials.map((matId, i) => {
+      const mat = materials.find((m) => m.id === matId);
+      return {
+        id: `seg-${i}-${Date.now()}`,
+        materialId: matId,
+        materialName: mat?.scene || mat?.name || matId,
+        start: 0,
+        end: 0,
+        sourceStart: 0,
+        sourceEnd: 0,
+        color: COLORS[i % COLORS.length],
+      };
+    });
+
+    setSegments(recalcPositions(newSegments, materials, audioDuration));
+  }, [audioDuration, materials, selectedMaterials, lsKey, cacheKey]);
+
+  // Auto-init when selectedMaterials, audioDuration, or material durations change
+  useEffect(() => {
+    const key = `${selectedMaterials.join(",")}_${audioDuration}_${durationsKey}`;
+    if (key !== prevKey.current) {
+      prevKey.current = key;
+      initSegments();
+    }
+  }, [selectedMaterials, audioDuration, durationsKey, initSegments]);
+
+  // Persist segments to localStorage on change (debounced)
+  useEffect(() => {
+    if (!lsKey || segments.length === 0) return;
+    const timeout = setTimeout(() => {
+      const snapshots: SegmentSnapshot[] = segments.map((s) => ({
+        materialId: s.materialId,
+        start: s.start,
+        end: s.end,
+        sourceStart: s.sourceStart,
+        sourceEnd: s.sourceEnd,
+      }));
+      localStorage.setItem(lsKey, JSON.stringify({ key: cacheKey, segments: snapshots }));
+    }, 300);
+    return () => clearTimeout(timeout);
+  }, [segments, lsKey, cacheKey]);
+
+  const reorderSegments = useCallback(
+    (fromIdx: number, toIdx: number) => {
+      setSegments((prev) => {
+        if (fromIdx < 0 || toIdx < 0 || fromIdx >= prev.length || toIdx >= prev.length) return prev;
+        if (fromIdx === toIdx) return prev;
+        const next = [...prev];
+        // Move the segment: remove from old position, insert at new position
+        const [moved] = next.splice(fromIdx, 1);
+        next.splice(toIdx, 0, moved);
+        return recalcPositions(next, materialsRef.current, audioDurationRef.current);
+      });
+    },
+    []
+  );
+
+  const setSourceRange = useCallback(
+    (id: string, sourceStart: number, sourceEnd: number) => {
+      setSegments((prev) => {
+        const updated = prev.map((s) => (s.id === id ? { ...s, sourceStart, sourceEnd } : s));
+        return recalcPositions(updated, materialsRef.current, audioDurationRef.current);
+      });
+    },
+    []
+  );
+
+  const toCustomAssignments = useCallback((): CustomAssignment[] => {
+    const duration = audioDurationRef.current;
+    return segments
+      .filter((seg) => seg.start < duration)
+      .map((seg) => {
+        const mat = materialsRef.current.find((m) => m.id === seg.materialId);
+        return {
+          material_path: mat?.path || seg.materialId,
+          start: seg.start,
+          end: seg.end,
+          source_start: seg.sourceStart,
+          source_end: seg.sourceEnd > seg.sourceStart ? seg.sourceEnd : undefined,
+        };
+      });
+  }, [segments]);
+
+  return {
+    segments,
+    initSegments,
+    reorderSegments,
+    setSourceRange,
+    toCustomAssignments,
+  };
+};
--- a/frontend/src/features/home/model/useVideoFrameCapture.ts
+++ b/frontend/src/features/home/model/useVideoFrameCapture.ts
@@ -0,0 +1,94 @@
+import { useEffect, useState } from "react";
+
+/** 预览窗口最大 280px 宽，截取无需超过此尺寸 */
+const MAX_CAPTURE_WIDTH = 480;
+
+/**
+ * 从视频 URL 截取 0.1s 处的帧，返回 JPEG data URL。
+ * 失败时返回 null（降级渐变背景）。
+ */
+export function useVideoFrameCapture(videoUrl: string | null): string | null {
+  const [frameUrl, setFrameUrl] = useState<string | null>(null);
+
+  useEffect(() => {
+    if (!videoUrl) {
+      setFrameUrl(null);
+      return;
+    }
+
+    let isActive = true;
+    const video = document.createElement("video");
+    video.crossOrigin = "anonymous";
+    video.muted = true;
+    video.preload = "auto";
+    video.playsInline = true;
+
+    const cleanup = () => {
+      video.removeEventListener("loadedmetadata", onLoaded);
+      video.removeEventListener("canplay", onLoaded);
+      video.removeEventListener("seeked", onSeeked);
+      video.removeEventListener("error", onError);
+      video.src = "";
+      video.load();
+    };
+
+    const onSeeked = () => {
+      if (!isActive) return;
+      try {
+        const vw = video.videoWidth;
+        const vh = video.videoHeight;
+        if (!vw || !vh) {
+          if (isActive) setFrameUrl(null);
+          cleanup();
+          return;
+        }
+
+        const scale = Math.min(1, MAX_CAPTURE_WIDTH / vw);
+        const cw = Math.round(vw * scale);
+        const ch = Math.round(vh * scale);
+
+        const canvas = document.createElement("canvas");
+        canvas.width = cw;
+        canvas.height = ch;
+        const ctx = canvas.getContext("2d");
+        if (!ctx) {
+          if (isActive) setFrameUrl(null);
+          cleanup();
+          return;
+        }
+        ctx.drawImage(video, 0, 0, cw, ch);
+        const dataUrl = canvas.toDataURL("image/jpeg", 0.7);
+        if (isActive) setFrameUrl(dataUrl);
+      } catch {
+        if (isActive) setFrameUrl(null);
+      }
+      cleanup();
+    };
+
+    let seeked = false;
+    const onLoaded = () => {
+      if (!isActive || seeked) return;
+      seeked = true;
+      video.currentTime = 0.1;
+    };
+
+    const onError = () => {
+      if (isActive) setFrameUrl(null);
+      cleanup();
+    };
+
+    // 先绑定监听，再设 src
+    video.addEventListener("loadedmetadata", onLoaded);
+    video.addEventListener("canplay", onLoaded);
+    video.addEventListener("seeked", onSeeked);
+    video.addEventListener("error", onError);
+    video.src = videoUrl;
+
+    return () => {
+      isActive = false;
+      cleanup();
+    };
+  }, [videoUrl]);
+
+  return frameUrl;
+}
--- a/frontend/src/features/home/ui/BgmPanel.tsx
+++ b/frontend/src/features/home/ui/BgmPanel.tsx
@@ -43,7 +43,7 @@ export function BgmPanel({
  return (
    <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
      <div className="flex items-center justify-between mb-4">
-        <h2 className="text-lg font-semibold text-white flex items-center gap-2">🎵 背景音乐</h2>
+        <h2 className="text-lg font-semibold text-white flex items-center gap-2">五、背景音乐</h2>
        <div className="flex items-center gap-2">
          <button
            onClick={onRefresh}
--- a/frontend/src/features/home/ui/ClipTrimmer.tsx
+++ b/frontend/src/features/home/ui/ClipTrimmer.tsx
@@ -0,0 +1,293 @@
+import { useCallback, useEffect, useRef, useState } from "react";
+import { X, Play, Pause } from "lucide-react";
+import type { TimelineSegment } from "@/features/home/model/useTimelineEditor";
+
+interface ClipTrimmerProps {
+  isOpen: boolean;
+  segment: TimelineSegment | null;
+  materialUrl: string | null;
+  onConfirm: (sourceStart: number, sourceEnd: number) => void;
+  onClose: () => void;
+}
+
+function formatSec(sec: number): string {
+  const m = Math.floor(sec / 60);
+  const s = sec % 60;
+  return `${String(m).padStart(2, "0")}:${s.toFixed(1).padStart(4, "0")}`;
+}
+
+export function ClipTrimmer({
+  isOpen,
+  segment,
+  materialUrl,
+  onConfirm,
+  onClose,
+}: ClipTrimmerProps) {
+  const videoRef = useRef<HTMLVideoElement>(null);
+  const trackRef = useRef<HTMLDivElement>(null);
+  const [duration, setDuration] = useState(0);
+  const [sourceStart, setSourceStart] = useState(0);
+  const [sourceEnd, setSourceEnd] = useState(0);
+  const [currentTime, setCurrentTime] = useState(0);
+  const [isPlaying, setIsPlaying] = useState(false);
+  const [dragging, setDragging] = useState<"start" | "end" | null>(null);
+  const animRef = useRef<number>(0);
+
+  // Reset state when segment changes
+  useEffect(() => {
+    if (segment && isOpen) {
+      setSourceStart(segment.sourceStart);
+      setSourceEnd(segment.sourceEnd);
+      setCurrentTime(segment.sourceStart);
+      setIsPlaying(false);
+    }
+  }, [segment, isOpen]);
+
+  // Track currentTime during playback
+  useEffect(() => {
+    if (!isPlaying || !videoRef.current) return;
+
+    const tick = () => {
+      if (!videoRef.current) return;
+      const t = videoRef.current.currentTime;
+      const end = sourceEnd || duration;
+      if (t >= end) {
+        videoRef.current.pause();
+        videoRef.current.currentTime = sourceStart;
+        setCurrentTime(sourceStart);
+        setIsPlaying(false);
+        return;
+      }
+      setCurrentTime(t);
+      animRef.current = requestAnimationFrame(tick);
+    };
+    animRef.current = requestAnimationFrame(tick);
+    return () => cancelAnimationFrame(animRef.current);
+  }, [isPlaying, sourceStart, sourceEnd, duration]);
+
+  // Seek video when not playing and currentTime changes
+  useEffect(() => {
+    if (videoRef.current && !isPlaying) {
+      videoRef.current.currentTime = currentTime;
+    }
+  }, [currentTime, isPlaying]);
+
+  const handleLoadedMetadata = useCallback(() => {
+    if (videoRef.current) {
+      const dur = videoRef.current.duration;
+      setDuration(dur);
+      if (sourceEnd === 0) {
+        setSourceEnd(dur);
+      }
+    }
+  }, [sourceEnd]);
+
+  const togglePlay = useCallback(() => {
+    if (!videoRef.current || duration === 0) return;
+    if (isPlaying) {
+      videoRef.current.pause();
+      setIsPlaying(false);
+    } else {
+      const end = sourceEnd || duration;
+      if (videoRef.current.currentTime >= end || videoRef.current.currentTime < sourceStart) {
+        videoRef.current.currentTime = sourceStart;
+        setCurrentTime(sourceStart);
+      }
+      videoRef.current.play().catch(() => {});
+      setIsPlaying(true);
+    }
+  }, [isPlaying, sourceStart, sourceEnd, duration]);
+
+  // --- Dual-handle slider logic ---
+  const getPositionFromEvent = useCallback(
+    (clientX: number) => {
+      if (!trackRef.current || duration === 0) return 0;
+      const rect = trackRef.current.getBoundingClientRect();
+      const ratio = Math.max(0, Math.min(1, (clientX - rect.left) / rect.width));
+      return ratio * duration;
+    },
+    [duration]
+  );
+
+  const handleThumbPointerDown = useCallback(
+    (which: "start" | "end", e: React.PointerEvent) => {
+      e.preventDefault();
+      e.stopPropagation();
+      setDragging(which);
+      (e.target as HTMLElement).setPointerCapture(e.pointerId);
+    },
+    []
+  );
+
+  const handleTrackPointerMove = useCallback(
+    (e: React.PointerEvent) => {
+      if (!dragging) return;
+      const pos = getPositionFromEvent(e.clientX);
+      const minGap = 0.5;
+      if (dragging === "start") {
+        const clamped = Math.max(0, Math.min(pos, (sourceEnd || duration) - minGap));
+        setSourceStart(clamped);
+        setCurrentTime(clamped);
+      } else {
+        const clamped = Math.min(duration, Math.max(pos, sourceStart + minGap));
+        setSourceEnd(clamped);
+      }
+    },
+    [dragging, getPositionFromEvent, sourceStart, sourceEnd, duration]
+  );
+
+  const handleTrackPointerUp = useCallback(() => {
+    setDragging(null);
+  }, []);
+
+  const handleConfirm = () => {
+    onConfirm(sourceStart, sourceEnd >= duration ? 0 : sourceEnd);
+  };
+
+  if (!isOpen || !segment) return null;
+
+  const assignedDur = segment.end - segment.start;
+  const effectiveEnd = sourceEnd || duration;
+  const clipDur = effectiveEnd - sourceStart;
+  const startPct = duration > 0 ? (sourceStart / duration) * 100 : 0;
+  const endPct = duration > 0 ? (effectiveEnd / duration) * 100 : 100;
+  const playheadPct = duration > 0 ? (currentTime / duration) * 100 : 0;
+
+  return (
+    <div className="fixed inset-0 z-50 flex items-center justify-center bg-black/60 backdrop-blur-sm" onClick={onClose}>
+      <div
+        className="bg-gray-900 border border-white/10 rounded-2xl w-full max-w-lg mx-4 overflow-hidden"
+        onClick={(e) => e.stopPropagation()}
+      >
+        {/* Header */}
+        <div className="flex items-center justify-between px-5 py-3 border-b border-white/10">
+          <h3 className="text-white font-semibold text-sm">
+            截取设置 - {segment.materialName}
+          </h3>
+          <button onClick={onClose} className="text-gray-400 hover:text-white">
+            <X className="h-4 w-4" />
+          </button>
+        </div>
+
+        {/* Video preview */}
+        <div className="px-5 pt-4">
+          <div className="relative bg-black rounded-lg overflow-hidden aspect-video group">
+            {materialUrl ? (
+              <video
+                ref={videoRef}
+                src={materialUrl}
+                className="w-full h-full object-contain"
+                onLoadedMetadata={handleLoadedMetadata}
+                onEnded={() => setIsPlaying(false)}
+                preload="auto"
+                muted
+              />
+            ) : (
+              <div className="flex items-center justify-center h-full text-gray-500 text-sm">
+                无法加载视频
+              </div>
+            )}
+            {/* Play/Pause overlay */}
+            {materialUrl && (
+              <button
+                onClick={togglePlay}
+                className="absolute inset-0 flex items-center justify-center bg-black/0 hover:bg-black/30 transition-colors"
+              >
+                <div className={`p-3 rounded-full bg-black/60 text-white transition-opacity ${isPlaying ? "opacity-0 group-hover:opacity-100" : "opacity-100"}`}>
+                  {isPlaying ? <Pause className="h-6 w-6" /> : <Play className="h-6 w-6" />}
+                </div>
+              </button>
+            )}
+            <div className="absolute bottom-2 right-2 bg-black/70 text-white text-[10px] px-2 py-0.5 rounded pointer-events-none">
+              {formatSec(currentTime)}
+            </div>
+          </div>
+        </div>
+
+        {/* Dual-handle range slider */}
+        <div className="px-5 py-4 space-y-3">
+          <div className="text-xs text-gray-400 flex justify-between">
+            <span>源视频时长: {duration > 0 ? formatSec(duration) : "加载中..."}</span>
+          </div>
+
+          {/* Custom range track */}
+          <div
+            ref={trackRef}
+            className="relative h-10 cursor-pointer select-none touch-none"
+            onPointerMove={handleTrackPointerMove}
+            onPointerUp={handleTrackPointerUp}
+            onPointerLeave={handleTrackPointerUp}
+          >
+            {/* Background track */}
+            <div className="absolute top-1/2 -translate-y-1/2 left-0 right-0 h-2 bg-white/10 rounded-full" />
+
+            {/* Selected range */}
+            <div
+              className="absolute top-1/2 -translate-y-1/2 h-2 rounded-full"
+              style={{
+                left: `${startPct}%`,
+                width: `${endPct - startPct}%`,
+                backgroundColor: segment.color + "88",
+              }}
+            />
+
+            {/* Playhead indicator */}
+            {duration > 0 && (
+              <div
+                className="absolute top-1/2 -translate-y-1/2 w-0.5 h-4 bg-white/60 rounded-full pointer-events-none"
+                style={{ left: `${playheadPct}%` }}
+              />
+            )}
+
+            {/* Start thumb */}
+            <div
+              onPointerDown={(e) => handleThumbPointerDown("start", e)}
+              className="absolute top-1/2 -translate-y-1/2 -translate-x-1/2 w-5 h-5 rounded-full bg-purple-500 border-2 border-white shadow-lg cursor-grab active:cursor-grabbing hover:scale-110 transition-transform z-10"
+              style={{ left: `${startPct}%` }}
+              title={`起点: ${formatSec(sourceStart)}`}
+            />
+
+            {/* End thumb */}
+            <div
+              onPointerDown={(e) => handleThumbPointerDown("end", e)}
+              className="absolute top-1/2 -translate-y-1/2 -translate-x-1/2 w-5 h-5 rounded-full bg-pink-500 border-2 border-white shadow-lg cursor-grab active:cursor-grabbing hover:scale-110 transition-transform z-10"
+              style={{ left: `${endPct}%` }}
+              title={`终点: ${formatSec(effectiveEnd)}`}
+            />
+          </div>
+
+          {/* Time labels */}
+          <div className="flex justify-between text-xs text-gray-400">
+            <span className="text-purple-400">{formatSec(sourceStart)}</span>
+            <span className="text-pink-400">{formatSec(effectiveEnd)}</span>
+          </div>
+
+          {/* Info */}
+          <div className="text-[11px] text-gray-500 flex items-center gap-2 flex-wrap">
+            <span>截取: {clipDur.toFixed(1)}s</span>
+            <span className="text-gray-600">|</span>
+            <span>分配: {assignedDur.toFixed(1)}s</span>
+            {clipDur < assignedDur && <span className="text-amber-500">(将循环补足)</span>}
+            {clipDur > assignedDur && <span className="text-cyan-500">(将截断)</span>}
+          </div>
+        </div>
+
+        {/* Actions */}
+        <div className="flex justify-end gap-2 px-5 pb-4">
+          <button
+            onClick={onClose}
+            className="px-4 py-1.5 text-xs bg-white/10 hover:bg-white/20 rounded-lg text-gray-300 transition-colors"
+          >
+            取消
+          </button>
+          <button
+            onClick={handleConfirm}
+            className="px-4 py-1.5 text-xs bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-700 hover:to-pink-700 text-white rounded-lg transition-colors"
+          >
+            确定
+          </button>
+        </div>
+      </div>
+    </div>
+  );
+}
--- a/frontend/src/features/home/ui/FloatingStylePreview.tsx
+++ b/frontend/src/features/home/ui/FloatingStylePreview.tsx
@@ -35,9 +35,13 @@ interface TitleStyleOption {
 interface FloatingStylePreviewProps {
  onClose: () => void;
  videoTitle: string;
+  videoSecondaryTitle: string;
  titleStyles: TitleStyleOption[];
  selectedTitleStyleId: string;
  titleFontSize: number;
+  selectedSecondaryTitleStyleId: string;
+  secondaryTitleFontSize: number;
+  secondaryTitleTopMargin: number;
  subtitleStyles: SubtitleStyleOption[];
  selectedSubtitleStyleId: string;
  subtitleFontSize: number;
@@ -49,16 +53,22 @@ interface FloatingStylePreviewProps {
  buildTextShadow: (color: string, size: number) => string;
  previewBaseWidth: number;
  previewBaseHeight: number;
+  previewBackgroundUrl?: string | null;
 }

 const DESKTOP_WIDTH = 280;
+const MOBILE_WIDTH = 160;

 export function FloatingStylePreview({
  onClose,
  videoTitle,
+  videoSecondaryTitle,
  titleStyles,
  selectedTitleStyleId,
  titleFontSize,
+  selectedSecondaryTitleStyleId,
+  secondaryTitleFontSize,
+  secondaryTitleTopMargin,
  subtitleStyles,
  selectedSubtitleStyleId,
  subtitleFontSize,
@@ -70,11 +80,10 @@ export function FloatingStylePreview({
  buildTextShadow,
  previewBaseWidth,
  previewBaseHeight,
+  previewBackgroundUrl,
 }: FloatingStylePreviewProps) {
  const isMobile = typeof window !== "undefined" && window.innerWidth < 640;
-  const windowWidth = isMobile
-    ? Math.min(window.innerWidth - 32, 360)
-    : DESKTOP_WIDTH;
+  const windowWidth = isMobile ? MOBILE_WIDTH : DESKTOP_WIDTH;

  useEffect(() => {
    const handleKeyDown = (e: KeyboardEvent) => {
@@ -86,6 +95,8 @@ export function FloatingStylePreview({

  const previewScale = windowWidth / previewBaseWidth;
  const previewHeight = previewBaseHeight * previewScale;
+  const widthScale = Math.min(1, previewBaseWidth / 1080);
+  const responsiveScale = Math.max(0.55, widthScale);

  const activeSubtitleStyle = subtitleStyles.find((s) => s.id === selectedSubtitleStyleId)
    || subtitleStyles.find((s) => s.is_default)
@@ -102,8 +113,8 @@ export function FloatingStylePreview({
  const subtitleHighlightColor = activeSubtitleStyle?.highlight_color || "#FFE600";
  const subtitleNormalColor = activeSubtitleStyle?.normal_color || "#FFFFFF";
  const subtitleStrokeColor = activeSubtitleStyle?.stroke_color || "#000000";
-  const subtitleStrokeSize = activeSubtitleStyle?.stroke_size ?? 3;
-  const subtitleLetterSpacing = activeSubtitleStyle?.letter_spacing ?? 2;
+  const subtitleStrokeSize = Math.max(1, Math.round((activeSubtitleStyle?.stroke_size ?? 3) * responsiveScale));
+  const subtitleLetterSpacing = Math.max(0, (activeSubtitleStyle?.letter_spacing ?? 2) * responsiveScale);
  const subtitleFontFamilyName = `SubtitlePreview-${activeSubtitleStyle?.id || "default"}`;
  const subtitleFontUrl = activeSubtitleStyle?.font_file
    ? resolveAssetUrl(`fonts/${activeSubtitleStyle.font_file}`)
@@ -111,23 +122,45 @@ export function FloatingStylePreview({

  const titleColor = activeTitleStyle?.color || "#FFFFFF";
  const titleStrokeColor = activeTitleStyle?.stroke_color || "#000000";
-  const titleStrokeSize = activeTitleStyle?.stroke_size ?? 8;
-  const titleLetterSpacing = activeTitleStyle?.letter_spacing ?? 4;
+  const titleStrokeSize = Math.max(1, Math.round((activeTitleStyle?.stroke_size ?? 8) * responsiveScale));
+  const titleLetterSpacing = Math.max(0, (activeTitleStyle?.letter_spacing ?? 4) * responsiveScale);
  const titleFontWeight = activeTitleStyle?.font_weight ?? 900;
  const titleFontFamilyName = `TitlePreview-${activeTitleStyle?.id || "default"}`;
  const titleFontUrl = activeTitleStyle?.font_file
    ? resolveAssetUrl(`fonts/${activeTitleStyle.font_file}`)
    : null;

+  const scaledTitleFontSize = Math.max(36, Math.round(titleFontSize * responsiveScale));
+  const scaledSubtitleFontSize = Math.max(28, Math.round(subtitleFontSize * responsiveScale));
+  const scaledTitleTopMargin = Math.max(0, Math.round(titleTopMargin * responsiveScale));
+  const scaledSubtitleBottomMargin = Math.max(0, Math.round(subtitleBottomMargin * responsiveScale));
+
+  // 副标题样式
+  const activeSecondaryTitleStyle = titleStyles.find((s) => s.id === selectedSecondaryTitleStyleId)
+    || activeTitleStyle;
+  const stColor = activeSecondaryTitleStyle?.color || "#FFFFFF";
+  const stStrokeColor = activeSecondaryTitleStyle?.stroke_color || "#000000";
+  const stStrokeSize = Math.max(1, Math.round((activeSecondaryTitleStyle?.stroke_size ?? 6) * responsiveScale));
+  const stLetterSpacing = Math.max(0, (activeSecondaryTitleStyle?.letter_spacing ?? 2) * responsiveScale);
+  const stFontWeight = activeSecondaryTitleStyle?.font_weight ?? 700;
+  const stFontFamilyName = `SecondaryTitlePreview-${activeSecondaryTitleStyle?.id || "default"}`;
+  const stFontUrl = activeSecondaryTitleStyle?.font_file
+    ? resolveAssetUrl(`fonts/${activeSecondaryTitleStyle.font_file}`)
+    : null;
+  const scaledSecondaryTitleFontSize = Math.max(24, Math.round(secondaryTitleFontSize * responsiveScale));
+  const scaledSecondaryTitleTopMargin = Math.max(0, Math.round(secondaryTitleTopMargin * responsiveScale));
+  const previewSecondaryTitleText = videoSecondaryTitle.trim() || "";
+
  const content = (
    <div
      style={{
        position: "fixed",
-        left: "16px",
-        top: "16px",
+        ...(isMobile
+          ? { right: "12px", bottom: "12px" }
+          : { left: "16px", top: "16px" }),
        width: `${windowWidth}px`,
        zIndex: 150,
-        maxHeight: "calc(100dvh - 32px)",
+        maxHeight: isMobile ? "calc(50dvh)" : "calc(100dvh - 32px)",
        overflow: "hidden",
      }}
      className="rounded-xl border border-white/20 bg-gray-900/95 backdrop-blur-md shadow-2xl"
@@ -152,13 +185,18 @@ export function FloatingStylePreview({
        className="relative overflow-hidden rounded-b-xl"
        style={{ height: `${previewHeight}px` }}
      >
-        {(titleFontUrl || subtitleFontUrl) && (
+        {(titleFontUrl || subtitleFontUrl || stFontUrl) && (
          <style>{`
            ${titleFontUrl ? `@font-face { font-family: '${titleFontFamilyName}'; src: url('${titleFontUrl}') format('${getFontFormat(activeTitleStyle?.font_file)}'); font-weight: 400; font-style: normal; }` : ''}
+            ${stFontUrl && stFontUrl !== titleFontUrl ? `@font-face { font-family: '${stFontFamilyName}'; src: url('${stFontUrl}') format('${getFontFormat(activeSecondaryTitleStyle?.font_file)}'); font-weight: 400; font-style: normal; }` : ''}
            ${subtitleFontUrl ? `@font-face { font-family: '${subtitleFontFamilyName}'; src: url('${subtitleFontUrl}') format('${getFontFormat(activeSubtitleStyle?.font_file)}'); font-weight: 400; font-style: normal; }` : ''}
          `}</style>
        )}
+        {previewBackgroundUrl ? (
+          <img src={previewBackgroundUrl} alt="" className="absolute inset-0 w-full h-full object-cover" />
+        ) : (
          <div className="absolute inset-0 opacity-20 bg-gradient-to-br from-purple-500/40 via-transparent to-pink-500/30" />
+        )}
        <div
          className="absolute top-0 left-0"
          style={{
@@ -172,11 +210,20 @@ export function FloatingStylePreview({
            className="w-full text-center"
            style={{
              position: 'absolute',
-              top: `${titleTopMargin}px`,
+              top: `${scaledTitleTopMargin}px`,
              left: 0,
              right: 0,
+              display: 'flex',
+              flexDirection: 'column',
+              alignItems: 'center',
+              padding: '0 5%',
+              boxSizing: 'border-box',
+            }}
+          >
+            <div
+              style={{
                color: titleColor,
-              fontSize: `${titleFontSize}px`,
+                fontSize: `${scaledTitleFontSize}px`,
                fontWeight: titleFontWeight,
                fontFamily: titleFontUrl
                  ? `'${titleFontFamilyName}', "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Noto Sans SC", sans-serif`
@@ -184,27 +231,57 @@ export function FloatingStylePreview({
                textShadow: buildTextShadow(titleStrokeColor, titleStrokeSize),
                letterSpacing: `${titleLetterSpacing}px`,
                lineHeight: 1.2,
+                whiteSpace: 'normal',
+                wordBreak: 'break-word',
+                overflowWrap: 'anywhere',
                opacity: videoTitle.trim() ? 1 : 0.7,
-              padding: '0 5%',
              }}
            >
              {previewTitleText}
            </div>
+            {previewSecondaryTitleText && (
+              <div
+                style={{
+                  marginTop: `${scaledSecondaryTitleTopMargin}px`,
+                  color: stColor,
+                  fontSize: `${scaledSecondaryTitleFontSize}px`,
+                  fontWeight: stFontWeight,
+                  fontFamily: stFontUrl && stFontUrl !== titleFontUrl
+                    ? `'${stFontFamilyName}', "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Noto Sans SC", sans-serif`
+                    : titleFontUrl
+                      ? `'${titleFontFamilyName}', "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Noto Sans SC", sans-serif`
+                      : '"PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Noto Sans SC", sans-serif',
+                  textShadow: buildTextShadow(stStrokeColor, stStrokeSize),
+                  letterSpacing: `${stLetterSpacing}px`,
+                  lineHeight: 1.2,
+                  whiteSpace: 'normal',
+                  wordBreak: 'break-word',
+                  overflowWrap: 'anywhere',
+                }}
+              >
+                {previewSecondaryTitleText}
+              </div>
+            )}
+          </div>

          <div
            className="w-full text-center"
            style={{
              position: 'absolute',
-              bottom: `${subtitleBottomMargin}px`,
+              bottom: `${scaledSubtitleBottomMargin}px`,
              left: 0,
              right: 0,
-              fontSize: `${subtitleFontSize}px`,
+              fontSize: `${scaledSubtitleFontSize}px`,
              fontFamily: subtitleFontUrl
                ? `'${subtitleFontFamilyName}', "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Noto Sans SC", sans-serif`
                : '"PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "Noto Sans SC", sans-serif',
              textShadow: buildTextShadow(subtitleStrokeColor, subtitleStrokeSize),
              letterSpacing: `${subtitleLetterSpacing}px`,
              lineHeight: 1.35,
+              whiteSpace: 'normal',
+              wordBreak: 'break-word',
+              overflowWrap: 'anywhere',
+              boxSizing: 'border-box',
              padding: '0 6%',
            }}
          >
--- a/frontend/src/features/home/ui/GenerateActionBar.tsx
+++ b/frontend/src/features/home/ui/GenerateActionBar.tsx
@@ -4,6 +4,7 @@ interface GenerateActionBarProps {
  isGenerating: boolean;
  progress: number;
  disabled: boolean;
+  materialCount?: number;
  onGenerate: () => void;
 }

@@ -11,9 +12,11 @@ export function GenerateActionBar({
  isGenerating,
  progress,
  disabled,
+  materialCount = 1,
  onGenerate,
 }: GenerateActionBarProps) {
  return (
+    <div>
      <button
        onClick={onGenerate}
        disabled={disabled}
@@ -49,5 +52,11 @@ export function GenerateActionBar({
          </span>
        )}
      </button>
+      {!isGenerating && materialCount >= 2 && (
+        <p className="text-xs text-gray-400 text-center mt-1.5">
+          多素材模式 ({materialCount} 个机位)，生成耗时较长
+        </p>
+      )}
+    </div>
  );
 }
--- a/frontend/src/features/home/ui/GeneratedAudiosPanel.tsx
+++ b/frontend/src/features/home/ui/GeneratedAudiosPanel.tsx
@@ -0,0 +1,362 @@
+import { useState, useRef, useCallback, useEffect } from "react";
+import { Play, Pause, Pencil, Trash2, Check, X, RefreshCw, Mic, ChevronDown } from "lucide-react";
+import type { GeneratedAudio } from "@/features/home/model/useGeneratedAudios";
+
+interface AudioTask {
+  status: string;
+  progress?: number;
+  message?: string;
+}
+
+interface GeneratedAudiosPanelProps {
+  generatedAudios: GeneratedAudio[];
+  selectedAudioId: string | null;
+  isGeneratingAudio: boolean;
+  audioTask: AudioTask | null;
+  onGenerateAudio: () => void;
+  onRefresh: () => void;
+  onSelectAudio: (audio: GeneratedAudio) => void;
+  onDeleteAudio: (id: string) => void;
+  onRenameAudio: (id: string, newName: string) => void;
+  hasText: boolean;
+  missingRefAudio?: boolean;
+  speed: number;
+  onSpeedChange: (speed: number) => void;
+  ttsMode: string;
+  embedded?: boolean;
+}
+
+export function GeneratedAudiosPanel({
+  generatedAudios,
+  selectedAudioId,
+  isGeneratingAudio,
+  audioTask,
+  onGenerateAudio,
+  onRefresh,
+  onSelectAudio,
+  onDeleteAudio,
+  onRenameAudio,
+  hasText,
+  missingRefAudio = false,
+  speed,
+  onSpeedChange,
+  ttsMode,
+  embedded = false,
+}: GeneratedAudiosPanelProps) {
+  const [editingId, setEditingId] = useState<string | null>(null);
+  const [editName, setEditName] = useState("");
+  const [playingId, setPlayingId] = useState<string | null>(null);
+  const [speedOpen, setSpeedOpen] = useState(false);
+  const audioRef = useRef<HTMLAudioElement | null>(null);
+  const speedRef = useRef<HTMLDivElement>(null);
+
+  const stopPlaying = useCallback(() => {
+    if (audioRef.current) {
+      audioRef.current.pause();
+      audioRef.current.currentTime = 0;
+      audioRef.current = null;
+    }
+    setPlayingId(null);
+  }, []);
+
+  // Cleanup on unmount
+  useEffect(() => {
+    return () => {
+      if (audioRef.current) {
+        audioRef.current.pause();
+        audioRef.current = null;
+      }
+    };
+  }, []);
+
+  // Close speed dropdown on click outside
+  useEffect(() => {
+    const handler = (e: MouseEvent) => {
+      if (speedRef.current && !speedRef.current.contains(e.target as Node)) {
+        setSpeedOpen(false);
+      }
+    };
+    if (speedOpen) document.addEventListener("mousedown", handler);
+    return () => document.removeEventListener("mousedown", handler);
+  }, [speedOpen]);
+
+  const togglePlay = (audio: GeneratedAudio, e: React.MouseEvent) => {
+    e.stopPropagation();
+    if (playingId === audio.id) {
+      stopPlaying();
+      return;
+    }
+    stopPlaying();
+    const player = new Audio(audio.path);
+    player.onended = () => setPlayingId(null);
+    player.play().catch(() => {});
+    audioRef.current = player;
+    setPlayingId(audio.id);
+  };
+
+  const startEditing = (audio: GeneratedAudio, e: React.MouseEvent) => {
+    e.stopPropagation();
+    setEditingId(audio.id);
+    setEditName(audio.name);
+  };
+
+  const saveEditing = (audioId: string, e: React.MouseEvent) => {
+    e.stopPropagation();
+    if (!editName.trim()) return;
+    onRenameAudio(audioId, editName.trim());
+    setEditingId(null);
+    setEditName("");
+  };
+
+  const cancelEditing = (e: React.MouseEvent) => {
+    e.stopPropagation();
+    setEditingId(null);
+    setEditName("");
+  };
+
+  const canGenerate = hasText && !missingRefAudio;
+
+  const speedOptions = [
+    { value: 0.8, label: "较慢" },
+    { value: 0.9, label: "稍慢" },
+    { value: 1.0, label: "正常" },
+    { value: 1.1, label: "稍快" },
+    { value: 1.2, label: "较快" },
+  ] as const;
+  const currentSpeedLabel = speedOptions.find((o) => o.value === speed)?.label ?? "正常";
+
+  const content = (
+    <>
+      {embedded ? (
+        <>
+          {/* Row 1: 语速 + 生成配音 (right-aligned) */}
+          <div className="flex justify-end items-center gap-1.5 mb-3">
+            {ttsMode === "voiceclone" && (
+              <div ref={speedRef} className="relative">
+                <button
+                  onClick={() => setSpeedOpen((v) => !v)}
+                  className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 whitespace-nowrap flex items-center gap-1 transition-all"
+                >
+                  语速: {currentSpeedLabel}
+                  <ChevronDown className={`h-3 w-3 transition-transform ${speedOpen ? "rotate-180" : ""}`} />
+                </button>
+                {speedOpen && (
+                  <div className="absolute right-0 top-full mt-1 bg-gray-800 border border-white/20 rounded-lg shadow-xl py-1 z-50 min-w-[80px]">
+                    {speedOptions.map((opt) => (
+                      <button
+                        key={opt.value}
+                        onClick={() => { onSpeedChange(opt.value); setSpeedOpen(false); }}
+                        className={`w-full text-left px-3 py-1.5 text-xs transition-colors ${
+                          speed === opt.value
+                            ? "bg-purple-600/40 text-purple-200"
+                            : "text-gray-300 hover:bg-white/10"
+                        }`}
+                      >
+                        {opt.label}
+                      </button>
+                    ))}
+                  </div>
+                )}
+              </div>
+            )}
+            <button
+              onClick={onGenerateAudio}
+              disabled={isGeneratingAudio || !canGenerate}
+              title={missingRefAudio ? "请先选择参考音频" : !hasText ? "请先输入文案" : ""}
+              className={`px-4 py-2 text-sm font-medium rounded-lg transition-all whitespace-nowrap flex items-center gap-1.5 shadow-sm ${
+                isGeneratingAudio || !canGenerate
+                  ? "bg-gray-600 cursor-not-allowed text-gray-400"
+                  : "bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-700 hover:to-pink-700 text-white hover:shadow-md"
+              }`}
+            >
+              <Mic className="h-4 w-4" />
+              生成配音
+            </button>
+          </div>
+          {/* Row 2: 配音列表 + 刷新 */}
+          <div className="flex justify-between items-center mb-3">
+            <h3 className="text-sm font-medium text-gray-400">配音列表</h3>
+            <button
+              onClick={onRefresh}
+              className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 whitespace-nowrap flex items-center gap-1"
+            >
+              <RefreshCw className="h-3.5 w-3.5" />
+              刷新
+            </button>
+          </div>
+        </>
+      ) : (
+        <div className="flex justify-between items-center gap-2 mb-4">
+          <h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2 whitespace-nowrap">
+            <Mic className="h-4 w-4 text-purple-400" />
+            配音列表
+          </h2>
+          <div className="flex gap-1.5">
+            {ttsMode === "voiceclone" && (
+              <div ref={speedRef} className="relative">
+                <button
+                  onClick={() => setSpeedOpen((v) => !v)}
+                  className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 whitespace-nowrap flex items-center gap-1 transition-all"
+                >
+                  语速: {currentSpeedLabel}
+                  <ChevronDown className={`h-3 w-3 transition-transform ${speedOpen ? "rotate-180" : ""}`} />
+                </button>
+                {speedOpen && (
+                  <div className="absolute right-0 top-full mt-1 bg-gray-800 border border-white/20 rounded-lg shadow-xl py-1 z-50 min-w-[80px]">
+                    {speedOptions.map((opt) => (
+                      <button
+                        key={opt.value}
+                        onClick={() => { onSpeedChange(opt.value); setSpeedOpen(false); }}
+                        className={`w-full text-left px-3 py-1.5 text-xs transition-colors ${
+                          speed === opt.value
+                            ? "bg-purple-600/40 text-purple-200"
+                            : "text-gray-300 hover:bg-white/10"
+                        }`}
+                      >
+                        {opt.label}
+                      </button>
+                    ))}
+                  </div>
+                )}
+              </div>
+            )}
+            <button
+              onClick={onGenerateAudio}
+              disabled={isGeneratingAudio || !canGenerate}
+              title={missingRefAudio ? "请先选择参考音频" : !hasText ? "请先输入文案" : ""}
+              className={`px-4 py-2 text-sm font-medium rounded-lg transition-all whitespace-nowrap flex items-center gap-1.5 shadow-sm ${
+                isGeneratingAudio || !canGenerate
+                  ? "bg-gray-600 cursor-not-allowed text-gray-400"
+                  : "bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-700 hover:to-pink-700 text-white hover:shadow-md"
+              }`}
+            >
+              <Mic className="h-4 w-4" />
+              生成配音
+            </button>
+            <button
+              onClick={onRefresh}
+              className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 whitespace-nowrap flex items-center gap-1"
+            >
+              <RefreshCw className="h-3.5 w-3.5" />
+              刷新
+            </button>
+          </div>
+        </div>
+      )}
+
+      {/* 缺少参考音频提示 */}
+      {missingRefAudio && (
+        <div className="mb-3 px-3 py-2 bg-yellow-500/10 border border-yellow-500/30 rounded-lg text-yellow-300 text-xs">
+          声音克隆模式需要先选择参考音频
+        </div>
+      )}
+
+      {/* 生成进度 */}
+      {isGeneratingAudio && audioTask && (
+        <div className="mb-4 p-3 bg-purple-500/10 rounded-xl border border-purple-500/30">
+          <div className="flex justify-between text-sm text-purple-300 mb-2">
+            <span>{audioTask.message || "生成中..."}</span>
+            <span>{audioTask.progress || 0}%</span>
+          </div>
+          <div className="h-2 bg-black/30 rounded-full overflow-hidden">
+            <div
+              className="h-full bg-gradient-to-r from-purple-500 to-pink-500 transition-all duration-300"
+              style={{ width: `${audioTask.progress || 0}%` }}
+            />
+          </div>
+        </div>
+      )}
+
+      {/* 配音列表 */}
+      {generatedAudios.length === 0 ? (
+        <div className="text-center py-6 text-gray-400">
+          <p className="text-sm">暂无配音</p>
+          <p className="text-xs mt-1 text-gray-500">点击「生成配音」创建</p>
+        </div>
+      ) : (
+        <div className="space-y-2 max-h-48 sm:max-h-56 overflow-y-auto hide-scrollbar">
+          {generatedAudios.map((audio) => {
+            const isSelected = selectedAudioId === audio.id;
+            return (
+              <div
+                key={audio.id}
+                onClick={() => onSelectAudio(audio)}
+                className={`p-3 rounded-lg border transition-all cursor-pointer flex items-center justify-between group ${
+                  isSelected
+                    ? "border-purple-500 bg-purple-500/20"
+                    : "border-white/10 bg-white/5 hover:border-white/30"
+                }`}
+              >
+                {editingId === audio.id ? (
+                  <div className="flex-1 flex items-center gap-2" onClick={(e) => e.stopPropagation()}>
+                    <input
+                      value={editName}
+                      onChange={(e) => setEditName(e.target.value)}
+                      className="flex-1 bg-black/40 border border-white/20 rounded-md px-2 py-1 text-xs text-white"
+                      autoFocus
+                      onKeyDown={(e) => {
+                        if (e.key === "Enter") saveEditing(audio.id, e as unknown as React.MouseEvent);
+                        if (e.key === "Escape") cancelEditing(e as unknown as React.MouseEvent);
+                      }}
+                    />
+                    <button onClick={(e) => saveEditing(audio.id, e)} className="p-1 text-green-400 hover:text-green-300" title="保存">
+                      <Check className="h-4 w-4" />
+                    </button>
+                    <button onClick={cancelEditing} className="p-1 text-gray-400 hover:text-white" title="取消">
+                      <X className="h-4 w-4" />
+                    </button>
+                  </div>
+                ) : (
+                  <>
+                    <div className="min-w-0 flex-1">
+                      <div className="text-white text-sm truncate">{audio.name}</div>
+                      <div className="text-gray-400 text-xs">{audio.duration_sec.toFixed(1)}s</div>
+                    </div>
+                    <div className="flex items-center gap-1 pl-2 opacity-40 group-hover:opacity-100 transition-opacity">
+                      <button
+                        onClick={(e) => togglePlay(audio, e)}
+                        className="p-1 text-gray-500 hover:text-purple-400 transition-colors"
+                        title={playingId === audio.id ? "暂停" : "播放"}
+                      >
+                        {playingId === audio.id ? (
+                          <Pause className="h-3.5 w-3.5" />
+                        ) : (
+                          <Play className="h-3.5 w-3.5" />
+                        )}
+                      </button>
+                      <button
+                        onClick={(e) => startEditing(audio, e)}
+                        className="p-1 text-gray-500 hover:text-white transition-colors"
+                        title="重命名"
+                      >
+                        <Pencil className="h-3.5 w-3.5" />
+                      </button>
+                      <button
+                        onClick={(e) => {
+                          e.stopPropagation();
+                          onDeleteAudio(audio.id);
+                        }}
+                        className="p-1 text-gray-500 hover:text-red-400 transition-colors"
+                        title="删除"
+                      >
+                        <Trash2 className="h-3.5 w-3.5" />
+                      </button>
+                    </div>
+                  </>
+                )}
+              </div>
+            );
+          })}
+        </div>
+      )}
+    </>
+  );
+
+  if (embedded) return content;
+
+  return (
+    <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm relative z-10">
+      {content}
+    </div>
+  );
+}
--- a/frontend/src/features/home/ui/HistoryList.tsx
+++ b/frontend/src/features/home/ui/HistoryList.tsx
@@ -16,6 +16,7 @@ interface HistoryListProps {
  onRefresh: () => void;
  registerVideoRef: (id: string, element: HTMLDivElement | null) => void;
  formatDate: (timestamp: number) => string;
+  embedded?: boolean;
 }

 export function HistoryList({
@@ -26,11 +27,13 @@ export function HistoryList({
  onRefresh,
  registerVideoRef,
  formatDate,
+  embedded = false,
 }: HistoryListProps) {
-  return (
-    <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
+  const content = (
+    <>
+      {!embedded && (
        <div className="flex justify-between items-center mb-4">
-        <h2 className="text-lg font-semibold text-white flex items-center gap-2">📂 历史作品</h2>
+          <h2 className="text-lg font-semibold text-white flex items-center gap-2">历史作品</h2>
          <button
            onClick={onRefresh}
            className="px-3 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 flex items-center gap-1"
@@ -39,6 +42,7 @@ export function HistoryList({
            刷新
          </button>
        </div>
+      )}
      {generatedVideos.length === 0 ? (
        <div className="text-center py-4 text-gray-500">
          <p>暂无生成的作品</p>
@@ -66,7 +70,7 @@ export function HistoryList({
                  e.stopPropagation();
                  onDeleteVideo(v.id);
                }}
-                className="p-1 text-gray-500 hover:text-red-400 opacity-0 group-hover:opacity-100 transition-opacity"
+                className="p-1 text-gray-500 hover:text-red-400 opacity-40 group-hover:opacity-100 transition-opacity"
                title="删除视频"
              >
                <Trash2 className="h-4 w-4" />
@@ -75,6 +79,14 @@ export function HistoryList({
          ))}
        </div>
      )}
+    </>
+  );
+
+  if (embedded) return content;
+
+  return (
+    <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
+      {content}
    </div>
  );
 }
--- a/frontend/src/features/home/ui/HomePage.tsx
+++ b/frontend/src/features/home/ui/HomePage.tsx
@@ -1,20 +1,26 @@
 "use client";

-import { useEffect } from "react";
+import { useEffect, useMemo } from "react";
 import { useRouter } from "next/navigation";
+import { RefreshCw } from "lucide-react";
 import VideoPreviewModal from "@/components/VideoPreviewModal";
 import ScriptExtractionModal from "./ScriptExtractionModal";
+import RewriteModal from "./RewriteModal";
 import { useHomeController } from "@/features/home/model/useHomeController";
+import { resolveMediaUrl } from "@/shared/lib/media";
 import { BgmPanel } from "@/features/home/ui/BgmPanel";
 import { GenerateActionBar } from "@/features/home/ui/GenerateActionBar";
 import { HistoryList } from "@/features/home/ui/HistoryList";
 import { HomeHeader } from "@/features/home/ui/HomeHeader";
 import { MaterialSelector } from "@/features/home/ui/MaterialSelector";
+import { TimelineEditor } from "@/features/home/ui/TimelineEditor";
+import { ClipTrimmer } from "@/features/home/ui/ClipTrimmer";
 import { PreviewPanel } from "@/features/home/ui/PreviewPanel";
 import { RefAudioPanel } from "@/features/home/ui/RefAudioPanel";
 import { ScriptEditor } from "@/features/home/ui/ScriptEditor";
 import { TitleSubtitlePanel } from "@/features/home/ui/TitleSubtitlePanel";
 import { VoiceSelector } from "@/features/home/ui/VoiceSelector";
+import { GeneratedAudiosPanel } from "@/features/home/ui/GeneratedAudiosPanel";

 export function HomePage() {
  const router = useRouter();
@@ -34,8 +40,8 @@ export function HomePage() {
    fetchMaterials,
    deleteMaterial,
    handleUpload,
-    selectedMaterial,
-    setSelectedMaterial,
+    selectedMaterials,
+    toggleMaterial,
    handlePreviewMaterial,
    editingMaterialId,
    editMaterialName,
@@ -47,8 +53,17 @@ export function HomePage() {
    setText,
    extractModalOpen,
    setExtractModalOpen,
+    rewriteModalOpen,
+    setRewriteModalOpen,
    handleGenerateMeta,
    isGeneratingMeta,
+    handleTranslate,
+    isTranslating,
+    originalText,
+    handleRestoreOriginal,
+    savedScripts,
+    handleSaveScript,
+    deleteSavedScript,
    showStylePreview,
    setShowStylePreview,
    videoTitle,
@@ -59,6 +74,15 @@ export function HomePage() {
    titleFontSize,
    setTitleFontSize,
    setTitleSizeLocked,
+    videoSecondaryTitle,
+    secondaryTitleInput,
+    selectedSecondaryTitleStyleId,
+    setSelectedSecondaryTitleStyleId,
+    secondaryTitleFontSize,
+    setSecondaryTitleFontSize,
+    setSecondaryTitleSizeLocked,
+    secondaryTitleTopMargin,
+    setSecondaryTitleTopMargin,
    subtitleStyles,
    selectedSubtitleStyleId,
    setSelectedSubtitleStyleId,
@@ -69,12 +93,13 @@ export function HomePage() {
    setTitleTopMargin,
    subtitleBottomMargin,
    setSubtitleBottomMargin,
-    enableSubtitles,
-    setEnableSubtitles,
+    titleDisplayMode,
+    setTitleDisplayMode,
+    outputAspectRatio,
+    setOutputAspectRatio,
    resolveAssetUrl,
    getFontFormat,
    buildTextShadow,
-    materialDimensions,
    ttsMode,
    setTtsMode,
    voices,
@@ -97,6 +122,8 @@ export function HomePage() {
    saveEditing,
    cancelEditing,
    deleteRefAudio,
+    retranscribeRefAudio,
+    retranscribingId,
    recordedBlob,
    isRecording,
    recordingTime,
@@ -104,7 +131,6 @@ export function HomePage() {
    stopRecording,
    useRecording,
    formatRecordingTime,
-    fixedRefText,
    bgmList,
    bgmLoading,
    bgmError,
@@ -130,12 +156,56 @@ export function HomePage() {
    fetchGeneratedVideos,
    registerVideoRef,
    formatDate,
+    generatedAudios,
+    selectedAudio,
+    selectedAudioId,
+    isGeneratingAudio,
+    audioTask,
+    fetchGeneratedAudios,
+    handleGenerateAudio,
+    deleteAudio,
+    renameAudio,
+    selectAudio,
+    speed,
+    setSpeed,
+    timelineSegments,
+    reorderSegments,
+    setSourceRange,
+    clipTrimmerOpen,
+    setClipTrimmerOpen,
+    clipTrimmerSegmentId,
+    setClipTrimmerSegmentId,
+    materialPosterUrl,
  } = useHomeController();

  useEffect(() => {
    router.prefetch("/publish");
  }, [router]);

+  useEffect(() => {
+    if (typeof window === "undefined") return;
+    if ("scrollRestoration" in history) {
+      history.scrollRestoration = "manual";
+    }
+    window.scrollTo({ top: 0, left: 0, behavior: "auto" });
+    // 兜底：等所有恢复 effect + 异步数据加载 settle 后再次强制回顶部
+    const timer = setTimeout(() => {
+      window.scrollTo({ top: 0, left: 0, behavior: "auto" });
+    }, 200);
+    return () => clearTimeout(timer);
+  }, []);
+
+  const clipTrimmerSegment = useMemo(
+    () => timelineSegments.find((s) => s.id === clipTrimmerSegmentId) ?? null,
+    [timelineSegments, clipTrimmerSegmentId]
+  );
+
+  const clipTrimmerMaterialUrl = useMemo(() => {
+    if (!clipTrimmerSegment) return null;
+    const mat = materials.find((m) => m.id === clipTrimmerSegment.materialId);
+    return mat?.path ? resolveMediaUrl(mat.path) : null;
+  }, [clipTrimmerSegment, materials]);
+
  return (
    <div className="min-h-dvh">
      <HomeHeader />
@@ -144,80 +214,32 @@ export function HomePage() {
        <div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
          {/* 左侧: 输入区域 */}
          <div className="space-y-6">
-            {/* 素材选择 */}
-            <MaterialSelector
-              materials={materials}
-              selectedMaterial={selectedMaterial}
-              isFetching={isFetching}
-              lastMaterialCount={lastMaterialCount}
-              editingMaterialId={editingMaterialId}
-              editMaterialName={editMaterialName}
-              isUploading={isUploading}
-              uploadProgress={uploadProgress}
-              uploadError={uploadError}
-              fetchError={fetchError}
-              apiBase={apiBase}
-              onUploadChange={handleUpload}
-              onRefresh={fetchMaterials}
-              onSelectMaterial={setSelectedMaterial}
-              onPreviewMaterial={handlePreviewMaterial}
-              onStartEditing={startMaterialEditing}
-              onEditNameChange={setEditMaterialName}
-              onSaveEditing={saveMaterialEditing}
-              onCancelEditing={cancelMaterialEditing}
-              onDeleteMaterial={deleteMaterial}
-              onClearUploadError={() => setUploadError(null)}
-              registerMaterialRef={registerMaterialRef}
-            />
-
-            {/* 文案输入 */}
+            {/* 一、文案提取与编辑 */}
            <ScriptEditor
              text={text}
              onChangeText={setText}
              onOpenExtractModal={() => setExtractModalOpen(true)}
+              onOpenRewriteModal={() => setRewriteModalOpen(true)}
              onGenerateMeta={handleGenerateMeta}
              isGeneratingMeta={isGeneratingMeta}
+              onTranslate={handleTranslate}
+              isTranslating={isTranslating}
+              hasOriginalText={originalText !== null}
+              onRestoreOriginal={handleRestoreOriginal}
+              savedScripts={savedScripts}
+              onSaveScript={handleSaveScript}
+              onLoadScript={setText}
+              onDeleteScript={deleteSavedScript}
            />

-            {/* 标题和字幕设置 */}
-            <TitleSubtitlePanel
-              showStylePreview={showStylePreview}
-              onTogglePreview={() => setShowStylePreview((prev) => !prev)}
-              videoTitle={videoTitle}
-              onTitleChange={titleInput.handleChange}
-              onTitleCompositionStart={titleInput.handleCompositionStart}
-              onTitleCompositionEnd={titleInput.handleCompositionEnd}
-              titleStyles={titleStyles}
-              selectedTitleStyleId={selectedTitleStyleId}
-              onSelectTitleStyle={setSelectedTitleStyleId}
-              titleFontSize={titleFontSize}
-              onTitleFontSizeChange={(value) => {
-                setTitleFontSize(value);
-                setTitleSizeLocked(true);
-              }}
-              subtitleStyles={subtitleStyles}
-              selectedSubtitleStyleId={selectedSubtitleStyleId}
-              onSelectSubtitleStyle={setSelectedSubtitleStyleId}
-              subtitleFontSize={subtitleFontSize}
-              onSubtitleFontSizeChange={(value) => {
-                setSubtitleFontSize(value);
-                setSubtitleSizeLocked(true);
-              }}
-              titleTopMargin={titleTopMargin}
-              onTitleTopMarginChange={setTitleTopMargin}
-              subtitleBottomMargin={subtitleBottomMargin}
-              onSubtitleBottomMarginChange={setSubtitleBottomMargin}
-              enableSubtitles={enableSubtitles}
-              onToggleSubtitles={setEnableSubtitles}
-              resolveAssetUrl={resolveAssetUrl}
-              getFontFormat={getFontFormat}
-              buildTextShadow={buildTextShadow}
-              previewBaseWidth={materialDimensions?.width || 1080}
-              previewBaseHeight={materialDimensions?.height || 1920}
-            />
-
-            {/* 配音方式选择 */}
+            {/* 二、配音 */}
+            <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
+              <h2 className="text-base sm:text-lg font-semibold text-white mb-4">
+                二、配音
+              </h2>
+              <h3 className="text-sm font-medium text-gray-400 mb-3">配音方式</h3>
              <VoiceSelector
+                embedded
                ttsMode={ttsMode}
                onSelectTtsMode={setTtsMode}
                voices={voices}
@@ -242,6 +264,8 @@ export function HomePage() {
                    onSaveEditing={saveEditing}
                    onCancelEditing={cancelEditing}
                    onDeleteRefAudio={deleteRefAudio}
+                    onRetranscribe={retranscribeRefAudio}
+                    retranscribingId={retranscribingId}
                    recordedBlob={recordedBlob}
                    isRecording={isRecording}
                    recordingTime={recordingTime}
@@ -249,12 +273,137 @@ export function HomePage() {
                    onStopRecording={stopRecording}
                    onUseRecording={useRecording}
                    formatRecordingTime={formatRecordingTime}
-                  fixedRefText={fixedRefText}
                  />
                )}
              />
+              <div className="border-t border-white/10 my-4" />
+              <GeneratedAudiosPanel
+                embedded
+                generatedAudios={generatedAudios}
+                selectedAudioId={selectedAudioId}
+                isGeneratingAudio={isGeneratingAudio}
+                audioTask={audioTask}
+                onGenerateAudio={handleGenerateAudio}
+                onRefresh={() => fetchGeneratedAudios()}
+                onSelectAudio={selectAudio}
+                onDeleteAudio={deleteAudio}
+                onRenameAudio={renameAudio}
+                hasText={!!text.trim()}
+                missingRefAudio={ttsMode === "voiceclone" && !selectedRefAudio}
+                speed={speed}
+                onSpeedChange={setSpeed}
+                ttsMode={ttsMode}
+              />
+            </div>

-            {/* 背景音乐 */}
+            {/* 三、素材编辑 */}
+            <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
+              <h2 className="text-base sm:text-lg font-semibold text-white mb-4">
+                三、素材编辑
+              </h2>
+              <MaterialSelector
+                embedded
+                materials={materials}
+                selectedMaterials={selectedMaterials}
+                isFetching={isFetching}
+                lastMaterialCount={lastMaterialCount}
+                editingMaterialId={editingMaterialId}
+                editMaterialName={editMaterialName}
+                isUploading={isUploading}
+                uploadProgress={uploadProgress}
+                uploadError={uploadError}
+                fetchError={fetchError}
+                apiBase={apiBase}
+                onUploadChange={handleUpload}
+                onRefresh={fetchMaterials}
+                onToggleMaterial={toggleMaterial}
+                onPreviewMaterial={handlePreviewMaterial}
+                onStartEditing={startMaterialEditing}
+                onEditNameChange={setEditMaterialName}
+                onSaveEditing={saveMaterialEditing}
+                onCancelEditing={cancelMaterialEditing}
+                onDeleteMaterial={deleteMaterial}
+                onClearUploadError={() => setUploadError(null)}
+                registerMaterialRef={registerMaterialRef}
+              />
+              <div className="border-t border-white/10 my-4" />
+              <div className="relative">
+                {(!selectedAudio || selectedMaterials.length === 0) && (
+                  <div className="absolute inset-0 bg-black/50 backdrop-blur-sm rounded-xl flex items-center justify-center z-10">
+                    <p className="text-gray-400">
+                      {!selectedAudio ? "请先生成并选中配音" : "请先选择素材"}
+                    </p>
+                  </div>
+                )}
+                <TimelineEditor
+                  embedded
+                  audioDuration={selectedAudio?.duration_sec ?? 0}
+                  audioUrl={selectedAudio ? (resolveMediaUrl(selectedAudio.path) || "") : ""}
+                  segments={timelineSegments}
+                  materials={materials}
+                  outputAspectRatio={outputAspectRatio}
+                  onOutputAspectRatioChange={setOutputAspectRatio}
+                  onReorderSegment={reorderSegments}
+                  onClickSegment={(seg) => {
+                    setClipTrimmerSegmentId(seg.id);
+                    setClipTrimmerOpen(true);
+                  }}
+                />
+              </div>
+            </div>
+
+            {/* 四、标题与字幕 */}
+            <TitleSubtitlePanel
+              showStylePreview={showStylePreview}
+              onTogglePreview={() => setShowStylePreview((prev) => !prev)}
+              videoTitle={videoTitle}
+              onTitleChange={titleInput.handleChange}
+              onTitleCompositionStart={titleInput.handleCompositionStart}
+              onTitleCompositionEnd={titleInput.handleCompositionEnd}
+              videoSecondaryTitle={videoSecondaryTitle}
+              onSecondaryTitleChange={secondaryTitleInput.handleChange}
+              onSecondaryTitleCompositionStart={secondaryTitleInput.handleCompositionStart}
+              onSecondaryTitleCompositionEnd={secondaryTitleInput.handleCompositionEnd}
+              titleStyles={titleStyles}
+              selectedTitleStyleId={selectedTitleStyleId}
+              onSelectTitleStyle={setSelectedTitleStyleId}
+              titleFontSize={titleFontSize}
+              onTitleFontSizeChange={(value) => {
+                setTitleFontSize(value);
+                setTitleSizeLocked(true);
+              }}
+              selectedSecondaryTitleStyleId={selectedSecondaryTitleStyleId}
+              onSelectSecondaryTitleStyle={setSelectedSecondaryTitleStyleId}
+              secondaryTitleFontSize={secondaryTitleFontSize}
+              onSecondaryTitleFontSizeChange={(value) => {
+                setSecondaryTitleFontSize(value);
+                setSecondaryTitleSizeLocked(true);
+              }}
+              secondaryTitleTopMargin={secondaryTitleTopMargin}
+              onSecondaryTitleTopMarginChange={setSecondaryTitleTopMargin}
+              subtitleStyles={subtitleStyles}
+              selectedSubtitleStyleId={selectedSubtitleStyleId}
+              onSelectSubtitleStyle={setSelectedSubtitleStyleId}
+              subtitleFontSize={subtitleFontSize}
+              onSubtitleFontSizeChange={(value) => {
+                setSubtitleFontSize(value);
+                setSubtitleSizeLocked(true);
+              }}
+              titleTopMargin={titleTopMargin}
+              onTitleTopMarginChange={setTitleTopMargin}
+              subtitleBottomMargin={subtitleBottomMargin}
+              onSubtitleBottomMarginChange={setSubtitleBottomMargin}
+              titleDisplayMode={titleDisplayMode}
+              onTitleDisplayModeChange={setTitleDisplayMode}
+              resolveAssetUrl={resolveAssetUrl}
+              getFontFormat={getFontFormat}
+              buildTextShadow={buildTextShadow}
+              previewBaseWidth={outputAspectRatio === "16:9" ? 1920 : 1080}
+              previewBaseHeight={outputAspectRatio === "16:9" ? 1080 : 1920}
+              previewBackgroundUrl={materialPosterUrl}
+            />
+
+            {/* 背景音乐 (不编号) */}
            <BgmPanel
              bgmList={bgmList}
              bgmLoading={bgmLoading}
@@ -272,24 +421,52 @@ export function HomePage() {
              registerBgmItemRef={registerBgmItemRef}
            />

-            {/* 生成按钮 */}
+            {/* 生成按钮 (不编号) */}
            <GenerateActionBar
              isGenerating={isGenerating}
              progress={currentTask?.progress || 0}
-              disabled={isGenerating || !selectedMaterial || (ttsMode === "voiceclone" && !selectedRefAudio)}
+              materialCount={selectedMaterials.length}
+              disabled={isGenerating || selectedMaterials.length === 0 || !selectedAudio}
              onGenerate={handleGenerate}
            />
          </div>

-          {/* 右侧: 预览区域 */}
+          {/* 右侧: 作品区域 */}
          <div className="space-y-6">
-            <PreviewPanel
-              currentTask={currentTask}
-              isGenerating={isGenerating}
-              generatedVideo={generatedVideo}
+            {/* 生成进度（在作品卡片上方） */}
+            {currentTask && isGenerating && (
+              <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-purple-500/30 backdrop-blur-sm">
+                <div className="space-y-3">
+                  <div className="flex justify-between text-sm text-purple-300 mb-1">
+                    <span>正在AI生成中...</span>
+                    <span>{currentTask.progress || 0}%</span>
+                  </div>
+                  <div className="h-3 bg-black/30 rounded-full overflow-hidden">
+                    <div
+                      className="h-full bg-gradient-to-r from-purple-500 to-pink-500 transition-all duration-300"
+                      style={{ width: `${currentTask.progress || 0}%` }}
                    />
-
+                  </div>
+                </div>
+              </div>
+            )}
+            {/* 六、作品 */}
+            <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
+              <h2 className="text-base sm:text-lg font-semibold text-white mb-4">
+                六、作品
+              </h2>
+              <div className="flex justify-between items-center mb-3">
+                <h3 className="text-sm font-medium text-gray-400">作品列表</h3>
+                <button
+                  onClick={() => fetchGeneratedVideos()}
+                  className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 flex items-center gap-1"
+                >
+                  <RefreshCw className="h-3.5 w-3.5" />
+                  刷新
+                </button>
+              </div>
              <HistoryList
+                embedded
                generatedVideos={generatedVideos}
                selectedVideoId={selectedVideoId}
                onSelectVideo={handleSelectVideo}
@@ -298,6 +475,15 @@ export function HomePage() {
                registerVideoRef={registerVideoRef}
                formatDate={formatDate}
              />
+              <div className="border-t border-white/10 my-4" />
+              <h3 className="text-sm font-medium text-gray-400 mb-3">作品预览</h3>
+              <PreviewPanel
+                embedded
+                currentTask={null}
+                isGenerating={false}
+                generatedVideo={generatedVideo}
+              />
+            </div>
          </div>
        </div>
      </main>
@@ -312,6 +498,26 @@ export function HomePage() {
        onClose={() => setExtractModalOpen(false)}
        onApply={(nextText) => setText(nextText)}
      />
+
+      <RewriteModal
+        isOpen={rewriteModalOpen}
+        onClose={() => setRewriteModalOpen(false)}
+        originalText={text}
+        onApply={(newText) => setText(newText)}
+      />
+
+      <ClipTrimmer
+        isOpen={clipTrimmerOpen}
+        segment={clipTrimmerSegment}
+        materialUrl={clipTrimmerMaterialUrl}
+        onConfirm={(sourceStart, sourceEnd) => {
+          if (clipTrimmerSegmentId) {
+            setSourceRange(clipTrimmerSegmentId, sourceStart, sourceEnd);
+          }
+          setClipTrimmerOpen(false);
+        }}
+        onClose={() => setClipTrimmerOpen(false)}
+      />
    </div>
  );
 }
--- a/frontend/src/features/home/ui/MaterialSelector.tsx
+++ b/frontend/src/features/home/ui/MaterialSelector.tsx
@@ -1,17 +1,10 @@
-import type { ChangeEvent, MouseEvent } from "react";
+import { type ChangeEvent, type MouseEvent, useMemo } from "react";
 import { Upload, RefreshCw, Eye, Trash2, X, Pencil, Check } from "lucide-react";
-
-interface Material {
-  id: string;
-  name: string;
-  scene: string;
-  size_mb: number;
-  path: string;
-}
+import type { Material } from "@/shared/types/material";

 interface MaterialSelectorProps {
  materials: Material[];
-  selectedMaterial: string;
+  selectedMaterials: string[];
  isFetching: boolean;
  lastMaterialCount: number;
  editingMaterialId: string | null;
@@ -23,7 +16,7 @@ interface MaterialSelectorProps {
  apiBase: string;
  onUploadChange: (event: ChangeEvent<HTMLInputElement>) => void;
  onRefresh: () => void;
-  onSelectMaterial: (id: string) => void;
+  onToggleMaterial: (id: string) => void;
  onPreviewMaterial: (path: string) => void;
  onStartEditing: (material: Material, event: MouseEvent) => void;
  onEditNameChange: (value: string) => void;
@@ -32,11 +25,12 @@ interface MaterialSelectorProps {
  onDeleteMaterial: (id: string) => void;
  onClearUploadError: () => void;
  registerMaterialRef: (id: string, element: HTMLDivElement | null) => void;
+  embedded?: boolean;
 }

 export function MaterialSelector({
  materials,
-  selectedMaterial,
+  selectedMaterials,
  isFetching,
  lastMaterialCount,
  editingMaterialId,
@@ -48,7 +42,7 @@ export function MaterialSelector({
  apiBase,
  onUploadChange,
  onRefresh,
-  onSelectMaterial,
+  onToggleMaterial,
  onPreviewMaterial,
  onStartEditing,
  onEditNameChange,
@@ -57,21 +51,32 @@ export function MaterialSelector({
  onDeleteMaterial,
  onClearUploadError,
  registerMaterialRef,
+  embedded = false,
 }: MaterialSelectorProps) {
-  return (
-    <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
+  const selectedSet = useMemo(() => new Set(selectedMaterials), [selectedMaterials]);
+  const isFull = selectedMaterials.length >= 4;
+
+  const content = (
+    <>
      <div className="flex justify-between items-center gap-2 mb-4">
-        <h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2 whitespace-nowrap">
-          📹 视频素材
-          <span className="ml-1 text-[11px] sm:text-xs text-gray-400/90 font-normal">
-            (上传自拍视频)
+        {!embedded ? (
+          <h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2 min-w-0">
+            <span className="shrink-0">视频素材</span>
+            <span className="text-[11px] sm:text-xs text-gray-400/90 font-normal truncate">
+              (上传自拍视频，最多可选4个)
            </span>
          </h2>
+        ) : (
+          <h3 className="text-sm font-medium text-gray-400 min-w-0">
+            <span className="shrink-0">视频素材</span>
+            <span className="ml-1 text-[11px] text-gray-400/90 font-normal hidden sm:inline">(上传自拍视频，最多可选4个)</span>
+          </h3>
+        )}
        <div className="flex gap-1.5">
          <input
            type="file"
            id="video-upload"
-            accept=".mp4,.mov,.avi"
+            accept="video/*"
            onChange={onUploadChange}
            className="hidden"
          />
@@ -98,7 +103,7 @@ export function MaterialSelector({
      {isUploading && (
        <div className="mb-4 p-4 bg-purple-500/10 rounded-xl border border-purple-500/30">
          <div className="flex justify-between text-sm text-purple-300 mb-2">
-            <span>📤 上传中...</span>
+            <span>上传中...</span>
            <span>{uploadProgress}%</span>
          </div>
          <div className="h-2 bg-black/30 rounded-full overflow-hidden">
@@ -112,7 +117,7 @@ export function MaterialSelector({

      {uploadError && (
        <div className="mb-4 p-4 bg-red-500/20 text-red-200 rounded-xl text-sm flex justify-between items-center">
-          <span>❌ {uploadError}</span>
+          <span>{uploadError}</span>
          <button onClick={onClearUploadError} className="text-red-300 hover:text-white">
            <X className="h-3.5 w-3.5" />
          </button>
@@ -126,7 +131,7 @@ export function MaterialSelector({
          API: {apiBase}/api/materials/
        </div>
      ) : isFetching && materials.length === 0 ? (
-        <div className="space-y-2 max-h-64 overflow-y-auto hide-scrollbar" style={{ contentVisibility: 'auto' }}>
+        <div className="space-y-2 max-h-48 sm:max-h-64 overflow-y-auto hide-scrollbar" style={{ contentVisibility: 'auto' }}>
          {Array.from({ length: Math.min(4, Math.max(1, lastMaterialCount || 1)) }).map((_, index) => (
            <div
              key={`material-skeleton-${index}`}
@@ -142,20 +147,24 @@ export function MaterialSelector({
          <div className="text-5xl mb-4">📁</div>
          <p>暂无视频素材</p>
          <p className="text-sm mt-2">
-            点击上方「📤 上传视频」按钮添加视频素材
+            点击上方「上传」按钮添加视频素材
          </p>
        </div>
      ) : (
        <div
-          className="space-y-2 max-h-64 overflow-y-auto hide-scrollbar"
+          className="space-y-2 max-h-48 sm:max-h-64 overflow-y-auto hide-scrollbar"
          style={{ contentVisibility: 'auto' }}
        >
-          {materials.map((m) => (
+          {materials.map((m) => {
+            const isSelected = selectedSet.has(m.id);
+            return (
              <div
                key={m.id}
                ref={(el) => registerMaterialRef(m.id, el)}
-              className={`p-3 rounded-lg border transition-all flex items-center justify-between group ${selectedMaterial === m.id
+                className={`p-3 rounded-lg border transition-all flex items-center justify-between group ${isSelected
                  ? "border-purple-500 bg-purple-500/20"
+                  : isFull
+                    ? "border-white/5 bg-white/[0.02] opacity-50 cursor-not-allowed"
                    : "border-white/10 bg-white/5 hover:border-white/30"
                  }`}
              >
@@ -183,9 +192,20 @@ export function MaterialSelector({
                    </button>
                  </div>
                ) : (
-                <button onClick={() => onSelectMaterial(m.id)} className="flex-1 text-left">
+                  <button onClick={() => onToggleMaterial(m.id)} disabled={isFull && !isSelected} className="flex-1 text-left flex items-center gap-2">
+                    {/* 复选框 */}
+                    <span
+                      className={`flex-shrink-0 w-4 h-4 rounded border flex items-center justify-center text-[10px] ${isSelected
+                        ? "border-purple-500 bg-purple-500 text-white"
+                        : "border-white/30 text-transparent"
+                        }`}
+                    >
+                      {isSelected ? "✓" : ""}
+                    </span>
+                    <div className="min-w-0">
                      <div className="text-white text-sm truncate">{m.scene || m.name}</div>
                      <div className="text-gray-400 text-xs">{m.size_mb.toFixed(1)} MB</div>
+                    </div>
                  </button>
                )}
                <div className="flex items-center gap-2 pl-2">
@@ -196,7 +216,7 @@ export function MaterialSelector({
                        onPreviewMaterial(m.path);
                      }
                    }}
-                  className="p-1 text-gray-500 hover:text-white opacity-0 group-hover:opacity-100 transition-opacity"
+                    className="p-1 text-gray-500 hover:text-white opacity-40 group-hover:opacity-100 transition-opacity"
                    title="预览视频"
                  >
                    <Eye className="h-4 w-4" />
@@ -204,7 +224,7 @@ export function MaterialSelector({
                  {editingMaterialId !== m.id && (
                    <button
                      onClick={(e) => onStartEditing(m, e)}
-                    className="p-1 text-gray-500 hover:text-white opacity-0 group-hover:opacity-100 transition-opacity"
+                      className="p-1 text-gray-500 hover:text-white opacity-40 group-hover:opacity-100 transition-opacity"
                      title="重命名"
                    >
                      <Pencil className="h-4 w-4" />
@@ -215,16 +235,25 @@ export function MaterialSelector({
                      e.stopPropagation();
                      onDeleteMaterial(m.id);
                    }}
-                  className="p-1 text-gray-500 hover:text-red-400 opacity-0 group-hover:opacity-100 transition-opacity"
+                    className="p-1 text-gray-500 hover:text-red-400 opacity-40 group-hover:opacity-100 transition-opacity"
                    title="删除素材"
                  >
                    <Trash2 className="h-4 w-4" />
                  </button>
                </div>
              </div>
-          ))}
+            );
+          })}
        </div>
      )}
+    </>
+  );
+
+  if (embedded) return content;
+
+  return (
+    <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
+      {content}
    </div>
  );
 }
--- a/frontend/src/features/home/ui/PreviewPanel.tsx
+++ b/frontend/src/features/home/ui/PreviewPanel.tsx
@@ -12,18 +12,20 @@ interface PreviewPanelProps {
  currentTask: Task | null;
  isGenerating: boolean;
  generatedVideo: string | null;
+  embedded?: boolean;
 }

 export function PreviewPanel({
  currentTask,
  isGenerating,
  generatedVideo,
+  embedded = false,
 }: PreviewPanelProps) {
-  return (
+  const content = (
    <>
      {currentTask && isGenerating && (
-        <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
-          <h2 className="text-lg font-semibold text-white mb-4">⏳ 生成进度</h2>
+        <div className={embedded ? "mb-4" : "bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm"}>
+          {!embedded && <h2 className="text-lg font-semibold text-white mb-4">生成进度</h2>}
          <div className="space-y-3">
            <div className="h-3 bg-black/30 rounded-full overflow-hidden">
              <div
@@ -36,8 +38,8 @@ export function PreviewPanel({
        </div>
      )}

-      <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
-        <h2 className="text-lg font-semibold text-white mb-4 flex items-center gap-2">🎥 作品预览</h2>
+      <div className={embedded ? "" : "bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm"}>
+        {!embedded && <h2 className="text-lg font-semibold text-white mb-4 flex items-center gap-2">作品预览</h2>}
        <div className="aspect-video bg-black/50 rounded-xl overflow-hidden flex items-center justify-center">
          {generatedVideo ? (
            <video src={generatedVideo} controls preload="metadata" className="w-full h-full object-contain" />
@@ -71,4 +73,6 @@ export function PreviewPanel({
      </div>
    </>
  );
+
+  return content;
 }
--- a/frontend/src/features/home/ui/RefAudioPanel.tsx
+++ b/frontend/src/features/home/ui/RefAudioPanel.tsx
@@ -1,6 +1,6 @@
 import { useEffect, useState } from "react";
 import type { MouseEvent } from "react";
-import { Upload, RefreshCw, Play, Pause, Pencil, Trash2, Check, X, Mic, Square } from "lucide-react";
+import { Upload, RefreshCw, Play, Pause, Pencil, Trash2, Check, X, Mic, Square, RotateCw } from "lucide-react";

 interface RefAudio {
  id: string;
@@ -29,6 +29,8 @@ interface RefAudioPanelProps {
  onSaveEditing: (id: string, event: MouseEvent) => void;
  onCancelEditing: (event: MouseEvent) => void;
  onDeleteRefAudio: (id: string) => void;
+  onRetranscribe: (id: string) => void;
+  retranscribingId: string | null;
  recordedBlob: Blob | null;
  isRecording: boolean;
  recordingTime: number;
@@ -36,9 +38,10 @@ interface RefAudioPanelProps {
  onStopRecording: () => void;
  onUseRecording: () => void;
  formatRecordingTime: (seconds: number) => string;
-  fixedRefText: string;
 }

+const OLD_FIXED_REF_TEXT = "其实生活中有许多美好的瞬间";
+
 export function RefAudioPanel({
  refAudios,
  selectedRefAudio,
@@ -57,6 +60,8 @@ export function RefAudioPanel({
  onSaveEditing,
  onCancelEditing,
  onDeleteRefAudio,
+  onRetranscribe,
+  retranscribingId,
  recordedBlob,
  isRecording,
  recordingTime,
@@ -64,7 +69,6 @@ export function RefAudioPanel({
  onStopRecording,
  onUseRecording,
  formatRecordingTime,
-  fixedRefText,
 }: RefAudioPanelProps) {
  const [recordedUrl, setRecordedUrl] = useState<string | null>(null);

@@ -81,11 +85,14 @@ export function RefAudioPanel({
    };
  }, [recordedBlob]);

+  const needsRetranscribe = (audio: RefAudio) =>
+    audio.ref_text.startsWith(OLD_FIXED_REF_TEXT);
+
  return (
    <div className="space-y-4">
      <div>
        <div className="flex justify-between items-center mb-2">
-          <span className="text-sm text-gray-300">📁 我的参考音频</span>
+          <span className="text-sm text-gray-300">📁 我的参考音频 <span className="text-xs text-gray-500 font-normal">(上传3-10秒语音样本)</span></span>
          <div className="flex gap-2">
            <input
              type="file"
@@ -122,7 +129,7 @@ export function RefAudioPanel({

        {isUploadingRef && (
          <div className="mb-2 p-2 bg-purple-500/10 rounded text-sm text-purple-300">
-            ⏳ 上传中...
+            ⏳ 上传并识别中...
          </div>
        )}

@@ -180,7 +187,7 @@ export function RefAudioPanel({
                      <div className="text-white text-xs truncate pr-1 flex-1" title={audio.name}>
                        {audio.name}
                      </div>
-                      <div className="flex gap-1 opacity-0 group-hover:opacity-100 transition-opacity">
+                      <div className="flex gap-1 opacity-40 group-hover:opacity-100 transition-opacity">
                        <button
                          onClick={(e) => onTogglePlayPreview(audio, e)}
                          className="text-gray-400 hover:text-purple-400 text-xs"
@@ -192,6 +199,17 @@ export function RefAudioPanel({
                            <Play className="h-3.5 w-3.5" />
                          )}
                        </button>
+                        <button
+                          onClick={(e) => {
+                            e.stopPropagation();
+                            onRetranscribe(audio.id);
+                          }}
+                          disabled={retranscribingId === audio.id}
+                          className="text-gray-400 hover:text-cyan-400 text-xs disabled:opacity-50"
+                          title="重新识别文字"
+                        >
+                          <RotateCw className={`h-3.5 w-3.5 ${retranscribingId === audio.id ? 'animate-spin' : ''}`} />
+                        </button>
                        <button
                          onClick={(e) => onStartEditing(audio, e)}
                          className="text-gray-400 hover:text-blue-400 text-xs"
@@ -211,7 +229,12 @@ export function RefAudioPanel({
                        </button>
                      </div>
                    </div>
-                    <div className="text-gray-400 text-xs">{audio.duration_sec.toFixed(1)}s</div>
+                    <div className="text-gray-400 text-xs">
+                      {audio.duration_sec.toFixed(1)}s
+                      {needsRetranscribe(audio) && (
+                        <span className="text-yellow-500 ml-1" title="需要重新识别文字">⚠</span>
+                      )}
+                    </div>
                  </>
                )}
              </div>
@@ -221,7 +244,7 @@ export function RefAudioPanel({
      </div>

      <div className="border-t border-white/10 pt-4">
-        <span className="text-sm text-gray-300 mb-2 block">🎤 或在线录音</span>
+        <span className="text-sm text-gray-300 mb-2 block">🎤 或在线录音 <span className="text-xs text-gray-500">（建议 3-10 秒，超出将自动截取）</span></span>
        <div className="flex gap-2 items-center">
          {!isRecording ? (
            <button
@@ -264,15 +287,6 @@ export function RefAudioPanel({
        )}
      </div>

-      <div className="border-t border-white/10 pt-4">
-        <label className="text-sm text-gray-300 mb-2 block">📝 录音/上传时请朗读以下内容：</label>
-        <div className="w-full bg-black/30 border border-white/10 rounded-lg p-3 text-white text-sm">
-          {fixedRefText}
-        </div>
-        <p className="text-xs text-gray-500 mt-1">
-          请清晰朗读上述内容完成录音，系统将以此为参考克隆您的声音
-        </p>
-      </div>
    </div>
  );
 }
--- a/frontend/src/features/home/ui/RewriteModal.tsx
+++ b/frontend/src/features/home/ui/RewriteModal.tsx
@@ -0,0 +1,213 @@
+import { useState, useEffect, useRef, useCallback } from "react";
+import { Loader2, Sparkles } from "lucide-react";
+import api from "@/shared/api/axios";
+import { ApiResponse, unwrap } from "@/shared/api/types";
+
+const CUSTOM_PROMPT_KEY = "vigent_rewriteCustomPrompt";
+
+interface RewriteModalProps {
+  isOpen: boolean;
+  onClose: () => void;
+  originalText: string;
+  onApply: (text: string) => void;
+}
+
+export default function RewriteModal({
+  isOpen,
+  onClose,
+  originalText,
+  onApply,
+}: RewriteModalProps) {
+  const [customPrompt, setCustomPrompt] = useState(
+    () => (typeof window !== "undefined" ? localStorage.getItem(CUSTOM_PROMPT_KEY) || "" : "")
+  );
+  const [rewrittenText, setRewrittenText] = useState("");
+  const [isLoading, setIsLoading] = useState(false);
+  const [error, setError] = useState<string | null>(null);
+
+  // Debounced save customPrompt to localStorage
+  const debounceRef = useRef<ReturnType<typeof setTimeout>>(undefined);
+  useEffect(() => {
+    debounceRef.current = setTimeout(() => {
+      localStorage.setItem(CUSTOM_PROMPT_KEY, customPrompt);
+    }, 300);
+    return () => clearTimeout(debounceRef.current);
+  }, [customPrompt]);
+
+  // Reset state when modal opens
+  useEffect(() => {
+    if (isOpen) {
+      setRewrittenText("");
+      setError(null);
+      setIsLoading(false);
+    }
+  }, [isOpen]);
+
+  const handleRewrite = useCallback(async () => {
+    if (!originalText.trim()) return;
+
+    setIsLoading(true);
+    setError(null);
+
+    try {
+      const { data: res } = await api.post<
+        ApiResponse<{ rewritten_text: string }>
+      >("/api/ai/rewrite", {
+        text: originalText,
+        custom_prompt: customPrompt.trim() || null,
+      });
+      const payload = unwrap(res);
+      setRewrittenText(payload.rewritten_text || "");
+    } catch (err: unknown) {
+      console.error("AI rewrite failed:", err);
+      const axiosErr = err as {
+        response?: { data?: { message?: string } };
+        message?: string;
+      };
+      const msg =
+        axiosErr.response?.data?.message || axiosErr.message || "改写失败，请重试";
+      setError(msg);
+    } finally {
+      setIsLoading(false);
+    }
+  }, [originalText, customPrompt]);
+
+  const handleApply = () => {
+    onApply(rewrittenText);
+    onClose();
+  };
+
+  const handleRetry = () => {
+    setRewrittenText("");
+    setError(null);
+  };
+
+  // ESC to close
+  useEffect(() => {
+    if (!isOpen) return;
+    const handleKeyDown = (e: KeyboardEvent) => {
+      if (e.key === "Escape") onClose();
+    };
+    document.addEventListener("keydown", handleKeyDown);
+    return () => document.removeEventListener("keydown", handleKeyDown);
+  }, [isOpen, onClose]);
+
+  if (!isOpen) return null;
+
+  return (
+    <div className="fixed inset-0 z-50 flex items-center justify-center bg-black/80 backdrop-blur-sm p-4 animate-in fade-in duration-200">
+      <div className="bg-[#1a1a1a] border border-white/10 rounded-2xl w-full max-w-2xl max-h-[90vh] overflow-hidden flex flex-col shadow-2xl">
+        {/* Header */}
+        <div className="flex items-center justify-between p-4 border-b border-white/10 bg-white/5">
+          <h3 className="text-lg font-semibold text-white flex items-center gap-2">
+            <Sparkles className="h-5 w-5 text-purple-400" />
+            AI 智能改写
+          </h3>
+          <button
+            onClick={onClose}
+            className="text-gray-400 hover:text-white transition-colors text-2xl leading-none"
+          >
+            &times;
+          </button>
+        </div>
+
+        {/* Content */}
+        <div className="flex-1 overflow-y-auto p-6 space-y-5">
+          {/* Custom Prompt */}
+          <div className="space-y-2">
+            <label className="text-sm text-gray-300">
+              自定义提示词 (可选)
+            </label>
+            <textarea
+              value={customPrompt}
+              onChange={(e) => setCustomPrompt(e.target.value)}
+              placeholder="输入改写要求..."
+              rows={3}
+              className="w-full bg-black/20 border border-white/10 rounded-xl px-3 py-2 text-sm text-white placeholder-gray-500 focus:outline-none focus:border-purple-500 transition-colors resize-none"
+            />
+            <p className="text-xs text-gray-500">留空则使用默认提示词</p>
+          </div>
+
+          {/* Action button (before result) */}
+          {!rewrittenText && (
+            <button
+              onClick={handleRewrite}
+              disabled={isLoading || !originalText.trim()}
+              className="w-full py-3 px-4 bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-500 hover:to-pink-500 disabled:opacity-50 disabled:cursor-not-allowed text-white rounded-xl transition-all font-medium shadow-lg flex items-center justify-center gap-2"
+            >
+              {isLoading ? (
+                <>
+                  <Loader2 className="w-5 h-5 animate-spin" />
+                  改写中...
+                </>
+              ) : (
+                <>
+                  <Sparkles className="w-5 h-5" />
+                  开始改写
+                </>
+              )}
+            </button>
+          )}
+
+          {/* Error */}
+          {error && (
+            <div className="bg-red-500/10 border border-red-500/30 rounded-xl p-4">
+              <p className="text-red-400 text-sm">{error}</p>
+            </div>
+          )}
+
+          {/* Rewritten result */}
+          {rewrittenText && (
+            <>
+              <div className="space-y-2">
+                <div className="flex justify-between items-center">
+                  <h4 className="font-semibold text-purple-300 flex items-center gap-2">
+                    <Sparkles className="h-4 w-4" />
+                    AI 改写结果
+                  </h4>
+                  <button
+                    onClick={handleApply}
+                    className="text-xs bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-500 hover:to-pink-500 text-white px-3 py-1.5 rounded-lg transition-colors shadow-sm"
+                  >
+                    使用此结果
+                  </button>
+                </div>
+                <div className="bg-purple-900/10 border border-purple-500/20 rounded-xl p-4 max-h-60 overflow-y-auto hide-scrollbar">
+                  <p className="text-gray-200 text-sm leading-relaxed whitespace-pre-wrap">
+                    {rewrittenText}
+                  </p>
+                </div>
+              </div>
+
+              <div className="space-y-2">
+                <div className="flex justify-between items-center">
+                  <h4 className="font-semibold text-gray-400 flex items-center gap-2">
+                    📝 原文对比
+                  </h4>
+                  <button
+                    onClick={onClose}
+                    className="text-xs bg-white/10 hover:bg-white/20 text-white px-3 py-1.5 rounded-lg transition-colors"
+                  >
+                    保留原文
+                  </button>
+                </div>
+                <div className="bg-white/5 border border-white/10 rounded-xl p-4 max-h-40 overflow-y-auto hide-scrollbar">
+                  <p className="text-gray-400 text-sm leading-relaxed whitespace-pre-wrap">
+                    {originalText}
+                  </p>
+                </div>
+              </div>
+
+              <button
+                onClick={handleRetry}
+                className="w-full py-2.5 px-4 bg-white/10 hover:bg-white/20 text-white rounded-xl transition-colors"
+              >
+                重新改写
+              </button>
+            </>
+          )}
+        </div>
+      </div>
+    </div>
+  );
+}
--- a/frontend/src/features/home/ui/ScriptEditor.tsx
+++ b/frontend/src/features/home/ui/ScriptEditor.tsx
@@ -1,52 +1,213 @@
-import { FileText, Loader2, Sparkles } from "lucide-react";
+import { useEffect, useRef, useState } from "react";
+import { FileText, History, Languages, Loader2, RotateCcw, Save, Sparkles, Trash2 } from "lucide-react";
+import type { SavedScript } from "@/features/home/model/useSavedScripts";
+
+const LANGUAGES = [
+  { code: "English", label: "英语 English" },
+  { code: "日本語", label: "日语 日本語" },
+  { code: "한국어", label: "韩语 한국어" },
+  { code: "Français", label: "法语 Français" },
+  { code: "Deutsch", label: "德语 Deutsch" },
+  { code: "Español", label: "西班牙语 Español" },
+  { code: "Русский", label: "俄语 Русский" },
+  { code: "Italiano", label: "意大利语 Italiano" },
+  { code: "Português", label: "葡萄牙语 Português" },
+];

 interface ScriptEditorProps {
  text: string;
  onChangeText: (value: string) => void;
  onOpenExtractModal: () => void;
+  onOpenRewriteModal: () => void;
  onGenerateMeta: () => void;
  isGeneratingMeta: boolean;
+  onTranslate: (targetLang: string) => void;
+  isTranslating: boolean;
+  hasOriginalText: boolean;
+  onRestoreOriginal: () => void;
+  savedScripts: SavedScript[];
+  onSaveScript: () => void;
+  onLoadScript: (content: string) => void;
+  onDeleteScript: (id: string) => void;
 }

 export function ScriptEditor({
  text,
  onChangeText,
  onOpenExtractModal,
+  onOpenRewriteModal,
  onGenerateMeta,
  isGeneratingMeta,
+  onTranslate,
+  isTranslating,
+  hasOriginalText,
+  onRestoreOriginal,
+  savedScripts,
+  onSaveScript,
+  onLoadScript,
+  onDeleteScript,
 }: ScriptEditorProps) {
+  const [showLangMenu, setShowLangMenu] = useState(false);
+  const langMenuRef = useRef<HTMLDivElement>(null);
+  const [showHistoryMenu, setShowHistoryMenu] = useState(false);
+  const historyMenuRef = useRef<HTMLDivElement>(null);
+
+  useEffect(() => {
+    if (!showLangMenu) return;
+    const handleClickOutside = (e: MouseEvent) => {
+      if (langMenuRef.current && !langMenuRef.current.contains(e.target as Node)) {
+        setShowLangMenu(false);
+      }
+    };
+    document.addEventListener("mousedown", handleClickOutside);
+    return () => document.removeEventListener("mousedown", handleClickOutside);
+  }, [showLangMenu]);
+
+  useEffect(() => {
+    if (!showHistoryMenu) return;
+    const handleClickOutside = (e: MouseEvent) => {
+      if (historyMenuRef.current && !historyMenuRef.current.contains(e.target as Node)) {
+        setShowHistoryMenu(false);
+      }
+    };
+    document.addEventListener("mousedown", handleClickOutside);
+    return () => document.removeEventListener("mousedown", handleClickOutside);
+  }, [showHistoryMenu]);
+
+  const handleSelectLang = (langCode: string) => {
+    setShowLangMenu(false);
+    onTranslate(langCode);
+  };
+
+  const formatDate = (ts: number) => {
+    const d = new Date(ts);
+    return `${(d.getMonth() + 1).toString().padStart(2, "0")}-${d.getDate().toString().padStart(2, "0")} ${d.getHours().toString().padStart(2, "0")}:${d.getMinutes().toString().padStart(2, "0")}`;
+  };
+
  return (
-    <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
-      <div className="flex flex-wrap justify-between items-center gap-2 mb-4">
-        <h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2 whitespace-nowrap">
-          ✍️ 文案提取与编辑
+    <div className="relative z-10 bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
+      <div className="mb-4 space-y-3">
+        <h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2">
+          一、文案提取与编辑
        </h2>
-        <div className="flex gap-2 flex-shrink-0">
+        <div className="flex gap-2 flex-wrap justify-end items-center">
+          {/* 历史文案 */}
+          <div className="relative" ref={historyMenuRef}>
+            <button
+              onClick={() => setShowHistoryMenu((prev) => !prev)}
+              className="h-7 px-2.5 text-xs rounded transition-all whitespace-nowrap bg-gray-600 hover:bg-gray-500 text-white inline-flex items-center gap-1"
+            >
+              <History className="h-3.5 w-3.5" />
+              历史文案
+            </button>
+            {showHistoryMenu && (
+              <div className="absolute left-0 top-full mt-1 z-50 bg-gray-800 border border-white/10 rounded-lg shadow-xl py-1 min-w-[220px] max-h-[280px] overflow-y-auto">
+                {savedScripts.length === 0 ? (
+                  <div className="px-3 py-3 text-xs text-gray-500 text-center">暂无保存的文案</div>
+                ) : (
+                  savedScripts.map((script) => (
+                    <div
+                      key={script.id}
+                      className="flex items-center gap-1 px-3 py-1.5 hover:bg-white/10 transition-colors group"
+                    >
+                      <button
+                        onClick={() => {
+                          onLoadScript(script.content);
+                          setShowHistoryMenu(false);
+                        }}
+                        className="flex-1 text-left min-w-0"
+                      >
+                        <div className="text-xs text-gray-200 truncate">{script.name}</div>
+                        <div className="text-[10px] text-gray-500">{formatDate(script.savedAt)}</div>
+                      </button>
+                      <button
+                        onClick={(e) => {
+                          e.stopPropagation();
+                          onDeleteScript(script.id);
+                        }}
+                        className="opacity-40 group-hover:opacity-100 p-1 text-gray-500 hover:text-red-400 transition-all shrink-0"
+                      >
+                        <Trash2 className="h-3 w-3" />
+                      </button>
+                    </div>
+                  ))
+                )}
+              </div>
+            )}
+          </div>
          <button
            onClick={onOpenExtractModal}
-            className="px-2 py-1 text-xs rounded transition-all whitespace-nowrap bg-purple-600 hover:bg-purple-700 text-white flex items-center gap-1"
+            className="h-7 px-2.5 text-xs rounded transition-all whitespace-nowrap bg-purple-600 hover:bg-purple-700 text-white inline-flex items-center gap-1"
          >
            <FileText className="h-3.5 w-3.5" />
            文案提取助手
          </button>
+          <div className="relative" ref={langMenuRef}>
+            <button
+              onClick={() => setShowLangMenu((prev) => !prev)}
+              disabled={isTranslating || !text.trim()}
+              className={`h-7 px-2.5 text-xs rounded transition-all whitespace-nowrap inline-flex items-center gap-1 ${
+                isTranslating || !text.trim()
+                  ? "bg-gray-600 cursor-not-allowed text-gray-400"
+                  : "bg-gradient-to-r from-emerald-600 to-teal-600 hover:from-emerald-700 hover:to-teal-700 text-white"
+              }`}
+            >
+              {isTranslating ? (
+                <>
+                  <Loader2 className="h-3.5 w-3.5 animate-spin" />
+                  翻译中...
+                </>
+              ) : (
+                <>
+                  <Languages className="h-3.5 w-3.5" />
+                  AI多语言
+                </>
+              )}
+            </button>
+            {showLangMenu && (
+              <div className="absolute right-0 top-full mt-1 z-50 bg-gray-800 border border-white/10 rounded-lg shadow-xl py-1 min-w-[160px]">
+                {hasOriginalText && (
+                  <>
+                    <button
+                      onClick={() => { setShowLangMenu(false); onRestoreOriginal(); }}
+                      className="w-full text-left px-3 py-1.5 text-xs text-amber-400 hover:bg-white/10 transition-colors flex items-center gap-1"
+                    >
+                      <RotateCcw className="h-3 w-3" />
+                      还原原文
+                    </button>
+                    <div className="border-t border-white/10 my-1" />
+                  </>
+                )}
+                {LANGUAGES.map((lang) => (
+                  <button
+                    key={lang.code}
+                    onClick={() => handleSelectLang(lang.code)}
+                    className="w-full text-left px-3 py-1.5 text-xs text-gray-200 hover:bg-white/10 transition-colors"
+                  >
+                    {lang.label}
+                  </button>
+                ))}
+              </div>
+            )}
+          </div>
          <button
            onClick={onGenerateMeta}
            disabled={isGeneratingMeta || !text.trim()}
-            className={`px-2 py-1 text-xs rounded transition-all whitespace-nowrap ${isGeneratingMeta || !text.trim()
+            className={`h-7 px-2.5 text-xs rounded transition-all whitespace-nowrap inline-flex items-center gap-1 ${isGeneratingMeta || !text.trim()
              ? "bg-gray-600 cursor-not-allowed text-gray-400"
              : "bg-gradient-to-r from-blue-600 to-cyan-600 hover:from-blue-700 hover:to-cyan-700 text-white"
              }`}
          >
            {isGeneratingMeta ? (
-              <span className="flex items-center gap-1">
+              <>
                <Loader2 className="h-3.5 w-3.5 animate-spin" />
                生成中...
-              </span>
+              </>
            ) : (
-              <span className="flex items-center gap-1">
+              <>
                <Sparkles className="h-3.5 w-3.5" />
                AI生成标题标签
-              </span>
+              </>
            )}
          </button>
        </div>
@@ -57,9 +218,34 @@ export function ScriptEditor({
        placeholder="请输入你想说的话..."
        className="w-full h-40 bg-black/30 border border-white/10 rounded-xl p-4 text-white placeholder-gray-500 resize-none focus:outline-none focus:border-purple-500 transition-colors hide-scrollbar"
      />
-      <div className="flex justify-between mt-2 text-sm text-gray-400">
+      <div className="flex items-center justify-between mt-2 text-sm text-gray-400">
        <span>{text.length} 字</span>
-        <span>预计时长: ~{Math.ceil(text.length / 4)} 秒</span>
+        <div className="flex items-center gap-2">
+          <button
+            onClick={onOpenRewriteModal}
+            disabled={!text.trim()}
+            className={`px-2.5 py-1 text-xs rounded transition-all flex items-center gap-1 ${
+              !text.trim()
+                ? "bg-gray-700 cursor-not-allowed text-gray-500"
+                : "bg-purple-600/80 hover:bg-purple-600 text-white"
+            }`}
+          >
+            <Sparkles className="h-3 w-3" />
+            AI智能改写
+          </button>
+          <button
+            onClick={onSaveScript}
+            disabled={!text.trim()}
+            className={`px-2.5 py-1 text-xs rounded transition-all flex items-center gap-1 ${
+              !text.trim()
+                ? "bg-gray-700 cursor-not-allowed text-gray-500"
+                : "bg-amber-600/80 hover:bg-amber-600 text-white"
+            }`}
+          >
+            <Save className="h-3 w-3" />
+            保存文案
+          </button>
+        </div>
      </div>
    </div>
  );
--- a/frontend/src/features/home/ui/ScriptExtractionModal.tsx
+++ b/frontend/src/features/home/ui/ScriptExtractionModal.tsx
@@ -18,15 +18,12 @@ export default function ScriptExtractionModal({
    const {
        isLoading,
        script,
-        rewrittenScript,
        error,
-        doRewrite,
        step,
        dragActive,
        selectedFile,
        activeTab,
        inputUrl,
-        setDoRewrite,
        setActiveTab,
        setInputUrl,
        handleDrag,
@@ -186,21 +183,6 @@ export default function ScriptExtractionModal({
                                </div>
                            )}

-                            {/* Options */}
-                            <div className="flex items-center gap-3 bg-white/5 rounded-xl p-4 border border-white/10">
-                                <label className="flex items-center gap-2 cursor-pointer">
-                                    <input
-                                        type="checkbox"
-                                        checked={doRewrite}
-                                        onChange={(e) => setDoRewrite(e.target.checked)}
-                                        className="w-4 h-4 rounded bg-white/10 border-white/20 text-purple-500 focus:ring-purple-500"
-                                    />
-                                    <span className="text-sm text-gray-300">
-                                        AI 智能改写（去口语化）
-                                    </span>
-                                </label>
-                            </div>
-
                            {/* Error */}
                            {error && (
                                <div className="bg-red-500/10 border border-red-500/30 rounded-xl p-4">
@@ -244,9 +226,7 @@ export default function ScriptExtractionModal({
                            <p className="text-sm text-gray-400 text-center max-w-sm px-4">
                                {activeTab === "url" && "正在下载视频..."}
                                <br />
-                                {doRewrite
-                                    ? "正在进行语音识别和 AI 智能改写..."
-                                    : "正在进行语音识别..."}
+                                正在进行语音识别...
                                <br />
                                <span className="opacity-75">
                                    大文件可能需要几分钟，请不要关闭窗口
@@ -257,47 +237,16 @@ export default function ScriptExtractionModal({

                    {step === "result" && (
                        <div className="space-y-6">
-                            {rewrittenScript && (
                            <div className="space-y-2">
                                <div className="flex justify-between items-center">
-                                        <h4 className="font-semibold text-purple-300 flex items-center gap-2">
-                                            ✨ AI 洗稿结果{" "}
-                                            <span className="text-xs font-normal text-purple-400/70">
-                                                (推荐)
-                                            </span>
-                                        </h4>
-                                        {onApply && (
-                                            <button
-                                                onClick={() => handleApplyAndClose(rewrittenScript)}
-                                                className="text-xs bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-500 hover:to-pink-500 text-white px-3 py-1.5 rounded-lg transition-colors flex items-center gap-1 shadow-sm"
-                                            >
-                                                📥 填入
-                                            </button>
-                                        )}
-                                        <button
-                                            onClick={() => copyToClipboard(rewrittenScript)}
-                                            className="text-xs bg-purple-600 hover:bg-purple-500 text-white px-3 py-1.5 rounded-lg transition-colors flex items-center gap-1"
-                                        >
-                                            📋 复制内容
-                                        </button>
-                                    </div>
-                                    <div className="bg-purple-900/10 border border-purple-500/20 rounded-xl p-4 max-h-60 overflow-y-auto custom-scrollbar">
-                                        <p className="text-gray-200 text-sm leading-relaxed whitespace-pre-wrap">
-                                            {rewrittenScript}
-                                        </p>
-                                    </div>
-                                </div>
-                            )}
-
-                            <div className="space-y-2">
-                                <div className="flex justify-between items-center">
-                                    <h4 className="font-semibold text-gray-400 flex items-center gap-2">
-                                        🎙️ 原始识别结果
+                                    <h4 className="font-semibold text-gray-300 flex items-center gap-2">
+                                        🎙️ 识别结果
                                    </h4>
+                                    <div className="flex items-center gap-2">
                                        {onApply && (
                                            <button
                                                onClick={() => handleApplyAndClose(script)}
-                                            className="text-xs bg-white/10 hover:bg-white/20 text-white px-3 py-1.5 rounded-lg transition-colors flex items-center gap-1"
+                                                className="text-xs bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-500 hover:to-pink-500 text-white px-3 py-1.5 rounded-lg transition-colors flex items-center gap-1 shadow-sm"
                                            >
                                                📥 填入
                                            </button>
@@ -309,8 +258,9 @@ export default function ScriptExtractionModal({
                                            复制
                                        </button>
                                    </div>
-                                <div className="bg-white/5 border border-white/10 rounded-xl p-4 max-h-40 overflow-y-auto custom-scrollbar">
-                                    <p className="text-gray-400 text-sm leading-relaxed whitespace-pre-wrap">
+                                </div>
+                                <div className="bg-white/5 border border-white/10 rounded-xl p-4 max-h-60 overflow-y-auto hide-scrollbar">
+                                    <p className="text-gray-200 text-sm leading-relaxed whitespace-pre-wrap">
                                        {script}
                                    </p>
                                </div>
--- a/frontend/src/features/home/ui/TimelineEditor.tsx
+++ b/frontend/src/features/home/ui/TimelineEditor.tsx
@@ -0,0 +1,364 @@
+import { useEffect, useRef, useCallback, useState, useMemo } from "react";
+import WaveSurfer from "wavesurfer.js";
+import { ChevronDown, GripVertical } from "lucide-react";
+import type { TimelineSegment } from "@/features/home/model/useTimelineEditor";
+import type { Material } from "@/shared/types/material";
+
+interface TimelineEditorProps {
+  audioDuration: number;
+  audioUrl: string;
+  segments: TimelineSegment[];
+  materials: Material[];
+  outputAspectRatio: "9:16" | "16:9";
+  onOutputAspectRatioChange: (ratio: "9:16" | "16:9") => void;
+  onReorderSegment: (fromIdx: number, toIdx: number) => void;
+  onClickSegment: (segment: TimelineSegment) => void;
+  embedded?: boolean;
+}
+
+function formatTime(sec: number): string {
+  const m = Math.floor(sec / 60);
+  const s = sec % 60;
+  return `${String(m).padStart(2, "0")}:${s.toFixed(1).padStart(4, "0")}`;
+}
+
+export function TimelineEditor({
+  audioDuration,
+  audioUrl,
+  segments,
+  materials,
+  outputAspectRatio,
+  onOutputAspectRatioChange,
+  onReorderSegment,
+  onClickSegment,
+  embedded = false,
+}: TimelineEditorProps) {
+  const waveRef = useRef<HTMLDivElement>(null);
+  const wsRef = useRef<WaveSurfer | null>(null);
+  const [waveReady, setWaveReady] = useState(false);
+  const [isPlaying, setIsPlaying] = useState(false);
+
+  // Refs for high-frequency DOM updates (avoid 60fps re-renders)
+  const playheadRef = useRef<HTMLDivElement>(null);
+  const timeRef = useRef<HTMLSpanElement>(null);
+  const audioDurationRef = useRef(audioDuration);
+
+  useEffect(() => {
+    audioDurationRef.current = audioDuration;
+  }, [audioDuration]);
+
+  // Drag-to-reorder state
+  const [dragFromIdx, setDragFromIdx] = useState<number | null>(null);
+  const [dragOverIdx, setDragOverIdx] = useState<number | null>(null);
+
+  // Aspect ratio dropdown
+  const [ratioOpen, setRatioOpen] = useState(false);
+  const ratioRef = useRef<HTMLDivElement>(null);
+  const ratioOptions = [
+    { value: "9:16" as const, label: "竖屏 9:16" },
+    { value: "16:9" as const, label: "横屏 16:9" },
+  ];
+  const currentRatioLabel =
+    ratioOptions.find((opt) => opt.value === outputAspectRatio)?.label ?? "竖屏 9:16";
+
+  useEffect(() => {
+    const handler = (e: MouseEvent) => {
+      if (ratioRef.current && !ratioRef.current.contains(e.target as Node)) {
+        setRatioOpen(false);
+      }
+    };
+    if (ratioOpen) document.addEventListener("mousedown", handler);
+    return () => document.removeEventListener("mousedown", handler);
+  }, [ratioOpen]);
+
+  // Create / recreate wavesurfer when audioUrl changes
+  useEffect(() => {
+    if (!waveRef.current || !audioUrl) return;
+
+    const playheadEl = playheadRef.current;
+    const timeEl = timeRef.current;
+
+    // Destroy previous instance
+    if (wsRef.current) {
+      wsRef.current.destroy();
+      wsRef.current = null;
+    }
+
+    const ws = WaveSurfer.create({
+      container: waveRef.current,
+      height: 56,
+      waveColor: "#6d28d9",
+      progressColor: "#a855f7",
+      barWidth: 2,
+      barGap: 1,
+      barRadius: 2,
+      cursorWidth: 1,
+      cursorColor: "#e879f9",
+      interact: true,
+      normalize: true,
+    });
+
+    // Click waveform → seek + auto-play
+    ws.on("interaction", () => ws.play());
+    ws.on("play", () => setIsPlaying(true));
+    ws.on("pause", () => setIsPlaying(false));
+    ws.on("finish", () => {
+      setIsPlaying(false);
+      if (playheadRef.current) playheadRef.current.style.display = "none";
+    });
+    // High-frequency: update playhead + time via refs (no React re-render)
+    ws.on("timeupdate", (time: number) => {
+      const dur = audioDurationRef.current;
+      if (playheadRef.current && dur > 0) {
+        playheadRef.current.style.left = `${(time / dur) * 100}%`;
+        playheadRef.current.style.display = "block";
+      }
+      if (timeRef.current) {
+        timeRef.current.textContent = formatTime(time);
+      }
+    });
+
+    ws.load(audioUrl);
+    wsRef.current = ws;
+
+    return () => {
+      ws.destroy();
+      wsRef.current = null;
+      setIsPlaying(false);
+      if (playheadEl) playheadEl.style.display = "none";
+      if (timeEl) timeEl.textContent = formatTime(0);
+    };
+  }, [audioUrl, waveReady]);
+
+  // Callback ref to detect when waveRef div mounts
+  const waveCallbackRef = useCallback((node: HTMLDivElement | null) => {
+    (waveRef as React.MutableRefObject<HTMLDivElement | null>).current = node;
+    setWaveReady(!!node);
+  }, []);
+
+  const handlePlayPause = useCallback(() => {
+    wsRef.current?.playPause();
+  }, []);
+
+  // Drag-to-reorder handlers
+  const handleDragStart = useCallback((idx: number, e: React.DragEvent) => {
+    setDragFromIdx(idx);
+    e.dataTransfer.effectAllowed = "move";
+    e.dataTransfer.setData("text/plain", String(idx));
+  }, []);
+
+  const handleDragOver = useCallback((idx: number, e: React.DragEvent) => {
+    e.preventDefault();
+    e.dataTransfer.dropEffect = "move";
+    setDragOverIdx(idx);
+  }, []);
+
+  const handleDragLeave = useCallback(() => {
+    setDragOverIdx(null);
+  }, []);
+
+  const handleDrop = useCallback((toIdx: number, e: React.DragEvent) => {
+    e.preventDefault();
+    const fromIdx = parseInt(e.dataTransfer.getData("text/plain"), 10);
+    if (!isNaN(fromIdx) && fromIdx !== toIdx) {
+      onReorderSegment(fromIdx, toIdx);
+    }
+    setDragFromIdx(null);
+    setDragOverIdx(null);
+  }, [onReorderSegment]);
+
+  const handleDragEnd = useCallback(() => {
+    setDragFromIdx(null);
+    setDragOverIdx(null);
+  }, []);
+
+  // Filter visible vs overflow segments
+  const visibleSegments = useMemo(() => segments.filter((s) => s.start < audioDuration), [segments, audioDuration]);
+  const overflowSegments = useMemo(() => segments.filter((s) => s.start >= audioDuration), [segments, audioDuration]);
+  const hasSegments = visibleSegments.length > 0;
+
+  const content = (
+    <>
+      <div className="flex items-center justify-between mb-3">
+        {!embedded ? (
+          <h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2">
+            时间轴编辑
+          </h2>
+        ) : (
+          <h3 className="text-sm font-medium text-gray-400">时间轴编辑</h3>
+        )}
+        <div className="flex items-center gap-2 text-xs text-gray-400">
+          <div ref={ratioRef} className="relative">
+            <button
+              type="button"
+              onClick={() => setRatioOpen((v) => !v)}
+              className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 whitespace-nowrap flex items-center gap-1 transition-all"
+              title="设置输出画面比例"
+            >
+              画面: {currentRatioLabel}
+              <ChevronDown className={`h-3 w-3 transition-transform ${ratioOpen ? "rotate-180" : ""}`} />
+            </button>
+            {ratioOpen && (
+              <div className="absolute right-0 top-full mt-1 bg-gray-800 border border-white/20 rounded-lg shadow-xl py-1 z-50 min-w-[106px]">
+                {ratioOptions.map((opt) => (
+                  <button
+                    key={opt.value}
+                    type="button"
+                    onClick={() => {
+                      onOutputAspectRatioChange(opt.value);
+                      setRatioOpen(false);
+                    }}
+                    className={`w-full text-left px-3 py-1.5 text-xs transition-colors ${
+                      outputAspectRatio === opt.value
+                        ? "bg-purple-600/40 text-purple-200"
+                        : "text-gray-300 hover:bg-white/10"
+                    }`}
+                  >
+                    {opt.label}
+                  </button>
+                ))}
+              </div>
+            )}
+          </div>
+
+          {audioUrl && (
+            <>
+              <button
+                onClick={handlePlayPause}
+                className="w-7 h-7 flex items-center justify-center rounded-full bg-white/10 hover:bg-white/20 text-white transition-colors"
+                title={isPlaying ? "暂停" : "播放"}
+              >
+                {isPlaying ? "⏸" : "▶"}
+              </button>
+              <span ref={timeRef} className="tabular-nums">00:00.0</span>
+              <span className="text-gray-600">/</span>
+              <span className="tabular-nums">{formatTime(audioDuration)}</span>
+            </>
+          )}
+        </div>
+      </div>
+
+      {/* Waveform — always rendered so ref stays mounted */}
+      <div className="relative mb-1">
+        <div ref={waveCallbackRef} className="rounded-lg overflow-hidden bg-black/20 cursor-pointer" style={{ minHeight: 56 }} />
+      </div>
+
+      {/* Segment blocks or empty placeholder */}
+      {hasSegments ? (
+        <>
+          <div className="relative h-14 flex select-none">
+            {/* Playhead — syncs with audio playback */}
+            <div
+              ref={playheadRef}
+              className="absolute top-0 h-full w-0.5 bg-fuchsia-400 z-10 pointer-events-none"
+              style={{ display: "none", left: "0%" }}
+            />
+            {visibleSegments.map((seg, i) => {
+              const left = (seg.start / audioDuration) * 100;
+              const width = ((seg.end - seg.start) / audioDuration) * 100;
+              const segDur = seg.end - seg.start;
+              const isDragTarget = dragOverIdx === i && dragFromIdx !== i;
+
+              // Compute loop portion for the last visible segment
+              const isLastVisible = i === visibleSegments.length - 1;
+              let loopPercent = 0;
+              if (isLastVisible && audioDuration > 0) {
+                const mat = materials.find((m) => m.id === seg.materialId);
+                const matDur = mat?.duration_sec ?? 0;
+                const effDur = (seg.sourceEnd > seg.sourceStart)
+                  ? (seg.sourceEnd - seg.sourceStart)
+                  : Math.max(matDur - seg.sourceStart, 0);
+                if (effDur > 0 && segDur > effDur + 0.1) {
+                  loopPercent = ((segDur - effDur) / segDur) * 100;
+                }
+              }
+
+              return (
+                <div key={seg.id} className="absolute top-0 h-full" style={{ left: `${left}%`, width: `${width}%` }}>
+                  <button
+                    draggable
+                    onDragStart={(e) => handleDragStart(i, e)}
+                    onDragOver={(e) => handleDragOver(i, e)}
+                    onDragLeave={handleDragLeave}
+                    onDrop={(e) => handleDrop(i, e)}
+                    onDragEnd={handleDragEnd}
+                    onClick={() => onClickSegment(seg)}
+                    className={`relative w-full h-full rounded-lg flex flex-col items-center justify-center overflow-hidden cursor-grab active:cursor-grabbing transition-all border ${
+                      isDragTarget
+                        ? "ring-2 ring-purple-400 border-purple-400 scale-[1.02]"
+                        : dragFromIdx === i
+                        ? "opacity-50 border-white/10"
+                        : "hover:opacity-90 border-white/10"
+                    }`}
+                    style={{ backgroundColor: seg.color + "33", borderColor: isDragTarget ? undefined : seg.color + "66" }}
+                    title={`拖拽可调换顺序 · 点击设置截取范围\n${seg.materialName}\n${segDur.toFixed(1)}s${loopPercent > 0 ? ` (含循环 ${(segDur * loopPercent / 100).toFixed(1)}s)` : ""}`}
+                  >
+                    <GripVertical className="absolute top-0.5 left-0.5 h-3 w-3 text-white/30 z-[1]" />
+                    <span className="text-[11px] text-white/90 truncate max-w-full px-1 leading-tight z-[1]">
+                      {seg.materialName}
+                    </span>
+                    <span className="text-[10px] text-white/60 leading-tight z-[1]">
+                      {segDur.toFixed(1)}s
+                    </span>
+                    {seg.sourceStart > 0 && (
+                      <span className="text-[9px] text-amber-400/80 leading-tight z-[1]">
+                        ✂ {seg.sourceStart.toFixed(1)}s
+                      </span>
+                    )}
+                    {/* Loop fill stripe overlay */}
+                    {loopPercent > 0 && (
+                      <div
+                        className="absolute top-0 right-0 h-full pointer-events-none flex items-center justify-center"
+                        style={{
+                          width: `${loopPercent}%`,
+                          background: `repeating-linear-gradient(-45deg, transparent, transparent 3px, rgba(255,255,255,0.07) 3px, rgba(255,255,255,0.07) 6px)`,
+                          borderLeft: "1px dashed rgba(255,255,255,0.25)",
+                        }}
+                      >
+                        <span className="text-[9px] text-white/30">循环</span>
+                      </div>
+                    )}
+                  </button>
+                </div>
+              );
+            })}
+          </div>
+
+          {/* Overflow segments — shown as gray chips */}
+          {overflowSegments.length > 0 && (
+            <div className="flex flex-wrap items-center gap-1.5 mt-1.5">
+              <span className="text-[10px] text-gray-500">未使用:</span>
+              {overflowSegments.map((seg) => (
+                <span
+                  key={seg.id}
+                  className="text-[10px] text-gray-500 bg-white/5 border border-white/10 rounded px-1.5 py-0.5"
+                >
+                  {seg.materialName}
+                </span>
+              ))}
+            </div>
+          )}
+
+          <p className="text-[10px] text-gray-500 mt-1.5">
+            点击波形定位播放 · 拖拽色块调换顺序 · 点击色块设置截取范围
+          </p>
+        </>
+      ) : (
+        <>
+          <div className="h-14 bg-white/5 rounded-lg" />
+          <p className="text-[10px] text-gray-500 mt-1.5">
+            选中配音和素材后可编辑时间轴
+          </p>
+        </>
+      )}
+    </>
+  );
+
+  if (embedded) return content;
+
+  return (
+    <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
+      {content}
+    </div>
+  );
+}
--- a/frontend/src/features/home/ui/TitleSubtitlePanel.tsx
+++ b/frontend/src/features/home/ui/TitleSubtitlePanel.tsx
@@ -1,4 +1,4 @@
-import { Eye } from "lucide-react";
+import { ChevronDown, Eye } from "lucide-react";
 import { FloatingStylePreview } from "@/features/home/ui/FloatingStylePreview";

 interface SubtitleStyleOption {
@@ -38,11 +38,21 @@ interface TitleSubtitlePanelProps {
  onTitleChange: (value: string) => void;
  onTitleCompositionStart?: () => void;
  onTitleCompositionEnd?: (value: string) => void;
+  videoSecondaryTitle: string;
+  onSecondaryTitleChange: (value: string) => void;
+  onSecondaryTitleCompositionStart?: () => void;
+  onSecondaryTitleCompositionEnd?: (value: string) => void;
  titleStyles: TitleStyleOption[];
  selectedTitleStyleId: string;
  onSelectTitleStyle: (id: string) => void;
  titleFontSize: number;
  onTitleFontSizeChange: (value: number) => void;
+  selectedSecondaryTitleStyleId: string;
+  onSelectSecondaryTitleStyle: (id: string) => void;
+  secondaryTitleFontSize: number;
+  onSecondaryTitleFontSizeChange: (value: number) => void;
+  secondaryTitleTopMargin: number;
+  onSecondaryTitleTopMarginChange: (value: number) => void;
  subtitleStyles: SubtitleStyleOption[];
  selectedSubtitleStyleId: string;
  onSelectSubtitleStyle: (id: string) => void;
@@ -52,13 +62,14 @@ interface TitleSubtitlePanelProps {
  onTitleTopMarginChange: (value: number) => void;
  subtitleBottomMargin: number;
  onSubtitleBottomMarginChange: (value: number) => void;
-  enableSubtitles: boolean;
-  onToggleSubtitles: (value: boolean) => void;
+  titleDisplayMode: "short" | "persistent";
+  onTitleDisplayModeChange: (mode: "short" | "persistent") => void;
  resolveAssetUrl: (path?: string | null) => string | null;
  getFontFormat: (fontFile?: string) => string;
  buildTextShadow: (color: string, size: number) => string;
  previewBaseWidth?: number;
  previewBaseHeight?: number;
+  previewBackgroundUrl?: string | null;
 }

 export function TitleSubtitlePanel({
@@ -68,11 +79,21 @@ export function TitleSubtitlePanel({
  onTitleChange,
  onTitleCompositionStart,
  onTitleCompositionEnd,
+  videoSecondaryTitle,
+  onSecondaryTitleChange,
+  onSecondaryTitleCompositionStart,
+  onSecondaryTitleCompositionEnd,
  titleStyles,
  selectedTitleStyleId,
  onSelectTitleStyle,
  titleFontSize,
  onTitleFontSizeChange,
+  selectedSecondaryTitleStyleId,
+  onSelectSecondaryTitleStyle,
+  secondaryTitleFontSize,
+  onSecondaryTitleFontSizeChange,
+  secondaryTitleTopMargin,
+  onSecondaryTitleTopMarginChange,
  subtitleStyles,
  selectedSubtitleStyleId,
  onSelectSubtitleStyle,
@@ -82,20 +103,34 @@ export function TitleSubtitlePanel({
  onTitleTopMarginChange,
  subtitleBottomMargin,
  onSubtitleBottomMarginChange,
-  enableSubtitles,
-  onToggleSubtitles,
+  titleDisplayMode,
+  onTitleDisplayModeChange,
  resolveAssetUrl,
  getFontFormat,
  buildTextShadow,
  previewBaseWidth = 1080,
  previewBaseHeight = 1920,
+  previewBackgroundUrl,
 }: TitleSubtitlePanelProps) {
  return (
    <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
      <div className="flex items-center justify-between mb-4 gap-2">
        <h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2">
-          🎬 标题与字幕
+          四、标题与字幕
        </h2>
+        <div className="flex items-center gap-1.5">
+          <div className="relative shrink-0">
+            <select
+              value={titleDisplayMode}
+              onChange={(e) => onTitleDisplayModeChange(e.target.value as "short" | "persistent")}
+              className="appearance-none rounded-lg border border-white/15 bg-black/35 px-2.5 py-1.5 pr-7 text-xs text-gray-200 outline-none transition-colors hover:border-white/25 focus:border-purple-500"
+              aria-label="标题显示方式"
+            >
+              <option value="short">标题短暂显示</option>
+              <option value="persistent">标题常驻显示</option>
+            </select>
+            <ChevronDown className="pointer-events-none absolute right-2 top-1/2 h-3.5 w-3.5 -translate-y-1/2 text-gray-400" />
+          </div>
          <button
            onClick={onTogglePreview}
            className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 flex items-center gap-1"
@@ -104,30 +139,39 @@ export function TitleSubtitlePanel({
            {showStylePreview ? "收起预览" : "预览样式"}
          </button>
        </div>
+      </div>

      {showStylePreview && (
        <FloatingStylePreview
          onClose={onTogglePreview}
          videoTitle={videoTitle}
+          videoSecondaryTitle={videoSecondaryTitle}
          titleStyles={titleStyles}
          selectedTitleStyleId={selectedTitleStyleId}
          titleFontSize={titleFontSize}
+          selectedSecondaryTitleStyleId={selectedSecondaryTitleStyleId}
+          secondaryTitleFontSize={secondaryTitleFontSize}
+          secondaryTitleTopMargin={secondaryTitleTopMargin}
          subtitleStyles={subtitleStyles}
          selectedSubtitleStyleId={selectedSubtitleStyleId}
          subtitleFontSize={subtitleFontSize}
          titleTopMargin={titleTopMargin}
          subtitleBottomMargin={subtitleBottomMargin}
-          enableSubtitles={enableSubtitles}
+          enableSubtitles={true}
          resolveAssetUrl={resolveAssetUrl}
          getFontFormat={getFontFormat}
          buildTextShadow={buildTextShadow}
          previewBaseWidth={previewBaseWidth}
          previewBaseHeight={previewBaseHeight}
+          previewBackgroundUrl={previewBackgroundUrl}
        />
      )}

      <div className="mb-4">
-        <label className="text-sm text-gray-300 mb-2 block">片头标题（限制15个字）</label>
+        <div className="flex items-center justify-between mb-2">
+          <label className="text-sm text-gray-300">片头标题</label>
+          <span className={`text-xs ${videoTitle.length > 15 ? "text-red-400" : "text-gray-500"}`}>{videoTitle.length}/15</span>
+        </div>
        <input
          type="text"
          value={videoTitle}
@@ -139,115 +183,105 @@ export function TitleSubtitlePanel({
        />
      </div>

-      {titleStyles.length > 0 && (
      <div className="mb-4">
-          <label className="text-sm text-gray-300 mb-2 block">标题样式</label>
-          <div className="grid grid-cols-2 gap-2">
+        <div className="flex items-center justify-between mb-2">
+          <label className="text-sm text-gray-300">片头副标题</label>
+          <span className={`text-xs ${videoSecondaryTitle.length > 20 ? "text-red-400" : "text-gray-500"}`}>{videoSecondaryTitle.length}/20</span>
+        </div>
+        <input
+          type="text"
+          value={videoSecondaryTitle}
+          onChange={(e) => onSecondaryTitleChange(e.target.value)}
+          onCompositionStart={onSecondaryTitleCompositionStart}
+          onCompositionEnd={(e) => onSecondaryTitleCompositionEnd?.(e.currentTarget.value)}
+          placeholder="输入副标题，显示在主标题下方"
+          className="w-full px-3 sm:px-4 py-2 text-sm sm:text-base bg-black/30 border border-white/10 rounded-xl text-white placeholder-gray-500 focus:outline-none focus:border-purple-500 transition-colors"
+        />
+      </div>
+
+      {titleStyles.length > 0 && (
+        <div className="mb-4 space-y-3">
+          <div className="flex items-center gap-3">
+            <label className="text-sm text-gray-300 shrink-0 w-20">标题样式</label>
+            <div className="relative w-1/3 min-w-[100px]">
+              <select
+                value={selectedTitleStyleId}
+                onChange={(e) => onSelectTitleStyle(e.target.value)}
+                className="w-full appearance-none rounded-lg border border-white/15 bg-black/35 px-3 py-2 pr-8 text-sm text-gray-200 outline-none transition-colors hover:border-white/25 focus:border-purple-500"
+              >
                {titleStyles.map((style) => (
-              <button
-                key={style.id}
-                onClick={() => onSelectTitleStyle(style.id)}
-                className={`p-2 rounded-lg border transition-all text-left ${selectedTitleStyleId === style.id
-                  ? "border-purple-500 bg-purple-500/20"
-                  : "border-white/10 bg-white/5 hover:border-white/30"
-                  }`}
-              >
-                <div className="text-white text-sm truncate">{style.label}</div>
-                <div className="text-xs text-gray-400 truncate">
-                  {style.font_family || style.font_file || ""}
-                </div>
-              </button>
+                  <option key={style.id} value={style.id}>{style.label}</option>
                ))}
+              </select>
+              <ChevronDown className="pointer-events-none absolute right-2.5 top-1/2 h-3.5 w-3.5 -translate-y-1/2 text-gray-400" />
            </div>
-          <div className="mt-3">
-            <label className="text-xs text-gray-400 mb-2 block">标题字号: {titleFontSize}px</label>
-            <input
-              type="range"
-              min="60"
-              max="150"
-              step="1"
-              value={titleFontSize}
-              onChange={(e) => onTitleFontSizeChange(parseInt(e.target.value, 10))}
-              className="w-full accent-purple-500"
-            />
          </div>
-          <div className="mt-3">
-            <label className="text-xs text-gray-400 mb-2 block">标题位置: {titleTopMargin}px</label>
-            <input
-              type="range"
-              min="0"
-              max="300"
-              step="1"
-              value={titleTopMargin}
-              onChange={(e) => onTitleTopMarginChange(parseInt(e.target.value, 10))}
-              className="w-full accent-purple-500"
-            />
+          <div className="flex items-center gap-3">
+            <label className="text-xs text-gray-400 shrink-0 w-20">字号 {titleFontSize}</label>
+            <input type="range" min="60" max="150" step="1" value={titleFontSize} onChange={(e) => onTitleFontSizeChange(parseInt(e.target.value, 10))} className="flex-1 accent-purple-500" />
+          </div>
+          <div className="flex items-center gap-3">
+            <label className="text-xs text-gray-400 shrink-0 w-20">位置 {titleTopMargin}</label>
+            <input type="range" min="0" max="300" step="1" value={titleTopMargin} onChange={(e) => onTitleTopMarginChange(parseInt(e.target.value, 10))} className="flex-1 accent-purple-500" />
          </div>
        </div>
      )}

-      {enableSubtitles && subtitleStyles.length > 0 && (
-        <div className="mt-4">
-          <label className="text-sm text-gray-300 mb-2 block">字幕样式</label>
-          <div className="grid grid-cols-2 gap-2">
+      {titleStyles.length > 0 && (
+        <div className="mb-4 space-y-3">
+          <div className="flex items-center gap-3">
+            <label className="text-sm text-gray-300 shrink-0 w-20">副标题样式</label>
+            <div className="relative w-1/3 min-w-[100px]">
+              <select
+                value={selectedSecondaryTitleStyleId}
+                onChange={(e) => onSelectSecondaryTitleStyle(e.target.value)}
+                className="w-full appearance-none rounded-lg border border-white/15 bg-black/35 px-3 py-2 pr-8 text-sm text-gray-200 outline-none transition-colors hover:border-white/25 focus:border-purple-500"
+              >
+                {titleStyles.map((style) => (
+                  <option key={style.id} value={style.id}>{style.label}</option>
+                ))}
+              </select>
+              <ChevronDown className="pointer-events-none absolute right-2.5 top-1/2 h-3.5 w-3.5 -translate-y-1/2 text-gray-400" />
+            </div>
+          </div>
+          <div className="flex items-center gap-3">
+            <label className="text-xs text-gray-400 shrink-0 w-20">字号 {secondaryTitleFontSize}</label>
+            <input type="range" min="30" max="100" step="1" value={secondaryTitleFontSize} onChange={(e) => onSecondaryTitleFontSizeChange(parseInt(e.target.value, 10))} className="flex-1 accent-purple-500" />
+          </div>
+          <div className="flex items-center gap-3">
+            <label className="text-xs text-gray-400 shrink-0 w-20">间距 {secondaryTitleTopMargin}</label>
+            <input type="range" min="0" max="100" step="1" value={secondaryTitleTopMargin} onChange={(e) => onSecondaryTitleTopMarginChange(parseInt(e.target.value, 10))} className="flex-1 accent-purple-500" />
+          </div>
+        </div>
+      )}
+
+      {subtitleStyles.length > 0 && (
+        <div className="mt-4 space-y-3">
+          <div className="flex items-center gap-3">
+            <label className="text-sm text-gray-300 shrink-0 w-20">字幕样式</label>
+            <div className="relative w-1/3 min-w-[100px]">
+              <select
+                value={selectedSubtitleStyleId}
+                onChange={(e) => onSelectSubtitleStyle(e.target.value)}
+                className="w-full appearance-none rounded-lg border border-white/15 bg-black/35 px-3 py-2 pr-8 text-sm text-gray-200 outline-none transition-colors hover:border-white/25 focus:border-purple-500"
+              >
                {subtitleStyles.map((style) => (
-              <button
-                key={style.id}
-                onClick={() => onSelectSubtitleStyle(style.id)}
-                className={`p-2 rounded-lg border transition-all text-left ${selectedSubtitleStyleId === style.id
-                  ? "border-purple-500 bg-purple-500/20"
-                  : "border-white/10 bg-white/5 hover:border-white/30"
-                  }`}
-              >
-                <div className="text-white text-sm truncate">{style.label}</div>
-                <div className="text-xs text-gray-400 truncate">
-                  {style.font_family || style.font_file || ""}
-                </div>
-              </button>
+                  <option key={style.id} value={style.id}>{style.label}</option>
                ))}
+              </select>
+              <ChevronDown className="pointer-events-none absolute right-2.5 top-1/2 h-3.5 w-3.5 -translate-y-1/2 text-gray-400" />
            </div>
-          <div className="mt-3">
-            <label className="text-xs text-gray-400 mb-2 block">字幕字号: {subtitleFontSize}px</label>
-            <input
-              type="range"
-              min="40"
-              max="90"
-              step="1"
-              value={subtitleFontSize}
-              onChange={(e) => onSubtitleFontSizeChange(parseInt(e.target.value, 10))}
-              className="w-full accent-purple-500"
-            />
          </div>
-          <div className="mt-3">
-            <label className="text-xs text-gray-400 mb-2 block">字幕位置: {subtitleBottomMargin}px</label>
-            <input
-              type="range"
-              min="0"
-              max="300"
-              step="1"
-              value={subtitleBottomMargin}
-              onChange={(e) => onSubtitleBottomMarginChange(parseInt(e.target.value, 10))}
-              className="w-full accent-purple-500"
-            />
+          <div className="flex items-center gap-3">
+            <label className="text-xs text-gray-400 shrink-0 w-20">字号 {subtitleFontSize}</label>
+            <input type="range" min="40" max="90" step="1" value={subtitleFontSize} onChange={(e) => onSubtitleFontSizeChange(parseInt(e.target.value, 10))} className="flex-1 accent-purple-500" />
+          </div>
+          <div className="flex items-center gap-3">
+            <label className="text-xs text-gray-400 shrink-0 w-20">位置 {subtitleBottomMargin}</label>
+            <input type="range" min="0" max="300" step="1" value={subtitleBottomMargin} onChange={(e) => onSubtitleBottomMarginChange(parseInt(e.target.value, 10))} className="flex-1 accent-purple-500" />
          </div>
        </div>
      )}
-
-      <div className="mt-4 pt-4 border-t border-white/10 flex items-center justify-between">
-        <div>
-          <span className="text-sm text-gray-300">逐字高亮字幕</span>
-          <p className="text-xs text-gray-500 mt-1">自动生成卡拉OK效果字幕</p>
-        </div>
-        <label className="relative inline-flex items-center cursor-pointer">
-          <input
-            type="checkbox"
-            checked={enableSubtitles}
-            onChange={(e) => onToggleSubtitles(e.target.checked)}
-            className="sr-only peer"
-          />
-          <div className="w-11 h-6 bg-gray-600 peer-focus:outline-none rounded-full peer peer-checked:after:translate-x-full peer-checked:after:border-white after:content-[''] after:absolute after:top-[2px] after:left-[2px] after:bg-white after:border-gray-300 after:border after:rounded-full after:h-5 after:w-5 after:transition-all peer-checked:bg-purple-600"></div>
-        </label>
-      </div>
    </div>
  );
 }
--- a/frontend/src/features/home/ui/VoiceSelector.tsx
+++ b/frontend/src/features/home/ui/VoiceSelector.tsx
@@ -13,6 +13,7 @@ interface VoiceSelectorProps {
  voice: string;
  onSelectVoice: (id: string) => void;
  voiceCloneSlot: ReactNode;
+  embedded?: boolean;
 }

 export function VoiceSelector({
@@ -22,32 +23,29 @@ export function VoiceSelector({
  voice,
  onSelectVoice,
  voiceCloneSlot,
+  embedded = false,
 }: VoiceSelectorProps) {
-  return (
-    <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
-      <h2 className="text-lg font-semibold text-white mb-4 flex items-center gap-2">
-        🎙️ 配音方式
-      </h2>
-
+  const content = (
+    <>
      <div className="flex gap-2 mb-4">
        <button
          onClick={() => onSelectTtsMode("edgetts")}
-          className={`flex-1 py-2 px-4 rounded-lg font-medium transition-all flex items-center justify-center gap-2 ${ttsMode === "edgetts"
+          className={`flex-1 py-2 px-2 sm:px-4 rounded-lg text-sm sm:text-base font-medium transition-all flex items-center justify-center gap-1.5 sm:gap-2 ${ttsMode === "edgetts"
            ? "bg-purple-600 text-white"
            : "bg-white/10 text-gray-300 hover:bg-white/20"
            }`}
        >
-          <Volume2 className="h-4 w-4" />
+          <Volume2 className="h-4 w-4 shrink-0" />
          选择声音
        </button>
        <button
          onClick={() => onSelectTtsMode("voiceclone")}
-          className={`flex-1 py-2 px-4 rounded-lg font-medium transition-all flex items-center justify-center gap-2 ${ttsMode === "voiceclone"
+          className={`flex-1 py-2 px-2 sm:px-4 rounded-lg text-sm sm:text-base font-medium transition-all flex items-center justify-center gap-1.5 sm:gap-2 ${ttsMode === "voiceclone"
            ? "bg-purple-600 text-white"
            : "bg-white/10 text-gray-300 hover:bg-white/20"
            }`}
        >
-          <Mic className="h-4 w-4" />
+          <Mic className="h-4 w-4 shrink-0" />
          克隆声音
        </button>
      </div>
@@ -70,6 +68,17 @@ export function VoiceSelector({
      )}

      {ttsMode === "voiceclone" && voiceCloneSlot}
+    </>
+  );
+
+  if (embedded) return content;
+
+  return (
+    <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
+      <h2 className="text-lg font-semibold text-white mb-4 flex items-center gap-2">
+        🎙️ 配音方式
+      </h2>
+      {content}
    </div>
  );
 }
--- a/frontend/src/features/home/ui/script-extraction/useScriptExtraction.ts
+++ b/frontend/src/features/home/ui/script-extraction/useScriptExtraction.ts
@@ -15,9 +15,7 @@ interface UseScriptExtractionOptions {
 export const useScriptExtraction = ({ isOpen }: UseScriptExtractionOptions) => {
    const [isLoading, setIsLoading] = useState(false);
    const [script, setScript] = useState("");
-    const [rewrittenScript, setRewrittenScript] = useState("");
    const [error, setError] = useState<string | null>(null);
-    const [doRewrite, setDoRewrite] = useState(true);
    const [step, setStep] = useState<ExtractionStep>("config");
    const [dragActive, setDragActive] = useState(false);
    const [selectedFile, setSelectedFile] = useState<File | null>(null);
@@ -29,7 +27,6 @@ export const useScriptExtraction = ({ isOpen }: UseScriptExtractionOptions) => {
        if (isOpen) {
            setStep("config");
            setScript("");
-            setRewrittenScript("");
            setError(null);
            setIsLoading(false);
            setSelectedFile(null);
@@ -100,10 +97,10 @@ export const useScriptExtraction = ({ isOpen }: UseScriptExtractionOptions) => {
            } else if (activeTab === "url") {
                formData.append("url", inputUrl.trim());
            }
-            formData.append("rewrite", doRewrite ? "true" : "false");
+            formData.append("rewrite", "false");

            const { data: res } = await api.post<
-                ApiResponse<{ original_script: string; rewritten_script?: string }>
+                ApiResponse<{ original_script: string }>
            >("/api/tools/extract-script", formData, {
                headers: { "Content-Type": "multipart/form-data" },
                timeout: 180000, // 3 minutes timeout
@@ -111,7 +108,6 @@ export const useScriptExtraction = ({ isOpen }: UseScriptExtractionOptions) => {

            const payload = unwrap(res);
            setScript(payload.original_script);
-            setRewrittenScript(payload.rewritten_script || "");
            setStep("result");
        } catch (err: unknown) {
            console.error(err);
@@ -126,7 +122,7 @@ export const useScriptExtraction = ({ isOpen }: UseScriptExtractionOptions) => {
        } finally {
            setIsLoading(false);
        }
-    }, [activeTab, selectedFile, inputUrl, doRewrite]);
+    }, [activeTab, selectedFile, inputUrl]);

    const copyToClipboard = useCallback((text: string) => {
        if (navigator.clipboard && window.isSecureContext) {
@@ -185,16 +181,13 @@ export const useScriptExtraction = ({ isOpen }: UseScriptExtractionOptions) => {
        // State
        isLoading,
        script,
-        rewrittenScript,
        error,
-        doRewrite,
        step,
        dragActive,
        selectedFile,
        activeTab,
        inputUrl,
        // Setters
-        setDoRewrite,
        setActiveTab,
        setInputUrl,
        // Handlers
--- a/frontend/src/features/publish/model/usePublishController.ts
+++ b/frontend/src/features/publish/model/usePublishController.ts
@@ -83,6 +83,8 @@ export const usePublishController = () => {
      setVideos(nextVideos);
      if (nextVideos.length > 0 && autoSelectLatest) {
        setSelectedVideo(nextVideos[0].id);
+        // 写入跨页面共享标记，让首页也能感知最新生成的视频
+        localStorage.setItem(`vigent_${getStorageKey()}_latestGeneratedVideoId`, nextVideos[0].id);
      }
      updatePrefetch({ videos: nextVideos });
    } catch (error) {
@@ -109,17 +111,24 @@ export const usePublishController = () => {

  // ---- 视频选择恢复（唯一一个 effect，条件极简） ----
  // 等 auth 完成 + videos 有数据 → 恢复一次，之后再也不跑
+  // 优先检查跨页面共享标记（最新生成的视频），其次恢复上次选择
  useEffect(() => {
    if (isAuthLoading || videos.length === 0 || videoRestoredRef.current) return;
    videoRestoredRef.current = true;

    const key = getStorageKey();
+    const latestId = localStorage.getItem(`vigent_${key}_latestGeneratedVideoId`);
+    if (latestId && videos.some(v => v.id === latestId)) {
+      setSelectedVideo(latestId);
+      localStorage.removeItem(`vigent_${key}_latestGeneratedVideoId`);
+    } else {
      const saved = localStorage.getItem(`vigent_${key}_publish_selected_video`);
      if (saved && videos.some(v => v.id === saved)) {
        setSelectedVideo(saved);
      } else {
        setSelectedVideo(videos[0].id);
      }
+    }
  }, [isAuthLoading, videos, getStorageKey]);

  // ---- 视频选择保存 ----
--- a/frontend/src/features/publish/ui/PublishPage.tsx
+++ b/frontend/src/features/publish/ui/PublishPage.tsx
@@ -135,7 +135,7 @@ export function PublishPage() {
          <div className="space-y-6">
            <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
              <h2 className="text-lg font-semibold text-white mb-4 flex items-center gap-2">
-                👤 平台账号
+                七、平台账号
              </h2>

              {isAccountsLoading ? (
@@ -157,30 +157,29 @@ export function PublishPage() {
                  ))}
                </div>
              ) : (
-                <div className="space-y-3">
+                <div className="space-y-2 sm:space-y-3">
                  {accounts.map((account) => (
                    <div
                      key={account.platform}
-                      className="flex items-center justify-between p-4 bg-black/30 rounded-xl"
+                      className="flex items-center gap-3 px-3 py-2.5 sm:px-4 sm:py-3.5 bg-black/30 rounded-xl"
                    >
-                      <div className="flex items-center gap-3">
                      {platformIcons[account.platform] ? (
                        <Image
                          src={platformIcons[account.platform].src}
                          alt={platformIcons[account.platform].alt}
                          width={28}
                          height={28}
-                            className="h-7 w-7"
+                          className="h-6 w-6 sm:h-7 sm:w-7 shrink-0"
                        />
                      ) : (
-                          <span className="text-2xl">🌐</span>
+                        <span className="text-xl sm:text-2xl">🌐</span>
                      )}
-                        <div>
-                          <div className="text-white font-medium">
+                      <div className="min-w-0 flex-1">
+                        <div className="text-sm sm:text-base text-white font-medium leading-tight">
                          {account.name}
                        </div>
                        <div
-                            className={`text-sm ${account.logged_in
+                          className={`text-xs sm:text-sm leading-tight ${account.logged_in
                            ? "text-green-400"
                            : "text-gray-500"
                            }`}
@@ -188,31 +187,30 @@ export function PublishPage() {
                          {account.logged_in ? "✓ 已登录" : "未登录"}
                        </div>
                      </div>
-                      </div>
-                      <div className="flex gap-2">
+                      <div className="flex items-center gap-1.5 sm:gap-2 shrink-0">
                        {account.logged_in ? (
                          <>
                            <button
                              onClick={() => handleLogin(account.platform)}
-                              className="px-3 py-1 bg-white/10 hover:bg-white/20 text-white text-sm rounded-lg transition-colors flex items-center gap-1"
+                              className="px-2 py-1 sm:px-3 sm:py-1.5 bg-white/10 hover:bg-white/20 text-white text-xs sm:text-sm rounded-md sm:rounded-lg transition-colors flex items-center gap-1"
                            >
-                              <RotateCcw className="h-3.5 w-3.5" />
+                              <RotateCcw className="h-3 w-3 sm:h-3.5 sm:w-3.5" />
                              重新登录
                            </button>
                            <button
                              onClick={() => handleLogout(account.platform)}
-                              className="px-3 py-1 bg-red-500/80 hover:bg-red-600 text-white text-sm rounded-lg transition-colors flex items-center gap-1"
+                              className="px-2 py-1 sm:px-3 sm:py-1.5 bg-red-500/80 hover:bg-red-600 text-white text-xs sm:text-sm rounded-md sm:rounded-lg transition-colors flex items-center gap-1"
                            >
-                              <LogOut className="h-3.5 w-3.5" />
+                              <LogOut className="h-3 w-3 sm:h-3.5 sm:w-3.5" />
                              注销
                            </button>
                          </>
                        ) : (
                          <button
                            onClick={() => handleLogin(account.platform)}
-                            className="px-3 py-1 bg-purple-500/80 hover:bg-purple-600 text-white text-sm rounded-lg transition-colors flex items-center gap-1"
+                            className="px-2 py-1 sm:px-3 sm:py-1.5 bg-purple-500/80 hover:bg-purple-600 text-white text-xs sm:text-sm rounded-md sm:rounded-lg transition-colors flex items-center gap-1"
                          >
-                            <QrCode className="h-3.5 w-3.5" />
+                            <QrCode className="h-3 w-3 sm:h-3.5 sm:w-3.5" />
                            登录
                          </button>
                        )}
@@ -228,7 +226,7 @@ export function PublishPage() {
          <div className="space-y-6">
            {/* 选择视频 */}
            <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
-              <h2 className="text-lg font-semibold text-white mb-4">📹 选择发布作品</h2>
+              <h2 className="text-lg font-semibold text-white mb-4">八、选择发布作品</h2>

              <div className="flex items-center gap-3 mb-4">
                <Search className="text-gray-400 w-4 h-4" />
@@ -303,7 +301,7 @@ export function PublishPage() {

            {/* 填写信息 */}
            <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
-              <h2 className="text-lg font-semibold text-white mb-4">✍️ 发布信息</h2>
+              <h2 className="text-lg font-semibold text-white mb-4">九、发布信息</h2>

              <div className="space-y-4">
                <div>
@@ -337,7 +335,7 @@ export function PublishPage() {

            {/* 选择平台 */}
            <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
-              <h2 className="text-lg font-semibold text-white mb-4">📱 选择发布平台</h2>
+              <h2 className="text-lg font-semibold text-white mb-4">十、选择发布平台</h2>

              <div className="grid grid-cols-3 gap-3">
                {accounts
--- a/frontend/src/shared/api/axios.ts
+++ b/frontend/src/shared/api/axios.ts
@@ -12,7 +12,7 @@ const API_BASE = typeof window === 'undefined'
 // 防止重复跳转
 let isRedirecting = false;

-const PUBLIC_PATHS = new Set(['/login', '/register']);
+const PUBLIC_PATHS = new Set(['/login', '/register', '/pay']);

 // 创建 axios 实例
 const api = axios.create({
--- a/frontend/src/shared/contexts/AuthContext.tsx
+++ b/frontend/src/shared/contexts/AuthContext.tsx
@@ -11,6 +11,7 @@ interface AuthContextType {
  user: User | null;
  isLoading: boolean;
  isAuthenticated: boolean;
+  setUser: (user: User | null) => void;
 }

 const AuthContext = createContext<AuthContextType>({
@@ -18,6 +19,7 @@ const AuthContext = createContext<AuthContextType>({
  user: null,
  isLoading: true,
  isAuthenticated: false,
+  setUser: () => {},
 });

 export function AuthProvider({ children }: { children: ReactNode }) {
@@ -63,7 +65,8 @@ export function AuthProvider({ children }: { children: ReactNode }) {
      userId: user?.id || null,
      user,
      isLoading,
-      isAuthenticated: !!user
+      isAuthenticated: !!user,
+      setUser,
    }}>
      {children}
    </AuthContext.Provider>
--- a/frontend/src/shared/lib/auth.ts
+++ b/frontend/src/shared/lib/auth.ts
@@ -12,6 +12,7 @@ export interface AuthResponse {
    success: boolean;
    message: string;
    user?: User;
+    paymentToken?: string;
 }

 interface ApiResponse<T> {
@@ -25,20 +26,41 @@ interface ApiResponse<T> {
 * 用户注册
 */
 export async function register(phone: string, password: string, username?: string): Promise<AuthResponse> {
+    try {
        const { data: payload } = await api.post<ApiResponse<null>>('/api/auth/register', {
            phone, password, username
        });
        return { success: payload.success, message: payload.message };
+    } catch (err: any) {
+        return {
+            success: false,
+            message: err.response?.data?.message || '注册失败',
+        };
+    }
 }

 /**
 * 用户登录
 */
 export async function login(phone: string, password: string): Promise<AuthResponse> {
+    try {
        const { data: payload } = await api.post<ApiResponse<{ user?: User }>>('/api/auth/login', {
            phone, password
        });
        return { success: payload.success, message: payload.message, user: payload.data?.user };
+    } catch (err: any) {
+        if (err.response?.status === 403 && err.response?.data?.data?.reason === 'PAYMENT_REQUIRED') {
+            return {
+                success: false,
+                message: err.response.data.message,
+                paymentToken: err.response.data.data.payment_token,
+            };
+        }
+        return {
+            success: false,
+            message: err.response?.data?.message || '登录失败',
+        };
+    }
 }

 /**
--- a/frontend/src/shared/lib/title.ts
+++ b/frontend/src/shared/lib/title.ts
@@ -1,8 +1,12 @@
 export const TITLE_MAX_LENGTH = 15;
+export const SECONDARY_TITLE_MAX_LENGTH = 20;

 export const clampTitle = (value: string, maxLength: number = TITLE_MAX_LENGTH) =>
  value.slice(0, maxLength);

+export const clampSecondaryTitle = (value: string, maxLength: number = SECONDARY_TITLE_MAX_LENGTH) =>
+  value.slice(0, maxLength);
+
 export const applyTitleLimit = (
  prev: string,
  next: string,
--- a/frontend/src/shared/types/material.ts
+++ b/frontend/src/shared/types/material.ts
@@ -0,0 +1,8 @@
+export interface Material {
+  id: string;
+  name: string;
+  path: string;
+  size_mb: number;
+  scene?: string;
+  duration_sec?: number;
+}
--- a/models/CosyVoice/CODE_OF_CONDUCT.md
+++ b/models/CosyVoice/CODE_OF_CONDUCT.md
@@ -0,0 +1,76 @@
+# Contributor Covenant Code of Conduct
+
+## Our Pledge
+
+In the interest of fostering an open and welcoming environment, we as
+contributors and maintainers pledge to making participation in our project and
+our community a harassment-free experience for everyone, regardless of age, body
+size, disability, ethnicity, sex characteristics, gender identity and expression,
+level of experience, education, socio-economic status, nationality, personal
+appearance, race, religion, or sexual identity and orientation.
+
+## Our Standards
+
+Examples of behavior that contributes to creating a positive environment
+include:
+
+* Using welcoming and inclusive language
+* Being respectful of differing viewpoints and experiences
+* Gracefully accepting constructive criticism
+* Focusing on what is best for the community
+* Showing empathy towards other community members
+
+Examples of unacceptable behavior by participants include:
+
+* The use of sexualized language or imagery and unwelcome sexual attention or
+ advances
+* Trolling, insulting/derogatory comments, and personal or political attacks
+* Public or private harassment
+* Publishing others' private information, such as a physical or electronic
+ address, without explicit permission
+* Other conduct which could reasonably be considered inappropriate in a
+ professional setting
+
+## Our Responsibilities
+
+Project maintainers are responsible for clarifying the standards of acceptable
+behavior and are expected to take appropriate and fair corrective action in
+response to any instances of unacceptable behavior.
+
+Project maintainers have the right and responsibility to remove, edit, or
+reject comments, commits, code, wiki edits, issues, and other contributions
+that are not aligned to this Code of Conduct, or to ban temporarily or
+permanently any contributor for other behaviors that they deem inappropriate,
+threatening, offensive, or harmful.
+
+## Scope
+
+This Code of Conduct applies both within project spaces and in public spaces
+when an individual is representing the project or its community. Examples of
+representing a project or community include using an official project e-mail
+address, posting via an official social media account, or acting as an appointed
+representative at an online or offline event. Representation of a project may be
+further defined and clarified by project maintainers.
+
+## Enforcement
+
+Instances of abusive, harassing, or otherwise unacceptable behavior may be
+reported by contacting the project team at mikelei@mobvoi.com. All
+complaints will be reviewed and investigated and will result in a response that
+is deemed necessary and appropriate to the circumstances. The project team is
+obligated to maintain confidentiality with regard to the reporter of an incident.
+Further details of specific enforcement policies may be posted separately.
+
+Project maintainers who do not follow or enforce the Code of Conduct in good
+faith may face temporary or permanent repercussions as determined by other
+members of the project's leadership.
+
+## Attribution
+
+This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
+available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
+
+[homepage]: https://www.contributor-covenant.org
+
+For answers to common questions about this code of conduct, see
+https://www.contributor-covenant.org/faq
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Kevin Wong	0e3502c6f0	更新	2026-02-27 16:11:34 +08:00
Kevin Wong	a1604979f0	更新	2026-02-26 11:13:03 +08:00
Kevin Wong	08221e48de	更新	2026-02-26 10:49:22 +08:00
Kevin Wong	42b5cc0c02	更新	2026-02-26 10:14:41 +08:00
Kevin Wong	1717635bfd	更新	2026-02-25 17:51:58 +08:00
Kevin Wong	0a5a17402c	更新	2026-02-24 16:55:29 +08:00
Kevin Wong	bc0fe9326a	更新	2026-02-11 17:48:38 +08:00
Kevin Wong	035ee29d72	更新	2026-02-11 14:33:05 +08:00
Kevin Wong	a6cc919e5c	更新	2026-02-11 13:57:41 +08:00
Kevin Wong	96a298e51c	更新	2026-02-11 13:48:45 +08:00
Kevin Wong	e33dfc3031	更新	2026-02-10 13:31:29 +08:00
Kevin Wong	3129d45b25	更新	2026-02-09 14:47:19 +08:00