更新

2026-03-10 10:59:38 +08:00 · 2026-03-09 10:18:14 +08:00 · 2026-03-05 17:23:22 +08:00 · 2026-03-04 17:35:59 +08:00 · 2026-03-04 14:07:54 +08:00 · 2026-03-03 15:16:38 +08:00
75 changed files with 11545 additions and 2711 deletions
--- a/Docs/BACKEND_DEV.md
+++ b/Docs/BACKEND_DEV.md
@@ -2,6 +2,12 @@

 本文档定义后端开发的结构规范、接口契约与实现习惯。目标是让新功能按统一范式落地，旧逻辑在修复时逐步抽离。

+## 文档定位
+
+- 本文档只定义后端开发规范与工程约束（分层职责、契约、流程、代码习惯）。
+- 接口说明、部署运行与环境配置示例请查看 `Docs/BACKEND_README.md`。
+- 历史变更请记录在 `Docs/DevLogs/` 与 `Docs/TASK_COMPLETE.md`，不要写入本规范文档。
+
 ---

 ## 1. 模块化与分层原则
@@ -43,7 +49,7 @@ backend/
 │   │   └── admin/           # 管理员功能
 │   ├── repositories/        # Supabase 数据访问
 │   ├── services/            # 外部服务集成
-│   │   ├── uploader/        # 平台发布器（douyin/weixin）
+│   │   ├── uploader/        # 平台发布器（douyin/weixin/xiaohongshu/bilibili）
 │   │   ├── qr_login_service.py
 │   │   ├── publish_service.py
 │   │   ├── remotion_service.py
@@ -80,13 +86,23 @@ backend/
 - `custom_assignments` 每项使用 `material_path/start/end/source_start/source_end?`，并以时间轴可见段为准。
 - `output_aspect_ratio` 仅允许 `9:16` / `16:9`，默认 `9:16`。
 - 标题显示模式参数：
-  - `title_display_mode`: `short` / `persistent`（默认 `short`）
+  - `title_display_mode`: `short` / `persistent`（默认 `short`，对主标题与副标题统一生效）
  - `title_duration`: 默认 `4.0`（秒），仅 `short` 模式生效
 - 片头副标题参数：
  - `secondary_title`: 副标题文字（可选，限 20 字），仅在视频画面中显示，不参与发布标题
  - `secondary_title_style_id` / `secondary_title_font_size` / `secondary_title_top_margin`: 副标题样式配置
 - workflow/remotion 侧需保持字段透传一致，避免前后端语义漂移。

+### `/api/videos/cleanup` 行为约定
+
+- 仅清理当前用户在 Storage 中的生成产物：
+  - `outputs` bucket（生成视频）
+  - `generated-audios` bucket（预生成配音 `.wav/.json`）
+- 清理接口采用严格成功语义：
+  - 全部删除成功才返回 success
+  - 任一删除失败返回错误，前端应保留清理弹窗并允许重试
+- 下载接口约定：`GET /api/videos/generated/{video_id}/download` 必须返回 `Content-Disposition: attachment`，用于前端一键下载，避免浏览器改为在线播放。
+
 ---

 ## 4. 认证与权限
@@ -94,6 +110,8 @@ backend/
 - 认证方式：**HttpOnly Cookie** (`access_token`)。
 - `get_current_user` / `get_current_user_optional` 位于 `core/deps.py`。
 - Session 单设备校验使用 `repositories/sessions.py`。
+- AI/Tools 等高成本接口必须强制鉴权（`Depends(get_current_user)`），禁止匿名调用消耗外部 API 配额。
+- 生产环境要求 `DEBUG=false` + 非默认 `JWT_SECRET_KEY`；默认密钥在生产模式下必须阻止服务启动。

 ---

@@ -109,6 +127,16 @@ backend/

 - 所有文件上传/下载/删除/移动通过 `services/storage.py`。
 - 需要重命名时使用 `move_file`，避免直接读写 Storage。
+- `delete_file` 必须向上抛出异常，不允许静默吞错（避免清理接口出现“假成功”）。
+- `list_files` 默认容错返回空列表；清理等强一致场景应使用 `strict=True`。
+- 所有用户输入的文件路径/ID 必须做防御校验：
+  - `material_id` 拒绝 `..` 序列，避免路径穿越
+  - `video_id` 等资源 ID 使用白名单（如 `^[A-Za-z0-9_-]+$`）
+- 上传/下载链路必须有体积上限：
+  - 素材上传遵循 `MAX_UPLOAD_SIZE_MB`
+  - 参考音频上限 5MB
+  - 文案提取工具文件上传与 URL 下载结果均上限 500MB
+- 面向前端的错误返回默认使用通用文案；内部堆栈只写服务端日志，避免泄露路径/实现细节。

 ### Cookie 存储（用户隔离）

@@ -133,6 +161,8 @@ backend/user_data/{user_uuid}/cookies/
 - 业务逻辑写在 service/workflow。
 - 数据库访问写在 repositories。
 - 统一使用 `loguru` 打日志。
+- GLM SDK 调用统一收口到 `services/glm_service.py`（通过统一入口方法），避免在模块内重复拼装 `chat.completions.create` 调用代码。
+- 涉及文案深度学习的抓取调用，router 侧应透传 `current_user.id` 到 `creator_scraper`，以便复用用户 Cookie 上下文并保持 `analysis_id` 用户隔离。

 ---

@@ -162,7 +192,16 @@ backend/user_data/{user_uuid}/cookies/
 - `MUSETALK_BATCH_SIZE` (推理批大小，默认 32)
 - `MUSETALK_VERSION` (v15)
 - `MUSETALK_USE_FLOAT16` (半精度，默认 true)
- `LIPSYNC_DURATION_THRESHOLD` (秒，>=此值用 MuseTalk，默认 120)
+- `LIPSYNC_DURATION_THRESHOLD` (秒，>=此值用 MuseTalk；代码默认 120，本仓库当前 `.env` 配置 100)
+
+### 小脸口型质量补偿（本地唇形路径）
+- `LIPSYNC_SMALL_FACE_ENHANCE` (总开关，默认 false)
+- `LIPSYNC_SMALL_FACE_THRESHOLD` (触发阈值，默认 256)
+- `LIPSYNC_SMALL_FACE_UPSCALER` (`gfpgan` / `codeformer`)
+- `LIPSYNC_SMALL_FACE_GPU_ID` (超分 GPU，默认 0)
+- `LIPSYNC_SMALL_FACE_FAIL_OPEN` (失败回退，默认 true)
+
+> 部署与验证细节见 `Docs/FACEENHANCE_DEPLOY.md`。

 ### 微信视频号
 - `WEIXIN_HEADLESS_MODE` (headful/headless-new)
@@ -179,6 +218,14 @@ backend/user_data/{user_uuid}/cookies/
 - `DOUYIN_FORCE_SWIFTSHADER`
 - `DOUYIN_DEBUG_ARTIFACTS` / `DOUYIN_RECORD_VIDEO` / `DOUYIN_KEEP_SUCCESS_VIDEO`

+### 小红书
+- `XIAOHONGSHU_HEADLESS_MODE` (headful/headless-new，默认 headless-new)
+- `XIAOHONGSHU_CHROME_PATH` / `XIAOHONGSHU_BROWSER_CHANNEL`
+- `XIAOHONGSHU_USER_AGENT`
+- `XIAOHONGSHU_LOCALE` / `XIAOHONGSHU_TIMEZONE_ID`
+- `XIAOHONGSHU_FORCE_SWIFTSHADER`
+- `XIAOHONGSHU_DEBUG_ARTIFACTS`
+
 ### 支付宝
 - `ALIPAY_APP_ID` / `ALIPAY_PRIVATE_KEY_PATH` / `ALIPAY_PUBLIC_KEY_PATH`
 - `ALIPAY_NOTIFY_URL` / `ALIPAY_RETURN_URL`
@@ -191,8 +238,9 @@ backend/user_data/{user_uuid}/cookies/
 ## 10. Playwright 发布调试

 - 诊断日志落盘：`backend/app/debug_screenshots/weixin_network.log` / `douyin_network.log`
- 关键失败截图：`backend/app/debug_screenshots/weixin_*.png` / `douyin_*.png`
+- 关键失败截图：`backend/app/debug_screenshots/weixin_*.png` / `douyin_*.png` / `xiaohongshu_*.png`
 - 视频号建议使用 headful + xvfb-run（避免 headless 解码/指纹问题）
+- 发布专项实现细节（登录链路、成功判定、排障）统一维护在 `Docs/PUBLISH_DEPLOY.md`

 ---

--- a/Docs/BACKEND_README.md
+++ b/Docs/BACKEND_README.md
@@ -1,6 +1,12 @@
 # ViGent2 后端开发指南

-本文档提供后端架构概览与接口规范。开发规范与分层约定见 `Docs/BACKEND_DEV.md`。
+本文档提供后端架构概览、接口说明与运行配置。
+
+## 📌 文档定位
+
+- 本文档用于说明后端服务能力、接口与部署运行方式（面向使用与联调）。
+- 开发规范、分层约束与代码实现习惯请查看 `Docs/BACKEND_DEV.md`。
+- 历史变更与里程碑请查看 `Docs/DevLogs/` 与 `Docs/TASK_COMPLETE.md`。

 ---

@@ -8,7 +14,7 @@

 后端采用 **FastAPI** 框架，基于 Python 3.10+ 构建，主要负责业务逻辑处理、AI 任务调度以及与各微服务组件的交互。

-### 目录结构
+### 目录结构（概览）

 ```
 backend/
@@ -36,6 +42,8 @@ backend/
 └── requirements.txt      # 依赖清单
 ```

+> 详细分层职责（router/service/workflow/repositories）与开发约束请查看 `Docs/BACKEND_DEV.md`。
+
 ---

 ## 🔌 API 接口规范
@@ -56,24 +64,32 @@ backend/

 2.  **视频生成 (Videos)**
    *   `POST /api/videos/generate`: 提交生成任务
+    *   `GET/POST /api/videos/voice-preview`: 生成音色试听短音频（返回二进制音频流）
+    *   `POST /api/videos/cleanup`: 清理当前用户工作区生成产物（outputs + generated-audios）
    *   `GET /api/videos/tasks/{task_id}`: 查询单个任务状态
    *   `GET /api/videos/tasks`: 获取用户所有任务列表
    *   `GET /api/videos/generated`: 获取历史视频列表
+    *   `GET /api/videos/generated/{video_id}/download`: 下载历史视频（`Content-Disposition: attachment`）
    *   `DELETE /api/videos/generated/{video_id}`: 删除历史视频

+> `POST /api/videos/cleanup` 采用严格成功语义：仅当目标文件删除全部成功时返回 success；存在删除失败会返回错误并提示重试。
+
 3.  **素材管理 (Materials)**
    *   `POST /api/materials`: 上传素材
    *   `GET /api/materials`: 获取素材列表
    *   `PUT /api/materials/{material_id}`: 重命名素材
-    *   `GET /api/materials/stream/{material_id}`: 同源流式返回素材文件（用于前端 canvas 截帧，避免跨域 CORS taint）
+    *   `GET /api/materials/stream/{material_id}`: 同源流式返回素材文件（用于前端 canvas 截帧，避免跨域 CORS taint；服务端会拒绝 `..` 路径）

 4.  **社交发布 (Publish)**
    *   `POST /api/publish`: 发布视频到 抖音/微信视频号/B站/小红书
-    *   `POST /api/publish/login`: 扫码登录平台
-    *   `GET /api/publish/login/status`: 查询登录状态（含刷脸验证二维码）
+    *   `POST /api/publish/login/{platform}`: 获取平台二维码并启动扫码登录
+    *   `GET /api/publish/login/status/{platform}`: 轮询登录状态（含抖音刷脸验证二维码）
+    *   `POST /api/publish/logout/{platform}`: 注销平台登录（删除 Cookie）
+    *   `POST /api/publish/cookies/save/{platform}`: 保存客户端提取的 Cookie
    *   `GET /api/publish/accounts`: 获取已登录账号列表
+    *   `GET /api/publish/screenshot/{filename}`: 获取发布成功截图（需登录）

-> 提示：视频号/抖音发布建议使用 headful + xvfb-run 运行后端。
+> 提示：视频号/抖音发布建议使用 headful + xvfb-run 运行后端。发布专项实现与部署说明见 `Docs/PUBLISH_DEPLOY.md`。

 5.  **资源库 (Assets)**
    *   `GET /api/assets/subtitle-styles`: 字幕样式列表
@@ -88,8 +104,9 @@ backend/
    *   `POST /api/ref-audios/{id}/retranscribe`: 重新识别参考音频文字（Whisper 转写 + 超 10s 自动截取）

 7.  **AI 功能 (AI)**
-    *   `POST /api/ai/generate-meta`: AI 生成标题和标签
-    *   `POST /api/ai/translate`: AI 多语言翻译（支持 9 种目标语言）
+    *   `POST /api/ai/generate-meta`: AI 生成标题和标签（需登录）
+    *   `POST /api/ai/translate`: AI 多语言翻译（支持 9 种目标语言，需登录）
+    *   `POST /api/ai/rewrite`: AI 改写文案（需登录）

 8.  **预生成配音 (Generated Audios)**
    *   `POST /api/generated-audios/generate`: 异步生成配音（返回 task_id）
@@ -99,11 +116,20 @@ backend/
    *   `PUT /api/generated-audios/{audio_id}`: 重命名配音

 9.  **工具 (Tools)**
-    *   `POST /api/tools/extract-script`: 从视频链接提取文案
+    *   `POST /api/tools/extract-script`: 从视频链接提取文案（需登录）
+    *   `POST /api/tools/analyze-creator`: 分析博主标题并返回热门话题（需登录）
+    *   `POST /api/tools/generate-topic-script`: 基于选中话题生成文案（需登录）
+
+> 文案深度学习说明：
+> - 平台支持：抖音 / B站博主主页链接。
+> - 抓取策略：当前统一使用 Playwright 主链路抓取标题（抖音/B站），并结合用户登录态 Cookie 上下文增强成功率。
+> - `analysis_id` 绑定 `user_id` 且有 TTL（默认 20 分钟），用于后续“生成文案”阶段安全读取标题上下文。

 10. **健康检查**
-    *   `GET /api/lipsync/health`: 唇形同步服务健康状态（含 LatentSync + MuseTalk + 混合路由阈值）
-    *   `GET /api/voiceclone/health`: CosyVoice 3.0 服务健康状态
+    *   `GET /api/videos/lipsync/health`: 唇形同步服务健康状态（含 LatentSync + MuseTalk + 混合路由阈值 + `data.small_face_enhance`）
+    *   `GET /api/videos/voiceclone/health`: CosyVoice 3.0 服务健康状态
+
+> 小脸口型质量补偿链路健康字段说明：`data.small_face_enhance.enabled`（总开关）、`threshold`（触发阈值）、`detector_loaded`（SCRFD 是否已懒加载）。

 11. **支付 (Payment)**
    *   `POST /api/payment/create-order`: 创建支付宝电脑网站支付订单（需 payment_token）
@@ -112,6 +138,16 @@ backend/

 > 登录时若账号未激活或已过期，返回 403 + `payment_token`，前端跳转 `/pay` 页面完成付费。详见 [支付宝部署指南](ALIPAY_DEPLOY.md)。

+### 安全基线（生产环境）
+
+- `DEBUG` 必须设为 `false`：认证 Cookie 会带 `Secure`，仅在 HTTPS 下发送。
+- `JWT_SECRET_KEY` 必须是强随机值且不能使用默认值；当 `DEBUG=false` 且仍为默认值时，后端会在启动阶段直接拒绝启动。
+- 上传体积限制：
+  - `POST /api/materials`：受 `MAX_UPLOAD_SIZE_MB` 限制（默认 500MB）
+  - `POST /api/ref-audios`：5MB
+  - `POST /api/tools/extract-script`：文件上传与 URL 下载结果均限制 500MB
+- `video_id` 在下载/删除接口使用白名单校验（`^[A-Za-z0-9_-]+$`），非法值直接返回 400。
+
 ### 统一响应结构

 ```json
@@ -138,9 +174,13 @@ backend/
 - `speed`: 语速（声音克隆模式，默认 1.0，范围 0.8-1.2）
 - `custom_assignments`: 自定义素材分配数组（每项含 `material_path` / `start` / `end` / `source_start` / `source_end?`），存在时优先按时间轴可见段生成
 - `output_aspect_ratio`: 输出画面比例（`9:16` 或 `16:9`，默认 `9:16`）
- `language`: TTS 语言（默认自动检测，声音克隆时透传给 CosyVoice 3.0）
+- `lipsync_model`: 唇形模型路由模式（`default` / `fast` / `advanced`）
+  - `default`: 阈值路由（`LIPSYNC_DURATION_THRESHOLD`）
+  - `fast`: 强制 MuseTalk，不可用时回退 LatentSync
+  - `advanced`: 强制 LatentSync
+- `language`: TTS 语言区域（默认 `zh-CN`；会映射为 Whisper 的 `zh/en/...` 与 CosyVoice 的 `Chinese/English/Auto`）
 - `title`: 片头标题文字
- `title_display_mode`: 标题显示模式（`short` / `persistent`，默认 `short`）
+- `title_display_mode`: 标题显示模式（`short` / `persistent`，默认 `short`；该模式对主标题与副标题统一生效）
 - `title_duration`: 标题显示时长（秒，默认 `4.0`；`short` 模式生效）
 - `subtitle_style_id`: 字幕样式 ID
 - `title_style_id`: 标题样式 ID
@@ -161,7 +201,7 @@ backend/
 - 多素材片段在拼接前统一重编码，并强制 `25fps + CFR`，减少段边界时间基不一致导致的画面卡顿。
 - concat 流程启用 `+genpts` 重建时间戳，提升拼接后时间轴连续性。
 - 对带旋转元数据的 MOV 素材会先做方向归一化，再进入分辨率判断和后续流程。
- compose 阶段（视频轨+音频轨合并）使用 `-c:v copy` 流复制替代重编码，几乎瞬间完成。
+- compose 阶段（视频轨+音频轨合并）在**无需循环视频**时使用 `-c:v copy` 流复制；需要循环时才重编码。
 - FFmpeg 子进程设有超时保护：`_run_ffmpeg()` 600 秒、`_get_duration()` 30 秒，防止畸形文件导致永久挂起。

 ### 全局并发控制
@@ -203,14 +243,14 @@ pip install -r requirements.txt

 ### 3. 环境变量配置

-复制 `.env.example` 到 `.env` 并配置必要的 Key：
+当前仓库使用 `backend/.env` 作为运行配置基准；请按你的环境替换敏感值并核对以下关键项（生产环境请勿提交真实密钥）：

 ```ini
 # Supabase
 SUPABASE_URL=http://localhost:8008
 SUPABASE_KEY=your_service_role_key

-# GLM API (用于 AI 标题生成)
+# GLM API (用于 AI 标题/改写/翻译/文案深度学习)
 GLM_API_KEY=your_glm_api_key

 # LatentSync 配置
@@ -220,9 +260,24 @@ LATENTSYNC_GPU_ID=1
 MUSETALK_GPU_ID=0
 MUSETALK_API_URL=http://localhost:8011
 MUSETALK_BATCH_SIZE=32
-LIPSYNC_DURATION_THRESHOLD=120
+LIPSYNC_DURATION_THRESHOLD=100
+
+# 小脸口型质量补偿（默认关闭，建议灰度开启）
+LIPSYNC_SMALL_FACE_ENHANCE=false
+LIPSYNC_SMALL_FACE_THRESHOLD=256
+LIPSYNC_SMALL_FACE_UPSCALER=gfpgan
+LIPSYNC_SMALL_FACE_GPU_ID=0
+LIPSYNC_SMALL_FACE_FAIL_OPEN=true
+
+# MuseTalk 可调参数（示例）
+MUSETALK_DETECT_EVERY=2
+MUSETALK_BLEND_CACHE_EVERY=2
+MUSETALK_ENCODE_CRF=14
+MUSETALK_ENCODE_PRESET=slow
 ```

+> 小脸口型质量补偿链路部署、权重与回滚说明见 `Docs/FACEENHANCE_DEPLOY.md`（仅本地 `_local_generate()` 路径接入，远程模式暂不接入）。
+
 ### 4. 启动服务

 **开发模式 (热重载)**:
@@ -232,51 +287,11 @@ uvicorn app.main:app --host 0.0.0.0 --port 8006 --reload

 ---

-## 🧩 服务集成指南
+## 🧩 开发约定与测试

-### 集成新模型
-
-如果需要集成新的 AI 模型 (例如新的 TTS 引擎)：
-
-1.  在 `app/services/` 下创建新的 Service 类 (如 `NewTTSService`)。
-2.  实现 `generate` 方法，可以使用 subprocess 调用，也可以是 HTTP 请求。
-3.  **重要**: 如果模型占用 GPU，请务必使用 `asyncio.Lock` 进行并发控制，防止 OOM。
-4.  在 `app/modules/` 下创建对应模块，添加 router/service/schemas，并在 `main.py` 注册路由。
-
-### 唇形同步混合路由
-
-`lipsync_service.py` 实现了 LatentSync + MuseTalk 混合路由：
- 短视频 (<`LIPSYNC_DURATION_THRESHOLD`s) → LatentSync 1.6 (GPU1, 端口 8007)
- 长视频 (>=阈值) → MuseTalk 1.5 (GPU0, 端口 8011)
- MuseTalk 不可用时自动回退到 LatentSync
- 路由逻辑对 workflow 完全透明
-
-### 添加定时任务
-
-目前推荐使用 **APScheduler** 或 **Crontab** 来管理定时任务。
-社交媒体的定时发布功能目前依赖 `playwright` 的延迟执行，未来计划迁移到 Celery 队列。
-
---
-
-## 🛡️ 错误处理
-
-全项目统一使用 `Loguru` 进行日志记录。
-
-```python
-from loguru import logger
-
-try:
-    # 业务逻辑
-except Exception as e:
-    logger.error(f"操作失败: {str(e)}")
-    raise HTTPException(status_code=500, detail="服务器内部错误")
-```
-
---
-
-## 🧪 测试
-
-运行测试套件：
+- 新增模块、分层职责、统一响应、错误处理与调试规范请查看 `Docs/BACKEND_DEV.md`。
+- 建议在核心流程变更后做基础冒烟：登录、视频生成、发布。
+- 测试命令：

 ```bash
 pytest
--- a/Docs/COSYVOICE3_DEPLOY.md
+++ b/Docs/COSYVOICE3_DEPLOY.md
@@ -8,7 +8,7 @@
 | 端口 | 8010 |
 | GPU | 0 (CUDA_VISIBLE_DEVICES=0) |
 | 推理精度 | FP16 (自动混合精度) |
-| PM2 名称 | vigent2-cosyvoice (id=15) |
+| PM2 名称 | vigent2-cosyvoice |
 | Conda 环境 | cosyvoice (Python 3.10) |
 | 启动脚本 | `run_cosyvoice.sh` |
 | 服务脚本 | `models/CosyVoice/cosyvoice_server.py` |
--- a/Docs/DEPLOY_MANUAL.md
+++ b/Docs/DEPLOY_MANUAL.md
@@ -97,10 +97,13 @@ python -m scripts.server  # 测试能否启动，Ctrl+C 退出

 ### 3b. MuseTalk 1.5 (长视频唇形同步, GPU0)

-> MuseTalk 是单步潜空间修复模型（非扩散模型），推理速度接近实时，适合 >=120s 的长视频。与 CosyVoice 共享 GPU0，fp16 推理约需 4-8GB 显存。合成阶段使用 NVENC GPU 硬编码（h264_nvenc）+ 纯 numpy blending，避免双重编码和 PIL 转换开销。
+> MuseTalk 是单步潜空间修复模型（非扩散模型），推理速度接近实时，适合达到路由阈值的长视频（本仓库当前 `.env` 示例为 >=100s）。与 CosyVoice 共享 GPU0，fp16 推理约需 4-8GB 显存。合成阶段已改为 FFmpeg rawvideo 管道直编码（`libx264` + 可配 CRF/preset）并保留 numpy blending，减少中间有损文件。

-请参考详细的独立部署指南：
-**[MuseTalk 部署指南](MUSETALK_DEPLOY.md)**
+请参考详细的独立部署指南：
+**[MuseTalk 部署指南](MUSETALK_DEPLOY.md)**
+
+小脸口型质量补偿（可选）部署与验证：
+**[小脸口型质量补偿链路部署指南](FACEENHANCE_DEPLOY.md)**

 简要步骤：
 1. 创建独立的 `musetalk` Conda 环境 (Python 3.10 + PyTorch 2.0.1 + CUDA 11.8)
@@ -136,26 +139,30 @@ pip install -r requirements.txt
 playwright install chromium
 ```

-> 提示：视频号发布建议使用系统 Chrome + xvfb-run（避免 headless 解码失败）。
-> 抖音发布同样建议 headful 模式 (`DOUYIN_HEADLESS_MODE=headful`)。
+> 提示：视频号发布建议使用系统 Chrome + xvfb-run（避免 headless 解码失败）。
+> 抖音发布同样建议 headful 模式 (`DOUYIN_HEADLESS_MODE=headful`)。
+> 四平台发布专项实现说明请见 `Docs/PUBLISH_DEPLOY.md`。

 ### 扫码登录注意事项

 - **Cookie 按用户隔离**：每个用户的 Cookie 存储在 `backend/user_data/{uuid}/cookies/` 目录下，多用户并发登录互不干扰。
- **抖音 QR 登录关键教训**：
-  - 扫码后绝对**不能重新加载 QR 页面**，否则会销毁会话 token
-  - 使用**新标签页**检测登录完成状态（检查 URL 包含 `creator-micro` + session cookies 存在）
-  - 抖音可能弹出**刷脸验证**，后端会自动提取验证二维码返回给前端展示
- **微信视频号发布**：标题、描述、标签统一写入"视频描述"字段
+- **抖音 QR 登录关键教训**：
+  - 扫码后绝对**不能重新加载 QR 页面**，否则会销毁会话 token
+  - 使用**新标签页**检测登录完成状态（检查 URL 包含 `creator-micro` + session cookies 存在）
+  - 抖音可能弹出**刷脸验证**，后端会自动提取验证二维码返回给前端展示
+- **小红书 QR 登录关键点**：
+  - 创作平台默认可能是短信登录视图，需先切换到扫码登录再抓取二维码
+  - 扫码后可能跳转 `creator.xiaohongshu.com/new/home`，不一定命中旧 `publish` 成功指示 URL
+- **微信视频号发布**：标题、描述、标签统一写入"视频描述"字段

 ---

-### 可选：AI 标题/标签生成
+### 可选：AI 标题/标签生成

 > ✅ 如需启用“AI 标题/标签生成”功能，请确保后端可访问外网 API。

- 需要可访问 `https://open.bigmodel.cn`
- API Key 配置在 `backend/app/services/glm_service.py`（建议替换为自己的密钥）
+- 需要可访问 `https://open.bigmodel.cn`
+- API Key 配置在 `backend/.env` 的 `GLM_API_KEY`

 ---

@@ -195,28 +202,26 @@ playwright install chromium
 ## 步骤 7: 配置环境变量


-```bash
-cd /home/rongye/ProgramFiles/ViGent2/backend
-
-# 复制配置模板
-cp .env.example .env
-```
-
-> 💡 **说明**：`.env.example` 已包含正确的默认配置，直接复制即可使用。  
-> 如需自定义，可编辑 `.env` 修改以下参数：
-
-| 配置项 | 默认值 | 说明 |
-|--------|--------|------|
-| `SUPABASE_URL` | `http://localhost:8008` | Supabase API 内部地址 |
-| `SUPABASE_PUBLIC_URL` | `https://api.hbyrkj.top` | Supabase API 公网地址 (前端访问) |
-| `LATENTSYNC_GPU_ID` | 1 | GPU 选择 (0 或 1) |
-| `LATENTSYNC_USE_SERVER` | false | 设为 true 以启用常驻服务加速 |
-| `LATENTSYNC_INFERENCE_STEPS` | 20 | 推理步数 (16-50) |
-| `LATENTSYNC_GUIDANCE_SCALE` | 2.0 | 引导系数 (1.0-3.0) |
-| `LATENTSYNC_ENABLE_DEEPCACHE` | true | DeepCache 推理加速 |
-| `LATENTSYNC_SEED` | 1247 | 固定随机种子（可复现） |
-| `DEBUG` | true | 生产环境改为 false |
-| `REDIS_URL` | `redis://localhost:6379/0` | 任务状态存储（不可用时回退内存） |
+```bash
+cd /home/rongye/ProgramFiles/ViGent2/backend
+```
+
+> 💡 **说明**：当前仓库直接使用 `backend/.env`。请按你的环境替换敏感值并确认以下参数。  
+> 如需自定义，可编辑 `.env` 修改以下参数：
+
+| 配置项 | 当前示例值 | 说明 |
+|--------|------------|------|
+| `SUPABASE_URL` | `http://localhost:8008` | Supabase API 内部地址 |
+| `SUPABASE_PUBLIC_URL` | `https://api.hbyrkj.top` | Supabase API 公网地址 (前端访问) |
+| `LATENTSYNC_GPU_ID` | 1 | GPU 选择 (0 或 1) |
+| `LATENTSYNC_USE_SERVER` | true | 设为 true 以启用常驻服务加速 |
+| `LATENTSYNC_INFERENCE_STEPS` | 30 | 推理步数 (16-50) |
+| `LATENTSYNC_GUIDANCE_SCALE` | 1.9 | 引导系数 (1.0-3.0) |
+| `LATENTSYNC_ENABLE_DEEPCACHE` | true | DeepCache 推理加速 |
+| `LATENTSYNC_SEED` | 1247 | 固定随机种子（可复现） |
+| `DEBUG` | false | 生产环境必须为 false（仅开发环境可设 true） |
+| `JWT_SECRET_KEY` | 强随机值 | 生产环境禁止默认值；默认值在 `DEBUG=false` 下会阻止后端启动 |
+| `REDIS_URL` | `redis://localhost:6379/0` | 任务状态存储（不可用时回退内存） |
 | `WEIXIN_HEADLESS_MODE` | headless-new | 视频号 Playwright 模式 (headful/headless-new) |
 | `WEIXIN_CHROME_PATH` | `/usr/bin/google-chrome` | 系统 Chrome 路径 |
 | `WEIXIN_BROWSER_CHANNEL` |  | Chromium 通道 (可选) |
@@ -229,19 +234,31 @@ cp .env.example .env
 | `DOUYIN_CHROME_PATH` | `/usr/bin/google-chrome` | 抖音 Chrome 路径 |
 | `DOUYIN_BROWSER_CHANNEL` |  | 抖音 Chromium 通道 (可选) |
 | `DOUYIN_USER_AGENT` | Chrome/144 UA | 抖音浏览器指纹 UA |
-| `DOUYIN_LOCALE` | zh-CN | 抖音语言环境 |
-| `DOUYIN_TIMEZONE_ID` | Asia/Shanghai | 抖音时区 |
-| `DOUYIN_FORCE_SWIFTSHADER` | true | 强制软件 WebGL |
-| `DOUYIN_DEBUG_ARTIFACTS` | false | 保留调试截图 |
-| `DOUYIN_RECORD_VIDEO` | false | 录制浏览器操作视频 |
-| `DOUYIN_KEEP_SUCCESS_VIDEO` | false | 成功后保留录屏 |
+| `DOUYIN_LOCALE` | zh-CN | 抖音语言环境 |
+| `DOUYIN_TIMEZONE_ID` | Asia/Shanghai | 抖音时区 |
+| `DOUYIN_FORCE_SWIFTSHADER` | true | 强制软件 WebGL |
+| `XIAOHONGSHU_HEADLESS_MODE` | headless-new | 小红书 Playwright 模式 (headful/headless-new) |
+| `XIAOHONGSHU_CHROME_PATH` | `/usr/bin/google-chrome` | 小红书 Chrome 路径 |
+| `XIAOHONGSHU_BROWSER_CHANNEL` |  | 小红书 Chromium 通道 (可选) |
+| `XIAOHONGSHU_USER_AGENT` | Chrome/144 UA | 小红书浏览器指纹 UA |
+| `XIAOHONGSHU_LOCALE` | zh-CN | 小红书语言环境 |
+| `XIAOHONGSHU_TIMEZONE_ID` | Asia/Shanghai | 小红书时区 |
+| `XIAOHONGSHU_FORCE_SWIFTSHADER` | true | 强制软件 WebGL |
+| `DOUYIN_DEBUG_ARTIFACTS` | false | 保留调试截图 |
+| `DOUYIN_RECORD_VIDEO` | false | 录制浏览器操作视频 |
+| `DOUYIN_KEEP_SUCCESS_VIDEO` | false | 成功后保留录屏 |
 | `CORS_ORIGINS` | `*` | CORS 允许源 (生产环境建议白名单) |
 | `MUSETALK_GPU_ID` | 0 | MuseTalk GPU 编号 |
 | `MUSETALK_API_URL` | `http://localhost:8011` | MuseTalk 常驻服务地址 |
 | `MUSETALK_BATCH_SIZE` | 32 | MuseTalk 推理批大小 |
-| `MUSETALK_VERSION` | v15 | MuseTalk 模型版本 |
-| `MUSETALK_USE_FLOAT16` | true | MuseTalk 半精度加速 |
-| `LIPSYNC_DURATION_THRESHOLD` | 120 | 秒，>=此值用 MuseTalk，<此值用 LatentSync |
+| `MUSETALK_VERSION` | v15 | MuseTalk 模型版本 |
+| `MUSETALK_USE_FLOAT16` | true | MuseTalk 半精度加速 |
+| `LIPSYNC_DURATION_THRESHOLD` | 100 | 秒，>=此值用 MuseTalk，<此值用 LatentSync（代码默认 120，建议在 `.env` 显式配置） |
+| `LIPSYNC_SMALL_FACE_ENHANCE` | false | 小脸口型质量补偿总开关（建议先关闭，灰度验证后开启） |
+| `LIPSYNC_SMALL_FACE_THRESHOLD` | 256 | 小脸触发阈值（像素） |
+| `LIPSYNC_SMALL_FACE_UPSCALER` | gfpgan | 超分模型（`gfpgan` / `codeformer`） |
+| `LIPSYNC_SMALL_FACE_GPU_ID` | 0 | 小脸补偿超分 GPU（建议与 MuseTalk 同卡） |
+| `LIPSYNC_SMALL_FACE_FAIL_OPEN` | true | 补偿链路失败时是否自动回退原流程 |
 | `ALIPAY_APP_ID` | 空 | 支付宝应用 APPID |
 | `ALIPAY_PRIVATE_KEY_PATH` | 空 | 应用私钥 PEM 文件路径 |
 | `ALIPAY_PUBLIC_KEY_PATH` | 空 | 支付宝公钥 PEM 文件路径 |
@@ -250,7 +267,9 @@ cp .env.example .env
 | `PAYMENT_AMOUNT` | `999.00` | 会员价格 (元) |
 | `PAYMENT_EXPIRE_DAYS` | `365` | 会员有效天数 |

-> 支付宝完整配置步骤（密钥生成、PEM 格式、产品开通等）请参考 **[支付宝部署指南](ALIPAY_DEPLOY.md)**。
+> 支付宝完整配置步骤（密钥生成、PEM 格式、产品开通等）请参考 **[支付宝部署指南](ALIPAY_DEPLOY.md)**。
+
+> 认证相关强约束：当 `DEBUG=false` 时，后端登录 Cookie 会带 `Secure`，前端必须通过 HTTPS 域名访问，HTTP 端口直连无法保持登录态。

 ---

@@ -308,11 +327,11 @@ cd /home/rongye/ProgramFiles/ViGent2/models/MuseTalk
 /home/rongye/ProgramFiles/miniconda3/envs/musetalk/bin/python scripts/server.py
 ```
 
-### 验证
-
-1. 访问 http://服务器IP:3002 查看前端
-2. 访问 http://服务器IP:8006/docs 查看 API 文档
-3. 上传测试视频，生成口播视频
+### 验证
+
+1. 访问 `https://你的前端域名` 查看前端（生产环境不要用 HTTP 端口直连）
+2. 访问 `http://服务器IP:8006/docs` 查看 API 文档（仅内网/运维调试）
+3. 上传测试视频，生成口播视频

 ---

@@ -402,7 +421,7 @@ curl http://localhost:8010/health

 ### 5. 启动 MuseTalk 长视频唇形同步服务

-> 长视频 (>=120s) 自动路由到 MuseTalk。MuseTalk 不可用时自动回退 LatentSync。
+> 达到阈值（当前 `.env` 示例为 >=100s）自动路由到 MuseTalk。MuseTalk 不可用时自动回退 LatentSync。
 > 详细部署步骤见 [MuseTalk 部署指南](MUSETALK_DEPLOY.md)。

 1. 启动脚本位于项目根目录: `run_musetalk.sh`
@@ -532,8 +551,8 @@ server {
   GLM_API_KEY=your_zhipu_api_key
   ```

-3. **验证**:
-   访问 `http://localhost:8006/docs`，测试 `/api/tools/extract-script` 接口。
+3. **验证**:
+   访问 `http://localhost:8006/docs`，在已登录会话下测试 `/api/tools/extract-script`（该接口需认证）。

 ---

--- a/Docs/DevLogs/Day30.md
+++ b/Docs/DevLogs/Day30.md
@@ -1,8 +1,8 @@
-## Remotion 缓存修复 + 编码流水线质量优化 + 唇形同步容错 + 模型选择 (Day 30)
+## Remotion 缓存修复 + 编码流水线质量优化 + 唇形同步容错 + 统一下拉交互 (Day 30)

 ### 概述

-本轮解决四大方面：(1) Remotion bundle 缓存导致标题/字幕丢失的严重 Bug；(2) 全面优化 LatentSync + MuseTalk 双引擎编码流水线，消除冗余有损编码；(3) 增强 LatentSync 的鲁棒性，允许素材中部分帧检测不到人脸时继续推理而非中断任务；(4) 前端唇形模型选择，用户可按需切换默认/快速/高级模型。
+本轮最终合并为五大方面：(1) Remotion bundle 缓存导致标题/字幕丢失的严重 Bug；(2) 全面优化 LatentSync + MuseTalk 双引擎编码流水线，消除冗余有损编码；(3) 增强 LatentSync 的鲁棒性，允许素材中部分帧检测不到人脸时继续推理而非中断任务；(4) 唇形模型选择全链路透传（默认/快速/高级）；(5) 首页与发布页选择器统一为 SelectPopover 交互，并修复遮挡、定位与预览层级问题。

 ---

@@ -278,66 +278,102 @@ needs_audio_compose = str(final_audio_path) != str(audio_path)

 ---

-### 6. 唇形模型前端选择
-
-前端生成按钮右侧新增模型下拉，用户可按需选择唇形同步引擎，全链路透传到后端路由。
-
-#### 模型选项
-
-| 选项 | 值 | 路由逻辑 |
-|------|------|------|
-| 默认模型 | `default` | 保持现有阈值策略（`LIPSYNC_DURATION_THRESHOLD` 分水岭，短视频 LatentSync，长视频 MuseTalk） |
-| 快速模型 | `fast` | 强制 MuseTalk，不可用时回退 LatentSync |
-| 高级模型 | `advanced` | 强制 LatentSync，跳过 MuseTalk |
-
-三种模式最终都有 LatentSync 兜底，不会出现无模型可用的情况。
-
-#### 数据流
-
-```
-前端 select → setLipsyncModelMode("fast") → localStorage 持久化
-                                           ↓
-用户点击"生成视频" → handleGenerate()
-  → payload.lipsync_model = lipsyncModelMode
-  → POST /api/videos/generate { ..., lipsync_model: "fast" }
-    → workflow: req.lipsync_model 透传给 lipsync.generate(model_mode=...)
-      → lipsync_service.generate(): 按 model_mode 路由
-        → fast: 强制 MuseTalk → 回退 LatentSync
-        → advanced: 强制 LatentSync
-        → default: 阈值策略
-```
-
-#### 改动文件
-
-| 文件 | 改动 |
-|------|------|
-| `frontend/src/features/home/ui/GenerateActionBar.tsx` | 生成按钮右侧新增模型 `<select>` 下拉 |
-| `frontend/src/features/home/ui/HomePage.tsx` | 透传 `modelMode` / `onModelModeChange` |
-| `frontend/src/features/home/model/useHomeController.ts` | `lipsyncModelMode` state + payload 透传 |
-| `frontend/src/features/home/model/useHomePersistence.ts` | 读/校验/写三步持久化 |
-| `backend/app/modules/videos/schemas.py` | `lipsync_model: Literal["default", "fast", "advanced"]` |
-| `backend/app/modules/videos/workflow.py` | 多素材/单素材两处 `model_mode=req.lipsync_model` 透传 |
-| `backend/app/services/lipsync_service.py` | `generate()` 新增 `model_mode` 参数，三路分支路由 |
-
---
+### 6. 唇形模型选择全链路
+
+前端“生成视频”按钮右侧新增模型选择，下拉值全链路透传到后端路由与推理服务。
+
+#### 模型选项
+
+| 选项 | 值 | 路由逻辑 |
+|------|------|------|
+| 默认模型 | `default` | 保持阈值路由（`LIPSYNC_DURATION_THRESHOLD`，当前建议 100s） |
+| 快速模型 | `fast` | 强制 MuseTalk，不可用时回退 LatentSync |
+| 高级模型 | `advanced` | 强制 LatentSync |
+
+#### 最终 UI 形态
+
+- 模型按钮由原生 `<select>` 升级为统一 `SelectPopover`
+- 触发器文案改为业务语义（`默认模型 / 快速模型 / 高级模型` + `按时长智能路由 / 速度优先 / 质量优先`）
+- 选择状态持久化到 `useHomePersistence`（`lipsyncModelMode`）
+
+#### 数据流
+
+```
+前端 SelectPopover → setLipsyncModelMode("fast") → localStorage 持久化
+                                                  ↓
+用户点击"生成视频" → handleGenerate()
+  → payload.lipsync_model = lipsyncModelMode
+  → POST /api/videos/generate { ..., lipsync_model: "fast" }
+    → workflow: req.lipsync_model 透传给 lipsync.generate(model_mode=...)
+      → lipsync_service.generate(): 按 model_mode 路由
+        → fast: 强制 MuseTalk → 回退 LatentSync
+        → advanced: 强制 LatentSync
+        → default: 阈值策略
+```
+
+---
+
+### 7. 首页/发布页统一下拉交互（SelectPopover）
+
+#### 7a. 统一改造范围
+
+首页与发布页的业务选择项统一迁移到 `SelectPopover`：
+
+- 首页：音色、参考音频、配音列表、素材选择、BGM 选择、作品选择、标题显示模式、标题/副标题/字幕样式、时间轴画面比例、唇形模型
+- 发布页：选择发布作品（搜索 + 预览）
+
+例外：`ScriptEditor` 的“历史文案 / AI多语言”按产品要求恢复为原有轻量菜单，不强制统一。
+
+#### 7b. 关键交互修复
+
+- **遮挡修复**：桌面端面板改为 `Portal + fixed`，脱离局部 stacking context，彻底解决被卡片遮挡
+- **上拉/下拉自适应**：底部空间不足时自动上拉，避免菜单显示不全
+- **同宽展示**：面板宽度与触发器保持一致
+- **风格统一**：面板背景加实（高不透明度），滚动条隐藏但可滚动
+- **已选定位**：再次打开下拉时自动滚动到已选项（`data-popover-selected="true"`）
+- **预览协同**：
+  - 下拉内点“预览”不强制关闭，支持连续预览
+  - 视频预览弹窗层级高于下拉，避免被遮挡
+  - 预览弹窗打开时，下拉不会因外部点击/Esc被误关闭；关闭预览后仍可继续操作
+
+#### 7c. BGM 面板收敛
+
+- BGM 改为与“发布作品”同款选择器（搜索 + 列表 + 试听 + 选中态）
+- 按产品要求移除首页 BGM 音量滑杆
+- 生成请求统一使用固定 `bgm_volume=0.2`
+
+---

 ## 📁 总修改文件清单

-| 文件 | 改动 |
-|------|------|
-| `remotion/render.ts` | bundle 缓存使用时硬链接视频+字体到 public 目录 |
-| `models/LatentSync/latentsync/utils/util.py` | `read_video` 检测 FPS，25fps 时跳过重编码 |
-| `models/LatentSync/latentsync/pipelines/lipsync_pipeline.py` | final mux `-c:v copy`；无脸帧容错 |
-| `backend/app/services/video_service.py` | CRF 23→18；`concat_videos` copy；`compose()` 异步化 + 循环 CRF 18 |
-| `backend/app/modules/videos/workflow.py` | 线程池化；同分辨率跳过 scale；compose 跳过；片段校验；模型选择透传 |
-| `backend/app/modules/videos/schemas.py` | 新增 `lipsync_model` 字段 |
-| `backend/app/services/lipsync_service.py` | `generate()` 新增 `model_mode` 三路分支路由 |
-| `models/MuseTalk/scripts/server.py` | FFmpeg rawvideo 管道；参数环境变量化 |
-| `backend/.env` | 新增 MuseTalk 质量优先参数 |
-| `frontend/src/features/home/ui/GenerateActionBar.tsx` | 模型下拉 UI |
-| `frontend/src/features/home/ui/HomePage.tsx` | 模型状态透传 |
-| `frontend/src/features/home/model/useHomeController.ts` | `lipsyncModelMode` state + payload |
-| `frontend/src/features/home/model/useHomePersistence.ts` | 模型选择持久化 |
+| 文件 | 改动 |
+|------|------|
+| `remotion/render.ts` | bundle 缓存使用时硬链接视频+字体到 public 目录 |
+| `models/LatentSync/latentsync/utils/util.py` | `read_video` 检测 FPS，25fps 时跳过重编码 |
+| `models/LatentSync/latentsync/pipelines/lipsync_pipeline.py` | final mux `-c:v copy`；无脸帧容错 |
+| `backend/app/services/video_service.py` | CRF 23→18；`concat_videos` copy；`compose()` 异步化 + 循环 CRF 18 |
+| `backend/app/modules/videos/workflow.py` | 线程池化；同分辨率跳过 scale；compose 跳过；片段校验；模型选择透传 |
+| `backend/app/modules/videos/schemas.py` | 新增 `lipsync_model` 字段 |
+| `backend/app/services/lipsync_service.py` | `generate()` 新增 `model_mode` 三路分支路由 |
+| `models/MuseTalk/scripts/server.py` | FFmpeg rawvideo 管道；参数环境变量化 |
+| `backend/.env` | MuseTalk 推理/融合/编码参数可配；路由阈值与质量档调优 |
+| `frontend/src/shared/ui/SelectPopover.tsx` | 新增统一选择器：Portal+fixed、防遮挡、上拉/下拉自适应、同宽、隐藏滚动条、已选定位、预览协同 |
+| `frontend/src/features/home/ui/HomePage.tsx` | 配音卡层级修复；传递统一下拉状态 |
+| `frontend/src/features/home/model/useHomeController.ts` | `lipsyncModelMode` 透传；BGM 固定 `bgm_volume=0.2` |
+| `frontend/src/features/home/model/useHomePersistence.ts` | 模型模式等新增字段持久化 |
+| `frontend/src/features/home/ui/GenerateActionBar.tsx` | 模型选择改为 SelectPopover（速度/质量语义文案） |
+| `frontend/src/features/home/ui/VoiceSelector.tsx` | 音色选择统一为 SelectPopover（音色名+语言） |
+| `frontend/src/features/home/ui/RefAudioPanel.tsx` | 参考音频选择统一为 SelectPopover（含试听/重命名/删除/重识别） |
+| `frontend/src/features/home/ui/GeneratedAudiosPanel.tsx` | 配音列表、语速、语气统一为 SelectPopover |
+| `frontend/src/features/home/ui/MaterialSelector.tsx` | 素材选择改为发布页同款下拉（搜索/多选/预览/重命名/删除） |
+| `frontend/src/features/home/ui/BgmPanel.tsx` | BGM 选择改为发布页同款下拉（搜索+试听），移除音量滑杆 |
+| `frontend/src/features/home/ui/HistoryList.tsx` | 首页作品选择改为下拉（搜索+删除+选中态） |
+| `frontend/src/features/home/ui/TitleSubtitlePanel.tsx` | 标题显示模式与样式选择统一为 SelectPopover |
+| `frontend/src/features/home/ui/TimelineEditor.tsx` | 画面比例选择统一为 SelectPopover（单行按钮） |
+| `frontend/src/features/publish/ui/PublishPage.tsx` | 发布作品选择改为 SelectPopover；预览时下拉保持打开 |
+| `frontend/src/components/VideoPreviewModal.tsx` | 提升层级并添加预览标记，与下拉联动 |
+| `frontend/src/features/home/ui/ScriptEditor.tsx` | 历史文案/AI多语言恢复原轻量菜单（产品例外） |
+| `Docs/FRONTEND_DEV.md` | 新增 SelectPopover 规范、预览层级规范、持久化字段修订 |

 ---

@@ -358,6 +394,12 @@ needs_audio_compose = str(final_audio_path) != str(audio_path)
 13. **compose 循环 CRF**: 循环场景编码应为 CRF 18（非 23）
 14. **模型选择 UI**: 生成按钮右侧应出现默认模型/快速模型/高级模型下拉
 15. **模型选择持久化**: 切换模型后刷新页面，下拉应恢复上次选择
-16. **快速模型路由**: 选择"快速模型"时，后端日志应出现 `强制快速模型：MuseTalk`
-17. **高级模型路由**: 选择"高级模型"时，后端日志应出现 `强制高级模型：LatentSync`
-18. **默认模型不变**: 选择"默认模型"时行为与改动前完全一致（阈值路由）
+16. **快速模型路由**: 选择"快速模型"时，后端日志应出现 `强制快速模型：MuseTalk`
+17. **高级模型路由**: 选择"高级模型"时，后端日志应出现 `强制高级模型：LatentSync`
+18. **默认模型不变**: 选择"默认模型"时行为与改动前完全一致（阈值路由）
+19. **统一下拉样式**: 首页/发布页业务选择项均为同款 SelectPopover（触发器 + 面板 + 选中态）
+20. **上拉自适应**: 页面底部打开下拉时应自动上拉，不出现被截断
+21. **已选定位**: 任意下拉再次打开时应自动定位到已选项，而非列表顶端
+22. **预览层级**: 视频预览弹窗应始终覆盖在下拉之上，不被菜单遮挡
+23. **连续预览**: 下拉内点击预览后菜单保持打开，关闭预览后可继续点击其他预览项
+24. **BGM 行为**: 首页 BGM 不再显示音量滑杆，生成请求固定 `bgm_volume=0.2`
--- a/Docs/DevLogs/Day31.md
+++ b/Docs/DevLogs/Day31.md
@@ -0,0 +1,526 @@
+## 文档分层收敛 + 音色试听修复 + 录音弹窗重构 + 弹窗体系统一 (Day 31)
+
+### 概述
+
+今天的工作聚焦四件事：
+
+1. 清理并收敛根目录文档（README/DEV 职责边界、历史内容归档、参数描述与代码对齐）
+2. 完成 EdgeTTS 音色列表「一键试听」能力，并修复浏览器端试听失败问题
+3. 重构声音克隆录音交互：录音入口下沉到参考音频区域底部右侧，流程改为弹窗
+4. 抽离统一弹窗基座 `AppModal`，将主要弹窗迁移到同一视觉和交互规范
+
+---
+
+## ✅ 1) 文档体系与内容一致性优化
+
+### 1.1 README / DEV 边界明确
+
+- 为 `FRONTEND_README.md`、`BACKEND_README.md`、`FRONTEND_DEV.md`、`BACKEND_DEV.md` 增加「文档定位」
+- README 只保留稳定说明（功能、接口、运行），DEV 保留规范（约束、分层、Checklist）
+- 将 README 中偏日志化内容（如 Day 标注）清理为稳定表述
+
+### 1.2 部署与参数文档对齐当前代码
+
+- 将唇形路由阈值文案统一为阈值驱动，并以当前 `.env` 示例 `100` 为参考
+- 修正旧编码描述（将 MuseTalk 合成描述对齐为 rawvideo 管道 + `libx264`）
+- 修复文档中不存在的 `.env.example` 指引，改为基于 `backend/.env` 的说明
+- 将 Qwen3-TTS 文档标注为「历史归档（已停用）」并指向 CosyVoice 3.0
+
+---
+
+## ✅ 2) 音色试听能力落地与故障修复
+
+### 2.1 功能实现
+
+- 音色下拉项新增试听按钮（播放/暂停/加载态）
+- 新增后端试听接口：`/api/videos/voice-preview`
+- 试听文本按音色 locale 自动选择固定示例文案（9 国语言 + 中文兜底）
+
+### 2.2 兼容与稳定性调整
+
+- 保留 `POST /api/videos/voice-preview`（兼容）
+- 新增 `GET /api/videos/voice-preview?voice=...`，前端改为直接播放 GET 音频流，减少浏览器自动播放策略干扰
+
+```python
+@router.get("/voice-preview")
+async def preview_voice_get(voice: str, current_user: dict = Depends(get_current_user)):
+    voice_value = voice.strip()
+    if not voice_value:
+        raise HTTPException(status_code=400, detail="voice 不能为空")
+    text = _get_preview_text_for_voice(voice_value)
+    return await _render_voice_preview(voice=voice_value, text=text)
+```
+
+### 2.3 本次线上问题结论（已修复）
+
+- 现象：浏览器端试听请求 404
+- 根因：新增 GET 路由后，后端进程未重启，运行中的代码仍是旧版本
+- 处理：`pm2 restart vigent2-backend` 后路由生效
+- 补充：`curl` 返回 401（无 auth cookie）属于预期；浏览器同源请求会自动带 cookie
+
+---
+
+## ✅ 3) 录音交互重构（声音克隆）
+
+### 3.1 入口重排
+
+- 去掉参考音频面板内的独立录音大块区域
+- 将「上传音频 / 录音」入口放到「我的参考音频」区域底部右侧
+
+### 3.2 录音流程改为弹窗
+
+- 录音弹窗支持：开始录音 / 停止录音 / 状态计时 / 试听
+- 保留并强化「使用此录音」和「弃用本次录音」
+- 关闭弹窗时若仍在录音，会先停止录音再关闭
+- 修正弹窗挂载位置：从局部组件渲染改为 `AppModal` Portal 到 `document.body`，确保是全页面弹窗体验
+- 参考音频区按钮文案更新：`录音` -> `在线录音`
+
+### 3.4 文案区按钮视觉统一
+
+- 统一「文案提取与编辑」区按钮尺寸与圆角（`px-3 py-1.5 text-xs rounded-lg`）
+- 将 `AI智能改写`、`保存文案` 按钮改为与上传/在线录音同等级的视觉规格
+- 同步统一图标尺寸与禁用态样式，消除“底部按钮偏小”问题
+
+### 3.5 录音试听条 UI 美化
+
+- 将录音完成后的原生白色 `<audio controls>` 替换为项目深色风格的自定义试听条
+- 新试听条包含：播放/暂停按钮、进度拖拽、当前时长/总时长显示
+- 统一配色到当前页面（深色底 + 绿色强调），避免与整体 UI 风格割裂
+
+### 3.6 录音上传关闭时机优化
+
+- 原逻辑：点击「使用此录音」后，需等待上传+识别完成才关闭弹窗（体感卡顿）
+- 新逻辑：点击后立即关闭弹窗，上传/识别在后台继续进行
+- 状态反馈仍在参考音频区域显示（上传识别中的提示 + 失败错误提示）
+
+---
+
+## ✅ 5) 发布管理抖音登录「无法获取二维码」修复
+
+### 问题定位
+
+- 现象：发布管理中点击抖音登录，前端提示无法获取二维码
+- 后端日志显示根因：
+  - `Page.goto: Timeout 30000ms exceeded`
+  - 导航目标：`https://creator.douyin.com/`
+  - 等待条件：`wait_until="networkidle"`
+
+### 修复方案
+
+- 抖音登录页改为与微信一致的更稳策略：`wait_until="domcontentloaded"`
+- 对抖音导航超时增加容错：即使 `goto` 超时，也继续执行二维码提取流程（避免长连接导致误失败）
+
+### 验证
+
+- 本地接口冒烟：`POST /api/publish/login/douyin` 返回 `success=true` 且包含 `qr_code`
+- 已重启后端进程使修复生效：`pm2 restart vigent2-backend`
+
+### 3.3 状态逻辑补齐
+
+- 新增 `discardRecording()`：清空本次录音与计时
+- 开始新录音前先清空旧录音，避免旧状态残留
+
+---
+
+## ✅ 4) 弹窗 UI/UX 统一（AppModal）
+
+新增统一弹窗基座：`frontend/src/shared/ui/AppModal.tsx`
+
+- 统一遮罩：`bg-black/80 + backdrop-blur-sm`
+- 统一容器：深色半透明背景、`border-white/10`、`rounded-2xl`、重阴影
+- 统一 Header：标题/副标题/关闭按钮
+- 统一行为：ESC 关闭、背景滚动锁定、按需控制 overlay 点击关闭
+- 统一挂载：通过 Portal 渲染到 `document.body`，避免出现“看起来只在配音区弹出”的层叠问题
+- 统一可访问性：补齐 `role="dialog"` + `aria-modal="true"`
+- 统一焦点管理：打开弹窗自动聚焦，关闭后恢复到打开前焦点元素
+- 统一滚动锁计数：支持多弹窗并存，避免一个弹窗关闭后提前恢复页面滚动
+
+已迁移弹窗：
+
+- 视频预览（`VideoPreviewModal`）
+- 文案提取（`ScriptExtractionModal`）
+- AI 改写（`RewriteModal`）
+- 截取设置（`ClipTrimmer`）
+- 录音弹窗（`RefAudioPanel` 内）
+- 修改密码弹窗（`AccountSettingsDropdown`）
+- 发布管理扫码登录弹窗（`PublishPage` 内 QR 登录弹窗）
+
+---
+
+## ✅ 6) 微信视频号登录二维码观感优化（“能扫但像被截断”）
+
+### 问题现象
+
+- 微信视频号登录二维码可扫码成功，但视觉上像“边缘不完整/被切掉”，观感不佳
+
+### 修复方案
+
+- 后端二维码提取策略增强（`qr_login_service.py`）：
+  - 优先导出二维码原始 PNG 数据（`canvas.toDataURL('image/png')` / `img[data:image/png]`），减少二次截图导致的边缘损失
+  - 微信回退截图时改为“按二维码 bbox 外扩留白裁剪”，避免贴边截取带来的不完整感
+  - 仅接受 PNG Data URL，避免把非 PNG（如 SVG 片段）直接当二维码返回造成边角异常
+- 前端扫码弹窗展示优化（`PublishPage.tsx`）：
+  - 取消二维码图片本体圆角裁切，改为外层白底容器 + 内边距（模拟 quiet zone）
+  - 同步调整二维码显示宽度与边框，提升完整感与观感一致性
+
+### 验证
+
+- 本地接口冒烟：`POST /api/publish/login/weixin` 返回 `success=true` 且包含 `qr_code`
+- 解码后图片尺寸为 `1000x1000`，扫码仍正常
+- 前后端进程已重启使修复生效：
+  - `pm2 restart vigent2-frontend`
+  - `pm2 restart vigent2-backend`
+
+---
+
+## ✅ 7) 发布流程性能与日志可读性优化（双平台发布场景）
+
+### 7.1 发布请求并发优化（前端）
+
+- 原逻辑：发布页按平台串行 `for...of await`，多平台总耗时为各平台耗时累加
+- 新逻辑：引入受限并发执行（并发度=2），两平台可并行发布，显著缩短总等待时长
+- 结果列表仍按用户选择的平台顺序回填，避免并发返回导致顺序抖动
+
+### 7.2 微信上传日志噪声优化（后端）
+
+- 原逻辑：`set_input_files` 后若立即读不到 `input.files[0]` 就直接打 warning：`[weixin][file_input] empty`
+- 新逻辑：先轮询确认“是否已进入上传中状态”，再决定是否告警；非最后一次重试只记 info，最后一次才 warning
+- 效果：减少误报警（实际已开始上传时不再刷 warning），排障日志更干净
+
+### 验证
+
+- `python -m py_compile backend/app/services/uploader/weixin_uploader.py` ✅
+- `npm run build`（frontend）✅
+- 服务重启：`pm2 restart vigent2-frontend && pm2 restart vigent2-backend` ✅
+
+---
+
+## ✅ 8) 小红书发布链路对齐改造（启动模式 / Cookie 格式 / 成功截图）
+
+### 8.1 启动模式与反检测参数对齐
+
+- 在 `config.py` 新增小红书 Playwright 配置：
+  - `XIAOHONGSHU_HEADLESS_MODE`（默认 `headless-new`）
+  - `XIAOHONGSHU_USER_AGENT / LOCALE / TIMEZONE_ID`
+  - `XIAOHONGSHU_CHROME_PATH / BROWSER_CHANNEL`
+  - `XIAOHONGSHU_FORCE_SWIFTSHADER / DEBUG_ARTIFACTS`
+- `xiaohongshu_uploader.py` 改为与抖音/微信一致的可配置启动策略，并保留反检测基础参数（`--disable-blink-features=AutomationControlled`）
+
+### 8.2 小红书 uploader 重构增强
+
+- 重写小红书 uploader 主流程（参考抖音/微信模式）：
+  - 上传入口/文件 input 多选择器回退
+  - 上传中/成功/失败状态轮询判定
+  - 标题与正文/话题填充容错
+  - 发布按钮多选择器与可点击检查
+- 发布成功判定从“仅 URL”增强为“多信号组合”：
+  - URL 跳转判定
+  - 页面成功/失败文案判定
+  - 发布 API 响应监听（`publish` / `note create` 类接口）
+- 发布成功后补齐截图能力并返回 `screenshot_url`（路径格式与抖音/微信一致）：
+  - `/api/publish/screenshot/{filename}`
+
+### 8.3 Cookie 保存格式统一
+
+- `publish_service.save_cookie_string()` 调整：
+  - `bilibili` 继续使用原有简化 cookie dict（兼容既有上传库）
+  - 非 `bilibili` 平台统一保存为 Playwright `storage_state`：
+    - `{"cookies": [...], "origins": []}`
+  - 补充平台默认 domain（抖音/微信/小红书），使 cookie 文件可直接用于 `browser.new_context(storage_state=...)`
+
+### 8.4 验证与生效
+
+- `python -m py_compile backend/app/core/config.py backend/app/services/publish_service.py backend/app/services/uploader/xiaohongshu_uploader.py` ✅
+- `pm2 restart vigent2-backend` ✅
+
+---
+
+## ✅ 9) 小红书登录二维码修复（默认短信登录需先切换）
+
+### 问题现象
+
+- 小红书创作平台 `https://creator.xiaohongshu.com/` 默认落在“短信登录”视图
+- 二维码需要先点击右上角切换图标才会出现，导致后端直接按二维码选择器抓取失败
+
+### 修复方案（`qr_login_service.py`）
+
+- 新增 `_ensure_xiaohongshu_qr_mode()`：
+  - 先检测是否处于短信登录（`input[placeholder*='手机号']`）
+  - 自动点击登录卡片右上角切换图标（优先稳定选择器，失败后用几何位置兜底）
+  - 切换后等待二维码渲染再进入提取流程
+- 扩展小红书二维码选择器集合：
+  - 增加登录卡片内二维码图片选择器（包含当前页面结构）
+  - 保留通用 `img[src*='qr'/'qrcode']` 兜底
+- 提高小红书候选过滤阈值（`min_side=120`），避免误选右上角切换小图标
+- 文本策略补充小红书关键词（如 `APP扫一扫登录`）
+
+### 验证
+
+- 本地接口冒烟：`POST /api/publish/login/xiaohongshu` 返回 `success=true` 且 `qr_code` 非空
+- 后端日志确认修复链路生效：
+  - `已点击登录方式切换，等待二维码渲染`
+  - `策略1(CSS): 匹配成功`
+
+---
+
+## ✅ 10) 小红书发布上传阶段修复（“发布笔记 - 上传视频”场景）
+
+### 问题现象
+
+- 小红书发布在“上传视频”阶段失败，页面停留在发布页，前端提示发布失败
+- 后端日志显示 `set_input_files` 触发成功，但短时间内未检测到上传状态，导致重复触发上传并误判失败
+- 进一步定位到上传文件实际是 Supabase 本地对象文件（无后缀），日志里 `file_input type=` 为空，平台可能无法正确识别视频 MIME
+
+### 修复方案（`xiaohongshu_uploader.py`）
+
+- 新增上传启动探测窗口 `UPLOAD_SIGNAL_TIMEOUT=12s`：
+  - `set_input_files` 成功后给上传状态留出启动时间
+  - 检测到“上传中/处理中/转码中”等信号即进入后续上传轮询
+  - 启动窗口内未出现明显信号时，不再立即判失败，转入主上传监控阶段继续等待
+- 修正失败判定词：
+  - 从失败关键词中移除 `重新上传`（该文案在小红书页面常作为正常状态/操作入口，不能直接视为失败）
+- 增补上传文件诊断日志：
+  - 输出 `file_input` 选中文件名/大小/类型，便于确认文件是否真正注入 input
+  - 上传失败命中时记录明确告警日志，便于线上快速定位
+- 增加无后缀视频文件兜底：
+  - 若原文件无后缀且父目录名带后缀（如 `xxx.mp4/<uuid>`），自动在 `/tmp/vigent_uploads` 生成同名临时文件（硬链接/软链接/复制兜底）
+  - 上传改用带后缀临时文件，提升站点 MIME 识别稳定性
+  - 任务结束后自动清理临时上传文件
+
+### 10.1 二次定位与加固（卡住复现后）
+
+- 复现日志显示：即使传入了带后缀临时路径，`file_input` 中仍出现无后缀文件名，且长时间停留在 `等待上传状态...`
+- 根因进一步确认：此前在跨设备场景下会走 `symlink` 回退，浏览器实际取到原始目标文件名（无后缀），导致站点识别失败
+- 加固修复：
+  - 去掉 `symlink` 回退，仅保留 `hardlink -> copy`，确保最终上传文件名稳定带 `.mp4`
+  - 新增 `file_input` 文件名后缀一致性校验：若与预期后缀不一致，直接重试并在最终失败时提前返回（不再无意义长时间等待）
+  - 新增上传空转超时保护（`UPLOAD_IDLE_TIMEOUT=90s`）：长时间无有效上传信号时提前失败并保留调试截图，避免前端“看起来卡死”
+  - 优化失败文案为“未能触发有效视频上传，请确认发布页状态及视频文件格式”
+
+### 10.2 实时发布验证（修复后）
+
+- 重新发起 `POST /api/publish`（小红书），后端完整走通上传+发布，接口返回 `200`
+- 本次实测耗时约 `45.77s`，属于上传与发布等待区间内的正常时长
+- 发布成功截图可访问：`GET /api/publish/screenshot/xiaohongshu_success_20260303_115944_633.png` 返回 `200`
+- 关键日志链路：`正在上传` -> `已设置上传文件` -> `等待发布结果` -> `Cookie 更新完毕`
+
+### 验证
+
+- `python -m py_compile backend/app/services/uploader/xiaohongshu_uploader.py` ✅
+- `pm2 restart vigent2-backend` ✅
+- `curl http://127.0.0.1:8006/health` 返回 `{"status":"ok"}` ✅
+
+---
+
+## ✅ 11) 首页「AI生成标题标签」按钮位置优化（迁移到四、标题与字幕）
+
+### 设计结论
+
+- 将 `AI生成标题标签` 从「一、文案提取与编辑」迁移到「四、标题与字幕」
+- 标题区改为两行：
+  - 第一行：`四、标题与字幕` 标题 + 右侧 `AI生成标题标签`
+  - 第二行：右对齐放置 `标题短暂显示/常驻显示` + `预览样式`
+- 显示语义补充：`标题短暂显示/常驻显示` 对主标题与副标题统一生效（常驻=主/副标题都常驻）
+- 不额外增加提示文案，保持界面简洁
+- `AI生成标题标签` 外观对齐 `在线录音` 按钮的圆角与尺寸（`rounded-lg` + 同级按钮尺寸），颜色保留原蓝色渐变
+
+### 结果
+
+- 标题相关动作集中到同一板块，避免用户在「一」和「四」之间来回跳转
+- 行内层级更明确：AI 动作在标题同层，配置项与预览在下一行
+- AI 按钮圆角与尺寸更柔和，配色仍保持原蓝色渐变，视觉更统一
+
+### 验证
+
+- `npm run build`（frontend）✅
+- `pm2 restart vigent2-frontend` ✅
+
+---
+
+## ✅ 12) 文案编辑框右下角扩展角标（弹出大编辑器）
+
+### 设计与实现
+
+- 在「一、文案提取与编辑」主输入框右下角新增角标按钮（点击后打开扩展编辑器）
+- 扩展编辑器使用 `AppModal`，提供更大编辑空间（高约 `66vh`）
+- 主输入框与弹窗内输入框共享同一份 `text` 状态，双向实时同步
+- 为避免角标遮挡正文，主输入框增加右下内边距（`pr-6 pb-6`）
+- 角标样式进一步极简化：仅保留双箭头图标，去掉外框容器并贴近输入框边缘
+- 角标位置微调为更协调的“上移+右移”：`right-0.5 bottom-2`，并固定点击区域 `h-5 w-5`
+- 修复扩展编辑输入焦点丢失：`AppModal` 改为使用 `onCloseRef` 处理 ESC，避免父组件重渲染时 effect 误清理导致 textarea 失焦
+- 移除扩展编辑输入框紫色聚焦边框，改为中性边框高亮（`focus:border-white/25`）
+
+### 验证
+
+- `npm run build`（frontend）✅
+- `pm2 restart vigent2-frontend` ✅
+
+---
+
+## ✅ 13) 站点 Icon 替换（使用 `Temp/video.png`）
+
+### 变更
+
+- 将提供的 `Temp/video.png` 转换并替换为站点图标资源
+- 新增 `frontend/src/app/icon.png`（Next App Router icon 资源）
+- 更新 `frontend/src/app/favicon.ico`（16/32/48/64 多尺寸）
+
+### 验证
+
+- `npm run build`（frontend）✅
+- 构建产物包含 `/icon.png` 路由 ✅
+- `pm2 restart vigent2-frontend` ✅
+
+---
+
+## ✅ 14) 发布后工作区清理链路加固（CleanupContext + `/api/videos/cleanup`）
+
+### 14.1 功能落地
+
+- 发布页新增“全平台发布成功后清理引导”链路：
+  - 全平台成功：触发 `CleanupModal`
+  - 任一平台失败：保持原内联结果展示
+- `CleanupModal` 支持展示：成功平台列表、成功截图、下载视频备份、一键清理
+- 清理状态 `cleanup_pending` 持久化到 localStorage，刷新/跳转后可恢复
+
+### 14.2 稳定性与防锁死优化
+
+- 后端删除能力改为“异常上抛”，避免静默吞错导致前端误判清理成功
+- 清理接口改为严格成功语义：
+  - 视频和配音删除都成功才返回 success
+  - 任一删除失败直接返回错误，前端保留弹窗并允许重试
+- 前端清理动作改为“先后端、后本地”：
+  - 后端失败：不清本地、不关弹窗
+  - 后端成功：再清理本地输入字段并关闭弹窗
+- 后端成功清理后前端派发 `vigent:workspace-cleared` 事件，发布页就地重置标题/标签输入态（无需手动刷新）
+- 连续失败达到阈值（3 次）后显示“暂不清理，继续使用”，避免异常环境下永久阻塞
+- 清理弹窗增加 24h 过期，避免跨天残留状态
+- 用户切换/登出时重置 cleanup 状态，避免旧账号状态串扰
+
+### 14.3 清理范围口径
+
+- 仅清理输入内容字段：
+  - 首页：文案/标题/副标题
+  - 发布页：标题/标签
+- 保留用户偏好字段（样式、字号、边距、模型、BGM 等）
+
+### 验证
+
+- `python -m py_compile backend/app/services/storage.py backend/app/modules/videos/service.py backend/app/modules/generated_audios/service.py backend/app/modules/videos/router.py` ✅
+- `npm run build`（frontend）✅
+- `pm2 restart vigent2-backend && pm2 restart vigent2-frontend` ✅
+- `curl http://127.0.0.1:8006/health` 返回 `{"status":"ok"}` ✅
+
+---
+
+## 📁 今日主要修改文件
+
+| 文件 | 改动 |
+|------|------|
+| `backend/app/modules/videos/router.py` | 新增/增强 `voice-preview` GET+POST，试听文本 locale 路由，临时文件清理；新增 `POST /api/videos/cleanup` 严格成功语义 |
+| `backend/app/modules/videos/service.py` | 新增批量删除生成视频能力；返回 `(deleted, failed)` 供 cleanup 路由判定 |
+| `backend/app/modules/generated_audios/service.py` | 新增批量删除预生成配音能力；返回 `(deleted, failed)` 供 cleanup 路由判定 |
+| `backend/app/services/storage.py` | `delete_file()` 改为异常上抛，避免删除失败静默吞错造成“假成功” |
+| `backend/app/modules/videos/schemas.py` | 新增 `VoicePreviewRequest` |
+| `frontend/src/features/home/ui/VoiceSelector.tsx` | 音色下拉增加试听按钮，改为 GET 音频流播放 |
+| `frontend/src/features/home/model/useHomeController.ts` | 录音状态重置、`discardRecording` |
+| `frontend/src/features/home/ui/HomePage.tsx` | 透传录音弃用动作；将 `AI生成标题标签` 事件改为传入 `TitleSubtitlePanel` |
+| `frontend/src/features/home/ui/RefAudioPanel.tsx` | 上传/录音入口重排；录音改弹窗；使用/弃用流程 |
+| `frontend/src/features/home/ui/ScriptEditor.tsx` | 文案编辑区按钮视觉统一；移除 `AI生成标题标签`（职责回归标题板块）；新增输入框右下角扩展角标与大编辑弹窗；角标改为双箭头极简贴边样式并微调到 `right-0.5 bottom-2`；输入框去除紫色聚焦边框 |
+| `frontend/src/features/home/ui/TitleSubtitlePanel.tsx` | 标题区改为“首行标题+AI、次行右对齐设置+预览”；AI按钮外观对齐在线录音按钮（软圆角） |
+| `frontend/src/features/home/ui/RefAudioPanel.tsx` | 录音完成试听条改为自定义深色播放器（替换原生白色控制条） |
+| `frontend/src/features/home/ui/RefAudioPanel.tsx` | 使用录音后弹窗立即关闭，上传识别后台进行（提升交互流畅度） |
+| `frontend/src/features/publish/model/usePublishController.ts` | 发布改为受限并发（并发度=2）；全平台发布成功时触发 `triggerCleanup()`，失败保持内联结果；监听 `workspace-cleared` 事件就地清空发布输入态 |
+| `frontend/src/shared/contexts/CleanupContext.tsx` | 新增发布后清理弹窗与持久化状态；失败不关闭/不清本地、3 次失败可跳过、24h 过期、用户切换复位；清理范围收敛为输入内容字段；成功清理后派发 `workspace-cleared` 事件 |
+| `frontend/src/app/layout.tsx` | 在 `TaskProvider` 内挂载 `CleanupProvider`，确保全局可触发发布后清理弹窗 |
+| `backend/app/core/config.py` | 新增小红书 Playwright 配置（headless/UA/locale/timezone/chrome/debug） |
+| `backend/app/services/uploader/xiaohongshu_uploader.py` | 按抖音/微信模式重构；补充上传启动容错窗口、无后缀文件兜底（hardlink/copy）、后缀一致性校验、空转超时保护与上传诊断日志 |
+| `backend/app/services/publish_service.py` | `save_cookie_string` 非 bilibili 统一存储为 Playwright `storage_state`；小红书 uploader 透传 `user_id` |
+| `backend/app/services/qr_login_service.py` | 抖音导航超时容错 + 微信二维码提取增强 + 小红书登录自动切换到扫码模式并提取二维码 |
+| `backend/app/services/uploader/weixin_uploader.py` | `file_input empty` 告警策略优化：先检测上传信号，非最后一次重试降级为 info |
+| `frontend/src/shared/ui/AppModal.tsx` | 统一弹窗组件 + 无障碍语义 + 焦点管理 + 多弹窗滚动锁计数；新增 `onCloseRef` 避免回调引用变化引发的意外失焦 |
+| `frontend/src/components/VideoPreviewModal.tsx` | 迁移到 `AppModal` |
+| `frontend/src/features/home/ui/ScriptExtractionModal.tsx` | 迁移到 `AppModal` |
+| `frontend/src/features/home/ui/RewriteModal.tsx` | 迁移到 `AppModal` |
+| `frontend/src/features/home/ui/ClipTrimmer.tsx` | 迁移到 `AppModal` |
+| `frontend/src/components/AccountSettingsDropdown.tsx` | 修改密码弹窗迁移到 `AppModal` |
+| `frontend/src/app/icon.png` | 新增站点 icon 资源（来自 `Temp/video.png`） |
+| `frontend/src/app/favicon.ico` | 替换站点 favicon（由 `video.png` 转换为多尺寸 ico） |
+| `frontend/src/features/publish/ui/PublishPage.tsx` | 扫码登录（QR）弹窗迁移到 `AppModal` + 二维码白底留白容器优化（避免边缘观感被裁） |
+| `Docs/FRONTEND_DEV.md` | 新增统一弹窗规范（AppModal）和录音交互规范；补充文案扩展编辑也统一走 AppModal；新增 CleanupContext 清理策略规范 |
+| `Docs/FRONTEND_README.md` | 增补录音入口与弹窗交互说明；明确“标题常驻显示”对主/副标题同时生效；补充文案输入框扩展编辑器说明；补充发布后清理弹窗失败兜底说明 |
+| `Docs/BACKEND_README.md` | 增补 `voice-preview` 接口说明；更新发布 API 路径（`/login/{platform}` 等）并链接发布专项文档；补充 `title_display_mode` 对主/副标题统一生效说明；新增 `/api/videos/cleanup` 接口说明 |
+| `Docs/BACKEND_DEV.md` | 更新后端规范中的发布器覆盖范围与小红书配置项；补充发布专项文档指引；补充 `title_display_mode` 主/副标题统一生效约定；新增 cleanup 严格成功语义约定 |
+| `Docs/PUBLISH_DEPLOY.md` | 新增多平台发布专项文档（登录实现、自动化发布流程、部署要点与排障）；补充“发布成功后清理联动”说明 |
+| `Docs/DEPLOY_MANUAL.md` | 部署参数与扫码说明补充小红书要点；新增发布专项文档入口 |
+| `README.md` | 文档中心新增 `PUBLISH_DEPLOY.md` 入口；发布结果可视化描述补齐小红书；补充发布成功后工作区清理引导说明 |
+| `Docs/TASK_COMPLETE.md` | 新增 Day31 任务汇总，更新 Current 标签与更新时间；补充发布后清理链路加固条目 |
+| `Docs/DOC_RULES.md` | 增补“发布相关三检”（路由真值/专项文档/入口回写）、敏感信息处理规范，更新工具规范为 `Read/Grep/apply_patch`，并对齐 TASK_COMPLETE 检查清单 |
+| `Docs/SUBTITLE_DEPLOY.md` | 与当前阈值/参数说明对齐 |
+| `Docs/LATENTSYNC_DEPLOY.md` | 与当前阈值/参数说明对齐 |
+| `Docs/COSYVOICE3_DEPLOY.md` | TTS 部署说明与当前运行路径对齐 |
+| `Docs/QWEN3_TTS_DEPLOY.md` | 标注为历史归档并指向 CosyVoice 3.0 |
+
+---
+
+## 🔍 验证记录
+
+- `python -m py_compile backend/app/modules/videos/router.py backend/app/modules/videos/schemas.py` ✅
+- `python -m py_compile backend/app/services/qr_login_service.py` ✅
+- `python -m py_compile backend/app/services/uploader/weixin_uploader.py` ✅
+- `python -m py_compile backend/app/core/config.py backend/app/services/publish_service.py backend/app/services/uploader/xiaohongshu_uploader.py` ✅
+- `POST /api/publish/login/xiaohongshu` 冒烟返回 `success=true` + `qr_code` ✅
+- `python -m py_compile backend/app/services/uploader/xiaohongshu_uploader.py`（上传阶段修复后）✅
+- `pm2 restart vigent2-backend`（上传阶段修复后）✅
+- `curl http://127.0.0.1:8006/health` 返回 `{"status":"ok"}` ✅
+- `backend/venv/bin/python` 本地探针验证 `_prepare_upload_file()`：临时文件非 symlink、后缀 `.mp4`、清理成功 ✅
+- 小红书发布实测：`POST /api/publish` 返回 `200`（`Duration: 45.77s`）且成功截图接口返回 `200` ✅
+- 新增 `Docs/PUBLISH_DEPLOY.md`（抖音/微信/B站/小红书登录与发布实现说明）✅
+- `npm run build`（frontend）✅
+- 站点 icon 替换后构建通过，产物包含 `/icon.png` 路由 ✅
+- `pm2 restart vigent2-frontend`（icon 替换后）✅
+- `python -m py_compile backend/app/services/storage.py backend/app/modules/videos/service.py backend/app/modules/generated_audios/service.py backend/app/modules/videos/router.py`（cleanup 链路加固后）✅
+- `npm run build`（CleanupContext 优化后）✅
+- `pm2 restart vigent2-backend && pm2 restart vigent2-frontend`（cleanup 链路加固后）✅
+- `curl http://127.0.0.1:8006/health` 返回 `{"status":"ok"}`（cleanup 链路加固后）✅
+- `POST /api/publish/login/weixin` 冒烟返回 `success=true` + `qr_code` ✅
+- `npx eslint` 定向检查以下文件通过：
+  - `VoiceSelector.tsx`
+  - `RefAudioPanel.tsx`
+  - `HomePage.tsx`
+  - `useHomeController.ts`
+  - `AppModal.tsx`
+  - `VideoPreviewModal.tsx`
+  - `ScriptExtractionModal.tsx`
+  - `RewriteModal.tsx`
+  - `AccountSettingsDropdown.tsx`
+- `ClipTrimmer.tsx` 仍有仓库既有 lint 规则项（`react-hooks/set-state-in-effect`），与本次弹窗风格迁移无关
+- 音色试听线上问题经后端重启后已恢复可用（浏览器同源携带 cookie）
+
+---
+
+## ☑️ Day31 覆盖核对（今日新增补充）
+
+已对照今天新增改动做二次核对，以下内容已写入本日志：
+
+- `AppModal` 的可访问性与焦点/滚动锁稳健性增强
+- 微信视频号二维码“观感不完整”问题的后端提取修复
+- 发布页二维码展示样式优化（白底留白、去除本体圆角裁切）
+- 小红书 uploader 对齐重构（启动参数、发布判定、成功截图）
+- 小红书“上传阶段卡住”二次定位与加固（文件名后缀一致性 + 空转超时）并完成实测发布成功
+- 形成发布专项文档 `Docs/PUBLISH_DEPLOY.md`，沉淀四平台登录与自动化发布实现
+- 回写 `Docs/BACKEND_README.md` / `Docs/BACKEND_DEV.md` / `Docs/DEPLOY_MANUAL.md`，统一发布 API 与部署说明口径
+- 回写 `Docs/FRONTEND_README.md` / `Docs/FRONTEND_DEV.md` / `Docs/PUBLISH_DEPLOY.md`，补齐发布后清理弹窗与 cleanup 接口联动说明
+- 回写 `README.md`，补充发布专项文档入口与小红书发布成功截图能力描述
+- 回写 `Docs/TASK_COMPLETE.md`，补齐 Day31 任务完成记录
+- 回写 `Docs/DOC_RULES.md`，同步文档更新规则到当前文档结构与工具链
+- 首页「AI生成标题标签」按钮迁移到「四、标题与字幕」并固定标题同层最右；显示方式与预览下沉到下一行右侧
+- 文案输入框右下角新增扩展角标，支持弹出大编辑器进行长文案编辑
+- 站点 icon 已替换为 `Temp/video.png` 对应资源（`app/icon.png` + `app/favicon.ico`）
+- 发布后工作区清理链路落地（CleanupModal + `/api/videos/cleanup`）并补齐失败兜底（失败不关弹窗、不清本地）
+- 清理链路防锁死优化：3 次失败可跳过、24h 过期、用户切换复位
+- 文档补充：`标题短暂显示/常驻显示` 对主标题与副标题统一生效（常驻=主/副标题全程显示）
+- 非 bilibili 平台 cookie 保存为 `storage_state` 格式
+- 小红书登录二维码自动切换（短信登录 -> 扫码登录）与提取修复
+- 对应构建/重启/冒烟验证记录
+- 今日运行期产物（`backend/user_data/**/cookies/*.json`、`watchdog.log`）为会话副产物，不属于代码/文档变更项
--- a/Docs/DevLogs/Day32.md
+++ b/Docs/DevLogs/Day32.md
@@ -0,0 +1,159 @@
+## 视频下载同源修复 + 安全漏洞第一批修复 (Day 32)
+
+### 概述
+
+今天的工作聚焦四件事：
+
+1. 修复首页与发布成功弹窗点击下载时被浏览器当作在线播放（新开标签页）的问题。
+2. 将下载修复开始后的开发内容从 `Day31` 拆分到 `Day32`，保持日志按天清晰归档。
+3. 根据安全审计报告（`Temp/安全审计报告.md`），实施第一批 6 项无功能风险的安全修复。
+4. 统一弹窗关闭交互（仅关闭策略）：默认支持点空白关闭，发布成功清理弹窗保持强制留存。
+
+---
+
+## ✅ 1) 视频下载链路修复（避免新开标签页播放）
+
+### 问题现象
+
+- 首页“下载视频”与发布成功弹窗“下载视频备份”在部分浏览器会打开新标签页播放视频，而不是直接触发下载。
+- 根因是跨域签名 URL 场景下，浏览器可能忽略 `<a download>`。
+
+### 修复方案
+
+- 后端新增同源下载接口：`GET /api/videos/generated/{video_id}/download`
+  - 使用 `FileResponse` 返回本地视频文件
+  - 显式返回 `Content-Disposition: attachment`
+  - 浏览器直接进入保存文件流程
+- 发布成功弹窗下载改为传 `videoId`，不再依赖签名 URL。
+- 首页作品预览下载同步改为同源下载接口，下载行为与发布弹窗统一。
+- 兼容旧清理状态：`CleanupContext` 对旧 `videoDownloadUrl` 持久化字段做 `videoId` 解析回填。
+
+---
+
+## ✅ 2) 配套调整与文档拆分
+
+### 前端联动
+
+- `CleanupContext` 继续沿用“清理失败不关弹窗、不清本地”的逻辑，下载链路仅替换为同源接口。
+- 首页 `PreviewPanel` 支持传入 `generatedVideoId`，下载按钮优先走 `/api/videos/generated/{id}/download`。
+
+### 日志归档
+
+- 将“下载修复开始后的内容”从 `Day31` 移出并归档到 `Day32`。
+- `Day31` 保留 Day31 当日核心内容（到 cleanup 链路加固为止）。
+
+---
+
+## ✅ 3) 安全漏洞第一批修复（6 项，无功能风险）
+
+根据安全审计报告，实施第一批 6 项可直接修复的安全加固项。
+
+### 3.1 JWT 默认密钥启动拦截
+
+- **文件**：`backend/app/main.py`
+- 新增 `check_jwt_secret` startup 事件（在 `init_admin` 之前）
+- 当 `JWT_SECRET_KEY` 仍为默认值 `"your-secret-key-change-in-production"` 时：
+  - **生产环境**（`DEBUG=False`）：`raise RuntimeError` 直接阻止服务启动
+  - **开发环境**（`DEBUG=True`）：输出 `CRITICAL` 级别日志告警，不阻止启动
+
+### 3.2 AI / Tools 接口加认证
+
+- **文件**：`backend/app/modules/ai/router.py`、`backend/app/modules/tools/router.py`
+- AI 路由 3 个端点（`/translate`、`/generate-meta`、`/rewrite`）均增加 `Depends(get_current_user)`
+- Tools 路由 1 个端点（`/extract-script`）增加 `Depends(get_current_user)`
+- 前端 axios 已有 `withCredentials: true`，401 自动跳登录页，无需前端改动
+
+### 3.3 素材路径穿越修复
+
+- **文件**：`backend/app/modules/materials/router.py`、`backend/app/modules/materials/service.py`
+- `stream`、`delete_material`、`rename_material` 三处在 `startswith(user_id)` 校验之前新增 `..` 拒绝
+- 含 `..` 的 `material_id` 直接返回 400
+- `delete_material` 路由补充 `except ValueError` → 400（原先仅 catch `PermissionError`，`ValueError` 会被 `Exception` 兜底返回 500）
+
+### 3.4 video_id 白名单校验
+
+- **文件**：`backend/app/modules/videos/router.py`
+- `download_generated` 和 `delete_generated` 两个端点在函数开头增加正则校验
+- 仅允许 `^[A-Za-z0-9_-]+$`，不符合直接返回 400
+
+### 3.5 上传/下载大小限制
+
+- **materials/service.py**（流式上传）：在 chunk 累加后检查 `MAX_UPLOAD_SIZE_MB`（默认 500MB），超限抛 `ValueError`
+- **ref_audios/service.py**（参考音频）：`await file.read()` 后检查 5MB 上限
+- **tools/service.py**（文案提取文件上传）：将 `shutil.copyfileobj` 替换为分块拷贝 + 500MB 限制
+- **tools/service.py**（URL 下载分支）：`_download_video` 返回后检查文件体积，超 500MB 删除临时文件并拒绝
+
+### 3.6 错误信息通用化
+
+- **ai/router.py**：3 处 `detail=str(e)` 分别改为"翻译服务暂时不可用"、"生成标题标签失败"、"改写服务暂时不可用"
+- **tools/router.py**：保留 "Fresh cookies" 特定分支提示，fallback 改为"文案提取失败，请稍后重试"
+- **generated_audios/service.py**：任务失败 `error` 字段从 `traceback.format_exc()` 改为 `str(e)`，traceback 仅写入服务端日志
+
+---
+
+## ✅ 4) 弹窗关闭策略统一（UX）
+
+### 目标
+
+- 保持统一交互预期：业务弹窗默认可通过 `X` 与点击遮罩关闭。
+- 保留关键流程保护：发布成功清理弹窗继续禁止遮罩关闭，避免误触导致流程中断。
+- 说明：按钮位置与视觉样式统一属于 Day33 范畴，本日志仅记录关闭策略统一。
+
+### 调整内容
+
+- 文案提取弹窗（`ScriptExtractionModal`）支持点击遮罩关闭。
+- AI 改写弹窗（`RewriteModal`）支持点击遮罩关闭。
+- 发布页扫码登录弹窗支持点击遮罩关闭。
+- 修改密码弹窗支持点击遮罩关闭。
+- 录音弹窗采用动态策略：`closeOnOverlay={!isRecording}`
+  - 未录音：允许遮罩关闭
+  - 录音中：禁止遮罩关闭（防误触）；`X` 关闭仍可用，且会先停止录音再关闭
+- 发布成功清理弹窗维持 `closeOnOverlay=false`，并且不提供 `onClose`（无右上角关闭按钮）。
+
+---
+
+## 📁 今日主要修改文件
+
+| 文件 | 改动 |
+|------|------|
+| `backend/app/modules/videos/router.py` | 新增 `GET /api/videos/generated/{video_id}/download`，返回 `attachment` 下载响应；新增 `video_id` 白名单正则校验（`^[A-Za-z0-9_-]+$`） |
+| `frontend/src/features/publish/model/usePublishController.ts` | 发布成功后 `triggerCleanup()` 传 `video.id`（替换签名 URL） |
+| `frontend/src/shared/contexts/CleanupContext.tsx` | 下载字段改为 `videoId`；兼容旧 `videoDownloadUrl` 回填；下载按钮改同源路径 |
+| `frontend/src/features/home/ui/PreviewPanel.tsx` | 首页下载改为同源下载接口 |
+| `frontend/src/features/home/ui/HomePage.tsx` | 透传 `generatedVideoId` 给 `PreviewPanel` |
+| `frontend/src/features/home/ui/ScriptExtractionModal.tsx` | 弹窗支持点击遮罩关闭（`closeOnOverlay`） |
+| `frontend/src/features/home/ui/RewriteModal.tsx` | 弹窗支持点击遮罩关闭（`closeOnOverlay`） |
+| `frontend/src/features/publish/ui/PublishPage.tsx` | 扫码登录弹窗支持点击遮罩关闭 |
+| `frontend/src/components/AccountSettingsDropdown.tsx` | 修改密码弹窗支持点击遮罩关闭 |
+| `frontend/src/features/home/ui/RefAudioPanel.tsx` | 录音弹窗改为 `closeOnOverlay={!isRecording}`（录音中禁遮罩关闭） |
+| `Docs/DevLogs/Day31.md` | 移除下载修复章节与对应验证/覆盖项（迁入 Day32） |
+| `Docs/TASK_COMPLETE.md` | 当日新增 Day32 区块并接棒 Current（后续由 Day33 接棒 Current） |
+| `Docs/BACKEND_README.md` | 补充 `/api/videos/generated/{video_id}/download` 接口说明 |
+| `Docs/BACKEND_DEV.md` | 补充下载接口 `attachment` 约定 |
+| `Docs/FRONTEND_README.md` | 补充首页/发布弹窗下载统一同源接口说明 |
+| `Docs/FRONTEND_DEV.md` | 补充 CleanupContext 下载策略规范 |
+| `Docs/PUBLISH_DEPLOY.md` | 补充发布成功后同源下载联动说明 |
+| `README.md` | 补充”一键下载直达（同源 attachment）”能力描述 |
+| `backend/app/main.py` | `check_jwt_secret` startup 事件：生产环境（`DEBUG=False`）强拦截启动，开发环境 `CRITICAL` 告警 |
+| `backend/app/modules/ai/router.py` | 3 个端点加 `Depends(get_current_user)` 认证；错误返回改为通用消息 |
+| `backend/app/modules/tools/router.py` | `extract-script` 端点加 `Depends(get_current_user)` 认证；错误返回改为通用消息 |
+| `backend/app/modules/materials/router.py` | `stream` 端点新增 `..` 路径穿越拒绝；`delete` 端点补充 `except ValueError` → 400 |
+| `backend/app/modules/materials/service.py` | `delete_material` / `rename_material` 新增 `..` 路径穿越拒绝；流式上传增加 `MAX_UPLOAD_SIZE_MB` 大小限制 |
+| `backend/app/modules/ref_audios/service.py` | 参考音频上传增加 5MB 大小限制 |
+| `backend/app/modules/tools/service.py` | 文案提取文件上传替换为限大小分块拷贝（500MB）；URL 下载分支增加下载后体积检查（500MB） |
+| `backend/app/modules/generated_audios/service.py` | 任务失败错误字段从 `traceback.format_exc()` 改为 `str(e)`，避免泄露内部路径 |
+
+---
+
+## 🔍 验证记录
+
+- `python -m py_compile backend/app/modules/videos/router.py` ✅
+- `npm run build`（frontend）✅
+- `npm run build`（frontend，弹窗关闭策略调整后复验）✅
+- `pm2 restart vigent2-frontend` ✅
+- `pm2 restart vigent2-backend` ✅
+- `curl http://127.0.0.1:8006/health` 返回 `{"status":"ok"}` ✅
+- 安全修复第一批语法验证：`python -m py_compile backend/app/main.py backend/app/modules/materials/router.py backend/app/modules/tools/service.py backend/app/modules/ai/router.py backend/app/modules/tools/router.py backend/app/modules/materials/service.py backend/app/modules/ref_audios/service.py backend/app/modules/videos/router.py backend/app/modules/generated_audios/service.py` ✅
+- 未登录调用 `/api/ai/translate` → 返回 401 ✅
+- 未登录调用 `/api/tools/extract-script` → 返回 401 ✅
+- 收尾三刀语法验证：`python -m py_compile backend/app/main.py backend/app/modules/materials/router.py backend/app/modules/tools/service.py` ✅
--- a/Docs/DevLogs/Day33.md
+++ b/Docs/DevLogs/Day33.md
@@ -0,0 +1,290 @@
+## 抖音短链文案提取稳健性修复 (Day 33)
+
+### 概述
+
+今天聚焦修复「文案提取助手」里抖音分享短链/口令文本偶发提取失败的问题，并补齐多种抖音落地 URL 形态的兼容。
+
+---
+
+## ✅ 1) 问题复盘
+
+### 现象
+
+- 复制抖音分享口令文本（含 `v.douyin.com` 短链）时，文案提取偶发失败。
+- 直接粘贴地址栏链接（如 `jingxuan?modal_id=...`）时，提取成功。
+
+### 根因
+
+- `backend/app/modules/tools/service.py` 中 `_download_douyin_manual` 原先只按 `/video/{id}` 提取视频 ID。
+- 短链重定向结果并不总是 `/video/{id}`，常见还包括：
+  - `/share/video/{id}`
+  - `/user/...?...&vid={id}`
+  - `/follow/search?...&modal_id={id}`
+- 当落到上述形态时会出现 `Could not extract video_id`，导致 fallback 失败。
+
+---
+
+## ✅ 2) 修复方案
+
+### 2.1 抽取统一解析函数
+
+- 新增 `_extract_douyin_video_id(candidate_url)`，统一解析以下 ID 形态：
+  - 路径：`/video/{id}`、`/share/video/{id}`
+  - Query 参数：`modal_id`、`vid`、`video_id`、`aweme_id`、`item_id`
+  - 解码后的整串 URL 兜底正则匹配
+
+### 2.2 fallback 提取链路增强
+
+- `_download_douyin_manual` 改为：
+  1. 优先从重定向后的 `final_url` 提取 `video_id`
+  2. 若失败，再从原始输入 `url` 提取 `video_id`
+- 保持后续下载链路不变：访问 `m.douyin.com/share/video/{video_id}` 提取 `play_addr` 并下载。
+
+---
+
+## 📁 今日修改文件
+
+| 文件 | 改动 |
+|------|------|
+| `backend/app/modules/tools/service.py` | 新增 `_extract_douyin_video_id`；增强抖音 fallback 的 `video_id` 提取策略（兼容 `share/video`、`modal_id`、`vid` 等） |
+| `Docs/DevLogs/Day33.md` | 新增 Day33 开发日志，记录问题、根因、修复与验证 |
+
+---
+
+## 🔍 验证记录
+
+- `python -m py_compile backend/app/modules/tools/service.py` ✅
+- URL 解析冒烟（函数级）：
+  - `jingxuan?modal_id=...` 可提取 ✅
+  - `user?...&vid=...` 可提取 ✅
+  - `follow/search?...&modal_id=...` 可提取 ✅
+- 下载链路冒烟（服务级）：
+  - 用户提供的短链口令文本可成功下载临时视频 ✅
+  - 历史失败样例 `user?...&vid=...` 可成功走通 fallback 下载 ✅
+
+---
+
+## ✅ 3) 文案深度学习：抖音抓取 Playwright 降级增强
+
+### 3.1 问题复盘
+
+- 在「文案深度学习」博主分析链路里，抖音用户页有时返回 JS 壳页（含 `byted_acrawler`），静态 HTML 提取拿不到 `desc`。
+- 表现为：短链可解析 `sec_uid`，但标题抓取报错“页面结构可能已变更”。
+
+### 3.2 修复方案
+
+- 在 `backend/app/services/creator_scraper.py` 中新增 Playwright 降级抓取：
+  1. 保留原 HTTP + `ttwid` 抓取作为首选（轻量、快）。
+  2. 当 HTTP 提取不到标题时，自动切换 Playwright。
+  3. 监听页面网络响应，定向捕获：
+     - `/aweme/v1/web/aweme/post/`
+     - `/aweme/v1/web/user/profile/other/`
+  4. 解析响应 JSON 中 `desc` 作为视频标题来源，并提取博主昵称。
+- 仅在确实失败时返回更准确提示：
+  - `抖音触发风控验证，暂时无法抓取标题，请稍后重试`
+
+### 3.3 结果
+
+- 给定短链 `https://v.douyin.com/hmFXdx5PvzQ/` 可稳定识别并完成标题抓取。
+- 抓取结果可获得有效博主昵称与约 50 条标题（受平台返回数据影响）。
+
+### 3.4 本次新增/更新文件
+
+| 文件 | 改动 |
+|------|------|
+| `backend/app/services/creator_scraper.py` | 新增抖音 Playwright 降级抓取、网络响应采集、标题/昵称解析优化、错误提示优化 |
+| `Docs/DevLogs/Day33.md` | 增补文案深度学习抖音抓取增强记录 |
+
+### 3.5 验证记录
+
+- `python -m py_compile backend/app/services/creator_scraper.py` ✅
+- 冒烟验证：
+  - 短链重定向 + `sec_uid` 提取 ✅
+  - HTTP 首选链路失败时自动切换 Playwright ✅
+  - Playwright 网络响应中抓取到 `aweme/post` 数据并提取标题 ✅
+
+---
+
+## ✅ 4) 文案深度学习功能首版落地
+
+### 4.1 后端实现
+
+- 新增博主抓取服务：`backend/app/services/creator_scraper.py`
+  - `scrape_creator_titles(url)`：平台识别 + 标题抓取统一入口
+  - `validate_url(url)`：`https` 强制、域名白名单、DNS 全记录公网校验、逐跳重定向校验
+  - `cache_titles(titles, user_id)` / `get_cached_titles(analysis_id, user_id)`：20 分钟 TTL + 用户绑定
+- GLM 服务扩展：`backend/app/services/glm_service.py`
+  - `analyze_topics(titles)`：从标题归纳热门话题（≤10）
+  - `generate_script_from_topic(topic, word_count, titles)`：按话题与风格生成文案
+- 工具路由新增接口：`backend/app/modules/tools/router.py`
+  - `POST /api/tools/analyze-creator`
+  - `POST /api/tools/generate-topic-script`
+  - 使用 Pydantic JSON 请求模型 + 登录态校验 + 统一 `success_response`
+
+### 4.2 前端实现
+
+- 新增状态逻辑 Hook：`frontend/src/features/home/ui/script-learning/useScriptLearning.ts`
+  - 流程状态：`input -> analyzing -> topics -> generating -> result`
+  - 管理分析请求、生成请求、错误态、复制、重新生成
+- 新增弹窗组件：`frontend/src/features/home/ui/ScriptLearningModal.tsx`
+  - 步骤式 UI：输入链接、话题单选、字数输入、结果展示、填入文案/复制
+- 接入首页交互：
+  - `frontend/src/features/home/ui/ScriptEditor.tsx`：新增「文案深度学习」按钮
+  - `frontend/src/features/home/model/useHomeController.ts`：新增 `learningModalOpen` 状态
+  - `frontend/src/features/home/ui/HomePage.tsx`：挂载弹窗并支持回填主编辑器
+
+### 4.3 交互位置与规则
+
+- 按钮位置已按约定落位：
+  - `历史文案` → `文案提取助手` → `文案深度学习` → `AI多语言`
+- 弹窗遵循当前统一策略：支持遮罩点击关闭（非关键流程弹窗）。
+
+### 4.4 验证记录
+
+- 后端语法检查：
+  - `python -m py_compile backend/app/services/creator_scraper.py backend/app/services/glm_service.py backend/app/modules/tools/router.py` ✅
+- 前端构建：
+  - `cd frontend && npm run build` ✅
+- 抖音短链样例联调：
+  - `https://v.douyin.com/hmFXdx5PvzQ/` 可解析、可抓取标题（触发降级时可自动走 Playwright）✅
+
+---
+
+## ✅ 5) 抖音 Cookie 依赖澄清与 B站频率限制增强
+
+### 5.1 抖音 Cookie 依赖澄清
+
+- 文案深度学习的抖音抓取**不依赖发布管理页登录 Cookie**。
+- 当前链路使用：
+  - 短链解析 + `sec_uid` 提取
+  - 公共访问链路（`ttwid` + 页面/接口抓取）
+  - 必要时 Playwright 降级
+- 因此用户即使未登录抖音，也可使用该功能（但仍可能受平台风控影响）。
+
+### 5.2 B站“请求过于频繁”优化
+
+- 在 `backend/app/services/creator_scraper.py` 增强 B站抓取稳健性：
+  - 对频率限制场景增加自动重试（指数退避 + 随机抖动）
+  - 频率限制识别（HTTP 412/429、错误码/错误文案）
+  - HTTP 链路失败后自动切换 Playwright 降级抓取
+  - 最终报错文案统一为更可理解的提示
+  - `mid` 提取兼容根路径与子路径（如 `/upload/video`）
+
+### 5.3 验证记录
+
+- B站样例联调：`https://space.bilibili.com/8047632` 可抓取 50 条标题 ✅
+- 抖音短链复测：`https://v.douyin.com/hmFXdx5PvzQ/` 仍可抓取 50 条标题 ✅
+
+---
+
+## ✅ 6) 抖音 + B站 抓取可靠性二次增强
+
+### 6.1 抖音增强
+
+- `backend/app/services/creator_scraper.py`
+  - `scrape_creator_titles(..., user_id)` 透传用户 ID，支持读取用户已登录平台 Cookie 作为增强上下文。
+  - 抖音抓取新增可选用户 Cookie 注入（HTTP 请求 + Playwright 上下文）。
+  - Playwright 降级抓取轮次从 4 次提升到 8 次，目标改为尽量补齐 `MAX_TITLES=50`。
+  - 保留网络响应抓取主链路（`aweme/post` + `profile/other`），优先 `desc` 提取标题。
+
+### 6.2 B站增强
+
+- 新增 WBI 签名链路（主链路）：
+  - 获取 `wbi_img` key（兼容 `nav` 返回 `-101` 但携带 `wbi_img` 的场景）
+  - 计算 `w_rid/wts` 后调用 `x/space/wbi/arc/search`
+  - 多页拉取（分页累加）+ 标题去重，尽量补齐 50 条
+- 新增 B站会话预热：
+  - `x/frontend/finger/spi` 获取并注入 `buvid3/buvid4`
+  - 支持读取用户已登录 B站 Cookie（若存在）提升命中率
+- Playwright 降级增强：
+  - 监听 `x/space/*/arc/search` 响应并解析有效 payload
+  - 对捕获的 arc URL 进行 `context.request` 二次回放尝试
+
+### 6.3 路由联动
+
+- `backend/app/modules/tools/router.py`
+  - `/api/tools/analyze-creator` 调用抓取时传入 `current_user.id`，用于平台 Cookie 增强。
+
+### 6.4 结果说明
+
+- 抖音：短链场景稳定性进一步提升，风控页下优先走 Playwright 降级抓取。
+- B站：已补齐签名链路与降级链路；但在平台强风控窗口仍可能返回“请求过于频繁/风控校验失败”，属于平台侧限制。
+
+---
+
+## ✅ 7) 抓取策略最终调整：抖音/B站改为 Playwright 直连
+
+根据产品决策，将文案深度学习的博主标题抓取策略统一为 **Playwright 直连主链路**，不再使用“HTTP 主链路 + Playwright 降级”。
+
+### 7.1 调整内容
+
+- `backend/app/services/creator_scraper.py`
+  - `_scrape_douyin()` 改为直接调用 `_scrape_douyin_with_playwright()`。
+  - `_scrape_bilibili()` 改为直接调用 `_scrape_bilibili_with_playwright()`。
+  - 两个平台均保留 2 次 Playwright 抓取重试。
+  - 支持优先读取用户隔离 Cookie，若缺失再尝试旧版全局 Cookie。
+- `backend/app/modules/tools/router.py`
+  - `analyze-creator` 继续传入 `current_user.id`，用于匹配用户 Cookie 上下文。
+
+### 7.2 影响评估
+
+- 影响范围仅限「文案深度学习」抓取链路。
+- **不影响**：视频自动化发布、文案提取助手（extract-script）现有流程。
+
+### 7.3 验证
+
+- 抖音短链样例：`https://v.douyin.com/hmFXdx5PvzQ/` 抓取成功，50 条。
+- B站样例：
+  - `https://space.bilibili.com/256237759?spm_id_from=...` 抓取成功，40 条。
+  - `https://space.bilibili.com/1140672573` 抓取成功，40 条。
+
+---
+
+## ✅ 8) GLM 调用链统一与超时体验优化
+
+### 8.1 现象
+
+- 文案深度学习“生成文案”偶发前端报错：`timeout of 30000ms exceeded`。
+
+### 8.2 原因
+
+- 主要是前端请求超时阈值过短（30s），在模型排队或长文本生成时容易超时。
+- 后端虽然统一走 `glm_service`，但各方法内部仍重复编写 SDK 调用代码，维护成本高。
+
+### 8.3 调整
+
+- 前端：`generate-topic-script` 超时从 30s 提升到 90s，并优化超时提示文案。
+- 后端：`backend/app/services/glm_service.py`
+  - 新增 `_call_glm(...)` 作为统一调用入口（统一 model / thinking / to_thread / timeout）
+  - `generate_title_tags / rewrite_script / analyze_topics / generate_script_from_topic / translate_text`
+    全部改为复用该入口
+  - 保持 `settings.GLM_MODEL` 单点配置，避免多处散落调用
+
+### 8.4 结果
+
+- GLM 调用标准统一，后续参数调整只需改一处。
+- 前端超时报错显著减少；如确实超时会给出可理解提示。
+
+---
+
+## ✅ 9) 三个文案弹窗操作按钮统一
+
+### 9.1 目标
+
+- 统一「文案提取助手」「AI 智能改写」「文案深度学习」结果页操作按钮的位置、样式与主次关系。
+
+### 9.2 调整
+
+- `frontend/src/features/home/ui/ScriptExtractionModal.tsx`
+  - 结果页按钮从“分散在标题右侧 + 底部单独按钮”改为统一底部 Action Grid。
+  - 按钮统一为：`填入文案`、`复制`、`提取下一个`、`关闭`。
+- `frontend/src/features/home/ui/RewriteModal.tsx`
+  - 结果页按钮改为统一底部 Action Grid。
+  - 新增复制按钮（含 clipboard fallback）。
+  - 按钮统一为：`填入文案`、`复制`、`重新生成`、`保留原文`。
+- `frontend/src/features/home/ui/ScriptLearningModal.tsx`
+  - 维持同一 Action Grid 风格：`填入文案`、`复制`、`重新生成`、`换个话题`。
+
+### 9.3 验证
+
+- `cd frontend && npm run build` ✅
--- a/Docs/DevLogs/Day34.md
+++ b/Docs/DevLogs/Day34.md
@@ -0,0 +1,244 @@
+## 多镜头（Multi-Camera）时间轴系统重构 (Day 34)
+
+### 概述
+
+将时间轴系统从"等分顺序片段"模型重构为"主素材 + 插入镜头"多镜头模型。主素材连续循环播放填满整条时间轴，用户可在任意位置叠加插入镜头，实现多机位切换效果。单素材模式行为完全不变。同时补充修复「文案深度学习」弹窗误触关闭问题。
+
+---
+
+## ✅ 1) 核心架构变更
+
+### 1.1 旧模型 vs 新模型
+
+| | 旧模型 | 新模型 |
+|---|---|---|
+| 时间轴结构 | 等分 N 段，每段对应一个素材 | 主素材连续播放 + 浮动插入块 |
+| 主素材 | 无概念 | `selectedMaterials[0]`，循环填满整条音频时长 |
+| 其余素材 | 平均分配时长 | 作为插入候选，可自由添加到时间轴任意位置 |
+| 片段边界 | 固定等分 | 用户拖拽调整位置，点击弹窗编辑时长 |
+| 最大素材数 | 4（等分） | 4（1 主 + 最多 3 插入候选），每个候选可多次插入 |
+
+### 1.2 `buildAssignments()` 核心算法
+
+多素材模式下调用 `toCustomAssignments()` 生成 `custom_assignments` 数组：
+
+1. 将插入块按 `start` 排序
+2. 插入块之间的空白（gap）由主素材填充
+3. 主素材使用 `primaryAccum` 追踪累计播放位置，实现无缝循环
+4. 每段 gap 按主素材有效片段长度做**边界分割**，确保每个子段不跨越 loop 边界
+5. 后端 `prepare_segment` 只需做简单裁剪，避免触发"先裁后循环"的帧重复路径
+
+---
+
+## ✅ 2) 前端改动
+
+### 2.1 新增文件
+
+**`frontend/src/shared/types/timeline.ts`**
+
+```typescript
+export interface InsertSegment {
+  id: string;
+  materialId: string;
+  materialName: string;
+  start: number;
+  end: number;
+  sourceStart: number;
+  sourceEnd: number;
+  color: string;
+}
+```
+
+跨模块共享类型，供 `useTimelineEditor`、`TimelineEditor`、`useHomeController` 共用。
+
+### 2.2 `useTimelineEditor.ts` — 完全重写
+
+核心 Hook 从等分模型重写为主素材+插入模型：
+
+- **新 API**：`addInsert`（返回 `AddInsertResult: "ok" | "limit" | "no_space"`）、`removeInsert`、`moveInsert`、`resizeInsert`、`setInsertSourceRange`、`setPrimarySourceRange`、`toCustomAssignments`
+- **`MultiCamCache`** 接口：独立 localStorage 持久化（`vigent_${storageKey}_multicam`），保存 inserts + primarySourceStart/End
+- 自动清理：当选中素材列表变化时，移除引用已删除素材的插入块
+- 主素材源范围在单/多模式切换时自动重置
+
+### 2.3 `TimelineEditor.tsx` — 完全重写
+
+可视化组件配合新模型：
+
+- 主素材背景条：紫色底色 + 循环条纹图案（loopCount > 1 时显示）
+- 浮动插入块：彩色半透明矩形，支持拖拽移动（中央），点击弹出 ClipTrimmer 编辑截取范围与时长
+- 插入候选栏：`selectedMaterials[1:]` 显示为 `+` 按钮，点击添加到时间轴
+- 移动端适配：40px 最小高度、12px 拖拽边缘、始终可见的删除按钮
+- 清理了未使用的 `TimelineSegment` import
+
+### 2.4 `useHomeController.ts` — 适配新 API
+
+- 替换旧 timeline 解构为新 API（`inserts`、`addInsert`、`removeInsert` 等）
+- `handleGenerate()` 多素材分支重写：调用 `toCustomAssignments()` 生成 assignments，构建 payload 时拆分 `material_path`（主）和 `material_paths`（全部去重路径）
+- 单素材分支同样调用 `toCustomAssignments()` 处理裁剪范围
+- 素材重命名时同步更新 inserts 中的 `materialName`
+- 新增 `handleSetPrimary` 回调：将指定素材提升到 `selectedMaterials[0]`
+- 新增 `insertCandidates` 计算值：`selectedMaterials[1:]` 对应的 Material 对象列表
+
+### 2.5 `MaterialSelector.tsx` — 增强
+
+- 新增 `Crown` 图标和 `onSetPrimary` 回调 prop
+- 多素材模式下显示角色标签：`selectedMaterials[0]` 显示紫色"主素材"徽章，其余显示灰色"可插入"徽章
+- 非主素材行显示 Crown 按钮，点击可设为主素材
+
+### 2.6 `HomePage.tsx` — 适配
+
+- `clipTrimmerSegment` 重写：支持 `"primary"` ID（主素材裁剪）和插入块 ID 两种路由
+- `TimelineEditor` 组件传入全部新 props
+- `ClipTrimmer` 的 `onConfirm` 根据 segment ID 路由到 `setPrimarySourceRange` 或 `setInsertSourceRange`
+- `MaterialSelector` 传入 `onSetPrimary`
+
+---
+
+## ✅ 3) 后端改动
+
+### 3.1 `workflow.py` — 多镜头支持修复
+
+四项关键修复：
+
+**(a) material_paths 来源**
+
+```python
+# 旧：从 custom_assignments 推断（不适用于多镜头）
+# 新：优先信任前端传入的 req.material_paths
+if req.material_paths and len(req.material_paths) >= 1:
+    material_paths = req.material_paths
+else:
+    material_paths = [req.material_path]
+```
+
+**(b) custom_assignments 校验**
+
+```python
+# 旧：len(custom_assignments) == len(material_paths)
+# 新：>= 1 + 硬上限 50 + 路径子集校验
+if len(req.custom_assignments) > 50:
+    raise ValueError(...)
+unknown = [a.material_path for a in req.custom_assignments if a.material_path not in known_paths]
+if unknown:
+    raise ValueError(...)
+```
+
+**(c) 下载去重 + 并发控制**
+
+```python
+# 旧：每个 assignment 独立下载（同一素材重复下载）
+# 新：按唯一路径去重下载，path_to_local 映射
+_segment_sem = asyncio.Semaphore(4)  # 每次调用内部创建，非模块级
+unique_paths = list(dict.fromkeys(a["material_path"] for a in assignments))
+path_to_local: dict = {}
+```
+
+Semaphore 在每次 `generate_video()` 内部创建，2 个并发任务 × 4 = 峰值 8 个 ffmpeg 进程。
+
+**(d) 首尾段 capping 保护**
+
+```python
+# 仅在非 custom_assignments 模式下执行首尾对齐
+if not req.custom_assignments and assignments and audio_duration > 0:
+    assignments[0]["start"] = 0.0
+    assignments[-1]["end"] = audio_duration
+```
+
+---
+
+## ✅ 4) 文案深度学习弹窗防误触关闭
+
+### 4.1 问题
+
+- 「文案深度学习」弹窗默认支持遮罩与 `ESC` 关闭，用户在查看生成结果时容易误触关闭，重新打开后已生成内容丢失。
+
+### 4.2 修复
+
+- `frontend/src/shared/ui/AppModal.tsx`
+  - 新增 `closeOnEsc?: boolean` 配置，默认值 `true`，保持旧弹窗行为不变。
+- `frontend/src/features/home/ui/ScriptLearningModal.tsx`
+  - 设置 `closeOnOverlay={false}` 与 `closeOnEsc={false}`，禁止遮罩/ESC 关闭。
+  - 输入页底部按钮由“取消”改为“清空”，仅清理链接输入，不关闭弹窗。
+  - 关闭路径收敛为：右上角 `X` 或结果页“填入文案”。
+
+---
+
+## ✅ 5) Code Review 修复
+
+### 5.1 UX：统一时长编辑入口
+
+- **问题**：时间轴插入块同时支持右边缘拖拽调时长和点击弹窗编辑，拖拽操作每次都误触弹窗
+- **修复**：
+  - 移除 `TimelineEditor` 右侧 resize handle
+  - 引入 `dragMovedRef` + 5px 像素阈值区分拖拽与点击
+  - `ClipTrimmer` onConfirm 新增 `resizeInsert()` 同步，确认截取后自动更新时间轴块时长
+  - 帮助文字更新："点击插入块设置截取/时长"
+
+### 5.2 Lint 修复
+
+- `useTimelineEditor.ts`：3 处 `react-hooks/set-state-in-effect`，用 `eslint-disable-next-line` 标注（初始化和清理场景）
+- `useTimelineEditor.ts`：render-time ref 访问改为 `useState` 模式（`prevPrimaryId`）
+- `HomePage.tsx`：移除未使用的 `reorderMaterials` 解构
+- `TimelineEditor.tsx`：移除未使用的 `useMemo` import 和 `materials`/`onResizeInsert` props
+
+### 5.3 P1：多片段 assignment 退化
+
+- **问题**：`selectedMaterials.length > 1` 但时间轴无插入块时，`is_multi=False`，后端走单素材路径丢弃非主素材
+- **修复**（`workflow.py`）：
+
+```python
+is_multi = len(material_paths) > 1 or (
+    req.custom_assignments is not None and len(req.custom_assignments) > 1
+)
+```
+
+### 5.4 P1：主素材 trim range 泄漏
+
+- **问题**：切换主素材（"设为主素材"）时，旧主素材的 `primarySourceStart/End` 保留给新主素材，导致截取范围错误
+- **原因**：仅按 `selectedMaterials.length` 变化重置，切换主素材时长度不变
+- **修复**（`useTimelineEditor.ts`）：改用 identity 追踪
+
+```typescript
+const [prevPrimaryId, setPrevPrimaryId] = useState(selectedMaterials[0]);
+if (selectedMaterials[0] !== prevPrimaryId) {
+  setPrevPrimaryId(selectedMaterials[0]);
+  setPrimarySourceStart(0);
+  setPrimarySourceEnd(0);
+}
+```
+
+---
+
+## 📁 今日修改文件
+
+| 文件 | 改动 |
+|------|------|
+| `frontend/src/shared/types/timeline.ts` | **新增**：`InsertSegment` 接口定义 |
+| `frontend/src/features/home/model/useTimelineEditor.ts` | **重写**：等分模型 → 主素材+插入模型 |
+| `frontend/src/features/home/ui/TimelineEditor.tsx` | **重写**：可视化组件适配新模型 |
+| `frontend/src/features/home/model/useHomeController.ts` | 适配新 timeline API、生成 payload 重写 |
+| `frontend/src/features/home/ui/MaterialSelector.tsx` | 主素材/可插入标签、设为主素材按钮 |
+| `frontend/src/features/home/ui/HomePage.tsx` | ClipTrimmer 路由、TimelineEditor 新 props |
+| `backend/app/modules/videos/workflow.py` | material_paths 来源、校验、下载去重、capping 保护 |
+| `frontend/src/shared/ui/AppModal.tsx` | 新增 `closeOnEsc` 配置，支持按弹窗粒度控制 ESC 关闭行为 |
+| `frontend/src/features/home/ui/ScriptLearningModal.tsx` | 禁用遮罩/ESC 关闭；输入页“取消”改为“清空” |
+
+---
+
+## 🔍 验证记录
+
+- TypeScript 编译检查：`npx tsc --noEmit` ✅ 无错误
+- Python 语法检查：`python -c "import ast; ast.parse(open(...).read())"` ✅
+- 前端 lint（本次补充修复）：`npm run lint -- src/shared/ui/AppModal.tsx src/features/home/ui/ScriptLearningModal.tsx` ✅
+- 代码审查（前端 + 后端各一轮 subagent review）：
+  - 前端：逻辑正确，无 bug，仅 1 处未使用 import（已清理）
+  - 后端：校验逻辑、下载去重、并发控制均正确
+- 单素材模式向后兼容：`toCustomAssignments()` 在单素材时正确生成带裁剪范围的单段 assignment
+
+---
+
+## ⚠️ 已知限制
+
+- `prepare_segment` 的"先裁后循环"路径（`needs_loop && source_start > 0`）仍存在，但前端的边界分割算法确保永远不会触发该路径
+- 插入块最多 10 个（`useTimelineEditor` 内 `MAX_INSERTS=10`），超出时返回 `"limit"`
+- 插入块最小时长 0.5s，低于此值的操作会被忽略
--- a/Docs/DevLogs/Day35.md
+++ b/Docs/DevLogs/Day35.md
@@ -0,0 +1,165 @@
+## 小脸口型质量补偿落地 + 部署验证 (Day 35)
+
+### 概述
+
+完成「小脸口型质量补偿（Small-Face LipSync Compensation）」后端落地与部署收口。核心目标是在不改变用户模型选择语义（`default/fast/advanced`）的前提下，对远景小脸素材增加质量补偿链路（检测 -> 裁切 -> 稀疏超分 -> 模型推理 -> 贴回），并保持默认关闭、失败回退（fail-open）、线上可快速回滚。
+
+---
+
+## ✅ 1) 后端能力落地
+
+### 1.1 配置与开关
+
+新增 5 个配置项（默认保守）：
+
+- `LIPSYNC_SMALL_FACE_ENHANCE`（默认 `false`）
+- `LIPSYNC_SMALL_FACE_THRESHOLD`（默认 `256`）
+- `LIPSYNC_SMALL_FACE_UPSCALER`（`gfpgan | codeformer`）
+- `LIPSYNC_SMALL_FACE_GPU_ID`（默认 `0`）
+- `LIPSYNC_SMALL_FACE_FAIL_OPEN`（默认 `true`）
+
+对应代码入口：`backend/app/core/config.py`、`backend/.env`。
+
+### 1.2 新增小脸增强服务
+
+新增 `backend/app/services/small_face_enhance_service.py`，实现完整补偿链路：
+
+1. **小脸判定**（CPU）
+   - SCRFD（`det_10g.onnx`，复用 LatentSync 权重）
+   - 从视频 10%-30% 区间均匀采样 24 帧
+   - 用最大脸宽中位数与阈值比较触发
+
+2. **裁切与轨迹**（CPU）
+   - 每 8 帧检测一次，其余帧前向填充 + EMA 平滑
+   - bbox 外扩 `padding=0.28`
+
+3. **稀疏超分**（GPU0）
+   - 检测帧走 GFPGAN/CodeFormer
+   - 非检测帧走 bicubic resize
+   - 目标尺寸 `512x512`
+
+4. **贴回融合**（CPU）
+   - 口型局部 mask（起点 68% + 侧边留白 16%）+ 高斯羽化（15px）
+   - `cv2.seamlessClone`，失败回退 alpha blend
+
+5. **帧数保护**
+   - 贴回前校验 `lipsync_frames <= original_frames`
+   - 仅当 `lipsync_frames > original_frames` 时报错（异常），其余按 lipsync 帧数正常贴回
+
+---
+
+## ✅ 2) LipSyncService 集成
+
+`backend/app/services/lipsync_service.py` 关键改造：
+
+- 在 `_local_generate()` 内按顺序执行：
+  - `video looping` -> `small face enhance` -> `model infer` -> `blend back`
+- 抽取 `_run_selected_model()` 统一模型路由（MuseTalk / LatentSync server / LatentSync subprocess）
+- 小脸增强分支全链路 `try/except`，受 `LIPSYNC_SMALL_FACE_FAIL_OPEN` 控制
+- `check_health()` 新增 `small_face_enhance` 状态字段
+
+语义保持：
+
+- 前端与 API 协议不变
+- 用户选择模型优先，不因小脸强制换模型
+- 仅本地路径（`_local_generate`）接入；远程路径暂不接入
+
+---
+
+## ✅ 3) 依赖与权重
+
+### 3.1 依赖
+
+`backend/requirements.txt` 新增：
+
+- `opencv-python-headless>=4.8.0`
+- `gfpgan>=1.3.8`
+
+### 3.2 权重
+
+- `models/FaceEnhance/GFPGANv1.4.pth`（新增目录与权重）
+- `models/LatentSync/checkpoints/auxiliary/models/buffalo_l/det_10g.onnx`（复用）
+
+---
+
+## ✅ 4) 稳定性修复（部署后补丁）
+
+为解决实际部署中的依赖兼容、帧数估算偏差、贴回误判与输出质量问题，补充九处修复：
+
+1. **懒加载 + 守卫**
+   - `cv2/numpy` 改为 `try/except` 导入
+   - 用 `_CV2_AVAILABLE` 守卫增强入口
+   - 缺依赖时跳过增强，不影响主流程
+
+2. **类型注解与 torchvision 兼容补丁**
+   - 增加 `from __future__ import annotations`，避免 `np.ndarray` 在缺依赖场景下导入期报错
+   - 在 `_ensure_upscaler()` 中注入
+     `sys.modules['torchvision.transforms.functional_tensor']`
+   兼容 `torchvision>=0.20` 与 `gfpgan/basicsr` 旧引用
+
+3. **ffprobe 帧率与帧数估算修复**
+   - `_get_video_info()` 从 `csv` 切到 `json` 字段访问，避免 `nb_frames` 缺失导致字段错位
+   - fps 取值改为优先 `avg_frame_rate`，`r_frame_rate` 仅作为 fallback
+
+4. **轨迹帧数与贴回检查修复**
+   - `_build_face_track()` 记录 ffmpeg 实际读帧数，覆盖估算 `nb_frames`
+   - `blend_back()` 放宽检查为 `lipsync <= original` 正常贴回，仅 `>` 报错
+
+5. **空输出防护**
+   - `blend_back()` 增加 `ls_frames <= 0` 异常分支
+   - 由外层 `FAIL_OPEN` 捕获并回退常规路径，避免写出空视频
+
+6. **时基对齐修复（慢动作/重影）**
+   - `_crop_and_upscale_video()` 输出 fps 改为跟随源视频 fps，避免增强视频时间轴拉伸
+   - `blend_back()` 按 `orig_fps/ls_fps` 映射原始帧索引，避免只贴回前段帧导致动作变慢/重影
+
+7. **无声视频修复**
+   - 小脸贴回成功后新增音轨封装（mux）步骤
+   - 强制将当前任务 `audio_path` 封装回贴回视频，防止增强路径无声音
+
+8. **眼部重影修复**
+   - 口型 mask 起点进一步下移到 68%，并增加左右 16% 留白，减少眼周/鼻翼参与融合
+   - `seamlessClone` 后对结果做 mask 限域二次融合，抑制 Poisson 扩散到眼部上方
+
+9. **畸形规避（运行侧）**
+   - `LIPSYNC_SMALL_FACE_THRESHOLD=9999` 仅用于链路冒烟，不用于质量评估
+   - 质量验证前统一恢复 `LIPSYNC_SMALL_FACE_THRESHOLD=256`
+
+---
+
+## ✅ 5) 部署文档与验证
+
+新增并回写部署文档：`Docs/FACEENHANCE_DEPLOY.md`。
+
+文档修正点：
+
+- 健康检查地址修正为：`/api/videos/lipsync/health`
+- 响应示例补齐 `success/data` 外层包装
+
+实际验证要点：
+
+- `GET /api/videos/lipsync/health` 返回 `data.small_face_enhance`
+- 默认 `enabled=false`，开关关闭时行为与旧版一致
+- `detector_loaded=false`（懒加载）符合预期
+
+---
+
+## 📁 今日修改文件
+
+| 文件 | 改动 |
+|------|------|
+| `backend/app/core/config.py` | 新增 `LIPSYNC_SMALL_FACE_*` 配置项（5 个） |
+| `backend/.env` | 增加小脸增强开关与参数 |
+| `backend/app/services/small_face_enhance_service.py` | 新增：检测/裁切/超分/贴回主服务；后续补丁含懒加载与兼容修复 |
+| `backend/app/services/lipsync_service.py` | 集成增强链路、抽取 `_run_selected_model`、health 增强状态 |
+| `backend/requirements.txt` | 新增 `opencv-python-headless`、`gfpgan` |
+| `models/FaceEnhance/GFPGANv1.4.pth` | 新增超分权重 |
+| `Docs/FACEENHANCE_DEPLOY.md` | 新增部署文档并修正健康检查路径/返回示例 |
+
+---
+
+## ⚠️ 已知限制
+
+- 仅本地唇形路径接入（`_local_generate()`）；远程模式未接入小脸补偿
+- 多镜头场景当前仍为全局判定，暂不做逐段小脸判定
+- v1 优先单人自拍稳定性，多人脸切换策略后续再补
--- a/Docs/Doc_Rules.md
+++ b/Docs/Doc_Rules.md
@@ -6,13 +6,14 @@

 ## ⚡ 核心原则

-| 规则 | 说明 |
-|------|------|
-| **默认更新** | 更新 `DayN.md` 和 `TASK_COMPLETE.md` |
-| **按需更新** | 其他文档仅在内容变化涉及时更新 |
-| **智能修改** | 错误→替换，改进→追加（见下方详细规则） |
-| **先读后写** | 更新前先查看文件当前内容 |
-| **日内合并** | 同一天的多次小修改合并为最终版本 |
+| 规则 | 说明 |
+|------|------|
+| **默认更新** | 更新 `DayN.md` 和 `TASK_COMPLETE.md` |
+| **按需更新** | 其他文档仅在内容变化涉及时更新 |
+| **链路对齐** | 新增/重构文档后，回写入口文档（`README.md` 或对应 `*_README.md`） |
+| **智能修改** | 错误→替换，改进→追加（见下方详细规则） |
+| **先读后写** | 更新前先查看文件当前内容 |
+| **日内合并** | 同一天的多次小修改合并为最终版本 |

 ---

@@ -20,17 +21,19 @@

 > **每次提交重要变更时，请核对以下文件是否需要同步：**

-| 优先级 | 文件路径 | 检查重点 |
-| :---: | :--- | :--- |
-| 🔥 **High** | `Docs/DevLogs/DayN.md` | **(最新日志)** 详细记录变更、修复、代码片段 |
-| 🔥 **High** | `Docs/TASK_COMPLETE.md` | **(任务总览)** 更新 `[x]`、进度条、时间线 |
-| ⚡ **Med** | `README.md` | **(项目主页)** 功能特性、技术栈、最新截图 |
-| ⚡ **Med** | `Docs/DEPLOY_MANUAL.md` | **(部署手册)** 环境变量、依赖包、启动命令变更 |
-| ⚡ **Med** | `Docs/BACKEND_DEV.md` | **(后端规范)** 接口契约、模块划分、环境变量 |
-| ⚡ **Med** | `Docs/BACKEND_README.md` | **(后端文档)** 接口说明、架构设计 |
-| ⚡ **Med** | `Docs/FRONTEND_DEV.md` | **(前端规范)** API封装、日期格式化、新页面规范 |
-| ⚡ **Med** | `Docs/FRONTEND_README.md` | **(前端文档)** 功能说明、页面变更 |
-| 🧊 **Low** | `Docs/*_DEPLOY.md` | **(子系统部署)** LatentSync/CosyVoice/字幕等独立部署文档 |
+| 优先级 | 文件路径 | 检查重点 |
+| :---: | :--- | :--- |
+| 🔥 **High** | `Docs/DevLogs/DayN.md` | **(最新日志)** 详细记录变更、修复、代码片段 |
+| 🔥 **High** | `Docs/TASK_COMPLETE.md` | **(任务总览)** 更新 Day Current、`[x]` 与更新时间 |
+| ⚡ **Med** | `README.md` | **(项目主页)** 功能特性、技术栈、最新截图 |
+| ⚡ **Med** | `Docs/DEPLOY_MANUAL.md` | **(部署手册)** 环境变量、依赖包、启动命令变更 |
+| ⚡ **Med** | `Docs/PUBLISH_DEPLOY.md` | **(发布专项)** 四平台登录/发布实现、排障、验收流程 |
+| ⚡ **Med** | `Docs/BACKEND_DEV.md` | **(后端规范)** 接口契约、模块划分、环境变量 |
+| ⚡ **Med** | `Docs/BACKEND_README.md` | **(后端文档)** 接口说明、架构设计 |
+| ⚡ **Med** | `Docs/FRONTEND_DEV.md` | **(前端规范)** API封装、日期格式化、新页面规范 |
+| ⚡ **Med** | `Docs/FRONTEND_README.md` | **(前端文档)** 功能说明、页面变更 |
+| 🧊 **Low** | `Docs/DOC_RULES.md` | **(规则文档)** 文档结构变化或流程变化时同步更新 |
+| 🧊 **Low** | `Docs/*_DEPLOY.md` | **(子系统部署)** LatentSync/CosyVoice/字幕等独立部署文档 |

 ---

@@ -89,7 +92,7 @@

 ---

-## 🔍 更新前检查清单
+## 🔍 更新前检查清单

 > **核心原则**：追加前先查找，避免重复和遗漏

@@ -112,12 +115,20 @@
 | **有待验证状态** | 更新状态标记 |
 | **全新独立内容** | 追加到末尾 |

-**3. 必须更新的内容**
+**3. 必须更新的内容**

 - ✅ **状态标记**：`🔄 待验证` → `✅ 已修复` / `❌ 失败`
 - ✅ **进度百分比**：更新为最新值
- ✅ **文件修改列表**：补充新修改的文件
- ❌ **禁止**：创建重复的章节标题
+- ✅ **文件修改列表**：补充新修改的文件
+- ❌ **禁止**：创建重复的章节标题
+
+### 发布相关变更的三检（新增）
+
+若涉及抖音/微信/B站/小红书发布或扫码登录，额外执行：
+
+1. **路由真值检查**：以 `backend/app/modules/publish/router.py` 为准校验 API 路径，避免文档写成旧路径（例如 `/screenshots/`）。
+2. **专项文档对齐**：更新 `Docs/PUBLISH_DEPLOY.md` 中对应平台章节（登录、发布判定、排障）。
+3. **入口文档回写**：至少回写一处入口文档（`README.md` 或 `Docs/BACKEND_README.md` / `Docs/DEPLOY_MANUAL.md`）。

 ### 示例场景

@@ -138,23 +149,23 @@

 ---

-## ️ 工具使用规范
+## ️ 工具使用规范

 > **核心原则**：使用正确的工具，避免字符编码问题

-### ✅ 推荐工具：Edit / Read / Grep
+### ✅ 推荐工具：Read / Grep / apply_patch

-**使用场景**：
- `Read`：更新前先查看文件当前内容
- `Edit`：精确替换现有内容、追加新章节
- `Grep`：搜索文件中是否已有相关章节
- `Write`：创建新文件（如 Day{N+1}.md）
+**使用场景**：
+- `Read`：更新前先查看文件当前内容
+- `apply_patch`：精确替换现有内容、追加新章节
+- `Grep`：搜索文件中是否已有相关章节
+- `Write`：创建新文件（如 Day{N+1}.md）

 **注意事项**：
 ```markdown
-1. **先读后写**：编辑前先用 Read 确认内容
-2. **精确匹配**：Edit 的 old_string 必须与文件内容完全一致
-3. **避免重复**：编辑前用 Grep 检查是否已存在同主题章节
+1. **先读后写**：编辑前先用 Read 确认内容
+2. **精确匹配**：`apply_patch` 的上下文必须与文件内容一致
+3. **避免重复**：编辑前用 Grep 检查是否已存在同主题章节
 ```

 ### ❌ 禁止使用：命令行工具修改文档
@@ -171,13 +182,14 @@

 ### 📝 最佳实践示例

-**追加新章节**：使用 `Edit` 工具，`old_string` 匹配文件末尾内容，`new_string` 包含原内容 + 新章节。
-
-**修改现有内容**：使用 `Edit` 工具精确替换。
-```markdown
-old_string: "**状态**：🔄 待修复"
-new_string: "**状态**：✅ 已修复"
-```
+**追加新章节**：使用 `apply_patch`，以文件末尾稳定上下文为锚点追加。
+
+**修改现有内容**：使用 `apply_patch` 精确替换。
+```markdown
+@@
+-**状态**：🔄 待修复
+**状态**：✅ 已修复
+```


 ---
@@ -191,11 +203,12 @@ ViGent2/Docs/
 ├── BACKEND_DEV.md                # 后端开发规范
 ├── BACKEND_README.md             # 后端功能文档
 ├── FRONTEND_DEV.md               # 前端开发规范
-├── FRONTEND_README.md            # 前端功能文档
-├── DEPLOY_MANUAL.md              # 部署手册
-├── SUPABASE_DEPLOY.md            # Supabase 部署文档
-├── LATENTSYNC_DEPLOY.md          # LatentSync 部署文档
-├── COSYVOICE3_DEPLOY.md           # 声音克隆部署文档
+├── FRONTEND_README.md            # 前端功能文档
+├── DEPLOY_MANUAL.md              # 部署手册
+├── PUBLISH_DEPLOY.md             # 多平台发布专项文档
+├── SUPABASE_DEPLOY.md            # Supabase 部署文档
+├── LATENTSYNC_DEPLOY.md          # LatentSync 部署文档
+├── COSYVOICE3_DEPLOY.md           # 声音克隆部署文档
 ├── ALIPAY_DEPLOY.md              # 支付宝付费部署文档
 ├── SUBTITLE_DEPLOY.md            # 字幕系统部署文档
 └── DevLogs/
@@ -254,16 +267,21 @@ ViGent2/Docs/

 ---

-## 📏 内容简洁性规则
+## 📏 内容简洁性规则

 ### 代码示例长度控制
 - **原则**：只展示关键代码片段（10-20行以内）
 - **超长代码**：使用 `// ... 省略 ...` 或仅列出文件名+行号
 - **完整代码**：引用文件链接，而非粘贴全文

-### 调试信息处理
- **临时调试**：验证后删除（如调试日志、测试截图）
- **有价值信息**：保留（如错误日志、性能数据）
+### 调试信息处理
+- **临时调试**：验证后删除（如调试日志、测试截图）
+- **有价值信息**：保留（如错误日志、性能数据）
+
+### 敏感信息处理
+- **禁止落盘**：Cookie 值、Token、密钥、完整手机号、支付凭证。
+- **日志引用**：仅记录必要关键词与结论，避免粘贴大段原始日志。
+- **路径引用**：优先给相对路径与文件名，不记录无关个人目录信息。

 ### 状态标记更新
 - **🔄 待验证** → 验证后更新为 **✅ 已修复** 或 **❌ 失败**
@@ -280,29 +298,29 @@ ViGent2/Docs/
 - **格式一致性**：直接参考 `TASK_COMPLETE.md` 现有格式追加内容。
 - **进度更新**：仅在阶段性里程碑时更新进度百分比。

-### 🔍 完整性检查清单 (必做)
-
-每次更新 `TASK_COMPLETE.md` 时，必须**逐一检查**以下所有板块：
-
-1. **文件头部 & 导航**
-   - [ ] `更新时间`：必须是当天日期
-   - [ ] `整体进度`：简述当前状态
-   - [ ] `快速导航`：Day 范围与文档一致
-
-2. **核心任务区**
-   - [ ] `已完成任务`：添加新的 [x] 项目
-   - [ ] `后续规划`：管理三色板块 (优先/债务/未来)
-
-3. **统计与回顾**
-   - [ ] `进度统计`：更新对应模块状态和百分比
-   - [ ] `里程碑`：若有重大进展，追加 `## Milestone N`
-
-4. **底部链接**
-   - [ ] `时间线`：追加今日概括
-   - [ ] `相关文档`：更新 DayLog 链接范围
-
-> **口诀**：头尾时间要对齐，任务规划两手抓，里程碑上别落下。
+### 🔍 完整性检查清单 (必做)
+
+每次更新 `TASK_COMPLETE.md` 时，必须**逐一检查**以下板块：
+
+1. **文件头部**
+   - [ ] `更新时间`：必须是当天日期
+   - [ ] `整体进度`：与当前 Day 状态一致（例如 Day31）
+
+2. **当日 Current 区块**
+   - [ ] 新增/更新 `Day N (Current)` 标题
+   - [ ] 关键任务以 `[x]` 列出（避免仅写结论）
+   - [ ] 前一天 Day 标题取消 `(Current)` 标记
+
+3. **Roadmap 与模块状态**
+   - [ ] 如有已完成长期事项，及时从待办迁移到已完成
+   - [ ] 模块完成度有变化时同步更新
+
+4. **相关文档链接**
+   - [ ] 新增的核心文档（如 `PUBLISH_DEPLOY.md`）要在相关位置可追溯
+   - [ ] 若 DayN 记录了“文档回写”，`TASK_COMPLETE.md` 的当日条目也要体现
+
+> **口诀**：头部日期、当日 Current、模块状态、链接可追溯。

 ---

-**最后更新**：2026-02-11
+**最后更新**：2026-03-03
--- a/Docs/FACEENHANCE_DEPLOY.md
+++ b/Docs/FACEENHANCE_DEPLOY.md
@@ -0,0 +1,428 @@
+# 小脸口型质量补偿链路部署指南
+
+> **更新时间**：2026-03-10 v1.4
+> **适用版本**：SmallFaceEnhance v1.4 (内嵌于 Backend 进程)
+> **架构**：LipSyncService 内部模块，无独立进程
+
+---
+
+## 架构概览
+
+小脸口型质量补偿链路（简称"小脸增强"）作为 `LipSyncService._local_generate()` 的**前处理分支**，在 lipsync 推理前自动检测小脸并增强输入质量：
+
+```
+原视频 + 音频
+  → video looping (已有逻辑)
+  → 小脸检测 (SCRFD, CPU)
+  → [非小脸] 直接用用户所选模型推理 (现有路径)
+  → [小脸]
+       A. 裁切主脸区域 (带 padding)
+       B. 稀疏关键帧超分到 512px (GFPGAN, GPU0)
+       C. 用用户所选模型推理 (MuseTalk 或 LatentSync)
+       D. 下半脸 mask 羽化 + seamlessClone 贴回原帧
+  → 进入现有后续流程 (字幕/BGM/上传)
+```
+
+**关键约束**：
+- 不改前端、不改 API 协议
+- 模型选择权归用户，不因小脸自动换模型
+- 默认 fail-open：增强链任何一步失败，自动回退原流程
+- 无独立进程/PM2，跟随 `vigent2-backend` 运行
+
+---
+
+## 硬件要求
+
+| 配置 | 说明 |
+|------|------|
+| 检测器 | SCRFD (det_10g.onnx)，CPU 推理，无额外 GPU 开销 |
+| 超分 | GFPGAN，GPU0 (与 MuseTalk 同卡，顺序执行)，约 2-3GB 显存 |
+| 内存 | 流式 ffmpeg pipe 逐帧处理，不额外占用大量内存 |
+
+> 超分与 MuseTalk 共享 GPU0，顺序执行不会同时占用显存。
+
+---
+
+## 依赖安装
+
+### 1. pip 依赖
+
+已在 `backend/requirements.txt` 中添加：
+
+```
+opencv-python-headless>=4.8.0
+gfpgan>=1.3.8
+```
+
+安装：
+
+```bash
+cd /home/rongye/ProgramFiles/ViGent2/backend
+pip install opencv-python-headless gfpgan
+```
+
+> `gfpgan` 会自动拉取 `basicsr`、`facexlib` 等依赖。
+> `onnxruntime` 需单独确认已安装（LatentSync 环境中已有 1.23.2）。
+> 如果 backend 虚拟环境中缺少 onnxruntime，需额外安装：`pip install onnxruntime`
+
+### 2. 系统依赖
+
+- `ffmpeg` / `ffprobe`：已有（视频处理必需）
+
+---
+
+## 模型权重
+
+### 目录结构
+
+```
+models/
+├── FaceEnhance/
+│   └── GFPGANv1.4.pth              ← 超分权重 (~333MB)
+└── LatentSync/checkpoints/auxiliary/
+    └── models/buffalo_l/
+        └── det_10g.onnx             ← 人脸检测权重 (~16MB, 复用已有)
+```
+
+### 下载方式
+
+**GFPGAN 权重**（已下载）：
+
+```bash
+cd /home/rongye/ProgramFiles/ViGent2/models/FaceEnhance
+wget -O GFPGANv1.4.pth "https://github.com/TencentARC/GFPGAN/releases/download/v1.3.4/GFPGANv1.4.pth"
+```
+
+**SCRFD 检测器权重**：
+
+复用 LatentSync 已有的 `det_10g.onnx`，无需额外下载。代码自动引用路径：
+`models/LatentSync/checkpoints/auxiliary/models/buffalo_l/det_10g.onnx`
+
+> 权重缺失时自动 fail-open 跳过增强，不会导致任务失败。
+
+---
+
+## 后端配置
+
+`backend/.env` 中的相关变量：
+
+```ini
+# =============== 小脸口型质量补偿链路 ===============
+LIPSYNC_SMALL_FACE_ENHANCE=false          # 总开关 (true/false)
+LIPSYNC_SMALL_FACE_THRESHOLD=256          # 触发阈值 (像素，脸宽 < 此值触发)
+LIPSYNC_SMALL_FACE_UPSCALER=gfpgan        # 超分模型: gfpgan | codeformer
+LIPSYNC_SMALL_FACE_GPU_ID=0               # 超分 GPU (与 MuseTalk 同卡)
+LIPSYNC_SMALL_FACE_FAIL_OPEN=true         # 失败回退 (true=回退原流程, false=报错)
+```
+
+`backend/app/core/config.py` 中的默认值：
+
+```python
+LIPSYNC_SMALL_FACE_ENHANCE: bool = False
+LIPSYNC_SMALL_FACE_THRESHOLD: int = 256
+LIPSYNC_SMALL_FACE_UPSCALER: str = "codeformer"
+LIPSYNC_SMALL_FACE_GPU_ID: int = 0
+LIPSYNC_SMALL_FACE_FAIL_OPEN: bool = True
+```
+
+> `.env` 优先于 `config.py` 默认值。`config.py` 仅在 `.env` 未设置时生效。
+
+### 模块内部常量
+
+以下参数固定为代码常量（`small_face_enhance_service.py`），暂不走 env：
+
+| 常量 | 值 | 说明 |
+|------|-----|------|
+| `PADDING` | 0.28 | bbox 外扩比例 |
+| `DETECT_EVERY` | 8 | 每 N 帧检测，中间帧 EMA 插值 |
+| `TARGET_SIZE` | 512 | 超分目标尺寸 |
+| `MASK_FEATHER` | 15 | 下半脸 mask 羽化像素 |
+| `MASK_UPPER_RATIO` | 0.68 | 口型 mask 起始位置 (crop 高度的 68%，仅覆盖嘴部/下巴) |
+| `MASK_SIDE_MARGIN` | 0.16 | 左右留白比例，避免改动面颊/鼻翼 |
+| `SAMPLE_FRAMES` | 24 | 小脸判定采样帧数 |
+| `SAMPLE_WINDOW` | (0.10, 0.30) | 采样窗口 (视频 10%~30%) |
+| `ENCODE_FPS` | 25 | 中间视频编码帧率 fallback（优先跟随源视频 fps，源 fps 不可用时回退 25） |
+| `ENCODE_CRF` | 18 | 中间视频编码质量 |
+| `EMA_ALPHA` | 0.3 | bbox EMA 平滑系数 |
+
+---
+
+## 启用与验证
+
+### 1. 开启小脸口型质量补偿链路
+
+```bash
+# 编辑 backend/.env
+LIPSYNC_SMALL_FACE_ENHANCE=true
+```
+
+重启后端：
+
+```bash
+pm2 restart vigent2-backend
+```
+
+### 2. 强制触发测试
+
+设置极大阈值，使任何视频都触发增强：
+
+```ini
+LIPSYNC_SMALL_FACE_THRESHOLD=9999
+```
+
+> 仅用于链路冒烟测试，不用于质量评估。`9999` 会强制大脸素材进入增强分支，可能出现中脸变形/鼻翼细节异常。
+
+提交一个视频任务，检查日志：
+
+```bash
+pm2 logs vigent2-backend --lines 50
+```
+
+应看到类似输出：
+
+```
+小脸增强: face_w=320px < threshold=9999px, 触发增强
+✅ SCRFD 检测器已加载
+✅ 超分器已加载: gfpgan
+小脸增强: face_w=320px threshold=9999px enhanced=True upscaler=gfpgan time=12.3s
+✅ 小脸增强 + 唇形同步完成: /path/to/output.mp4
+```
+
+### 3. 调回正常阈值
+
+验证通过后，改回合理阈值：
+
+```ini
+LIPSYNC_SMALL_FACE_THRESHOLD=256
+```
+
+并重启 backend：`pm2 restart vigent2-backend`。
+
+### 4. 健康检查
+
+```bash
+curl http://localhost:8006/api/videos/lipsync/health | python3 -m json.tool
+```
+
+应包含 `data.small_face_enhance`：
+
+```json
+{
+  "success": true,
+  "data": {
+    "small_face_enhance": {
+      "enabled": true,
+      "threshold": 256,
+      "detector_loaded": true
+    }
+  }
+}
+```
+
+---
+
+## 相关文件
+
+| 文件 | 说明 |
+|------|------|
+| `backend/app/services/small_face_enhance_service.py` | 小脸增强主服务 (检测 + 裁切 + 超分 + 贴回) |
+| `backend/app/services/lipsync_service.py` | 混合路由 + 小脸增强集成 + `_run_selected_model()` |
+| `backend/app/core/config.py` | `LIPSYNC_SMALL_FACE_*` 配置项 |
+| `models/FaceEnhance/GFPGANv1.4.pth` | GFPGAN 超分权重 |
+| `models/LatentSync/checkpoints/auxiliary/models/buffalo_l/det_10g.onnx` | SCRFD 检测器权重 (复用) |
+| `Temp/小脸增强分支-实施计划.md` | 详细方案文档 |
+
+---
+
+## 处理流程详解
+
+### 1. 检测阶段 (CPU)
+
+- 从视频 10%~30% 区间均匀采 24 帧
+- SCRFD (det_10g.onnx) 检测最大脸，取中位数脸宽
+- `脸宽 < THRESHOLD` 时触发增强
+
+### 2. 裁切 + 轨迹 (CPU)
+
+- 每 8 帧检测人脸 bbox，中间帧 EMA 插值平滑
+- bbox + 0.28 padding 外扩，clamp 到帧边界
+- 实际读取帧数回写 `track.frame_count`，修正 ffprobe 估算偏差
+- ffmpeg pipe 流式裁切，输出 512x512 视频
+
+### 3. 超分 (GPU0)
+
+- 检测帧 (每 8 帧)：GFPGAN 全量超分
+- 非检测帧：bicubic resize 到 512x512
+- 增强视频输出 fps 跟随源视频 fps（不再固定写 25fps），避免时基拉伸
+- 推理后自动 `torch.cuda.empty_cache()`
+
+### 4. Lipsync 推理
+
+- 用户选择的模型 (fast/default/advanced) 对增强后的人脸视频推理
+- 模型选择语义不变
+
+### 5. 贴回 (CPU)
+
+- 口型局部 mask (从 68% 高度开始 + 左右留白 16%) + 高斯羽化 15px（仅覆盖嘴部/下巴）
+- `cv2.seamlessClone(NORMAL_CLONE)` 贴回原帧
+- 对 seamlessClone 结果再按 mask 区域做二次 alpha 限域，避免融合扩散到眼部上方
+- seamlessClone 失败时 fallback alpha 混合
+- 贴回按时间轴映射原始帧索引（`orig_fps/ls_fps`），避免只使用前段帧导致动作变慢/重影
+- 帧数保护：lipsync 按音频时长输出，帧数通常 <= 原始 looped 视频；仅 `lipsync帧数 > 原始帧数` 时报错，`<=` 时正常贴回
+- 空输出保护：`lipsync帧数 <= 0` 直接抛异常，外层 `FAIL_OPEN` 回退原流程，避免写出空视频
+- 音轨封装：贴回后强制复用 `audio_path` 重新 mux 音轨，避免增强路径出现无声视频
+
+---
+
+## 回滚方案
+
+**一级回滚 (秒级)**：
+
+```ini
+LIPSYNC_SMALL_FACE_ENHANCE=false
+```
+
+重启 backend 即可，所有任务走原流程。
+
+**二级回滚 (版本级)**：
+
+回退 `lipsync_service.py` 增强接入提交，配置项保留但不生效。
+
+---
+
+## 常见问题
+
+### onnxruntime 未安装
+
+```
+⚠️ SCRFD 初始化失败: No module named 'onnxruntime'
+```
+
+**解决**：
+
+```bash
+pip install onnxruntime
+```
+
+### GFPGAN 权重缺失
+
+```
+⚠️ GFPGAN 权重不存在: .../models/FaceEnhance/GFPGANv1.4.pth
+```
+
+**解决**：参考上方"模型权重"章节下载。权重缺失时超分自动降级为 bicubic resize。
+
+### 帧数异常导致 fail-open
+
+```
+⚠️ 小脸贴回失败，回退原流程: 帧数异常: lipsync=300 > original=250
+```
+
+**说明**：v1.1 已放宽帧数检查。lipsync 模型按音频时长输出帧数，通常 <= looped 视频帧数，此时正常贴回。仅当 lipsync 输出帧数**大于**原始帧数时才报错（异常情况）。
+
+### lipsync 输出为空导致回退
+
+```
+⚠️ 小脸贴回失败，回退原流程: lipsync 输出帧数为 0，跳过贴回
+```
+
+**说明**：v1.2 新增空输出保护。`ls_frames <= 0` 时立即抛错，由外层 fail-open 回退到常规唇形路径，避免生成空视频文件。
+
+### 增强后动作变慢 / 眼睛重影
+
+**原因**：原视频与 lipsync 输出 fps 不一致时，若按同帧号直接贴回，可能出现时间轴错位（只贴回前段帧）。
+
+**修复**：v1.3 已改为按 `orig_fps/ls_fps` 做时间轴映射，贴回阶段使用时间对应帧而非同索引帧，同时增强视频输出 fps 跟随源 fps。
+
+**进一步修复（v1.4）**：
+- mask 起点进一步下移到 68%，并增加左右 16% 留白，减少眼周/鼻翼参与融合
+- 对 seamlessClone 输出增加 mask 限域，防止 Poisson 扩散造成眼部上方重影
+
+### 增强后脸部畸形（鼻翼/中脸异常）
+
+**高概率原因**：使用了测试阈值 `LIPSYNC_SMALL_FACE_THRESHOLD=9999`，把本不需要增强的大脸素材强制送入补偿链路。
+
+**建议处理**：
+- 先改回 `LIPSYNC_SMALL_FACE_THRESHOLD=256` 并重启 backend。
+- 如仍有异常，临时关闭 `LIPSYNC_SMALL_FACE_ENHANCE=false` 做 A/B 对比，再继续调参。
+
+### 增强后无声音
+
+**原因**：贴回阶段 rawvideo 写出默认不带音轨。
+
+**修复**：v1.3 已在贴回后强制执行音轨封装（mux），使用当前任务 `audio_path` 写回音频。
+
+> v1.0 使用严格一致性检查（`lipsync != original` 即失败），在 looped 视频帧数远大于音频帧数时会误判失败。v1.1 已修复。
+
+### 增强后口型有偏移
+
+检查 `PADDING` 常量是否合理。过小的 padding 可能导致裁切区域不够，过大会引入太多背景。当前默认 0.28 (28%) 适用于大多数单人自拍场景。
+
+### torchvision 兼容性 (functional_tensor)
+
+```
+No module named 'torchvision.transforms.functional_tensor'
+```
+
+**原因**：torchvision >= 0.20 移除了 `functional_tensor` 模块，但 `basicsr`（gfpgan 依赖）仍引用。
+
+**解决**：代码已内置兼容 shim（`_ensure_upscaler()` 中自动注入 `sys.modules`），无需手动处理。如仍出现，检查 `_ensure_upscaler` 方法是否正常执行。
+
+### cv2/numpy 未安装
+
+```
+⚠️ cv2 未安装，小脸增强不可用
+```
+
+**说明**：`cv2` 和 `numpy` 为 lazy import（`try/except`），缺失时小脸增强自动禁用，不影响后端启动和其他功能。安装 `opencv-python-headless` 即可恢复。
+
+---
+
+## 已知限制 (v1.4)
+
+- 仅覆盖本地 lipsync 路径 (`_local_generate()`)，远程模式 (`_remote_generate()`) 暂不接入
+- 多镜头仅全局判定，不做逐段小脸检测
+- 仅保证单人 (主脸) 场景稳定，不做多人脸切换
+- CodeFormer 超分需额外安装 `basicsr`，当前推荐使用 GFPGAN
+
+---
+
+## v1.3 → v1.4 变更记录
+
+| 修复项 | 说明 |
+|--------|------|
+| 眼部重影修复 | mask 起点下移到 68% + 左右 16% 留白，减少上半脸与鼻翼参与融合 |
+| Poisson 扩散抑制 | seamlessClone 后按 mask 二次限域，避免眼部上方 ghosting |
+
+---
+
+## v1.2 → v1.3 变更记录
+
+| 修复项 | 说明 |
+|--------|------|
+| 时基修复 | `_crop_and_upscale_video()` 输出 fps 跟随源视频 fps，避免增强视频时间轴被拉伸 |
+| 贴回对齐修复 | `blend_back()` 改为按 `orig_fps/ls_fps` 映射原始帧索引，减少动作变慢与重影 |
+| 音轨修复 | 贴回成功后新增音轨封装（mux），避免增强路径无声音 |
+
+---
+
+## v1.1 → v1.2 变更记录
+
+| 修复项 | 说明 |
+|--------|------|
+| 空输出保护 | `blend_back()` 新增 `ls_frames <= 0` 判断，直接抛错并由外层 fail-open 回退，避免写出空视频 |
+
+---
+
+## v1.0 → v1.1 变更记录
+
+| 修复项 | 说明 |
+|--------|------|
+| ffprobe 解析 | CSV → JSON 格式，字段名访问，不再受 `nb_frames` 缺失导致的字段错位影响 |
+| fps 选取 | 优先 `avg_frame_rate`（真实平均帧率），`r_frame_rate` 作为 fallback；避免 `60/1` 等 timebase 倍数导致帧数估算偏大 |
+| 实际帧数回写 | `_build_face_track()` 用 ffmpeg 实际读到的帧数覆盖估算值，`track.frame_count` 更准确 |
+| 贴回帧数检查 | 放宽为 `lipsync <= original` 时正常贴回，仅 `>` 时报错；适配 MuseTalk/LatentSync 按音频时长输出的行为 |
+| 边界防护 | `streams` 为空时 return None；`r_frame_rate` 分母为 0 时 fallback 25fps |
+| torchvision 兼容 | `_ensure_upscaler()` 中注入 `functional_tensor` shim，兼容 torchvision >= 0.20 |
+| lazy import | `cv2`/`numpy` 包装在 `try/except`，缺失时增强自动禁用不影响后端启动 |
+| 类型注解 | `from __future__ import annotations` 避免依赖缺失时 `np.ndarray` 等注解触发 NameError |
--- a/Docs/FRONTEND_DEV.md
+++ b/Docs/FRONTEND_DEV.md
@@ -1,5 +1,11 @@
 # 前端开发规范

+## 文档定位
+
+- 本文档只定义前端开发规范与约束（结构、交互、持久化、接口调用、Checklist）。
+- 功能说明与启动方式请查看 `Docs/FRONTEND_README.md`。
+- 历史变更请记录在 `Docs/DevLogs/` 与 `Docs/TASK_COMPLETE.md`，不要写入本规范文档。
+
 ## 目录结构

 采用轻量 FSD（Feature-Sliced Design）结构：
@@ -33,8 +39,12 @@ frontend/src/
 │   │       ├── MaterialSelector.tsx
 │   │       ├── ScriptEditor.tsx
 │   │       ├── ScriptExtractionModal.tsx
+│   │       ├── RewriteModal.tsx
+│   │       ├── ScriptLearningModal.tsx
 │   │       ├── script-extraction/
 │   │       │   └── useScriptExtraction.ts
+│   │       ├── script-learning/
+│   │       │   └── useScriptLearning.ts
 │   │       ├── TitleSubtitlePanel.tsx
 │   │       ├── FloatingStylePreview.tsx
 │   │       ├── VoiceSelector.tsx
@@ -62,12 +72,16 @@ frontend/src/
 │   ├── hooks/
 │   │   ├── useTitleInput.ts
 │   │   └── usePublishPrefetch.ts
+│   ├── ui/
+│   │   ├── SelectPopover.tsx   # 统一下拉/BottomSheet 选择器
+│   │   └── AppModal.tsx        # 统一弹窗基座
 │   ├── types/
 │   │   ├── user.ts            # User 类型定义
 │   │   └── publish.ts         # 发布相关类型
-│   └── contexts/              # 全局 Context（Auth、Task）
+│   └── contexts/              # 全局 Context（Auth、Task、Cleanup）
 │       ├── AuthContext.tsx
-│       └── TaskContext.tsx
+│       ├── TaskContext.tsx
+│       └── CleanupContext.tsx
 ├── components/                # 遗留通用组件
 │   └── VideoPreviewModal.tsx
 └── proxy.ts                   # Next.js middleware（路由保护）
@@ -180,6 +194,71 @@ body {

 ---

+## 统一下拉选择器规范 (SelectPopover)
+
+首页/发布页的业务选择项（音色、参考音频、配音、素材、BGM、作品、样式、模型、画面比例）统一使用 `@/shared/ui/SelectPopover`：
+
+- 桌面端使用 Popover，移动端自动切换 BottomSheet
+- 触发器与面板风格统一：`border-white/10 + bg-black/25`（或同级变体）
+- 下拉项选中态统一：`border-purple-500 bg-purple-500/20`
+- 选中项需添加 `data-popover-selected="true"`，确保再次打开时自动滚动定位到已选项
+- 底部空间不足时自动上拉；滚动条隐藏但保留滚动能力
+
+### 视频预览与下拉层级
+
+- 下拉菜单层级应低于视频预览弹窗，避免遮挡预览内容
+- 在下拉内点击“预览”时，不强制关闭下拉（便于连续预览）
+- 关闭预览后，用户可继续在下拉内操作；点击外部时下拉正常收起
+
+### 例外说明
+
+- `ScriptEditor` 的“历史文案 / AI多语言”保持原有轻量菜单样式，不强制迁移到 `SelectPopover`
+
+---
+
+## 统一弹窗规范 (AppModal)
+
+所有居中弹窗（如视频预览、文案提取、AI 改写、文案深度学习、文案扩展编辑、录音、密码修改）统一使用 `@/shared/ui/AppModal` + `AppModalHeader`：
+
+- 统一遮罩与层级：`fixed inset-0` + `bg-black/80` + `backdrop-blur-sm` + 明确 `z-index`
+- 统一挂载位置：通过 Portal 挂载到 `document.body`，避免局部容器/层叠上下文影响，确保是全页面弹窗
+- 统一容器风格：`border-white/10`、深色半透明背景、圆角 `rounded-2xl`、重阴影
+- 统一关闭行为：支持 `ESC`；是否允许点击遮罩关闭通过 `closeOnOverlay` 显式配置
+- 默认策略：除关键流程外，`closeOnOverlay` 默认应为 `true`，并通过 `AppModalHeader onClose` 提供右上角 `X` 关闭入口
+- 关键流程例外：发布成功清理弹窗（`CleanupContext`）必须保持 `closeOnOverlay=false`，且不提供右上角关闭按钮
+- 录音弹窗例外：使用 `closeOnOverlay={!isRecording}`，录音中禁止遮罩关闭，避免误触中断
+- 统一滚动策略：弹窗打开时锁定背景滚动（`lockBodyScroll`），内容区自行滚动
+- 特殊层级场景（例如视频预览压过下拉）使用更高 `z-index`（如 `z-[320]`）
+
+### 文案类弹窗结果操作栏规范
+
+适用组件：
+- `ScriptExtractionModal`
+- `RewriteModal`
+- `ScriptLearningModal`
+
+统一要求：
+- 结果页操作按钮统一放在内容底部（Action Grid），避免“标题右上角按钮 + 底部按钮”混排。
+- 主按钮统一为高亮渐变（如「填入文案」），其余按钮统一次级样式（`bg-white/10`）。
+- 动作文案尽量统一：`填入文案` / `复制` / `重新生成`（或与当前流程等价的返回动作）。
+- 按钮尺寸、圆角、间距保持一致（推荐 `py-2.5 px-3 rounded-lg text-sm`）。
+
+---
+
+## 发布后清理弹窗规范 (CleanupContext)
+
+发布页由 `CleanupContext` 统一承接“全部平台发布成功后的清理引导”，规则如下：
+
+- 触发条件：仅当本次发布结果 **全部成功** 才触发弹窗；有任一失败则走原内联结果展示。
+- 持久化恢复：`cleanup_pending` 写入 localStorage，支持刷新/跳转后恢复；带 `createdAt`，24 小时自动过期。
+- 清理顺序：必须先调用 `POST /api/videos/cleanup`；仅在接口成功后才清本地输入字段并关闭弹窗。
+- 状态同步：清理成功后派发 `vigent:workspace-cleared` 事件，当前发布页输入态需就地重置（避免“localStorage 已清空但页面仍显示旧值”）。
+- 失败处理：接口失败时保留弹窗和输入数据，允许重试；连续失败达到阈值后显示“暂不清理，继续使用”。
+- 本地清理范围：仅输入内容（文案/标题/副标题/发布标题/标签），不清用户偏好（样式、字号、边距、模型、BGM 等）。
+- 下载策略：弹窗“下载视频备份”必须使用同源下载接口（`/api/videos/generated/{id}/download`），不要直接使用签名 URL 作为 `href`。
+
+---
+
 ## API 请求规范

 ### 必须使用 `api` (axios 实例)
@@ -187,6 +266,7 @@ body {
 所有需要认证的 API 请求**必须**使用 `@/shared/api/axios` 导出的 axios 实例。该实例已配置：
 - 自动携带 `credentials: include`
 - 遇到 401/403 时自动清除 cookie 并跳转登录页
+- AI/Tools 接口（如 `/api/ai/*`、`/api/tools/extract-script`、`/api/tools/analyze-creator`、`/api/tools/generate-topic-script`）现为强制鉴权，禁止匿名 `fetch` 直调

 **使用方式：**

@@ -346,8 +426,9 @@ useEffect(() => {
 - `shared/api`：Axios 实例与统一响应类型
 - `shared/lib`：通用工具函数（media.ts / auth.ts / title.ts）
 - `shared/hooks`：跨功能通用 hooks
+- `shared/ui`：跨功能通用 UI（如 SelectPopover）
 - `shared/types`：跨功能实体类型（User / PublishVideo 等）
- `shared/contexts`：全局 Context（AuthContext / TaskContext）
+- `shared/contexts`：全局 Context（AuthContext / TaskContext / CleanupContext）
 - `components/`：遗留通用组件（VideoPreviewModal）

 ## 类型定义规范
@@ -366,11 +447,14 @@ useEffect(() => {
  - 标题样式 ID / 字幕样式 ID
  - 标题字号 / 字幕字号
  - 标题显示模式（`short` / `persistent`）
-  - 背景音乐选择 / 音量 / 开关状态
+  - 唇形模型模式（`default` / `fast` / `advanced`）
+  - 背景音乐选择 / 开关状态（当前前端不提供音量滑杆，生成时使用固定音量）
  - 输出画面比例（`9:16` / `16:9`）
  - 素材选择 / 历史作品选择
  - 选中配音 ID (`selectedAudioId`)
+  - 选中参考音频 ID (`selectedRefAudio` 对应 id)
  - 语速 (`speed`，声音克隆模式)
+  - 语气 (`emotion`，声音克隆模式)
  - 时间轴段信息 (`useTimelineEditor` 的 localStorage)

 ### 历史文案（独立持久化）
@@ -406,6 +490,7 @@ useEffect(() => {
 - 发布按钮在未选择任何平台时禁用
 - 仅保留"立即发布"，不再提供定时发布 UI/参数
 - **作品选择持久化**：使用 `video.id`（稳定标识）而非 `video.path`（签名 URL）进行选择、比较和 localStorage 存储。发布时根据 `id` 查找对应 `path` 发送请求。
+- **新作品优先级**：检测到“刚生成的新视频”时，页面首次恢复优先选中最新视频；之后用户手动改选会继续按持久化值恢复。

 ---

@@ -457,6 +542,11 @@ await api.post('/api/videos/generate', {

 使用 `MediaRecorder` API 录制音频，格式为 `audio/webm`，上传后后端自动转换为 WAV (16kHz mono)。

+- 录音入口放在“我的参考音频”区域底部右侧（与“上传音频”并排）。
+- 录音交互使用弹窗：开始/停止 -> 试听 -> 使用此录音 / 弃用本次录音。
+- 关闭录音弹窗时如仍在录制，会先停止录音再关闭。
+- 录音中禁止点击遮罩关闭（`closeOnOverlay={!isRecording}`）；未录音时允许遮罩关闭。
+
 ```typescript
 // 录音需要用户授权麦克风
 const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
@@ -472,5 +562,5 @@ const mediaRecorder = new MediaRecorder(stream, { mimeType: 'audio/webm' });
 ### UI 结构

 配音方式使用 Tab 切换：
- **EdgeTTS 音色** - 预设音色 2x3 网格
- **声音克隆** - 参考音频列表 + 在线录音 + 语速下拉菜单 (5 档: 较慢/稍慢/正常/稍快/较快)
+- **EdgeTTS 音色** - 统一下拉选择（显示“音色名 + 语言”）
+- **声音克隆** - 参考音频选择器（含试听/重命名/删除/重识别）+ 底部右侧上传/录音入口（录音弹窗）+ 语速/语气下拉
--- a/Docs/FRONTEND_README.md
+++ b/Docs/FRONTEND_README.md
@@ -2,69 +2,84 @@

 ViGent2 的前端界面，采用 Next.js 16 + TailwindCSS 构建。

+## 📌 文档定位
+
+- 本文档用于说明前端功能、运行方式与目录概览（面向使用与协作）。
+- 开发规范与实现约束请查看 `Docs/FRONTEND_DEV.md`。
+- 历史变更与里程碑请查看 `Docs/DevLogs/` 与 `Docs/TASK_COMPLETE.md`。
+
 ## ✨ 核心功能

 ### 1. 视频生成 (`/`)
- **一、文案提取与编辑**: 文案输入/提取/翻译/保存。
+- **一、文案提取与编辑**: 文案输入/提取/翻译/保存；输入框右下角支持一键扩展到大编辑器。
 - **二、配音**: 配音方式（EdgeTTS/声音克隆）+ 配音列表（生成/试听/管理）合并为一个板块。
 - **三、素材编辑**: 视频素材（上传/选择/管理）+ 时间轴编辑（波形/色块/拖拽排序）合并为一个板块。
- **四、标题与字幕**: 片头标题/副标题/字幕样式配置；短暂显示/常驻显示；样式预览使用视频片头帧作为真实背景 (Day 28)。
- **五、背景音乐**: 试听 + 音量控制 + 选择持久化。
+- **四、标题与字幕**: 片头标题/副标题/字幕样式配置；短暂显示/常驻显示；样式预览使用视频片头帧作为真实背景。
+- **五、背景音乐**: 试听 + 搜索选择 + 选择持久化（无音量滑杆，生成时固定混音系数）。
 - **六、作品**（右栏）: 作品列表 + 作品预览合并为一个板块。
 - **进度追踪**: 实时显示视频生成进度 (10% -> 100%)。
 - **作品预览**: 生成完成后直接播放下载（作品预览 + 历史作品）。
+- **下载直达**: 首页作品下载与发布成功弹窗下载统一走同源下载接口（`/api/videos/generated/{id}/download`），避免新标签页在线播放。
 - **预览优化**: 预览视频 `metadata` 预取，首帧加载更快。
- **本地保存**: 文案/标题/偏好由 `useHomePersistence` 统一持久化，刷新后恢复 (Day 14/17)。
- **历史文案**: 手动保存/加载/删除历史文案，独立 localStorage 持久化 (Day 23)。
- **选择持久化**: 首页/发布页作品选择均使用稳定 `id` 持久化，刷新保持用户选择；新视频生成后自动选中最新 (Day 21)。
- **AI 多语言翻译**: 支持 9 种目标语言翻译文案 + 还原原文 (Day 22)。
+- **本地保存**: 文案/标题/偏好由 `useHomePersistence` 统一持久化，刷新后恢复。
+- **历史文案**: 手动保存/加载/删除历史文案，独立 localStorage 持久化。
+- **选择持久化**: 首页/发布页作品选择均使用稳定 `id` 持久化；新视频生成后优先选中最新，后续用户手动选择持续持久化恢复。
+- **统一下拉交互**: 首页/发布页业务选择器统一为 SelectPopover（支持自动上拉、已选定位、移动端 BottomSheet）；`ScriptEditor` 的“历史文案 / AI多语言”为产品例外，保留原轻量菜单。
+- **AI 多语言翻译**: 支持 9 种目标语言翻译文案 + 还原原文。

-### 2. 全自动发布 (`/publish`) [Day 7 新增]
+### 2. 全自动发布 (`/publish`)
 - **多平台管理**: 统一管理抖音、微信视频号、B站、小红书账号状态。
 - **扫码登录**: 
  - 集成后端 Playwright 生成的 QR Code。
  - 实时检测扫码状态 (Wait/Success)。
  - Cookie 自动保存与状态同步。
 - **发布配置**: 设置视频标题、标签、简介。
- **作品选择**: 卡片列表 + 搜索 + 预览弹窗。
- **选择持久化**: 使用稳定 `video.id` 持久化选择，刷新保持；新视频生成自动选中最新 (Day 21)。
+- **作品选择**: SelectPopover 下拉 + 搜索 + 预览弹窗（下拉内可连续预览，不强制收起）。
+- **选择持久化**: 使用稳定 `video.id` 持久化选择，刷新保持；新视频生成自动选中最新。
 - **预览兼容**: 签名 URL / 相对路径均可直接预览。
 - **发布方式**: 仅支持 "立即发布"。
+- **发布成功清理弹窗**: 全平台发布成功后触发 `CleanupModal`（展示成功平台、截图、下载备份、清理按钮），刷新/跳转后可恢复。
+- **清理失败兜底**: 清理接口失败时弹窗不关闭且不清本地输入；连续失败达到阈值后可“暂不清理，继续使用”。
+- **清理范围**: 仅清理输入内容字段（文案/标题/副标题/发布标题/标签），保留样式、字号、边距、模型等用户偏好。

-### 3. 声音克隆 [Day 13 新增]
- **TTS 模式选择**: EdgeTTS (预设音色) / 声音克隆 (自定义音色) 切换。
+### 3. 声音克隆
+- **TTS 模式选择**: EdgeTTS / 声音克隆切换，音色选择统一下拉（显示音色名 + 语言）。
+- **音色试听**: EdgeTTS 音色列表支持一键试听，按音色 locale 自动选择对应语言的固定示例文案。
 - **参考音频管理**: 上传/列表/重命名/删除参考音频，上传后自动 Whisper 转写 ref_text + 超 10s 自动截取。
+- **录音入口**: 参考音频区域底部右侧提供“上传音频 / 录音”入口；录音采用弹窗流程（录制 -> 试听 -> 使用/弃用）。
+- **录音防误触**: 录音中禁用遮罩关闭（避免误触中断）；未录音时可点空白关闭。
 - **重新识别**: 旧参考音频可重新转写并截取 (RotateCw 按钮)。
 - **一键克隆**: 选择参考音频后自动调用 CosyVoice 3.0 服务。
- **语速控制**: 声音克隆模式下支持 5 档语速 (0.8-1.2)，选择持久化 (Day 23)。
- **语气控制**: 声音克隆模式下支持 4 种语气 (正常/欢快/低沉/严肃)，基于 CosyVoice3 `inference_instruct2`，选择持久化 (Day 29)。
- **多语言支持**: EdgeTTS 10 语言声音列表，声音克隆 language 透传 (Day 22)。
+- **语速控制**: 声音克隆模式下支持 5 档语速 (0.8-1.2)，统一下拉，选择持久化。
+- **语气控制**: 声音克隆模式下支持 4 种语气 (正常/欢快/低沉/严肃)，统一下拉，选择持久化。
+- **多语言支持**: EdgeTTS 10 语言声音列表，声音克隆 language 透传。

-### 4. 配音前置 + 时间轴编排 [Day 23 新增]
+### 4. 配音前置 + 时间轴编排
 - **配音独立生成**: 先生成配音 → 选中配音 → 再选素材 → 生成视频。
 - **配音管理面板**: 生成/试听/改名/删除/选中，异步生成 + 进度轮询。
- **时间轴编辑器**: wavesurfer.js 音频波形 + 色块可视化素材分配，拖拽分割线调整各段时长。
- **素材截取设置**: ClipTrimmer 双手柄 range slider + HTML5 视频预览播放。
- **拖拽排序**: 时间轴色块支持 HTML5 Drag & Drop 调换素材顺序。
+- **时间轴编辑器**: wavesurfer.js 音频波形 + 主素材连续播放背景 + 浮动插入镜头块，拖拽移动位置，点击弹窗编辑截取范围与时长。
+- **素材截取设置**: ClipTrimmer 双手柄 range slider + HTML5 视频预览播放（主素材与插入块统一入口）。
+- **多镜头模型**: 主素材循环填满音频时长，其余素材作为插入候选可多次添加到时间轴任意位置；支持"设为主素材"切换。
 - **自定义分配**: 后端 `custom_assignments` 支持用户定义的素材分配方案（含 `source_start/source_end` 截取区间）。
 - **时间轴语义对齐**: 超出音频时仅保留可见段并截齐末段，超出段不参与生成；不足音频时最后可见段自动循环补齐。
 - **画面比例控制**: 时间轴顶部支持 `9:16 / 16:9` 输出比例选择，设置持久化并透传后端。

-### 5. 字幕与标题 [Day 13 新增]
- **片头标题**: 可选输入，限制 15 字；支持”短暂显示 / 常驻显示”，默认短暂显示（4 秒），对标题和副标题同时生效。
- **片头副标题**: 可选输入，限制 20 字；显示在主标题下方，用于补充说明或悬念引导；独立样式配置（字体/字号/颜色/间距），可由 AI 同时生成；与标题共享显示模式设定；仅在视频画面中显示，不参与发布标题 (Day 25)。
+### 5. 字幕与标题
+- **片头标题**: 可选输入，限制 15 字；支持”短暂显示 / 常驻显示”，默认短暂显示（4 秒）；`常驻显示` 时主标题与副标题都会全程显示。
+- **片头副标题**: 可选输入，限制 20 字；显示在主标题下方，用于补充说明或悬念引导；独立样式配置（字体/字号/颜色/间距），可由 AI 同时生成；与标题共享显示模式设定；仅在视频画面中显示，不参与发布标题。
 - **标题同步**: 首页片头标题修改会同步到发布信息标题。
- **逐字高亮字幕**: 卡拉OK效果，默认开启，可关闭。
+- **逐字高亮字幕**: 卡拉OK效果，默认开启。
 - **自动对齐**: 基于 faster-whisper 生成字级别时间戳。
- **样式预设**: 标题/字幕/副标题样式选择 + 预览 + 字号调节 (Day 16/25)。
- **默认样式**: 标题 90px 站酷快乐体；字幕 60px 经典黄字 + DingTalkJinBuTi (Day 17)。
- **样式持久化**: 标题/字幕/副标题样式与字号刷新保留 (Day 17/25)。
+- **样式预设**: 标题/字幕/副标题样式选择 + 预览 + 字号调节。
+- **默认样式**: 标题 90px 站酷快乐体；字幕 60px 经典黄字 + DingTalkJinBuTi。
+- **样式持久化**: 标题/字幕/副标题样式与字号刷新保留。

-### 6. 背景音乐 [Day 16 新增]
- **试听预览**: 点击试听即选中，音量滑块实时生效。
- **混音控制**: 仅影响 BGM，配音保持原音量。
+### 6. 背景音乐
+- **试听预览**: 下拉列表内可直接试听。
+- **选择体验**: 发布页同款搜索选择器，打开时自动定位到当前已选。
+- **混音控制**: 当前前端不提供音量滑杆，生成时固定 `bgm_volume=0.2`，保持配音音量稳定。

-### 7. 账户设置 [Day 15 新增]
+### 7. 账户设置
 - **手机号登录**: 11位中国手机号验证登录。
 - **账户下拉菜单**: 显示手机号（中间四位脱敏）+ 有效期 + 修改密码 + 安全退出。
 - **修改密码**: 弹窗输入当前密码与新密码，修改后强制重新登录。
@@ -76,12 +91,12 @@ ViGent2 的前端界面，采用 Next.js 16 + TailwindCSS 构建。
 - **到期续费**: 会员到期后登录自动跳转付费页续费，流程与首次开通一致。
 - **管理员激活**: 管理员手动激活功能并存，两种方式互不影响。

-### 8. 文案提取助手 (`ScriptExtractionModal`) [Day 15 新增]
- **多源提取**: 支持文件拖拽上传与 URL 粘贴 (B站/抖音/TikTok)。
- **AI 智能改写**: 集成 GLM-4.7-Flash，自动改写为口播文案。
- **自定义提示词**: 可自定义改写提示词，留空使用默认；设置持久化到 localStorage (Day 25)。
- **一键填入**: 提取结果直接填充至视频生成输入框。
- **智能交互**: 实时进度展示，防误触设计。
+### 9. 文案创作助手（3 个弹窗）
+- **文案提取助手** (`ScriptExtractionModal`): 支持文件上传与 URL 提取（需登录），提取结果可一键填入主编辑器。
+- **AI 智能改写** (`RewriteModal`): 基于 GLM-4.7-Flash 改写文案，支持自定义提示词持久化。
+- **文案深度学习** (`ScriptLearningModal`): 输入抖音/B站博主主页，分析热门话题并生成口播文案（需登录）。
+- **统一结果操作栏**: 三个弹窗结果页统一底部 Action Grid 风格，主按钮为「填入文案」，次按钮统一「复制 / 重新生成（或等价返回操作）」。
+- **登录鉴权**: 依赖受保护接口（`/api/tools/*`、`/api/ai/*`），未登录会触发全局 401 跳转登录。

 ## 🛠️ 技术栈

@@ -92,7 +107,7 @@ ViGent2 的前端界面，采用 Next.js 16 + TailwindCSS 构建。
 - **音频波形**: wavesurfer.js (时间轴编辑器)
 - **API**: Axios 实例 `@/shared/api/axios` (对接后端 FastAPI :8006)

-## 🚀 开发指南
+## 🚀 快速开始

 ### 安装依赖

@@ -140,11 +155,12 @@ src/
 - **URL 统一工具**: `@/shared/lib/media` 提供 `resolveMediaUrl` / `resolveAssetUrl`
 - **代理配置**: Next.js Rewrites (如需) 或直接 CORS。

-## 🎨 设计规范
+## 🎨 UI 说明（概览）

- **主色调**: 深紫/黑色系 (Dark Mode)
- **交互**: 悬停微动画 (Hover Effects)；操作按钮默认半透明可见 (opacity-40)，hover 时全亮，兼顾触屏设备
- **响应式**: 适配桌面端与移动端；发布页平台卡片响应式布局（移动端紧凑/桌面端宽松）
- **滚动体验**: 列表滚动条统一隐藏 (hide-scrollbar)；刷新后自动回到顶部（禁用浏览器滚动恢复 + 列表 scroll 时间门控）
- **样式预览**: 浮动预览窗口，桌面端左上角 280px，移动端右下角 160px（不遮挡控件）
- **输入辅助**: 标题/副标题输入框实时字数计数器，超限变红
+- 业务选择器统一使用 `SelectPopover`（桌面 Popover / 移动端 BottomSheet）；`ScriptEditor` 的“历史文案 / AI多语言”保留原轻量菜单。
+- 业务弹窗统一使用 `AppModal`（统一遮罩、头部、关闭行为与滚动策略）。
+- 弹窗关闭策略：默认支持 `ESC` / `X` / 点击空白关闭；仅发布成功清理弹窗为强制流程（不允许空白关闭，也不显示 `X`）。
+- 文案类弹窗结果页按钮统一：底部 Action Grid、主次按钮层级一致、文案动作命名一致（填入/复制/重新生成）。
+- 视频预览弹窗层级高于下拉菜单；下拉内支持连续预览。
+- 页面同时适配桌面端与移动端；长列表统一隐藏滚动条。
+- 详细 UI 规范、持久化规范与交互约束请查看 `Docs/FRONTEND_DEV.md`。
--- a/Docs/LatentSync_DEPLOY.md
+++ b/Docs/LatentSync_DEPLOY.md
@@ -137,11 +137,9 @@ CUDA_VISIBLE_DEVICES=1 python -m scripts.inference \
 └── DEPLOY.md
 ```

---
-
---
-
-## 步骤 7: 性能优化 (预加载模型服务)
+---
+
+## 步骤 6: 性能优化（预加载模型服务）

 为了消除每次生成视频时 30-40秒 的模型加载时间，建议运行常驻服务。

@@ -201,7 +199,7 @@ LatentSync 1.6 需要 ~18GB VRAM。如果遇到 OOM 错误：
 - `inference_steps`: 增加到 30-50 可提高质量
 - `guidance_scale`: 增加可改善唇同步，但过高可能导致抖动

-### 编码流水线优化（Day 30）
+### 编码流水线优化（当前实现）

 LatentSync 内部默认流程有两处冗余编码已优化：

@@ -214,7 +212,7 @@ LatentSync 内部默认流程有两处冗余编码已优化：

 ---

-### 无脸帧容错（Day 30）
+### 无脸帧容错（当前实现）

 素材中部分帧检测不到人脸（转头、遮挡、空镜头）时，不再中断整次推理：

--- a/Docs/MUSETALK_DEPLOY.md
+++ b/Docs/MUSETALK_DEPLOY.md
@@ -10,8 +10,8 @@

 MuseTalk 作为 **混合唇形同步方案** 的长视频引擎：

- **短视频 (<120s)** → LatentSync 1.6 (GPU1, 端口 8007)
- **长视频 (>=120s)** → MuseTalk 1.5 (GPU0, 端口 8011)
+- **短视频 (<100s，按当前 `.env` 示例)** → LatentSync 1.6 (GPU1, 端口 8007)
+- **长视频 (>=100s，按当前 `.env` 示例)** → MuseTalk 1.5 (GPU0, 端口 8011)
 - 路由阈值由 `LIPSYNC_DURATION_THRESHOLD` 控制
 - MuseTalk 不可用时自动回退到 LatentSync

@@ -196,7 +196,7 @@ MUSETALK_ENCODE_CRF=14                   # CRF 越小越清晰 (14≈接近视
 MUSETALK_ENCODE_PRESET=slow              # x264 preset (slow=高压缩效率)

 # 混合唇形同步路由
-LIPSYNC_DURATION_THRESHOLD=120           # 秒, >=此值用 MuseTalk
+LIPSYNC_DURATION_THRESHOLD=100           # 秒, >=此值用 MuseTalk
 ```

 > **参数档位参考**：
--- a/Docs/PUBLISH_DEPLOY.md
+++ b/Docs/PUBLISH_DEPLOY.md
@@ -0,0 +1,215 @@
+# 多平台发布部署与实现说明（抖音 / 微信视频号 / B站 / 小红书）
+
+## 1. 目标
+
+本文件用于集中说明以下内容：
+
+- 平台登录（扫码）如何实现
+- 自动化发布链路如何实现
+- 部署时必须具备的运行环境与配置
+- 常见故障如何快速定位
+
+适用代码范围：`backend/app/modules/publish`、`backend/app/services/publish_service.py`、`backend/app/services/qr_login_service.py`、`backend/app/services/uploader/*`。
+
+---
+
+## 2. 总体架构
+
+### 2.1 API 入口
+
+- `POST /api/publish`：执行发布
+- `POST /api/publish/login/{platform}`：获取二维码并启动登录会话
+- `GET /api/publish/login/status/{platform}`：轮询扫码状态
+- `POST /api/publish/logout/{platform}`：注销并删除对应 Cookie
+- `POST /api/publish/cookies/save/{platform}`：手动保存浏览器 `document.cookie`
+- `GET /api/publish/accounts`：查询各平台是否已登录
+- `GET /api/publish/screenshot/{filename}`：读取发布成功截图（需登录）
+- `POST /api/videos/cleanup`：清理当前用户工作区生成产物（发布成功后前端触发）
+
+核心路由文件：`backend/app/modules/publish/router.py`。
+
+### 2.2 服务分层
+
+- `PublishService`：平台路由、账号隔离、视频路径处理、调用具体 uploader
+- `QRLoginService`：Playwright 获取二维码、监控扫码结果、保存 Cookie
+- `*Uploader`：平台发布自动化（抖音/微信/小红书基于 Playwright，B站基于 biliup）
+
+### 2.3 发布成功后的清理联动
+
+- 前端 `CleanupContext` 在“本次所选平台全部发布成功”时触发清理弹窗。
+- 用户点击清理时先调用 `POST /api/videos/cleanup`，仅接口成功后才清本地输入并关闭弹窗。
+- 清理成功后前端派发 `vigent:workspace-cleared` 事件，当前发布页会就地重置标题/标签输入态。
+- 接口失败时弹窗保持打开并允许重试；连续失败达到阈值后可“暂不清理，继续使用”。
+- 弹窗“下载视频备份”走同源下载接口：`GET /api/videos/generated/{video_id}/download`，确保浏览器直接保存文件而非新标签页播放。
+
+---
+
+## 3. Cookie 与账号隔离
+
+### 3.1 存储路径
+
+- 用户隔离路径：`backend/user_data/{user_uuid}/cookies/{platform}_cookies.json`
+- 兼容旧版路径：`backend/app/cookies/{platform}_cookies.json`
+
+路径管理文件：`backend/app/core/paths.py`。
+
+### 3.2 Cookie 格式
+
+- `bilibili`：简化字典格式（`SESSDATA` / `bili_jct` / `DedeUserID` / `DedeUserID__ckMd5`）
+- `douyin` / `weixin` / `xiaohongshu`：Playwright `storage_state` 格式（`cookies + origins`）
+
+对应逻辑：`backend/app/services/publish_service.py` 与 `backend/app/services/qr_login_service.py`。
+
+---
+
+## 4. 运行与部署要求
+
+### 4.1 系统依赖
+
+- Python 3.10+
+- Node.js 18+
+- Playwright Chromium（`playwright install chromium`）
+- 系统 Chrome（建议）
+- Xvfb（建议，尤其抖音/微信 headful）
+
+### 4.2 启动建议
+
+- 推荐使用根目录脚本启动后端：`./run_backend.sh`
+- 脚本内置 `xvfb-run`，适合无物理桌面服务器场景
+
+脚本：`run_backend.sh`。
+
+### 4.3 环境变量（核心）
+
+统一在 `backend/.env` 配置，配置定义见 `backend/app/core/config.py`。
+
+- 抖音：`DOUYIN_HEADLESS_MODE`、`DOUYIN_CHROME_PATH`、`DOUYIN_USER_AGENT`、`DOUYIN_LOCALE`、`DOUYIN_TIMEZONE_ID`
+- 微信：`WEIXIN_HEADLESS_MODE`、`WEIXIN_CHROME_PATH`、`WEIXIN_USER_AGENT`、`WEIXIN_LOCALE`、`WEIXIN_TIMEZONE_ID`、`WEIXIN_TRANSCODE_MODE`
+- 小红书：`XIAOHONGSHU_HEADLESS_MODE`、`XIAOHONGSHU_CHROME_PATH`、`XIAOHONGSHU_USER_AGENT`、`XIAOHONGSHU_LOCALE`、`XIAOHONGSHU_TIMEZONE_ID`
+- 发布截图目录：`PUBLISH_SCREENSHOT_DIR`
+
+说明：小红书这些配置当前用于发布 uploader；扫码登录服务里抖音/微信使用独立配置，B站/小红书登录走通用默认浏览器参数。
+
+---
+
+## 5. 登录实现（扫码）
+
+统一由 `QRLoginService` 处理：
+
+1. 打开平台登录页并提取二维码（CSS/Text 多策略）
+2. 前端展示二维码给用户扫码
+3. 后台监控 URL + Session Cookie 变化
+4. 登录成功后保存 Cookie 文件
+
+关键文件：`backend/app/services/qr_login_service.py`。
+
+### 5.1 抖音
+
+- 登录页：`https://creator.douyin.com/`
+- 额外能力：监听 `check_qrconnect` 接口，支持识别 `redirect_url`
+- 特殊场景：若触发刷脸验证，会提取验证二维码 `face_verify_qr` 返回前端
+
+### 5.2 微信视频号
+
+- 登录页：`https://channels.weixin.qq.com/platform/`
+- 二维码提取支持 `img/canvas/svg` 等兜底选择器
+
+### 5.3 小红书
+
+- 登录页：`https://creator.xiaohongshu.com/`
+- 关键修复：默认可能落在短信登录页，先自动切换到扫码模式再提取二维码
+- 成功判定支持 `/new/home`，避免仅依赖旧 `success_indicator`
+
+### 5.4 B站
+
+- 登录页：`https://passport.bilibili.com/login`
+- 扫码成功后保存 B站所需核心 Cookie 字段
+
+---
+
+## 6. 自动化发布实现
+
+### 6.1 抖音（Playwright）
+
+文件：`backend/app/services/uploader/douyin_uploader.py`
+
+- 使用 `storage_state` 打开浏览器上下文
+- 自动进入上传页，触发 file chooser 上传
+- 上传完成后填写标题/简介/话题，必要时处理封面
+- 发布成功判定：页面跳转、接口信号、管理页核验
+- 成功后回写 Cookie，并保存发布成功截图
+
+### 6.2 微信视频号（Playwright）
+
+文件：`backend/app/services/uploader/weixin_uploader.py`
+
+- 进入视频号创作平台，自动定位上传入口
+- 标题/描述/标签按当前产品规则统一写入“视频描述”字段
+- 发布成功判定：`post_create` API 或页面离开创建页
+- 成功后回写 Cookie，并保存发布成功截图
+
+### 6.3 小红书（Playwright）
+
+文件：`backend/app/services/uploader/xiaohongshu_uploader.py`
+
+- 自动进入发布页并触发上传
+- 上传阶段增强：
+  - `UPLOAD_SIGNAL_TIMEOUT` 启动探测窗口
+  - 无后缀视频文件自动准备带后缀临时文件（`hardlink/copy`）
+  - 文件名后缀一致性校验
+  - `UPLOAD_IDLE_TIMEOUT` 空转超时保护，避免长时间“假卡住”
+- 发布成功判定：URL 跳转 + 成功文案 + 发布 API 信号
+- 成功后回写 Cookie，并返回成功截图 URL
+
+### 6.4 B站（biliup）
+
+文件：`backend/app/services/uploader/bilibili_uploader.py`
+
+- 使用 biliup SDK，不依赖 Playwright 发布流程
+- 读取 B站 Cookie，调用 biliup 上传并提交
+- 返回 `bvid/aid` 对应链接（若 API 返回）
+
+---
+
+## 7. 调试与排障
+
+### 7.1 后端日志
+
+- PM2 输出日志：`~/.pm2/logs/vigent2-backend-out.log`
+- PM2 错误日志：`~/.pm2/logs/vigent2-backend-error.log`
+
+### 7.2 常见问题
+
+- 现象：登录二维码拿不到
+  - 优先检查平台登录页是否改版（selector 失效）
+  - 小红书需确认是否仍停留短信登录视图
+
+- 现象：发布看起来卡住
+  - 检查是否长期停留“等待上传状态/等待发布结果”
+  - 小红书优先检查上传文件名后缀与 MIME 识别
+
+- 现象：突然要求重新登录
+  - 通常为 Cookie 失效或平台风控，需要重新扫码
+
+### 7.3 调试产物
+
+- 开启对应 `*_DEBUG_ARTIFACTS` 可输出调试截图/网络日志
+- 成功截图通过 `/api/publish/screenshot/{filename}` 回传前端
+
+---
+
+## 8. 建议的验收流程（每次部署后）
+
+1. 健康检查：`curl http://127.0.0.1:8006/health`
+2. 登录检查：分别触发 4 个平台扫码登录并确认状态轮询可达成功
+3. 发布检查：四个平台各发 1 条测试视频（或最少覆盖当日变更平台）
+4. 截图检查：确认成功截图可通过 `/api/publish/screenshot/{filename}` 拉取
+5. 日志检查：确认无持续重试、无长时间空转、无明显 selector 失败风暴
+
+---
+
+## 9. 关联文档
+
+- 总部署文档：`Docs/DEPLOY_MANUAL.md`
+- 后端说明：`Docs/BACKEND_README.md`
+- 当日变更记录：`Docs/DevLogs/Day31.md`
--- a/Docs/QWEN3_TTS_DEPLOY.md
+++ b/Docs/QWEN3_TTS_DEPLOY.md
@@ -1,6 +1,10 @@
 # Qwen3-TTS 1.7B 部署指南

 > 本文档描述如何在 Ubuntu 服务器上部署 Qwen3-TTS 1.7B-Base 声音克隆模型。
+>
+> ⚠️ **状态：历史归档（已停用）**
+> 当前项目生产环境已切换到 CosyVoice 3.0，请优先参考 `Docs/COSYVOICE3_DEPLOY.md`。
+> 本文档仅保留用于回溯旧方案，不建议新部署继续使用。

 ## 系统要求

--- a/Docs/SUBTITLE_DEPLOY.md
+++ b/Docs/SUBTITLE_DEPLOY.md
@@ -24,7 +24,7 @@
  音频 → faster-whisper → 字幕JSON ─────────────────────────────────────────────┴→ Remotion合成 → 最终视频
 ```

-> **唇形同步路由**: 短视频 (<120s) 用 LatentSync 1.6 (GPU1)，长视频 (>=120s) 用 MuseTalk 1.5 (GPU0)，由 `LIPSYNC_DURATION_THRESHOLD` 控制。
+> **唇形同步路由**: 短视频 (<100s，按当前 `.env` 示例) 用 LatentSync 1.6 (GPU1)，长视频 (>=100s，按当前 `.env` 示例) 用 MuseTalk 1.5 (GPU0)，由 `LIPSYNC_DURATION_THRESHOLD` 控制。

 ## 系统要求

@@ -146,8 +146,8 @@ remotion/
 | 阶段 | 进度 | 说明 |
 |------|------|------|
 | 下载素材 | 0% → 5% | 从 Supabase 下载输入视频 |
-| TTS 语音生成 | 5% → 25% | EdgeTTS / Qwen3-TTS / 预生成配音下载 |
-| 唇形同步 | 25% → 80% | LatentSync 推理 |
+| TTS 语音生成 | 5% → 25% | EdgeTTS / CosyVoice / 预生成配音下载 |
+| 唇形同步 | 25% → 80% | LatentSync / MuseTalk（按阈值路由） |
 | 字幕对齐 | 80% → 85% | faster-whisper 生成字级别时间戳 |
 | Remotion 渲染 | 85% → 95% | 合成字幕和标题 |
 | 上传结果 | 95% → 100% | 上传到 Supabase Storage |
@@ -305,4 +305,4 @@ WhisperService(device="cuda:0")  # 或 "cuda:1"
 | 2026-02-27 | 1.3.0 | 架构图更新 MuseTalk 混合路由；Remotion 并发渲染从 8 提升到 16；GPU 分配说明更新 |
 | 2026-02-28 | 1.3.1 | MuseTalk 合成阶段优化：纯 numpy blending + FFmpeg pipe NVENC GPU 硬编码替代双重编码 |
 | 2026-02-28 | 1.4.0 | compose 流复制替代重编码；FFmpeg 超时保护 (600s/30s)；Remotion 并发 16→4；Whisper 时间戳平滑 + 原文节奏映射；全局视频生成 Semaphore(2)；Redis 任务 TTL |
-| 2026-03-02 | 1.5.0 | Remotion bundle 缓存修复（硬链接视频/字体到 cached public 目录）；编码流水线优化 prepare_segment/normalize CRF 23→18；多素材 concat 改为流复制 |
+| 2026-03-02 | 1.5.0 | Remotion bundle 缓存修复（硬链接视频/字体到 cached public 目录）；编码流水线优化 prepare_segment/normalize CRF 23→18；多素材 concat 改为流复制；MuseTalk 合成改为 rawvideo 管道 + `libx264`（可配 CRF/preset） |
--- a/Docs/task_complete.md
+++ b/Docs/task_complete.md
@@ -1,302 +1,380 @@
-# ViGent2 开发任务清单 (Task Log)
-
-**项目**: ViGent2 数字人口播视频生成系统
-**进度**: 100% (Day 30 - Remotion 缓存修复 + 编码流水线质量优化)
-**更新时间**: 2026-03-02
-
---
-
-## 📅 对话历史与开发日志
-
-> 这里记录了每一天的核心开发内容与 milestone。
-
-### Day 30: Remotion 缓存修复 + 编码流水线质量优化 + 唇形同步容错 (Current)
- [x] **Remotion 缓存 404 修复**: bundle 缓存命中时，新生成的视频/字体文件不在旧缓存 `public/` 目录 → 404 → 回退 FFmpeg（无标题字幕）。改为硬链接（`fs.linkSync`）当前渲染所需文件到缓存目录。
- [x] **LatentSync `read_video` 跳过冗余 FPS 重编码**: 检测输入 FPS，已是 25fps 时跳过 `ffmpeg -r 25 -crf 18` 重编码。
- [x] **LatentSync final mux 流复制**: `imageio` CRF 13 写帧后的 mux 步骤从 `libx264 -crf 18` 改为 `-c:v copy`，消除冗余双重编码。
- [x] **`prepare_segment` + `normalize_orientation` CRF 提质**: CRF 23 → 18，与 LatentSync 内部质量标准统一。
- [x] **多素材 concat 流复制**: 各段参数已统一，`concat_videos` 从 `libx264 -crf 23` 改为 `-c:v copy`。
- [x] **编码次数总计**: 从 5-6 次有损编码降至 3 次（prepare_segment → LatentSync/MuseTalk 模型输出 → Remotion）。
- [x] **LatentSync 无脸帧容错**: 素材部分帧检测不到人脸时不再中断推理，无脸帧保留原画面，单素材异常时回退原视频。
- [x] **MuseTalk 管道直编码**: `cv2.VideoWriter(mp4v)` 中间有损文件改为 FFmpeg rawvideo stdin 管道，消除一次冗余有损编码。
- [x] **MuseTalk 参数环境变量化**: 推理与编码参数（detect_every/blend_cache/CRF/preset 等）从硬编码迁移到 `backend/.env`，当前使用质量优先档（CRF 14, preset slow, detect_every 2, blend_cache_every 2）。
- [x] **Workflow 异步防阻塞**: 新增 `_run_blocking()` 线程池辅助，5 处同步 FFmpeg 调用（旋转归一化/prepare_segment/concat/BGM 混音）改为 `await _run_blocking()`，事件循环不再被阻塞。
- [x] **compose 跳过优化**: 无 BGM 时 `final_audio_path == audio_path`，跳过多余的 compose 步骤，Remotion 路径直接用 lipsync 输出，非 Remotion 路径 `shutil.copy` 透传。
- [x] **compose() 异步化**: `compose()` 改为 `async def`，内部 `_get_duration` 和 `_run_ffmpeg` 走 `run_in_executor`。
- [x] **同分辨率跳过 scale**: 多素材逐段比对分辨率，匹配的传 `None` 走 copy 分支；单素材同理。避免已是目标分辨率时的无效重编码。
- [x] **`_get_duration()` 线程池化**: workflow 中 3 处同步 ffprobe 探测改为 `await _run_blocking()`。
- [x] **compose 循环 CRF 统一**: 循环场景 CRF 23 → 18，与全流水线质量标准一致。
- [x] **多素材片段校验**: prepare 完成后校验片段数量一致，防止空片段进入 concat。
- [x] **唇形模型前端选择**: 生成按钮右侧新增模型下拉（默认模型/快速模型/高级模型），全链路透传 `lipsync_model` 到后端路由。默认保持阈值策略，快速强制 MuseTalk，高级强制 LatentSync，三种模式均有 LatentSync 兜底。选择 localStorage 持久化。
-
-### Day 29: 视频流水线优化 + CosyVoice 语气控制
- [x] **字幕同步修复**: Whisper 时间戳三步平滑（单调递增+重叠消除+间隙填补）+ 原文节奏映射（线性插值 + 单字时长钳位）。
- [x] **LatentSync 嘴型参数调优**: inference_steps 16→20, guidance_scale 2.0, DeepCache 启用, Remotion concurrency 16→4。
- [x] **compose 流复制**: 不循环时 `-c:v copy` 替代 libx264 重编码，compose 耗时从分钟级降到秒级。
- [x] **FFmpeg 超时保护**: `_run_ffmpeg()` timeout=600, `_get_duration()` timeout=30。
- [x] **全局并发限制**: `asyncio.Semaphore(2)` 控制同时运行的生成任务数。
- [x] **Redis 任务 TTL**: create 24h, completed/failed 2h, list 自动清理过期索引。
- [x] **临时字体清理**: 字体文件加入 temp_files 清理列表。
- [x] **预览背景 CORS 修复**: 素材同源代理 `/api/materials/stream/{id}` 彻底绕开跨域。
- [x] **CosyVoice 语气控制**: 声音克隆模式新增语气下拉（正常/欢快/低沉/严肃），基于 `inference_instruct2()` 自然语言指令控制情绪，全链路透传 instruct_text，默认"正常"行为不变。
-
-### Day 28: CosyVoice FP16 加速 + 文档全面更新
- [x] **CosyVoice FP16 半精度加速**: `AutoModel()` 开启 `fp16=True`，LLM 推理和 Flow Matching 自动混合精度运行，预估提速 30-40%、显存降低 ~30%。
- [x] **文档全面更新**: README.md / DEPLOY_MANUAL.md / SUBTITLE_DEPLOY.md / BACKEND_README.md 补充 MuseTalk 混合唇形同步方案、性能优化、Remotion 并发渲染等内容。
-
-### Day 27: Remotion 描边修复 + 字体样式扩展 + 混合唇形同步 + 性能优化
- [x] **描边渲染修复**: 标题/副标题/字幕从 `textShadow` 4 方向模拟改为 CSS 原生 `-webkit-text-stroke` + `paint-order: stroke fill`，修复描边过粗和副标题重影问题。
- [x] **字体样式扩展**: 标题样式 4→12 个（+庞门正道/优设标题圆/阿里数黑体/文道潮黑/无界黑/厚底黑/寒蝉半圆体/欣意吉祥宋），字幕样式 4→8 个（+少女粉/清新绿/金色隶书/楷体红字）。
- [x] **描边参数优化**: 所有预设 `stroke_size` 从 8 降至 4~5，配合原生描边视觉更干净。
- [x] **TypeScript 类型修复**: Root.tsx `Composition` 泛型与 `calculateMetadata` 参数类型对齐；Video.tsx `VideoProps` 添加索引签名兼容 `Record<string, unknown>`；VideoLayer.tsx 移除 `OffthreadVideo` 不支持的 `loop` prop。
- [x] **进度条文案还原**: 进度条从显示后端推送消息改回固定 `正在AI生成中...`。
- [x] **MuseTalk 混合唇形同步**: 部署 MuseTalk 1.5 常驻服务 (GPU0, 端口 8011)，按音频时长自动路由 — 短视频 (<120s) 走 LatentSync，长视频 (>=120s) 走 MuseTalk，MuseTalk 不可用时自动回退。
- [x] **MuseTalk 推理性能优化**: server.py v2 重写 — cv2 直读帧(跳过 ffmpeg→PNG)、人脸检测降频(每5帧)、BiSeNet mask 缓存(每5帧)、cv2.VideoWriter 直写(跳过 PNG 写盘)、batch_size 8→32，预估 30min→8-10min (~3x)。
- [x] **Remotion 并发渲染优化**: render.ts 新增 concurrency 参数，从默认 8 提升到 16 (56核 CPU)，预估 5min→2-3min。
-
-### Day 26: 前端优化：板块合并 + 序号标题 + UI 精细化
- [x] **板块合并**: 首页 9 个独立板块合并为 5 个主板块（配音方式+配音列表→三、配音；视频素材+时间轴→四、素材编辑；历史作品+作品预览→六、作品）。
- [x] **中文序号标题**: 一~十编号（首页一~六，发布页七~十），移除所有 emoji 图标。
- [x] **embedded 模式**: 6 个组件支持 `embedded` prop，嵌入时不渲染外层卡片/标题。
- [x] **配音列表两行布局**: embedded 模式第 1 行语速+生成配音（右对齐），第 2 行配音列表+刷新。
- [x] **子组件自渲染子标题**: MaterialSelector/TimelineEditor embedded 时自渲染 h3 子标题+操作按钮同行。
- [x] **下拉对齐**: TitleSubtitlePanel 标签统一 `w-20`，下拉 `w-1/3 min-w-[100px]`，垂直对齐。
- [x] **参考音频文案简化**: 底部段落移至标题旁，简化为 `(上传3-10秒语音样本)`。
- [x] **账户手机号显示**: AccountSettingsDropdown 新增手机号显示。
- [x] **标题显示模式对副标题生效**: payload 条件修复 + UI 下拉上移至板块标题行。
- [x] **登录后用户信息立即可用**: AuthContext 暴露 `setUser`，登录成功后立即写入用户数据，修复登录后显示"未知账户"的问题。
- [x] **文案微调**: 素材描述改为"上传自拍视频，最多可选4个"；显示模式选项加"标题"前缀。
- [x] **UI/UX 体验优化**: 操作按钮移动端可见（opacity-40）、手机号脱敏、标题字数计数器、时间轴拖拽抓手图标、截取滑块放大。
- [x] **代码质量修复**: 密码弹窗 success 清空、MaterialSelector useMemo + disabled 守卫、TimelineEditor useMemo。
- [x] **发布页响应式布局**: 平台账号卡片单行布局，移动端紧凑（小图标/小按钮），桌面端宽松（与其他板块风格一致）。
- [x] **移动端刷新回顶部**: `scrollRestoration = "manual"` + 列表 scroll 时间门控（`scrollEffectsEnabled` ref，1 秒内禁止自动滚动）+ 延迟兜底 `scrollTo(0,0)`。
- [x] **移动端样式预览缩小**: FloatingStylePreview 移动端宽度缩至 160px，位置改为右下角，不遮挡样式调节控件。
- [x] **列表滚动条统一隐藏**: 所有列表（BGM/配音/作品/素材/文案提取）滚动条改回 `hide-scrollbar`。
- [x] **移动端配音/素材适配**: VoiceSelector 按钮移动端缩小（`px-2 sm:px-4`）修复克隆声音不可见；MaterialSelector 标题行移除 `whitespace-nowrap`，描述移动端隐藏，修复刷新按钮溢出。
- [x] **生成配音按钮放大**: 从辅助尺寸（`text-xs px-2 py-1`）升级为主操作尺寸（`text-sm font-medium px-4 py-2`），新增阴影。
- [x] **生成进度条位置调整**: 从"六、作品"卡片内部提取到右栏独立卡片，显示在作品卡片上方，更醒目。
- [x] **LatentSync 超时修复**: httpx 超时从 1200s（20 分钟）改为 3600s（1 小时），修复 2 分钟以上视频口型推理超时回退问题。
- [x] **字幕时间戳节奏映射**: `whisper_service.py` 从全程线性插值改为 Whisper 逐词节奏映射，修复长视频字幕漂移。
-
-### Day 25: 文案提取修复 + 自定义提示词 + 片头副标题
- [x] **抖音文案提取修复**: yt-dlp Fresh cookies 报错，重写 `_download_douyin_manual` 为移动端分享页 + 自动获取 ttwid 方案。
- [x] **清理 DOUYIN_COOKIE**: 新方案不再需要手动维护 Cookie，从 `.env`/`config.py`/`service.py` 全面删除。
- [x] **AI 智能改写自定义提示词**: 后端 `rewrite_script()` 支持 `custom_prompt` 参数；前端 checkbox 旁新增折叠式提示词编辑区，localStorage 持久化。
- [x] **SSR 构建修复**: `useState` 初始化 `localStorage` 访问加 `typeof window` 守卫，修复 `npm run build` 报错。
- [x] **片头副标题**: 新增 secondary_title（后端/Remotion/前端全链路），AI 同时生成，独立样式配置，20 字限制。
- [x] **前端文案修正**: "AI 洗稿结果"→"AI 改写结果"。
- [x] **yt-dlp 升级**: `2025.12.08` → `2026.2.21`。
- [x] **参考音频中文文件名修复**: `sanitize_filename()` 将存储路径清洗为 ASCII 安全字符，纯中文名哈希兜底，原始名保留为展示名。
-
-### Day 24: 鉴权到期治理 + 多素材时间轴稳定性修复
- [x] **会员到期请求时失效**: 登录与鉴权接口统一执行 `expires_at` 检查；到期后自动停用账号、清理 session，并返回“会员已到期，请续费”。
- [x] **画面比例控制**: 时间轴新增 `9:16 / 16:9` 输出比例选择，前端持久化并透传后端，单素材/多素材统一按目标分辨率处理。
- [x] **标题/字幕防溢出**: Remotion 与前端预览统一响应式缩放、自动换行、描边/字距/边距比例缩放，降低预览与成片差异。
- [x] **标题显示模式**: 标题行新增“短暂显示/常驻显示”下拉；默认短暂显示（4 秒），用户选择持久化并透传至 Remotion 渲染链路。
- [x] **MOV 方向归一化**: 新增旋转元数据解析与 orientation normalize，修复“编码横屏+旋转元数据”导致的竖屏判断偏差。
- [x] **多素材拼接稳定性**: 片段 prepare 与 concat 统一 25fps/CFR，concat 增加 `+genpts`，缓解段切换处“画面冻结口型还动”。
- [x] **时间轴语义对齐**: 打通 `source_end` 全链路；修复 `sourceStart>0 且 sourceEnd=0` 时长计算；生成时以时间轴可见段 assignments 为准，超出段不参与。
- [x] **交互细节优化**: 页面刷新回顶部；素材/历史列表首轮自动滚动抑制，减少恢复状态时页面跳动。
-
-### Day 23: 配音前置重构 + 素材时间轴编排 + UI 体验优化 + 声音克隆增强
-
-#### 第一阶段：配音前置
- [x] **配音生成独立化**: 新增 `generated_audios` 后端模块（router/schemas/service），5 个 API 端点，复用现有 TTSService / voice_clone_service / task_store。
- [x] **配音管理面板**: 前端新增 `useGeneratedAudios` hook + `GeneratedAudiosPanel` 组件，支持生成/试听/改名/删除/选中。
- [x] **UI 面板重排序**: 文案 → 标题字幕 → 配音方式 → 配音列表 → 素材选择 → BGM → 生成视频。
- [x] **素材区门控**: 未选中配音时素材区显示遮罩，选中后显示配音时长 + 素材均分信息。
- [x] **视频生成对接**: workflow.py 新增预生成音频分支（`generated_audio_id`），跳过内联 TTS，向后兼容。
- [x] **持久化**: selectedAudioId 加入 useHomePersistence，刷新页面恢复选中配音。
-
-#### 第二阶段：素材时间轴编排
- [x] **时间轴编辑器**: 新增 `TimelineEditor` 组件，wavesurfer.js 音频波形 + 色块可视化素材分配，拖拽分割线调整各段时长。
- [x] **素材截取设置**: 新增 `ClipTrimmer` 模态框，HTML5 视频预览 + 双端滑块设置源视频截取起点/终点。
- [x] **后端自定义分配**: 新增 `CustomAssignment` 模型，`prepare_segment` 支持 `source_start`，workflow 多素材/单素材流水线支持 `custom_assignments`。
- [x] **循环截取修复**: `stream_loop + source_start` 改为两步处理（先裁剪再循环），确保从截取起点循环而非从视频 0s 开始。
- [x] **MaterialSelector 精简**: 移除旧的时长信息栏和拖拽排序区（功能迁移到 TimelineEditor）。
-
-#### 第三阶段：UI 体验优化 + TTS 稳定性
- [x] **TTS SoX PATH 修复**: `run_qwen_tts.sh` export conda env bin 到 PATH (Qwen3-TTS 已停用，已被 CosyVoice 3.0 替换)。
- [x] **TTS 显存管理**: 每次生成后 `torch.cuda.empty_cache()`，asyncio.to_thread 避免阻塞事件循环 (CosyVoice 沿用相同机制)。
- [x] **配音列表按钮统一**: Play/Edit/Delete 按钮右侧同组 hover 显示，与 RefAudioPanel 一致，移除文案摘要。
- [x] **素材区解除配音门控**: 移除 MaterialSelector 的 selectedAudio 遮罩，素材随时可上传管理。
- [x] **时间轴拖拽排序**: TimelineEditor 色块支持 HTML5 Drag & Drop 调换素材顺序。
- [x] **截取设置 Range Slider**: ClipTrimmer 改为单轨道双手柄（紫色起点+粉色终点），替换两个独立滑块。
- [x] **截取设置视频预览**: 视频区域可播放/暂停，从 sourceStart 到 sourceEnd 自动停止，拖拽手柄时实时 seek。
-
-#### 第四阶段：历史文案 + Bug 修复
- [x] **历史文案保存与加载**: 新增 `useSavedScripts` hook，手动保存/加载/删除历史文案，独立 localStorage 持久化。
- [x] **时间轴拖拽修复**: `reorderSegments` 从属性交换改为数组移动（splice），修复拖拽后时长不跟随素材的 Bug。
- [x] **按钮视觉统一**: 文案编辑区 4 个按钮统一为固定高度 `h-7`，移除多余 `<span>` 嵌套。
- [x] **底部栏调整**: "保存文案"按钮移至底部右侧，移除预计时长显示。
-
-#### 第五阶段：字幕语言不匹配 + 视频比例错位修复
- [x] **字幕用原文替换 Whisper 转录**: `align()` 新增 `original_text` 参数，字幕文字永远用配音保存的原始文案。
- [x] **Remotion 动态视频尺寸**: `calculateMetadata` 从 props 读取真实尺寸，修复标题/字幕比例错位。
- [x] **英文空格丢失修复**: `split_word_to_chars` 遇到空格时 flush buffer + pending_space 标记。
-
-#### 第六阶段：参考音频自动转写 + 语速控制
- [x] **Whisper 自动转写 ref_text**: 上传参考音频时自动调用 Whisper 转写内容作为 ref_text，不再使用前端固定文字。
- [x] **参考音频自动截取**: 超过 10 秒自动在静音点截取（ffmpeg silencedetect），末尾 0.1 秒淡出避免截断爆音。
- [x] **重新识别功能**: 新增 `POST /ref-audios/{id}/retranscribe` 端点 + 前端 RotateCw 按钮，旧音频可重新转写并截取。
- [x] **语速控制**: 全链路 speed 参数（前端选择器 → 持久化 → 后端 → CosyVoice `inference_zero_shot(speed=)`），5 档：较慢(0.8)/稍慢(0.9)/正常(1.0)/稍快(1.1)/较快(1.2)。
- [x] **缺少参考音频门控**: 声音克隆模式下未选参考音频时，生成配音按钮禁用 + 黄色警告提示。
- [x] **Whisper 语言自动检测**: `transcribe()` language 参数改为可选（默认 None = 自动检测），支持多语言参考音频。
- [x] **前端清理**: 移除固定 ref_text 常量、朗读引导文字，简化为"上传任意语音样本，系统将自动识别内容并克隆声音"。
-
-### Day 22: 多素材优化 + AI 翻译 + TTS 多语言
- [x] **多素材 Bug 修复**: 6 个高优 Bug（边界溢出、单段 fallback、除零、duration 校验、Whisper 兜底、空列表检查）。
- [x] **架构重构**: 多素材从"逐段 LatentSync"重构为"先拼接再推理"，推理次数 N→1。
- [x] **前端优化**: payload 安全、进度消息、上传自动选中、Material 接口统一、拖拽修复、素材上限 4 个。
- [x] **AI 多语言翻译**: 新增 `/api/ai/translate` 接口，前端 9 种语言翻译 + 还原原文。
- [x] **TTS 多语言**: EdgeTTS 10 语言声音列表、翻译自动切换声音、声音克隆 language 透传、textLang 持久化。
-
-### Day 21: 缺陷修复 + 浮动预览 + 发布重构 + 架构优化 + 多素材生成
- [x] **Remotion 崩溃容错**: 渲染进程 SIGABRT 退出时检查输出文件，避免误判失败导致标题/字幕丢失。
- [x] **首页作品选择持久化**: 修复 `fetchGeneratedVideos` 无条件覆盖恢复值的问题，新增 `preferVideoId` 参数控制选中逻辑。
- [x] **发布页作品选择持久化**: 根因为签名 URL 不稳定，全面改用 `video.id` 替代 `path` 进行选择/持久化/比较。
- [x] **预取缓存补全**: 首页预取发布页数据时加入 `id` 字段，确保缓存数据可用于持久化匹配。
- [x] **浮动样式预览窗口**: 标题字幕预览改为 `position: fixed` 浮动窗口，固定左上角，滚动时始终可见。
- [x] **移动端适配**: ScriptEditor 按钮换行、预览默认比例改为 9:16 竖屏。
- [x] **多平台发布重构**: 平台配置独立化（DOUYIN_*/WEIXIN_*）、用户隔离 Cookie 管理、抖音刷脸验证二维码、微信发布流程优化。
- [x] **前端结构微调**: ScriptExtractionModal 迁移到 features/、contexts 迁移到 shared/contexts/、清理空目录。
- [x] **后端模块分层**: materials/tools/ref_audios 三个模块补全 router+schemas+service 分层。
- [x] **开发规范更新**: BACKEND_DEV.md 新增渐进原则、DOC_RULES.md 取消 TASK_COMPLETE.md 手动触发约束。
- [x] **文档全面更新**: BACKEND_DEV/README、FRONTEND_DEV、DEPLOY_MANUAL、README.md 同步更新。
- [x] **多素材视频生成（多机位效果）**: 支持多选素材 + 拖拽排序，按素材数量均分音频时长（对齐 Whisper 字边界）自动切换机位。逐段 LatentSync + FFmpeg 拼接。前端 @dnd-kit 拖拽排序 UI。
- [x] **字幕开关移除**: 默认启用逐字高亮字幕，移除开关及相关死代码。
- [x] **视频格式扩展**: 上传支持 mkv/webm/flv/wmv/m4v/ts/mts 等常见格式。
- [x] **Watchdog 优化**: 健康检查阈值提高到 5 次，新增重启冷却期 120 秒，避免误重启。
- [x] **多素材 Bug 修复**: 修复标点分句方案对无句末标点文案无效（改为均分方案）、音频时间偏移导致口型不对齐等缺陷。
-
-### Day 20: 代码质量与安全优化
- [x] **功能性修复**: LatentSync 回退逻辑、任务状态接口认证、User 类型统一。
- [x] **性能优化**: N+1 查询修复、视频上传流式处理、httpx 异步替换、GLM 异步包装。
- [x] **安全修复**: 硬编码 Cookie 配置化、日志敏感信息脱敏、ffprobe 安全调用、CORS 配置化。
- [x] **配置优化**: 存储路径环境变量化、Remotion 预编译加速、LatentSync 绝对路径。
- [x] **文档更新**: 更新 DOC_RULES.md 清单，补齐后端与部署文档；更新 SUBTITLE_DEPLOY.md, FRONTEND_DEV.md, implementation_plan.md。
- [x] **缺陷修复**: 修复 Remotion 路径解析、发布页持久化竞态、首页选中回归、素材闭包陷阱。
-
-### Day 19: 自动发布稳定性与发布体验优化 🚀
- [x] **抖音发布稳定性**: 上传入口、封面流程、发布重试、登录失效识别与网络失败快速返回全面增强。
- [x] **视频号发布修复**: 标题+标签统一写入“视频描述”，`post_create` 成功信号快速判定，超时改为失败返回。
- [x] **成功截图闭环**: 抖音/视频号发布成功截图接入前端，支持用户隔离存储与鉴权访问。
- [x] **截图观感优化**: 成功截图延后 3 秒并改为视口截图，修复“截图内容仅占 1/3”问题。
- [x] **调试能力开关化**: 新增视频号录屏配置，默认可按环境变量开关，失败排障更直观。
- [x] **启动链路统一**: 合并为 `run_backend.sh`（xvfb + headful），统一端口 `8006`，减少多进程混淆。
- [x] **发布页防误操作**: 发布中按钮提示“请勿刷新或关闭网页”，并启用刷新/关页二次确认拦截。
- [ ] **后续优化**: 发布任务状态恢复机制（任务化 + 状态持久化 + 前端轮询恢复）。
-
-### Day 18: 后端模块化与规范完善
- [x] **模块化迁移**: 路由透传 `modules/*`，业务逻辑集中到 service/workflow。
- [x] **视频生成拆分**: 生成流程下沉 workflow，任务状态统一 TaskStore。
- [x] **Redis 任务存储**: Redis 优先，不可用自动回退内存。
- [x] **仓储层抽离**: Supabase 访问统一 `repositories/*`，deps/auth/admin 全面替换。
- [x] **响应规范**: 统一 `success/message/data/code` + 全局异常处理。
- [x] **素材重命名**: 新增重命名接口与 Storage `move_file`。
- [x] **平台顺序调整**: 抖音/微信视频号/B站/小红书，移除快手。
- [x] **后端开发规范**: 新增 `BACKEND_DEV.md`，README 同步模块化结构。
- [x] **发布管理体验**: 首页预取路由 + 发布页骨架与缓存，进入更快。
- [x] **素材加载优化**: 素材列表并发签名 URL，骨架数量动态。
- [x] **预览加载优化**: `preload="metadata"` + hover 预取。
-
-### Day 17: 前端重构与体验优化
- [x] **UI 组件拆分**: 首页拆分为独立组件，降低 `page.tsx` 复杂度。
- [x] **轻量 FSD 迁移**: `app` 页面轻量化，逻辑集中到 `features/*/model`，通用能力下沉 `shared/*`。
- [x] **Controller Hooks**: Home/Publish 页面逻辑集中到 Controller Hook，Page 仅组合渲染。
- [x] **通用工具抽取**: `media.ts` 统一 API Base / URL / 日期格式化。
- [x] **交互优化**: 选择项持久化、列表内定位、刷新回顶部、最新作品优先预览。
- [x] **发布页改造**: 作品列表卡片化 + 搜索 + 预览弹窗。
- [x] **预览体验**: 预览弹窗统一头部样式与提示文案。
- [x] **预览一致性**: 标题/字幕预览按素材分辨率缩放。
- [x] **标题同步与限制**: 片头标题同步发布标题，输入法合成态兼容，限制 15 字。
- [x] **样式默认与持久化**: 默认样式与字号调整，刷新保留用户选择。
- [x] **性能微优化**: 列表渲染优化 + 并行请求 + localStorage 防抖。
- [x] **资源能力**: 字体/BGM 资源库 + `/api/assets` 接入。
- [x] **音频与字幕修复**: BGM 混音稳定性与字幕断句优化。
- [x] **持久化修复**: 接入 `useHomePersistence`，恢复 `isRestored` 逻辑并通过构建。
- [x] **预览与选择修复**: 发布预览兼容签名 URL，音频试听路径解析，素材/BGM 回退有效项。
- [x] **体验细节优化**: 录音预览 URL 回收，预览弹窗滚动恢复，全局任务提示挂载。
-
-### Day 16: 深度性能优化
- [x] **Qwen-TTS 加速**: 集成 Flash Attention 2 (已停用，被 CosyVoice 3.0 替换)。
- [x] **服务守护**: 开发 `Watchdog` 看门狗机制，自动监控并重启僵死服务。
- [x] **LatentSync 性能确认**: 验证 DeepCache + 原生 Flash Attn 生效。
- [x] **文档重构**: 全面更新 README、部署手册及后端文档。
-
-### Day 15: 手机号认证迁移
- [x] **认证系统升级**: 从邮箱迁移至 11 位手机号注册/登录。
- [x] **账户管理**: 新增修改密码、有效期显示、安全退出功能。
- [x] **AI 文案助手**: 升级 GLM-4.7-Flash，支持 B站/抖音链接提取与洗稿。
-
-### Day 14: AI 增强与体验优化
- [x] **AI 标题/标签**: 集成 GLM-4API 自动生成视频元数据。
- [x] **字幕升级**: Remotion 逐字高亮字幕 (卡拉OK效果) 及动画片头。
- [x] **模型升级**: 声音克隆已迁移至 CosyVoice 3.0 (0.5B)。
-
-### Day 13: 声音克隆集成
- [x] **声音克隆微服务**: 封装 CosyVoice 3.0 为独立 API (8010端口，替换 Qwen3-TTS)。
- [x] **参考音频管理**: Supabase 存储桶配置与管理接口。
- [x] **多模态 TTS**: 前端支持 EdgeTTS / Clone Voice 切换。
-
-### Day 12: 移动端适配
- [x] **iOS 兼容**: 修复 Safari 安全区域、状态栏颜色、Cookie 拦截问题。
- [x] **响应式 UI**: 移动端 Header 与发布页重构。
-
-### Day 11: 上传架构重构
- [x] **直传优化**: 前端直传 Supabase Storage，解决 Nginx 30s 超时问题。
- [x] **数据隔离**: 用户素材/视频按 UserID 物理隔离。
-
-### Day 10: HTTPS 与安全
- [x] **HTTPS 部署**: 配置 SSL 证书与 Nginx 反向代理。
- [x] **安全加固**: Supabase Studio 增加 Basic Auth 保护。
-
-### Day 9: 认证系统与发布闭环
- [x] **用户系统**: 基于 Supabase Auth 实现 JWT 认证。
- [x] **发布闭环**: 验证 B站/抖音/小红书 自动发布流程。
- [x] **服务自愈**: 配置 PM2 进程守护。
-
-### Day 1-8: 核心功能构建
- [x] **Day 8**: 历史记录持久化与文件管理。
- [x] **Day 7**: 社交媒体自动登录与多平台发布。
- [x] **Day 6**: **LatentSync 1.6** 升级与服务器部署。
- [x] **Day 5**: 前端视频上传与进度反馈。
- [x] **Day 4**: MuseTalk (旧版) 口型同步修复。
- [x] **Day 3**: 服务器环境配置与模型权重下载。
- [x] **Day 1-2**: 项目基础框架 (FastAPI + Next.js) 搭建。
-
---
-
-## 🛤️ 后续规划 (Roadmap)
-
-### 🔴 优先待办
- [x] ~~**配音前置重构 — 第二阶段**: 素材片段截取 + 语音时间轴编排~~ ✅ Day 23 已完成
- [ ] **批量生成架构**: 支持 Excel 导入，批量生产视频。
- [ ] **定时任务后台化**: 迁移前端触发的定时发布到后端 APScheduler。
- [ ] **发布任务恢复机制**: 发布任务化 + 状态持久化 + 前端断点恢复，解决刷新后状态丢失。
-
-### 🔵 长期探索
- [ ] **容器化交付**: 提供完整的 Docker Compose 一键部署包。
- [ ] **分布式队列**: 引入 Celery + Redis 处理超高并发任务。
-
---
-
-## 📊 模块完成度
-
-| 模块 | 进度 | 状态 |
-|------|------|------|
-| **核心 API** | 100% | ✅ 稳定 |
-| **Web UI** | 100% | ✅ 稳定 (移动端适配) |
-| **唇形同步** | 100% | ✅ LatentSync 1.6 |
-| **TTS 配音** | 100% | ✅ EdgeTTS + CosyVoice 3.0 + 配音前置 + 时间轴编排 + 自动转写 + 语速控制 + 语气控制 |
-| **自动发布** | 100% | ✅ 抖音/微信视频号/B站/小红书 |
-| **用户认证** | 100% | ✅ 手机号 + JWT |
-| **付费会员** | 100% | ✅ 支付宝电脑网站支付 + 自动激活 |
-| **部署运维** | 100% | ✅ PM2 + Watchdog |
-
---
-
-## 📎 相关文档
-
- [详细开发日志 (DevLogs)](Docs/DevLogs/)
- [部署手册 (DEPLOY_MANUAL)](Docs/DEPLOY_MANUAL.md)
+# ViGent2 开发任务清单 (Task Log)
+
+**项目**: ViGent2 数字人口播视频生成系统
+**进度**: 100% (Day 35 - 小脸口型质量补偿落地 + 部署验证)
+**更新时间**: 2026-03-10
+
+---
+
+## 📅 对话历史与开发日志
+
+> 这里记录了每一天的核心开发内容与 milestone。
+
+### Day 35: 小脸口型质量补偿落地 + 部署验证 + 稳定性补丁 (Current)
+- [x] **小脸口型质量补偿落地**: 新增 `small_face_enhance_service.py`，实现 SCRFD 小脸检测（10%-30% 采样）-> 裁切轨迹（每 8 帧检测 + EMA）-> 稀疏关键帧超分（GFPGAN/CodeFormer）-> 下半脸贴回（seamlessClone/alpha fallback）完整链路。
+- [x] **后端集成完成**: `lipsync_service.py` 在 `_local_generate()` 内完成 looping 后插入增强，抽取 `_run_selected_model()` 统一模型路由，增强失败按 `FAIL_OPEN` 自动回退原流程。
+- [x] **配置与依赖**: 新增 5 个 `LIPSYNC_SMALL_FACE_*` 配置项；`requirements.txt` 增加 `opencv-python-headless`、`gfpgan`；新增 `models/FaceEnhance/GFPGANv1.4.pth` 权重目录。
+- [x] **部署文档新增**: 新增并回写 `Docs/FACEENHANCE_DEPLOY.md`，补齐部署、权重、开关、验证、回滚说明。
+- [x] **线上稳定性修复**:
+  - `small_face_enhance_service.py` 增加 `cv2/numpy` 懒加载守卫，缺依赖时跳过增强不影响主流程。
+  - 增加 `from __future__ import annotations`，避免 `np.ndarray` 注解在缺依赖场景导入期报错。
+  - 增加 `torchvision.transforms.functional_tensor` shim，修复 `torchvision>=0.20` 下 GFPGAN 初始化失败。
+  - `_get_video_info()` 改为 JSON 字段解析并优先 `avg_frame_rate`，修复 `nb_frames` 缺失导致的帧数估算偏差。
+  - `_build_face_track()` 回写实际读帧数；`blend_back()` 帧数校验放宽为 `lipsync <= original` 正常贴回，仅 `>` 报错。
+  - `blend_back()` 新增 `ls_frames <= 0` 空输出保护，异常时由 `FAIL_OPEN` 回退常规路径，避免写出空视频。
+  - 时基修复：增强视频输出 fps 跟随源视频 fps；贴回按 `orig_fps/ls_fps` 映射原始帧索引，修复动作变慢与重影。
+  - 音轨修复：贴回成功后新增 mux 音轨步骤，确保小脸增强路径输出视频包含声音。
+  - 眼部重影修复：mask 起点下移到 68% 并增加左右 16% 留白，对 seamlessClone 结果做 mask 限域二次融合，减少眼部上方 ghosting。
+  - 运行策略收口：`LIPSYNC_SMALL_FACE_THRESHOLD=9999` 仅用于链路冒烟，质量验证与日常运行统一回归 `256`。
+- [x] **部署校验通过**: `GET /api/videos/lipsync/health` 已返回 `data.small_face_enhance`；默认 `enabled=false`，开关关闭下行为与原流程一致。
+
+### Day 34: 多镜头时间轴重构 + 文案深度学习弹窗防误触关闭 + Code Review 修复
+- [x] **时间轴模型重构**: 多素材从”等分顺序片段”升级为”主素材连续播放 + 插入镜头块”，支持自由插入、拖拽移动。
+- [x] **前端链路落地**: 重写 `useTimelineEditor` 与 `TimelineEditor`，新增主素材/插入候选语义，`useHomeController` / `HomePage` / `MaterialSelector` 全链路适配。
+- [x] **后端生成链路适配**: `workflow.py` 完成 `material_paths` 来源修正、`custom_assignments` 新校验、素材下载去重与段处理并发限制，保持单素材兼容。
+- [x] **文案深度学习防误触关闭**: `ScriptLearningModal` 禁用遮罩和 `ESC` 关闭，仅允许右上角 `X` 或”填入文案”关闭；输入页”取消”改为”清空”。
+- [x] **Code Review 修复**:
+  - UX: 移除时间轴 resize handle，统一用 ClipTrimmer 弹窗编辑时长；引入拖拽/点击像素阈值区分。
+  - Lint: 修复 `useTimelineEditor` 3 处 set-state-in-effect、`HomePage` 未使用解构、`TimelineEditor` 未使用 import/props。
+  - P1: `workflow.py` `is_multi` 补充 `custom_assignments` 条件，防止多片段 assignment 退化为单素材路径。
+  - P1: 主素材 trim range 改为按 identity（非 count）重置，修复切换主素材时截取范围泄漏。
+  - ClipTrimmer onConfirm 同步调用 `resizeInsert()` 更新时间轴块时长。
+- [x] **文档同步**: 回写 `Day34` 与 `TASK_COMPLETE`，并更新 Current 指向。
+
+### Day 33: 文案深度学习落地 + 抓取稳定性增强 + 交互统一
+- [x] **文案深度学习功能上线**: 新增 `ScriptLearningModal`（输入主页链接 -> 话题分析 -> 生成文案 -> 填入编辑器）与首页入口接入。
+- [x] **Tools 新接口**: 新增 `POST /api/tools/analyze-creator` 与 `POST /api/tools/generate-topic-script`，并接入登录鉴权。
+- [x] **抖音/B站抓取增强**: 博主标题抓取统一升级为 Playwright 直连主链路，支持用户 Cookie 上下文增强与失败重试。
+- [x] **GLM 调用统一收口**: `glm_service` 新增统一调用入口，标题生成/改写/翻译/话题分析/话题文案生成全部复用，减少重复代码。
+- [x] **超时体验优化**: 文案深度学习“生成文案”前端超时从 30s 提升到 90s，并补充超时提示文案。
+- [x] **文案弹窗交互统一**: 文案提取/AI 改写/文案深度学习结果页按钮统一为底部 Action Grid，主次按钮层级与文案动作统一。
+- [x] **依赖升级**: 后端 venv 升级 `yt-dlp`、`playwright`、`biliup` 并完成兼容性冒烟验证。
+- [x] **文档同步**: 回写 `Day33`、`FRONTEND_README`、`FRONTEND_DEV`、`BACKEND_README`、`BACKEND_DEV`、`TASK_COMPLETE`。
+
+### Day 32: 视频下载同源修复 + 安全整改第一批 + Day 日志拆分归档
+- [x] **下载链路修复**: 新增 `GET /api/videos/generated/{video_id}/download`，统一返回 `Content-Disposition: attachment`，修复“点击下载却新开标签页播放”问题。
+- [x] **发布成功弹窗下载改造**: `CleanupContext` 从传 URL 改为传 `videoId`，下载按钮改走同源接口，去掉 `target="_blank"`。
+- [x] **首页下载改造**: `PreviewPanel` 同步切换到同源下载接口，首页与发布页下载行为一致。
+- [x] **兼容旧持久化状态**: `CleanupContext` 对旧 `videoDownloadUrl` 做 `videoId` 解析回填，避免旧 pending 状态失效。
+- [x] **文档拆分归档**: 将“下载修复开始后的今日内容”归档到 `Docs/DevLogs/Day32.md`，并从 `Day31.md` 移除对应章节与验证记录。
+- [x] **安全第一批修复**: JWT 默认密钥生产拦截、AI/Tools 接口强制鉴权、materials 路径穿越拦截、video_id 白名单、上传体积限制、错误信息通用化。
+- [x] **安全收尾三刀**: `delete_material` 的 `ValueError -> 400`、`tools` URL 下载分支 500MB 限制、`DEBUG=false` 下默认 JWT 密钥阻断启动。
+- [x] **弹窗关闭策略收敛**: 默认支持 `ESC/X/遮罩` 关闭；发布成功清理弹窗保持强制流程不允许遮罩关闭；录音弹窗录音中禁遮罩关闭（防误触）。
+
+### Day 31: 文档体系收敛 + 音色试听 + 录音弹窗重构 + 发布登录稳定性修复
+- [x] **文档体系收敛**: README/DEV 职责边界明确，部署参数与代码对齐，Qwen3-TTS 文档归档至历史状态。
+- [x] **音色试听能力**: 新增并启用 `GET/POST /api/videos/voice-preview`，前端改为直接播放 GET 音频流，修复线上 404（重启后端生效）。
+- [x] **录音交互重构**: 录音入口迁移到参考音频区底部，流程改为弹窗；支持录音后即时关闭弹窗、后台上传识别。
+- [x] **弹窗系统统一**: 抽离 `AppModal`，统一遮罩/焦点/滚动锁/Portal，可访问性补齐；主要弹窗完成迁移（预览、提取、改写、截取、录音、改密、发布登录）。
+- [x] **抖音扫码修复**: 登录页等待策略改为 `domcontentloaded`，并对导航超时容错，避免“无法获取二维码”。
+- [x] **微信二维码优化**: 后端优先导出原始 PNG，前端展示加入白底留白容器，修复“二维码边缘像被截断”的观感问题。
+- [x] **发布性能优化**: 发布页改为受限并发（并发度 2），多平台发布总等待时长明显下降。
+- [x] **微信上传日志降噪**: `file_input empty` 告警改为信号驱动，非最终重试降级为 info，减少误报警。
+- [x] **小红书发布重构**: 对齐抖音/微信上传架构，补齐启动配置、上传/发布多信号判定、成功截图与 `screenshot_url` 回传。
+- [x] **Cookie 格式统一**: 非 B 站平台统一保存为 Playwright `storage_state`，支持 uploader 直接加载上下文。
+- [x] **小红书扫码修复**: 自动从短信登录切换到扫码页并提取二维码，登录成功判定补齐 `/new/home` 路径。
+- [x] **小红书“上传卡住”修复**: 新增无后缀视频临时文件兜底（hardlink/copy）、文件名后缀一致性校验、上传空转超时保护（90s）。
+- [x] **实测闭环**: 小红书 `POST /api/publish` 实测成功（45.77s）并可访问成功截图接口。
+- [x] **文档补齐**: 新增 `Docs/PUBLISH_DEPLOY.md`，并回写 `README.md`、`BACKEND_README.md`、`BACKEND_DEV.md`、`DEPLOY_MANUAL.md`。
+- [x] **文档规则对齐**: 更新 `Docs/DOC_RULES.md`，补充发布相关“三检”与敏感信息处理规范，加入 `PUBLISH_DEPLOY.md` 检查项，工具规范改为 `Read/Grep/apply_patch`，并对齐 TASK_COMPLETE 检查清单。
+- [x] **首页交互微调**: `AI生成标题标签` 按钮迁移到“四、标题与字幕”标题同层最右；`标题显示方式 + 预览样式` 下沉到下一行右侧；AI按钮圆角/尺寸对齐“在线录音”，配色保留原蓝色渐变；文档明确 `title_display_mode` 对主/副标题统一生效。
+- [x] **文案编辑扩展**: 在文案输入框右下角新增扩展角标，点击后弹出大编辑器，主框与弹窗内文案实时同步；角标样式改为双箭头极简贴边并微调到 `right-0.5 bottom-2`；修复扩展输入框打字后失焦问题，移除紫色聚焦边框。
+- [x] **站点图标更新**: 使用 `Temp/video.png` 替换网站 icon，生成并更新 `frontend/src/app/icon.png` 与多尺寸 `frontend/src/app/favicon.ico`。
+- [x] **发布后清理链路加固**: 新增/优化 `CleanupContext` + `/api/videos/cleanup` 全链路；后端删除异常不再吞错、清理接口严格成功语义；前端失败不清本地/不关弹窗，3 次失败可暂不清理，清理状态 24h 过期并支持用户切换复位；清理范围收敛为输入内容字段并保留用户偏好。
+
+### Day 30: Remotion 缓存修复 + 编码流水线质量优化 + 唇形同步容错 + 统一下拉交互
+- [x] **Remotion 缓存 404 修复**: bundle 缓存命中时，新生成的视频/字体文件不在旧缓存 `public/` 目录 → 404 → 回退 FFmpeg（无标题字幕）。改为硬链接（`fs.linkSync`）当前渲染所需文件到缓存目录。
+- [x] **LatentSync `read_video` 跳过冗余 FPS 重编码**: 检测输入 FPS，已是 25fps 时跳过 `ffmpeg -r 25 -crf 18` 重编码。
+- [x] **LatentSync final mux 流复制**: `imageio` CRF 13 写帧后的 mux 步骤从 `libx264 -crf 18` 改为 `-c:v copy`，消除冗余双重编码。
+- [x] **`prepare_segment` + `normalize_orientation` CRF 提质**: CRF 23 → 18，与 LatentSync 内部质量标准统一。
+- [x] **多素材 concat 流复制**: 各段参数已统一，`concat_videos` 从 `libx264 -crf 23` 改为 `-c:v copy`。
+- [x] **编码次数总计**: 从 5-6 次有损编码降至 3 次（prepare_segment → LatentSync/MuseTalk 模型输出 → Remotion）。
+- [x] **LatentSync 无脸帧容错**: 素材部分帧检测不到人脸时不再中断推理，无脸帧保留原画面，单素材异常时回退原视频。
+- [x] **MuseTalk 管道直编码**: `cv2.VideoWriter(mp4v)` 中间有损文件改为 FFmpeg rawvideo stdin 管道，消除一次冗余有损编码。
+- [x] **MuseTalk 参数环境变量化**: 推理与编码参数（detect_every/blend_cache/CRF/preset 等）从硬编码迁移到 `backend/.env`，当前使用质量优先档（CRF 14, preset slow, detect_every 2, blend_cache_every 2）。
+- [x] **Workflow 异步防阻塞**: 新增 `_run_blocking()` 线程池辅助，5 处同步 FFmpeg 调用（旋转归一化/prepare_segment/concat/BGM 混音）改为 `await _run_blocking()`，事件循环不再被阻塞。
+- [x] **compose 跳过优化**: 无 BGM 时 `final_audio_path == audio_path`，跳过多余的 compose 步骤，Remotion 路径直接用 lipsync 输出，非 Remotion 路径 `shutil.copy` 透传。
+- [x] **compose() 异步化**: `compose()` 改为 `async def`，内部 `_get_duration` 和 `_run_ffmpeg` 走 `run_in_executor`。
+- [x] **同分辨率跳过 scale**: 多素材逐段比对分辨率，匹配的传 `None` 走 copy 分支；单素材同理。避免已是目标分辨率时的无效重编码。
+- [x] **`_get_duration()` 线程池化**: workflow 中 3 处同步 ffprobe 探测改为 `await _run_blocking()`。
+- [x] **compose 循环 CRF 统一**: 循环场景 CRF 23 → 18，与全流水线质量标准一致。
+- [x] **多素材片段校验**: prepare 完成后校验片段数量一致，防止空片段进入 concat。
+- [x] **唇形模型前端选择**: 生成按钮右侧新增模型下拉（默认模型/快速模型/高级模型），全链路透传 `lipsync_model` 到后端路由。默认保持阈值策略，快速强制 MuseTalk，高级强制 LatentSync，三种模式均有 LatentSync 兜底。选择 localStorage 持久化。
+- [x] **业务下拉统一组件化**: 新增 `SelectPopover`（桌面 Popover + 移动端 BottomSheet），覆盖首页/发布页主要业务选择器（音色、参考音频、配音、素材、BGM、作品、样式、模型、画面比例）。
+- [x] **下拉体验修复**: 统一处理遮挡（Portal + fixed）、自动上拉、触发器同宽、背景不透明、滚动条隐藏、再次打开定位到已选项。
+- [x] **预览联动修复**: 下拉内点击视频预览不强制收起菜单；预览弹窗层级高于下拉；关闭预览后可继续在菜单内连续预览。
+- [x] **BGM 交互收敛**: BGM 选择改为发布页同款（搜索 + 列表 + 试听）；按产品要求移除首页音量滑杆，生成请求固定 `bgm_volume=0.2`。
+- [x] **例外回退**: `ScriptEditor` 的“历史文案 / AI多语言”恢复原轻量菜单样式（不强制统一 SelectPopover）。
+- [x] **文档同步**: Day30 / TASK_COMPLETE / FRONTEND_DEV / FRONTEND_README / README / BACKEND_README 同步更新到最终实现。
+
+### Day 29: 视频流水线优化 + CosyVoice 语气控制
+- [x] **字幕同步修复**: Whisper 时间戳三步平滑（单调递增+重叠消除+间隙填补）+ 原文节奏映射（线性插值 + 单字时长钳位）。
+- [x] **LatentSync 嘴型参数调优**: inference_steps 16→20, guidance_scale 2.0, DeepCache 启用, Remotion concurrency 16→4。
+- [x] **compose 流复制**: 不循环时 `-c:v copy` 替代 libx264 重编码，compose 耗时从分钟级降到秒级。
+- [x] **FFmpeg 超时保护**: `_run_ffmpeg()` timeout=600, `_get_duration()` timeout=30。
+- [x] **全局并发限制**: `asyncio.Semaphore(2)` 控制同时运行的生成任务数。
+- [x] **Redis 任务 TTL**: create 24h, completed/failed 2h, list 自动清理过期索引。
+- [x] **临时字体清理**: 字体文件加入 temp_files 清理列表。
+- [x] **预览背景 CORS 修复**: 素材同源代理 `/api/materials/stream/{id}` 彻底绕开跨域。
+- [x] **CosyVoice 语气控制**: 声音克隆模式新增语气下拉（正常/欢快/低沉/严肃），基于 `inference_instruct2()` 自然语言指令控制情绪，全链路透传 instruct_text，默认"正常"行为不变。
+
+### Day 28: CosyVoice FP16 加速 + 文档全面更新
+- [x] **CosyVoice FP16 半精度加速**: `AutoModel()` 开启 `fp16=True`，LLM 推理和 Flow Matching 自动混合精度运行，预估提速 30-40%、显存降低 ~30%。
+- [x] **文档全面更新**: README.md / DEPLOY_MANUAL.md / SUBTITLE_DEPLOY.md / BACKEND_README.md 补充 MuseTalk 混合唇形同步方案、性能优化、Remotion 并发渲染等内容。
+
+### Day 27: Remotion 描边修复 + 字体样式扩展 + 混合唇形同步 + 性能优化
+- [x] **描边渲染修复**: 标题/副标题/字幕从 `textShadow` 4 方向模拟改为 CSS 原生 `-webkit-text-stroke` + `paint-order: stroke fill`，修复描边过粗和副标题重影问题。
+- [x] **字体样式扩展**: 标题样式 4→12 个（+庞门正道/优设标题圆/阿里数黑体/文道潮黑/无界黑/厚底黑/寒蝉半圆体/欣意吉祥宋），字幕样式 4→8 个（+少女粉/清新绿/金色隶书/楷体红字）。
+- [x] **描边参数优化**: 所有预设 `stroke_size` 从 8 降至 4~5，配合原生描边视觉更干净。
+- [x] **TypeScript 类型修复**: Root.tsx `Composition` 泛型与 `calculateMetadata` 参数类型对齐；Video.tsx `VideoProps` 添加索引签名兼容 `Record<string, unknown>`；VideoLayer.tsx 移除 `OffthreadVideo` 不支持的 `loop` prop。
+- [x] **进度条文案还原**: 进度条从显示后端推送消息改回固定 `正在AI生成中...`。
+- [x] **MuseTalk 混合唇形同步**: 部署 MuseTalk 1.5 常驻服务 (GPU0, 端口 8011)，按音频时长自动路由（由 `LIPSYNC_DURATION_THRESHOLD` 控制；本仓库当前 `.env` 为 100）— 短视频走 LatentSync，长视频走 MuseTalk，MuseTalk 不可用时自动回退。
+- [x] **MuseTalk 推理性能优化**: server.py v2 重写 — cv2 直读帧(跳过 ffmpeg→PNG)、人脸检测降频(每5帧)、BiSeNet mask 缓存(每5帧)、cv2.VideoWriter 直写(跳过 PNG 写盘)、batch_size 8→32，预估 30min→8-10min (~3x)。
+- [x] **Remotion 并发渲染优化**: render.ts 新增 concurrency 参数，从默认 8 提升到 16 (56核 CPU)，预估 5min→2-3min。
+
+### Day 26: 前端优化：板块合并 + 序号标题 + UI 精细化
+- [x] **板块合并**: 首页 9 个独立板块合并为 5 个主板块（配音方式+配音列表→三、配音；视频素材+时间轴→四、素材编辑；历史作品+作品预览→六、作品）。
+- [x] **中文序号标题**: 一~十编号（首页一~六，发布页七~十），移除所有 emoji 图标。
+- [x] **embedded 模式**: 6 个组件支持 `embedded` prop，嵌入时不渲染外层卡片/标题。
+- [x] **配音列表两行布局**: embedded 模式第 1 行语速+生成配音（右对齐），第 2 行配音列表+刷新。
+- [x] **子组件自渲染子标题**: MaterialSelector/TimelineEditor embedded 时自渲染 h3 子标题+操作按钮同行。
+- [x] **下拉对齐**: TitleSubtitlePanel 标签统一 `w-20`，下拉 `w-1/3 min-w-[100px]`，垂直对齐。
+- [x] **参考音频文案简化**: 底部段落移至标题旁，简化为 `(上传3-10秒语音样本)`。
+- [x] **账户手机号显示**: AccountSettingsDropdown 新增手机号显示。
+- [x] **标题显示模式对副标题生效**: payload 条件修复 + UI 下拉上移至板块标题行。
+- [x] **登录后用户信息立即可用**: AuthContext 暴露 `setUser`，登录成功后立即写入用户数据，修复登录后显示"未知账户"的问题。
+- [x] **文案微调**: 素材描述改为"上传自拍视频，最多可选4个"；显示模式选项加"标题"前缀。
+- [x] **UI/UX 体验优化**: 操作按钮移动端可见（opacity-40）、手机号脱敏、标题字数计数器、时间轴拖拽抓手图标、截取滑块放大。
+- [x] **代码质量修复**: 密码弹窗 success 清空、MaterialSelector useMemo + disabled 守卫、TimelineEditor useMemo。
+- [x] **发布页响应式布局**: 平台账号卡片单行布局，移动端紧凑（小图标/小按钮），桌面端宽松（与其他板块风格一致）。
+- [x] **移动端刷新回顶部**: `scrollRestoration = "manual"` + 列表 scroll 时间门控（`scrollEffectsEnabled` ref，1 秒内禁止自动滚动）+ 延迟兜底 `scrollTo(0,0)`。
+- [x] **移动端样式预览缩小**: FloatingStylePreview 移动端宽度缩至 160px，位置改为右下角，不遮挡样式调节控件。
+- [x] **列表滚动条统一隐藏**: 所有列表（BGM/配音/作品/素材/文案提取）滚动条改回 `hide-scrollbar`。
+- [x] **移动端配音/素材适配**: VoiceSelector 按钮移动端缩小（`px-2 sm:px-4`）修复克隆声音不可见；MaterialSelector 标题行移除 `whitespace-nowrap`，描述移动端隐藏，修复刷新按钮溢出。
+- [x] **生成配音按钮放大**: 从辅助尺寸（`text-xs px-2 py-1`）升级为主操作尺寸（`text-sm font-medium px-4 py-2`），新增阴影。
+- [x] **生成进度条位置调整**: 从"六、作品"卡片内部提取到右栏独立卡片，显示在作品卡片上方，更醒目。
+- [x] **LatentSync 超时修复**: httpx 超时从 1200s（20 分钟）改为 3600s（1 小时），修复 2 分钟以上视频口型推理超时回退问题。
+- [x] **字幕时间戳节奏映射**: `whisper_service.py` 从全程线性插值改为 Whisper 逐词节奏映射，修复长视频字幕漂移。
+
+### Day 25: 文案提取修复 + 自定义提示词 + 片头副标题
+- [x] **抖音文案提取修复**: yt-dlp Fresh cookies 报错，重写 `_download_douyin_manual` 为移动端分享页 + 自动获取 ttwid 方案。
+- [x] **清理 DOUYIN_COOKIE**: 新方案不再需要手动维护 Cookie，从 `.env`/`config.py`/`service.py` 全面删除。
+- [x] **AI 智能改写自定义提示词**: 后端 `rewrite_script()` 支持 `custom_prompt` 参数；前端 checkbox 旁新增折叠式提示词编辑区，localStorage 持久化。
+- [x] **SSR 构建修复**: `useState` 初始化 `localStorage` 访问加 `typeof window` 守卫，修复 `npm run build` 报错。
+- [x] **片头副标题**: 新增 secondary_title（后端/Remotion/前端全链路），AI 同时生成，独立样式配置，20 字限制。
+- [x] **前端文案修正**: "AI 洗稿结果"→"AI 改写结果"。
+- [x] **yt-dlp 升级**: `2025.12.08` → `2026.2.21`。
+- [x] **参考音频中文文件名修复**: `sanitize_filename()` 将存储路径清洗为 ASCII 安全字符，纯中文名哈希兜底，原始名保留为展示名。
+
+### Day 24: 鉴权到期治理 + 多素材时间轴稳定性修复
+- [x] **会员到期请求时失效**: 登录与鉴权接口统一执行 `expires_at` 检查；到期后自动停用账号、清理 session，并返回“会员已到期，请续费”。
+- [x] **画面比例控制**: 时间轴新增 `9:16 / 16:9` 输出比例选择，前端持久化并透传后端，单素材/多素材统一按目标分辨率处理。
+- [x] **标题/字幕防溢出**: Remotion 与前端预览统一响应式缩放、自动换行、描边/字距/边距比例缩放，降低预览与成片差异。
+- [x] **标题显示模式**: 标题行新增“短暂显示/常驻显示”下拉；默认短暂显示（4 秒），用户选择持久化并透传至 Remotion 渲染链路。
+- [x] **MOV 方向归一化**: 新增旋转元数据解析与 orientation normalize，修复“编码横屏+旋转元数据”导致的竖屏判断偏差。
+- [x] **多素材拼接稳定性**: 片段 prepare 与 concat 统一 25fps/CFR，concat 增加 `+genpts`，缓解段切换处“画面冻结口型还动”。
+- [x] **时间轴语义对齐**: 打通 `source_end` 全链路；修复 `sourceStart>0 且 sourceEnd=0` 时长计算；生成时以时间轴可见段 assignments 为准，超出段不参与。
+- [x] **交互细节优化**: 页面刷新回顶部；素材/历史列表首轮自动滚动抑制，减少恢复状态时页面跳动。
+
+### Day 23: 配音前置重构 + 素材时间轴编排 + UI 体验优化 + 声音克隆增强
+
+#### 第一阶段：配音前置
+- [x] **配音生成独立化**: 新增 `generated_audios` 后端模块（router/schemas/service），5 个 API 端点，复用现有 TTSService / voice_clone_service / task_store。
+- [x] **配音管理面板**: 前端新增 `useGeneratedAudios` hook + `GeneratedAudiosPanel` 组件，支持生成/试听/改名/删除/选中。
+- [x] **UI 面板重排序**: 文案 → 标题字幕 → 配音方式 → 配音列表 → 素材选择 → BGM → 生成视频。
+- [x] **素材区门控**: 未选中配音时素材区显示遮罩，选中后显示配音时长 + 素材均分信息。
+- [x] **视频生成对接**: workflow.py 新增预生成音频分支（`generated_audio_id`），跳过内联 TTS，向后兼容。
+- [x] **持久化**: selectedAudioId 加入 useHomePersistence，刷新页面恢复选中配音。
+
+#### 第二阶段：素材时间轴编排
+- [x] **时间轴编辑器**: 新增 `TimelineEditor` 组件，wavesurfer.js 音频波形 + 色块可视化素材分配，拖拽分割线调整各段时长。
+- [x] **素材截取设置**: 新增 `ClipTrimmer` 模态框，HTML5 视频预览 + 双端滑块设置源视频截取起点/终点。
+- [x] **后端自定义分配**: 新增 `CustomAssignment` 模型，`prepare_segment` 支持 `source_start`，workflow 多素材/单素材流水线支持 `custom_assignments`。
+- [x] **循环截取修复**: `stream_loop + source_start` 改为两步处理（先裁剪再循环），确保从截取起点循环而非从视频 0s 开始。
+- [x] **MaterialSelector 精简**: 移除旧的时长信息栏和拖拽排序区（功能迁移到 TimelineEditor）。
+
+#### 第三阶段：UI 体验优化 + TTS 稳定性
+- [x] **TTS SoX PATH 修复**: `run_qwen_tts.sh` export conda env bin 到 PATH (Qwen3-TTS 已停用，已被 CosyVoice 3.0 替换)。
+- [x] **TTS 显存管理**: 每次生成后 `torch.cuda.empty_cache()`，asyncio.to_thread 避免阻塞事件循环 (CosyVoice 沿用相同机制)。
+- [x] **配音列表按钮统一**: Play/Edit/Delete 按钮右侧同组 hover 显示，与 RefAudioPanel 一致，移除文案摘要。
+- [x] **素材区解除配音门控**: 移除 MaterialSelector 的 selectedAudio 遮罩，素材随时可上传管理。
+- [x] **时间轴拖拽排序**: TimelineEditor 色块支持 HTML5 Drag & Drop 调换素材顺序。
+- [x] **截取设置 Range Slider**: ClipTrimmer 改为单轨道双手柄（紫色起点+粉色终点），替换两个独立滑块。
+- [x] **截取设置视频预览**: 视频区域可播放/暂停，从 sourceStart 到 sourceEnd 自动停止，拖拽手柄时实时 seek。
+
+#### 第四阶段：历史文案 + Bug 修复
+- [x] **历史文案保存与加载**: 新增 `useSavedScripts` hook，手动保存/加载/删除历史文案，独立 localStorage 持久化。
+- [x] **时间轴拖拽修复**: `reorderSegments` 从属性交换改为数组移动（splice），修复拖拽后时长不跟随素材的 Bug。
+- [x] **按钮视觉统一**: 文案编辑区 4 个按钮统一为固定高度 `h-7`，移除多余 `<span>` 嵌套。
+- [x] **底部栏调整**: "保存文案"按钮移至底部右侧，移除预计时长显示。
+
+#### 第五阶段：字幕语言不匹配 + 视频比例错位修复
+- [x] **字幕用原文替换 Whisper 转录**: `align()` 新增 `original_text` 参数，字幕文字永远用配音保存的原始文案。
+- [x] **Remotion 动态视频尺寸**: `calculateMetadata` 从 props 读取真实尺寸，修复标题/字幕比例错位。
+- [x] **英文空格丢失修复**: `split_word_to_chars` 遇到空格时 flush buffer + pending_space 标记。
+
+#### 第六阶段：参考音频自动转写 + 语速控制
+- [x] **Whisper 自动转写 ref_text**: 上传参考音频时自动调用 Whisper 转写内容作为 ref_text，不再使用前端固定文字。
+- [x] **参考音频自动截取**: 超过 10 秒自动在静音点截取（ffmpeg silencedetect），末尾 0.1 秒淡出避免截断爆音。
+- [x] **重新识别功能**: 新增 `POST /ref-audios/{id}/retranscribe` 端点 + 前端 RotateCw 按钮，旧音频可重新转写并截取。
+- [x] **语速控制**: 全链路 speed 参数（前端选择器 → 持久化 → 后端 → CosyVoice `inference_zero_shot(speed=)`），5 档：较慢(0.8)/稍慢(0.9)/正常(1.0)/稍快(1.1)/较快(1.2)。
+- [x] **缺少参考音频门控**: 声音克隆模式下未选参考音频时，生成配音按钮禁用 + 黄色警告提示。
+- [x] **Whisper 语言自动检测**: `transcribe()` language 参数改为可选（默认 None = 自动检测），支持多语言参考音频。
+- [x] **前端清理**: 移除固定 ref_text 常量、朗读引导文字，简化为"上传任意语音样本，系统将自动识别内容并克隆声音"。
+
+### Day 22: 多素材优化 + AI 翻译 + TTS 多语言
+- [x] **多素材 Bug 修复**: 6 个高优 Bug（边界溢出、单段 fallback、除零、duration 校验、Whisper 兜底、空列表检查）。
+- [x] **架构重构**: 多素材从"逐段 LatentSync"重构为"先拼接再推理"，推理次数 N→1。
+- [x] **前端优化**: payload 安全、进度消息、上传自动选中、Material 接口统一、拖拽修复、素材上限 4 个。
+- [x] **AI 多语言翻译**: 新增 `/api/ai/translate` 接口，前端 9 种语言翻译 + 还原原文。
+- [x] **TTS 多语言**: EdgeTTS 10 语言声音列表、翻译自动切换声音、声音克隆 language 透传、textLang 持久化。
+
+### Day 21: 缺陷修复 + 浮动预览 + 发布重构 + 架构优化 + 多素材生成
+- [x] **Remotion 崩溃容错**: 渲染进程 SIGABRT 退出时检查输出文件，避免误判失败导致标题/字幕丢失。
+- [x] **首页作品选择持久化**: 修复 `fetchGeneratedVideos` 无条件覆盖恢复值的问题，新增 `preferVideoId` 参数控制选中逻辑。
+- [x] **发布页作品选择持久化**: 根因为签名 URL 不稳定，全面改用 `video.id` 替代 `path` 进行选择/持久化/比较。
+- [x] **预取缓存补全**: 首页预取发布页数据时加入 `id` 字段，确保缓存数据可用于持久化匹配。
+- [x] **浮动样式预览窗口**: 标题字幕预览改为 `position: fixed` 浮动窗口，固定左上角，滚动时始终可见。
+- [x] **移动端适配**: ScriptEditor 按钮换行、预览默认比例改为 9:16 竖屏。
+- [x] **多平台发布重构**: 平台配置独立化（DOUYIN_*/WEIXIN_*）、用户隔离 Cookie 管理、抖音刷脸验证二维码、微信发布流程优化。
+- [x] **前端结构微调**: ScriptExtractionModal 迁移到 features/、contexts 迁移到 shared/contexts/、清理空目录。
+- [x] **后端模块分层**: materials/tools/ref_audios 三个模块补全 router+schemas+service 分层。
+- [x] **开发规范更新**: BACKEND_DEV.md 新增渐进原则、DOC_RULES.md 取消 TASK_COMPLETE.md 手动触发约束。
+- [x] **文档全面更新**: BACKEND_DEV/README、FRONTEND_DEV、DEPLOY_MANUAL、README.md 同步更新。
+- [x] **多素材视频生成（多机位效果）**: 支持多选素材 + 拖拽排序，按素材数量均分音频时长（对齐 Whisper 字边界）自动切换机位。逐段 LatentSync + FFmpeg 拼接。前端 @dnd-kit 拖拽排序 UI。
+- [x] **字幕开关移除**: 默认启用逐字高亮字幕，移除开关及相关死代码。
+- [x] **视频格式扩展**: 上传支持 mkv/webm/flv/wmv/m4v/ts/mts 等常见格式。
+- [x] **Watchdog 优化**: 健康检查阈值提高到 5 次，新增重启冷却期 120 秒，避免误重启。
+- [x] **多素材 Bug 修复**: 修复标点分句方案对无句末标点文案无效（改为均分方案）、音频时间偏移导致口型不对齐等缺陷。
+
+### Day 20: 代码质量与安全优化
+- [x] **功能性修复**: LatentSync 回退逻辑、任务状态接口认证、User 类型统一。
+- [x] **性能优化**: N+1 查询修复、视频上传流式处理、httpx 异步替换、GLM 异步包装。
+- [x] **安全修复**: 硬编码 Cookie 配置化、日志敏感信息脱敏、ffprobe 安全调用、CORS 配置化。
+- [x] **配置优化**: 存储路径环境变量化、Remotion 预编译加速、LatentSync 绝对路径。
+- [x] **文档更新**: 更新 DOC_RULES.md 清单，补齐后端与部署文档；更新 SUBTITLE_DEPLOY.md, FRONTEND_DEV.md, implementation_plan.md。
+- [x] **缺陷修复**: 修复 Remotion 路径解析、发布页持久化竞态、首页选中回归、素材闭包陷阱。
+
+### Day 19: 自动发布稳定性与发布体验优化 🚀
+- [x] **抖音发布稳定性**: 上传入口、封面流程、发布重试、登录失效识别与网络失败快速返回全面增强。
+- [x] **视频号发布修复**: 标题+标签统一写入“视频描述”，`post_create` 成功信号快速判定，超时改为失败返回。
+- [x] **成功截图闭环**: 抖音/视频号发布成功截图接入前端，支持用户隔离存储与鉴权访问。
+- [x] **截图观感优化**: 成功截图延后 3 秒并改为视口截图，修复“截图内容仅占 1/3”问题。
+- [x] **调试能力开关化**: 新增视频号录屏配置，默认可按环境变量开关，失败排障更直观。
+- [x] **启动链路统一**: 合并为 `run_backend.sh`（xvfb + headful），统一端口 `8006`，减少多进程混淆。
+- [x] **发布页防误操作**: 发布中按钮提示“请勿刷新或关闭网页”，并启用刷新/关页二次确认拦截。
+- [ ] **后续优化**: 发布任务状态恢复机制（任务化 + 状态持久化 + 前端轮询恢复）。
+
+### Day 18: 后端模块化与规范完善
+- [x] **模块化迁移**: 路由透传 `modules/*`，业务逻辑集中到 service/workflow。
+- [x] **视频生成拆分**: 生成流程下沉 workflow，任务状态统一 TaskStore。
+- [x] **Redis 任务存储**: Redis 优先，不可用自动回退内存。
+- [x] **仓储层抽离**: Supabase 访问统一 `repositories/*`，deps/auth/admin 全面替换。
+- [x] **响应规范**: 统一 `success/message/data/code` + 全局异常处理。
+- [x] **素材重命名**: 新增重命名接口与 Storage `move_file`。
+- [x] **平台顺序调整**: 抖音/微信视频号/B站/小红书，移除快手。
+- [x] **后端开发规范**: 新增 `BACKEND_DEV.md`，README 同步模块化结构。
+- [x] **发布管理体验**: 首页预取路由 + 发布页骨架与缓存，进入更快。
+- [x] **素材加载优化**: 素材列表并发签名 URL，骨架数量动态。
+- [x] **预览加载优化**: `preload="metadata"` + hover 预取。
+
+### Day 17: 前端重构与体验优化
+- [x] **UI 组件拆分**: 首页拆分为独立组件，降低 `page.tsx` 复杂度。
+- [x] **轻量 FSD 迁移**: `app` 页面轻量化，逻辑集中到 `features/*/model`，通用能力下沉 `shared/*`。
+- [x] **Controller Hooks**: Home/Publish 页面逻辑集中到 Controller Hook，Page 仅组合渲染。
+- [x] **通用工具抽取**: `media.ts` 统一 API Base / URL / 日期格式化。
+- [x] **交互优化**: 选择项持久化、列表内定位、刷新回顶部、最新作品优先预览。
+- [x] **发布页改造**: 作品列表卡片化 + 搜索 + 预览弹窗。
+- [x] **预览体验**: 预览弹窗统一头部样式与提示文案。
+- [x] **预览一致性**: 标题/字幕预览按素材分辨率缩放。
+- [x] **标题同步与限制**: 片头标题同步发布标题，输入法合成态兼容，限制 15 字。
+- [x] **样式默认与持久化**: 默认样式与字号调整，刷新保留用户选择。
+- [x] **性能微优化**: 列表渲染优化 + 并行请求 + localStorage 防抖。
+- [x] **资源能力**: 字体/BGM 资源库 + `/api/assets` 接入。
+- [x] **音频与字幕修复**: BGM 混音稳定性与字幕断句优化。
+- [x] **持久化修复**: 接入 `useHomePersistence`，恢复 `isRestored` 逻辑并通过构建。
+- [x] **预览与选择修复**: 发布预览兼容签名 URL，音频试听路径解析，素材/BGM 回退有效项。
+- [x] **体验细节优化**: 录音预览 URL 回收，预览弹窗滚动恢复，全局任务提示挂载。
+
+### Day 16: 深度性能优化
+- [x] **Qwen-TTS 加速**: 集成 Flash Attention 2 (已停用，被 CosyVoice 3.0 替换)。
+- [x] **服务守护**: 开发 `Watchdog` 看门狗机制，自动监控并重启僵死服务。
+- [x] **LatentSync 性能确认**: 验证 DeepCache + 原生 Flash Attn 生效。
+- [x] **文档重构**: 全面更新 README、部署手册及后端文档。
+
+### Day 15: 手机号认证迁移
+- [x] **认证系统升级**: 从邮箱迁移至 11 位手机号注册/登录。
+- [x] **账户管理**: 新增修改密码、有效期显示、安全退出功能。
+- [x] **AI 文案助手**: 升级 GLM-4.7-Flash，支持 B站/抖音链接提取与洗稿。
+
+### Day 14: AI 增强与体验优化
+- [x] **AI 标题/标签**: 集成 GLM-4API 自动生成视频元数据。
+- [x] **字幕升级**: Remotion 逐字高亮字幕 (卡拉OK效果) 及动画片头。
+- [x] **模型升级**: 声音克隆已迁移至 CosyVoice 3.0 (0.5B)。
+
+### Day 13: 声音克隆集成
+- [x] **声音克隆微服务**: 封装 CosyVoice 3.0 为独立 API (8010端口，替换 Qwen3-TTS)。
+- [x] **参考音频管理**: Supabase 存储桶配置与管理接口。
+- [x] **多模态 TTS**: 前端支持 EdgeTTS / Clone Voice 切换。
+
+### Day 12: 移动端适配
+- [x] **iOS 兼容**: 修复 Safari 安全区域、状态栏颜色、Cookie 拦截问题。
+- [x] **响应式 UI**: 移动端 Header 与发布页重构。
+
+### Day 11: 上传架构重构
+- [x] **直传优化**: 前端直传 Supabase Storage，解决 Nginx 30s 超时问题。
+- [x] **数据隔离**: 用户素材/视频按 UserID 物理隔离。
+
+### Day 10: HTTPS 与安全
+- [x] **HTTPS 部署**: 配置 SSL 证书与 Nginx 反向代理。
+- [x] **安全加固**: Supabase Studio 增加 Basic Auth 保护。
+
+### Day 9: 认证系统与发布闭环
+- [x] **用户系统**: 基于 Supabase Auth 实现 JWT 认证。
+- [x] **发布闭环**: 验证 B站/抖音/小红书 自动发布流程。
+- [x] **服务自愈**: 配置 PM2 进程守护。
+
+### Day 1-8: 核心功能构建
+- [x] **Day 8**: 历史记录持久化与文件管理。
+- [x] **Day 7**: 社交媒体自动登录与多平台发布。
+- [x] **Day 6**: **LatentSync 1.6** 升级与服务器部署。
+- [x] **Day 5**: 前端视频上传与进度反馈。
+- [x] **Day 4**: MuseTalk (旧版) 口型同步修复。
+- [x] **Day 3**: 服务器环境配置与模型权重下载。
+- [x] **Day 1-2**: 项目基础框架 (FastAPI + Next.js) 搭建。
+
+---
+
+## 🛤️ 后续规划 (Roadmap)
+
+### 🔴 优先待办
+- [x] ~~**配音前置重构 — 第二阶段**: 素材片段截取 + 语音时间轴编排~~ ✅ Day 23 已完成
+- [ ] **批量生成架构**: 支持 Excel 导入，批量生产视频。
+- [ ] **定时任务后台化**: 迁移前端触发的定时发布到后端 APScheduler。
+- [ ] **发布任务恢复机制**: 发布任务化 + 状态持久化 + 前端断点恢复，解决刷新后状态丢失。
+
+### 🔵 长期探索
+- [ ] **容器化交付**: 提供完整的 Docker Compose 一键部署包。
+- [ ] **分布式队列**: 引入 Celery + Redis 处理超高并发任务。
+
+---
+
+## 📊 模块完成度
+
+| 模块 | 进度 | 状态 |
+|------|------|------|
+| **核心 API** | 100% | ✅ 稳定 |
+| **Web UI** | 100% | ✅ 稳定 (移动端适配) |
+| **唇形同步** | 100% | ✅ LatentSync 1.6 |
+| **TTS 配音** | 100% | ✅ EdgeTTS + CosyVoice 3.0 + 配音前置 + 时间轴编排 + 自动转写 + 语速控制 + 语气控制 |
+| **自动发布** | 100% | ✅ 抖音/微信视频号/B站/小红书 |
+| **用户认证** | 100% | ✅ 手机号 + JWT |
+| **付费会员** | 100% | ✅ 支付宝电脑网站支付 + 自动激活 |
+| **部署运维** | 100% | ✅ PM2 + Watchdog |
+
+---
+
+## 📎 相关文档
+
+- [详细开发日志 (DevLogs)](Docs/DevLogs/)
+- [部署手册 (DEPLOY_MANUAL)](Docs/DEPLOY_MANUAL.md)
--- a/README.md
+++ b/README.md
@@ -16,26 +16,31 @@
 ## ✨ 功能特性

 ### 核心能力
- 🎬 **高清唇形同步** - 混合方案：短视频 (<120s) 用 LatentSync 1.6 (高质量 Latent Diffusion)，长视频 (>=120s) 用 MuseTalk 1.5 (实时级单步推理)，自动路由 + 回退。前端可选模型：默认模型（阈值自动路由）/ 快速模型（强制 MuseTalk）/ 高级模型（强制 LatentSync）。
+- 🎬 **高清唇形同步** - 混合方案：短视频（本仓库当前 `.env` 阈值 100s，可配）用 LatentSync 1.6（高质量 Latent Diffusion），长视频用 MuseTalk 1.5（实时级单步推理），自动路由 + 回退。前端可选模型：默认模型（阈值自动路由）/ 快速模型（速度优先）/ 高级模型（质量优先）。
+- 🧠 **小脸口型质量补偿（可选）** - 本地唇形路径支持小脸检测 + 裁切 + 稀疏关键帧超分 + 下半脸贴回补偿链路；默认关闭（`LIPSYNC_SMALL_FACE_ENHANCE=false`），失败自动回退原流程（fail-open）。
 - 🎙️ **多模态配音** - 支持 **EdgeTTS** (微软超自然语音, 10 语言) 和 **CosyVoice 3.0** (3秒极速声音克隆, 9语言+18方言, 语速/语气可调)。上传参考音频自动 Whisper 转写 + 智能截取。配音前置工作流：先生成配音 → 选素材 → 生成视频。
 - 📝 **智能字幕** - 集成 faster-whisper + Remotion，自动生成逐字高亮 (卡拉OK效果) 字幕。
 - 🎨 **样式预设** - 12 种标题 + 8 种字幕样式预设，支持预览 + 字号调节 + 自定义字体库。CSS 原生描边渲染，清晰无重影。
 - 🏷️ **标题显示模式** - 片头标题支持 `短暂显示` / `常驻显示`，默认短暂显示（4秒），用户偏好自动持久化。
 - 📌 **片头副标题** - 可选副标题显示在主标题下方，独立样式配置，AI 可同时生成，20 字限制。
 - 🖼️ **作品预览一致性** - 标题/字幕预览与 Remotion 成片统一响应式缩放和自动换行，窄屏画布也稳定显示。
- 🎞️ **多素材多机位** - 支持多选素材 + 时间轴编辑器 (wavesurfer.js 波形可视化)，拖拽分割线调整时长、拖拽排序切换机位、按 `source_start/source_end` 截取片段。
+- 🎞️ **多素材多机位** - 支持多选素材 + 时间轴编辑器 (wavesurfer.js 波形可视化)，主素材连续循环播放 + 浮动插入镜头块自由叠加，拖拽移动位置、ClipTrimmer 统一编辑截取范围与时长，支持"设为主素材"切换。
 - 📐 **画面比例控制** - 时间轴一键切换 `9:16 / 16:9` 输出比例，生成链路全程按目标比例处理。
- 💾 **用户偏好持久化** - 首页状态统一恢复/保存，刷新后延续上次配置。历史文案手动保存与加载。
- 🎵 **背景音乐** - 试听 + 音量控制 + 混音，保持配音音量稳定。
- 🤖 **AI 辅助创作** - 内置 GLM-4.7-Flash，支持 B站/抖音链接文案提取、AI 智能改写（支持自定义提示词）、标题/标签自动生成、9 语言翻译。
+- 💾 **用户偏好持久化** - 首页状态统一恢复/保存，刷新后延续上次配置；新作品生成后优先选中最新，后续用户手动选择持续持久化。
+- 🎵 **背景音乐** - 试听 + 搜索选择 + 混音（当前前端固定混音系数，保持配音音量稳定）。
+- 🧩 **统一选择器交互** - 首页/发布页业务选择项统一 SelectPopover（桌面 Popover / 移动端 BottomSheet），支持自动上拉、已选定位与连续预览。
+- 🤖 **AI 辅助创作** - 内置 GLM-4.7-Flash，支持 B站/抖音链接文案提取、AI 智能改写（支持自定义提示词）、文案深度学习（博主话题分析+文案生成）、标题/标签自动生成、9 语言翻译。

 ### 平台化功能
 - 📱 **全自动发布** - 支持抖音/微信视频号/B站/小红书立即发布；扫码登录 + Cookie 持久化。
 - 🖥️ **发布管理预览** - 支持签名 URL / 相对路径作品预览，确保可直接播放。
- 📸 **发布结果可视化** - 抖音/微信视频号发布成功后返回截图，发布页结果卡片可直接查看。
+- 📸 **发布结果可视化** - 抖音/微信视频号/小红书发布成功后返回截图，发布页结果卡片可直接查看。
+- 🧹 **发布后工作区清理引导** - 全平台发布成功后弹出不可误关清理弹窗（失败可重试，达到阈值可暂不清理），仅清输入内容并保留用户偏好。
+- ⬇️ **一键下载直达** - 首页与发布成功弹窗下载统一走同源 `attachment` 接口，不再新开标签页播放视频。
 - 🛡️ **发布防误操作** - 发布进行中自动提示“请勿刷新或关闭网页”，并拦截刷新/关页二次确认。
 - 💳 **付费会员** - 支付宝电脑网站支付自动开通会员，到期自动停用并引导续费，管理员手动激活并存。
 - 🔐 **认证与隔离** - 基于 Supabase 的用户隔离，支持手机号注册/登录、密码管理。
+- 🛡️ **安全基线** - AI/Tools 接口强制登录鉴权、关键上传链路体积限制、生产环境默认密钥启动拦截。
 - 🛡️ **服务守护** - 内置 Watchdog 看门狗机制，自动监控并重启僵死服务，确保 7x24h 稳定运行。
 - 🚀 **性能优化** - 编码流水线从 5-6 次有损编码精简至 3 次（prepare_segment → 模型输出 → Remotion）、compose 流复制免重编码、同分辨率跳过 scale、FFmpeg 超时保护、全局视频生成并发限制 (Semaphore(2))、Remotion 4 并发渲染、MuseTalk rawvideo 管道直编码（消除中间有损文件）、模型常驻服务、双 GPU 流水线并发、Redis 任务 TTL 自动清理、workflow 阻塞调用线程池化。

@@ -61,10 +66,12 @@

 ### 部署运维
 - **[部署手册 (DEPLOY_MANUAL.md)](Docs/DEPLOY_MANUAL.md)** - 👈 **部署请看这里**！包含完整的环境搭建步骤。
+- [多平台发布部署说明 (PUBLISH_DEPLOY.md)](Docs/PUBLISH_DEPLOY.md) - 抖音/微信视频号/B站/小红书登录与自动化发布专项文档。
 - [参考音频服务部署 (COSYVOICE3_DEPLOY.md)](Docs/COSYVOICE3_DEPLOY.md) - 声音克隆模型部署指南。
- [LatentSync 部署指南 (LATENTSYNC_DEPLOY.md)](Docs/LATENTSYNC_DEPLOY.md) - 唇形同步模型独立部署。
- [MuseTalk 部署指南 (MUSETALK_DEPLOY.md)](Docs/MUSETALK_DEPLOY.md) - 长视频唇形同步模型部署。
- [Supabase 部署指南 (SUPABASE_DEPLOY.md)](Docs/SUPABASE_DEPLOY.md) - Supabase 与认证系统配置。
+- [LatentSync 部署指南 (LATENTSYNC_DEPLOY.md)](Docs/LATENTSYNC_DEPLOY.md) - 唇形同步模型独立部署。
+- [MuseTalk 部署指南 (MUSETALK_DEPLOY.md)](Docs/MUSETALK_DEPLOY.md) - 长视频唇形同步模型部署。
+- [小脸口型质量补偿链路部署指南 (FACEENHANCE_DEPLOY.md)](Docs/FACEENHANCE_DEPLOY.md) - 小脸口型质量补偿链路部署与验证。
+- [Supabase 部署指南 (SUPABASE_DEPLOY.md)](Docs/SUPABASE_DEPLOY.md) - Supabase 与认证系统配置。
 - [支付宝部署指南 (ALIPAY_DEPLOY.md)](Docs/ALIPAY_DEPLOY.md) - 支付宝付费开通会员配置。

 ### 开发文档
--- a/backend/.env.example
+++ b/backend/.env.example
@@ -2,7 +2,7 @@
 # 复制此文件为 .env 并填入实际值

 # 调试模式
-DEBUG=true
+DEBUG=false

 # Redis 配置 (Celery 任务队列)
 REDIS_URL=redis://localhost:6379/0
@@ -83,6 +83,13 @@ MUSETALK_ENCODE_PRESET=slow
 # 音频时长 >= 此阈值（秒）用 MuseTalk，< 此阈值用 LatentSync
 LIPSYNC_DURATION_THRESHOLD=100

+# =============== 小脸口型质量补偿链路 ===============
+LIPSYNC_SMALL_FACE_ENHANCE=true	
+LIPSYNC_SMALL_FACE_THRESHOLD=256
+LIPSYNC_SMALL_FACE_UPSCALER=gfpgan
+LIPSYNC_SMALL_FACE_GPU_ID=0
+LIPSYNC_SMALL_FACE_FAIL_OPEN=true
+
 # =============== 上传配置 ===============
 # 最大上传文件大小 (MB)
 MAX_UPLOAD_SIZE_MB=500
--- a/backend/app/core/config.py
+++ b/backend/app/core/config.py
@@ -43,6 +43,16 @@ class Settings(BaseSettings):
    DOUYIN_KEEP_SUCCESS_VIDEO: bool = False
    DOUYIN_RECORD_VIDEO_WIDTH: int = 1280
    DOUYIN_RECORD_VIDEO_HEIGHT: int = 720
+
+    # Xiaohongshu Playwright 配置
+    XIAOHONGSHU_HEADLESS_MODE: str = "headless-new"
+    XIAOHONGSHU_USER_AGENT: str = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/144.0.0.0 Safari/537.36"
+    XIAOHONGSHU_LOCALE: str = "zh-CN"
+    XIAOHONGSHU_TIMEZONE_ID: str = "Asia/Shanghai"
+    XIAOHONGSHU_CHROME_PATH: str = "/usr/bin/google-chrome"
+    XIAOHONGSHU_BROWSER_CHANNEL: str = ""
+    XIAOHONGSHU_FORCE_SWIFTSHADER: bool = True
+    XIAOHONGSHU_DEBUG_ARTIFACTS: bool = False
    
    # TTS 配置
    DEFAULT_TTS_VOICE: str = "zh-CN-YunxiNeural"
@@ -68,6 +78,13 @@ class Settings(BaseSettings):
    # 混合唇形同步路由
    LIPSYNC_DURATION_THRESHOLD: float = 120.0       # 秒，>=此值用 MuseTalk

+    # 小脸口型质量补偿链路
+    LIPSYNC_SMALL_FACE_ENHANCE: bool = False
+    LIPSYNC_SMALL_FACE_THRESHOLD: int = 256
+    LIPSYNC_SMALL_FACE_UPSCALER: str = "codeformer"
+    LIPSYNC_SMALL_FACE_GPU_ID: int = 0
+    LIPSYNC_SMALL_FACE_FAIL_OPEN: bool = True
+
    # Supabase 配置
    SUPABASE_URL: str = ""
    SUPABASE_PUBLIC_URL: str = ""  # 公网访问地址，用于生成前端可访问的 URL
--- a/backend/app/main.py
+++ b/backend/app/main.py
@@ -130,6 +130,20 @@ app.include_router(generated_audios_router, prefix="/api/generated-audios", tags
 app.include_router(payment_router)  # /api/payment


+@app.on_event("startup")
+async def check_jwt_secret():
+    if settings.JWT_SECRET_KEY == "your-secret-key-change-in-production":
+        if not settings.DEBUG:
+            raise RuntimeError(
+                "JWT_SECRET_KEY is still the default value! "
+                "Set a strong random secret in .env before running in production (DEBUG=False)."
+            )
+        logger.critical(
+            "JWT_SECRET_KEY is still the default value! "
+            "Set a strong random secret in .env for production."
+        )
+
+
@app.on_event("startup")
 async def init_admin():
    """
--- a/backend/app/modules/ai/router.py
+++ b/backend/app/modules/ai/router.py
@@ -4,11 +4,12 @@ AI 相关 API 路由

 from typing import Optional

-from fastapi import APIRouter, HTTPException
+from fastapi import APIRouter, Depends, HTTPException
 from pydantic import BaseModel
 from loguru import logger

 from app.services.glm_service import glm_service
+from app.core.deps import get_current_user
 from app.core.response import success_response


@@ -40,7 +41,7 @@ class TranslateRequest(BaseModel):


@router.post("/translate")
-async def translate_text(req: TranslateRequest):
+async def translate_text(req: TranslateRequest, current_user: dict = Depends(get_current_user)):
    """
    AI 翻译文案

@@ -57,11 +58,11 @@ async def translate_text(req: TranslateRequest):
        return success_response({"translated_text": translated})
    except Exception as e:
        logger.error(f"Translate failed: {e}")
-        raise HTTPException(status_code=500, detail=str(e))
+        raise HTTPException(status_code=500, detail="翻译服务暂时不可用，请稍后重试")


@router.post("/generate-meta")
-async def generate_meta(req: GenerateMetaRequest):
+async def generate_meta(req: GenerateMetaRequest, current_user: dict = Depends(get_current_user)):
    """
    AI 生成视频标题和标签

@@ -80,11 +81,11 @@ async def generate_meta(req: GenerateMetaRequest):
        ).model_dump())
    except Exception as e:
        logger.error(f"Generate meta failed: {e}")
-        raise HTTPException(status_code=500, detail=str(e))
+        raise HTTPException(status_code=500, detail="生成标题标签失败，请稍后重试")


@router.post("/rewrite")
-async def rewrite_script(req: RewriteRequest):
+async def rewrite_script(req: RewriteRequest, current_user: dict = Depends(get_current_user)):
    """AI 改写文案"""
    if not req.text or not req.text.strip():
        raise HTTPException(status_code=400, detail="文案不能为空")
@@ -95,4 +96,4 @@ async def rewrite_script(req: RewriteRequest):
        return success_response({"rewritten_text": rewritten})
    except Exception as e:
        logger.error(f"Rewrite failed: {e}")
-        raise HTTPException(status_code=500, detail=str(e))
+        raise HTTPException(status_code=500, detail="改写服务暂时不可用，请稍后重试")
--- a/backend/app/modules/generated_audios/service.py
+++ b/backend/app/modules/generated_audios/service.py
@@ -152,9 +152,9 @@ async def generate_audio_task(task_id: str, req: GenerateAudioRequest, user_id:
        task_store.update(task_id, {
            "status": "failed",
            "message": f"配音生成失败: {str(e)}",
-            "error": traceback.format_exc(),
+            "error": str(e),
        })
-        logger.error(f"Generate audio failed: {e}")
+        logger.error(f"Generate audio failed: {e}\n{traceback.format_exc()}")


 async def list_generated_audios(user_id: str) -> dict:
@@ -215,6 +215,30 @@ async def list_generated_audios(user_id: str) -> dict:
    return GeneratedAudioListResponse(items=items).model_dump()


+async def delete_all_generated_audios(user_id: str) -> tuple[int, int]:
+    """删除用户所有生成的配音（.wav + .json），返回 (删除数量, 失败数量)"""
+    try:
+        files = await storage_service.list_files(BUCKET, user_id, strict=True)
+        deleted_count = 0
+        failed_count = 0
+        for f in files:
+            name = f.get("name", "")
+            if not name or name == ".emptyFolderPlaceholder":
+                continue
+            if name.endswith("_audio.wav") or name.endswith("_audio.json"):
+                full_path = f"{user_id}/{name}"
+                try:
+                    await storage_service.delete_file(BUCKET, full_path)
+                    deleted_count += 1
+                except Exception as e:
+                    failed_count += 1
+                    logger.warning(f"Delete audio file failed: {full_path}, {e}")
+        return deleted_count, failed_count
+    except Exception as e:
+        logger.error(f"Delete all generated audios failed: {e}")
+        return 0, 1
+
+
 async def delete_generated_audio(audio_id: str, user_id: str) -> None:
    if not audio_id.startswith(f"{user_id}/"):
        raise PermissionError("无权删除此文件")
--- a/backend/app/modules/materials/router.py
+++ b/backend/app/modules/materials/router.py
@@ -14,6 +14,8 @@ router = APIRouter()
@router.get("/stream/{material_id:path}")
 async def stream_material(material_id: str, current_user: dict = Depends(get_current_user)):
    """直接流式返回素材文件（同源，避免 CORS canvas taint）"""
+    if ".." in material_id:
+        raise HTTPException(400, "非法素材ID")
    user_id = current_user["id"]
    if not material_id.startswith(f"{user_id}/"):
        raise HTTPException(403, "无权访问此素材")
@@ -52,6 +54,8 @@ async def delete_material(material_id: str, current_user: dict = Depends(get_cur
    try:
        await service.delete_material(material_id, user_id)
        return success_response(message="素材已删除")
+    except ValueError as e:
+        raise HTTPException(400, str(e))
    except PermissionError as e:
        raise HTTPException(403, str(e))
    except Exception as e:
--- a/backend/app/modules/materials/service.py
+++ b/backend/app/modules/materials/service.py
@@ -7,6 +7,7 @@ import aiofiles
 from pathlib import Path
 from loguru import logger

+from app.core.config import settings as app_settings
 from app.services.storage import storage_service


@@ -123,6 +124,9 @@ async def upload_material(request, user_id: str) -> dict:
            async for chunk in request.stream():
                await f.write(chunk)
                total_size += len(chunk)
+                max_bytes = app_settings.MAX_UPLOAD_SIZE_MB * 1024 * 1024
+                if total_size > max_bytes:
+                    raise ValueError(f"文件大小超过限制 ({app_settings.MAX_UPLOAD_SIZE_MB}MB)")

                if total_size - last_log > 20 * 1024 * 1024:
                    logger.info(f"Receiving stream... Processed {total_size / (1024*1024):.2f} MB")
@@ -239,6 +243,8 @@ async def list_materials(user_id: str) -> list[dict]:

 async def delete_material(material_id: str, user_id: str) -> None:
    """删除素材"""
+    if ".." in material_id:
+        raise ValueError("非法素材ID")
    if not material_id.startswith(f"{user_id}/"):
        raise PermissionError("无权删除此素材")
    await storage_service.delete_file(
@@ -249,6 +255,8 @@ async def delete_material(material_id: str, user_id: str) -> None:

 async def rename_material(material_id: str, new_name_raw: str, user_id: str) -> dict:
    """重命名素材，返回更新后的素材信息"""
+    if ".." in material_id:
+        raise ValueError("非法素材ID")
    if not material_id.startswith(f"{user_id}/"):
        raise PermissionError("无权重命名此素材")

--- a/backend/app/modules/ref_audios/service.py
+++ b/backend/app/modules/ref_audios/service.py
@@ -104,6 +104,8 @@ async def upload_ref_audio(file, ref_text: str, user_id: str) -> dict:
    # 创建临时文件
    with tempfile.NamedTemporaryFile(delete=False, suffix=ext) as tmp_input:
        content = await file.read()
+        if len(content) > 5 * 1024 * 1024:
+            raise ValueError("参考音频文件大小不能超过 5MB")
        tmp_input.write(content)
        tmp_input_path = tmp_input.name

--- a/backend/app/modules/tools/router.py
+++ b/backend/app/modules/tools/router.py
@@ -1,20 +1,54 @@
-from fastapi import APIRouter, UploadFile, File, Form, HTTPException
+from fastapi import APIRouter, Depends, UploadFile, File, Form, HTTPException
 from typing import Optional
+from urllib.parse import urlparse
 import traceback
 from loguru import logger
+from pydantic import BaseModel, Field, field_validator

+from app.core.deps import get_current_user
 from app.core.response import success_response
 from app.modules.tools import service
+from app.services import creator_scraper
+from app.services.creator_scraper import ALLOWED_INPUT_DOMAINS
+from app.services.glm_service import glm_service

 router = APIRouter()


+class AnalyzeCreatorRequest(BaseModel):
+    url: str = Field(..., description="博主主页链接（仅支持抖音/B站 https 链接）")
+
+    @field_validator("url")
+    @classmethod
+    def validate_url_format(cls, value: str) -> str:
+        candidate = value.strip()
+        if len(candidate) > 500:
+            raise ValueError("链接过长")
+
+        parsed = urlparse(candidate)
+        if parsed.scheme != "https":
+            raise ValueError("仅支持 https 链接")
+
+        hostname = (parsed.hostname or "").lower()
+        if hostname not in ALLOWED_INPUT_DOMAINS:
+            raise ValueError(f"不支持的域名: {hostname}，仅支持抖音和B站")
+
+        return candidate
+
+
+class GenerateTopicScriptRequest(BaseModel):
+    analysis_id: str = Field(..., min_length=8, max_length=80, description="分析结果ID")
+    topic: str = Field(..., min_length=2, max_length=30, description="选中的话题（2-30字）")
+    word_count: int = Field(..., ge=80, le=1000, description="目标字数（80-1000）")
+
+
@router.post("/extract-script")
 async def extract_script_tool(
    file: Optional[UploadFile] = File(None),
    url: Optional[str] = Form(None),
    rewrite: bool = Form(True),
-    custom_prompt: Optional[str] = Form(None)
+    custom_prompt: Optional[str] = Form(None),
+    current_user: dict = Depends(get_current_user),
 ):
    """独立文案提取工具"""
    try:
@@ -29,5 +63,64 @@ async def extract_script_tool(
        logger.error(traceback.format_exc())
        msg = str(e)
        if "Fresh cookies" in msg:
-            msg = "下载失败：目标平台开启了反爬验证，请过段时间重试或直接上传视频文件。"
-        raise HTTPException(500, f"提取失败: {msg}")
+            raise HTTPException(500, "下载失败：目标平台开启了反爬验证，请过段时间重试或直接上传视频文件。")
+        raise HTTPException(500, "文案提取失败，请稍后重试")
+
+
+@router.post("/analyze-creator")
+async def analyze_creator(
+    req: AnalyzeCreatorRequest,
+    current_user: dict = Depends(get_current_user),
+):
+    """分析博主内容并返回热门话题"""
+    try:
+        user_id = str(current_user.get("id") or "").strip()
+        if not user_id:
+            raise HTTPException(401, "登录状态无效，请重新登录")
+
+        creator_result = await creator_scraper.scrape_creator_titles(req.url, user_id=user_id)
+        titles = creator_result.get("titles") or []
+        topics = await glm_service.analyze_topics(titles)
+
+        analysis_id = creator_scraper.cache_titles(titles, user_id)
+
+        return success_response({
+            "platform": creator_result.get("platform", ""),
+            "creator_name": creator_result.get("creator_name", ""),
+            "topics": topics,
+            "analysis_id": analysis_id,
+            "fetched_count": creator_result.get("fetched_count", len(titles)),
+        })
+    except ValueError as e:
+        raise HTTPException(400, str(e))
+    except HTTPException:
+        raise
+    except Exception as e:
+        logger.error(f"Analyze creator failed: {e}")
+        logger.error(traceback.format_exc())
+        raise HTTPException(500, "博主内容分析失败，请稍后重试")
+
+
+@router.post("/generate-topic-script")
+async def generate_topic_script(
+    req: GenerateTopicScriptRequest,
+    current_user: dict = Depends(get_current_user),
+):
+    """根据话题生成文案"""
+    try:
+        user_id = str(current_user.get("id") or "").strip()
+        if not user_id:
+            raise HTTPException(401, "登录状态无效，请重新登录")
+
+        titles = creator_scraper.get_cached_titles(req.analysis_id, user_id)
+        script = await glm_service.generate_script_from_topic(req.topic, req.word_count, titles)
+
+        return success_response({"script": script})
+    except ValueError as e:
+        raise HTTPException(400, str(e))
+    except HTTPException:
+        raise
+    except Exception as e:
+        logger.error(f"Generate topic script failed: {e}")
+        logger.error(traceback.format_exc())
+        raise HTTPException(500, "文案生成失败，请稍后重试")
--- a/backend/app/modules/tools/service.py
+++ b/backend/app/modules/tools/service.py
@@ -8,7 +8,7 @@ import subprocess
 import traceback
 from pathlib import Path
 from typing import Optional, Any
-from urllib.parse import unquote
+from urllib.parse import unquote, parse_qs, urlparse

 import httpx
 from loguru import logger
@@ -41,7 +41,19 @@ async def extract_script(file=None, url: Optional[str] = None, rewrite: bool = T
                raise ValueError("文件名无效")
            safe_filename = Path(filename).name.replace(" ", "_")
            temp_path = temp_dir / f"tool_extract_{timestamp}_{safe_filename}"
-            await loop.run_in_executor(None, lambda: shutil.copyfileobj(file.file, open(temp_path, "wb")))
+            max_bytes = 500 * 1024 * 1024  # 500MB
+            total_written = 0
+            with open(temp_path, "wb") as dst:
+                while True:
+                    chunk = file.file.read(1024 * 1024)
+                    if not chunk:
+                        break
+                    total_written += len(chunk)
+                    if total_written > max_bytes:
+                        dst.close()
+                        os.remove(temp_path)
+                        raise ValueError("上传文件大小不能超过 500MB")
+                    dst.write(chunk)
            logger.info(f"Tool processing upload file: {temp_path}")
        else:
            temp_path = await _download_video(url, temp_dir, timestamp)
@@ -49,6 +61,13 @@ async def extract_script(file=None, url: Optional[str] = None, rewrite: bool = T
        if not temp_path or not temp_path.exists():
            raise ValueError("文件获取失败")

+        # 下载文件体积检查（500MB 上限）
+        max_download_bytes = 500 * 1024 * 1024
+        file_size = temp_path.stat().st_size
+        if file_size > max_download_bytes:
+            os.remove(temp_path)
+            raise ValueError(f"下载的文件过大（{file_size / (1024*1024):.0f}MB），上限 500MB")
+
        # 1.5 安全转换: 强制转为 WAV (16k)
        audio_path = temp_dir / f"extract_audio_{timestamp}.wav"
        try:
@@ -193,10 +212,9 @@ async def _download_douyin_manual(url: str, temp_dir: Path, timestamp: int) -> O

        logger.info(f"[douyin-fallback] Final URL: {final_url}")

-        video_id = None
-        match = re.search(r'/video/(\d+)', final_url)
-        if match:
-            video_id = match.group(1)
+        video_id = _extract_douyin_video_id(final_url)
+        if not video_id:
+            video_id = _extract_douyin_video_id(url)

        if not video_id:
            logger.error("[douyin-fallback] Could not extract video_id")
@@ -217,7 +235,8 @@ async def _download_douyin_manual(url: str, temp_dir: Path, timestamp: int) -> O
                        "cbUrlProtocol": "https", "union": True,
                    }
                )
-                ttwid = ttwid_resp.cookies.get("ttwid", "")
+                fresh_ttwid = ttwid_resp.cookies.get("ttwid")
+                ttwid = str(fresh_ttwid) if fresh_ttwid else ""
                logger.info(f"[douyin-fallback] Got fresh ttwid (len={len(ttwid)})")
        except Exception as e:
            logger.warning(f"[douyin-fallback] Failed to get ttwid: {e}")
@@ -277,6 +296,39 @@ async def _download_douyin_manual(url: str, temp_dir: Path, timestamp: int) -> O
        return None


+def _extract_douyin_video_id(candidate_url: str) -> Optional[str]:
+    """从抖音 URL 中提取视频 ID，兼容 video/share/video/modal_id/vid 等形态"""
+    if not candidate_url:
+        return None
+
+    decoded_url = unquote(candidate_url)
+    parsed = urlparse(decoded_url)
+
+    for source in (decoded_url, parsed.path):
+        for pattern in (r"/video/(\d+)", r"/share/video/(\d+)"):
+            match = re.search(pattern, source)
+            if match:
+                return match.group(1)
+
+    id_keys = ("modal_id", "vid", "video_id", "aweme_id", "item_id")
+    for pairs in (parse_qs(parsed.query), parse_qs(parsed.fragment)):
+        for key in id_keys:
+            values = pairs.get(key, [])
+            for value in values:
+                match = re.search(r"(\d+)", value)
+                if match:
+                    return match.group(1)
+
+    inline_match = re.search(
+        r"(?:[?&#](?:modal_id|vid|video_id|aweme_id|item_id)=)(\d+)",
+        decoded_url,
+    )
+    if inline_match:
+        return inline_match.group(1)
+
+    return None
+
+
 async def _download_bilibili_manual(url: str, temp_dir: Path, timestamp: int) -> Optional[Path]:
    """手动下载 Bilibili 视频 (Playwright Fallback)"""
    from playwright.async_api import async_playwright
--- a/backend/app/modules/videos/router.py
+++ b/backend/app/modules/videos/router.py
@@ -1,17 +1,80 @@
-from fastapi import APIRouter, BackgroundTasks, Depends
+import os
+import re
+import tempfile
 import uuid

+from fastapi import APIRouter, BackgroundTasks, Depends, HTTPException
+from fastapi.responses import FileResponse
+from loguru import logger
+from starlette.background import BackgroundTask
+
 from app.core.deps import get_current_user
 from app.core.response import success_response
+from app.services.tts_service import TTSService

-from .schemas import GenerateRequest
+from .schemas import GenerateRequest, VoicePreviewRequest
 from .task_store import create_task, get_task, list_tasks
 from .workflow import process_video_generation, get_lipsync_health, get_voiceclone_health
-from .service import list_generated_videos, delete_generated_video
+from .service import list_generated_videos, delete_generated_video, delete_all_generated_videos
+from app.modules.generated_audios.service import delete_all_generated_audios
+from app.services.storage import storage_service


 router = APIRouter()

+PREVIEW_TEXTS = {
+    "zh-CN": "你好，请选择你喜欢的音色吧。",
+    "en-US": "Hello, please choose the voice you like.",
+    "ja-JP": "こんにちは。お好きな音声を選んでください。",
+    "ko-KR": "안녕하세요, 마음에 드는 음성을 선택해 주세요.",
+    "fr-FR": "Bonjour, veuillez choisir la voix que vous preferez.",
+    "de-DE": "Hallo, bitte waehlen Sie die Stimme, die Ihnen gefaellt.",
+    "es-ES": "Hola, por favor elige la voz que mas te guste.",
+    "ru-RU": "Zdravstvuite, pozhaluista, vyberite golos, kotoryi vam nravitsya.",
+    "it-IT": "Ciao, scegli la voce che preferisci.",
+    "pt-BR": "Ola, escolha a voz de que voce mais gosta.",
+}
+
+
+def _cleanup_temp_file(path: str) -> None:
+    try:
+        os.unlink(path)
+    except Exception:
+        pass
+
+
+def _get_voice_locale(voice: str) -> str:
+    parts = voice.split("-")
+    if len(parts) >= 2:
+        return f"{parts[0]}-{parts[1]}"
+    return "zh-CN"
+
+
+def _get_preview_text_for_voice(voice: str) -> str:
+    locale = _get_voice_locale(voice)
+    return PREVIEW_TEXTS.get(locale, PREVIEW_TEXTS["zh-CN"])
+
+
+async def _render_voice_preview(voice: str, text: str) -> FileResponse:
+    tmp_file = tempfile.NamedTemporaryFile(prefix="voice_preview_", suffix=".mp3", delete=False)
+    output_path = tmp_file.name
+    tmp_file.close()
+
+    tts = TTSService()
+    try:
+        await tts.generate_audio(text=text, voice=voice, output_path=output_path)
+    except Exception as e:
+        _cleanup_temp_file(output_path)
+        logger.error(f"音色试听生成失败: voice={voice}, error={e}")
+        raise HTTPException(status_code=500, detail="音色试听生成失败，请稍后重试")
+
+    return FileResponse(
+        path=output_path,
+        media_type="audio/mpeg",
+        filename="voice_preview.mp3",
+        background=BackgroundTask(_cleanup_temp_file, output_path),
+    )
+

@router.post("/generate")
 async def generate_video(
@@ -53,12 +116,91 @@ async def voiceclone_health():
    return success_response(await get_voiceclone_health())


+@router.post("/cleanup")
+async def cleanup_workspace(current_user: dict = Depends(get_current_user)):
+    user_id = current_user["id"]
+
+    videos_deleted, videos_failed = await delete_all_generated_videos(user_id)
+    audios_deleted, audios_failed = await delete_all_generated_audios(user_id)
+
+    if videos_failed > 0 or audios_failed > 0:
+        raise HTTPException(
+            status_code=500,
+            detail=(
+                f"工作区清理不完整：视频删除失败 {videos_failed} 个，"
+                f"配音删除失败 {audios_failed} 个，请重试"
+            ),
+        )
+
+    return success_response({
+        "videos_deleted": videos_deleted,
+        "audios_deleted": audios_deleted,
+    }, message="工作区已清理")
+
+
@router.get("/generated")
 async def list_generated(current_user: dict = Depends(get_current_user)):
    return success_response(await list_generated_videos(current_user["id"]))


+@router.get("/generated/{video_id}/download")
+async def download_generated(video_id: str, current_user: dict = Depends(get_current_user)):
+    if not re.match(r'^[A-Za-z0-9_-]+$', video_id):
+        raise HTTPException(status_code=400, detail="非法 video_id")
+    user_id = current_user["id"]
+    storage_path = f"{user_id}/{video_id}.mp4"
+    local_path = storage_service.get_local_file_path(
+        bucket=storage_service.BUCKET_OUTPUTS,
+        path=storage_path,
+    )
+    if not local_path or not os.path.exists(local_path):
+        raise HTTPException(status_code=404, detail="视频文件不存在")
+    return FileResponse(
+        path=local_path,
+        media_type="video/mp4",
+        filename=f"{video_id}.mp4",
+        headers={"Content-Disposition": f'attachment; filename="{video_id}.mp4"'},
+    )
+
+
@router.delete("/generated/{video_id}")
 async def delete_generated(video_id: str, current_user: dict = Depends(get_current_user)):
+    if not re.match(r'^[A-Za-z0-9_-]+$', video_id):
+        raise HTTPException(status_code=400, detail="非法 video_id")
    result = await delete_generated_video(current_user["id"], video_id)
    return success_response(result, message="视频已删除")
+
+
+@router.post("/voice-preview")
+async def preview_voice_post(
+    req: VoicePreviewRequest,
+    current_user: dict = Depends(get_current_user),
+):
+    # 复用统一鉴权，接口本身不需要 user_id
+    _ = current_user
+
+    voice = req.voice.strip()
+    text = req.text.strip()
+
+    if not voice:
+        raise HTTPException(status_code=400, detail="voice 不能为空")
+    if not text:
+        raise HTTPException(status_code=400, detail="text 不能为空")
+
+    return await _render_voice_preview(voice=voice, text=text)
+
+
+@router.get("/voice-preview")
+async def preview_voice_get(
+    voice: str,
+    current_user: dict = Depends(get_current_user),
+):
+    # 复用统一鉴权，接口本身不需要 user_id
+    _ = current_user
+
+    voice_value = voice.strip()
+    if not voice_value:
+        raise HTTPException(status_code=400, detail="voice 不能为空")
+
+    text = _get_preview_text_for_voice(voice_value)
+    return await _render_voice_preview(voice=voice_value, text=text)
--- a/backend/app/modules/videos/schemas.py
+++ b/backend/app/modules/videos/schemas.py
@@ -1,4 +1,4 @@
-from pydantic import BaseModel
+from pydantic import BaseModel, Field
 from typing import Optional, List, Literal


@@ -39,3 +39,8 @@ class GenerateRequest(BaseModel):
    custom_assignments: Optional[List[CustomAssignment]] = None
    output_aspect_ratio: Literal["9:16", "16:9"] = "9:16"
    lipsync_model: Literal["default", "fast", "advanced"] = "default"
+
+
+class VoicePreviewRequest(BaseModel):
+    voice: str
+    text: str = Field(..., min_length=1, max_length=120)
--- a/backend/app/modules/videos/service.py
+++ b/backend/app/modules/videos/service.py
@@ -73,6 +73,36 @@ async def list_generated_videos(user_id: str) -> dict:
        return {"videos": []}


+async def delete_all_generated_videos(user_id: str) -> tuple[int, int]:
+    """删除用户所有生成的视频，返回 (删除数量, 失败数量)"""
+    try:
+        files = await storage_service.list_files(
+            bucket=storage_service.BUCKET_OUTPUTS,
+            path=user_id,
+            strict=True,
+        )
+        deleted_count = 0
+        failed_count = 0
+        for f in files:
+            name = f.get("name")
+            if not name or name == ".emptyFolderPlaceholder":
+                continue
+            full_path = f"{user_id}/{name}"
+            try:
+                await storage_service.delete_file(
+                    bucket=storage_service.BUCKET_OUTPUTS,
+                    path=full_path
+                )
+                deleted_count += 1
+            except Exception as e:
+                failed_count += 1
+                logger.warning(f"Delete file failed: {full_path}, {e}")
+        return deleted_count, failed_count
+    except Exception as e:
+        logger.error(f"Delete all generated videos failed: {e}")
+        return 0, 1
+
+
 async def delete_generated_video(user_id: str, video_id: str) -> dict:
    """删除生成的视频"""
    try:
--- a/backend/app/modules/videos/workflow.py
+++ b/backend/app/modules/videos/workflow.py
@@ -188,16 +188,16 @@ async def _process_video_generation_inner(task_id: str, req: GenerateRequest, us
    try:
        start_time = time.time()

-        # ── 确定素材列表 ──
+        # ── 确定素材列表（优先信任 req.material_paths 去重列表）──
        material_paths: List[str] = []
-        if req.custom_assignments and len(req.custom_assignments) > 1:
-            material_paths = [a.material_path for a in req.custom_assignments if a.material_path]
-        elif req.material_paths and len(req.material_paths) > 1:
+        if req.material_paths and len(req.material_paths) >= 1:
            material_paths = req.material_paths
        else:
            material_paths = [req.material_path]

-        is_multi = len(material_paths) > 1
+        is_multi = len(material_paths) > 1 or (
+            req.custom_assignments is not None and len(req.custom_assignments) > 1
+        )
        target_resolution = (1080, 1920) if req.output_aspect_ratio == "9:16" else (1920, 1080)

        logger.info(
@@ -341,8 +341,18 @@ async def _process_video_generation_inner(task_id: str, req: GenerateRequest, us
            # ══════════════════════════════════════
            _update_task(task_id, progress=12, message="正在分配素材...")

-            if req.custom_assignments and len(req.custom_assignments) == len(material_paths):
-                # 用户自定义分配，跳过 Whisper 均分
+            if req.custom_assignments and len(req.custom_assignments) >= 1:
+                # 硬上限校验
+                if len(req.custom_assignments) > 50:
+                    raise ValueError(f"custom_assignments 数量超限: {len(req.custom_assignments)}")
+                # 校验所有 assignment 的 material_path 都在前端声明的 material_paths 中
+                known_paths = set(material_paths)
+                unknown = [a.material_path for a in req.custom_assignments if a.material_path not in known_paths]
+                if unknown:
+                    logger.warning(f"[MultiMat] custom_assignments 包含未知素材路径: {unknown[:3]}，终止生成")
+                    raise ValueError(f"素材路径校验失败: 包含 {len(unknown)} 个未知路径")
+
+                # 用户自定义分配（多镜头模式：主素材可重复出现）
                assignments = [
                    {
                        "material_path": a.material_path,
@@ -373,20 +383,13 @@ async def _process_video_generation_inner(task_id: str, req: GenerateRequest, us
                        captions_path = None
                else:
                    captions_path = None
-            elif req.custom_assignments:
-                logger.warning(
-                    f"[MultiMat] custom_assignments 数量({len(req.custom_assignments)})"
-                    f" 与素材数量({len(material_paths)})不一致，回退自动分配"
-                )
-
-                assignments, captions_path = await _whisper_and_split()

            else:
                assignments, captions_path = await _whisper_and_split()

-            # 扩展段覆盖完整音频范围：首段从0开始，末段到音频结尾
+            # 扩展段覆盖完整音频范围（仅自动均分时执行，自定义分配已精确计算）
            audio_duration = await _run_blocking(video._get_duration, str(audio_path))
-            if assignments and audio_duration > 0:
+            if not req.custom_assignments and assignments and audio_duration > 0:
                assignments[0]["start"] = 0.0
                assignments[-1]["end"] = audio_duration

@@ -398,65 +401,73 @@ async def _process_video_generation_inner(task_id: str, req: GenerateRequest, us

            lipsync_start = time.time()

-            # ── 第一步：并行下载所有素材并检测分辨率 ──
-            material_locals: List[Path] = []
-            resolutions = []
+            # 并发限流（每个任务独立 Semaphore，峰值 2×4=8 个 ffmpeg 进程）
+            _segment_sem = asyncio.Semaphore(4)

-            async def _download_and_normalize(i: int, assignment: dict):
-                """下载单个素材并归一化方向"""
-                material_local = temp_dir / f"{task_id}_material_{i}.mp4"
-                temp_files.append(material_local)
-                await _download_material(assignment["material_path"], material_local)
+            # ── 第一步：去重下载所有素材并检测分辨率 ──
+            unique_paths = list(dict.fromkeys(a["material_path"] for a in assignments))
+            path_to_local: dict = {}      # material_path → 本地文件
+            path_to_res: dict = {}        # material_path → 分辨率

-                normalized_material = temp_dir / f"{task_id}_material_{i}_norm.mp4"
-                normalized_result = await _run_blocking(
-                    video.normalize_orientation,
-                    str(material_local),
-                    str(normalized_material),
-                )
-                if normalized_result != str(material_local):
-                    temp_files.append(normalized_material)
-                    material_local = normalized_material
+            async def _download_unique(mat_path: str, idx: int):
+                """去重下载单个素材并归一化方向"""
+                async with _segment_sem:
+                    material_local = temp_dir / f"{task_id}_material_{idx}.mp4"
+                    temp_files.append(material_local)
+                    await _download_material(mat_path, material_local)

-                res = video.get_resolution(str(material_local))
-                return material_local, res
+                    normalized_material = temp_dir / f"{task_id}_material_{idx}_norm.mp4"
+                    normalized_result = await _run_blocking(
+                        video.normalize_orientation,
+                        str(material_local),
+                        str(normalized_material),
+                    )
+                    if normalized_result != str(material_local):
+                        temp_files.append(normalized_material)
+                        material_local = normalized_material

-            download_tasks = [
-                _download_and_normalize(i, assignment)
-                for i, assignment in enumerate(assignments)
-            ]
-            download_results = await asyncio.gather(*download_tasks)
-            for local, res in download_results:
-                material_locals.append(local)
-                resolutions.append(res)
+                    res = video.get_resolution(str(material_local))
+                    return mat_path, material_local, res
+
+            download_results = await asyncio.gather(*[
+                _download_unique(p, i) for i, p in enumerate(unique_paths)
+            ])
+            for mat_path, local, res in download_results:
+                path_to_local[mat_path] = local
+                path_to_res[mat_path] = res
+
+            logger.info(f"[MultiMat] 去重下载 {len(unique_paths)} 个素材（共 {num_segments} 个段）")

            # 按用户选择的画面比例统一分辨率
            base_res = target_resolution
-            need_scale = any(r != base_res for r in resolutions)
+            need_scale = any(r != base_res for r in path_to_res.values())
            if need_scale:
                logger.info(f"[MultiMat] 素材分辨率不一致，统一到 {base_res[0]}x{base_res[1]}")

-            # ── 第二步：并行裁剪每段素材到对应时长 ──
+            # ── 第二步：并行裁剪每段素材到对应时长（通过映射找到已下载文件）──
            prepared_segments: List[Optional[Path]] = [None] * num_segments

            async def _prepare_one_segment(i: int, assignment: dict):
                """将单个素材裁剪/循环到对应时长"""
-                seg_dur = assignment["end"] - assignment["start"]
-                prepared_path = temp_dir / f"{task_id}_prepared_{i}.mp4"
-                temp_files.append(prepared_path)
-                prepare_target_res = None if resolutions[i] == base_res else base_res
+                async with _segment_sem:
+                    seg_dur = assignment["end"] - assignment["start"]
+                    prepared_path = temp_dir / f"{task_id}_prepared_{i}.mp4"
+                    temp_files.append(prepared_path)
+                    mat_local = path_to_local[assignment["material_path"]]
+                    mat_res = path_to_res[assignment["material_path"]]
+                    prepare_target_res = None if mat_res == base_res else base_res

-                await _run_blocking(
-                    video.prepare_segment,
-                    str(material_locals[i]),
-                    seg_dur,
-                    str(prepared_path),
-                    prepare_target_res,
-                    assignment.get("source_start", 0.0),
-                    assignment.get("source_end"),
-                    25,
-                )
-                return i, prepared_path
+                    await _run_blocking(
+                        video.prepare_segment,
+                        str(mat_local),
+                        seg_dur,
+                        str(prepared_path),
+                        prepare_target_res,
+                        assignment.get("source_start", 0.0),
+                        assignment.get("source_end"),
+                        25,
+                    )
+                    return i, prepared_path

            _update_task(
                task_id,
--- a/backend/app/services/creator_scraper.py
+++ b/backend/app/services/creator_scraper.py
--- a/backend/app/services/glm_service.py
+++ b/backend/app/services/glm_service.py
@@ -3,8 +3,10 @@ GLM AI 服务
 使用智谱 GLM 生成标题和标签
 """

+import asyncio
 import json
 import re
+from typing import Any, Optional, cast
 from loguru import logger
 from zai import ZhipuAiClient

@@ -25,6 +27,48 @@ class GLMService:
            self.client = ZhipuAiClient(api_key=settings.GLM_API_KEY)
        return self.client

+    async def _call_glm(
+        self,
+        *,
+        prompt: str,
+        max_tokens: int,
+        temperature: float,
+        action: str,
+        timeout_seconds: float = 85.0,
+    ) -> str:
+        """统一 GLM 调用入口，避免重复调用代码"""
+        client = self._get_client()
+        logger.info(
+            f"{action} | model={settings.GLM_MODEL} | max_tokens={max_tokens} | temperature={temperature}"
+        )
+
+        try:
+            response = await asyncio.wait_for(
+                asyncio.to_thread(
+                    client.chat.completions.create,
+                    model=settings.GLM_MODEL,
+                    messages=[{"role": "user", "content": prompt}],
+                    thinking={"type": "disabled"},
+                    max_tokens=max_tokens,
+                    temperature=temperature,
+                ),
+                timeout=timeout_seconds,
+            )
+        except asyncio.TimeoutError as exc:
+            raise Exception("GLM 请求超时，请稍后重试") from exc
+
+        completion = cast(Any, response)
+        choices = getattr(completion, "choices", None)
+        if not choices:
+            raise Exception("AI 返回内容为空")
+
+        message = getattr(choices[0], "message", None)
+        content = getattr(message, "content", "")
+        text = content.strip() if isinstance(content, str) else str(content or "").strip()
+        if not text:
+            raise Exception("AI 返回内容为空")
+        return text
+
    async def generate_title_tags(self, text: str) -> dict:
        """
        根据口播文案生成标题和标签
@@ -50,22 +94,13 @@ class GLMService:
 {{"title": "标题", "secondary_title": "副标题", "tags": ["标签1", "标签2", "标签3"]}}"""

        try:
-            client = self._get_client()
-            logger.info(f"Calling GLM API with model: {settings.GLM_MODEL}")
-            
-            # 使用 asyncio.to_thread 包装同步 SDK 调用，避免阻塞事件循环
-            import asyncio
-            response = await asyncio.to_thread(
-                client.chat.completions.create,
-                model=settings.GLM_MODEL,
-                messages=[{"role": "user", "content": prompt}],
-                thinking={"type": "disabled"},  # 禁用思考模式，加快响应
+            content = await self._call_glm(
+                prompt=prompt,
                max_tokens=500,
-                temperature=0.7
+                temperature=0.7,
+                action="生成标题与标签",
+                timeout_seconds=75.0,
            )
-
-            # 提取生成的内容
-            content = response.choices[0].message.content
            logger.info(f"GLM response (model: {settings.GLM_MODEL}): {content}")

            # 解析 JSON
@@ -76,7 +111,7 @@ class GLMService:
            logger.error(f"GLM service error: {e}")
            raise Exception(f"AI 生成失败: {str(e)}")

-    async def rewrite_script(self, text: str, custom_prompt: str = None) -> str:
+    async def rewrite_script(self, text: str, custom_prompt: Optional[str] = None) -> str:
        """
        AI 改写文案

@@ -105,28 +140,126 @@ class GLMService:
 4. 不要返回多余的解释，只返回改写后的正文"""

        try:
-            client = self._get_client()
-            logger.info(f"Using GLM to rewrite script")
-
-            # 使用 asyncio.to_thread 包装同步 SDK 调用，避免阻塞事件循环
-            import asyncio
-            response = await asyncio.to_thread(
-                client.chat.completions.create,
-                model=settings.GLM_MODEL,
-                messages=[{"role": "user", "content": prompt}],
-                thinking={"type": "disabled"},
+            content = await self._call_glm(
+                prompt=prompt,
                max_tokens=2000,
-                temperature=0.8
+                temperature=0.8,
+                action="改写文案",
+                timeout_seconds=85.0,
            )
-
-            content = response.choices[0].message.content
            logger.info("GLM rewrite completed")
-            return content.strip()
+            return content

        except Exception as e:
            logger.error(f"GLM rewrite error: {e}")
            raise Exception(f"AI 改写失败: {str(e)}")

+    async def analyze_topics(self, titles: list[str]) -> list[str]:
+        """
+        分析视频标题列表并归纳热门话题（最多 10 个）
+        """
+        cleaned_titles = [str(title).strip() for title in titles if str(title).strip()]
+        if not cleaned_titles:
+            raise Exception("标题列表为空")
+
+        limited_titles = cleaned_titles[:50]
+        titles_text = "\n".join(f"{idx + 1}. {title}" for idx, title in enumerate(limited_titles))
+
+        prompt = f"""以下是某短视频博主最近发布的视频标题列表：
+
+{titles_text}
+
+请分析这些标题，归纳总结出该博主内容中最热门的话题方向。
+
+要求：
+1. 提取不超过10个话题方向
+2. 每个话题用简短短语描述（建议 5-15 字）
+3. 按热门程度排序（出现频率高的在前）
+4. 只返回话题列表，每行一个，不要编号、解释或多余内容"""
+
+        try:
+            content = await self._call_glm(
+                prompt=prompt,
+                max_tokens=500,
+                temperature=0.5,
+                action="分析博主话题",
+                timeout_seconds=85.0,
+            )
+            topics = self._parse_topic_lines(content)
+            if not topics:
+                raise Exception("未识别到有效话题")
+
+            logger.info(f"GLM topic analysis completed: {len(topics)} topics")
+            return topics[:10]
+        except Exception as e:
+            logger.error(f"GLM topic analysis error: {e}")
+            raise Exception(f"话题分析失败: {str(e)}")
+
+    async def generate_script_from_topic(self, topic: str, word_count: int, titles: list[str]) -> str:
+        """
+        根据选中话题与博主标题风格生成文案
+        """
+        topic_value = str(topic or "").strip()
+        if not topic_value:
+            raise Exception("话题不能为空")
+
+        cleaned_titles = [str(title).strip() for title in titles if str(title).strip()]
+        if not cleaned_titles:
+            raise Exception("参考标题为空")
+
+        word_count_value = max(80, min(int(word_count), 1000))
+        sample_titles = "\n".join(f"{idx + 1}. {title}" for idx, title in enumerate(cleaned_titles[:10]))
+
+        prompt = f"""请围绕「{topic_value}」这个话题，生成一段短视频口播文案。
+
+参考该博主的标题风格：
+{sample_titles}
+
+要求：
+1. 文案字数约 {word_count_value} 字
+2. 适合短视频口播，语气自然、有吸引力
+3. 开头要有钩子吸引观众
+4. 只返回文案正文，不要标题和其他说明"""
+
+        try:
+            content = await self._call_glm(
+                prompt=prompt,
+                max_tokens=min(word_count_value * 3, 4000),
+                temperature=0.8,
+                action=f"按话题生成文案(topic={topic_value})",
+                timeout_seconds=88.0,
+            )
+
+            logger.info("GLM topic script generation completed")
+            return content
+        except Exception as e:
+            logger.error(f"GLM topic script generation error: {e}")
+            raise Exception(f"文案生成失败: {str(e)}")
+
+    def _parse_topic_lines(self, content: str) -> list[str]:
+        lines = [line.strip() for line in str(content or "").splitlines()]
+        topics: list[str] = []
+        seen: set[str] = set()
+
+        for line in lines:
+            if not line:
+                continue
+
+            cleaned = re.sub(r"^\s*(?:[-*•]+|\d+[.)、\s]+)", "", line).strip()
+            cleaned = cleaned.strip('"“”')
+            if not cleaned:
+                continue
+
+            if cleaned in seen:
+                continue
+            seen.add(cleaned)
+            topics.append(cleaned)
+
+            if len(topics) >= 10:
+                break
+
+        return topics
+


    async def translate_text(self, text: str, target_lang: str) -> str:
@@ -151,22 +284,15 @@ class GLMService:
 3. 翻译要自然流畅，符合目标语言的表达习惯"""

        try:
-            client = self._get_client()
-            logger.info(f"Using GLM to translate text to {target_lang}")
-
-            import asyncio
-            response = await asyncio.to_thread(
-                client.chat.completions.create,
-                model=settings.GLM_MODEL,
-                messages=[{"role": "user", "content": prompt}],
-                thinking={"type": "disabled"},
+            content = await self._call_glm(
+                prompt=prompt,
                max_tokens=2000,
-                temperature=0.3
+                temperature=0.3,
+                action=f"翻译文案(target={target_lang})",
+                timeout_seconds=75.0,
            )
-
-            content = response.choices[0].message.content
            logger.info("GLM translation completed")
-            return content.strip()
+            return content

        except Exception as e:
            logger.error(f"GLM translate error: {e}")
--- a/backend/app/services/lipsync_service.py
+++ b/backend/app/services/lipsync_service.py
@@ -11,12 +11,13 @@ import asyncio
 import httpx
 from pathlib import Path
 from loguru import logger
-from typing import Optional, Literal
+from typing import Optional, Literal

 from app.core.config import settings
+from app.services.small_face_enhance_service import SmallFaceEnhanceService


-class LipSyncService:
+class LipSyncService:
    """唇形同步服务 - LatentSync 1.6 + MuseTalk 1.5 混合方案"""

    def __init__(self):
@@ -38,6 +39,9 @@ class LipSyncService:
        
        # 运行时检测
        self._weights_available: Optional[bool] = None
+
+        # 小脸增强
+        self._face_enhance = SmallFaceEnhanceService()
    
    def _check_weights(self) -> bool:
        """检查模型权重是否存在"""
@@ -93,7 +97,7 @@ class LipSyncService:
            logger.warning(f"⚠️ 获取媒体时长失败: {e}")
        return None

-    def _loop_video_to_duration(self, video_path: str, output_path: str, target_duration: float) -> str:
+    def _loop_video_to_duration(self, video_path: str, output_path: str, target_duration: float) -> str:
        """
        循环视频以匹配目标时长
        使用 FFmpeg stream_loop 实现无缝循环
@@ -117,47 +121,70 @@ class LipSyncService:
            else:
                logger.warning(f"⚠️ 视频循环失败: {result.stderr[:200]}")
                return video_path
-        except Exception as e:
-            logger.warning(f"⚠️ 视频循环异常: {e}")
-            return video_path
+        except Exception as e:
+            logger.warning(f"⚠️ 视频循环异常: {e}")
+            return video_path
+
+    def _mux_audio_to_video(self, video_path: str, audio_path: str, output_path: str) -> bool:
+        """将音轨封装到视频，避免增强路径出现无声输出。"""
+        try:
+            cmd = [
+                "ffmpeg", "-y",
+                "-i", video_path,
+                "-i", audio_path,
+                "-map", "0:v:0",
+                "-map", "1:a:0",
+                "-c:v", "copy",
+                "-c:a", "aac",
+                "-shortest",
+                output_path,
+            ]
+            result = subprocess.run(cmd, capture_output=True, text=True, timeout=120)
+            if result.returncode == 0 and Path(output_path).exists():
+                return True
+            logger.warning(f"⚠️ 音轨封装失败: {result.stderr[:200]}")
+            return False
+        except Exception as e:
+            logger.warning(f"⚠️ 音轨封装异常: {e}")
+            return False

-    async def generate(
-        self, 
-        video_path: str, 
-        audio_path: str, 
-        output_path: str, 
-        fps: int = 25,
-        model_mode: Literal["default", "fast", "advanced"] = "default",
-    ) -> str:
-        """生成唇形同步视频"""
-        logger.info(f"🎬 唇形同步任务: {Path(video_path).name} + {Path(audio_path).name}")
-        Path(output_path).parent.mkdir(parents=True, exist_ok=True)
-
-        normalized_mode: Literal["default", "fast", "advanced"] = model_mode
-        if normalized_mode not in ("default", "fast", "advanced"):
-            normalized_mode = "default"
-        logger.info(f"🧠 Lipsync 模式: {normalized_mode}")
-        
-        if self.use_local:
-            return await self._local_generate(video_path, audio_path, output_path, fps, normalized_mode)
-        else:
-            return await self._remote_generate(video_path, audio_path, output_path, fps, normalized_mode)
+    async def generate(
+        self, 
+        video_path: str, 
+        audio_path: str, 
+        output_path: str, 
+        fps: int = 25,
+        model_mode: Literal["default", "fast", "advanced"] = "default",
+    ) -> str:
+        """生成唇形同步视频"""
+        logger.info(f"🎬 唇形同步任务: {Path(video_path).name} + {Path(audio_path).name}")
+        Path(output_path).parent.mkdir(parents=True, exist_ok=True)
+
+        normalized_mode: Literal["default", "fast", "advanced"] = model_mode
+        if normalized_mode not in ("default", "fast", "advanced"):
+            normalized_mode = "default"
+        logger.info(f"🧠 Lipsync 模式: {normalized_mode}")
+        
+        if self.use_local:
+            return await self._local_generate(video_path, audio_path, output_path, fps, normalized_mode)
+        else:
+            return await self._remote_generate(video_path, audio_path, output_path, fps, normalized_mode)
    
-    async def _local_generate(
-        self, 
-        video_path: str, 
-        audio_path: str, 
-        output_path: str, 
-        fps: int,
-        model_mode: Literal["default", "fast", "advanced"],
-    ) -> str:
-        """使用 subprocess 调用 LatentSync conda 环境"""
-
-        logger.info("⏳ 等待 GPU 资源 (排队中)...")
-        async with self._lock:
-            # 使用临时目录存放中间文件
-            with tempfile.TemporaryDirectory() as tmpdir:
-                tmpdir = Path(tmpdir)
+    async def _local_generate(
+        self,
+        video_path: str,
+        audio_path: str,
+        output_path: str,
+        fps: int,
+        model_mode: Literal["default", "fast", "advanced"],
+    ) -> str:
+        """使用 subprocess 调用 LatentSync conda 环境"""
+
+        logger.info("⏳ 等待 GPU 资源 (排队中)...")
+        async with self._lock:
+            # 使用临时目录存放中间文件
+            with tempfile.TemporaryDirectory() as tmpdir:
+                tmpdir = Path(tmpdir)

                # 获取音频和视频时长
                audio_duration = self._get_media_duration(audio_path)
@@ -172,133 +199,206 @@ class LipSyncService:
                        str(looped_video),
                        audio_duration
                    )
-                else:
-                    actual_video_path = video_path
-
-                # 模型路由
-                force_musetalk = model_mode == "fast"
-                force_latentsync = model_mode == "advanced"
-                auto_to_musetalk = (
-                    model_mode == "default"
-                    and audio_duration is not None
-                    and audio_duration >= settings.LIPSYNC_DURATION_THRESHOLD
-                )
-
-                if force_musetalk:
-                    logger.info("⚡ 强制快速模型：MuseTalk")
-                    musetalk_result = await self._call_musetalk_server(
-                        actual_video_path, audio_path, output_path
-                    )
-                    if musetalk_result:
-                        return musetalk_result
-                    logger.warning("⚠️ MuseTalk 不可用，快速模型回退到 LatentSync")
-                elif auto_to_musetalk:
-                    logger.info(
-                        f"🔄 音频 {audio_duration:.1f}s >= {settings.LIPSYNC_DURATION_THRESHOLD}s，路由到 MuseTalk"
-                    )
-                    musetalk_result = await self._call_musetalk_server(
-                        actual_video_path, audio_path, output_path
-                    )
-                    if musetalk_result:
-                        return musetalk_result
-                    logger.warning("⚠️ MuseTalk 不可用，回退到 LatentSync（长视频，会较慢）")
-                elif force_latentsync:
-                    logger.info("🎯 强制高级模型：LatentSync")
-
-                # 检查 LatentSync 前置条件（仅在需要回退或使用 LatentSync 时）
-                if not self._check_conda_env():
-                    logger.warning("⚠️ Conda 环境不可用，使用 Fallback")
-                    shutil.copy(video_path, output_path)
-                    return output_path
-
-                if not self._check_weights():
-                    logger.warning("⚠️ 模型权重不存在，使用 Fallback")
-                    shutil.copy(video_path, output_path)
-                    return output_path
-
-                if self.use_server:
-                    # 模式 A: 调用常驻服务 (加速模式)
-                    return await self._call_persistent_server(actual_video_path, audio_path, output_path)
+                else:
+                    actual_video_path = video_path

-                logger.info("🔄 调用 LatentSync 推理 (subprocess)...")
-
-                temp_output = tmpdir / "output.mp4"
-                
-                # 构建命令
-                cmd = [
-                    str(self.conda_python),
-                    "-m", "scripts.inference",
-                    "--unet_config_path", "configs/unet/stage2_512.yaml",
-                    "--inference_ckpt_path", "checkpoints/latentsync_unet.pt",
-                    "--inference_steps", str(settings.LATENTSYNC_INFERENCE_STEPS),
-                    "--guidance_scale", str(settings.LATENTSYNC_GUIDANCE_SCALE),
-                    "--video_path", str(actual_video_path),  # 使用预处理后的视频
-                    "--audio_path", str(audio_path),
-                    "--video_out_path", str(temp_output),
-                    "--seed", str(settings.LATENTSYNC_SEED),
-                    "--temp_dir", str(tmpdir / "cache"),
-                ]
-                
-                if settings.LATENTSYNC_ENABLE_DEEPCACHE:
-                    cmd.append("--enable_deepcache")
-                
-                # 设置环境变量
-                env = os.environ.copy()
-                env["CUDA_VISIBLE_DEVICES"] = str(self.gpu_id)
-                
-                logger.info(f"🖥️ 执行命令: {' '.join(cmd[:8])}...")
-                logger.info(f"🖥️ GPU: CUDA_VISIBLE_DEVICES={self.gpu_id}")
-                
+                # ── 小脸增强 ──
+                enhance_result = None
                try:
-                    # 使用 asyncio subprocess 实现真正的异步执行
-                    # 这样事件循环可以继续处理其他请求（如进度查询）
-                    process = await asyncio.create_subprocess_exec(
-                        *cmd,
-                        cwd=str(self.latentsync_dir),
-                        env=env,
-                        stdout=asyncio.subprocess.PIPE,
-                        stderr=asyncio.subprocess.PIPE,
+                    enhance_result = self._face_enhance.enhance_if_needed(
+                        video_path=str(actual_video_path),
+                        tmpdir=tmpdir,
+                        gpu_id=settings.LIPSYNC_SMALL_FACE_GPU_ID,
                    )
-                    
-                    # 等待进程完成，带超时
-                    try:
-                        stdout, stderr = await asyncio.wait_for(
-                            process.communicate(),
-                            timeout=900  # 15分钟超时
-                        )
-                    except asyncio.TimeoutError:
-                        process.kill()
-                        await process.wait()
-                        logger.error("⏰ LatentSync 推理超时 (15分钟)")
-                        shutil.copy(video_path, output_path)
-                        return output_path
-                    
-                    stdout_text = stdout.decode() if stdout else ""
-                    stderr_text = stderr.decode() if stderr else ""
-                    
-                    if process.returncode != 0:
-                        logger.error(f"LatentSync 推理失败:\n{stderr_text}")
-                        logger.error(f"stdout:\n{stdout_text[-1000:] if stdout_text else 'N/A'}")
-                        # Fallback
-                        shutil.copy(video_path, output_path)
-                        return output_path
-                    
-                    logger.info(f"LatentSync 输出:\n{stdout_text[-500:] if stdout_text else 'N/A'}")
-
-                    # 检查输出文件
-                    if temp_output.exists():
-                        shutil.copy(temp_output, output_path)
-                        logger.info(f"✅ 唇形同步完成: {output_path}")
-                        return output_path
-                    else:
-                        logger.warning("⚠️ 未找到输出文件，使用 Fallback")
-                        shutil.copy(video_path, output_path)
-                        return output_path
-                        
                except Exception as e:
-                    logger.error(f"❌ 推理异常: {e}")
-                    shutil.copy(video_path, output_path)
-                    return output_path
+                    if settings.LIPSYNC_SMALL_FACE_FAIL_OPEN:
+                        logger.warning(f"⚠️ 小脸增强失败，跳过: {e}")
+                    else:
+                        raise
+
+                if enhance_result and enhance_result.was_enhanced:
+                    track = enhance_result.track
+                    if track is None:
+                        raise RuntimeError("小脸增强轨迹缺失")
+
+                    # 增强路径：模型推理增强后的人脸视频 → 贴回原视频
+                    temp_sync = tmpdir / "face_sync.mp4"
+                    await self._run_selected_model(
+                        video_path=enhance_result.video_path,
+                        audio_path=audio_path,
+                        output_path=str(temp_sync),
+                        tmpdir=tmpdir,
+                        model_mode=model_mode,
+                        audio_duration=audio_duration,
+                        original_video_path=video_path,
+                    )
+
+                    try:
+                        blended = self._face_enhance.blend_back(
+                            original_video=str(actual_video_path),
+                            lipsync_video=str(temp_sync),
+                            track=track,
+                            tmpdir=tmpdir,
+                        )
+                        blended_with_audio = tmpdir / "blended_with_audio.mp4"
+                        if not self._mux_audio_to_video(
+                            video_path=str(blended),
+                            audio_path=audio_path,
+                            output_path=str(blended_with_audio),
+                        ):
+                            raise RuntimeError("贴回视频音轨封装失败")
+
+                        shutil.copy(str(blended_with_audio), output_path)
+                        logger.info(f"✅ 小脸增强 + 唇形同步完成: {output_path}")
+                        return output_path
+                    except Exception as e:
+                        if settings.LIPSYNC_SMALL_FACE_FAIL_OPEN:
+                            logger.warning(f"⚠️ 小脸贴回失败，回退原流程: {e}")
+                        else:
+                            raise
+
+                # 常规路径（未增强或增强失败）
+                return await self._run_selected_model(
+                    video_path=str(actual_video_path),
+                    audio_path=audio_path,
+                    output_path=output_path,
+                    tmpdir=tmpdir,
+                    model_mode=model_mode,
+                    audio_duration=audio_duration,
+                    original_video_path=video_path,
+                )
+
+    async def _run_selected_model(
+        self,
+        video_path: str,
+        audio_path: str,
+        output_path: str,
+        tmpdir: Path,
+        model_mode: Literal["default", "fast", "advanced"],
+        audio_duration: Optional[float],
+        original_video_path: str,
+    ) -> str:
+        """模型路由 + 执行（MuseTalk / LatentSync 常驻服务 / LatentSync subprocess）"""
+
+        # 模型路由
+        force_musetalk = model_mode == "fast"
+        force_latentsync = model_mode == "advanced"
+        auto_to_musetalk = (
+            model_mode == "default"
+            and audio_duration is not None
+            and audio_duration >= settings.LIPSYNC_DURATION_THRESHOLD
+        )
+
+        if force_musetalk:
+            logger.info("⚡ 强制快速模型：MuseTalk")
+            musetalk_result = await self._call_musetalk_server(
+                video_path, audio_path, output_path
+            )
+            if musetalk_result:
+                return musetalk_result
+            logger.warning("⚠️ MuseTalk 不可用，快速模型回退到 LatentSync")
+        elif auto_to_musetalk:
+            logger.info(
+                f"🔄 音频 {audio_duration:.1f}s >= {settings.LIPSYNC_DURATION_THRESHOLD}s，路由到 MuseTalk"
+            )
+            musetalk_result = await self._call_musetalk_server(
+                video_path, audio_path, output_path
+            )
+            if musetalk_result:
+                return musetalk_result
+            logger.warning("⚠️ MuseTalk 不可用，回退到 LatentSync（长视频，会较慢）")
+        elif force_latentsync:
+            logger.info("🎯 强制高级模型：LatentSync")
+
+        # 检查 LatentSync 前置条件
+        if not self._check_conda_env():
+            logger.warning("⚠️ Conda 环境不可用，使用 Fallback")
+            shutil.copy(original_video_path, output_path)
+            return output_path
+
+        if not self._check_weights():
+            logger.warning("⚠️ 模型权重不存在，使用 Fallback")
+            shutil.copy(original_video_path, output_path)
+            return output_path
+
+        if self.use_server:
+            # 模式 A: 调用常驻服务 (加速模式)
+            return await self._call_persistent_server(video_path, audio_path, output_path)
+
+        logger.info("🔄 调用 LatentSync 推理 (subprocess)...")
+
+        temp_output = tmpdir / "output.mp4"
+
+        # 构建命令
+        cmd = [
+            str(self.conda_python),
+            "-m", "scripts.inference",
+            "--unet_config_path", "configs/unet/stage2_512.yaml",
+            "--inference_ckpt_path", "checkpoints/latentsync_unet.pt",
+            "--inference_steps", str(settings.LATENTSYNC_INFERENCE_STEPS),
+            "--guidance_scale", str(settings.LATENTSYNC_GUIDANCE_SCALE),
+            "--video_path", str(video_path),
+            "--audio_path", str(audio_path),
+            "--video_out_path", str(temp_output),
+            "--seed", str(settings.LATENTSYNC_SEED),
+            "--temp_dir", str(tmpdir / "cache"),
+        ]
+
+        if settings.LATENTSYNC_ENABLE_DEEPCACHE:
+            cmd.append("--enable_deepcache")
+
+        # 设置环境变量
+        env = os.environ.copy()
+        env["CUDA_VISIBLE_DEVICES"] = str(self.gpu_id)
+
+        logger.info(f"🖥️ 执行命令: {' '.join(cmd[:8])}...")
+        logger.info(f"🖥️ GPU: CUDA_VISIBLE_DEVICES={self.gpu_id}")
+
+        try:
+            process = await asyncio.create_subprocess_exec(
+                *cmd,
+                cwd=str(self.latentsync_dir),
+                env=env,
+                stdout=asyncio.subprocess.PIPE,
+                stderr=asyncio.subprocess.PIPE,
+            )
+
+            try:
+                stdout, stderr = await asyncio.wait_for(
+                    process.communicate(),
+                    timeout=900  # 15分钟超时
+                )
+            except asyncio.TimeoutError:
+                process.kill()
+                await process.wait()
+                logger.error("⏰ LatentSync 推理超时 (15分钟)")
+                shutil.copy(original_video_path, output_path)
+                return output_path
+
+            stdout_text = stdout.decode() if stdout else ""
+            stderr_text = stderr.decode() if stderr else ""
+
+            if process.returncode != 0:
+                logger.error(f"LatentSync 推理失败:\n{stderr_text}")
+                logger.error(f"stdout:\n{stdout_text[-1000:] if stdout_text else 'N/A'}")
+                shutil.copy(original_video_path, output_path)
+                return output_path
+
+            logger.info(f"LatentSync 输出:\n{stdout_text[-500:] if stdout_text else 'N/A'}")
+
+            if temp_output.exists():
+                shutil.copy(temp_output, output_path)
+                logger.info(f"✅ 唇形同步完成: {output_path}")
+                return output_path
+            else:
+                logger.warning("⚠️ 未找到输出文件，使用 Fallback")
+                shutil.copy(original_video_path, output_path)
+                return output_path
+
+        except Exception as e:
+            logger.error(f"❌ 推理异常: {e}")
+            shutil.copy(original_video_path, output_path)
+            return output_path
    
    async def _call_musetalk_server(
        self, video_path: str, audio_path: str, output_path: str
@@ -413,18 +513,18 @@ class LipSyncService:
            "请确保 LatentSync 服务已启动 (cd models/LatentSync && python scripts/server.py)"
        )
    
-    async def _remote_generate(
-        self, 
-        video_path: str, 
-        audio_path: str, 
-        output_path: str, 
-        fps: int,
-        model_mode: Literal["default", "fast", "advanced"],
-    ) -> str:
-        """调用远程 LatentSync API 服务"""
-        if model_mode == "fast":
-            logger.warning("⚠️ 远程模式未接入 MuseTalk，快速模型将使用远程 LatentSync")
-        logger.info(f"📡 调用远程 API: {self.api_url}")
+    async def _remote_generate(
+        self, 
+        video_path: str, 
+        audio_path: str, 
+        output_path: str, 
+        fps: int,
+        model_mode: Literal["default", "fast", "advanced"],
+    ) -> str:
+        """调用远程 LatentSync API 服务"""
+        if model_mode == "fast":
+            logger.warning("⚠️ 远程模式未接入 MuseTalk，快速模型将使用远程 LatentSync")
+        logger.info(f"📡 调用远程 API: {self.api_url}")
        
        try:
            async with httpx.AsyncClient(timeout=600.0) as client:
@@ -499,4 +599,9 @@ class LipSyncService:
            "ready": conda_ok and weights_ok and gpu_ok,
            "musetalk_ready": musetalk_ready,
            "lipsync_threshold": settings.LIPSYNC_DURATION_THRESHOLD,
+            "small_face_enhance": {
+                "enabled": settings.LIPSYNC_SMALL_FACE_ENHANCE,
+                "threshold": settings.LIPSYNC_SMALL_FACE_THRESHOLD,
+                "detector_loaded": self._face_enhance._detector_session is not None,
+            },
        }
--- a/backend/app/services/publish_service.py
+++ b/backend/app/services/publish_service.py
@@ -21,16 +21,22 @@ from .uploader.xiaohongshu_uploader import XiaohongshuUploader
 from .uploader.weixin_uploader import WeixinUploader


-class PublishService:
-    """Social media publishing service (with user isolation)"""
+class PublishService:
+    """Social media publishing service (with user isolation)"""

    # 支持的平台配置
-    PLATFORMS: Dict[str, Dict[str, Any]] = {
-        "douyin": {"name": "抖音", "url": "https://creator.douyin.com/", "enabled": True},
-        "weixin": {"name": "微信视频号", "url": "https://channels.weixin.qq.com/", "enabled": True},
-        "bilibili": {"name": "B站", "url": "https://member.bilibili.com/platform/upload/video/frame", "enabled": True},
-        "xiaohongshu": {"name": "小红书", "url": "https://creator.xiaohongshu.com/", "enabled": True},
-    }
+    PLATFORMS: Dict[str, Dict[str, Any]] = {
+        "douyin": {"name": "抖音", "url": "https://creator.douyin.com/", "enabled": True},
+        "weixin": {"name": "微信视频号", "url": "https://channels.weixin.qq.com/", "enabled": True},
+        "bilibili": {"name": "B站", "url": "https://member.bilibili.com/platform/upload/video/frame", "enabled": True},
+        "xiaohongshu": {"name": "小红书", "url": "https://creator.xiaohongshu.com/", "enabled": True},
+    }
+
+    COOKIE_DOMAINS: Dict[str, str] = {
+        "douyin": ".douyin.com",
+        "weixin": ".weixin.qq.com",
+        "xiaohongshu": ".xiaohongshu.com",
+    }
    
    def __init__(self) -> None:
        # 存储活跃的登录会话，用于跟踪登录状态
@@ -185,15 +191,16 @@ class PublishService:
                    description=description,
                    user_id=user_id,
                )
-            elif platform == "xiaohongshu":
-                uploader = XiaohongshuUploader(
-                    title=title,
-                    file_path=local_video_path,
-                    tags=tags,
-                    publish_date=publish_time,
-                    account_file=str(account_file),
-                    description=description
-                )
+            elif platform == "xiaohongshu":
+                uploader = XiaohongshuUploader(
+                    title=title,
+                    file_path=local_video_path,
+                    tags=tags,
+                    publish_date=publish_time,
+                    account_file=str(account_file),
+                    description=description,
+                    user_id=user_id,
+                )
            elif platform == "weixin":
                uploader = WeixinUploader(
                    title=title,
@@ -330,48 +337,88 @@ class PublishService:
            logger.exception(f"[登出] 失败: {e}")
            return {"success": False, "message": f"注销失败: {str(e)}"}

-    async def save_cookie_string(self, platform: str, cookie_string: str, user_id: Optional[str] = None) -> Dict[str, Any]:
-        """
-        保存从客户端浏览器提取的Cookie字符串
+    async def save_cookie_string(self, platform: str, cookie_string: str, user_id: Optional[str] = None) -> Dict[str, Any]:
+        """
+        保存从客户端浏览器提取的Cookie字符串
        
        Args:
            platform: 平台ID
            cookie_string: document.cookie 格式的Cookie字符串
            user_id: 用户 ID (用于 Cookie 隔离)
-        """
-        try:
-            account_file = self._get_cookie_path(platform, user_id)
-            
-            # 解析Cookie字符串
-            cookie_dict = {}
-            for item in cookie_string.split('; '):
-                if '=' in item:
-                    name, value = item.split('=', 1)
-                    cookie_dict[name] = value
-            
-            # 对B站进行特殊处理
-            if platform == "bilibili":
-                bilibili_cookies = {}
-                required_fields = ['SESSDATA', 'bili_jct', 'DedeUserID', 'DedeUserID__ckMd5']
+        """
+        try:
+            if platform not in self.PLATFORMS:
+                return {
+                    "success": False,
+                    "message": f"不支持的平台: {platform}",
+                }
+
+            account_file = self._get_cookie_path(platform, user_id)
+            
+            # 解析Cookie字符串
+            cookie_dict: Dict[str, str] = {}
+            for item in cookie_string.split(';'):
+                item = item.strip()
+                if not item:
+                    continue
+                if '=' in item:
+                    name, value = item.split('=', 1)
+                    cookie_dict[name.strip()] = value.strip()
+
+            if not cookie_dict:
+                return {
+                    "success": False,
+                    "message": "Cookie 为空，请确认已完成登录",
+                }
+            
+            # 对B站进行特殊处理
+            if platform == "bilibili":
+                bilibili_cookies = {}
+                required_fields = ['SESSDATA', 'bili_jct', 'DedeUserID', 'DedeUserID__ckMd5']
                
                for field in required_fields:
                    if field in cookie_dict:
                        bilibili_cookies[field] = cookie_dict[field]
                
-                if len(bilibili_cookies) < 3:
-                    return {
-                        "success": False,
-                        "message": "Cookie不完整，请确保已登录"
-                    }
-                
-                cookie_dict = bilibili_cookies
-            
-            # 确保目录存在
-            account_file.parent.mkdir(parents=True, exist_ok=True)
-            
-            # 保存Cookie
-            with open(account_file, 'w', encoding='utf-8') as f:
-                json.dump(cookie_dict, f, indent=2)
+                if len(bilibili_cookies) < 3:
+                    return {
+                        "success": False,
+                        "message": "Cookie不完整，请确保已登录"
+                    }
+                payload: Any = bilibili_cookies
+            else:
+                cookie_domain = self.COOKIE_DOMAINS.get(platform, "")
+                if not cookie_domain:
+                    platform_url = self.PLATFORMS.get(platform, {}).get("url", "")
+                    host = re.sub(r"^https?://", "", platform_url).strip("/")
+                    cookie_domain = f".{host}" if host else ""
+
+                storage_cookies = []
+                for name, value in cookie_dict.items():
+                    if not name:
+                        continue
+                    storage_cookies.append({
+                        "name": name,
+                        "value": value,
+                        "domain": cookie_domain,
+                        "path": "/",
+                        "httpOnly": False,
+                        "secure": True,
+                        "sameSite": "Lax",
+                        "expires": -1,
+                    })
+
+                payload = {
+                    "cookies": storage_cookies,
+                    "origins": [],
+                }
+            
+            # 确保目录存在
+            account_file.parent.mkdir(parents=True, exist_ok=True)
+            
+            # 保存Cookie
+            with open(account_file, 'w', encoding='utf-8') as f:
+                json.dump(payload, f, indent=2)
            
            logger.success(f"[登录] {platform} Cookie已保存 (user: {user_id or 'legacy'})")
            
--- a/backend/app/services/qr_login_service.py
+++ b/backend/app/services/qr_login_service.py
@@ -8,7 +8,8 @@ import base64
 import json
 from pathlib import Path
 from typing import Optional, Dict, Any, List, Sequence, Mapping, Union
-from playwright.async_api import async_playwright, Page, Frame, BrowserContext, Browser, Playwright as PW
+from urllib.parse import unquote_to_bytes
+from playwright.async_api import async_playwright, Page, Frame, BrowserContext, Browser, Playwright as PW, TimeoutError as PlaywrightTimeoutError
 from loguru import logger
 from app.core.config import settings

@@ -65,10 +66,16 @@ class QRLoginService:
            "xiaohongshu": {
                "url": "https://creator.xiaohongshu.com/",
                "qr_selectors": [
+                    ".login-box-container img.css-1lhmg90",
+                    ".login-box-container .css-dvxtzn img",
+                    ".login-box-container img",
+                    "div[class*='login-box'] img",
                    ".qrcode img",
                    "img[alt*='二维码']",
                    "canvas.qr-code",
-                    "img[class*='qr']"
+                    "img[class*='qr']",
+                    "img[src*='qrcode']",
+                    "img[src*='qr']"
                ],
                "success_indicator": "https://creator.xiaohongshu.com/publish"
            },
@@ -109,6 +116,103 @@ class QRLoginService:
        ratio = width / height
        return 0.75 <= ratio <= 1.33

+    def _data_url_to_base64(self, data_url: str) -> Optional[str]:
+        if not data_url or "," not in data_url:
+            return None
+        header, payload = data_url.split(",", 1)
+        header_lower = header.lower()
+        if not header_lower.startswith("data:image/png"):
+            return None
+        if ";base64" in header:
+            return payload
+        try:
+            raw = unquote_to_bytes(payload)
+            return base64.b64encode(raw).decode()
+        except Exception:
+            return None
+
+    async def _try_export_qr_data_url(self, qr_element) -> Optional[str]:
+        """优先导出元素原图，避免截图带来的缩放/裁切损失。"""
+        try:
+            data_url = await qr_element.evaluate("""async (el) => {
+                const tag = (el.tagName || '').toLowerCase();
+
+                if (tag === 'canvas') {
+                    try {
+                        return el.toDataURL('image/png');
+                    } catch {
+                        return null;
+                    }
+                }
+
+                if (tag === 'img') {
+                    const src = el.currentSrc || el.src || '';
+                    if (!src) return null;
+
+                    if (src.startsWith('data:image/png')) {
+                        return src;
+                    }
+
+                    if (src.startsWith('blob:')) {
+                        try {
+                            const resp = await fetch(src);
+                            const blob = await resp.blob();
+                            return await new Promise((resolve) => {
+                                const reader = new FileReader();
+                                reader.onloadend = () => resolve(typeof reader.result === 'string' ? reader.result : null);
+                                reader.onerror = () => resolve(null);
+                                reader.readAsDataURL(blob);
+                            });
+                        } catch {
+                            return null;
+                        }
+                    }
+
+                    return null;
+                }
+
+                return null;
+            }""")
+
+            if not data_url:
+                return None
+
+            return self._data_url_to_base64(data_url)
+        except Exception:
+            return None
+
+    async def _screenshot_qr_base64(self, page: Page, qr_element) -> Optional[str]:
+        try:
+            if self.platform == "weixin":
+                bbox = await qr_element.bounding_box()
+                viewport = page.viewport_size or {"width": 1920, "height": 1080}
+                if bbox:
+                    pad = max(16, int(min(bbox.get("width", 0), bbox.get("height", 0)) * 0.08))
+                    x = max(0.0, bbox.get("x", 0.0) - pad)
+                    y = max(0.0, bbox.get("y", 0.0) - pad)
+                    max_width = float(viewport.get("width", 1920))
+                    max_height = float(viewport.get("height", 1080))
+                    width = min(max_width - x, bbox.get("width", 0.0) + pad * 2)
+                    height = min(max_height - y, bbox.get("height", 0.0) + pad * 2)
+                    if width > 8 and height > 8:
+                        clipped = await page.screenshot(
+                            clip={"x": x, "y": y, "width": width, "height": height},
+                            type="png",
+                        )
+                        return base64.b64encode(clipped).decode()
+
+            screenshot = await qr_element.screenshot(type="png")
+            return base64.b64encode(screenshot).decode()
+        except Exception as e:
+            logger.warning(f"[{self.platform}] QR截图失败: {e}")
+            return None
+
+    async def _capture_qr_base64(self, page: Page, qr_element) -> Optional[str]:
+        data_url_base64 = await self._try_export_qr_data_url(qr_element)
+        if data_url_base64:
+            return data_url_base64
+        return await self._screenshot_qr_base64(page, qr_element)
+
    async def _pick_best_candidate(self, locator, min_side: int = 100):
        best = None
        best_area = 0
@@ -160,6 +264,88 @@ class QRLoginService:

        return await self._find_qr_in_frames(page, selectors, min_side=min_side)

+    async def _ensure_xiaohongshu_qr_mode(self, page: Page) -> None:
+        """小红书登录页默认短信登录，需要先切到扫码登录。"""
+        if self.platform != "xiaohongshu":
+            return
+
+        try:
+            for _ in range(3):
+                sms_mode = False
+                try:
+                    sms_mode = await page.locator("input[placeholder*='手机号']").first.is_visible(timeout=800)
+                except Exception:
+                    sms_mode = False
+
+                if not sms_mode:
+                    return
+
+                clicked = False
+
+                # 先尝试稳定选择器
+                switch_selectors = [
+                    "img.css-wemwzq",
+                    ".login-box-container img[style*='cursor: pointer']",
+                ]
+
+                for selector in switch_selectors:
+                    try:
+                        locator = page.locator(selector)
+                        count = await locator.count()
+                        for i in range(count):
+                            candidate = locator.nth(i)
+                            if not await candidate.is_visible():
+                                continue
+                            bbox = await candidate.bounding_box()
+                            if not bbox:
+                                continue
+                            if bbox.get("width", 0) < 24 or bbox.get("width", 0) > 96:
+                                continue
+                            if bbox.get("height", 0) < 24 or bbox.get("height", 0) > 96:
+                                continue
+                            try:
+                                await candidate.click(timeout=1200)
+                            except Exception:
+                                await candidate.evaluate("el => el.click()")
+                            clicked = True
+                            break
+                        if clicked:
+                            break
+                    except Exception:
+                        continue
+
+                if not clicked:
+                    # 兜底：在登录卡片右上角找可点击小图标
+                    clicked = bool(await page.evaluate("""() => {
+                        const phoneInput = Array.from(document.querySelectorAll('input'))
+                          .find((el) => (el.placeholder || '').includes('手机号'));
+                        const card = document.querySelector('.login-box-container') || phoneInput?.closest('div');
+                        if (!card) return false;
+
+                        const cardRect = card.getBoundingClientRect();
+                        const imgs = Array.from(card.querySelectorAll('img'));
+                        for (const img of imgs) {
+                            const r = img.getBoundingClientRect();
+                            if (r.width < 24 || r.width > 96 || r.height < 24 || r.height > 96) continue;
+                            if (r.right < cardRect.right - 90) continue;
+                            if (r.top > cardRect.top + 90) continue;
+                            const style = getComputedStyle(img);
+                            if (style.cursor !== 'pointer') continue;
+                            img.click();
+                            return true;
+                        }
+                        return false;
+                    }"""))
+
+                if not clicked:
+                    logger.warning("[xiaohongshu] 未找到登录方式切换按钮，继续尝试二维码提取")
+                    return
+
+                logger.info("[xiaohongshu] 已点击登录方式切换，等待二维码渲染")
+                await asyncio.sleep(1.5)
+        except Exception as e:
+            logger.warning(f"[xiaohongshu] 切换扫码登录模式失败: {e}")
+
    async def _try_text_strategy_in_frames(self, page: Page):
        for frame in page.frames:
            if frame == page.main_frame:
@@ -317,12 +503,22 @@ class QRLoginService:

            for url in urls_to_try:
                logger.info(f"[{self.platform}] 打开登录页: {url}")
-                wait_until = "domcontentloaded" if self.platform == "weixin" else "networkidle"
-                await page.goto(url, wait_until=wait_until)
+                wait_until = "domcontentloaded" if self.platform in ("weixin", "douyin") else "networkidle"
+                try:
+                    await page.goto(url, wait_until=wait_until, timeout=30000)
+                except PlaywrightTimeoutError as nav_err:
+                    # 抖音页存在长连接，偶发无法满足等待条件；超时后继续尝试提取二维码
+                    if self.platform == "douyin":
+                        logger.warning(f"[douyin] 页面加载超时，继续尝试提取二维码: {nav_err}")
+                    else:
+                        raise

                # 等待页面加载
                await asyncio.sleep(1 if self.platform == "weixin" else 2)

+                if self.platform == "xiaohongshu":
+                    await self._ensure_xiaohongshu_qr_mode(page)
+
                # 提取二维码 (并行策略)
                qr_image = await self._extract_qr_code(page, config["qr_selectors"])
                if qr_image:
@@ -373,8 +569,9 @@ class QRLoginService:
                    el = await page.wait_for_selector(combined_selector, state="visible", timeout=5000)
                    if el:
                        logger.info(f"[{self.platform}] 策略CSS: 匹配成功")
-                        screenshot = await el.screenshot()
-                        return base64.b64encode(screenshot).decode()
+                        qr_base64 = await self._capture_qr_base64(page, el)
+                        if qr_base64:
+                            return qr_base64
                except Exception as e:
                    logger.warning(f"[{self.platform}] 策略CSS 失败: {e}")

@@ -382,8 +579,9 @@ class QRLoginService:
                qr_element = await self._try_text_strategy(page)
                if qr_element:
                    try:
-                        screenshot = await qr_element.screenshot()
-                        return base64.b64encode(screenshot).decode()
+                        qr_base64 = await self._capture_qr_base64(page, qr_element)
+                        if qr_base64:
+                            return qr_base64
                    except Exception as e:
                        logger.warning(f"[{self.platform}] Text策略截图失败: {e}")

@@ -397,8 +595,9 @@ class QRLoginService:
                qr_element = await self._try_text_strategy(page)
                if qr_element:
                    try:
-                        screenshot = await qr_element.screenshot()
-                        return base64.b64encode(screenshot).decode()
+                        qr_base64 = await self._capture_qr_base64(page, qr_element)
+                        if qr_base64:
+                            return qr_base64
                    except Exception as e:
                        logger.warning(f"[{self.platform}] Text策略截图失败: {e}")
                        qr_element = None
@@ -410,12 +609,16 @@ class QRLoginService:
                        el = await page.wait_for_selector(combined_selector, state="visible", timeout=5000)
                        if el:
                            logger.info(f"[{self.platform}] 策略CSS: 匹配成功")
-                            screenshot = await el.screenshot()
-                            return base64.b64encode(screenshot).decode()
+                            qr_base64 = await self._capture_qr_base64(page, el)
+                            if qr_base64:
+                                return qr_base64
                    except Exception as e:
                        logger.warning(f"[{self.platform}] 策略CSS 失败: {e}")
        else:
            # 其他平台 (小红书/微信等)：保持原顺序 CSS -> Text
+            if self.platform == "xiaohongshu":
+                await self._ensure_xiaohongshu_qr_mode(page)
+
            # 策略1: CSS 选择器
            try:
                combined_selector = ", ".join(selectors)
@@ -432,7 +635,8 @@ class QRLoginService:
                else:
                    await page.wait_for_selector(combined_selector, state="visible", timeout=5000)
                    locator = page.locator(combined_selector)
-                    qr_element = await self._pick_best_candidate(locator, min_side=100)
+                    min_side = 120 if self.platform == "xiaohongshu" else 100
+                    qr_element = await self._pick_best_candidate(locator, min_side=min_side)
                    if qr_element:
                        logger.info(f"[{self.platform}] 策略1(CSS): 匹配成功")
            except Exception as e:
@@ -448,8 +652,9 @@ class QRLoginService:
            # 如果找到元素，截图返回
            if qr_element:
                try:
-                    screenshot = await qr_element.screenshot()
-                    return base64.b64encode(screenshot).decode()
+                    qr_base64 = await self._capture_qr_base64(page, qr_element)
+                    if qr_base64:
+                        return qr_base64
                except Exception as e:
                    logger.error(f"[{self.platform}] 截图失败: {e}")
        
@@ -465,6 +670,8 @@ class QRLoginService:
            keywords = [
                "扫码登录",
                "二维码",
+                "APP扫一扫登录",
+                "可用小红书扫码",
                "打开抖音",
                "抖音APP",
                "使用APP扫码",
@@ -483,7 +690,7 @@ class QRLoginService:
                    for _ in range(5):
                        parent = parent.locator("..")
                        candidates = parent.locator("img, canvas")
-                        min_side = 120 if self.platform == "weixin" else 100
+                        min_side = 120 if self.platform in ("weixin", "xiaohongshu") else 100
                        best = await self._pick_best_candidate(candidates, min_side=min_side)
                        if best:
                            logger.info(f"[{self.platform}] 策略Text: 成功")
@@ -554,6 +761,22 @@ class QRLoginService:
                        await self._save_cookies(final)
                        break

+                    # ── 小红书特殊：扫码后常跳转到 /new/home，不一定命中 success_indicator ──
+                    if self.platform == "xiaohongshu":
+                        lowered_url = current_url.lower()
+                        xhs_logged_in = (
+                            lowered_url.startswith("https://creator.xiaohongshu.com/new/")
+                            or "/publish/publish" in lowered_url
+                            or "/publish/success" in lowered_url
+                        ) and "/login" not in lowered_url
+                        if xhs_logged_in:
+                            logger.success(f"[xiaohongshu] 登录成功！URL={current_url[:120]}")
+                            self.login_success = True
+                            await asyncio.sleep(2)
+                            final = [dict(c) for c in await self.context.cookies()]
+                            await self._save_cookies(final)
+                            break
+
                    # ── 抖音：API 拦截到 redirect_url → 直接导航 ──
                    if self.platform == "douyin" and self._qr_api_confirmed and self._qr_redirect_url:
                        logger.info(f"[douyin] 导航到 redirect_url...")
--- a/backend/app/services/small_face_enhance_service.py
+++ b/backend/app/services/small_face_enhance_service.py
@@ -0,0 +1,872 @@
+"""
+小脸增强服务
+远景小脸场景下，裁切 + 超分 -> lipsync 推理 -> 贴回，提升输入质量。
+
+单文件单类，供 LipSyncService 调用。
+"""
+from __future__ import annotations
+
+import subprocess
+import time
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Optional, Tuple, List
+
+from loguru import logger
+
+from app.core.config import settings
+
+try:
+    import cv2
+    import numpy as np
+    _CV2_AVAILABLE = True
+except ImportError:
+    _CV2_AVAILABLE = False
+
+# ── 模块常量 ──
+PADDING = 0.28                  # bbox 外扩比例
+DETECT_EVERY = 8                # 每 N 帧检测一次
+TARGET_SIZE = 512               # 超分目标尺寸
+MASK_FEATHER = 15               # 羽化像素
+MASK_UPPER_RATIO = 0.68         # 口型区域起始位置（仅覆盖嘴部/下巴）
+MASK_SIDE_MARGIN = 0.16         # 左右留白比例，避免改动面颊/鼻翼
+SAMPLE_FRAMES = 24              # 采样帧数
+SAMPLE_WINDOW = (0.10, 0.30)    # 采样窗口 (10%~30%)
+ENCODE_FPS = 25                 # 编码帧率
+ENCODE_CRF = 18                 # 编码质量
+EMA_ALPHA = 0.3                 # EMA 平滑系数
+
+# 检测过滤
+MIN_FACE_WIDTH = 50
+FACE_ASPECT_MIN = 0.2
+FACE_ASPECT_MAX = 1.5
+DET_SCORE_THRESH = 0.5
+NMS_IOU_THRESH = 0.4
+
+# 权重路径
+_PROJECT_ROOT = Path(__file__).resolve().parent.parent.parent.parent
+DET_MODEL_PATH = (
+    _PROJECT_ROOT
+    / "models" / "LatentSync" / "checkpoints"
+    / "auxiliary" / "models" / "buffalo_l" / "det_10g.onnx"
+)
+
+
+# ── 数据结构 ──
+
+@dataclass
+class FaceTrack:
+    """每帧的人脸追踪数据（用于裁切 + 贴回）"""
+    crop_boxes: List[Tuple[int, int, int, int]]   # 每帧 (x1,y1,x2,y2)
+    face_width_median: float
+    frame_count: int
+    frame_w: int
+    frame_h: int
+
+
+@dataclass
+class EnhanceResult:
+    """enhance_if_needed 返回值"""
+    video_path: str
+    was_enhanced: bool
+    track: Optional[FaceTrack] = None
+    face_width: float = 0.0
+
+
+class SmallFaceEnhanceService:
+    """小脸增强服务：检测 → 裁切 → 超分 → (lipsync) → 贴回"""
+
+    def __init__(self):
+        self._detector_session = None
+        self._sr_model = None
+        self._sr_type: Optional[str] = None
+
+    # ================================================================
+    #  SCRFD 人脸检测（det_10g.onnx，CPU 推理）
+    # ================================================================
+
+    def _ensure_detector(self) -> bool:
+        if self._detector_session is not None:
+            return True
+        if not DET_MODEL_PATH.exists():
+            logger.warning(f"⚠️ SCRFD 权重不存在: {DET_MODEL_PATH}")
+            return False
+        try:
+            import onnxruntime as ort
+            self._detector_session = ort.InferenceSession(
+                str(DET_MODEL_PATH),
+                providers=["CPUExecutionProvider"],
+            )
+            logger.info("✅ SCRFD 检测器已加载")
+            return True
+        except Exception as e:
+            logger.warning(f"⚠️ SCRFD 初始化失败: {e}")
+            return False
+
+    def _detect_faces(self, img_bgr: np.ndarray) -> List[Tuple[np.ndarray, float]]:
+        """
+        用 SCRFD 检测人脸。
+        Returns: [(bbox_xyxy, score), ...] 按面积降序。
+        """
+        if self._detector_session is None:
+            return []
+
+        h, w = img_bgr.shape[:2]
+        input_h, input_w = 640, 640
+
+        # ── Preprocess ──
+        ratio = min(input_h / h, input_w / w)
+        new_h, new_w = int(h * ratio), int(w * ratio)
+        resized = cv2.resize(img_bgr, (new_w, new_h))
+
+        padded = np.full((input_h, input_w, 3), 127.5, dtype=np.float32)
+        padded[:new_h, :new_w] = resized.astype(np.float32)
+
+        # BGR → RGB → normalize
+        blob = padded[:, :, ::-1].copy()
+        blob = (blob - 127.5) / 128.0
+        blob = blob.transpose(2, 0, 1)[np.newaxis].astype(np.float32)
+
+        # ── Inference ──
+        input_name = self._detector_session.get_inputs()[0].name
+        outputs = self._detector_session.run(None, {input_name: blob})
+
+        # det_10g outputs: [scores_s8, scores_s16, scores_s32,
+        #                    bbox_s8, bbox_s16, bbox_s32,
+        #                    kps_s8, kps_s16, kps_s32]
+        strides = [8, 16, 32]
+        all_bboxes = []
+        all_scores = []
+
+        for i, stride in enumerate(strides):
+            scores = outputs[i].flatten()
+            bboxes = outputs[i + 3].reshape(-1, 4)
+
+            # 生成 anchor 中心
+            feat_h = input_h // stride
+            feat_w = input_w // stride
+            anchors = []
+            for y in range(feat_h):
+                for x in range(feat_w):
+                    cx, cy = x * stride, y * stride
+                    anchors.append([cx, cy])
+                    anchors.append([cx, cy])   # 2 anchors per cell
+            anchors = np.array(anchors, dtype=np.float32)
+
+            # 置信度过滤
+            mask = scores > DET_SCORE_THRESH
+            if not mask.any():
+                continue
+
+            f_scores = scores[mask]
+            f_bboxes = bboxes[mask]
+            f_anchors = anchors[mask]
+
+            # Decode: distance * stride → xyxy
+            decoded = np.empty_like(f_bboxes)
+            decoded[:, 0] = f_anchors[:, 0] - f_bboxes[:, 0] * stride
+            decoded[:, 1] = f_anchors[:, 1] - f_bboxes[:, 1] * stride
+            decoded[:, 2] = f_anchors[:, 0] + f_bboxes[:, 2] * stride
+            decoded[:, 3] = f_anchors[:, 1] + f_bboxes[:, 3] * stride
+
+            # 缩放回原始图像坐标
+            decoded /= ratio
+
+            all_bboxes.append(decoded)
+            all_scores.append(f_scores)
+
+        if not all_bboxes:
+            return []
+
+        bboxes_cat = np.concatenate(all_bboxes)
+        scores_cat = np.concatenate(all_scores)
+
+        # NMS
+        keep = self._nms(bboxes_cat, scores_cat, NMS_IOU_THRESH)
+
+        # 尺寸 + 宽高比过滤
+        results = []
+        for idx in keep:
+            bbox = bboxes_cat[idx]
+            score = float(scores_cat[idx])
+            bw = bbox[2] - bbox[0]
+            bh = bbox[3] - bbox[1]
+            if bw < MIN_FACE_WIDTH or bh < MIN_FACE_WIDTH:
+                continue
+            aspect = bw / max(bh, 1)
+            if aspect < FACE_ASPECT_MIN or aspect > FACE_ASPECT_MAX:
+                continue
+            results.append((bbox.copy(), score))
+
+        results.sort(key=lambda x: (x[0][2] - x[0][0]) * (x[0][3] - x[0][1]), reverse=True)
+        return results
+
+    @staticmethod
+    def _nms(bboxes: np.ndarray, scores: np.ndarray, threshold: float) -> List[int]:
+        x1 = bboxes[:, 0]
+        y1 = bboxes[:, 1]
+        x2 = bboxes[:, 2]
+        y2 = bboxes[:, 3]
+        areas = (x2 - x1) * (y2 - y1)
+        order = scores.argsort()[::-1]
+        keep = []
+        while order.size > 0:
+            i = order[0]
+            keep.append(int(i))
+            if order.size == 1:
+                break
+            xx1 = np.maximum(x1[i], x1[order[1:]])
+            yy1 = np.maximum(y1[i], y1[order[1:]])
+            xx2 = np.minimum(x2[i], x2[order[1:]])
+            yy2 = np.minimum(y2[i], y2[order[1:]])
+            inter = np.maximum(0, xx2 - xx1) * np.maximum(0, yy2 - yy1)
+            iou = inter / (areas[i] + areas[order[1:]] - inter + 1e-6)
+            inds = np.where(iou <= threshold)[0]
+            order = order[inds + 1]
+        return keep
+
+    # ================================================================
+    #  视频工具
+    # ================================================================
+
+    @staticmethod
+    def _get_video_info(video_path: str) -> Optional[Tuple[int, int, int, float]]:
+        """返回 (width, height, frame_count, fps)"""
+        try:
+            import json as _json
+            cmd = [
+                "ffprobe", "-v", "error",
+                "-select_streams", "v:0",
+                "-show_entries", "stream=width,height,nb_frames,r_frame_rate,avg_frame_rate",
+                "-of", "json",
+                video_path,
+            ]
+            r = subprocess.run(cmd, capture_output=True, text=True, timeout=10)
+            if r.returncode != 0:
+                return None
+            info = _json.loads(r.stdout)
+            streams = info.get("streams")
+            if not streams:
+                return None
+            stream = streams[0]
+            w, h = int(stream["width"]), int(stream["height"])
+            # nb_frames 可能为 "N/A" 或缺失
+            nb_raw = stream.get("nb_frames", "N/A")
+            nb = int(nb_raw) if nb_raw not in ("N/A", "") else 0
+
+            def _parse_fps(s: str) -> float:
+                if "/" in s:
+                    num, den = s.split("/")
+                    return float(num) / float(den) if float(den) != 0 else 0.0
+                return float(s) if s else 0.0
+
+            # 优先 avg_frame_rate（真实平均帧率），r_frame_rate 可能是 timebase 倍数
+            avg_fps = _parse_fps(stream.get("avg_frame_rate", "0/0"))
+            r_fps = _parse_fps(stream.get("r_frame_rate", "25/1"))
+            fps = avg_fps if avg_fps > 0 else (r_fps if r_fps > 0 else 25.0)
+
+            if nb == 0:
+                cmd2 = [
+                    "ffprobe", "-v", "error",
+                    "-show_entries", "format=duration",
+                    "-of", "default=noprint_wrappers=1:nokey=1",
+                    video_path,
+                ]
+                r2 = subprocess.run(cmd2, capture_output=True, text=True, timeout=10)
+                if r2.returncode == 0 and r2.stdout.strip():
+                    nb = int(float(r2.stdout.strip()) * fps)
+            return w, h, nb, fps
+        except Exception as e:
+            logger.warning(f"⚠️ 获取视频信息失败: {e}")
+            return None
+
+    @staticmethod
+    def _open_video_reader(video_path: str, w: int, h: int,
+                           seek_sec: float = 0, duration_sec: float = 0):
+        """打开 ffmpeg rawvideo 读取管道"""
+        cmd = ["ffmpeg"]
+        if seek_sec > 0:
+            cmd += ["-ss", f"{seek_sec:.3f}"]
+        cmd += ["-i", video_path]
+        if duration_sec > 0:
+            cmd += ["-t", f"{duration_sec:.3f}"]
+        cmd += ["-f", "rawvideo", "-pix_fmt", "bgr24", "-v", "quiet", "-"]
+        return subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)
+
+    @staticmethod
+    def _read_one_frame(proc, w: int, h: int) -> Optional[np.ndarray]:
+        raw = proc.stdout.read(w * h * 3)
+        if len(raw) < w * h * 3:
+            return None
+        return np.frombuffer(raw, dtype=np.uint8).reshape(h, w, 3).copy()
+
+    @staticmethod
+    def _open_video_writer(output_path: str, w: int, h: int,
+                           fps: int = ENCODE_FPS, crf: int = ENCODE_CRF):
+        """打开 ffmpeg rawvideo 写入管道"""
+        cmd = [
+            "ffmpeg", "-y",
+            "-f", "rawvideo", "-pix_fmt", "bgr24",
+            "-s", f"{w}x{h}", "-r", str(fps), "-i", "-",
+            "-c:v", "libx264", "-crf", str(crf),
+            "-preset", "fast", "-pix_fmt", "yuv420p",
+            output_path,
+        ]
+        return subprocess.Popen(cmd, stdin=subprocess.PIPE, stderr=subprocess.DEVNULL)
+
+    # ================================================================
+    #  Phase 2: 人脸尺寸检测
+    # ================================================================
+
+    def _detect_face_size(self, video_path: str) -> Optional[float]:
+        """
+        从视频 10%~30% 区间均匀采样，检测最大脸宽度中位数。
+        返回 None 表示未检测到人脸或检测器不可用。
+        """
+        if not self._ensure_detector():
+            return None
+
+        info = self._get_video_info(video_path)
+        if info is None:
+            return None
+        w, h, nb_frames, fps = info
+        if nb_frames < 1 or fps <= 0:
+            return None
+
+        # 计算采样区间
+        start_frame = int(nb_frames * SAMPLE_WINDOW[0])
+        end_frame = int(nb_frames * SAMPLE_WINDOW[1])
+        end_frame = max(end_frame, start_frame + 1)
+        n_sample = min(SAMPLE_FRAMES, end_frame - start_frame)
+        if n_sample <= 0:
+            return None
+
+        step = max(1, (end_frame - start_frame) // n_sample)
+        sample_indices = set(range(start_frame, end_frame, step))
+
+        # 用 ffmpeg seek 定位到采样起点
+        seek_sec = start_frame / fps
+        duration_sec = (end_frame - start_frame) / fps + 0.5  # 余量
+
+        proc = self._open_video_reader(video_path, w, h, seek_sec, duration_sec)
+        face_widths = []
+        try:
+            for local_idx in range(end_frame - start_frame + 1):
+                frame = self._read_one_frame(proc, w, h)
+                if frame is None:
+                    break
+                global_idx = start_frame + local_idx
+                if global_idx not in sample_indices:
+                    continue
+                faces = self._detect_faces(frame)
+                if faces:
+                    bbox = faces[0][0]  # 最大脸
+                    face_widths.append(float(bbox[2] - bbox[0]))
+        finally:
+            proc.stdout.close()
+            proc.terminate()
+            proc.wait()
+
+        if not face_widths:
+            return None
+
+        face_widths.sort()
+        mid = len(face_widths) // 2
+        if len(face_widths) % 2 == 0:
+            return (face_widths[mid - 1] + face_widths[mid]) / 2
+        return face_widths[mid]
+
+    # ================================================================
+    #  Phase 3: 裁切 + 轨迹
+    # ================================================================
+
+    def _build_face_track(self, video_path: str,
+                          w: int, h: int, nb_frames: int) -> Optional[FaceTrack]:
+        """
+        逐帧人脸追踪：每 DETECT_EVERY 帧检测，中间帧 EMA 插值。
+        返回 FaceTrack 或 None（检测失败）。
+        """
+        if not self._ensure_detector():
+            return None
+
+        detect_set = set(range(0, nb_frames, DETECT_EVERY))
+
+        # 第一遍：检测帧
+        proc = self._open_video_reader(video_path, w, h)
+        keyframe_bboxes = {}
+        actual_frames = 0
+        try:
+            for idx in range(nb_frames):
+                frame = self._read_one_frame(proc, w, h)
+                if frame is None:
+                    break
+                actual_frames = idx + 1
+                if idx not in detect_set:
+                    continue
+                faces = self._detect_faces(frame)
+                if faces:
+                    keyframe_bboxes[idx] = faces[0][0].copy()
+        finally:
+            proc.stdout.close()
+            proc.terminate()
+            proc.wait()
+
+        if not keyframe_bboxes:
+            return None
+
+        # 用实际读到的帧数，避免 _get_video_info 估算偏差
+        nb_frames = actual_frames
+
+        # 前向填充 + EMA 平滑
+        sorted_keys = sorted(keyframe_bboxes.keys())
+        raw_bboxes: List[np.ndarray] = [None] * nb_frames  # type: ignore
+
+        for k in sorted_keys:
+            raw_bboxes[k] = keyframe_bboxes[k]
+
+        prev = keyframe_bboxes[sorted_keys[0]]
+        for i in range(nb_frames):
+            if raw_bboxes[i] is not None:
+                prev = raw_bboxes[i]
+            else:
+                raw_bboxes[i] = prev.copy()
+
+        # EMA 平滑
+        smoothed = [raw_bboxes[0].copy()]
+        for i in range(1, nb_frames):
+            s = EMA_ALPHA * raw_bboxes[i] + (1 - EMA_ALPHA) * smoothed[-1]
+            smoothed.append(s)
+
+        # 带 padding 的 crop box（clamp 到帧边界）
+        crop_boxes = []
+        for bbox in smoothed:
+            x1, y1, x2, y2 = bbox
+            bw, bh = x2 - x1, y2 - y1
+            pad_w, pad_h = bw * PADDING, bh * PADDING
+            cx1 = max(0, int(x1 - pad_w))
+            cy1 = max(0, int(y1 - pad_h))
+            cx2 = min(w, int(x2 + pad_w))
+            cy2 = min(h, int(y2 + pad_h))
+            crop_boxes.append((cx1, cy1, cx2, cy2))
+
+        # 中位数脸宽
+        widths = sorted(float(b[2] - b[0]) for b in smoothed)
+        median_w = widths[len(widths) // 2]
+
+        return FaceTrack(
+            crop_boxes=crop_boxes,
+            face_width_median=median_w,
+            frame_count=nb_frames,
+            frame_w=w,
+            frame_h=h,
+        )
+
+    # ================================================================
+    #  Phase 3: 超分
+    # ================================================================
+
+    def _ensure_upscaler(self, upscaler: str, gpu_id: int) -> bool:
+        """懒加载超分模型"""
+        if self._sr_model is not None and self._sr_type == upscaler:
+            return True
+        try:
+            import sys
+            import torch
+
+            # torchvision >= 0.20 移除了 functional_tensor，但 basicsr 仍引用
+            if "torchvision.transforms.functional_tensor" not in sys.modules:
+                try:
+                    import torchvision.transforms.functional as _F
+                    sys.modules["torchvision.transforms.functional_tensor"] = _F
+                except ImportError:
+                    pass
+
+            device = torch.device(f"cuda:{gpu_id}" if torch.cuda.is_available() else "cpu")
+
+            if upscaler == "gfpgan":
+                from gfpgan import GFPGANer
+                model_path = _PROJECT_ROOT / "models" / "FaceEnhance" / "GFPGANv1.4.pth"
+                if not model_path.exists():
+                    logger.warning(f"⚠️ GFPGAN 权重不存在: {model_path}")
+                    return False
+                self._sr_model = GFPGANer(
+                    model_path=str(model_path),
+                    upscale=2,
+                    arch="clean",
+                    channel_multiplier=2,
+                    bg_upsampler=None,
+                    device=device,
+                )
+            elif upscaler == "codeformer":
+                from basicsr.archs.codeformer_arch import CodeFormer as CodeFormerArch
+                model_path = _PROJECT_ROOT / "models" / "FaceEnhance" / "codeformer.pth"
+                if not model_path.exists():
+                    logger.warning(f"⚠️ CodeFormer 权重不存在: {model_path}")
+                    # 尝试回退 gfpgan
+                    return self._ensure_upscaler("gfpgan", gpu_id)
+                net = CodeFormerArch(
+                    dim_embd=512, codebook_size=1024, n_head=8, n_layers=9,
+                    connect_list=["32", "64", "128", "256"],
+                ).to(device)
+                ckpt = torch.load(str(model_path), map_location=device, weights_only=False)
+                net.load_state_dict(ckpt.get("params_ema", ckpt.get("params", ckpt)))
+                net.eval()
+                self._sr_model = net
+                self._sr_device = device
+            else:
+                logger.warning(f"⚠️ 未知超分器: {upscaler}")
+                return False
+
+            self._sr_type = upscaler
+            logger.info(f"✅ 超分器已加载: {upscaler}")
+            return True
+        except Exception as e:
+            logger.warning(f"⚠️ 超分器初始化失败 ({upscaler}): {e}")
+            return False
+
+    def _upscale_face(self, face_img: np.ndarray, target_size: int) -> np.ndarray:
+        """用已加载的超分模型增强单帧，失败回退 bicubic"""
+        try:
+            if self._sr_type == "gfpgan":
+                _, _, output = self._sr_model.enhance(
+                    face_img, paste_back=False, has_aligned=False,
+                )
+                if output is not None:
+                    return cv2.resize(
+                        output, (target_size, target_size),
+                        interpolation=cv2.INTER_LANCZOS4,
+                    )
+            elif self._sr_type == "codeformer":
+                import torch
+                img = cv2.resize(face_img, (512, 512))
+                img_t = (
+                    torch.from_numpy(img.astype(np.float32) / 255.0)
+                    .permute(2, 0, 1)
+                    .unsqueeze(0)
+                    .to(self._sr_device)
+                )
+                with torch.no_grad():
+                    out = self._sr_model(img_t, w=0.7)[0]
+                out_np = (
+                    out.squeeze().permute(1, 2, 0).cpu().numpy() * 255
+                ).clip(0, 255).astype(np.uint8)
+                return cv2.resize(
+                    out_np, (target_size, target_size),
+                    interpolation=cv2.INTER_LANCZOS4,
+                )
+        except Exception as e:
+            logger.debug(f"超分失败，回退 bicubic: {e}")
+
+        return cv2.resize(
+            face_img, (target_size, target_size),
+            interpolation=cv2.INTER_CUBIC,
+        )
+
+    # ================================================================
+    #  Phase 3: 裁切 + 超分 → 增强视频
+    # ================================================================
+
+    def _crop_and_upscale_video(
+        self,
+        video_path: str,
+        track: FaceTrack,
+        tmpdir: Path,
+        gpu_id: int,
+        source_fps: float,
+    ) -> str:
+        """
+        裁切人脸区域 → 稀疏关键帧超分 → 输出 TARGET_SIZE 视频。
+        流式处理，不占满内存。
+        """
+        output_path = str(tmpdir / "enhanced_face.mp4")
+        w, h = track.frame_w, track.frame_h
+
+        upscaler = settings.LIPSYNC_SMALL_FACE_UPSCALER
+        sr_available = self._ensure_upscaler(upscaler, gpu_id)
+        detect_set = set(range(0, track.frame_count, DETECT_EVERY))
+
+        reader = self._open_video_reader(video_path, w, h)
+        out_fps = max(1, int(round(source_fps))) if source_fps > 0 else ENCODE_FPS
+        writer = self._open_video_writer(output_path, TARGET_SIZE, TARGET_SIZE, fps=out_fps)
+
+        try:
+            for idx in range(track.frame_count):
+                frame = self._read_one_frame(reader, w, h)
+                if frame is None:
+                    break
+
+                cx1, cy1, cx2, cy2 = track.crop_boxes[idx]
+                cropped = frame[cy1:cy2, cx1:cx2]
+
+                if sr_available and idx in detect_set:
+                    enhanced = self._upscale_face(cropped, TARGET_SIZE)
+                else:
+                    enhanced = cv2.resize(
+                        cropped, (TARGET_SIZE, TARGET_SIZE),
+                        interpolation=cv2.INTER_CUBIC,
+                    )
+
+                writer.stdin.write(enhanced.tobytes())
+        finally:
+            reader.stdout.close()
+            reader.terminate()
+            reader.wait()
+            writer.stdin.close()
+            writer.wait()
+
+        if not Path(output_path).exists():
+            raise RuntimeError("增强视频写入失败")
+
+        return output_path
+
+    # ================================================================
+    #  Phase 3: 贴回
+    # ================================================================
+
+    def blend_back(
+        self,
+        original_video: str,
+        lipsync_video: str,
+        track: FaceTrack,
+        tmpdir,
+    ) -> str:
+        """
+        将 lipsync 推理结果贴回原视频。
+        下半脸 mask + 高斯羽化 + seamlessClone。
+        """
+        tmpdir = Path(tmpdir)
+        output_path = str(tmpdir / "blended_output.mp4")
+        w, h = track.frame_w, track.frame_h
+
+        # 获取 lipsync 视频尺寸
+        ls_info = self._get_video_info(lipsync_video)
+        if ls_info is None:
+            raise RuntimeError("无法读取 lipsync 视频信息")
+        ls_w, ls_h, ls_frames, ls_fps = ls_info
+
+        if ls_fps <= 0:
+            ls_fps = ENCODE_FPS
+
+        # 帧数保护：lipsync 模型按音频时长输出，帧数通常 <= 原始(looped)视频
+        if ls_frames <= 0:
+            raise RuntimeError(f"lipsync 输出帧数为 {ls_frames}，跳过贴回")
+        if ls_frames > track.frame_count:
+            raise RuntimeError(
+                f"帧数异常: lipsync={ls_frames} > original={track.frame_count}"
+            )
+        blend_count = ls_frames
+
+        orig_info = self._get_video_info(original_video)
+        orig_fps = orig_info[3] if orig_info is not None else 0.0
+        if orig_fps <= 0:
+            orig_fps = ls_fps
+
+        orig_reader = self._open_video_reader(original_video, w, h)
+        ls_reader = self._open_video_reader(lipsync_video, ls_w, ls_h)
+        writer = self._open_video_writer(
+            output_path,
+            w,
+            h,
+            fps=max(1, int(round(ls_fps))),
+        )
+
+        current_orig_idx = -1
+        current_orig_frame = None
+
+        try:
+            for idx in range(blend_count):
+                target_orig_idx = min(
+                    track.frame_count - 1,
+                    int(round((idx / ls_fps) * orig_fps)),
+                )
+
+                while current_orig_idx < target_orig_idx:
+                    frame = self._read_one_frame(orig_reader, w, h)
+                    if frame is None:
+                        current_orig_frame = None
+                        break
+                    current_orig_idx += 1
+                    current_orig_frame = frame
+
+                orig_frame = current_orig_frame
+                ls_frame = self._read_one_frame(ls_reader, ls_w, ls_h)
+                if orig_frame is None or ls_frame is None:
+                    break
+
+                cx1, cy1, cx2, cy2 = track.crop_boxes[target_orig_idx]
+                crop_w, crop_h = cx2 - cx1, cy2 - cy1
+
+                # 将 lipsync 输出 resize 到裁切区域尺寸
+                ls_resized = cv2.resize(
+                    ls_frame, (crop_w, crop_h),
+                    interpolation=cv2.INTER_LANCZOS4,
+                )
+
+                # 嘴部局部 mask（尽量仅覆盖嘴唇与下巴区域，避免鼻子/眼周被改动）
+                mask = np.zeros((crop_h, crop_w), dtype=np.uint8)
+                upper = int(crop_h * MASK_UPPER_RATIO)
+                left = int(crop_w * MASK_SIDE_MARGIN)
+                right = int(crop_w * (1.0 - MASK_SIDE_MARGIN))
+                if right - left < 8:
+                    left, right = 0, crop_w
+
+                mask[upper:, left:right] = 255
+
+                # 中央椭圆增强口型区域权重
+                ellipse_center = (crop_w // 2, int(crop_h * 0.82))
+                ellipse_axes = (max(8, int(crop_w * 0.22)), max(8, int(crop_h * 0.13)))
+                cv2.ellipse(mask, ellipse_center, ellipse_axes, 0, 0, 360, 255, -1)
+                mask = cv2.GaussianBlur(mask, (0, 0), MASK_FEATHER)
+
+                # 融合
+                blended = self._blend_face_region(
+                    orig_frame, ls_resized, mask, cx1, cy1, cx2, cy2,
+                )
+                writer.stdin.write(blended.tobytes())
+        finally:
+            for p in (orig_reader, ls_reader):
+                p.stdout.close()
+                p.terminate()
+                p.wait()
+            writer.stdin.close()
+            writer.wait()
+
+        if not Path(output_path).exists():
+            raise RuntimeError("融合视频写入失败")
+        return output_path
+
+    @staticmethod
+    def _blend_face_region(
+        orig: np.ndarray,
+        face: np.ndarray,
+        mask: np.ndarray,
+        x1: int, y1: int, x2: int, y2: int,
+    ) -> np.ndarray:
+        """seamlessClone 贴回，失败回退 alpha 混合"""
+        result = orig.copy()
+        crop_h, crop_w = face.shape[:2]
+
+        # 尝试 seamlessClone
+        try:
+            center_x = (x1 + x2) // 2
+            center_y = int(y1 + (y2 - y1) * 0.7)
+            center_x = max(1, min(center_x, orig.shape[1] - 2))
+            center_y = max(1, min(center_y, orig.shape[0] - 2))
+
+            src = np.zeros_like(orig)
+            src[y1:y2, x1:x2] = face
+
+            full_mask = np.zeros(orig.shape[:2], dtype=np.uint8)
+            full_mask[y1:y2, x1:x2] = mask
+
+            if full_mask.max() > 0:
+                cloned = cv2.seamlessClone(
+                    src, orig, full_mask, (center_x, center_y), cv2.NORMAL_CLONE,
+                )
+
+                # 限制融合影响范围到 mask 区域，避免 Poisson 扩散导致眼部上方重影
+                alpha = mask.astype(np.float32) / 255.0
+                alpha_3ch = np.stack([alpha] * 3, axis=-1)
+                roi_orig = orig[y1:y2, x1:x2].astype(np.float32)
+                roi_clone = cloned[y1:y2, x1:x2].astype(np.float32)
+                blended_roi = roi_orig * (1 - alpha_3ch) + roi_clone * alpha_3ch
+
+                result = orig.copy()
+                result[y1:y2, x1:x2] = blended_roi.astype(np.uint8)
+                return result
+        except Exception:
+            pass
+
+        # Fallback: alpha 混合
+        alpha = mask.astype(np.float32) / 255.0
+        alpha_3ch = np.stack([alpha] * 3, axis=-1)
+        crop_region = result[y1:y2, x1:x2].astype(np.float32)
+        blended = crop_region * (1 - alpha_3ch) + face.astype(np.float32) * alpha_3ch
+        result[y1:y2, x1:x2] = blended.astype(np.uint8)
+        return result
+
+    # ================================================================
+    #  主入口
+    # ================================================================
+
+    def enhance_if_needed(
+        self,
+        video_path: str,
+        tmpdir,
+        gpu_id: int,
+    ) -> EnhanceResult:
+        """
+        主入口：检测小脸 → 裁切 + 超分 → 返回增强结果。
+        如不需要增强，返回 was_enhanced=False。
+        """
+        if not settings.LIPSYNC_SMALL_FACE_ENHANCE:
+            return EnhanceResult(video_path=video_path, was_enhanced=False)
+
+        if not _CV2_AVAILABLE:
+            logger.warning("⚠️ opencv/numpy 未安装，小脸增强不可用")
+            return EnhanceResult(video_path=video_path, was_enhanced=False)
+
+        start = time.time()
+        tmpdir = Path(tmpdir)
+        face_dir = tmpdir / "face_enhance"
+        face_dir.mkdir(exist_ok=True)
+
+        # ── 检测 ──
+        face_width = self._detect_face_size(video_path)
+        if face_width is None:
+            logger.info("小脸增强: 未检测到人脸，跳过")
+            return EnhanceResult(video_path=video_path, was_enhanced=False)
+
+        threshold = settings.LIPSYNC_SMALL_FACE_THRESHOLD
+        if face_width >= threshold:
+            logger.info(
+                f"小脸增强: face_w={face_width:.0f}px >= threshold={threshold}px, 跳过"
+            )
+            return EnhanceResult(
+                video_path=video_path, was_enhanced=False, face_width=face_width,
+            )
+
+        logger.info(
+            f"小脸增强: face_w={face_width:.0f}px < threshold={threshold}px, 触发增强"
+        )
+
+        # ── 构建追踪 ──
+        info = self._get_video_info(video_path)
+        if info is None:
+            raise RuntimeError("无法读取视频信息")
+        w, h, nb_frames, fps = info
+
+        track = self._build_face_track(video_path, w, h, nb_frames)
+        if track is None:
+            raise RuntimeError("人脸追踪失败")
+
+        # ── 裁切 + 超分 ──
+        enhanced_path = self._crop_and_upscale_video(
+            video_path,
+            track,
+            face_dir,
+            gpu_id,
+            source_fps=fps,
+        )
+
+        # 清理 GPU 缓存
+        try:
+            import torch
+            if torch.cuda.is_available():
+                torch.cuda.empty_cache()
+        except ImportError:
+            pass
+
+        elapsed = time.time() - start
+        logger.info(
+            f"小脸增强: face_w={face_width:.0f}px threshold={threshold}px "
+            f"enhanced=True upscaler={settings.LIPSYNC_SMALL_FACE_UPSCALER} "
+            f"time={elapsed:.1f}s"
+        )
+
+        return EnhanceResult(
+            video_path=enhanced_path,
+            was_enhanced=True,
+            track=track,
+            face_width=face_width,
+        )
--- a/backend/app/services/storage.py
+++ b/backend/app/services/storage.py
@@ -182,18 +182,18 @@ class StorageService:
            logger.error(f"Get public URL failed: {e}")
            return ""

-    async def delete_file(self, bucket: str, path: str):
-        """异步删除文件"""
-        try:
-            loop = asyncio.get_running_loop()
-            await loop.run_in_executor(
-                None,
-                lambda: self.supabase.storage.from_(bucket).remove([path])
-            )
-            logger.info(f"Deleted file: {bucket}/{path}")
-        except Exception as e:
-            logger.error(f"Delete file failed: {e}")
-            pass
+    async def delete_file(self, bucket: str, path: str):
+        """异步删除文件"""
+        try:
+            loop = asyncio.get_running_loop()
+            await loop.run_in_executor(
+                None,
+                lambda: self.supabase.storage.from_(bucket).remove([path])
+            )
+            logger.info(f"Deleted file: {bucket}/{path}")
+        except Exception as e:
+            logger.error(f"Delete file failed: {e}")
+            raise e

    async def move_file(self, bucket: str, from_path: str, to_path: str):
        """异步移动/重命名文件"""
@@ -208,17 +208,19 @@ class StorageService:
            logger.error(f"Move file failed: {e}")
            raise e

-    async def list_files(self, bucket: str, path: str) -> List[Any]:
-        """异步列出文件"""
-        try:
-            loop = asyncio.get_running_loop()
-            res = await loop.run_in_executor(
-                None,
-                lambda: self.supabase.storage.from_(bucket).list(path)
-            )
-            return res or []
-        except Exception as e:
-            logger.error(f"List files failed: {e}")
-            return []
+    async def list_files(self, bucket: str, path: str, strict: bool = False) -> List[Any]:
+        """异步列出文件"""
+        try:
+            loop = asyncio.get_running_loop()
+            res = await loop.run_in_executor(
+                None,
+                lambda: self.supabase.storage.from_(bucket).list(path)
+            )
+            return res or []
+        except Exception as e:
+            logger.error(f"List files failed: {e}")
+            if strict:
+                raise e
+            return []

 storage_service = StorageService()
--- a/backend/app/services/uploader/weixin_uploader.py
+++ b/backend/app/services/uploader/weixin_uploader.py
@@ -847,13 +847,22 @@ class WeixinUploader(BaseUploader):
                            logger.info(text)
                            self._append_debug_log(text)
                            return True
-                        text = "[weixin][file_input] empty"
-                        logger.warning(text)
-                        self._append_debug_log(text)
-                        await asyncio.sleep(0.5)
-                        if await self._is_upload_in_progress(page):
+                        upload_started = False
+                        for _ in range(3):
+                            await asyncio.sleep(0.4)
+                            if await self._is_upload_in_progress(page):
+                                upload_started = True
+                                break
+                        if upload_started:
                            logger.info("[weixin] upload started after file input set")
                            return True
+
+                        text = "[weixin][file_input] empty after set_input_files and no upload signal"
+                        if attempt + 1 >= self.MAX_CLICK_RETRIES:
+                            logger.warning(text)
+                        else:
+                            logger.info(text)
+                        self._append_debug_log(text)
                    except Exception as e:
                        logger.warning(f"[weixin] failed to read file input info: {e}")
            except Exception as e:
--- a/backend/app/services/uploader/xiaohongshu_uploader.py
+++ b/backend/app/services/uploader/xiaohongshu_uploader.py
@@ -1,201 +1,775 @@
-"""
-Xiaohongshu (小红书) uploader using Playwright
-Based on social-auto-upload implementation
-"""
-from datetime import datetime
-from pathlib import Path
-from typing import Optional, List, Dict, Any
-import asyncio
-
-from playwright.async_api import Playwright, async_playwright
-from loguru import logger
-
-from .base_uploader import BaseUploader
-from .cookie_utils import set_init_script
-
-
-class XiaohongshuUploader(BaseUploader):
-    """Xiaohongshu video uploader using Playwright"""
-    
-    # 超时配置 (秒)
-    UPLOAD_TIMEOUT = 300  # 视频上传超时
-    PUBLISH_TIMEOUT = 120  # 发布检测超时
-    POLL_INTERVAL = 1  # 轮询间隔
-    
-    def __init__(
-        self,
-        title: str,
-        file_path: str,
-        tags: List[str],
-        publish_date: Optional[datetime] = None,
-        account_file: Optional[str] = None,
-        description: str = ""
-    ):
-        super().__init__(title, file_path, tags, publish_date, account_file, description)
-        self.upload_url = "https://creator.xiaohongshu.com/publish/publish?from=homepage&target=video"
-    
-    async def set_schedule_time(self, page, publish_date):
-        """Set scheduled publish time"""
-        try:
-            logger.info("[小红书] 正在设置定时发布时间...")
-            
-            # Click "定时发布" label
-            label_element = page.locator("label:has-text('定时发布')")
-            await label_element.click()
-            await asyncio.sleep(1)
-            
-            # Format time
-            publish_date_hour = publish_date.strftime("%Y-%m-%d %H:%M")
-            
-            # Fill datetime input
-            await page.locator('.el-input__inner[placeholder="选择日期和时间"]').click()
-            await page.keyboard.press("Control+KeyA")
-            await page.keyboard.type(str(publish_date_hour))
-            await page.keyboard.press("Enter")
-            
-            await asyncio.sleep(1)
-            logger.info(f"[小红书] 已设置定时发布: {publish_date_hour}")
-            
-        except Exception as e:
-            logger.error(f"[小红书] 设置定时发布失败: {e}")
-    
-    async def upload(self, playwright: Playwright) -> dict:
-        """Main upload logic with guaranteed resource cleanup"""
-        browser = None
-        context = None
-        try:
-            # Launch browser (headless for server deployment)
-            browser = await playwright.chromium.launch(headless=True)
-            context = await browser.new_context(
-                viewport={"width": 1600, "height": 900},
-                storage_state=self.account_file
-            )
-            context = await set_init_script(context)
-            
-            page = await context.new_page()
-            
-            # Go to upload page
-            await page.goto(self.upload_url)
-            logger.info(f"[小红书] 正在上传: {self.file_path.name}")
-            
-            # Upload video file
-            await page.locator("div[class^='upload-content'] input[class='upload-input']").set_input_files(str(self.file_path))
-            
-            # Wait for upload to complete (with timeout)
-            import time
-            upload_start = time.time()
-            while time.time() - upload_start < self.UPLOAD_TIMEOUT:
-                try:
-                    upload_input = await page.wait_for_selector('input.upload-input', timeout=3000)
-                    preview_new = await upload_input.query_selector(
-                        'xpath=following-sibling::div[contains(@class, "preview-new")]'
-                    )
-                    
-                    if preview_new:
-                        stage_elements = await preview_new.query_selector_all('div.stage')
-                        upload_success = False
-                        
-                        for stage in stage_elements:
-                            text_content = await page.evaluate('(element) => element.textContent', stage)
-                            if '上传成功' in text_content:
-                                upload_success = True
-                                break
-                        
-                        if upload_success:
-                            logger.info("[小红书] 检测到上传成功标识")
-                            break
-                        else:
-                            logger.info("[小红书] 未找到上传成功标识，继续等待...")
-                    else:
-                        logger.info("[小红书] 未找到预览元素，继续等待...")
-                    
-                    await asyncio.sleep(self.POLL_INTERVAL)
-                    
-                except Exception as e:
-                    logger.info(f"[小红书] 检测过程: {str(e)}，重新尝试...")
-                    await asyncio.sleep(0.5)
-            else:
-                logger.error("[小红书] 视频上传超时")
-                return {
-                    "success": False,
-                    "message": "视频上传超时",
-                    "url": None
-                }
-            
-            # Fill title and tags
-            await asyncio.sleep(1)
-            logger.info("[小红书] 正在填充标题和话题...")
-            
-            title_container = page.locator('div.plugin.title-container').locator('input.d-text')
-            if await title_container.count():
-                await title_container.fill(self.title[:30])
-            
-            # Add tags
-            css_selector = ".tiptap"
-            for tag in self.tags:
-                await page.type(css_selector, "#" + tag)
-                await page.press(css_selector, "Space")
-            
-            logger.info(f"[小红书] 总共添加 {len(self.tags)} 个话题")
-            
-            # Set scheduled publish time if needed
-            if self.publish_date != 0:
-                await self.set_schedule_time(page, self.publish_date)
-            
-            # Click publish button (with timeout)
-            publish_start = time.time()
-            while time.time() - publish_start < self.PUBLISH_TIMEOUT:
-                try:
-                    if self.publish_date != 0:
-                        await page.locator('button:has-text("定时发布")').click()
-                    else:
-                        await page.locator('button:has-text("发布")').click()
-                    
-                    await page.wait_for_url(
-                        "https://creator.xiaohongshu.com/publish/success?**",
-                        timeout=3000
-                    )
-                    logger.success("[小红书] 视频发布成功")
-                    break
-                except Exception:
-                    logger.info("[小红书] 视频正在发布中...")
-                    await asyncio.sleep(0.5)
-            else:
-                logger.warning("[小红书] 发布检测超时，请手动确认")
-            
-            # Save updated cookies
-            await context.storage_state(path=self.account_file)
-            logger.success("[小红书] Cookie 更新完毕")
-            
-            await asyncio.sleep(2)
-            
-            return {
-                "success": True,
-                "message": "发布成功，待审核" if self.publish_date == 0 else "已设置定时发布",
-                "url": None
-            }
-            
-        except Exception as e:
-            logger.exception(f"[小红书] 上传失败: {e}")
-            return {
-                "success": False,
-                "message": f"上传失败: {str(e)}",
-                "url": None
-            }
-        finally:
-            # 确保资源释放
-            if context:
-                try:
-                    await context.close()
-                except Exception:
-                    pass
-            if browser:
-                try:
-                    await browser.close()
-                except Exception:
-                    pass
-    
-    async def main(self) -> Dict[str, Any]:
-        """Execute upload"""
-        async with async_playwright() as playwright:
-            return await self.upload(playwright)
+"""
+Xiaohongshu (小红书) uploader using Playwright.
+"""
+from datetime import datetime
+from pathlib import Path
+from typing import Optional, List, Dict, Any
+import asyncio
+import os
+import re
+import shutil
+import time
+
+from playwright.async_api import Playwright, async_playwright
+from loguru import logger
+
+from .base_uploader import BaseUploader
+from .cookie_utils import set_init_script
+from app.core.config import settings
+
+
+class XiaohongshuUploader(BaseUploader):
+    """Xiaohongshu video uploader using Playwright"""
+
+    UPLOAD_TIMEOUT = 420
+    UPLOAD_IDLE_TIMEOUT = 90
+    UPLOAD_SIGNAL_TIMEOUT = 12
+    PUBLISH_TIMEOUT = 120
+    PAGE_READY_TIMEOUT = 60
+    POLL_INTERVAL = 2
+    MAX_CLICK_RETRIES = 3
+
+    def __init__(
+        self,
+        title: str,
+        file_path: str,
+        tags: List[str],
+        publish_date: Optional[datetime] = None,
+        account_file: Optional[str] = None,
+        description: str = "",
+        user_id: Optional[str] = None,
+    ):
+        super().__init__(title, file_path, tags, publish_date, account_file, description)
+        self.user_id = user_id
+        self.upload_url = "https://creator.xiaohongshu.com/publish/publish?from=homepage&target=video"
+        self._publish_api_submitted = False
+        self._publish_api_error: Optional[str] = None
+        self._temp_upload_paths: List[Path] = []
+
+    def _track_temp_upload_path(self, path: Path) -> None:
+        self._temp_upload_paths.append(path)
+
+    def _prepare_upload_file(self) -> Path:
+        src = self.file_path
+        if src.suffix:
+            return src
+
+        parent_suffix = Path(src.parent.name).suffix
+        if not parent_suffix:
+            return src
+
+        temp_dir = Path("/tmp/vigent_uploads")
+        temp_dir.mkdir(parents=True, exist_ok=True)
+        target = temp_dir / src.parent.name
+
+        try:
+            if target.exists():
+                target.unlink()
+        except Exception:
+            pass
+
+        try:
+            os.link(src, target)
+            logger.info(f"[小红书] using hardlink upload file: {target}")
+        except Exception:
+            try:
+                shutil.copy2(src, target)
+                logger.info(f"[小红书] using copied upload file: {target}")
+            except Exception as e:
+                logger.warning(f"[小红书] 构建带后缀上传文件失败，回退原文件: {e}")
+                return src
+
+        self._track_temp_upload_path(target)
+        return target
+
+    def _cleanup_upload_file(self) -> None:
+        if not self._temp_upload_paths:
+            return
+
+        paths = list(self._temp_upload_paths)
+        self._temp_upload_paths = []
+        for path in paths:
+            try:
+                if path.exists():
+                    path.unlink()
+            except Exception as e:
+                logger.warning(f"[小红书] 清理临时上传文件失败: {e}")
+
+    def _resolve_headless_mode(self) -> str:
+        mode = (settings.XIAOHONGSHU_HEADLESS_MODE or "").strip().lower()
+        return mode or "headless-new"
+
+    def _build_launch_options(self) -> Dict[str, Any]:
+        mode = self._resolve_headless_mode()
+        args = [
+            "--no-sandbox",
+            "--disable-dev-shm-usage",
+            "--disable-blink-features=AutomationControlled",
+        ]
+
+        headless = mode not in ("headful", "false", "0", "no")
+        if headless and mode in ("new", "headless-new", "headless_new"):
+            args.append("--headless=new")
+
+        if settings.XIAOHONGSHU_FORCE_SWIFTSHADER or headless:
+            args.extend([
+                "--enable-unsafe-swiftshader",
+                "--use-gl=swiftshader",
+            ])
+
+        options: Dict[str, Any] = {"headless": headless, "args": args}
+        chrome_path = (settings.XIAOHONGSHU_CHROME_PATH or "").strip()
+        if chrome_path:
+            if Path(chrome_path).exists():
+                options["executable_path"] = chrome_path
+            else:
+                logger.warning(f"[小红书] XIAOHONGSHU_CHROME_PATH 不存在: {chrome_path}")
+        else:
+            channel = (settings.XIAOHONGSHU_BROWSER_CHANNEL or "").strip()
+            if channel:
+                options["channel"] = channel
+
+        return options
+
+    def _debug_artifacts_enabled(self) -> bool:
+        return bool(settings.DEBUG and settings.XIAOHONGSHU_DEBUG_ARTIFACTS)
+
+    async def _save_debug_screenshot(self, page, name: str) -> None:
+        if not self._debug_artifacts_enabled():
+            return
+        try:
+            debug_dir = Path(__file__).parent.parent.parent / "debug_screenshots"
+            debug_dir.mkdir(exist_ok=True)
+            safe_name = name.replace("/", "_").replace(" ", "_")
+            file_path = debug_dir / f"xiaohongshu_{safe_name}.png"
+            await page.screenshot(path=str(file_path), full_page=True)
+            logger.info(f"[小红书] saved debug screenshot: {file_path}")
+        except Exception as e:
+            logger.warning(f"[小红书] 保存调试截图失败: {e}")
+
+    def _publish_screenshot_dir(self) -> Path:
+        user_key = re.sub(r"[^A-Za-z0-9_-]", "_", self.user_id or "legacy")[:64] or "legacy"
+        target = settings.PUBLISH_SCREENSHOT_DIR / user_key
+        target.mkdir(parents=True, exist_ok=True)
+        return target
+
+    async def _save_publish_success_screenshot(self, page) -> Optional[str]:
+        try:
+            timestamp = time.strftime("%Y%m%d_%H%M%S", time.localtime())
+            filename = f"xiaohongshu_success_{timestamp}_{int(time.time() * 1000) % 1000:03d}.png"
+            file_path = self._publish_screenshot_dir() / filename
+            await page.screenshot(path=str(file_path), full_page=False)
+            return f"/api/publish/screenshot/{filename}"
+        except Exception as e:
+            logger.warning(f"[小红书] 保存发布成功截图失败: {e}")
+            return None
+
+    def _attach_publish_listener(self, page) -> None:
+        ignore_tokens = ("report", "collect", "analytics", "monitor", "perf")
+
+        def on_response(response):
+            try:
+                request = response.request
+                if request.method not in ("POST", "PUT"):
+                    return
+
+                url = (response.url or "").lower()
+                if "xiaohongshu.com" not in url or "api" not in url:
+                    return
+                if not any(token in url for token in ("publish", "note/create", "note/publish", "note/save")):
+                    return
+                if any(token in url for token in ignore_tokens):
+                    return
+
+                if response.status < 400:
+                    self._publish_api_submitted = True
+                    logger.info("[小红书][publish] publish API ok")
+                else:
+                    self._publish_api_error = f"发布请求失败（HTTP {response.status}）"
+                    logger.warning(f"[小红书][publish] publish API failed status={response.status}")
+            except Exception:
+                pass
+
+        page.on("response", on_response)
+
+    async def _is_text_visible(self, page, text: str, exact: bool = False) -> bool:
+        try:
+            return await page.get_by_text(text, exact=exact).first.is_visible()
+        except Exception:
+            return False
+
+    async def _first_existing_locator(self, page, selectors: List[str], require_visible: bool = True):
+        for selector in selectors:
+            locator = page.locator(selector)
+            try:
+                if await locator.count() == 0:
+                    continue
+                candidate = locator.first
+                if require_visible and not await candidate.is_visible():
+                    continue
+                return candidate
+            except Exception:
+                continue
+        return None
+
+    async def _is_login_page(self, page) -> bool:
+        url = page.url.lower()
+        if "login" in url or "signin" in url:
+            return True
+        if await self._is_text_visible(page, "扫码登录", exact=False):
+            return True
+        if await self._is_text_visible(page, "立即登录", exact=False):
+            return True
+        return False
+
+    async def _go_to_publish_page(self, page):
+        await page.goto(self.upload_url, wait_until="domcontentloaded", timeout=self.PAGE_READY_TIMEOUT * 1000)
+        await asyncio.sleep(2)
+        return page
+
+    async def _find_file_input(self, page):
+        selectors = [
+            "input[type='file'][accept*='video']",
+            "div[class*='upload'] input[type='file']",
+            "input.upload-input",
+            "input[type='file']",
+        ]
+        return await self._first_existing_locator(page, selectors, require_visible=False)
+
+    async def _open_upload_entry(self, page) -> None:
+        selectors = [
+            "button:has-text('上传视频')",
+            "button:has-text('上传')",
+            "div[role='button']:has-text('上传视频')",
+            "div[role='button']:has-text('上传')",
+            "span:has-text('上传视频')",
+        ]
+        target = await self._first_existing_locator(page, selectors)
+        if not target:
+            return
+        try:
+            await target.scroll_into_view_if_needed()
+        except Exception:
+            pass
+        try:
+            await target.click(timeout=2000)
+        except Exception:
+            try:
+                await target.evaluate("el => el.click()")
+            except Exception:
+                pass
+
+    async def _is_upload_in_progress(self, page) -> bool:
+        in_progress_texts = [
+            "上传中",
+            "正在上传",
+            "处理中",
+            "视频处理中",
+            "转码中",
+            "请稍候",
+            "上传进度",
+            "校验中",
+            "准备中",
+        ]
+        for text in in_progress_texts:
+            if await self._is_text_visible(page, text, exact=False):
+                return True
+        return False
+
+    async def _is_upload_success(self, page) -> bool:
+        success_texts = [
+            "上传成功",
+            "上传完成",
+            "处理完成",
+            "转码完成",
+            "可发布",
+        ]
+        for text in success_texts:
+            if await self._is_text_visible(page, text, exact=False):
+                return True
+        return await self._is_publish_button_enabled(page)
+
+    async def _upload_failed_reason(self, page) -> Optional[str]:
+        failure_texts = [
+            "上传失败",
+            "上传异常",
+            "上传出错",
+            "上传超时",
+            "网络异常",
+        ]
+        for text in failure_texts:
+            if await self._is_text_visible(page, text, exact=False):
+                return f"上传失败：{text}"
+        return None
+
+    async def _upload_video(self, page) -> bool:
+        page = await self._go_to_publish_page(page)
+        await self._save_debug_screenshot(page, "publish_page")
+
+        upload_path = self._prepare_upload_file()
+        try:
+            upload_size = upload_path.stat().st_size
+            logger.info(
+                f"[小红书][upload_file] path={upload_path} "
+                f"size={upload_size} suffix={upload_path.suffix}"
+            )
+        except Exception as e:
+            logger.warning(f"[小红书] 读取上传文件信息失败: {e}")
+
+        for attempt in range(self.MAX_CLICK_RETRIES):
+            file_input = await self._find_file_input(page)
+            if not file_input:
+                await self._open_upload_entry(page)
+                await asyncio.sleep(1)
+                file_input = await self._find_file_input(page)
+
+            if not file_input:
+                logger.info(f"[小红书] 未找到上传文件 input，准备重试 ({attempt + 1}/{self.MAX_CLICK_RETRIES})")
+                await asyncio.sleep(1)
+                continue
+
+            try:
+                await file_input.set_input_files(str(upload_path))
+                logger.info(f"[小红书] 已设置上传文件: {upload_path.name}")
+
+                try:
+                    file_info = await file_input.evaluate(
+                        """
+                        (input) => {
+                          const file = input && input.files ? input.files[0] : null;
+                          if (!file) return null;
+                          return { name: file.name, size: file.size, type: file.type };
+                        }
+                        """
+                    )
+                    if file_info:
+                        selected_name = str(file_info.get("name") or "")
+                        logger.info(
+                            "[小红书][file_input] "
+                            f"name={selected_name} "
+                            f"size={file_info.get('size')} "
+                            f"type={file_info.get('type')}"
+                        )
+                        if upload_path.suffix and selected_name and not selected_name.lower().endswith(upload_path.suffix.lower()):
+                            logger.warning(
+                                "[小红书] file input 文件名后缀与上传文件不一致，"
+                                f"expect=*{upload_path.suffix} actual={selected_name}"
+                            )
+                            if attempt + 1 < self.MAX_CLICK_RETRIES:
+                                await asyncio.sleep(1)
+                                continue
+                            await self._save_debug_screenshot(page, "upload_input_name_mismatch")
+                            return False
+
+                        if not str(file_info.get("type") or "").strip():
+                            logger.warning("[小红书] file input MIME 为空，可能影响站点识别")
+                except Exception:
+                    pass
+
+                signal_detected = False
+                bootstrap_error: Optional[str] = None
+                deadline = time.time() + self.UPLOAD_SIGNAL_TIMEOUT
+                while time.time() < deadline:
+                    bootstrap_error = await self._upload_failed_reason(page)
+                    if bootstrap_error:
+                        break
+                    if await self._is_upload_in_progress(page) or await self._is_upload_success(page):
+                        signal_detected = True
+                        break
+                    await asyncio.sleep(0.6)
+
+                if bootstrap_error:
+                    logger.warning(f"[小红书] 上传启动阶段失败: {bootstrap_error}")
+                    if attempt + 1 < self.MAX_CLICK_RETRIES:
+                        await asyncio.sleep(1)
+                        continue
+                    return False
+
+                if signal_detected:
+                    return True
+
+                logger.info("[小红书] 未立即检测到上传状态，进入后续上传监控")
+                return True
+            except Exception as e:
+                logger.warning(f"[小红书] set_input_files 失败: {e}")
+
+            await asyncio.sleep(1)
+
+        await self._save_debug_screenshot(page, "upload_input_missing")
+        return False
+
+    async def _wait_for_upload_complete(self, page) -> tuple[bool, str]:
+        start = time.time()
+        idle_start = start
+        while time.time() - start < self.UPLOAD_TIMEOUT:
+            reason = await self._upload_failed_reason(page)
+            if reason:
+                logger.warning(f"[小红书] 上传失败检测: {reason}")
+                return False, reason
+
+            if await self._is_upload_success(page):
+                return True, "上传完成"
+
+            if await self._is_upload_in_progress(page):
+                idle_start = time.time()
+                logger.info("[小红书] 视频上传进行中...")
+            else:
+                if time.time() - idle_start > self.UPLOAD_IDLE_TIMEOUT:
+                    await self._save_debug_screenshot(page, "upload_idle_timeout")
+                    return False, "未检测到有效上传进度（疑似上传控件未生效）"
+                logger.info("[小红书] 等待上传状态...")
+
+            await asyncio.sleep(self.POLL_INTERVAL)
+
+        return False, "视频上传超时"
+
+    def _normalize_tags(self, tags: List[str]) -> List[str]:
+        normalized: List[str] = []
+        seen = set()
+        for raw in tags:
+            item = (raw or "").strip().lstrip("#")
+            if not item:
+                continue
+            lowered = item.lower()
+            if lowered in seen:
+                continue
+            seen.add(lowered)
+            normalized.append(item)
+        return normalized
+
+    async def _fill_title(self, page) -> bool:
+        selectors = [
+            "input[placeholder*='标题']",
+            "div.plugin.title-container input",
+            "input.d-text",
+        ]
+        target = await self._first_existing_locator(page, selectors)
+        if not target:
+            return False
+
+        try:
+            await target.click(timeout=1500)
+            await target.fill((self.title or "")[:30])
+            return True
+        except Exception:
+            return False
+
+    async def _fill_description(self, page, text: str) -> bool:
+        selectors = [
+            ".tiptap[contenteditable='true']",
+            "[contenteditable='true'][data-placeholder*='描述']",
+            "[contenteditable='true'][role='textbox']",
+            "textarea[placeholder*='描述']",
+            "textarea[placeholder*='正文']",
+        ]
+        target = await self._first_existing_locator(page, selectors)
+        if not target:
+            return False
+
+        try:
+            await target.click(timeout=1500)
+            await page.keyboard.press("Control+KeyA")
+            await page.keyboard.type(text)
+            return True
+        except Exception:
+            return False
+
+    async def set_schedule_time(self, page, publish_date: datetime) -> bool:
+        try:
+            toggle = await self._first_existing_locator(
+                page,
+                [
+                    "label:has-text('定时发布')",
+                    "span:has-text('定时发布')",
+                    "div:has-text('定时发布')",
+                ],
+            )
+            if not toggle:
+                return False
+
+            try:
+                await toggle.click(timeout=2000)
+            except Exception:
+                await toggle.evaluate("el => el.click()")
+
+            await asyncio.sleep(0.5)
+            date_input = await self._first_existing_locator(
+                page,
+                [
+                    "input[placeholder*='日期和时间']",
+                    "input[placeholder*='发布时间']",
+                    "input[placeholder*='选择日期']",
+                ],
+            )
+            if not date_input:
+                return False
+
+            value = publish_date.strftime("%Y-%m-%d %H:%M")
+            await date_input.click(timeout=2000)
+            await page.keyboard.press("Control+KeyA")
+            await page.keyboard.type(value)
+            await page.keyboard.press("Enter")
+            logger.info(f"[小红书] 已设置定时发布: {value}")
+            return True
+        except Exception as e:
+            logger.warning(f"[小红书] 设置定时发布时间失败: {e}")
+            return False
+
+    async def _find_publish_button(self, page, scheduled: bool):
+        selectors = [
+            "button:has-text('定时发布')",
+            "div[role='button']:has-text('定时发布')",
+        ] if scheduled else [
+            "button:has-text('发布')",
+            "button:has-text('立即发布')",
+            "div[role='button']:has-text('发布')",
+        ]
+
+        for selector in selectors:
+            locator = page.locator(selector)
+            try:
+                if await locator.count() == 0:
+                    continue
+                candidate = locator.first
+                if not await candidate.is_visible():
+                    continue
+                return candidate
+            except Exception:
+                continue
+        return None
+
+    async def _is_publish_button_enabled(self, page) -> bool:
+        buttons = [
+            await self._find_publish_button(page, scheduled=False),
+            await self._find_publish_button(page, scheduled=True),
+        ]
+        for button in buttons:
+            if not button:
+                continue
+            try:
+                if await button.is_enabled():
+                    return True
+            except Exception:
+                continue
+        return False
+
+    async def _click_publish(self, page, scheduled: bool) -> tuple[bool, str]:
+        for _ in range(self.MAX_CLICK_RETRIES):
+            button = await self._find_publish_button(page, scheduled)
+            if not button:
+                await asyncio.sleep(0.8)
+                continue
+
+            try:
+                if not await button.is_enabled():
+                    await asyncio.sleep(0.8)
+                    continue
+            except Exception:
+                pass
+
+            try:
+                await button.click(timeout=2000)
+                return True, "发布按钮点击成功"
+            except Exception:
+                try:
+                    await button.evaluate("el => el.click()")
+                    return True, "发布按钮 JS 点击成功"
+                except Exception:
+                    await asyncio.sleep(0.8)
+
+        return False, "未找到可点击的发布按钮"
+
+    async def _wait_for_publish_result(self, page) -> tuple[bool, str, bool]:
+        create_url = page.url
+        success_url_tokens = [
+            "/publish/success",
+            "/publish/result",
+            "/publish/published",
+        ]
+        success_texts = [
+            "发布成功",
+            "发布完成",
+            "审核中",
+            "查看笔记",
+            "去查看",
+        ]
+        failure_texts = [
+            "发布失败",
+            "发布异常",
+            "发布出错",
+            "网络异常",
+            "请完善",
+            "请补充",
+        ]
+
+        start_time = time.time()
+        while time.time() - start_time < self.PUBLISH_TIMEOUT:
+            if self._publish_api_error:
+                return False, self._publish_api_error, False
+
+            current_url = page.url
+            lowered_url = current_url.lower()
+            if any(token in lowered_url for token in success_url_tokens):
+                return True, f"发布成功：跳转到 {current_url}", False
+
+            if current_url != create_url and "/publish/publish" not in lowered_url:
+                return True, f"发布成功：页面已跳转 {current_url}", False
+
+            if self._publish_api_submitted:
+                return True, "发布成功：API 已确认", False
+
+            for text in failure_texts:
+                if await self._is_text_visible(page, text, exact=False):
+                    return False, f"发布失败：{text}", False
+
+            for text in success_texts:
+                if await self._is_text_visible(page, text, exact=False):
+                    return True, f"发布成功：检测到文案 {text}", False
+
+            logger.info("[小红书] 等待发布结果...")
+            await asyncio.sleep(self.POLL_INTERVAL)
+
+        return False, "发布超时", True
+
+    async def upload(self, playwright: Playwright) -> Dict[str, Any]:
+        browser = None
+        context = None
+        page = None
+        try:
+            launch_options = self._build_launch_options()
+            browser = await playwright.chromium.launch(**launch_options)
+            context = await browser.new_context(
+                storage_state=self.account_file,
+                viewport={"width": 1600, "height": 900},
+                device_scale_factor=1,
+                user_agent=settings.XIAOHONGSHU_USER_AGENT,
+                locale=settings.XIAOHONGSHU_LOCALE,
+                timezone_id=settings.XIAOHONGSHU_TIMEZONE_ID,
+            )
+            context = await set_init_script(context)
+
+            page = await context.new_page()
+            self._attach_publish_listener(page)
+
+            await self._go_to_publish_page(page)
+            if await self._is_login_page(page):
+                return {
+                    "success": False,
+                    "message": "登录失效，请重新扫码登录小红书",
+                    "url": None,
+                }
+
+            logger.info(f"[小红书] 正在上传: {self.file_path.name}")
+            if not await self._upload_video(page):
+                return {
+                    "success": False,
+                    "message": "未能触发有效视频上传，请确认发布页状态及视频文件格式",
+                    "url": None,
+                }
+
+            upload_success, upload_reason = await self._wait_for_upload_complete(page)
+            if not upload_success:
+                await self._save_debug_screenshot(page, "upload_failed")
+                return {
+                    "success": False,
+                    "message": upload_reason,
+                    "url": None,
+                }
+
+            await asyncio.sleep(1)
+            title_filled = await self._fill_title(page)
+            if not title_filled:
+                logger.warning("[小红书] 未找到标题输入框，尝试在正文中补充标题")
+
+            normalized_tags = self._normalize_tags(self.tags)
+            body_parts: List[str] = []
+            if self.description:
+                body_parts.append(self.description.strip())
+            if not title_filled and self.title:
+                body_parts.insert(0, self.title.strip())
+            if normalized_tags:
+                body_parts.append(" ".join([f"#{tag}" for tag in normalized_tags]))
+            body_text = "\n".join([part for part in body_parts if part]).strip()
+
+            if body_text:
+                body_ok = await self._fill_description(page, body_text)
+                if not body_ok:
+                    logger.warning("[小红书] 未找到正文输入框，跳过正文/话题填充")
+
+            if self.publish_date != 0 and isinstance(self.publish_date, datetime):
+                if not await self.set_schedule_time(page, self.publish_date):
+                    return {
+                        "success": False,
+                        "message": "未找到定时发布控件，请检查小红书发布页结构",
+                        "url": None,
+                    }
+
+            clicked, click_reason = await self._click_publish(page, scheduled=self.publish_date != 0)
+            if not clicked:
+                await self._save_debug_screenshot(page, "publish_button_not_clickable")
+                return {
+                    "success": False,
+                    "message": click_reason,
+                    "url": None,
+                }
+
+            publish_success, publish_reason, is_timeout = await self._wait_for_publish_result(page)
+
+            await context.storage_state(path=self.account_file)
+            logger.success("[小红书] Cookie 更新完毕")
+
+            if publish_success:
+                await asyncio.sleep(2)
+                screenshot_url = await self._save_publish_success_screenshot(page)
+                return {
+                    "success": True,
+                    "message": "发布成功，待审核" if self.publish_date == 0 else "已设置定时发布",
+                    "url": None,
+                    "screenshot_url": screenshot_url,
+                }
+
+            if is_timeout:
+                return {
+                    "success": False,
+                    "message": f"发布状态未知（检测超时），请到小红书创作中心确认: {publish_reason}",
+                    "url": None,
+                }
+
+            return {
+                "success": False,
+                "message": publish_reason,
+                "url": None,
+            }
+
+        except Exception as e:
+            logger.exception(f"[小红书] 上传失败: {e}")
+            return {
+                "success": False,
+                "message": f"上传失败: {str(e)}",
+                "url": None,
+            }
+        finally:
+            self._cleanup_upload_file()
+
+            if page:
+                try:
+                    if not page.is_closed():
+                        await page.close()
+                except Exception:
+                    pass
+
+            if context:
+                try:
+                    await context.close()
+                except Exception:
+                    pass
+
+            if browser:
+                try:
+                    await browser.close()
+                except Exception:
+                    pass
+
+    async def main(self) -> Dict[str, Any]:
+        async with async_playwright() as playwright:
+            return await self.upload(playwright)
--- a/backend/requirements.txt
+++ b/backend/requirements.txt
@@ -38,3 +38,7 @@ faster-whisper>=1.0.0
 # 文案提取与AI生成
 yt-dlp>=2023.0.0
 zai-sdk>=0.2.0
+
+# 小脸增强
+opencv-python-headless>=4.8.0
+gfpgan>=1.3.8
--- a/frontend/src/app/favicon.ico
+++ b/frontend/src/app/favicon.ico
--- a/frontend/src/app/icon.png
+++ b/frontend/src/app/icon.png
--- a/frontend/src/app/layout.tsx
+++ b/frontend/src/app/layout.tsx
@@ -3,6 +3,7 @@ import { Geist, Geist_Mono } from "next/font/google";
 import "./globals.css";
 import { AuthProvider } from "@/shared/contexts/AuthContext";
 import { TaskProvider } from "@/shared/contexts/TaskContext";
+import { CleanupProvider } from "@/shared/contexts/CleanupContext";

 import { Toaster } from "sonner";

@@ -40,7 +41,9 @@ export default function RootLayout({
      >
        <AuthProvider>
          <TaskProvider>
-            {children}
+            <CleanupProvider>
+              {children}
+            </CleanupProvider>
          </TaskProvider>
        </AuthProvider>
        <Toaster
--- a/frontend/src/components/AccountSettingsDropdown.tsx
+++ b/frontend/src/components/AccountSettingsDropdown.tsx
@@ -4,6 +4,7 @@ import { useState, useEffect, useRef } from "react";
 import { useAuth } from "@/shared/contexts/AuthContext";
 import api from "@/shared/api/axios";
 import { ApiResponse } from "@/shared/api/types";
+import { AppModal, AppModalHeader } from "@/shared/ui/AppModal";

 // 账户设置下拉菜单组件
 export default function AccountSettingsDropdown() {
@@ -90,6 +91,15 @@ export default function AccountSettingsDropdown() {
        }
    };

+    const closePasswordModal = () => {
+        setShowPasswordModal(false);
+        setError('');
+        setSuccess('');
+        setOldPassword('');
+        setNewPassword('');
+        setConfirmPassword('');
+    };
+
    return (
        <div className="relative" ref={dropdownRef}>
            <button
@@ -137,81 +147,83 @@ export default function AccountSettingsDropdown() {

            {/* 修改密码弹窗 */}
            {showPasswordModal && (
-                <div className="fixed inset-0 z-[200] flex items-start justify-center pt-20 bg-black/60 backdrop-blur-sm p-4">
-                    <div className="w-full max-w-md p-6 bg-gray-900 border border-white/10 rounded-2xl shadow-2xl mx-4">
-                        <h3 className="text-xl font-bold text-white mb-4">修改密码</h3>
-                        <form onSubmit={handleChangePassword} className="space-y-4">
-                            <div>
-                                <label className="block text-sm text-gray-300 mb-1">当前密码</label>
-                                <input
-                                    type="password"
-                                    value={oldPassword}
-                                    onChange={(e) => setOldPassword(e.target.value)}
-                                    required
-                                    className="w-full px-3 py-2 bg-white/5 border border-white/10 rounded-lg text-white placeholder-gray-500 focus:outline-none focus:ring-2 focus:ring-purple-500"
-                                    placeholder="输入当前密码"
-                                />
-                            </div>
-                            <div>
-                                <label className="block text-sm text-gray-300 mb-1">新密码</label>
-                                <input
-                                    type="password"
-                                    value={newPassword}
-                                    onChange={(e) => setNewPassword(e.target.value)}
-                                    required
-                                    className="w-full px-3 py-2 bg-white/5 border border-white/10 rounded-lg text-white placeholder-gray-500 focus:outline-none focus:ring-2 focus:ring-purple-500"
-                                    placeholder="至少6位"
-                                />
-                            </div>
-                            <div>
-                                <label className="block text-sm text-gray-300 mb-1">确认新密码</label>
-                                <input
-                                    type="password"
-                                    value={confirmPassword}
-                                    onChange={(e) => setConfirmPassword(e.target.value)}
-                                    required
-                                    className="w-full px-3 py-2 bg-white/5 border border-white/10 rounded-lg text-white placeholder-gray-500 focus:outline-none focus:ring-2 focus:ring-purple-500"
-                                    placeholder="再次输入新密码"
-                                />
-                            </div>
+                <AppModal
+                    isOpen={showPasswordModal}
+                    onClose={closePasswordModal}
+                    zIndexClassName="z-[200]"
+                    panelClassName="w-full max-w-md rounded-2xl border border-white/10 bg-[#171821]/95 shadow-[0_24px_80px_rgba(0,0,0,0.55)] overflow-hidden"
+                    closeOnOverlay
+                >
+                    <AppModalHeader
+                        title="修改密码"
+                        subtitle="修改后将自动退出并重新登录"
+                        onClose={closePasswordModal}
+                    />

-                            {error && (
-                                <div className="p-2 bg-red-500/20 border border-red-500/50 rounded text-red-200 text-sm">
-                                    {error}
-                                </div>
-                            )}
-                            {success && (
-                                <div className="p-2 bg-green-500/20 border border-green-500/50 rounded text-green-200 text-sm">
-                                    {success}
-                                </div>
-                            )}
+                    <form onSubmit={handleChangePassword} className="space-y-4 p-5">
+                        <div>
+                            <label className="block text-sm text-gray-300 mb-1">当前密码</label>
+                            <input
+                                type="password"
+                                value={oldPassword}
+                                onChange={(e) => setOldPassword(e.target.value)}
+                                required
+                                className="w-full px-3 py-2 bg-white/5 border border-white/10 rounded-lg text-white placeholder-gray-500 focus:outline-none focus:ring-2 focus:ring-purple-500"
+                                placeholder="输入当前密码"
+                            />
+                        </div>
+                        <div>
+                            <label className="block text-sm text-gray-300 mb-1">新密码</label>
+                            <input
+                                type="password"
+                                value={newPassword}
+                                onChange={(e) => setNewPassword(e.target.value)}
+                                required
+                                className="w-full px-3 py-2 bg-white/5 border border-white/10 rounded-lg text-white placeholder-gray-500 focus:outline-none focus:ring-2 focus:ring-purple-500"
+                                placeholder="至少6位"
+                            />
+                        </div>
+                        <div>
+                            <label className="block text-sm text-gray-300 mb-1">确认新密码</label>
+                            <input
+                                type="password"
+                                value={confirmPassword}
+                                onChange={(e) => setConfirmPassword(e.target.value)}
+                                required
+                                className="w-full px-3 py-2 bg-white/5 border border-white/10 rounded-lg text-white placeholder-gray-500 focus:outline-none focus:ring-2 focus:ring-purple-500"
+                                placeholder="再次输入新密码"
+                            />
+                        </div>

-                            <div className="flex gap-3 pt-2">
-                                <button
-                                    type="button"
-                                    onClick={() => {
-                                        setShowPasswordModal(false);
-                                        setError('');
-                                        setSuccess('');
-                                        setOldPassword('');
-                                        setNewPassword('');
-                                        setConfirmPassword('');
-                                    }}
-                                    className="flex-1 py-2 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors"
-                                >
-                                    取消
-                                </button>
-                                <button
-                                    type="submit"
-                                    disabled={loading}
-                                    className="flex-1 py-2 bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-700 hover:to-pink-700 text-white rounded-lg transition-colors disabled:opacity-50"
-                                >
-                                    {loading ? '修改中...' : '确认修改'}
-                                </button>
+                        {error && (
+                            <div className="p-2 bg-red-500/20 border border-red-500/50 rounded text-red-200 text-sm">
+                                {error}
                            </div>
-                        </form>
-                    </div>
-                </div>
+                        )}
+                        {success && (
+                            <div className="p-2 bg-green-500/20 border border-green-500/50 rounded text-green-200 text-sm">
+                                {success}
+                            </div>
+                        )}
+
+                        <div className="flex gap-3 pt-2">
+                            <button
+                                type="button"
+                                onClick={closePasswordModal}
+                                className="flex-1 py-2 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors"
+                            >
+                                取消
+                            </button>
+                            <button
+                                type="submit"
+                                disabled={loading}
+                                className="flex-1 py-2 bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-700 hover:to-pink-700 text-white rounded-lg transition-colors disabled:opacity-50"
+                            >
+                                {loading ? '修改中...' : '确认修改'}
+                            </button>
+                        </div>
+                    </form>
+                </AppModal>
            )}
        </div>
    );
--- a/frontend/src/components/VideoPreviewModal.tsx
+++ b/frontend/src/components/VideoPreviewModal.tsx
@@ -1,7 +1,7 @@
 "use client";

-import { useEffect } from "react";
-import { X, Video } from "lucide-react";
+import { Video } from "lucide-react";
+import { AppModal, AppModalHeader } from "@/shared/ui/AppModal";

 interface VideoPreviewModalProps {
    videoUrl: string | null;
@@ -16,66 +16,34 @@ export default function VideoPreviewModal({
    title = "视频预览",
    subtitle = "ESC 关闭 · 点击空白关闭",
 }: VideoPreviewModalProps) {
-  useEffect(() => {
-    if (!videoUrl) return;
-    // 按 ESC 关闭
-    const handleEsc = (e: KeyboardEvent) => {
-      if (e.key === 'Escape') onClose();
-    };
-    const prevOverflow = document.body.style.overflow;
-    document.addEventListener('keydown', handleEsc);
-    // 禁止背景滚动
-    document.body.style.overflow = 'hidden';
+  if (!videoUrl) return null;

-    return () => {
-      document.removeEventListener('keydown', handleEsc);
-      document.body.style.overflow = prevOverflow;
-    };
-  }, [videoUrl, onClose]);
+  return (
+    <AppModal
+      isOpen={Boolean(videoUrl)}
+      onClose={onClose}
+      zIndexClassName="z-[320]"
+      panelClassName="relative w-full max-w-4xl rounded-2xl border border-white/10 bg-[#171821]/95 shadow-[0_24px_80px_rgba(0,0,0,0.55)] overflow-hidden flex flex-col"
+      closeOnOverlay
+    >
+      <div data-video-preview-open="true" className="flex flex-col">
+        <AppModalHeader
+          title={title}
+          subtitle={subtitle}
+          icon={<Video className="h-5 w-5" />}
+          onClose={onClose}
+        />

-    if (!videoUrl) return null;
-
-    return (
-        <div
-            className="fixed inset-0 z-[200] flex items-center justify-center bg-black/80 backdrop-blur-sm p-4 animate-in fade-in duration-200"
-            onClick={onClose}
-        >
-            <div
-                className="relative w-full max-w-4xl bg-gray-900 border border-white/10 rounded-2xl shadow-2xl overflow-hidden flex flex-col"
-                onClick={(e) => e.stopPropagation()}
-            >
-                <div className="flex items-center justify-between px-6 py-3 border-b border-white/10 bg-gradient-to-r from-white/5 via-white/0 to-white/5">
-                    <div className="flex items-center gap-3">
-                        <div className="h-9 w-9 rounded-lg bg-white/10 flex items-center justify-center text-white">
-                            <Video className="h-5 w-5" />
-                        </div>
-                        <div>
-                            <h3 className="text-lg font-semibold text-white">
-                                {title}
-                            </h3>
-                            <p className="text-xs text-gray-400">
-                                {subtitle}
-                            </p>
-                        </div>
-                    </div>
-                    <button
-                        onClick={onClose}
-                        className="p-2 text-gray-400 hover:text-white hover:bg-white/10 rounded-lg transition-colors"
-                    >
-                        <X className="h-5 w-5" />
-                    </button>
-                </div>
-
-                <div className="bg-black flex items-center justify-center min-h-[50vh] max-h-[80vh]">
-                    <video
-                        src={videoUrl}
-                        controls
-                        autoPlay
-                        preload="metadata"
-                        className="w-full h-full max-h-[80vh] object-contain"
-                    />
-                </div>
-            </div>
+        <div className="bg-black flex items-center justify-center min-h-[50vh] max-h-[80vh]">
+          <video
+            src={videoUrl}
+            controls
+            autoPlay
+            preload="metadata"
+            className="w-full h-full max-h-[80vh] object-contain"
+          />
        </div>
-    );
+      </div>
+    </AppModal>
+  );
 }
--- a/frontend/src/features/home/model/useHomeController.ts
+++ b/frontend/src/features/home/model/useHomeController.ts
@@ -1,4 +1,4 @@
-import { useEffect, useMemo, useRef, useState } from "react";
+import { useCallback, useEffect, useMemo, useRef, useState } from "react";
 import api from "@/shared/api/axios";
 import {
  buildTextShadow,
@@ -256,6 +256,14 @@ export const useHomeController = () => {
      const payload = unwrap(res);
      if (selectedMaterials.includes(materialId) && payload?.id) {
        setSelectedMaterials((prev) => prev.map((x) => (x === materialId ? payload.id : x)));
+        // Sync inserts: update materialId and name when rename changes the ID
+        if (payload.id !== materialId) {
+          setInserts((prev) => prev.map((ins) =>
+            ins.materialId === materialId
+              ? { ...ins, materialId: payload.id, materialName: editMaterialName.trim() }
+              : ins
+          ));
+        }
      }
      setEditingMaterialId(null);
      setEditMaterialName("");
@@ -287,6 +295,9 @@ export const useHomeController = () => {
  // 文案提取模态框
  const [extractModalOpen, setExtractModalOpen] = useState(false);

+  // 文案深度学习模态框
+  const [learningModalOpen, setLearningModalOpen] = useState(false);
+
  // AI 改写模态框
  const [rewriteModalOpen, setRewriteModalOpen] = useState(false);

@@ -307,6 +318,7 @@ export const useHomeController = () => {
    setUploadError,
    fetchMaterials,
    toggleMaterial,
+    reorderMaterials,
    deleteMaterial,
    handleUpload,
  } = useMaterials({
@@ -394,9 +406,17 @@ export const useHomeController = () => {
  });

  const {
-    segments: timelineSegments,
-    reorderSegments,
-    setSourceRange,
+    inserts,
+    setInserts,
+    primaryMaterial: timelinePrimaryMaterial,
+    primarySourceStart,
+    primarySourceEnd,
+    addInsert,
+    removeInsert,
+    moveInsert,
+    resizeInsert,
+    setInsertSourceRange,
+    setPrimarySourceRange,
    toCustomAssignments,
  } = useTimelineEditor({
    audioDuration: selectedAudio?.duration_sec ?? 0,
@@ -405,16 +425,15 @@ export const useHomeController = () => {
    storageKey,
  });

-  // 时间轴第一段素材的视频 URL（用于帧截取预览）
+  // 主素材的视频 URL（用于帧截取预览）
  // 使用后端代理 URL（同源）避免 CORS canvas taint
  const firstTimelineMaterialUrl = useMemo(() => {
-    const firstSeg = timelineSegments[0];
-    const matId = firstSeg?.materialId ?? selectedMaterials[0];
+    const matId = selectedMaterials[0];
    if (!matId) return null;
    const mat = materials.find((m) => m.id === matId);
    if (!mat) return null;
    return `/api/materials/stream/${mat.id}`;
-  }, [materials, timelineSegments, selectedMaterials]);
+  }, [materials, selectedMaterials]);

  const materialPosterUrl = useVideoFrameCapture(showStylePreview ? firstTimelineMaterialUrl : null);

@@ -735,6 +754,9 @@ export const useHomeController = () => {
  // 开始录音
  const startRecording = async () => {
    try {
+      setRecordedBlob(null);
+      setRecordingTime(0);
+
      const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
      const mediaRecorder = new MediaRecorder(stream, { mimeType: "audio/webm" });
      const chunks: BlobPart[] = [];
@@ -748,7 +770,6 @@ export const useHomeController = () => {

      mediaRecorder.start();
      setIsRecording(true);
-      setRecordingTime(0);
      mediaRecorderRef.current = mediaRecorder;

      // 计时器
@@ -784,6 +805,11 @@ export const useHomeController = () => {
    setRecordingTime(0);
  };

+  const discardRecording = () => {
+    setRecordedBlob(null);
+    setRecordingTime(0);
+  };
+
  // 格式化录音时长
  const formatRecordingTime = (seconds: number) => {
    const mins = Math.floor(seconds / 60);
@@ -945,57 +971,36 @@ export const useHomeController = () => {
        output_aspect_ratio: outputAspectRatio,
      };

-      // 多素材
+      // 多素材（多镜头模式）
      if (selectedMaterials.length > 1) {
-        const timelineOrderedIds = timelineSegments
-          .map((seg) => seg.materialId)
-          .filter((id, index, arr) => arr.indexOf(id) === index);
-        const orderedMaterialIds = [
-          ...timelineOrderedIds.filter((id) => selectedMaterials.includes(id)),
-          ...selectedMaterials.filter((id) => !timelineOrderedIds.includes(id)),
-        ];
-
-        const materialPaths = orderedMaterialIds
-          .map((id) => materials.find((x) => x.id === id)?.path)
-          .filter((path): path is string => !!path);
-
-        if (materialPaths.length === 0) {
-          toast.error("多素材解析失败，请刷新素材后重试");
-          return;
-        }
-
-        payload.material_paths = materialPaths;
-        payload.material_path = materialPaths[0];
-
-        // 发送自定义时间轴分配
        const assignments = toCustomAssignments();
        if (assignments.length > 0) {
-          const assignmentPaths = assignments
-            .map((a) => a.material_path)
-            .filter((path): path is string => !!path);
-
-          if (assignmentPaths.length === assignments.length) {
-            // 以时间轴可见段为准：超出时间轴的素材不会参与本次生成
-            payload.material_paths = assignmentPaths;
-            payload.material_path = assignmentPaths[0];
+          // 前端预估段数校验（与后端硬上限 50 对齐）
+          if (assignments.length > 50) {
+            toast.error(`时间轴段数过多（${assignments.length}），请减少插入或使用更长的主素材`);
+            return;
          }
+          // 主素材路径（始终来自 selectedMaterials[0]）
+          const primaryPath = firstMaterialObj.path;
+          // 去重素材路径列表，主素材保证在首位
+          const otherPaths = [...new Set(
+            assignments.map((a) => a.material_path).filter((p) => p !== primaryPath)
+          )];
+          payload.material_path = primaryPath;
+          payload.material_paths = [primaryPath, ...otherPaths];
          payload.custom_assignments = assignments;
        } else {
-          console.warn(
-            "[Timeline] custom_assignments 为空，回退后端自动分配",
-            { materials: materialPaths.length }
-          );
+          // 无插入且主素材无裁剪：退化为单素材
+          payload.material_path = firstMaterialObj.path;
        }
      }

      // 单素材 + 截取范围
-      const singleSeg = timelineSegments[0];
-      if (
-        selectedMaterials.length === 1
-        && singleSeg
-        && (singleSeg.sourceStart > 0 || singleSeg.sourceEnd > 0)
-      ) {
-        payload.custom_assignments = toCustomAssignments();
+      if (selectedMaterials.length === 1) {
+        const assignments = toCustomAssignments();
+        if (assignments.length > 0) {
+          payload.custom_assignments = assignments;
+        }
      }

      if (selectedSubtitleStyleId) {
@@ -1040,7 +1045,7 @@ export const useHomeController = () => {

      if (enableBgm && selectedBgmId) {
        payload.bgm_id = selectedBgmId;
-        payload.bgm_volume = bgmVolume;
+        payload.bgm_volume = 0.2;
      }

      // 创建生成任务
@@ -1087,6 +1092,21 @@ export const useHomeController = () => {
    videoItemRefs.current[id] = el;
  };

+  // 设为主素材：将目标素材移到 selectedMaterials[0]
+  const handleSetPrimary = useCallback((materialId: string) => {
+    setSelectedMaterials((prev) => {
+      const filtered = prev.filter((id) => id !== materialId);
+      return [materialId, ...filtered];
+    });
+  }, [setSelectedMaterials]);
+
+  // 多镜头：插入候选素材（selectedMaterials[1:]）
+  const insertCandidates = useMemo(() => {
+    return selectedMaterials.slice(1)
+      .map((id) => materials.find((m) => m.id === id))
+      .filter((m): m is Material => !!m);
+  }, [selectedMaterials, materials]);
+
  return {
    apiBase,
    registerMaterialRef,
@@ -1116,6 +1136,8 @@ export const useHomeController = () => {
    setText,
    extractModalOpen,
    setExtractModalOpen,
+    learningModalOpen,
+    setLearningModalOpen,
    rewriteModalOpen,
    setRewriteModalOpen,
    handleGenerateMeta,
@@ -1198,6 +1220,7 @@ export const useHomeController = () => {
    startRecording,
    stopRecording,
    useRecording,
+    discardRecording,
    formatRecordingTime,
    bgmList,
    bgmLoading,
@@ -1238,9 +1261,20 @@ export const useHomeController = () => {
    setSpeed,
    emotion,
    setEmotion,
-    timelineSegments,
-    reorderSegments,
-    setSourceRange,
+    // Multi-camera timeline
+    inserts,
+    timelinePrimaryMaterial,
+    primarySourceStart,
+    primarySourceEnd,
+    insertCandidates,
+    addInsert,
+    removeInsert,
+    moveInsert,
+    resizeInsert,
+    setInsertSourceRange,
+    setPrimarySourceRange,
+    handleSetPrimary,
+    reorderMaterials,
    clipTrimmerOpen,
    setClipTrimmerOpen,
    clipTrimmerSegmentId,
--- a/frontend/src/features/home/model/useTimelineEditor.ts
+++ b/frontend/src/features/home/model/useTimelineEditor.ts
@@ -1,5 +1,9 @@
 import { useCallback, useEffect, useRef, useState } from "react";
 import type { Material } from "@/shared/types/material";
+import type { InsertSegment } from "@/shared/types/timeline";
+
+// Re-export for downstream consumers (ClipTrimmer, etc.)
+export type { InsertSegment };

 export interface TimelineSegment {
  id: string;
@@ -12,18 +16,23 @@ export interface TimelineSegment {
  color: string;
 }

-export interface CustomAssignment {
-  material_path: string;
-  start: number;
-  end: number;
-  source_start: number;
-  source_end?: number;
-}
+export interface CustomAssignment {
+  material_path: string;
+  start: number;
+  end: number;
+  source_start: number;
+  source_end?: number;
+}

 const COLORS = ["#8b5cf6", "#ec4899", "#06b6d4", "#f59e0b", "#10b981", "#f97316"];
+const MAX_INSERTS = 10;
+const DEFAULT_INSERT_DURATION = 3;
+const MIN_GAP = 0.5;
+
+export type AddInsertResult = "ok" | "limit" | "no_space";

 /** Serializable subset for localStorage */
-interface SegmentSnapshot {
+interface InsertSnapshot {
  materialId: string;
  start: number;
  end: number;
@@ -31,56 +40,11 @@ interface SegmentSnapshot {
  sourceEnd: number;
 }

-/** Get effective duration of a segment (clipped range or full material duration) */
-function getEffectiveDuration(
-  seg: { sourceStart: number; sourceEnd: number; materialId: string },
-  mats: Material[]
-): number {
-  const mat = mats.find((m) => m.id === seg.materialId);
-  const matDur = mat?.duration_sec ?? 0;
-  if (seg.sourceEnd > seg.sourceStart) return seg.sourceEnd - seg.sourceStart;
-  if (seg.sourceStart > 0) return Math.max(matDur - seg.sourceStart, 0);
-  return matDur;
-}
-
-/**
- * Recalculate segment start/end positions based on effective durations.
- * - Segments placed sequentially by effective duration
- * - Segments exceeding audioDuration keep their positions (overflow, start >= duration)
- * - Last visible segment is capped/extended to exactly audioDuration (loop fill)
- */
-function recalcPositions(
-  segs: TimelineSegment[],
-  mats: Material[],
-  duration: number
-): TimelineSegment[] {
-  if (segs.length === 0 || duration <= 0) return segs;
-
-  const fallbackDur = duration / segs.length;
-  let cursor = 0;
-  const result = segs.map((seg) => {
-    const effDur = getEffectiveDuration(seg, mats);
-    const dur = effDur > 0 ? effDur : fallbackDur;
-    const newSeg = { ...seg, start: cursor, end: cursor + dur };
-    cursor += dur;
-    return newSeg;
-  });
-
-  // Find last segment that starts before audioDuration
-  let lastVisibleIdx = -1;
-  for (let i = result.length - 1; i >= 0; i--) {
-    if (result[i].start < duration) {
-      lastVisibleIdx = i;
-      break;
-    }
-  }
-
-  // Cap/extend last visible segment to exactly audioDuration
-  if (lastVisibleIdx >= 0) {
-    result[lastVisibleIdx] = { ...result[lastVisibleIdx], end: duration };
-  }
-
-  return result;
+interface MultiCamCache {
+  key: string;
+  inserts: InsertSnapshot[];
+  primarySourceStart: number;
+  primarySourceEnd: number;
 }

 interface UseTimelineEditorOptions {
@@ -96,34 +60,40 @@ export const useTimelineEditor = ({
  selectedMaterials,
  storageKey,
 }: UseTimelineEditorOptions) => {
-  const [segments, setSegments] = useState<TimelineSegment[]>([]);
+  const [inserts, setInserts] = useState<InsertSegment[]>([]);
+  const [primarySourceStart, setPrimarySourceStart] = useState(0);
+  const [primarySourceEnd, setPrimarySourceEnd] = useState(0);
  const prevKey = useRef("");
-  const restoredRef = useRef(false);
+  const [prevPrimaryId, setPrevPrimaryId] = useState(selectedMaterials[0]);

-  // Refs for stable callbacks (avoid recreating on every materials/duration change)
-  const materialsRef = useRef(materials);
-  const audioDurationRef = useRef(audioDuration);
-
-  useEffect(() => {
-    materialsRef.current = materials;
-  }, [materials]);
-
-  useEffect(() => {
-    audioDurationRef.current = audioDuration;
-  }, [audioDuration]);
+  // Refs for stable callbacks
+  const materialsRef = useRef(materials);
+  const audioDurationRef = useRef(audioDuration);
+  const selectedMaterialsRef = useRef(selectedMaterials);

-  // Build a durationsKey so segments re-init when material durations become available
-  const durationsKey = selectedMaterials
-    .map((id) => materials.find((m) => m.id === id)?.duration_sec ?? 0)
-    .join(",");
+  useEffect(() => { materialsRef.current = materials; }, [materials]);
+  useEffect(() => { audioDurationRef.current = audioDuration; }, [audioDuration]);
+  useEffect(() => { selectedMaterialsRef.current = selectedMaterials; }, [selectedMaterials]);

-  // Build a cache key from materials + duration
+  // Computed: primary material
+  const primaryMaterial = materials.find((m) => m.id === selectedMaterials[0]);
+
+  // Cache key
  const cacheKey = `${selectedMaterials.join(",")}_${audioDuration.toFixed(1)}`;
-  const lsKey = storageKey ? `vigent_${storageKey}_timeline` : null;
+  const lsKey = storageKey ? `vigent_${storageKey}_multicam` : null;

-  const initSegments = useCallback(() => {
-    if (selectedMaterials.length === 0 || audioDuration <= 0) {
-      setSegments([]);
+  // Reset primary source range when primary material identity changes
+  // (React render-time state adjustment pattern for derived state)
+  if (selectedMaterials[0] !== prevPrimaryId) {
+    setPrevPrimaryId(selectedMaterials[0]);
+    setPrimarySourceStart(0);
+    setPrimarySourceEnd(0);
+  }
+
+  // Initialize / restore from localStorage
+  const initInserts = useCallback(() => {
+    if (selectedMaterials.length <= 1 || audioDuration <= 0) {
+      setInserts([]);
      return;
    }

@@ -132,27 +102,28 @@ export const useTimelineEditor = ({
      try {
        const raw = localStorage.getItem(lsKey);
        if (raw) {
-          const saved = JSON.parse(raw) as { key: string; segments: SegmentSnapshot[] };
-          if (saved.key === cacheKey && saved.segments.length === selectedMaterials.length) {
-            const allMatch = saved.segments.every(
-              (s, i) => s.materialId === selectedMaterials[i] || saved.segments.some((ss) => ss.materialId === selectedMaterials[i])
-            );
-            if (allMatch) {
-              const restored: TimelineSegment[] = saved.segments.map((s, i) => {
+          const saved: MultiCamCache = JSON.parse(raw);
+          if (saved.key === cacheKey) {
+            // Validate all insert materialIds still exist
+            const existingIds = new Set(materials.map((m) => m.id));
+            const validInserts = saved.inserts.filter((s) => existingIds.has(s.materialId));
+            if (validInserts.length === saved.inserts.length) {
+              const restored: InsertSegment[] = validInserts.map((s, i) => {
                const mat = materials.find((m) => m.id === s.materialId);
                return {
-                  id: `seg-${i}-${Date.now()}`,
+                  id: `ins-${i}-${Date.now()}`,
                  materialId: s.materialId,
                  materialName: mat?.scene || mat?.name || s.materialId,
-                  start: 0,
-                  end: 0,
+                  start: s.start,
+                  end: s.end,
                  sourceStart: s.sourceStart,
                  sourceEnd: s.sourceEnd,
                  color: COLORS[i % COLORS.length],
                };
              });
-              setSegments(recalcPositions(restored, materials, audioDuration));
-              restoredRef.current = true;
+              setInserts(restored);
+              setPrimarySourceStart(saved.primarySourceStart || 0);
+              setPrimarySourceEnd(saved.primarySourceEnd || 0);
              return;
            }
          }
@@ -162,95 +133,315 @@ export const useTimelineEditor = ({
      }
    }

-    // Create fresh segments — positions derived by recalcPositions
-    const newSegments: TimelineSegment[] = selectedMaterials.map((matId, i) => {
-      const mat = materials.find((m) => m.id === matId);
-      return {
-        id: `seg-${i}-${Date.now()}`,
-        materialId: matId,
-        materialName: mat?.scene || mat?.name || matId,
-        start: 0,
-        end: 0,
-        sourceStart: 0,
-        sourceEnd: 0,
-        color: COLORS[i % COLORS.length],
-      };
-    });
-
-    setSegments(recalcPositions(newSegments, materials, audioDuration));
+    // Start fresh
+    setInserts([]);
+    setPrimarySourceStart(0);
+    setPrimarySourceEnd(0);
  }, [audioDuration, materials, selectedMaterials, lsKey, cacheKey]);

-  // Auto-init when selectedMaterials, audioDuration, or material durations change
+  // Auto-init when inputs change
  useEffect(() => {
+    const durationsKey = selectedMaterials
+      .map((id) => materials.find((m) => m.id === id)?.duration_sec ?? 0)
+      .join(",");
    const key = `${selectedMaterials.join(",")}_${audioDuration}_${durationsKey}`;
    if (key !== prevKey.current) {
      prevKey.current = key;
-      initSegments();
+      // eslint-disable-next-line react-hooks/set-state-in-effect -- initialization on input change
+      initInserts();
    }
-  }, [selectedMaterials, audioDuration, durationsKey, initSegments]);
+  }, [selectedMaterials, audioDuration, materials, initInserts]);

-  // Persist segments to localStorage on change (debounced)
+  // Persist to localStorage (debounced)
  useEffect(() => {
-    if (!lsKey || segments.length === 0) return;
+    if (!lsKey || selectedMaterials.length <= 1) return;
    const timeout = setTimeout(() => {
-      const snapshots: SegmentSnapshot[] = segments.map((s) => ({
+      const snapshots: InsertSnapshot[] = inserts.map((s) => ({
        materialId: s.materialId,
        start: s.start,
        end: s.end,
        sourceStart: s.sourceStart,
        sourceEnd: s.sourceEnd,
      }));
-      localStorage.setItem(lsKey, JSON.stringify({ key: cacheKey, segments: snapshots }));
+      const cache: MultiCamCache = {
+        key: cacheKey,
+        inserts: snapshots,
+        primarySourceStart,
+        primarySourceEnd,
+      };
+      localStorage.setItem(lsKey, JSON.stringify(cache));
    }, 300);
    return () => clearTimeout(timeout);
-  }, [segments, lsKey, cacheKey]);
+  }, [inserts, primarySourceStart, primarySourceEnd, lsKey, cacheKey, selectedMaterials.length]);

-  const reorderSegments = useCallback(
-    (fromIdx: number, toIdx: number) => {
-      setSegments((prev) => {
-        if (fromIdx < 0 || toIdx < 0 || fromIdx >= prev.length || toIdx >= prev.length) return prev;
-        if (fromIdx === toIdx) return prev;
-        const next = [...prev];
-        // Move the segment: remove from old position, insert at new position
-        const [moved] = next.splice(fromIdx, 1);
-        next.splice(toIdx, 0, moved);
-        return recalcPositions(next, materialsRef.current, audioDurationRef.current);
-      });
-    },
-    []
-  );
+  // Clean up inserts referencing removed materials
+  useEffect(() => {
+    const existingIds = new Set(selectedMaterials.slice(1));
+    // eslint-disable-next-line react-hooks/set-state-in-effect -- cleanup stale references
+    setInserts((prev) => {
+      const filtered = prev.filter((ins) => existingIds.has(ins.materialId));
+      return filtered.length !== prev.length ? filtered : prev;
+    });
+  }, [selectedMaterials]);

-  const setSourceRange = useCallback(
-    (id: string, sourceStart: number, sourceEnd: number) => {
-      setSegments((prev) => {
-        const updated = prev.map((s) => (s.id === id ? { ...s, sourceStart, sourceEnd } : s));
-        return recalcPositions(updated, materialsRef.current, audioDurationRef.current);
-      });
-    },
-    []
-  );
+  // ── Operations ──
+
+  const addInsert = useCallback((materialId: string): AddInsertResult => {
+    const currentInserts = inserts;
+    const duration = audioDurationRef.current;
+
+    if (currentInserts.length >= MAX_INSERTS) return "limit";
+    if (duration <= 0) return "no_space";
+
+    const mat = materialsRef.current.find((m) => m.id === materialId);
+    const sorted = [...currentInserts].sort((a, b) => a.start - b.start);
+
+    // Find first gap that can fit DEFAULT_INSERT_DURATION
+    let bestStart = -1;
+    let prevEnd = 0;
+    for (const ins of sorted) {
+      if (ins.start - prevEnd >= DEFAULT_INSERT_DURATION + MIN_GAP) {
+        bestStart = prevEnd + MIN_GAP;
+        break;
+      }
+      prevEnd = ins.end;
+    }
+    // Check trailing gap
+    if (bestStart < 0 && duration - prevEnd >= DEFAULT_INSERT_DURATION + MIN_GAP) {
+      bestStart = prevEnd + MIN_GAP;
+    }
+
+    if (bestStart < 0) return "no_space";
+
+    const newInsert: InsertSegment = {
+      id: `ins-${Date.now()}-${Math.random().toString(36).slice(2, 6)}`,
+      materialId,
+      materialName: mat?.scene || mat?.name || materialId,
+      start: bestStart,
+      end: Math.min(bestStart + DEFAULT_INSERT_DURATION, duration),
+      sourceStart: 0,
+      sourceEnd: 0,
+      color: COLORS[currentInserts.length % COLORS.length],
+    };
+
+    setInserts((prev) => [...prev, newInsert]);
+    return "ok";
+  }, [inserts]);
+
+  const removeInsert = useCallback((id: string) => {
+    setInserts((prev) => prev.filter((ins) => ins.id !== id));
+  }, []);
+
+  const moveInsert = useCallback((id: string, newStart: number) => {
+    setInserts((prev) => {
+      const duration = audioDurationRef.current;
+      const target = prev.find((ins) => ins.id === id);
+      if (!target) return prev;
+
+      const len = target.end - target.start;
+      let clampedStart = Math.max(0, Math.min(newStart, duration - len));
+      let clampedEnd = clampedStart + len;
+
+      // Prevent overlap with other inserts
+      const others = prev.filter((ins) => ins.id !== id).sort((a, b) => a.start - b.start);
+      for (const other of others) {
+        if (clampedEnd > other.start && clampedStart < other.end) {
+          // Try pushing to right of blocker
+          const rightStart = other.end + 0.1;
+          if (rightStart + len <= duration) {
+            clampedStart = rightStart;
+            clampedEnd = clampedStart + len;
+          } else {
+            // Try pushing to left of blocker
+            const leftEnd = other.start - 0.1;
+            if (leftEnd - len >= 0) {
+              clampedEnd = leftEnd;
+              clampedStart = clampedEnd - len;
+            }
+          }
+        }
+      }
+
+      return prev.map((ins) =>
+        ins.id === id ? { ...ins, start: clampedStart, end: clampedEnd } : ins
+      );
+    });
+  }, []);
+
+  const resizeInsert = useCallback((id: string, newEnd: number) => {
+    setInserts((prev) => {
+      const duration = audioDurationRef.current;
+      const target = prev.find((ins) => ins.id === id);
+      if (!target) return prev;
+
+      const minLen = 0.5;
+      let clamped = Math.max(target.start + minLen, Math.min(newEnd, duration));
+
+      // Prevent overlap with next insert
+      const others = prev.filter((ins) => ins.id !== id).sort((a, b) => a.start - b.start);
+      for (const other of others) {
+        if (other.start > target.start && clamped > other.start - 0.1) {
+          clamped = other.start - 0.1;
+        }
+      }
+
+      return prev.map((ins) =>
+        ins.id === id ? { ...ins, end: Math.max(clamped, target.start + minLen) } : ins
+      );
+    });
+  }, []);
+
+  const setInsertSourceRange = useCallback((id: string, sourceStart: number, sourceEnd: number) => {
+    setInserts((prev) =>
+      prev.map((ins) => (ins.id === id ? { ...ins, sourceStart, sourceEnd } : ins))
+    );
+  }, []);
+
+  const setPrimarySourceRange = useCallback((sourceStart: number, sourceEnd: number) => {
+    setPrimarySourceStart(sourceStart);
+    setPrimarySourceEnd(sourceEnd);
+  }, []);
+
+  // ── Serialization ──

  const toCustomAssignments = useCallback((): CustomAssignment[] => {
+    const mats = materialsRef.current;
+    const selMats = selectedMaterialsRef.current;
    const duration = audioDurationRef.current;
-    return segments
-      .filter((seg) => seg.start < duration)
-      .map((seg) => {
-        const mat = materialsRef.current.find((m) => m.id === seg.materialId);
-        return {
-          material_path: mat?.path || seg.materialId,
-          start: seg.start,
-          end: seg.end,
-          source_start: seg.sourceStart,
-          source_end: seg.sourceEnd > seg.sourceStart ? seg.sourceEnd : undefined,
-        };
-      });
-  }, [segments]);
+
+    if (duration <= 0 || selMats.length === 0) return [];
+
+    const primaryMat = mats.find((m) => m.id === selMats[0]);
+    if (!primaryMat) return [];
+    const primaryPath = primaryMat.path;
+    const primaryDuration = primaryMat.duration_sec ?? 0;
+
+    // Single material mode: only emit assignment if user has set a source range
+    if (selMats.length === 1) {
+      if (primarySourceStart > 0 || primarySourceEnd > 0) {
+        return [{
+          material_path: primaryPath,
+          start: 0,
+          end: duration,
+          source_start: primarySourceStart,
+          source_end: primarySourceEnd > primarySourceStart ? primarySourceEnd : undefined,
+        }];
+      }
+      return [];
+    }
+
+    // Multi-camera mode: build assignments with gap splitting
+    return buildAssignments(
+      primaryPath,
+      primaryDuration,
+      primarySourceStart,
+      primarySourceEnd,
+      inserts,
+      duration,
+      mats,
+    );
+  }, [inserts, primarySourceStart, primarySourceEnd]);

  return {
-    segments,
-    initSegments,
-    reorderSegments,
-    setSourceRange,
+    // State
+    inserts,
+    setInserts,
+    primaryMaterial,
+    primarySourceStart,
+    primarySourceEnd,
+    // Operations
+    addInsert,
+    removeInsert,
+    moveInsert,
+    resizeInsert,
+    setInsertSourceRange,
+    setPrimarySourceRange,
+    // Serialization
    toCustomAssignments,
  };
 };
+
+// ── buildAssignments: gap-filling + boundary-splitting ──
+
+function buildAssignments(
+  primaryPath: string,
+  primaryDuration: number,
+  pSourceStart: number,
+  pSourceEnd: number,
+  inserts: InsertSegment[],
+  audioDuration: number,
+  materials: Material[],
+): CustomAssignment[] {
+  const assignments: CustomAssignment[] = [];
+  const sorted = [...inserts].sort((a, b) => a.start - b.start);
+
+  // Primary material effective play range
+  const clipStart = pSourceStart;
+  const clipEnd = pSourceEnd > pSourceStart ? pSourceEnd : primaryDuration;
+  const effective = clipEnd - clipStart;
+
+  let cursor = 0;
+  let primaryAccum = 0;
+
+  function addPrimaryGap(gapStart: number, gapEnd: number) {
+    if (gapEnd - gapStart < 0.05) return;
+
+    // No valid effective range: single segment from 0 (graceful degradation)
+    if (effective <= 0) {
+      assignments.push({
+        material_path: primaryPath,
+        start: gapStart,
+        end: gapEnd,
+        source_start: 0,
+      });
+      return;
+    }
+
+    let remaining = gapEnd - gapStart;
+    let segStart = gapStart;
+    const EPSILON = 0.01;
+
+    while (remaining > 0.05) {
+      const posInClip = primaryAccum % effective;
+      const sourceStart = clipStart + posInClip;
+      const availableInClip = effective - posInClip;
+      const segDuration = Math.min(remaining, availableInClip);
+
+      if (segDuration < EPSILON) break;
+
+      assignments.push({
+        material_path: primaryPath,
+        start: segStart,
+        end: segStart + segDuration,
+        source_start: sourceStart,
+        source_end: pSourceEnd > pSourceStart ? pSourceEnd : undefined,
+      });
+
+      primaryAccum += segDuration;
+      segStart += segDuration;
+      remaining -= segDuration;
+    }
+  }
+
+  for (const insert of sorted) {
+    // Primary gap before this insert
+    addPrimaryGap(cursor, insert.start);
+
+    // Insert segment
+    const mat = materials.find((m) => m.id === insert.materialId);
+    assignments.push({
+      material_path: mat?.path || insert.materialId,
+      start: insert.start,
+      end: insert.end,
+      source_start: insert.sourceStart,
+      source_end: insert.sourceEnd > insert.sourceStart ? insert.sourceEnd : undefined,
+    });
+
+    cursor = insert.end;
+  }
+
+  // Trailing primary gap
+  addPrimaryGap(cursor, audioDuration);
+
+  return assignments;
+}
--- a/frontend/src/features/home/ui/BgmPanel.tsx
+++ b/frontend/src/features/home/ui/BgmPanel.tsx
@@ -1,5 +1,6 @@
-import type { RefObject, MouseEvent } from "react";
-import { RefreshCw, Play, Pause } from "lucide-react";
+import { type RefObject, type MouseEvent, useCallback, useMemo, useState } from "react";
+import { RefreshCw, Play, Pause, ChevronDown, Check, Search } from "lucide-react";
+import { SelectPopover } from "@/shared/ui/SelectPopover";

 interface BgmItem {
  id: string;
@@ -18,8 +19,6 @@ interface BgmPanelProps {
  onSelectBgm: (id: string) => void;
  playingBgmId: string | null;
  onTogglePreview: (bgm: BgmItem, event: MouseEvent) => void;
-  bgmVolume: number;
-  onVolumeChange: (value: number) => void;
  bgmListContainerRef: RefObject<HTMLDivElement | null>;
  registerBgmItemRef: (id: string, element: HTMLDivElement | null) => void;
 }
@@ -35,11 +34,31 @@ export function BgmPanel({
  onSelectBgm,
  playingBgmId,
  onTogglePreview,
-  bgmVolume,
-  onVolumeChange,
  bgmListContainerRef,
  registerBgmItemRef,
 }: BgmPanelProps) {
+  const [bgmFilter, setBgmFilter] = useState("");
+  const selectedBgm = bgmList.find((item) => item.id === selectedBgmId) || null;
+  const canSelectBgm = enableBgm && !bgmLoading && !bgmError && bgmList.length > 0;
+  const filteredBgmList = useMemo(() => {
+    const query = bgmFilter.trim().toLowerCase();
+    if (!query) return bgmList;
+    return bgmList.filter((bgm) => bgm.name.toLowerCase().includes(query));
+  }, [bgmFilter, bgmList]);
+
+  const handleOpenBgmPopover = useCallback(() => {
+    setBgmFilter("");
+
+    requestAnimationFrame(() => {
+      requestAnimationFrame(() => {
+        const container = bgmListContainerRef.current;
+        if (!container) return;
+        const selectedRow = container.querySelector<HTMLElement>("[data-bgm-selected='true']");
+        selectedRow?.scrollIntoView({ block: "nearest", behavior: "auto" });
+      });
+    });
+  }, [bgmListContainerRef]);
+
  return (
    <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
      <div className="flex items-center justify-between mb-4">
@@ -79,57 +98,108 @@ export function BgmPanel({
      ) : bgmList.length === 0 ? (
        <div className="text-center py-4 text-gray-500 text-sm">暂无背景音乐，请先导入素材</div>
      ) : (
-        <div
-          ref={bgmListContainerRef}
-          className={`space-y-2 max-h-64 overflow-y-auto hide-scrollbar ${enableBgm ? '' : 'opacity-70'}`}
-        >
-          {bgmList.map((bgm) => (
-            <div
-              key={bgm.id}
-              ref={(el) => registerBgmItemRef(bgm.id, el)}
-              className={`p-3 rounded-lg border transition-all flex items-center justify-between group ${selectedBgmId === bgm.id
-                ? "border-purple-500 bg-purple-500/20"
-                : "border-white/10 bg-white/5 hover:border-white/30"
-                }`}
-            >
-              <button onClick={() => onSelectBgm(bgm.id)} className="flex-1 text-left">
-                <div className="text-white text-sm truncate">{bgm.name}</div>
-                <div className="text-xs text-gray-400">.{bgm.ext || 'audio'}</div>
+        <div className={!enableBgm ? "opacity-70" : ""}>
+          <p className="mb-2 text-xs text-gray-400">曲目选择</p>
+          <SelectPopover
+            sheetTitle="选择背景音乐"
+            disabled={!canSelectBgm}
+            onOpen={handleOpenBgmPopover}
+            trigger={({ open, toggle }) => (
+              <button
+                type="button"
+                onClick={toggle}
+                disabled={!canSelectBgm}
+                className={`w-full rounded-xl border px-3 py-2.5 text-left transition-colors ${canSelectBgm
+                  ? "border-white/10 bg-black/25 hover:border-white/30"
+                  : "border-white/10 bg-black/20 text-gray-500 cursor-not-allowed"
+                  }`}
+              >
+                <span className="flex items-center justify-between gap-3">
+                  <span className="min-w-0">
+                    <span className="block truncate text-sm text-white">
+                      {selectedBgm?.name || "请选择背景音乐"}
+                    </span>
+                    <span className="mt-0.5 block text-xs text-gray-400">
+                      {selectedBgm ? `.${selectedBgm.ext || "audio"}` : "未选择"}
+                    </span>
+                  </span>
+                  <ChevronDown className={`h-4 w-4 text-gray-300 transition-transform ${open ? "rotate-180" : ""}`} />
+                </span>
              </button>
-              <div className="flex items-center gap-2 pl-2">
-                <button
-                  onClick={(e) => onTogglePreview(bgm, e)}
-                  className="p-1 text-gray-500 hover:text-purple-400 transition-colors"
-                  title="试听"
-                >
-                  {playingBgmId === bgm.id ? (
-                    <Pause className="h-4 w-4" />
-                  ) : (
-                    <Play className="h-4 w-4" />
-                  )}
-                </button>
-                {selectedBgmId === bgm.id && (
-                  <span className="text-xs text-purple-300">已选</span>
+            )}
+          >
+            {({ close }) => (
+              <div className="space-y-2">
+                <div className="rounded-lg border border-white/10 bg-black/30 px-3 py-2">
+                  <div className="flex items-center gap-2">
+                    <Search className="h-4 w-4 text-gray-400" />
+                    <input
+                      type="text"
+                      value={bgmFilter}
+                      onChange={(e) => setBgmFilter(e.target.value)}
+                      placeholder="搜索背景音乐..."
+                      className="w-full bg-transparent text-sm text-white placeholder-gray-500 outline-none"
+                    />
+                  </div>
+                </div>
+
+                {filteredBgmList.length === 0 ? (
+                  <div className="py-6 text-center text-sm text-gray-400">没有匹配的背景音乐</div>
+                ) : (
+                  <div
+                    ref={bgmListContainerRef}
+                    className="space-y-1"
+                    style={{ contentVisibility: "auto" }}
+                  >
+                    {filteredBgmList.map((bgm) => {
+                      const isSelected = selectedBgmId === bgm.id;
+
+                      return (
+                        <div
+                          key={bgm.id}
+                          ref={(el) => registerBgmItemRef(bgm.id, el)}
+                          data-popover-selected={isSelected ? "true" : undefined}
+                          data-bgm-selected={isSelected ? "true" : "false"}
+                          className={`flex items-center justify-between gap-2 rounded-lg border px-3 py-2 transition-colors ${isSelected
+                            ? "border-purple-500 bg-purple-500/20"
+                            : "border-white/10 bg-white/5 hover:border-white/30"
+                            }`}
+                        >
+                          <button
+                            type="button"
+                            onClick={() => {
+                              onSelectBgm(bgm.id);
+                              close();
+                            }}
+                            className="min-w-0 flex-1 text-left"
+                          >
+                            <span className="block truncate text-sm text-white">{bgm.name}</span>
+                            <span className="mt-0.5 block text-xs text-gray-400">.{bgm.ext || "audio"}</span>
+                          </button>
+
+                          <div className="flex items-center gap-2 pl-2">
+                            <button
+                              type="button"
+                              onClick={(e) => onTogglePreview(bgm, e)}
+                              className="p-1 text-gray-400 hover:text-purple-300 transition-colors"
+                              title="试听"
+                            >
+                              {playingBgmId === bgm.id ? (
+                                <Pause className="h-4 w-4" />
+                              ) : (
+                                <Play className="h-4 w-4" />
+                              )}
+                            </button>
+                            {isSelected && <Check className="h-4 w-4 text-purple-300" />}
+                          </div>
+                        </div>
+                      );
+                    })}
+                  </div>
                )}
              </div>
-            </div>
-          ))}
-        </div>
-      )}
-
-      {enableBgm && (
-        <div className="mt-4">
-          <label className="text-sm text-gray-300 mb-2 block">音量</label>
-          <input
-            type="range"
-            min="0"
-            max="1"
-            step="0.05"
-            value={bgmVolume}
-            onChange={(e) => onVolumeChange(parseFloat(e.target.value))}
-            className="w-full accent-purple-500"
-          />
-          <div className="text-xs text-gray-400 mt-1">当前: {Math.round(bgmVolume * 100)}%</div>
+            )}
+          </SelectPopover>
        </div>
      )}
    </div>
--- a/frontend/src/features/home/ui/ClipTrimmer.tsx
+++ b/frontend/src/features/home/ui/ClipTrimmer.tsx
@@ -1,6 +1,7 @@
-import { useCallback, useEffect, useRef, useState } from "react";
-import { X, Play, Pause } from "lucide-react";
-import type { TimelineSegment } from "@/features/home/model/useTimelineEditor";
+import { useCallback, useEffect, useRef, useState } from "react";
+import { Play, Pause } from "lucide-react";
+import type { TimelineSegment } from "@/features/home/model/useTimelineEditor";
+import { AppModal, AppModalHeader } from "@/shared/ui/AppModal";

 interface ClipTrimmerProps {
  isOpen: boolean;
@@ -153,21 +154,18 @@ export function ClipTrimmer({
  const endPct = duration > 0 ? (effectiveEnd / duration) * 100 : 100;
  const playheadPct = duration > 0 ? (currentTime / duration) * 100 : 0;

-  return (
-    <div className="fixed inset-0 z-50 flex items-center justify-center bg-black/60 backdrop-blur-sm" onClick={onClose}>
-      <div
-        className="bg-gray-900 border border-white/10 rounded-2xl w-full max-w-lg mx-4 overflow-hidden"
-        onClick={(e) => e.stopPropagation()}
-      >
-        {/* Header */}
-        <div className="flex items-center justify-between px-5 py-3 border-b border-white/10">
-          <h3 className="text-white font-semibold text-sm">
-            截取设置 - {segment.materialName}
-          </h3>
-          <button onClick={onClose} className="text-gray-400 hover:text-white">
-            <X className="h-4 w-4" />
-          </button>
-        </div>
+  return (
+    <AppModal
+      isOpen={isOpen}
+      onClose={onClose}
+      panelClassName="w-full max-w-lg mx-4 rounded-2xl border border-white/10 bg-[#171821]/95 shadow-[0_24px_80px_rgba(0,0,0,0.55)] overflow-hidden"
+      closeOnOverlay
+    >
+      <AppModalHeader
+        title={`截取设置 - ${segment.materialName}`}
+        subtitle="拖拽起止点，精确控制素材片段"
+        onClose={onClose}
+      />

        {/* Video preview */}
        <div className="px-5 pt-4">
@@ -287,7 +285,6 @@ export function ClipTrimmer({
            确定
          </button>
        </div>
-      </div>
-    </div>
-  );
-}
+    </AppModal>
+  );
+}
--- a/frontend/src/features/home/ui/GenerateActionBar.tsx
+++ b/frontend/src/features/home/ui/GenerateActionBar.tsx
@@ -1,7 +1,14 @@
-import { Rocket } from "lucide-react";
+import { Rocket, ChevronDown, Check } from "lucide-react";
+import { SelectPopover } from "@/shared/ui/SelectPopover";

 type LipsyncModelMode = "default" | "fast" | "advanced";

+const MODEL_OPTIONS: Array<{ value: LipsyncModelMode; label: string; desc: string }> = [
+  { value: "default", label: "默认模型", desc: "按时长智能路由" },
+  { value: "fast", label: "快速模型", desc: "速度优先" },
+  { value: "advanced", label: "高级模型", desc: "质量优先" },
+];
+
 interface GenerateActionBarProps {
  isGenerating: boolean;
  progress: number;
@@ -21,6 +28,8 @@ export function GenerateActionBar({
  onModelModeChange,
  onGenerate,
 }: GenerateActionBarProps) {
+  const currentModel = MODEL_OPTIONS.find((opt) => opt.value === modelMode) || MODEL_OPTIONS[0];
+
  return (
    <div>
      <div className="flex items-center gap-2">
@@ -60,17 +69,56 @@ export function GenerateActionBar({
          )}
        </button>

-        <select
-          value={modelMode}
-          onChange={(e) => onModelModeChange(e.target.value as LipsyncModelMode)}
+        <SelectPopover
+          sheetTitle="选择唇形模型"
          disabled={isGenerating}
-          className="h-[58px] rounded-xl border border-white/15 bg-black/30 px-3 text-sm text-gray-200 outline-none focus:border-purple-400"
-          title="选择唇形模型"
+          trigger={({ open, toggle }) => (
+            <button
+              type="button"
+              onClick={toggle}
+              disabled={isGenerating}
+              className="h-[58px] min-w-[152px] rounded-xl border border-white/15 bg-black/30 px-3 text-left text-sm text-gray-200 transition-colors hover:border-white/30 disabled:cursor-not-allowed disabled:opacity-50"
+              title="选择唇形模型"
+            >
+              <span className="flex items-center justify-between gap-2">
+                <span className="min-w-0">
+                  <span className="block truncate text-sm text-white">{currentModel.label}</span>
+                  <span className="mt-0.5 block text-xs text-gray-400">{currentModel.desc}</span>
+                </span>
+                <ChevronDown className={`h-4 w-4 text-gray-300 transition-transform ${open ? "rotate-180" : ""}`} />
+              </span>
+            </button>
+          )}
        >
-          <option value="default">默认模型</option>
-          <option value="fast">快速模型</option>
-          <option value="advanced">高级模型</option>
-        </select>
+          {({ close }) => (
+            <div className="space-y-1">
+              {MODEL_OPTIONS.map((opt) => {
+                const isSelected = opt.value === modelMode;
+                return (
+                  <button
+                    key={opt.value}
+                    type="button"
+                    data-popover-selected={isSelected ? "true" : undefined}
+                    onClick={() => {
+                      onModelModeChange(opt.value);
+                      close();
+                    }}
+                    className={`flex w-full items-center justify-between rounded-lg border px-3 py-2 text-left transition-colors ${isSelected
+                      ? "border-purple-500 bg-purple-500/20"
+                      : "border-white/10 bg-white/5 hover:border-white/30"
+                      }`}
+                  >
+                    <span>
+                      <span className="block text-sm text-white">{opt.label}</span>
+                      <span className="mt-0.5 block text-xs text-gray-400">{opt.desc}</span>
+                    </span>
+                    {isSelected && <Check className="h-4 w-4 text-purple-300" />}
+                  </button>
+                );
+              })}
+            </div>
+          )}
+        </SelectPopover>
      </div>
      {!isGenerating && materialCount >= 2 && (
        <p className="text-xs text-gray-400 text-center mt-1.5">
--- a/frontend/src/features/home/ui/GeneratedAudiosPanel.tsx
+++ b/frontend/src/features/home/ui/GeneratedAudiosPanel.tsx
@@ -1,6 +1,7 @@
-import { useState, useRef, useCallback, useEffect } from "react";
-import { Play, Pause, Pencil, Trash2, Check, X, RefreshCw, Mic, ChevronDown } from "lucide-react";
-import type { GeneratedAudio } from "@/features/home/model/useGeneratedAudios";
+import { useState, useRef, useCallback, useEffect, useMemo } from "react";
+import { Play, Pause, Pencil, Trash2, Check, X, RefreshCw, Mic, ChevronDown, Search } from "lucide-react";
+import type { GeneratedAudio } from "@/features/home/model/useGeneratedAudios";
+import { SelectPopover } from "@/shared/ui/SelectPopover";

 interface AudioTask {
  status: string;
@@ -47,14 +48,12 @@ export function GeneratedAudiosPanel({
  onEmotionChange,
  embedded = false,
 }: GeneratedAudiosPanelProps) {
-  const [editingId, setEditingId] = useState<string | null>(null);
-  const [editName, setEditName] = useState("");
-  const [playingId, setPlayingId] = useState<string | null>(null);
-  const [speedOpen, setSpeedOpen] = useState(false);
-  const [emotionOpen, setEmotionOpen] = useState(false);
-  const audioRef = useRef<HTMLAudioElement | null>(null);
-  const speedRef = useRef<HTMLDivElement>(null);
-  const emotionRef = useRef<HTMLDivElement>(null);
+  const [editingId, setEditingId] = useState<string | null>(null);
+  const [editName, setEditName] = useState("");
+  const [playingId, setPlayingId] = useState<string | null>(null);
+  const [audioFilter, setAudioFilter] = useState("");
+  const audioRef = useRef<HTMLAudioElement | null>(null);
+  const audioListContainerRef = useRef<HTMLDivElement | null>(null);

  const stopPlaying = useCallback(() => {
    if (audioRef.current) {
@@ -75,28 +74,6 @@ export function GeneratedAudiosPanel({
    };
  }, []);

-  // Close speed dropdown on click outside
-  useEffect(() => {
-    const handler = (e: MouseEvent) => {
-      if (speedRef.current && !speedRef.current.contains(e.target as Node)) {
-        setSpeedOpen(false);
-      }
-    };
-    if (speedOpen) document.addEventListener("mousedown", handler);
-    return () => document.removeEventListener("mousedown", handler);
-  }, [speedOpen]);
-
-  // Close emotion dropdown on click outside
-  useEffect(() => {
-    const handler = (e: MouseEvent) => {
-      if (emotionRef.current && !emotionRef.current.contains(e.target as Node)) {
-        setEmotionOpen(false);
-      }
-    };
-    if (emotionOpen) document.addEventListener("mousedown", handler);
-    return () => document.removeEventListener("mousedown", handler);
-  }, [emotionOpen]);
-
  const togglePlay = (audio: GeneratedAudio, e: React.MouseEvent) => {
    e.stopPropagation();
    if (playingId === audio.id) {
@@ -148,7 +125,26 @@ export function GeneratedAudiosPanel({
    { value: "sad", label: "低沉" },
    { value: "angry", label: "严肃" },
  ] as const;
-  const currentEmotionLabel = emotionOptions.find((o) => o.value === emotion)?.label ?? "正常";
+  const currentEmotionLabel = emotionOptions.find((o) => o.value === emotion)?.label ?? "正常";
+  const selectedAudio = generatedAudios.find((audio) => audio.id === selectedAudioId) || null;
+  const filteredAudios = useMemo(() => {
+    const query = audioFilter.trim().toLowerCase();
+    if (!query) return generatedAudios;
+    return generatedAudios.filter((audio) => audio.name.toLowerCase().includes(query));
+  }, [audioFilter, generatedAudios]);
+
+  const handleOpenAudioPopover = useCallback(() => {
+    setAudioFilter("");
+
+    requestAnimationFrame(() => {
+      requestAnimationFrame(() => {
+        const container = audioListContainerRef.current;
+        if (!container) return;
+        const selectedRow = container.querySelector<HTMLElement>("[data-audio-selected='true']");
+        selectedRow?.scrollIntoView({ block: "nearest", behavior: "auto" });
+      });
+    });
+  }, []);

  const content = (
    <>
@@ -156,62 +152,88 @@ export function GeneratedAudiosPanel({
        <>
          {/* Row 1: 语气 + 语速 + 生成配音 (right-aligned) */}
          <div className="flex justify-end items-center gap-1.5 mb-3">
-            {ttsMode === "voiceclone" && (
-              <div ref={emotionRef} className="relative">
-                <button
-                  onClick={() => setEmotionOpen((v) => !v)}
-                  className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 whitespace-nowrap flex items-center gap-1 transition-all"
-                >
-                  语气: {currentEmotionLabel}
-                  <ChevronDown className={`h-3 w-3 transition-transform ${emotionOpen ? "rotate-180" : ""}`} />
-                </button>
-                {emotionOpen && (
-                  <div className="absolute right-0 top-full mt-1 bg-gray-800 border border-white/20 rounded-lg shadow-xl py-1 z-50 min-w-[80px]">
-                    {emotionOptions.map((opt) => (
-                      <button
-                        key={opt.value}
-                        onClick={() => { onEmotionChange(opt.value); setEmotionOpen(false); }}
-                        className={`w-full text-left px-3 py-1.5 text-xs transition-colors ${
-                          emotion === opt.value
-                            ? "bg-purple-600/40 text-purple-200"
-                            : "text-gray-300 hover:bg-white/10"
-                        }`}
-                      >
-                        {opt.label}
-                      </button>
-                    ))}
-                  </div>
-                )}
-              </div>
-            )}
-            {ttsMode === "voiceclone" && (
-              <div ref={speedRef} className="relative">
-                <button
-                  onClick={() => setSpeedOpen((v) => !v)}
-                  className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 whitespace-nowrap flex items-center gap-1 transition-all"
-                >
-                  语速: {currentSpeedLabel}
-                  <ChevronDown className={`h-3 w-3 transition-transform ${speedOpen ? "rotate-180" : ""}`} />
-                </button>
-                {speedOpen && (
-                  <div className="absolute right-0 top-full mt-1 bg-gray-800 border border-white/20 rounded-lg shadow-xl py-1 z-50 min-w-[80px]">
-                    {speedOptions.map((opt) => (
-                      <button
-                        key={opt.value}
-                        onClick={() => { onSpeedChange(opt.value); setSpeedOpen(false); }}
-                        className={`w-full text-left px-3 py-1.5 text-xs transition-colors ${
-                          speed === opt.value
-                            ? "bg-purple-600/40 text-purple-200"
-                            : "text-gray-300 hover:bg-white/10"
-                        }`}
-                      >
-                        {opt.label}
-                      </button>
-                    ))}
-                  </div>
-                )}
-              </div>
-            )}
+            {ttsMode === "voiceclone" && (
+              <SelectPopover
+                sheetTitle="选择语气"
+                trigger={({ open, toggle }) => (
+                  <button
+                    type="button"
+                    onClick={toggle}
+                    className="rounded-lg border border-white/10 bg-black/25 px-2.5 py-1.5 text-xs text-gray-200 whitespace-nowrap flex items-center gap-1 transition-colors hover:border-white/30"
+                  >
+                    语气: {currentEmotionLabel}
+                    <ChevronDown className={`h-3 w-3 transition-transform ${open ? "rotate-180" : ""}`} />
+                  </button>
+                )}
+              >
+                {({ close }) => (
+                  <div className="space-y-1">
+                    {emotionOptions.map((opt) => {
+                      const isSelected = emotion === opt.value;
+                      return (
+                        <button
+                          key={opt.value}
+                          type="button"
+                          data-popover-selected={isSelected ? "true" : undefined}
+                          onClick={() => {
+                            onEmotionChange(opt.value);
+                            close();
+                          }}
+                          className={`flex w-full items-center justify-between rounded-lg border px-3 py-2 text-left text-xs transition-colors ${isSelected
+                            ? "border-purple-500 bg-purple-500/20 text-purple-200"
+                            : "border-white/10 bg-white/5 text-gray-300 hover:border-white/30"
+                            }`}
+                        >
+                          {opt.label}
+                          {isSelected && <Check className="h-3.5 w-3.5 text-purple-300" />}
+                        </button>
+                      );
+                    })}
+                  </div>
+                )}
+              </SelectPopover>
+            )}
+            {ttsMode === "voiceclone" && (
+              <SelectPopover
+                sheetTitle="选择语速"
+                trigger={({ open, toggle }) => (
+                  <button
+                    type="button"
+                    onClick={toggle}
+                    className="rounded-lg border border-white/10 bg-black/25 px-2.5 py-1.5 text-xs text-gray-200 whitespace-nowrap flex items-center gap-1 transition-colors hover:border-white/30"
+                  >
+                    语速: {currentSpeedLabel}
+                    <ChevronDown className={`h-3 w-3 transition-transform ${open ? "rotate-180" : ""}`} />
+                  </button>
+                )}
+              >
+                {({ close }) => (
+                  <div className="space-y-1">
+                    {speedOptions.map((opt) => {
+                      const isSelected = speed === opt.value;
+                      return (
+                        <button
+                          key={opt.value}
+                          type="button"
+                          data-popover-selected={isSelected ? "true" : undefined}
+                          onClick={() => {
+                            onSpeedChange(opt.value);
+                            close();
+                          }}
+                          className={`flex w-full items-center justify-between rounded-lg border px-3 py-2 text-left text-xs transition-colors ${isSelected
+                            ? "border-purple-500 bg-purple-500/20 text-purple-200"
+                            : "border-white/10 bg-white/5 text-gray-300 hover:border-white/30"
+                            }`}
+                        >
+                          {opt.label}
+                          {isSelected && <Check className="h-3.5 w-3.5 text-purple-300" />}
+                        </button>
+                      );
+                    })}
+                  </div>
+                )}
+              </SelectPopover>
+            )}
            <button
              onClick={onGenerateAudio}
              disabled={isGeneratingAudio || !canGenerate}
@@ -245,62 +267,88 @@ export function GeneratedAudiosPanel({
            配音列表
          </h2>
          <div className="flex gap-1.5">
-            {ttsMode === "voiceclone" && (
-              <div ref={emotionRef} className="relative">
-                <button
-                  onClick={() => setEmotionOpen((v) => !v)}
-                  className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 whitespace-nowrap flex items-center gap-1 transition-all"
-                >
-                  语气: {currentEmotionLabel}
-                  <ChevronDown className={`h-3 w-3 transition-transform ${emotionOpen ? "rotate-180" : ""}`} />
-                </button>
-                {emotionOpen && (
-                  <div className="absolute right-0 top-full mt-1 bg-gray-800 border border-white/20 rounded-lg shadow-xl py-1 z-50 min-w-[80px]">
-                    {emotionOptions.map((opt) => (
-                      <button
-                        key={opt.value}
-                        onClick={() => { onEmotionChange(opt.value); setEmotionOpen(false); }}
-                        className={`w-full text-left px-3 py-1.5 text-xs transition-colors ${
-                          emotion === opt.value
-                            ? "bg-purple-600/40 text-purple-200"
-                            : "text-gray-300 hover:bg-white/10"
-                        }`}
-                      >
-                        {opt.label}
-                      </button>
-                    ))}
-                  </div>
-                )}
-              </div>
-            )}
-            {ttsMode === "voiceclone" && (
-              <div ref={speedRef} className="relative">
-                <button
-                  onClick={() => setSpeedOpen((v) => !v)}
-                  className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 whitespace-nowrap flex items-center gap-1 transition-all"
-                >
-                  语速: {currentSpeedLabel}
-                  <ChevronDown className={`h-3 w-3 transition-transform ${speedOpen ? "rotate-180" : ""}`} />
-                </button>
-                {speedOpen && (
-                  <div className="absolute right-0 top-full mt-1 bg-gray-800 border border-white/20 rounded-lg shadow-xl py-1 z-50 min-w-[80px]">
-                    {speedOptions.map((opt) => (
-                      <button
-                        key={opt.value}
-                        onClick={() => { onSpeedChange(opt.value); setSpeedOpen(false); }}
-                        className={`w-full text-left px-3 py-1.5 text-xs transition-colors ${
-                          speed === opt.value
-                            ? "bg-purple-600/40 text-purple-200"
-                            : "text-gray-300 hover:bg-white/10"
-                        }`}
-                      >
-                        {opt.label}
-                      </button>
-                    ))}
-                  </div>
-                )}
-              </div>
-            )}
+            {ttsMode === "voiceclone" && (
+              <SelectPopover
+                sheetTitle="选择语气"
+                trigger={({ open, toggle }) => (
+                  <button
+                    type="button"
+                    onClick={toggle}
+                    className="rounded-lg border border-white/10 bg-black/25 px-2.5 py-1.5 text-xs text-gray-200 whitespace-nowrap flex items-center gap-1 transition-colors hover:border-white/30"
+                  >
+                    语气: {currentEmotionLabel}
+                    <ChevronDown className={`h-3 w-3 transition-transform ${open ? "rotate-180" : ""}`} />
+                  </button>
+                )}
+              >
+                {({ close }) => (
+                  <div className="space-y-1">
+                    {emotionOptions.map((opt) => {
+                      const isSelected = emotion === opt.value;
+                      return (
+                        <button
+                          key={opt.value}
+                          type="button"
+                          data-popover-selected={isSelected ? "true" : undefined}
+                          onClick={() => {
+                            onEmotionChange(opt.value);
+                            close();
+                          }}
+                          className={`flex w-full items-center justify-between rounded-lg border px-3 py-2 text-left text-xs transition-colors ${isSelected
+                            ? "border-purple-500 bg-purple-500/20 text-purple-200"
+                            : "border-white/10 bg-white/5 text-gray-300 hover:border-white/30"
+                            }`}
+                        >
+                          {opt.label}
+                          {isSelected && <Check className="h-3.5 w-3.5 text-purple-300" />}
+                        </button>
+                      );
+                    })}
+                  </div>
+                )}
+              </SelectPopover>
+            )}
+            {ttsMode === "voiceclone" && (
+              <SelectPopover
+                sheetTitle="选择语速"
+                trigger={({ open, toggle }) => (
+                  <button
+                    type="button"
+                    onClick={toggle}
+                    className="rounded-lg border border-white/10 bg-black/25 px-2.5 py-1.5 text-xs text-gray-200 whitespace-nowrap flex items-center gap-1 transition-colors hover:border-white/30"
+                  >
+                    语速: {currentSpeedLabel}
+                    <ChevronDown className={`h-3 w-3 transition-transform ${open ? "rotate-180" : ""}`} />
+                  </button>
+                )}
+              >
+                {({ close }) => (
+                  <div className="space-y-1">
+                    {speedOptions.map((opt) => {
+                      const isSelected = speed === opt.value;
+                      return (
+                        <button
+                          key={opt.value}
+                          type="button"
+                          data-popover-selected={isSelected ? "true" : undefined}
+                          onClick={() => {
+                            onSpeedChange(opt.value);
+                            close();
+                          }}
+                          className={`flex w-full items-center justify-between rounded-lg border px-3 py-2 text-left text-xs transition-colors ${isSelected
+                            ? "border-purple-500 bg-purple-500/20 text-purple-200"
+                            : "border-white/10 bg-white/5 text-gray-300 hover:border-white/30"
+                            }`}
+                        >
+                          {opt.label}
+                          {isSelected && <Check className="h-3.5 w-3.5 text-purple-300" />}
+                        </button>
+                      );
+                    })}
+                  </div>
+                )}
+              </SelectPopover>
+            )}
            <button
              onClick={onGenerateAudio}
              disabled={isGeneratingAudio || !canGenerate}
@@ -349,87 +397,142 @@ export function GeneratedAudiosPanel({
      )}

      {/* 配音列表 */}
-      {generatedAudios.length === 0 ? (
-        <div className="text-center py-6 text-gray-400">
-          <p className="text-sm">暂无配音</p>
-          <p className="text-xs mt-1 text-gray-500">点击「生成配音」创建</p>
-        </div>
-      ) : (
-        <div className="space-y-2 max-h-48 sm:max-h-56 overflow-y-auto hide-scrollbar">
-          {generatedAudios.map((audio) => {
-            const isSelected = selectedAudioId === audio.id;
-            return (
-              <div
-                key={audio.id}
-                onClick={() => onSelectAudio(audio)}
-                className={`p-3 rounded-lg border transition-all cursor-pointer flex items-center justify-between group ${
-                  isSelected
-                    ? "border-purple-500 bg-purple-500/20"
-                    : "border-white/10 bg-white/5 hover:border-white/30"
-                }`}
-              >
-                {editingId === audio.id ? (
-                  <div className="flex-1 flex items-center gap-2" onClick={(e) => e.stopPropagation()}>
-                    <input
-                      value={editName}
-                      onChange={(e) => setEditName(e.target.value)}
-                      className="flex-1 bg-black/40 border border-white/20 rounded-md px-2 py-1 text-xs text-white"
-                      autoFocus
-                      onKeyDown={(e) => {
-                        if (e.key === "Enter") saveEditing(audio.id, e as unknown as React.MouseEvent);
-                        if (e.key === "Escape") cancelEditing(e as unknown as React.MouseEvent);
-                      }}
-                    />
-                    <button onClick={(e) => saveEditing(audio.id, e)} className="p-1 text-green-400 hover:text-green-300" title="保存">
-                      <Check className="h-4 w-4" />
-                    </button>
-                    <button onClick={cancelEditing} className="p-1 text-gray-400 hover:text-white" title="取消">
-                      <X className="h-4 w-4" />
-                    </button>
-                  </div>
-                ) : (
-                  <>
-                    <div className="min-w-0 flex-1">
-                      <div className="text-white text-sm truncate">{audio.name}</div>
-                      <div className="text-gray-400 text-xs">{audio.duration_sec.toFixed(1)}s</div>
-                    </div>
-                    <div className="flex items-center gap-1 pl-2 opacity-40 group-hover:opacity-100 transition-opacity">
-                      <button
-                        onClick={(e) => togglePlay(audio, e)}
-                        className="p-1 text-gray-500 hover:text-purple-400 transition-colors"
-                        title={playingId === audio.id ? "暂停" : "播放"}
-                      >
-                        {playingId === audio.id ? (
-                          <Pause className="h-3.5 w-3.5" />
-                        ) : (
-                          <Play className="h-3.5 w-3.5" />
-                        )}
-                      </button>
-                      <button
-                        onClick={(e) => startEditing(audio, e)}
-                        className="p-1 text-gray-500 hover:text-white transition-colors"
-                        title="重命名"
-                      >
-                        <Pencil className="h-3.5 w-3.5" />
-                      </button>
-                      <button
-                        onClick={(e) => {
-                          e.stopPropagation();
-                          onDeleteAudio(audio.id);
-                        }}
-                        className="p-1 text-gray-500 hover:text-red-400 transition-colors"
-                        title="删除"
-                      >
-                        <Trash2 className="h-3.5 w-3.5" />
-                      </button>
-                    </div>
-                  </>
-                )}
-              </div>
-            );
-          })}
-        </div>
-      )}
+      {generatedAudios.length === 0 ? (
+        <div className="text-center py-6 text-gray-400">
+          <p className="text-sm">暂无配音</p>
+          <p className="text-xs mt-1 text-gray-500">点击「生成配音」创建</p>
+        </div>
+      ) : (
+        <SelectPopover
+          sheetTitle="选择配音"
+          onOpen={handleOpenAudioPopover}
+          trigger={({ open, toggle }) => (
+            <button
+              type="button"
+              onClick={toggle}
+              className="w-full rounded-xl border border-white/10 bg-black/25 px-3 py-2.5 text-left transition-colors hover:border-white/30"
+            >
+              <span className="flex items-center justify-between gap-3">
+                <span className="min-w-0">
+                  <span className="block text-xs text-gray-400">当前配音</span>
+                  <span className="mt-0.5 block truncate text-sm text-white">
+                    {selectedAudio ? selectedAudio.name : "请选择配音"}
+                  </span>
+                </span>
+                <ChevronDown className={`h-4 w-4 text-gray-300 transition-transform ${open ? "rotate-180" : ""}`} />
+              </span>
+            </button>
+          )}
+        >
+          {({ close }) => (
+            <div className="space-y-2">
+              <div className="rounded-lg border border-white/10 bg-black/30 px-3 py-2">
+                <div className="flex items-center gap-2">
+                  <Search className="h-4 w-4 text-gray-400" />
+                  <input
+                    type="text"
+                    value={audioFilter}
+                    onChange={(e) => setAudioFilter(e.target.value)}
+                    placeholder="搜索配音..."
+                    className="w-full bg-transparent text-sm text-white placeholder-gray-500 outline-none"
+                  />
+                </div>
+              </div>
+
+              {filteredAudios.length === 0 ? (
+                <div className="py-6 text-center text-sm text-gray-400">没有匹配的配音</div>
+              ) : (
+                <div ref={audioListContainerRef} className="space-y-1" style={{ contentVisibility: "auto" }}>
+                  {filteredAudios.map((audio) => {
+                    const isSelected = selectedAudioId === audio.id;
+                    return (
+                      <div
+                        key={audio.id}
+                        data-popover-selected={isSelected ? "true" : undefined}
+                        data-audio-selected={isSelected ? "true" : "false"}
+                        className={`flex items-center justify-between gap-2 rounded-lg border px-3 py-2 transition-colors ${isSelected
+                          ? "border-purple-500 bg-purple-500/20"
+                          : "border-white/10 bg-white/5 hover:border-white/30"
+                          }`}
+                      >
+                        {editingId === audio.id ? (
+                          <div className="flex-1 flex items-center gap-2" onClick={(e) => e.stopPropagation()}>
+                            <input
+                              value={editName}
+                              onChange={(e) => setEditName(e.target.value)}
+                              className="flex-1 bg-black/40 border border-white/20 rounded-md px-2 py-1 text-xs text-white"
+                              autoFocus
+                              onKeyDown={(e) => {
+                                if (e.key === "Enter") saveEditing(audio.id, e as unknown as React.MouseEvent);
+                                if (e.key === "Escape") cancelEditing(e as unknown as React.MouseEvent);
+                              }}
+                            />
+                            <button type="button" onClick={(e) => saveEditing(audio.id, e)} className="p-1 text-green-400 hover:text-green-300" title="保存">
+                              <Check className="h-4 w-4" />
+                            </button>
+                            <button type="button" onClick={cancelEditing} className="p-1 text-gray-400 hover:text-white" title="取消">
+                              <X className="h-4 w-4" />
+                            </button>
+                          </div>
+                        ) : (
+                          <button
+                            type="button"
+                            onClick={() => {
+                              onSelectAudio(audio);
+                              close();
+                            }}
+                            className="min-w-0 flex-1 text-left"
+                          >
+                            <span className="block truncate text-sm text-white">{audio.name}</span>
+                            <span className="mt-0.5 block text-xs text-gray-400">{audio.duration_sec.toFixed(1)}s</span>
+                          </button>
+                        )}
+
+                        {editingId !== audio.id && (
+                          <div className="flex items-center gap-1 pl-2">
+                            <button
+                              type="button"
+                              onClick={(e) => togglePlay(audio, e)}
+                              className="p-1 text-gray-400 hover:text-purple-300 transition-colors"
+                              title={playingId === audio.id ? "暂停" : "播放"}
+                            >
+                              {playingId === audio.id ? (
+                                <Pause className="h-3.5 w-3.5" />
+                              ) : (
+                                <Play className="h-3.5 w-3.5" />
+                              )}
+                            </button>
+                            <button
+                              type="button"
+                              onClick={(e) => startEditing(audio, e)}
+                              className="p-1 text-gray-400 hover:text-white transition-colors"
+                              title="重命名"
+                            >
+                              <Pencil className="h-3.5 w-3.5" />
+                            </button>
+                            <button
+                              type="button"
+                              onClick={(e) => {
+                                e.stopPropagation();
+                                onDeleteAudio(audio.id);
+                              }}
+                              className="p-1 text-gray-400 hover:text-red-400 transition-colors"
+                              title="删除"
+                            >
+                              <Trash2 className="h-3.5 w-3.5" />
+                            </button>
+                            {isSelected && <Check className="h-3.5 w-3.5 text-purple-300" />}
+                          </div>
+                        )}
+                      </div>
+                    );
+                  })}
+                </div>
+              )}
+            </div>
+          )}
+        </SelectPopover>
+      )}
    </>
  );

--- a/frontend/src/features/home/ui/HistoryList.tsx
+++ b/frontend/src/features/home/ui/HistoryList.tsx
@@ -1,4 +1,6 @@
-import { RefreshCw, Trash2 } from "lucide-react";
+import { useCallback, useMemo, useRef, useState } from "react";
+import { RefreshCw, Trash2, Search, ChevronDown, Check } from "lucide-react";
+import { SelectPopover } from "@/shared/ui/SelectPopover";

 interface GeneratedVideo {
  id: string;
@@ -29,6 +31,29 @@ export function HistoryList({
  formatDate,
  embedded = false,
 }: HistoryListProps) {
+  const [videoFilter, setVideoFilter] = useState("");
+  const videoListContainerRef = useRef<HTMLDivElement | null>(null);
+
+  const selectedVideo = generatedVideos.find((v) => v.id === selectedVideoId) || null;
+  const filteredVideos = useMemo(() => {
+    const query = videoFilter.trim().toLowerCase();
+    if (!query) return generatedVideos;
+    return generatedVideos.filter((v) => formatDate(v.created_at).toLowerCase().includes(query));
+  }, [generatedVideos, videoFilter, formatDate]);
+
+  const handleOpenVideoPopover = useCallback(() => {
+    setVideoFilter("");
+
+    requestAnimationFrame(() => {
+      requestAnimationFrame(() => {
+        const container = videoListContainerRef.current;
+        if (!container) return;
+        const selectedRow = container.querySelector<HTMLElement>("[data-video-selected='true']");
+        selectedRow?.scrollIntoView({ block: "nearest", behavior: "auto" });
+      });
+    });
+  }, []);
+
  const content = (
    <>
      {!embedded && (
@@ -48,36 +73,98 @@ export function HistoryList({
          <p>暂无生成的作品</p>
        </div>
      ) : (
-        <div
-          className="space-y-2 max-h-64 overflow-y-auto hide-scrollbar"
-          style={{ contentVisibility: 'auto' }}
-        >
-          {generatedVideos.map((v) => (
-            <div
-              key={v.id}
-              ref={(el) => registerVideoRef(v.id, el)}
-              className={`p-3 rounded-lg border transition-all flex items-center justify-between group ${selectedVideoId === v.id
-                ? "border-purple-500 bg-purple-500/20"
-                : "border-white/10 bg-white/5 hover:border-white/30"
-                }`}
+        <SelectPopover
+          sheetTitle="选择作品"
+          onOpen={handleOpenVideoPopover}
+          trigger={({ open, toggle }) => (
+            <button
+              type="button"
+              onClick={toggle}
+              className="w-full rounded-xl border border-white/10 bg-black/25 px-3 py-2.5 text-left transition-colors hover:border-white/30"
            >
-              <button onClick={() => onSelectVideo(v)} className="flex-1 text-left">
-                <div className="text-white text-sm truncate">{formatDate(v.created_at)}</div>
-                <div className="text-gray-400 text-xs">{v.size_mb.toFixed(1)} MB</div>
-              </button>
-              <button
-                onClick={(e) => {
-                  e.stopPropagation();
-                  onDeleteVideo(v.id);
-                }}
-                className="p-1 text-gray-500 hover:text-red-400 opacity-40 group-hover:opacity-100 transition-opacity"
-                title="删除视频"
-              >
-                <Trash2 className="h-4 w-4" />
-              </button>
+              <span className="flex items-center justify-between gap-3">
+                <span className="min-w-0">
+                  <span className="block text-xs text-gray-400">当前作品</span>
+                  <span className="mt-0.5 block truncate text-sm text-white">
+                    {selectedVideo ? formatDate(selectedVideo.created_at) : "请选择作品"}
+                  </span>
+                </span>
+                <ChevronDown className={`h-4 w-4 text-gray-300 transition-transform ${open ? "rotate-180" : ""}`} />
+              </span>
+            </button>
+          )}
+        >
+          {({ close }) => (
+            <div className="space-y-2">
+              <div className="rounded-lg border border-white/10 bg-black/30 px-3 py-2">
+                <div className="flex items-center gap-2">
+                  <Search className="h-4 w-4 text-gray-400" />
+                  <input
+                    type="text"
+                    value={videoFilter}
+                    onChange={(e) => setVideoFilter(e.target.value)}
+                    placeholder="搜索作品..."
+                    className="w-full bg-transparent text-sm text-white placeholder-gray-500 outline-none"
+                  />
+                </div>
+              </div>
+
+              {filteredVideos.length === 0 ? (
+                <div className="py-6 text-center text-sm text-gray-400">没有匹配的作品</div>
+              ) : (
+                <div
+                  ref={videoListContainerRef}
+                  className="space-y-1"
+                  style={{ contentVisibility: "auto" }}
+                >
+                  {filteredVideos.map((v) => {
+                    const isSelected = selectedVideoId === v.id;
+
+                    return (
+                      <div
+                        key={v.id}
+                        ref={(el) => registerVideoRef(v.id, el)}
+                        data-popover-selected={isSelected ? "true" : undefined}
+                        data-video-selected={isSelected ? "true" : "false"}
+                        className={`flex items-center justify-between gap-2 rounded-lg border px-3 py-2 transition-colors ${isSelected
+                          ? "border-purple-500 bg-purple-500/20"
+                          : "border-white/10 bg-white/5 hover:border-white/30"
+                          }`}
+                      >
+                        <button
+                          type="button"
+                          onClick={() => {
+                            onSelectVideo(v);
+                            close();
+                          }}
+                          className="min-w-0 flex-1 text-left"
+                        >
+                          <span className="block truncate text-sm text-white">{formatDate(v.created_at)}</span>
+                          <span className="mt-0.5 block text-xs text-gray-400">{v.size_mb.toFixed(1)} MB</span>
+                        </button>
+
+                        <div className="flex items-center gap-2 pl-2">
+                          <button
+                            type="button"
+                            onClick={(e) => {
+                              e.stopPropagation();
+                              onDeleteVideo(v.id);
+                            }}
+                            className="p-1 text-gray-400 hover:text-red-400"
+                            title="删除视频"
+                          >
+                            <Trash2 className="h-4 w-4" />
+                          </button>
+                          {isSelected && <Check className="h-4 w-4 text-purple-300" />}
+                        </div>
+                      </div>
+                    );
+                  })}
+                </div>
+              )}
            </div>
-          ))}
-        </div>
+          )}
+        </SelectPopover>
      )}
    </>
  );
--- a/frontend/src/features/home/ui/HomePage.tsx
+++ b/frontend/src/features/home/ui/HomePage.tsx
@@ -5,9 +5,11 @@ import { useRouter } from "next/navigation";
 import { RefreshCw } from "lucide-react";
 import VideoPreviewModal from "@/components/VideoPreviewModal";
 import ScriptExtractionModal from "./ScriptExtractionModal";
+import ScriptLearningModal from "./ScriptLearningModal";
 import RewriteModal from "./RewriteModal";
 import { useHomeController } from "@/features/home/model/useHomeController";
 import { resolveMediaUrl } from "@/shared/lib/media";
+import { toast } from "sonner";
 import { BgmPanel } from "@/features/home/ui/BgmPanel";
 import { GenerateActionBar } from "@/features/home/ui/GenerateActionBar";
 import { HistoryList } from "@/features/home/ui/HistoryList";
@@ -53,6 +55,8 @@ export function HomePage() {
    setText,
    extractModalOpen,
    setExtractModalOpen,
+    learningModalOpen,
+    setLearningModalOpen,
    rewriteModalOpen,
    setRewriteModalOpen,
    handleGenerateMeta,
@@ -132,6 +136,7 @@ export function HomePage() {
    startRecording,
    stopRecording,
    useRecording,
+    discardRecording,
    formatRecordingTime,
    bgmList,
    bgmLoading,
@@ -143,8 +148,6 @@ export function HomePage() {
    setSelectedBgmId,
    playingBgmId,
    toggleBgmPreview,
-    bgmVolume,
-    setBgmVolume,
    bgmListContainerRef,
    registerBgmItemRef,
    currentTask,
@@ -172,9 +175,19 @@ export function HomePage() {
    setSpeed,
    emotion,
    setEmotion,
-    timelineSegments,
-    reorderSegments,
-    setSourceRange,
+    // Multi-camera timeline
+    inserts,
+    timelinePrimaryMaterial,
+    primarySourceStart,
+    primarySourceEnd,
+    insertCandidates,
+    addInsert,
+    removeInsert,
+    moveInsert,
+    resizeInsert,
+    setInsertSourceRange,
+    setPrimarySourceRange,
+    handleSetPrimary,
    clipTrimmerOpen,
    setClipTrimmerOpen,
    clipTrimmerSegmentId,
@@ -199,10 +212,27 @@ export function HomePage() {
    return () => clearTimeout(timer);
  }, []);

-  const clipTrimmerSegment = useMemo(
-    () => timelineSegments.find((s) => s.id === clipTrimmerSegmentId) ?? null,
-    [timelineSegments, clipTrimmerSegmentId]
-  );
+  // ClipTrimmer: construct segment from either primary or an insert
+  const clipTrimmerSegment = useMemo(() => {
+    if (!clipTrimmerSegmentId) return null;
+    // Check if it's the primary material
+    if (clipTrimmerSegmentId === "primary" && timelinePrimaryMaterial) {
+      return {
+        id: "primary",
+        materialId: timelinePrimaryMaterial.id,
+        materialName: timelinePrimaryMaterial.scene || timelinePrimaryMaterial.name || "",
+        start: 0,
+        end: selectedAudio?.duration_sec ?? 0,
+        sourceStart: primarySourceStart,
+        sourceEnd: primarySourceEnd,
+        color: "#8b5cf6",
+      };
+    }
+    // Check inserts
+    const insert = inserts.find((i) => i.id === clipTrimmerSegmentId);
+    if (insert) return insert;
+    return null;
+  }, [clipTrimmerSegmentId, timelinePrimaryMaterial, inserts, selectedAudio, primarySourceStart, primarySourceEnd]);

  const clipTrimmerMaterialUrl = useMemo(() => {
    if (!clipTrimmerSegment) return null;
@@ -223,9 +253,8 @@ export function HomePage() {
              text={text}
              onChangeText={setText}
              onOpenExtractModal={() => setExtractModalOpen(true)}
+              onOpenLearningModal={() => setLearningModalOpen(true)}
              onOpenRewriteModal={() => setRewriteModalOpen(true)}
-              onGenerateMeta={handleGenerateMeta}
-              isGeneratingMeta={isGeneratingMeta}
              onTranslate={handleTranslate}
              isTranslating={isTranslating}
              hasOriginalText={originalText !== null}
@@ -237,7 +266,7 @@ export function HomePage() {
            />

            {/* 二、配音 */}
-            <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
+            <div className="relative z-20 bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
              <h2 className="text-base sm:text-lg font-semibold text-white mb-4">
                二、配音
              </h2>
@@ -276,6 +305,7 @@ export function HomePage() {
                    onStartRecording={startRecording}
                    onStopRecording={stopRecording}
                    onUseRecording={useRecording}
+                    onDiscardRecording={discardRecording}
                    formatRecordingTime={formatRecordingTime}
                  />
                )}
@@ -331,6 +361,7 @@ export function HomePage() {
                onDeleteMaterial={deleteMaterial}
                onClearUploadError={() => setUploadError(null)}
                registerMaterialRef={registerMaterialRef}
+                onSetPrimary={handleSetPrimary}
              />
              <div className="border-t border-white/10 my-4" />
              <div className="relative">
@@ -345,15 +376,28 @@ export function HomePage() {
                  embedded
                  audioDuration={selectedAudio?.duration_sec ?? 0}
                  audioUrl={selectedAudio ? (resolveMediaUrl(selectedAudio.path) || "") : ""}
-                  segments={timelineSegments}
-                  materials={materials}
-                  outputAspectRatio={outputAspectRatio}
-                  onOutputAspectRatioChange={setOutputAspectRatio}
-                  onReorderSegment={reorderSegments}
-                  onClickSegment={(seg) => {
-                    setClipTrimmerSegmentId(seg.id);
+                  primaryMaterial={timelinePrimaryMaterial}
+                  inserts={inserts}
+                  insertCandidates={insertCandidates}
+                  onAddInsert={(materialId) => {
+                    const result = addInsert(materialId);
+                    if (result === "limit") toast.error("最多添加 10 个插入片段");
+                    else if (result === "no_space") toast.error("时间轴空间不足，无法再添加插入");
+                  }}
+                  onRemoveInsert={removeInsert}
+                  onMoveInsert={moveInsert}
+                  onClickInsert={(insert) => {
+                    setClipTrimmerSegmentId(insert.id);
                    setClipTrimmerOpen(true);
                  }}
+                  onClickPrimary={() => {
+                    setClipTrimmerSegmentId("primary");
+                    setClipTrimmerOpen(true);
+                  }}
+                  primarySourceStart={primarySourceStart}
+                  primarySourceEnd={primarySourceEnd}
+                  outputAspectRatio={outputAspectRatio}
+                  onOutputAspectRatioChange={setOutputAspectRatio}
                />
              </div>
            </div>
@@ -362,6 +406,9 @@ export function HomePage() {
            <TitleSubtitlePanel
              showStylePreview={showStylePreview}
              onTogglePreview={() => setShowStylePreview((prev) => !prev)}
+              onGenerateMeta={handleGenerateMeta}
+              isGeneratingMeta={isGeneratingMeta}
+              canGenerateMeta={!!text.trim()}
              videoTitle={videoTitle}
              onTitleChange={titleInput.handleChange}
              onTitleCompositionStart={titleInput.handleCompositionStart}
@@ -421,8 +468,6 @@ export function HomePage() {
              onSelectBgm={setSelectedBgmId}
              playingBgmId={playingBgmId}
              onTogglePreview={toggleBgmPreview}
-              bgmVolume={bgmVolume}
-              onVolumeChange={setBgmVolume}
              bgmListContainerRef={bgmListContainerRef}
              registerBgmItemRef={registerBgmItemRef}
            />
@@ -490,6 +535,7 @@ export function HomePage() {
                currentTask={null}
                isGenerating={false}
                generatedVideo={generatedVideo}
+                generatedVideoId={selectedVideoId}
              />
            </div>
          </div>
@@ -514,13 +560,28 @@ export function HomePage() {
        onApply={(newText) => setText(newText)}
      />

+      <ScriptLearningModal
+        isOpen={learningModalOpen}
+        onClose={() => setLearningModalOpen(false)}
+        onApply={(nextText) => setText(nextText)}
+      />
+
      <ClipTrimmer
        isOpen={clipTrimmerOpen}
        segment={clipTrimmerSegment}
        materialUrl={clipTrimmerMaterialUrl}
        onConfirm={(sourceStart, sourceEnd) => {
-          if (clipTrimmerSegmentId) {
-            setSourceRange(clipTrimmerSegmentId, sourceStart, sourceEnd);
+          if (clipTrimmerSegmentId === "primary") {
+            setPrimarySourceRange(sourceStart, sourceEnd);
+          } else if (clipTrimmerSegmentId) {
+            setInsertSourceRange(clipTrimmerSegmentId, sourceStart, sourceEnd);
+            // Sync timeline duration to match trimmed source duration
+            if (sourceEnd > sourceStart) {
+              const ins = inserts.find((i) => i.id === clipTrimmerSegmentId);
+              if (ins) {
+                resizeInsert(clipTrimmerSegmentId, ins.start + (sourceEnd - sourceStart));
+              }
+            }
          }
          setClipTrimmerOpen(false);
        }}
--- a/frontend/src/features/home/ui/MaterialSelector.tsx
+++ b/frontend/src/features/home/ui/MaterialSelector.tsx
@@ -1,6 +1,7 @@
-import { type ChangeEvent, type MouseEvent, useMemo } from "react";
-import { Upload, RefreshCw, Eye, Trash2, X, Pencil, Check } from "lucide-react";
+import { type ChangeEvent, type MouseEvent, useCallback, useMemo, useRef, useState } from "react";
+import { Upload, RefreshCw, Eye, Trash2, X, Pencil, Check, Search, ChevronDown, Crown } from "lucide-react";
 import type { Material } from "@/shared/types/material";
+import { SelectPopover } from "@/shared/ui/SelectPopover";

 interface MaterialSelectorProps {
  materials: Material[];
@@ -25,6 +26,7 @@ interface MaterialSelectorProps {
  onDeleteMaterial: (id: string) => void;
  onClearUploadError: () => void;
  registerMaterialRef: (id: string, element: HTMLDivElement | null) => void;
+  onSetPrimary?: (materialId: string) => void;
  embedded?: boolean;
 }

@@ -51,10 +53,49 @@ export function MaterialSelector({
  onDeleteMaterial,
  onClearUploadError,
  registerMaterialRef,
+  onSetPrimary,
  embedded = false,
 }: MaterialSelectorProps) {
+  const [materialFilter, setMaterialFilter] = useState("");
+  const materialListContainerRef = useRef<HTMLDivElement | null>(null);
  const selectedSet = useMemo(() => new Set(selectedMaterials), [selectedMaterials]);
  const isFull = selectedMaterials.length >= 4;
+  const selectedMaterialItems = useMemo(
+    () => selectedMaterials.map((id) => materials.find((m) => m.id === id)).filter((m): m is Material => Boolean(m)),
+    [materials, selectedMaterials],
+  );
+  const filteredMaterials = useMemo(() => {
+    const query = materialFilter.trim().toLowerCase();
+    if (!query) return materials;
+    return materials.filter((m) => (m.scene || m.name).toLowerCase().includes(query));
+  }, [materialFilter, materials]);
+
+  const selectedSummary = useMemo(() => {
+    if (selectedMaterialItems.length === 0) {
+      return "请选择素材（最多4个）";
+    }
+    const names = selectedMaterialItems
+      .slice(0, 2)
+      .map((m) => m.scene || m.name)
+      .join("、");
+    if (selectedMaterialItems.length > 2) {
+      return `${names} +${selectedMaterialItems.length - 2}`;
+    }
+    return names;
+  }, [selectedMaterialItems]);
+
+  const handleOpenMaterialPopover = useCallback(() => {
+    setMaterialFilter("");
+
+    requestAnimationFrame(() => {
+      requestAnimationFrame(() => {
+        const container = materialListContainerRef.current;
+        if (!container) return;
+        const selectedRow = container.querySelector<HTMLElement>("[data-material-selected='true']");
+        selectedRow?.scrollIntoView({ block: "nearest", behavior: "auto" });
+      });
+    });
+  }, []);

  const content = (
    <>
@@ -151,100 +192,167 @@ export function MaterialSelector({
          </p>
        </div>
      ) : (
-        <div
-          className="space-y-2 max-h-48 sm:max-h-64 overflow-y-auto hide-scrollbar"
-          style={{ contentVisibility: 'auto' }}
+        <SelectPopover
+          sheetTitle="选择视频素材"
+          onOpen={handleOpenMaterialPopover}
+          trigger={({ open, toggle }) => (
+            <button
+              type="button"
+              onClick={toggle}
+              className="w-full rounded-xl border border-white/10 bg-black/25 px-3 py-2.5 text-left transition-colors hover:border-white/30"
+            >
+              <span className="flex items-center justify-between gap-3">
+                <span className="min-w-0">
+                  <span className="block text-xs text-gray-400">已选 {selectedMaterials.length}/4 个素材</span>
+                  <span className="mt-0.5 block truncate text-sm text-white">{selectedSummary}</span>
+                </span>
+                <ChevronDown className={`h-4 w-4 text-gray-300 transition-transform ${open ? "rotate-180" : ""}`} />
+              </span>
+            </button>
+          )}
        >
-          {materials.map((m) => {
-            const isSelected = selectedSet.has(m.id);
-            return (
-              <div
-                key={m.id}
-                ref={(el) => registerMaterialRef(m.id, el)}
-                className={`p-3 rounded-lg border transition-all flex items-center justify-between group ${isSelected
-                  ? "border-purple-500 bg-purple-500/20"
-                  : isFull
-                    ? "border-white/5 bg-white/[0.02] opacity-50 cursor-not-allowed"
-                    : "border-white/10 bg-white/5 hover:border-white/30"
-                  }`}
-              >
-                {editingMaterialId === m.id ? (
-                  <div className="flex-1 flex items-center gap-2" onClick={(e) => e.stopPropagation()}>
-                    <input
-                      value={editMaterialName}
-                      onChange={(e) => onEditNameChange(e.target.value)}
-                      className="flex-1 bg-black/40 border border-white/20 rounded-md px-2 py-1 text-xs text-white"
-                      autoFocus
-                    />
-                    <button
-                      onClick={(e) => onSaveEditing(m.id, e)}
-                      className="p-1 text-green-400 hover:text-green-300"
-                      title="保存"
-                    >
-                      <Check className="h-4 w-4" />
-                    </button>
-                    <button
-                      onClick={onCancelEditing}
-                      className="p-1 text-gray-400 hover:text-white"
-                      title="取消"
-                    >
-                      <X className="h-4 w-4" />
-                    </button>
-                  </div>
-                ) : (
-                  <button onClick={() => onToggleMaterial(m.id)} disabled={isFull && !isSelected} className="flex-1 text-left flex items-center gap-2">
-                    {/* 复选框 */}
-                    <span
-                      className={`flex-shrink-0 w-4 h-4 rounded border flex items-center justify-center text-[10px] ${isSelected
-                        ? "border-purple-500 bg-purple-500 text-white"
-                        : "border-white/30 text-transparent"
-                        }`}
-                    >
-                      {isSelected ? "✓" : ""}
-                    </span>
-                    <div className="min-w-0">
-                      <div className="text-white text-sm truncate">{m.scene || m.name}</div>
-                      <div className="text-gray-400 text-xs">{m.size_mb.toFixed(1)} MB</div>
-                    </div>
-                  </button>
-                )}
-                <div className="flex items-center gap-2 pl-2">
-                  <button
-                    onClick={(e) => {
-                      e.stopPropagation();
-                      if (m.path) {
-                        onPreviewMaterial(m.path);
-                      }
-                    }}
-                    className="p-1 text-gray-500 hover:text-white opacity-40 group-hover:opacity-100 transition-opacity"
-                    title="预览视频"
-                  >
-                    <Eye className="h-4 w-4" />
-                  </button>
-                  {editingMaterialId !== m.id && (
-                    <button
-                      onClick={(e) => onStartEditing(m, e)}
-                      className="p-1 text-gray-500 hover:text-white opacity-40 group-hover:opacity-100 transition-opacity"
-                      title="重命名"
-                    >
-                      <Pencil className="h-4 w-4" />
-                    </button>
-                  )}
-                  <button
-                    onClick={(e) => {
-                      e.stopPropagation();
-                      onDeleteMaterial(m.id);
-                    }}
-                    className="p-1 text-gray-500 hover:text-red-400 opacity-40 group-hover:opacity-100 transition-opacity"
-                    title="删除素材"
-                  >
-                    <Trash2 className="h-4 w-4" />
-                  </button>
+          {() => (
+            <div className="space-y-2">
+              <div className="rounded-lg border border-white/10 bg-black/30 px-3 py-2">
+                <div className="flex items-center gap-2">
+                  <Search className="h-4 w-4 text-gray-400" />
+                  <input
+                    type="text"
+                    value={materialFilter}
+                    onChange={(e) => setMaterialFilter(e.target.value)}
+                    placeholder="搜索素材名称..."
+                    className="w-full bg-transparent text-sm text-white placeholder-gray-500 outline-none"
+                  />
                </div>
              </div>
-            );
-          })}
-        </div>
+
+              {filteredMaterials.length === 0 ? (
+                <div className="py-6 text-center text-sm text-gray-400">没有匹配的素材</div>
+              ) : (
+                <div
+                  ref={materialListContainerRef}
+                  className="space-y-1"
+                  style={{ contentVisibility: "auto" }}
+                >
+                  {filteredMaterials.map((m) => {
+                    const isSelected = selectedSet.has(m.id);
+
+                    return (
+                      <div
+                        key={m.id}
+                        ref={(el) => registerMaterialRef(m.id, el)}
+                        data-popover-selected={isSelected ? "true" : undefined}
+                        data-material-selected={isSelected ? "true" : "false"}
+                        className={`flex items-center justify-between gap-2 rounded-lg border px-3 py-2 transition-colors ${isSelected
+                          ? "border-purple-500 bg-purple-500/20"
+                          : isFull
+                            ? "border-white/5 bg-white/[0.02] opacity-50"
+                            : "border-white/10 bg-white/5 hover:border-white/30"
+                          }`}
+                      >
+                        {editingMaterialId === m.id ? (
+                          <div className="flex-1 flex items-center gap-2" onClick={(e) => e.stopPropagation()}>
+                            <input
+                              value={editMaterialName}
+                              onChange={(e) => onEditNameChange(e.target.value)}
+                              className="flex-1 rounded-md border border-white/20 bg-black/40 px-2 py-1 text-xs text-white"
+                              autoFocus
+                            />
+                            <button
+                              type="button"
+                              onClick={(e) => onSaveEditing(m.id, e)}
+                              className="p-1 text-green-400 hover:text-green-300"
+                              title="保存"
+                            >
+                              <Check className="h-4 w-4" />
+                            </button>
+                            <button
+                              type="button"
+                              onClick={onCancelEditing}
+                              className="p-1 text-gray-400 hover:text-white"
+                              title="取消"
+                            >
+                              <X className="h-4 w-4" />
+                            </button>
+                          </div>
+                        ) : (
+                          <button
+                            type="button"
+                            onClick={() => onToggleMaterial(m.id)}
+                            disabled={isFull && !isSelected}
+                            className="min-w-0 flex-1 text-left"
+                          >
+                            <span className="flex items-center gap-1.5">
+                              <span className="block truncate text-sm text-white">{m.scene || m.name}</span>
+                              {isSelected && selectedMaterials[0] === m.id && selectedMaterials.length > 1 && (
+                                <span className="shrink-0 text-[9px] px-1 py-0.5 rounded bg-purple-500/30 text-purple-300 border border-purple-500/40">主素材</span>
+                              )}
+                              {isSelected && selectedMaterials[0] !== m.id && (
+                                <span className="shrink-0 text-[9px] px-1 py-0.5 rounded bg-white/10 text-gray-400 border border-white/10">可插入</span>
+                              )}
+                            </span>
+                            <span className="mt-0.5 block text-xs text-gray-400">{m.size_mb.toFixed(1)} MB</span>
+                          </button>
+                        )}
+
+                        <div className="flex items-center gap-2 pl-2">
+                          {isSelected && selectedMaterials[0] !== m.id && onSetPrimary && (
+                            <button
+                              type="button"
+                              onClick={(e) => {
+                                e.stopPropagation();
+                                onSetPrimary(m.id);
+                              }}
+                              className="p-1 text-gray-400 hover:text-amber-300"
+                              title="设为主素材"
+                            >
+                              <Crown className="h-4 w-4" />
+                            </button>
+                          )}
+                          <button
+                            type="button"
+                            onClick={(e) => {
+                              e.stopPropagation();
+                              if (m.path) {
+                                onPreviewMaterial(m.path);
+                              }
+                            }}
+                            className="p-1 text-gray-400 hover:text-purple-300"
+                            title="预览视频"
+                          >
+                            <Eye className="h-4 w-4" />
+                          </button>
+                          {editingMaterialId !== m.id && (
+                            <button
+                              type="button"
+                              onClick={(e) => onStartEditing(m, e)}
+                              className="p-1 text-gray-400 hover:text-white"
+                              title="重命名"
+                            >
+                              <Pencil className="h-4 w-4" />
+                            </button>
+                          )}
+                          <button
+                            type="button"
+                            onClick={(e) => {
+                              e.stopPropagation();
+                              onDeleteMaterial(m.id);
+                            }}
+                            className="p-1 text-gray-400 hover:text-red-400"
+                            title="删除素材"
+                          >
+                            <Trash2 className="h-4 w-4" />
+                          </button>
+                          {isSelected && <Check className="h-4 w-4 text-purple-300" />}
+                        </div>
+                      </div>
+                    );
+                  })}
+                </div>
+              )}
+            </div>
+          )}
+        </SelectPopover>
      )}
    </>
  );
--- a/frontend/src/features/home/ui/PreviewPanel.tsx
+++ b/frontend/src/features/home/ui/PreviewPanel.tsx
@@ -12,6 +12,7 @@ interface PreviewPanelProps {
  currentTask: Task | null;
  isGenerating: boolean;
  generatedVideo: string | null;
+  generatedVideoId?: string | null;
  embedded?: boolean;
 }

@@ -19,8 +20,13 @@ export function PreviewPanel({
  currentTask,
  isGenerating,
  generatedVideo,
+  generatedVideoId = null,
  embedded = false,
 }: PreviewPanelProps) {
+  const downloadHref = generatedVideoId
+    ? `/api/videos/generated/${encodeURIComponent(generatedVideoId)}/download`
+    : generatedVideo;
+
  const content = (
    <>
      {currentTask && isGenerating && (
@@ -51,10 +57,10 @@ export function PreviewPanel({
          )}
        </div>

-        {generatedVideo && (
+        {generatedVideo && downloadHref && (
          <>
            <a
-              href={generatedVideo}
+              href={downloadHref}
              download
              className="mt-4 w-full py-3 rounded-xl bg-green-600 hover:bg-green-700 text-white font-medium flex items-center justify-center gap-2 transition-colors"
            >
--- a/frontend/src/features/home/ui/RefAudioPanel.tsx
+++ b/frontend/src/features/home/ui/RefAudioPanel.tsx
@@ -1,6 +1,8 @@
-import { useEffect, useState } from "react";
-import type { MouseEvent } from "react";
-import { Upload, RefreshCw, Play, Pause, Pencil, Trash2, Check, X, Mic, Square, RotateCw } from "lucide-react";
+import { useCallback, useEffect, useMemo, useRef, useState } from "react";
+import type { ChangeEvent, MouseEvent } from "react";
+import { Upload, RefreshCw, Play, Pause, Pencil, Trash2, Check, X, Mic, Square, RotateCw, Search, ChevronDown } from "lucide-react";
+import { SelectPopover } from "@/shared/ui/SelectPopover";
+import { AppModal, AppModalHeader } from "@/shared/ui/AppModal";

 interface RefAudio {
  id: string;
@@ -36,7 +38,8 @@ interface RefAudioPanelProps {
  recordingTime: number;
  onStartRecording: () => void;
  onStopRecording: () => void;
-  onUseRecording: () => void;
+  onUseRecording: () => void | Promise<void>;
+  onDiscardRecording: () => void;
  formatRecordingTime: (seconds: number) => string;
 }

@@ -68,9 +71,26 @@ export function RefAudioPanel({
  onStartRecording,
  onStopRecording,
  onUseRecording,
+  onDiscardRecording,
  formatRecordingTime,
 }: RefAudioPanelProps) {
  const [recordedUrl, setRecordedUrl] = useState<string | null>(null);
+  const [refAudioFilter, setRefAudioFilter] = useState("");
+  const [recordingModalOpen, setRecordingModalOpen] = useState(false);
+  const [recordedPreviewPlaying, setRecordedPreviewPlaying] = useState(false);
+  const [recordedPreviewCurrentTime, setRecordedPreviewCurrentTime] = useState(0);
+  const [recordedPreviewDuration, setRecordedPreviewDuration] = useState(0);
+  const refAudioListContainerRef = useRef<HTMLDivElement | null>(null);
+  const recordedAudioRef = useRef<HTMLAudioElement | null>(null);
+
+  const stopRecordedPreview = useCallback(() => {
+    const player = recordedAudioRef.current;
+    if (!player) return;
+    player.pause();
+    player.currentTime = 0;
+    setRecordedPreviewPlaying(false);
+    setRecordedPreviewCurrentTime(0);
+  }, []);

  useEffect(() => {
    if (!recordedBlob) {
@@ -88,45 +108,95 @@ export function RefAudioPanel({
  const needsRetranscribe = (audio: RefAudio) =>
    audio.ref_text.startsWith(OLD_FIXED_REF_TEXT);

+  const selectedRefAudioLabel = selectedRefAudio?.name || "请选择参考音频";
+  const filteredRefAudios = useMemo(() => {
+    const query = refAudioFilter.trim().toLowerCase();
+    if (!query) return refAudios;
+    return refAudios.filter((audio) => audio.name.toLowerCase().includes(query));
+  }, [refAudioFilter, refAudios]);
+
+  const handleOpenRefAudioPopover = useCallback(() => {
+    setRefAudioFilter("");
+
+    requestAnimationFrame(() => {
+      requestAnimationFrame(() => {
+        const container = refAudioListContainerRef.current;
+        if (!container) return;
+        const selectedRow = container.querySelector<HTMLElement>("[data-ref-selected='true']");
+        selectedRow?.scrollIntoView({ block: "nearest", behavior: "auto" });
+      });
+    });
+  }, []);
+
+  const closeRecordingModal = () => {
+    stopRecordedPreview();
+    if (isRecording) {
+      onStopRecording();
+    }
+    setRecordingModalOpen(false);
+  };
+
+  const handleUseRecordingAndClose = () => {
+    stopRecordedPreview();
+    setRecordingModalOpen(false);
+    void onUseRecording();
+  };
+
+  const handleToggleRecordedPreview = () => {
+    const player = recordedAudioRef.current;
+    if (!player) return;
+
+    if (player.paused) {
+      player.play().catch(() => {
+        setRecordedPreviewPlaying(false);
+      });
+      return;
+    }
+
+    player.pause();
+  };
+
+  const handleRecordedSeek = (event: ChangeEvent<HTMLInputElement>) => {
+    const player = recordedAudioRef.current;
+    if (!player) return;
+    const nextTime = Number(event.target.value);
+    player.currentTime = Number.isFinite(nextTime) ? nextTime : 0;
+    setRecordedPreviewCurrentTime(Number.isFinite(nextTime) ? nextTime : 0);
+  };
+
+  const totalRecordedPreviewTime =
+    Number.isFinite(recordedPreviewDuration) && recordedPreviewDuration > 0
+      ? recordedPreviewDuration
+      : recordingTime;
+
  return (
    <div className="space-y-4">
      <div>
        <div className="flex justify-between items-center mb-2">
          <span className="text-sm text-gray-300">📁 我的参考音频 <span className="text-xs text-gray-500 font-normal">(上传3-10秒语音样本)</span></span>
-          <div className="flex gap-2">
-            <input
-              type="file"
-              id="ref-audio-upload"
-              accept=".wav,.mp3,.m4a,.webm,.ogg,.flac,.aac"
-              onChange={(e) => {
-                const file = e.target.files?.[0];
-                if (file) {
-                  onUploadRefAudio(file);
-                }
-                e.target.value = '';
-              }}
-              className="hidden"
-            />
-            <label
-              htmlFor="ref-audio-upload"
-              className={`px-2 py-1 text-xs rounded cursor-pointer transition-all flex items-center gap-1 ${isUploadingRef
-                ? "bg-gray-600 cursor-not-allowed text-gray-400"
-                : "bg-purple-600 hover:bg-purple-700 text-white"
-                }`}
-            >
-              <Upload className="h-3.5 w-3.5" />
-              上传
-            </label>
-            <button
-              onClick={onFetchRefAudios}
-              className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 flex items-center gap-1"
-            >
-              <RefreshCw className="h-3.5 w-3.5" />
-              刷新
-            </button>
-          </div>
+          <button
+            onClick={onFetchRefAudios}
+            className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 flex items-center gap-1"
+          >
+            <RefreshCw className="h-3.5 w-3.5" />
+            刷新
+          </button>
        </div>

+        <input
+          type="file"
+          id="ref-audio-upload"
+          accept=".wav,.mp3,.m4a,.webm,.ogg,.flac,.aac"
+          onChange={(e) => {
+            const file = e.target.files?.[0];
+            if (file) {
+              onUploadRefAudio(file);
+            }
+            e.target.value = "";
+          }}
+          className="hidden"
+        />
+
        {isUploadingRef && (
          <div className="mb-2 p-2 bg-purple-500/10 rounded text-sm text-purple-300">
            ⏳ 上传并识别中...
@@ -147,146 +217,316 @@ export function RefAudioPanel({
            暂无参考音频，请上传或录制
          </div>
        ) : (
-          <div className="grid grid-cols-2 gap-2" style={{ contentVisibility: 'auto' }}>
-            {refAudios.map((audio) => (
-              <div
-                key={audio.id}
-                className={`p-2 rounded-lg border transition-all relative group cursor-pointer ${selectedRefAudio?.id === audio.id
-                  ? "border-purple-500 bg-purple-500/20"
-                  : "border-white/10 bg-white/5 hover:border-white/30"
-                  }`}
-                onClick={() => {
-                  if (editingAudioId !== audio.id) {
-                    onSelectRefAudio(audio);
-                  }
-                }}
+          <SelectPopover
+            sheetTitle="选择参考音频"
+            onOpen={handleOpenRefAudioPopover}
+            trigger={({ open, toggle }) => (
+              <button
+                type="button"
+                onClick={toggle}
+                className="w-full rounded-xl border border-white/10 bg-black/25 px-3 py-2.5 text-left transition-colors hover:border-white/30"
              >
-                {editingAudioId === audio.id ? (
-                  <div className="flex items-center gap-1" onClick={(e) => e.stopPropagation()}>
+                <span className="flex items-center justify-between gap-3">
+                  <span className="min-w-0">
+                    <span className="block text-xs text-gray-400">当前参考音频</span>
+                    <span className="mt-0.5 block truncate text-sm text-white">{selectedRefAudioLabel}</span>
+                  </span>
+                  <ChevronDown className={`h-4 w-4 text-gray-300 transition-transform ${open ? "rotate-180" : ""}`} />
+                </span>
+              </button>
+            )}
+          >
+            {({ close }) => (
+              <div className="space-y-2">
+                <div className="rounded-lg border border-white/10 bg-black/30 px-3 py-2">
+                  <div className="flex items-center gap-2">
+                    <Search className="h-4 w-4 text-gray-400" />
                    <input
                      type="text"
-                      value={editName}
-                      onChange={(e) => onEditNameChange(e.target.value)}
-                      className="w-full bg-black/50 text-white text-xs px-1 py-0.5 rounded border border-purple-500 focus:outline-none"
-                      autoFocus
-                      onKeyDown={(e) => {
-                        if (e.key === 'Enter') onSaveEditing(audio.id, e as unknown as MouseEvent);
-                        if (e.key === 'Escape') onCancelEditing(e as unknown as MouseEvent);
-                      }}
+                      value={refAudioFilter}
+                      onChange={(e) => setRefAudioFilter(e.target.value)}
+                      placeholder="搜索参考音频..."
+                      className="w-full bg-transparent text-sm text-white placeholder-gray-500 outline-none"
                    />
-                    <button onClick={(e) => onSaveEditing(audio.id, e)} className="text-green-400 hover:text-green-300 text-xs">
-                      <Check className="h-3 w-3" />
-                    </button>
-                    <button onClick={(e) => onCancelEditing(e)} className="text-gray-400 hover:text-gray-300 text-xs">
-                      <X className="h-3 w-3" />
-                    </button>
                  </div>
+                </div>
+
+                {filteredRefAudios.length === 0 ? (
+                  <div className="py-6 text-center text-sm text-gray-400">没有匹配的参考音频</div>
                ) : (
-                  <>
-                    <div className="flex justify-between items-start mb-1">
-                      <div className="text-white text-xs truncate pr-1 flex-1" title={audio.name}>
-                        {audio.name}
-                      </div>
-                      <div className="flex gap-1 opacity-40 group-hover:opacity-100 transition-opacity">
-                        <button
-                          onClick={(e) => onTogglePlayPreview(audio, e)}
-                          className="text-gray-400 hover:text-purple-400 text-xs"
-                          title="试听"
+                  <div ref={refAudioListContainerRef} className="space-y-1" style={{ contentVisibility: "auto" }}>
+                    {filteredRefAudios.map((audio) => {
+                      const isSelected = selectedRefAudio?.id === audio.id;
+
+                      return (
+                        <div
+                          key={audio.id}
+                          data-popover-selected={isSelected ? "true" : undefined}
+                          data-ref-selected={isSelected ? "true" : "false"}
+                          className={`flex items-center justify-between gap-2 rounded-lg border px-3 py-2 transition-colors ${isSelected
+                            ? "border-purple-500 bg-purple-500/20"
+                            : "border-white/10 bg-white/5 hover:border-white/30"
+                            }`}
                        >
-                          {playingAudioId === audio.id ? (
-                            <Pause className="h-3.5 w-3.5" />
+                          {editingAudioId === audio.id ? (
+                            <div className="flex-1 flex items-center gap-2" onClick={(e) => e.stopPropagation()}>
+                              <input
+                                type="text"
+                                value={editName}
+                                onChange={(e) => onEditNameChange(e.target.value)}
+                                className="w-full rounded border border-purple-500 bg-black/50 px-2 py-1 text-xs text-white focus:outline-none"
+                                autoFocus
+                                onKeyDown={(e) => {
+                                  if (e.key === "Enter") onSaveEditing(audio.id, e as unknown as MouseEvent);
+                                  if (e.key === "Escape") onCancelEditing(e as unknown as MouseEvent);
+                                }}
+                              />
+                              <button type="button" onClick={(e) => onSaveEditing(audio.id, e)} className="text-green-400 hover:text-green-300">
+                                <Check className="h-3.5 w-3.5" />
+                              </button>
+                              <button type="button" onClick={(e) => onCancelEditing(e)} className="text-gray-400 hover:text-gray-300">
+                                <X className="h-3.5 w-3.5" />
+                              </button>
+                            </div>
                          ) : (
-                            <Play className="h-3.5 w-3.5" />
+                            <button
+                              type="button"
+                              onClick={() => {
+                                onSelectRefAudio(audio);
+                                close();
+                              }}
+                              className="min-w-0 flex-1 text-left"
+                            >
+                              <span className="block truncate text-sm text-white" title={audio.name}>{audio.name}</span>
+                              <span className="mt-0.5 block text-xs text-gray-400">
+                                {audio.duration_sec.toFixed(1)}s
+                                {needsRetranscribe(audio) && (
+                                  <span className="ml-1 text-yellow-500" title="需要重新识别文字">⚠</span>
+                                )}
+                              </span>
+                            </button>
                          )}
-                        </button>
-                        <button
-                          onClick={(e) => {
-                            e.stopPropagation();
-                            onRetranscribe(audio.id);
-                          }}
-                          disabled={retranscribingId === audio.id}
-                          className="text-gray-400 hover:text-cyan-400 text-xs disabled:opacity-50"
-                          title="重新识别文字"
-                        >
-                          <RotateCw className={`h-3.5 w-3.5 ${retranscribingId === audio.id ? 'animate-spin' : ''}`} />
-                        </button>
-                        <button
-                          onClick={(e) => onStartEditing(audio, e)}
-                          className="text-gray-400 hover:text-blue-400 text-xs"
-                          title="重命名"
-                        >
-                          <Pencil className="h-3.5 w-3.5" />
-                        </button>
-                        <button
-                          onClick={(e) => {
-                            e.stopPropagation();
-                            onDeleteRefAudio(audio.id);
-                          }}
-                          className="text-gray-400 hover:text-red-400 text-xs"
-                          title="删除"
-                        >
-                          <Trash2 className="h-3.5 w-3.5" />
-                        </button>
-                      </div>
-                    </div>
-                    <div className="text-gray-400 text-xs">
-                      {audio.duration_sec.toFixed(1)}s
-                      {needsRetranscribe(audio) && (
-                        <span className="text-yellow-500 ml-1" title="需要重新识别文字">⚠</span>
-                      )}
-                    </div>
-                  </>
+
+                          {editingAudioId !== audio.id && (
+                            <div className="flex items-center gap-1 pl-2">
+                              <button
+                                type="button"
+                                onClick={(e) => onTogglePlayPreview(audio, e)}
+                                className="text-gray-400 hover:text-purple-300"
+                                title="试听"
+                              >
+                                {playingAudioId === audio.id ? (
+                                  <Pause className="h-3.5 w-3.5" />
+                                ) : (
+                                  <Play className="h-3.5 w-3.5" />
+                                )}
+                              </button>
+                              <button
+                                type="button"
+                                onClick={(e) => {
+                                  e.stopPropagation();
+                                  onRetranscribe(audio.id);
+                                }}
+                                disabled={retranscribingId === audio.id}
+                                className="text-gray-400 hover:text-cyan-400 disabled:opacity-50"
+                                title="重新识别文字"
+                              >
+                                <RotateCw className={`h-3.5 w-3.5 ${retranscribingId === audio.id ? "animate-spin" : ""}`} />
+                              </button>
+                              <button
+                                type="button"
+                                onClick={(e) => onStartEditing(audio, e)}
+                                className="text-gray-400 hover:text-blue-400"
+                                title="重命名"
+                              >
+                                <Pencil className="h-3.5 w-3.5" />
+                              </button>
+                              <button
+                                type="button"
+                                onClick={(e) => {
+                                  e.stopPropagation();
+                                  onDeleteRefAudio(audio.id);
+                                }}
+                                className="text-gray-400 hover:text-red-400"
+                                title="删除"
+                              >
+                                <Trash2 className="h-3.5 w-3.5" />
+                              </button>
+                              {isSelected && <Check className="h-3.5 w-3.5 text-purple-300" />}
+                            </div>
+                          )}
+                        </div>
+                      );
+                    })}
+                  </div>
                )}
              </div>
-            ))}
-          </div>
+            )}
+          </SelectPopover>
        )}
-      </div>

-      <div className="border-t border-white/10 pt-4">
-        <span className="text-sm text-gray-300 mb-2 block">🎤 或在线录音 <span className="text-xs text-gray-500">（建议 3-10 秒，超出将自动截取）</span></span>
-        <div className="flex gap-2 items-center">
-          {!isRecording ? (
-            <button
-              onClick={onStartRecording}
-              className="px-4 py-2 bg-red-600 hover:bg-red-700 text-white rounded-lg text-sm font-medium transition-colors flex items-center gap-2"
-            >
-              <Mic className="h-4 w-4" />
-              开始录音
-            </button>
-          ) : (
-            <button
-              onClick={onStopRecording}
-              className="px-4 py-2 bg-gray-600 hover:bg-gray-700 text-white rounded-lg text-sm font-medium transition-colors flex items-center gap-2"
-            >
-              <Square className="h-4 w-4" />
-              停止
-            </button>
-          )}
-          {isRecording && (
-            <span className="text-red-400 text-sm animate-pulse">
-              🔴 录音中 {formatRecordingTime(recordingTime)}
+        <div className="mt-3 flex flex-wrap items-center justify-end gap-2">
+          {recordedBlob && !isRecording && (
+            <span className="mr-auto text-xs text-emerald-300/90">
+              已录制 {formatRecordingTime(recordingTime)}，可点击“在线录音”处理
            </span>
          )}
+          <label
+            htmlFor="ref-audio-upload"
+            className={`px-3 py-1.5 text-xs rounded-lg cursor-pointer transition-all inline-flex items-center gap-1.5 ${isUploadingRef
+              ? "bg-gray-600 cursor-not-allowed text-gray-400 pointer-events-none"
+              : "bg-purple-600 hover:bg-purple-700 text-white"
+              }`}
+          >
+            <Upload className="h-3.5 w-3.5" />
+            上传音频
+          </label>
+          <button
+            type="button"
+            onClick={() => setRecordingModalOpen(true)}
+            disabled={isUploadingRef}
+            className="px-3 py-1.5 text-xs rounded-lg transition-colors bg-red-600 hover:bg-red-700 text-white disabled:bg-gray-600 disabled:text-gray-400 inline-flex items-center gap-1.5"
+          >
+            <Mic className="h-3.5 w-3.5" />
+            在线录音
+          </button>
        </div>
-
-        {recordedBlob && !isRecording && (
-          <div className="mt-3 p-3 bg-green-500/10 border border-green-500/30 rounded-lg">
-            <div className="flex items-center gap-2 mb-2">
-              <span className="text-green-300 text-sm">✅ 录音完成 ({formatRecordingTime(recordingTime)})</span>
-              <audio src={recordedUrl || ''} controls className="h-8" />
-            </div>
-            <button
-              onClick={onUseRecording}
-              disabled={isUploadingRef}
-              className="px-3 py-1 bg-green-600 hover:bg-green-700 text-white rounded text-sm disabled:bg-gray-600"
-            >
-              使用此录音
-            </button>
-          </div>
-        )}
      </div>

+      {recordingModalOpen && (
+        <AppModal
+          isOpen={recordingModalOpen}
+          onClose={closeRecordingModal}
+          panelClassName="w-full max-w-lg rounded-2xl border border-white/10 bg-[#171821]/95 shadow-[0_24px_80px_rgba(0,0,0,0.55)] overflow-hidden"
+          closeOnOverlay={!isRecording}
+        >
+          <AppModalHeader
+            title="🎤 在线录音"
+            subtitle="建议录制 3-10 秒，超出会自动截取到可用长度"
+            onClose={closeRecordingModal}
+          />
+
+          <div className="space-y-4 p-4 sm:p-5">
+            <div className="rounded-xl border border-white/10 bg-black/25 p-3 sm:p-4">
+              <div className="flex flex-wrap items-center gap-2">
+                {!isRecording ? (
+                  <button
+                    type="button"
+                    onClick={onStartRecording}
+                    disabled={isUploadingRef}
+                    className="px-4 py-2 rounded-lg text-sm font-medium bg-red-600 hover:bg-red-700 text-white transition-colors disabled:bg-gray-600 disabled:text-gray-400 inline-flex items-center gap-2"
+                  >
+                    <Mic className="h-4 w-4" />
+                    {recordedBlob ? "重新录音" : "开始录音"}
+                  </button>
+                ) : (
+                  <button
+                    type="button"
+                    onClick={onStopRecording}
+                    className="px-4 py-2 rounded-lg text-sm font-medium bg-gray-600 hover:bg-gray-700 text-white transition-colors inline-flex items-center gap-2"
+                  >
+                    <Square className="h-4 w-4" />
+                    停止录音
+                  </button>
+                )}
+
+                {isRecording ? (
+                  <span className="inline-flex items-center gap-1 rounded-full border border-red-400/40 bg-red-500/10 px-3 py-1 text-xs text-red-300 animate-pulse">
+                    <span className="h-1.5 w-1.5 rounded-full bg-red-400" />
+                    录音中 {formatRecordingTime(recordingTime)}
+                  </span>
+                ) : recordedBlob ? (
+                  <span className="inline-flex items-center gap-1 rounded-full border border-emerald-400/30 bg-emerald-500/10 px-3 py-1 text-xs text-emerald-300">
+                    已录制 {formatRecordingTime(recordingTime)}
+                  </span>
+                ) : null}
+              </div>
+
+              {!recordedBlob && !isRecording && (
+                <p className="mt-3 text-xs text-gray-500">点击“开始录音”后允许麦克风权限，结束后可试听并确认上传</p>
+              )}
+            </div>
+
+            {recordedBlob && !isRecording && (
+              <div className="space-y-3 rounded-xl border border-emerald-500/30 bg-emerald-500/10 p-3">
+                <div className="flex items-center justify-between gap-2">
+                  <span className="text-sm text-emerald-200">✅ 录音完成，可先试听再使用</span>
+                  <span className="text-xs text-emerald-300/80">{formatRecordingTime(recordingTime)}</span>
+                </div>
+
+                <div className="rounded-lg border border-white/10 bg-black/35 px-3 py-2.5">
+                  <audio
+                    key={recordedUrl || "recorded-preview"}
+                    ref={recordedAudioRef}
+                    src={recordedUrl || ""}
+                    className="hidden"
+                    onPlay={() => setRecordedPreviewPlaying(true)}
+                    onPause={() => setRecordedPreviewPlaying(false)}
+                    onEnded={() => {
+                      setRecordedPreviewPlaying(false);
+                      setRecordedPreviewCurrentTime(0);
+                    }}
+                    onTimeUpdate={(event) => setRecordedPreviewCurrentTime(event.currentTarget.currentTime || 0)}
+                    onLoadedMetadata={(event) => setRecordedPreviewDuration(event.currentTarget.duration || 0)}
+                  />
+
+                  <div className="flex items-center gap-3">
+                    <button
+                      type="button"
+                      onClick={handleToggleRecordedPreview}
+                      disabled={!recordedUrl}
+                      className="h-8 w-8 shrink-0 rounded-full bg-white/10 hover:bg-white/20 text-emerald-200 disabled:text-gray-500 disabled:bg-white/5 inline-flex items-center justify-center transition-colors"
+                      title={recordedPreviewPlaying ? "暂停试听" : "播放试听"}
+                    >
+                      {recordedPreviewPlaying ? (
+                        <Pause className="h-4 w-4" />
+                      ) : (
+                        <Play className="h-4 w-4 translate-x-[1px]" />
+                      )}
+                    </button>
+
+                    <div className="min-w-0 flex-1">
+                      <input
+                        type="range"
+                        min={0}
+                        max={Math.max(totalRecordedPreviewTime, 0.1)}
+                        step={0.01}
+                        value={Math.min(recordedPreviewCurrentTime, totalRecordedPreviewTime || 0)}
+                        onChange={handleRecordedSeek}
+                        className="w-full h-1.5 cursor-pointer appearance-none rounded-full bg-white/15 accent-emerald-400"
+                      />
+                      <div className="mt-1 flex items-center justify-between text-[11px] text-emerald-200/80">
+                        <span>{formatRecordingTime(Math.floor(recordedPreviewCurrentTime))}</span>
+                        <span>{formatRecordingTime(Math.floor(totalRecordedPreviewTime))}</span>
+                      </div>
+                    </div>
+                  </div>
+                </div>
+
+                <div className="flex flex-wrap items-center justify-end gap-2">
+                  <button
+                    type="button"
+                    onClick={onDiscardRecording}
+                    disabled={isUploadingRef}
+                    className="px-3 py-1.5 rounded-lg text-sm bg-white/10 hover:bg-white/20 text-gray-200 transition-colors disabled:bg-white/5 disabled:text-gray-500"
+                  >
+                    弃用本次录音
+                  </button>
+                  <button
+                    type="button"
+                    onClick={handleUseRecordingAndClose}
+                    disabled={isUploadingRef}
+                    className="px-3 py-1.5 rounded-lg text-sm bg-green-600 hover:bg-green-700 text-white transition-colors disabled:bg-gray-600 disabled:text-gray-400"
+                  >
+                    使用此录音
+                  </button>
+                </div>
+              </div>
+            )}
+          </div>
+        </AppModal>
+      )}
+
    </div>
  );
 }
--- a/frontend/src/features/home/ui/RewriteModal.tsx
+++ b/frontend/src/features/home/ui/RewriteModal.tsx
@@ -1,7 +1,9 @@
-import { useState, useEffect, useRef, useCallback } from "react";
-import { Loader2, Sparkles } from "lucide-react";
-import api from "@/shared/api/axios";
-import { ApiResponse, unwrap } from "@/shared/api/types";
+import { useState, useEffect, useRef, useCallback } from "react";
+import { Loader2, Sparkles } from "lucide-react";
+import api from "@/shared/api/axios";
+import { ApiResponse, unwrap } from "@/shared/api/types";
+import { AppModal, AppModalHeader } from "@/shared/ui/AppModal";
+import { toast } from "sonner";

 const CUSTOM_PROMPT_KEY = "vigent_rewriteCustomPrompt";

@@ -77,42 +79,70 @@ export default function RewriteModal({
    onClose();
  };

-  const handleRetry = () => {
-    setRewrittenText("");
-    setError(null);
-  };
+  const handleRetry = () => {
+    setRewrittenText("");
+    setError(null);
+  };
+
+  const fallbackCopyTextToClipboard = useCallback((text: string) => {
+    const textArea = document.createElement("textarea");
+    textArea.value = text;
+    textArea.style.top = "0";
+    textArea.style.left = "0";
+    textArea.style.position = "fixed";
+    textArea.style.opacity = "0";
+
+    document.body.appendChild(textArea);
+    textArea.focus();
+    textArea.select();
+
+    try {
+      const successful = document.execCommand("copy");
+      if (successful) {
+        toast.success("已复制到剪贴板");
+      } else {
+        toast.error("复制失败，请手动复制");
+      }
+    } catch {
+      toast.error("复制失败，请手动复制");
+    }
+
+    document.body.removeChild(textArea);
+  }, []);
+
+  const handleCopy = useCallback((text: string) => {
+    if (!text.trim()) return;
+    if (navigator.clipboard && window.isSecureContext) {
+      navigator.clipboard
+        .writeText(text)
+        .then(() => {
+          toast.success("已复制到剪贴板");
+        })
+        .catch(() => {
+          fallbackCopyTextToClipboard(text);
+        });
+    } else {
+      fallbackCopyTextToClipboard(text);
+    }
+  }, [fallbackCopyTextToClipboard]);

-  // ESC to close
-  useEffect(() => {
-    if (!isOpen) return;
-    const handleKeyDown = (e: KeyboardEvent) => {
-      if (e.key === "Escape") onClose();
-    };
-    document.addEventListener("keydown", handleKeyDown);
-    return () => document.removeEventListener("keydown", handleKeyDown);
-  }, [isOpen, onClose]);
-
-  if (!isOpen) return null;
-
-  return (
-    <div className="fixed inset-0 z-50 flex items-center justify-center bg-black/80 backdrop-blur-sm p-4 animate-in fade-in duration-200">
-      <div className="bg-[#1a1a1a] border border-white/10 rounded-2xl w-full max-w-2xl max-h-[90vh] overflow-hidden flex flex-col shadow-2xl">
-        {/* Header */}
-        <div className="flex items-center justify-between p-4 border-b border-white/10 bg-white/5">
-          <h3 className="text-lg font-semibold text-white flex items-center gap-2">
-            <Sparkles className="h-5 w-5 text-purple-400" />
-            AI 智能改写
-          </h3>
-          <button
-            onClick={onClose}
-            className="text-gray-400 hover:text-white transition-colors text-2xl leading-none"
-          >
-            &times;
-          </button>
-        </div>
-
-        {/* Content */}
-        <div className="flex-1 overflow-y-auto p-6 space-y-5">
+  if (!isOpen) return null;
+
+  return (
+    <AppModal
+      isOpen={isOpen}
+      onClose={onClose}
+      panelClassName="w-full max-w-2xl max-h-[90vh] rounded-2xl border border-white/10 bg-[#171821]/95 shadow-[0_24px_80px_rgba(0,0,0,0.55)] overflow-hidden flex flex-col"
+      closeOnOverlay
+    >
+      <AppModalHeader
+        title="AI 智能改写"
+        icon={<Sparkles className="h-5 w-5 text-purple-300" />}
+        onClose={onClose}
+      />
+
+        {/* Content */}
+        <div className="flex-1 overflow-y-auto p-6 space-y-5">
          {/* Custom Prompt */}
          <div className="space-y-2">
            <label className="text-sm text-gray-300">
@@ -156,58 +186,64 @@ export default function RewriteModal({
            </div>
          )}

-          {/* Rewritten result */}
-          {rewrittenText && (
-            <>
-              <div className="space-y-2">
-                <div className="flex justify-between items-center">
-                  <h4 className="font-semibold text-purple-300 flex items-center gap-2">
-                    <Sparkles className="h-4 w-4" />
-                    AI 改写结果
-                  </h4>
-                  <button
-                    onClick={handleApply}
-                    className="text-xs bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-500 hover:to-pink-500 text-white px-3 py-1.5 rounded-lg transition-colors shadow-sm"
-                  >
-                    使用此结果
-                  </button>
-                </div>
-                <div className="bg-purple-900/10 border border-purple-500/20 rounded-xl p-4 max-h-60 overflow-y-auto hide-scrollbar">
-                  <p className="text-gray-200 text-sm leading-relaxed whitespace-pre-wrap">
-                    {rewrittenText}
-                  </p>
-                </div>
-              </div>
-
-              <div className="space-y-2">
-                <div className="flex justify-between items-center">
-                  <h4 className="font-semibold text-gray-400 flex items-center gap-2">
-                    📝 原文对比
-                  </h4>
-                  <button
-                    onClick={onClose}
-                    className="text-xs bg-white/10 hover:bg-white/20 text-white px-3 py-1.5 rounded-lg transition-colors"
-                  >
-                    保留原文
-                  </button>
-                </div>
-                <div className="bg-white/5 border border-white/10 rounded-xl p-4 max-h-40 overflow-y-auto hide-scrollbar">
-                  <p className="text-gray-400 text-sm leading-relaxed whitespace-pre-wrap">
-                    {originalText}
-                  </p>
-                </div>
-              </div>
-
-              <button
-                onClick={handleRetry}
-                className="w-full py-2.5 px-4 bg-white/10 hover:bg-white/20 text-white rounded-xl transition-colors"
-              >
-                重新改写
-              </button>
-            </>
-          )}
-        </div>
-      </div>
-    </div>
-  );
-}
+          {/* Rewritten result */}
+          {rewrittenText && (
+            <>
+              <div className="space-y-2">
+                <div className="flex justify-between items-center">
+                  <h4 className="font-semibold text-purple-300 flex items-center gap-2">
+                    <Sparkles className="h-4 w-4" />
+                    AI 改写结果
+                  </h4>
+                  <span className="text-xs text-gray-400">{rewrittenText.length} 字</span>
+                </div>
+                <div className="bg-purple-900/10 border border-purple-500/20 rounded-xl p-4 max-h-60 overflow-y-auto hide-scrollbar">
+                  <p className="text-gray-200 text-sm leading-relaxed whitespace-pre-wrap">
+                    {rewrittenText}
+                  </p>
+                </div>
+              </div>
+
+              <div className="space-y-2">
+                <h4 className="font-semibold text-gray-400 flex items-center gap-2">
+                  📝 原文对比
+                </h4>
+                <div className="bg-white/5 border border-white/10 rounded-xl p-4 max-h-40 overflow-y-auto hide-scrollbar">
+                  <p className="text-gray-400 text-sm leading-relaxed whitespace-pre-wrap">
+                    {originalText}
+                  </p>
+                </div>
+              </div>
+
+              <div className="grid grid-cols-2 sm:grid-cols-4 gap-2">
+                <button
+                  onClick={handleApply}
+                  className="py-2.5 px-3 bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-500 hover:to-pink-500 text-white rounded-lg transition-colors text-sm"
+                >
+                  填入文案
+                </button>
+                <button
+                  onClick={() => handleCopy(rewrittenText)}
+                  className="py-2.5 px-3 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors text-sm"
+                >
+                  复制
+                </button>
+                <button
+                  onClick={handleRetry}
+                  className="py-2.5 px-3 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors text-sm"
+                >
+                  重新生成
+                </button>
+                <button
+                  onClick={onClose}
+                  className="py-2.5 px-3 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors text-sm"
+                >
+                  保留原文
+                </button>
+              </div>
+            </>
+          )}
+        </div>
+    </AppModal>
+  );
+}
--- a/frontend/src/features/home/ui/ScriptEditor.tsx
+++ b/frontend/src/features/home/ui/ScriptEditor.tsx
@@ -1,6 +1,7 @@
-import { useEffect, useRef, useState } from "react";
-import { FileText, History, Languages, Loader2, RotateCcw, Save, Sparkles, Trash2 } from "lucide-react";
+import { useCallback, useEffect, useRef, useState } from "react";
+import { FileText, GraduationCap, History, Languages, Loader2, Maximize2, RotateCcw, Save, Sparkles, Trash2 } from "lucide-react";
 import type { SavedScript } from "@/features/home/model/useSavedScripts";
+import { AppModal, AppModalHeader } from "@/shared/ui/AppModal";

 const LANGUAGES = [
  { code: "English", label: "英语 English" },
@@ -18,9 +19,8 @@ interface ScriptEditorProps {
  text: string;
  onChangeText: (value: string) => void;
  onOpenExtractModal: () => void;
+  onOpenLearningModal: () => void;
  onOpenRewriteModal: () => void;
-  onGenerateMeta: () => void;
-  isGeneratingMeta: boolean;
  onTranslate: (targetLang: string) => void;
  isTranslating: boolean;
  hasOriginalText: boolean;
@@ -35,9 +35,8 @@ export function ScriptEditor({
  text,
  onChangeText,
  onOpenExtractModal,
+  onOpenLearningModal,
  onOpenRewriteModal,
-  onGenerateMeta,
-  isGeneratingMeta,
  onTranslate,
  isTranslating,
  hasOriginalText,
@@ -47,10 +46,17 @@ export function ScriptEditor({
  onLoadScript,
  onDeleteScript,
 }: ScriptEditorProps) {
+  const actionBtnBase = "px-3 py-1.5 text-xs rounded-lg transition-colors whitespace-nowrap inline-flex items-center gap-1.5";
+  const actionBtnDisabled = "bg-gray-600 cursor-not-allowed text-gray-400";
+
  const [showLangMenu, setShowLangMenu] = useState(false);
  const langMenuRef = useRef<HTMLDivElement>(null);
  const [showHistoryMenu, setShowHistoryMenu] = useState(false);
  const historyMenuRef = useRef<HTMLDivElement>(null);
+  const [isExpandedEditorOpen, setIsExpandedEditorOpen] = useState(false);
+  const handleCloseExpandedEditor = useCallback(() => {
+    setIsExpandedEditorOpen(false);
+  }, []);

  useEffect(() => {
    if (!showLangMenu) return;
@@ -95,7 +101,7 @@ export function ScriptEditor({
          <div className="relative" ref={historyMenuRef}>
            <button
              onClick={() => setShowHistoryMenu((prev) => !prev)}
-              className="h-7 px-2.5 text-xs rounded transition-all whitespace-nowrap bg-gray-600 hover:bg-gray-500 text-white inline-flex items-center gap-1"
+              className={`${actionBtnBase} bg-gray-600 hover:bg-gray-500 text-white`}
            >
              <History className="h-3.5 w-3.5" />
              历史文案
@@ -137,18 +143,25 @@ export function ScriptEditor({
          </div>
          <button
            onClick={onOpenExtractModal}
-            className="h-7 px-2.5 text-xs rounded transition-all whitespace-nowrap bg-purple-600 hover:bg-purple-700 text-white inline-flex items-center gap-1"
+            className={`${actionBtnBase} bg-purple-600 hover:bg-purple-700 text-white`}
          >
            <FileText className="h-3.5 w-3.5" />
            文案提取助手
          </button>
+          <button
+            onClick={onOpenLearningModal}
+            className={`${actionBtnBase} bg-gradient-to-r from-blue-600 to-cyan-600 hover:from-blue-500 hover:to-cyan-500 text-white`}
+          >
+            <GraduationCap className="h-3.5 w-3.5" />
+            文案深度学习
+          </button>
          <div className="relative" ref={langMenuRef}>
            <button
              onClick={() => setShowLangMenu((prev) => !prev)}
              disabled={isTranslating || !text.trim()}
-              className={`h-7 px-2.5 text-xs rounded transition-all whitespace-nowrap inline-flex items-center gap-1 ${
+              className={`${actionBtnBase} ${
                isTranslating || !text.trim()
-                  ? "bg-gray-600 cursor-not-allowed text-gray-400"
+                  ? actionBtnDisabled
                  : "bg-gradient-to-r from-emerald-600 to-teal-600 hover:from-emerald-700 hover:to-teal-700 text-white"
              }`}
            >
@@ -190,63 +203,75 @@ export function ScriptEditor({
              </div>
            )}
          </div>
-          <button
-            onClick={onGenerateMeta}
-            disabled={isGeneratingMeta || !text.trim()}
-            className={`h-7 px-2.5 text-xs rounded transition-all whitespace-nowrap inline-flex items-center gap-1 ${isGeneratingMeta || !text.trim()
-              ? "bg-gray-600 cursor-not-allowed text-gray-400"
-              : "bg-gradient-to-r from-blue-600 to-cyan-600 hover:from-blue-700 hover:to-cyan-700 text-white"
-              }`}
-          >
-            {isGeneratingMeta ? (
-              <>
-                <Loader2 className="h-3.5 w-3.5 animate-spin" />
-                生成中...
-              </>
-            ) : (
-              <>
-                <Sparkles className="h-3.5 w-3.5" />
-                AI生成标题标签
-              </>
-            )}
-          </button>
        </div>
      </div>
-      <textarea
-        value={text}
-        onChange={(e) => onChangeText(e.target.value)}
-        placeholder="请输入你想说的话..."
-        className="w-full h-40 bg-black/30 border border-white/10 rounded-xl p-4 text-white placeholder-gray-500 resize-none focus:outline-none focus:border-purple-500 transition-colors hide-scrollbar"
-      />
+      <div className="relative">
+        <textarea
+          value={text}
+          onChange={(e) => onChangeText(e.target.value)}
+          placeholder="请输入你想说的话..."
+          className="w-full h-40 bg-black/30 border border-white/10 rounded-xl p-4 pr-6 pb-6 text-white placeholder-gray-500 resize-none focus:outline-none focus:border-white/25 transition-colors hide-scrollbar"
+        />
+        <button
+          type="button"
+          onClick={() => setIsExpandedEditorOpen(true)}
+          className="absolute right-0.5 bottom-2 h-5 w-5 text-gray-400/85 hover:text-white focus:outline-none transition-colors inline-flex items-center justify-center"
+          aria-label="扩展文案编辑器"
+          title="扩展编辑"
+        >
+          <Maximize2 className="h-4 w-4" />
+        </button>
+      </div>
      <div className="flex items-center justify-between mt-2 text-sm text-gray-400">
        <span>{text.length} 字</span>
        <div className="flex items-center gap-2">
          <button
            onClick={onOpenRewriteModal}
            disabled={!text.trim()}
-            className={`px-2.5 py-1 text-xs rounded transition-all flex items-center gap-1 ${
+            className={`${actionBtnBase} ${
              !text.trim()
-                ? "bg-gray-700 cursor-not-allowed text-gray-500"
-                : "bg-purple-600/80 hover:bg-purple-600 text-white"
+                ? "bg-gray-600 cursor-not-allowed text-gray-400"
+                : "bg-purple-600 hover:bg-purple-700 text-white"
            }`}
          >
-            <Sparkles className="h-3 w-3" />
+            <Sparkles className="h-3.5 w-3.5" />
            AI智能改写
          </button>
          <button
            onClick={onSaveScript}
            disabled={!text.trim()}
-            className={`px-2.5 py-1 text-xs rounded transition-all flex items-center gap-1 ${
+            className={`${actionBtnBase} ${
              !text.trim()
-                ? "bg-gray-700 cursor-not-allowed text-gray-500"
-                : "bg-amber-600/80 hover:bg-amber-600 text-white"
+                ? "bg-gray-600 cursor-not-allowed text-gray-400"
+                : "bg-amber-600 hover:bg-amber-700 text-white"
            }`}
          >
-            <Save className="h-3 w-3" />
+            <Save className="h-3.5 w-3.5" />
            保存文案
          </button>
        </div>
      </div>
+
+      <AppModal
+        isOpen={isExpandedEditorOpen}
+        onClose={handleCloseExpandedEditor}
+        panelClassName="w-full max-w-5xl max-h-[92vh] rounded-2xl border border-white/10 bg-[#171821]/95 shadow-[0_24px_80px_rgba(0,0,0,0.55)] overflow-hidden flex flex-col"
+      >
+        <AppModalHeader
+          title="扩展文案编辑"
+          subtitle="在更大空间里编写与调整文案"
+          onClose={handleCloseExpandedEditor}
+          actions={<span className="text-xs text-gray-400 tabular-nums">{text.length} 字</span>}
+        />
+        <div className="flex-1 p-4 sm:p-5">
+          <textarea
+            value={text}
+            onChange={(e) => onChangeText(e.target.value)}
+            placeholder="请输入你想说的话..."
+            className="w-full h-[66vh] min-h-[320px] bg-black/30 border border-white/10 rounded-xl p-4 text-white placeholder-gray-500 resize-none focus:outline-none focus:border-white/25 transition-colors hide-scrollbar"
+          />
+        </div>
+      </AppModal>
    </div>
  );
 }
--- a/frontend/src/features/home/ui/ScriptExtractionModal.tsx
+++ b/frontend/src/features/home/ui/ScriptExtractionModal.tsx
@@ -3,6 +3,7 @@
 import { useEffect, useCallback } from "react";
 import { Loader2 } from "lucide-react";
 import { useScriptExtraction } from "./script-extraction/useScriptExtraction";
+import { AppModal, AppModalHeader } from "@/shared/ui/AppModal";

 interface ScriptExtractionModalProps {
    isOpen: boolean;
@@ -36,17 +37,15 @@ export default function ScriptExtractionModal({
        clearInputUrl,
    } = useScriptExtraction({ isOpen });

-    // 快捷键：ESC 关闭，Enter 提交（仅在 config 步骤）
+    // 快捷键：Enter 提交（仅在 config 步骤）
    const canExtract = (activeTab === "file" && selectedFile) || (activeTab === "url" && inputUrl.trim());

    const handleKeyDown = useCallback((e: KeyboardEvent) => {
-        if (e.key === "Escape") {
-            onClose();
-        } else if (e.key === "Enter" && !e.shiftKey && step === "config" && canExtract && !isLoading) {
+        if (e.key === "Enter" && !e.shiftKey && step === "config" && canExtract && !isLoading) {
            e.preventDefault();
            handleExtract();
        }
-    }, [onClose, step, canExtract, isLoading, handleExtract]);
+    }, [step, canExtract, isLoading, handleExtract]);

    useEffect(() => {
        if (!isOpen) return;
@@ -68,20 +67,13 @@ export default function ScriptExtractionModal({
    };

    return (
-        <div className="fixed inset-0 z-50 flex items-center justify-center bg-black/80 backdrop-blur-sm p-4 animate-in fade-in duration-200">
-            <div className="bg-[#1a1a1a] border border-white/10 rounded-2xl w-full max-w-2xl max-h-[90vh] overflow-hidden flex flex-col shadow-2xl">
-                {/* Header */}
-                <div className="flex items-center justify-between p-4 border-b border-white/10 bg-white/5">
-                    <h3 className="text-lg font-semibold text-white flex items-center gap-2">
-                        📜 文案提取助手
-                    </h3>
-                    <button
-                        onClick={onClose}
-                        className="text-gray-400 hover:text-white transition-colors text-2xl leading-none"
-                    >
-                        &times;
-                    </button>
-                </div>
+        <AppModal
+            isOpen={isOpen}
+            onClose={onClose}
+            panelClassName="w-full max-w-2xl max-h-[90vh] rounded-2xl border border-white/10 bg-[#171821]/95 shadow-[0_24px_80px_rgba(0,0,0,0.55)] overflow-hidden flex flex-col"
+            closeOnOverlay
+        >
+            <AppModalHeader title="📜 文案提取助手" onClose={onClose} />

                {/* Content */}
                <div className="flex-1 overflow-y-auto p-6">
@@ -236,48 +228,51 @@ export default function ScriptExtractionModal({
                    )}

                    {step === "result" && (
-                        <div className="space-y-6">
-                            <div className="space-y-2">
-                                <div className="flex justify-between items-center">
-                                    <h4 className="font-semibold text-gray-300 flex items-center gap-2">
-                                        🎙️ 识别结果
-                                    </h4>
-                                    <div className="flex items-center gap-2">
-                                        {onApply && (
-                                            <button
-                                                onClick={() => handleApplyAndClose(script)}
-                                                className="text-xs bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-500 hover:to-pink-500 text-white px-3 py-1.5 rounded-lg transition-colors flex items-center gap-1 shadow-sm"
-                                            >
-                                                📥 填入
-                                            </button>
-                                        )}
-                                        <button
-                                            onClick={() => copyToClipboard(script)}
-                                            className="text-xs bg-white/10 hover:bg-white/20 text-white px-3 py-1.5 rounded-lg transition-colors"
-                                        >
-                                            复制
-                                        </button>
-                                    </div>
-                                </div>
-                                <div className="bg-white/5 border border-white/10 rounded-xl p-4 max-h-60 overflow-y-auto hide-scrollbar">
-                                    <p className="text-gray-200 text-sm leading-relaxed whitespace-pre-wrap">
-                                        {script}
-                                    </p>
-                                </div>
+                        <div className="space-y-5">
+                            <div className="flex justify-between items-center">
+                                <h4 className="font-semibold text-gray-300 flex items-center gap-2">
+                                    🎙️ 识别结果
+                                </h4>
+                                <span className="text-xs text-gray-400">{script.length} 字</span>
                            </div>

-                            <div className="flex justify-center pt-4">
+                            <div className="bg-white/5 border border-white/10 rounded-xl p-4 max-h-72 overflow-y-auto hide-scrollbar">
+                                <p className="text-gray-200 text-sm leading-relaxed whitespace-pre-wrap">
+                                    {script}
+                                </p>
+                            </div>
+
+                            <div className={`grid ${onApply ? "grid-cols-2 sm:grid-cols-4" : "grid-cols-2 sm:grid-cols-3"} gap-2`}>
+                                {onApply && (
+                                    <button
+                                        onClick={() => handleApplyAndClose(script)}
+                                        className="py-2.5 px-3 bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-500 hover:to-pink-500 text-white rounded-lg transition-colors text-sm"
+                                    >
+                                        填入文案
+                                    </button>
+                                )}
+                                <button
+                                    onClick={() => copyToClipboard(script)}
+                                    className="py-2.5 px-3 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors text-sm"
+                                >
+                                    复制
+                                </button>
                                <button
                                    onClick={handleExtractNext}
-                                    className="px-6 py-2 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors"
+                                    className="py-2.5 px-3 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors text-sm"
                                >
                                    提取下一个
                                </button>
+                                <button
+                                    onClick={onClose}
+                                    className="py-2.5 px-3 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors text-sm"
+                                >
+                                    关闭
+                                </button>
                            </div>
                        </div>
                    )}
                </div>
-            </div>
-        </div>
+        </AppModal>
    );
 }
--- a/frontend/src/features/home/ui/ScriptLearningModal.tsx
+++ b/frontend/src/features/home/ui/ScriptLearningModal.tsx
@@ -0,0 +1,242 @@
+"use client";
+
+import { BookOpen, Sparkles } from "lucide-react";
+import { AppModal, AppModalHeader } from "@/shared/ui/AppModal";
+import { useScriptLearning } from "./script-learning/useScriptLearning";
+
+interface ScriptLearningModalProps {
+  isOpen: boolean;
+  onClose: () => void;
+  onApply?: (text: string) => void;
+}
+
+const WORD_COUNT_MIN = 80;
+const WORD_COUNT_MAX = 1000;
+
+export default function ScriptLearningModal({ isOpen, onClose, onApply }: ScriptLearningModalProps) {
+  const {
+    step,
+    inputUrl,
+    setInputUrl,
+    topics,
+    selectedTopic,
+    setSelectedTopic,
+    wordCount,
+    setWordCount,
+    generatedScript,
+    error,
+    analysisId,
+    handleAnalyze,
+    handleGenerate,
+    handleRegenerate,
+    backToInput,
+    backToTopics,
+    copyToClipboard,
+  } = useScriptLearning({ isOpen });
+
+  if (!isOpen) return null;
+
+  const wordCountNum = Number(wordCount);
+  const wordCountValid = Number.isInteger(wordCountNum)
+    && wordCountNum >= WORD_COUNT_MIN
+    && wordCountNum <= WORD_COUNT_MAX;
+  const canGenerate = !!analysisId && !!selectedTopic && wordCountValid;
+
+  const handleApplyAndClose = () => {
+    if (!generatedScript.trim()) return;
+    onApply?.(generatedScript);
+    onClose();
+  };
+
+  return (
+    <AppModal
+      isOpen={isOpen}
+      onClose={onClose}
+      panelClassName="w-full max-w-2xl max-h-[90vh] rounded-2xl border border-white/10 bg-[#171821]/95 shadow-[0_24px_80px_rgba(0,0,0,0.55)] overflow-hidden flex flex-col"
+      closeOnOverlay={false}
+      closeOnEsc={false}
+    >
+      <AppModalHeader
+        title="文案深度学习"
+        icon={<BookOpen className="h-5 w-5 text-cyan-300" />}
+        subtitle="分析博主近期选题风格并快速生成文案"
+        onClose={onClose}
+      />
+
+      <div className="flex-1 overflow-y-auto p-6">
+        {step === "input" && (
+          <div className="space-y-5">
+            <div className="space-y-2">
+              <label className="text-sm text-gray-300">博主主页链接</label>
+              <input
+                type="text"
+                value={inputUrl}
+                onChange={(event) => setInputUrl(event.target.value)}
+                placeholder="请粘贴抖音或B站博主主页链接..."
+                className="w-full bg-black/20 border border-white/10 rounded-xl px-4 py-3 text-white placeholder-gray-500 focus:outline-none focus:border-cyan-500 transition-colors"
+              />
+              <p className="text-xs text-gray-500">仅支持 https 链接，建议使用主页地址（非单条视频链接）</p>
+            </div>
+
+            {error && (
+              <div className="bg-red-500/10 border border-red-500/30 rounded-xl p-4">
+                <p className="text-red-400 text-sm">{error}</p>
+              </div>
+            )}
+
+            <div className="flex gap-3 pt-1">
+              <button
+                type="button"
+                onClick={() => setInputUrl("")}
+                className="flex-1 py-3 px-4 bg-white/10 hover:bg-white/20 text-white rounded-xl transition-colors"
+              >
+                清空
+              </button>
+              <button
+                type="button"
+                onClick={() => void handleAnalyze()}
+                disabled={!inputUrl.trim()}
+                className="flex-1 py-3 px-4 bg-gradient-to-r from-blue-600 to-cyan-600 hover:from-blue-500 hover:to-cyan-500 disabled:opacity-50 disabled:cursor-not-allowed text-white rounded-xl transition-all font-medium shadow-lg"
+              >
+                开始分析
+              </button>
+            </div>
+          </div>
+        )}
+
+        {(step === "analyzing" || step === "generating") && (
+          <div className="flex flex-col items-center justify-center py-20">
+            <div className="relative w-20 h-20 mb-6">
+              <div className="absolute inset-0 border-4 border-cyan-500/30 rounded-full" />
+              <div className="absolute inset-0 border-4 border-t-cyan-500 rounded-full animate-spin" />
+            </div>
+            <h4 className="text-xl font-medium text-white mb-2">
+              {step === "analyzing" ? "正在分析中..." : "正在生成中..."}
+            </h4>
+          </div>
+        )}
+
+        {step === "topics" && (
+          <div className="space-y-5">
+            <div className="bg-cyan-500/10 border border-cyan-500/30 rounded-xl p-3">
+              <p className="text-cyan-200 text-sm">已完成深度学习，请选择热门话题。</p>
+            </div>
+
+            <div className="space-y-2">
+              <p className="text-sm text-gray-300">请选择一个话题</p>
+              <div className="grid grid-cols-1 sm:grid-cols-2 gap-2">
+                {topics.map((topic) => {
+                  const active = selectedTopic === topic;
+                  return (
+                    <button
+                      key={topic}
+                      type="button"
+                      onClick={() => setSelectedTopic(topic)}
+                      className={`text-left rounded-lg border px-3 py-2.5 text-sm transition-colors ${
+                        active
+                          ? "border-cyan-400 bg-cyan-500/20 text-cyan-100"
+                          : "border-white/10 bg-white/5 text-gray-200 hover:border-white/20 hover:bg-white/10"
+                      }`}
+                    >
+                      {topic}
+                    </button>
+                  );
+                })}
+              </div>
+            </div>
+
+            <div className="space-y-2">
+              <label className="text-sm text-gray-300">目标字数</label>
+              <input
+                type="number"
+                min={WORD_COUNT_MIN}
+                max={WORD_COUNT_MAX}
+                value={wordCount}
+                onChange={(event) => setWordCount(event.target.value)}
+                placeholder="请输入目标字数（80-1000），如 300"
+                className="w-full bg-black/20 border border-white/10 rounded-xl px-4 py-3 text-white placeholder-gray-500 focus:outline-none focus:border-cyan-500 transition-colors"
+              />
+            </div>
+
+            {error && (
+              <div className="bg-red-500/10 border border-red-500/30 rounded-xl p-4">
+                <p className="text-red-400 text-sm">{error}</p>
+              </div>
+            )}
+
+            <div className="flex gap-3 pt-1">
+              <button
+                type="button"
+                onClick={backToInput}
+                className="flex-1 py-3 px-4 bg-white/10 hover:bg-white/20 text-white rounded-xl transition-colors"
+              >
+                返回
+              </button>
+              <button
+                type="button"
+                onClick={() => void handleGenerate()}
+                disabled={!canGenerate}
+                className="flex-1 py-3 px-4 bg-gradient-to-r from-blue-600 to-cyan-600 hover:from-blue-500 hover:to-cyan-500 disabled:opacity-50 disabled:cursor-not-allowed text-white rounded-xl transition-all font-medium shadow-lg"
+              >
+                生成文案
+              </button>
+            </div>
+          </div>
+        )}
+
+        {step === "result" && (
+          <div className="space-y-5">
+            <div className="flex justify-between items-center">
+              <h4 className="font-semibold text-cyan-200 flex items-center gap-2">
+                <Sparkles className="h-4 w-4" />
+                生成结果
+              </h4>
+              <span className="text-xs text-gray-400">{generatedScript.length} 字</span>
+            </div>
+
+            <div className="bg-white/5 border border-white/10 rounded-xl p-4 max-h-72 overflow-y-auto hide-scrollbar">
+              <p className="text-gray-200 text-sm leading-relaxed whitespace-pre-wrap">{generatedScript}</p>
+            </div>
+
+            <div className="grid grid-cols-2 sm:grid-cols-4 gap-2">
+              <button
+                type="button"
+                onClick={handleApplyAndClose}
+                className="py-2.5 px-3 bg-gradient-to-r from-blue-600 to-cyan-600 hover:from-blue-500 hover:to-cyan-500 text-white rounded-lg transition-colors text-sm"
+              >
+                填入文案
+              </button>
+              <button
+                type="button"
+                onClick={() => copyToClipboard(generatedScript)}
+                className="py-2.5 px-3 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors text-sm"
+              >
+                复制
+              </button>
+              <button
+                type="button"
+                onClick={() => void handleRegenerate()}
+                className="py-2.5 px-3 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors text-sm"
+              >
+                重新生成
+              </button>
+              <button
+                type="button"
+                onClick={backToTopics}
+                className="py-2.5 px-3 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors text-sm"
+              >
+                换个话题
+              </button>
+            </div>
+
+            {error && (
+              <div className="bg-red-500/10 border border-red-500/30 rounded-xl p-4">
+                <p className="text-red-400 text-sm">{error}</p>
+              </div>
+            )}
+          </div>
+        )}
+      </div>
+    </AppModal>
+  );
+}
--- a/frontend/src/features/home/ui/TimelineEditor.tsx
+++ b/frontend/src/features/home/ui/TimelineEditor.tsx
@@ -1,18 +1,28 @@
-import { useEffect, useRef, useCallback, useState, useMemo } from "react";
+import { useEffect, useRef, useCallback, useState } from "react";
 import WaveSurfer from "wavesurfer.js";
-import { ChevronDown, GripVertical } from "lucide-react";
-import type { TimelineSegment } from "@/features/home/model/useTimelineEditor";
+import { ChevronDown, Check, X, Plus } from "lucide-react";
+import type { InsertSegment } from "@/shared/types/timeline";
 import type { Material } from "@/shared/types/material";
+import { SelectPopover } from "@/shared/ui/SelectPopover";

 interface TimelineEditorProps {
  audioDuration: number;
  audioUrl: string;
-  segments: TimelineSegment[];
-  materials: Material[];
+  // Multi-camera props
+  primaryMaterial: Material | undefined;
+  inserts: InsertSegment[];
+  insertCandidates: Material[];
+  onAddInsert: (materialId: string) => void;
+  onRemoveInsert: (id: string) => void;
+  onMoveInsert: (id: string, newStart: number) => void;
+  onClickInsert: (insert: InsertSegment) => void;
+  onClickPrimary: () => void;
+  // Single material: for ClipTrimmer compat, pass a synthetic TimelineSegment
+  primarySourceStart: number;
+  primarySourceEnd: number;
+  // Shared
  outputAspectRatio: "9:16" | "16:9";
  onOutputAspectRatioChange: (ratio: "9:16" | "16:9") => void;
-  onReorderSegment: (fromIdx: number, toIdx: number) => void;
-  onClickSegment: (segment: TimelineSegment) => void;
  embedded?: boolean;
 }

@@ -25,12 +35,18 @@ function formatTime(sec: number): string {
 export function TimelineEditor({
  audioDuration,
  audioUrl,
-  segments,
-  materials,
+  primaryMaterial,
+  inserts,
+  insertCandidates,
+  onAddInsert,
+  onRemoveInsert,
+  onMoveInsert,
+  onClickInsert,
+  onClickPrimary,
+  primarySourceStart,
+  primarySourceEnd,
  outputAspectRatio,
  onOutputAspectRatioChange,
-  onReorderSegment,
-  onClickSegment,
  embedded = false,
 }: TimelineEditorProps) {
  const waveRef = useRef<HTMLDivElement>(null);
@@ -38,22 +54,27 @@ export function TimelineEditor({
  const [waveReady, setWaveReady] = useState(false);
  const [isPlaying, setIsPlaying] = useState(false);

-  // Refs for high-frequency DOM updates (avoid 60fps re-renders)
+  // Refs for high-frequency DOM updates
  const playheadRef = useRef<HTMLDivElement>(null);
  const timeRef = useRef<HTMLSpanElement>(null);
  const audioDurationRef = useRef(audioDuration);
+  const timelineBarRef = useRef<HTMLDivElement>(null);

  useEffect(() => {
    audioDurationRef.current = audioDuration;
  }, [audioDuration]);

-  // Drag-to-reorder state
-  const [dragFromIdx, setDragFromIdx] = useState<number | null>(null);
-  const [dragOverIdx, setDragOverIdx] = useState<number | null>(null);
+  // Drag state for insert blocks (move only; duration editing unified to ClipTrimmer)
+  const [dragId, setDragId] = useState<string | null>(null);
+  const dragStartXRef = useRef(0);
+  const dragStartValRef = useRef(0);
+  const dragMovedRef = useRef(false);
+  const DRAG_THRESHOLD = 5;

-  // Aspect ratio dropdown
-  const [ratioOpen, setRatioOpen] = useState(false);
-  const ratioRef = useRef<HTMLDivElement>(null);
+  const isMultiCam = insertCandidates.length > 0 || inserts.length > 0;
+  const hasPrimary = !!primaryMaterial;
+
+  // Aspect ratio options
  const ratioOptions = [
    { value: "9:16" as const, label: "竖屏 9:16" },
    { value: "16:9" as const, label: "横屏 16:9" },
@@ -61,24 +82,21 @@ export function TimelineEditor({
  const currentRatioLabel =
    ratioOptions.find((opt) => opt.value === outputAspectRatio)?.label ?? "竖屏 9:16";

-  useEffect(() => {
-    const handler = (e: MouseEvent) => {
-      if (ratioRef.current && !ratioRef.current.contains(e.target as Node)) {
-        setRatioOpen(false);
-      }
-    };
-    if (ratioOpen) document.addEventListener("mousedown", handler);
-    return () => document.removeEventListener("mousedown", handler);
-  }, [ratioOpen]);
+  // Primary material loop info
+  const primaryDuration = primaryMaterial?.duration_sec ?? 0;
+  const primaryEffective = primarySourceEnd > primarySourceStart
+    ? primarySourceEnd - primarySourceStart
+    : primaryDuration;
+  const loopCount = primaryEffective > 0 && audioDuration > 0
+    ? (audioDuration / primaryEffective)
+    : 0;

  // Create / recreate wavesurfer when audioUrl changes
  useEffect(() => {
    if (!waveRef.current || !audioUrl) return;
-
    const playheadEl = playheadRef.current;
    const timeEl = timeRef.current;

-    // Destroy previous instance
    if (wsRef.current) {
      wsRef.current.destroy();
      wsRef.current = null;
@@ -98,7 +116,6 @@ export function TimelineEditor({
      normalize: true,
    });

-    // Click waveform → seek + auto-play
    ws.on("interaction", () => ws.play());
    ws.on("play", () => setIsPlaying(true));
    ws.on("pause", () => setIsPlaying(false));
@@ -106,7 +123,6 @@ export function TimelineEditor({
      setIsPlaying(false);
      if (playheadRef.current) playheadRef.current.style.display = "none";
    });
-    // High-frequency: update playhead + time via refs (no React re-render)
    ws.on("timeupdate", (time: number) => {
      const dur = audioDurationRef.current;
      if (playheadRef.current && dur > 0) {
@@ -130,7 +146,6 @@ export function TimelineEditor({
    };
  }, [audioUrl, waveReady]);

-  // Callback ref to detect when waveRef div mounts
  const waveCallbackRef = useCallback((node: HTMLDivElement | null) => {
    (waveRef as React.MutableRefObject<HTMLDivElement | null>).current = node;
    setWaveReady(!!node);
@@ -140,43 +155,45 @@ export function TimelineEditor({
    wsRef.current?.playPause();
  }, []);

-  // Drag-to-reorder handlers
-  const handleDragStart = useCallback((idx: number, e: React.DragEvent) => {
-    setDragFromIdx(idx);
-    e.dataTransfer.effectAllowed = "move";
-    e.dataTransfer.setData("text/plain", String(idx));
-  }, []);
+  // ── Insert block pointer handlers (move only) ──

-  const handleDragOver = useCallback((idx: number, e: React.DragEvent) => {
+  const getTimeFromClientX = useCallback((clientX: number): number => {
+    if (!timelineBarRef.current || audioDuration <= 0) return 0;
+    const rect = timelineBarRef.current.getBoundingClientRect();
+    const ratio = Math.max(0, Math.min(1, (clientX - rect.left) / rect.width));
+    return ratio * audioDuration;
+  }, [audioDuration]);
+
+  const handleInsertPointerDown = useCallback((
+    id: string,
+    e: React.PointerEvent
+  ) => {
    e.preventDefault();
-    e.dataTransfer.dropEffect = "move";
-    setDragOverIdx(idx);
-  }, []);
+    e.stopPropagation();
+    setDragId(id);
+    dragStartXRef.current = e.clientX;
+    dragMovedRef.current = false;
+    const ins = inserts.find((i) => i.id === id);
+    dragStartValRef.current = ins?.start ?? 0;
+    (e.target as HTMLElement).setPointerCapture(e.pointerId);
+  }, [inserts]);

-  const handleDragLeave = useCallback(() => {
-    setDragOverIdx(null);
-  }, []);
-
-  const handleDrop = useCallback((toIdx: number, e: React.DragEvent) => {
-    e.preventDefault();
-    const fromIdx = parseInt(e.dataTransfer.getData("text/plain"), 10);
-    if (!isNaN(fromIdx) && fromIdx !== toIdx) {
-      onReorderSegment(fromIdx, toIdx);
+  const handlePointerMove = useCallback((e: React.PointerEvent) => {
+    if (!dragId) return;
+    if (!dragMovedRef.current) {
+      const dx = Math.abs(e.clientX - dragStartXRef.current);
+      if (dx < DRAG_THRESHOLD) return;
+      dragMovedRef.current = true;
    }
-    setDragFromIdx(null);
-    setDragOverIdx(null);
-  }, [onReorderSegment]);
+    const currentTime = getTimeFromClientX(e.clientX);
+    const startTime = getTimeFromClientX(dragStartXRef.current);
+    onMoveInsert(dragId, dragStartValRef.current + (currentTime - startTime));
+  }, [dragId, getTimeFromClientX, onMoveInsert]);

-  const handleDragEnd = useCallback(() => {
-    setDragFromIdx(null);
-    setDragOverIdx(null);
+  const handlePointerUp = useCallback(() => {
+    setDragId(null);
  }, []);

-  // Filter visible vs overflow segments
-  const visibleSegments = useMemo(() => segments.filter((s) => s.start < audioDuration), [segments, audioDuration]);
-  const overflowSegments = useMemo(() => segments.filter((s) => s.start >= audioDuration), [segments, audioDuration]);
-  const hasSegments = visibleSegments.length > 0;
-
  const content = (
    <>
      <div className="flex items-center justify-between mb-3">
@@ -188,37 +205,49 @@ export function TimelineEditor({
          <h3 className="text-sm font-medium text-gray-400">时间轴编辑</h3>
        )}
        <div className="flex items-center gap-2 text-xs text-gray-400">
-          <div ref={ratioRef} className="relative">
-            <button
-              type="button"
-              onClick={() => setRatioOpen((v) => !v)}
-              className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 whitespace-nowrap flex items-center gap-1 transition-all"
-              title="设置输出画面比例"
+          <div className="shrink-0">
+            <SelectPopover
+              sheetTitle="设置输出画面比例"
+              trigger={({ open, toggle }) => (
+                <button
+                  type="button"
+                  onClick={toggle}
+                  className="rounded-lg border border-white/10 bg-black/25 px-2.5 py-1.5 text-left transition-colors hover:border-white/30"
+                  title="设置输出画面比例"
+                >
+                  <span className="flex items-center justify-between gap-2">
+                    <span className="truncate text-xs text-white">画面: {currentRatioLabel}</span>
+                    <ChevronDown className={`h-3.5 w-3.5 text-gray-300 transition-transform ${open ? "rotate-180" : ""}`} />
+                  </span>
+                </button>
+              )}
            >
-              画面: {currentRatioLabel}
-              <ChevronDown className={`h-3 w-3 transition-transform ${ratioOpen ? "rotate-180" : ""}`} />
-            </button>
-            {ratioOpen && (
-              <div className="absolute right-0 top-full mt-1 bg-gray-800 border border-white/20 rounded-lg shadow-xl py-1 z-50 min-w-[106px]">
-                {ratioOptions.map((opt) => (
-                  <button
-                    key={opt.value}
-                    type="button"
-                    onClick={() => {
-                      onOutputAspectRatioChange(opt.value);
-                      setRatioOpen(false);
-                    }}
-                    className={`w-full text-left px-3 py-1.5 text-xs transition-colors ${
-                      outputAspectRatio === opt.value
-                        ? "bg-purple-600/40 text-purple-200"
-                        : "text-gray-300 hover:bg-white/10"
-                    }`}
-                  >
-                    {opt.label}
-                  </button>
-                ))}
-              </div>
-            )}
+              {({ close }) => (
+                <div className="space-y-1">
+                  {ratioOptions.map((opt) => {
+                    const isSelected = outputAspectRatio === opt.value;
+                    return (
+                      <button
+                        key={opt.value}
+                        type="button"
+                        data-popover-selected={isSelected ? "true" : undefined}
+                        onClick={() => {
+                          onOutputAspectRatioChange(opt.value);
+                          close();
+                        }}
+                        className={`flex w-full items-center justify-between rounded-lg border px-3 py-2 text-left transition-colors ${isSelected
+                          ? "border-purple-500 bg-purple-500/20"
+                          : "border-white/10 bg-white/5 hover:border-white/30"
+                          }`}
+                      >
+                        <span className="text-xs text-white">{opt.label}</span>
+                        {isSelected && <Check className="h-3.5 w-3.5 text-purple-300" />}
+                      </button>
+                    );
+                  })}
+                </div>
+              )}
+            </SelectPopover>
          </div>

          {audioUrl && (
@@ -238,109 +267,149 @@ export function TimelineEditor({
        </div>
      </div>

-      {/* Waveform — always rendered so ref stays mounted */}
+      {/* Waveform */}
      <div className="relative mb-1">
        <div ref={waveCallbackRef} className="rounded-lg overflow-hidden bg-black/20 cursor-pointer" style={{ minHeight: 56 }} />
      </div>

-      {/* Segment blocks or empty placeholder */}
-      {hasSegments ? (
+      {/* Timeline visualization */}
+      {hasPrimary && audioDuration > 0 ? (
        <>
-          <div className="relative h-14 flex select-none">
-            {/* Playhead — syncs with audio playback */}
+          <div
+            ref={timelineBarRef}
+            className="relative select-none touch-none"
+            style={{ minHeight: isMultiCam ? 80 : 56 }}
+            onPointerMove={handlePointerMove}
+            onPointerUp={handlePointerUp}
+            onPointerLeave={handlePointerUp}
+          >
+            {/* Playhead */}
            <div
              ref={playheadRef}
-              className="absolute top-0 h-full w-0.5 bg-fuchsia-400 z-10 pointer-events-none"
+              className="absolute top-0 h-full w-0.5 bg-fuchsia-400 z-20 pointer-events-none"
              style={{ display: "none", left: "0%" }}
            />
-            {visibleSegments.map((seg, i) => {
-              const left = (seg.start / audioDuration) * 100;
-              const width = ((seg.end - seg.start) / audioDuration) * 100;
-              const segDur = seg.end - seg.start;
-              const isDragTarget = dragOverIdx === i && dragFromIdx !== i;

-              // Compute loop portion for the last visible segment
-              const isLastVisible = i === visibleSegments.length - 1;
-              let loopPercent = 0;
-              if (isLastVisible && audioDuration > 0) {
-                const mat = materials.find((m) => m.id === seg.materialId);
-                const matDur = mat?.duration_sec ?? 0;
-                const effDur = (seg.sourceEnd > seg.sourceStart)
-                  ? (seg.sourceEnd - seg.sourceStart)
-                  : Math.max(matDur - seg.sourceStart, 0);
-                if (effDur > 0 && segDur > effDur + 0.1) {
-                  loopPercent = ((segDur - effDur) / segDur) * 100;
-                }
-              }
+            {/* Primary material background bar */}
+            <button
+              onClick={onClickPrimary}
+              className="absolute inset-0 rounded-lg overflow-hidden border border-purple-500/30 hover:border-purple-500/50 transition-colors cursor-pointer"
+              style={{ backgroundColor: "#8b5cf620" }}
+              title={`主素材: ${primaryMaterial?.scene || primaryMaterial?.name || ""}${
+                loopCount > 1 ? ` (${primaryEffective.toFixed(1)}s ×${loopCount.toFixed(1)} 循环)` : ""
+              }\n点击设置截取范围`}
+            >
+              {/* Loop stripe pattern */}
+              {loopCount > 1 && (
+                <div
+                  className="absolute inset-0 pointer-events-none"
+                  style={{
+                    background: `repeating-linear-gradient(-45deg, transparent, transparent 6px, rgba(139,92,246,0.06) 6px, rgba(139,92,246,0.06) 12px)`,
+                  }}
+                />
+              )}
+              <div className="absolute inset-0 flex items-center px-3">
+                <span className="text-[11px] text-purple-300/80 truncate">
+                  主素材: {primaryMaterial?.scene || primaryMaterial?.name || ""}
+                  {loopCount > 1 && (
+                    <span className="text-purple-400/60 ml-1">
+                      ({primaryEffective.toFixed(1)}s ×{loopCount.toFixed(1)} 循环)
+                    </span>
+                  )}
+                  {primarySourceStart > 0 && (
+                    <span className="text-amber-400/80 ml-1">✂ {primarySourceStart.toFixed(1)}s</span>
+                  )}
+                </span>
+              </div>
+            </button>
+
+            {/* Insert blocks floating above primary */}
+            {inserts.map((ins) => {
+              const left = (ins.start / audioDuration) * 100;
+              const width = ((ins.end - ins.start) / audioDuration) * 100;
+              const insDur = ins.end - ins.start;
+              const isDragging = dragId === ins.id;

              return (
-                <div key={seg.id} className="absolute top-0 h-full" style={{ left: `${left}%`, width: `${width}%` }}>
+                <div
+                  key={ins.id}
+                  className={`absolute group min-h-[40px] ${isDragging ? "z-30" : "z-10"}`}
+                  style={{
+                    left: `${left}%`,
+                    width: `${width}%`,
+                    top: isMultiCam ? 12 : 4,
+                    bottom: isMultiCam ? 12 : 4,
+                  }}
+                >
+                  {/* Main block body — move on drag, click opens ClipTrimmer */}
                  <button
-                    draggable
-                    onDragStart={(e) => handleDragStart(i, e)}
-                    onDragOver={(e) => handleDragOver(i, e)}
-                    onDragLeave={handleDragLeave}
-                    onDrop={(e) => handleDrop(i, e)}
-                    onDragEnd={handleDragEnd}
-                    onClick={() => onClickSegment(seg)}
-                    className={`relative w-full h-full rounded-lg flex flex-col items-center justify-center overflow-hidden cursor-grab active:cursor-grabbing transition-all border ${
-                      isDragTarget
-                        ? "ring-2 ring-purple-400 border-purple-400 scale-[1.02]"
-                        : dragFromIdx === i
-                        ? "opacity-50 border-white/10"
-                        : "hover:opacity-90 border-white/10"
+                    className={`w-full h-full rounded-lg flex flex-col items-center justify-center overflow-hidden cursor-grab active:cursor-grabbing transition-all border ${
+                      isDragging
+                        ? "ring-2 ring-white/40 scale-[1.02]"
+                        : "hover:brightness-110"
                    }`}
-                    style={{ backgroundColor: seg.color + "33", borderColor: isDragTarget ? undefined : seg.color + "66" }}
-                    title={`拖拽可调换顺序 · 点击设置截取范围\n${seg.materialName}\n${segDur.toFixed(1)}s${loopPercent > 0 ? ` (含循环 ${(segDur * loopPercent / 100).toFixed(1)}s)` : ""}`}
+                    style={{
+                      backgroundColor: ins.color + "55",
+                      borderColor: ins.color + "88",
+                    }}
+                    onPointerDown={(e) => handleInsertPointerDown(ins.id, e)}
+                    onClick={() => {
+                      if (!dragMovedRef.current) onClickInsert(ins);
+                    }}
+                    title={`${ins.materialName} ${insDur.toFixed(1)}s\n点击设置截取范围`}
                  >
-                    <GripVertical className="absolute top-0.5 left-0.5 h-3 w-3 text-white/30 z-[1]" />
                    <span className="text-[11px] text-white/90 truncate max-w-full px-1 leading-tight z-[1]">
-                      {seg.materialName}
+                      {ins.materialName}
                    </span>
                    <span className="text-[10px] text-white/60 leading-tight z-[1]">
-                      {segDur.toFixed(1)}s
+                      {insDur.toFixed(1)}s
                    </span>
-                    {seg.sourceStart > 0 && (
+                    {ins.sourceStart > 0 && (
                      <span className="text-[9px] text-amber-400/80 leading-tight z-[1]">
-                        ✂ {seg.sourceStart.toFixed(1)}s
+                        ✂ {ins.sourceStart.toFixed(1)}s
                      </span>
                    )}
-                    {/* Loop fill stripe overlay */}
-                    {loopPercent > 0 && (
-                      <div
-                        className="absolute top-0 right-0 h-full pointer-events-none flex items-center justify-center"
-                        style={{
-                          width: `${loopPercent}%`,
-                          background: `repeating-linear-gradient(-45deg, transparent, transparent 3px, rgba(255,255,255,0.07) 3px, rgba(255,255,255,0.07) 6px)`,
-                          borderLeft: "1px dashed rgba(255,255,255,0.25)",
-                        }}
-                      >
-                        <span className="text-[9px] text-white/30">循环</span>
-                      </div>
-                    )}
+                  </button>
+
+                  {/* Delete button */}
+                  <button
+                    className="absolute -top-1.5 -right-1.5 w-5 h-5 rounded-full bg-red-500/80 hover:bg-red-500 flex items-center justify-center opacity-40 sm:opacity-0 sm:group-hover:opacity-100 transition-opacity z-20"
+                    onClick={(e) => {
+                      e.stopPropagation();
+                      onRemoveInsert(ins.id);
+                    }}
+                    title="删除此插入"
+                  >
+                    <X className="w-3 h-3 text-white" />
                  </button>
                </div>
              );
            })}
          </div>

-          {/* Overflow segments — shown as gray chips */}
-          {overflowSegments.length > 0 && (
-            <div className="flex flex-wrap items-center gap-1.5 mt-1.5">
-              <span className="text-[10px] text-gray-500">未使用:</span>
-              {overflowSegments.map((seg) => (
-                <span
-                  key={seg.id}
-                  className="text-[10px] text-gray-500 bg-white/5 border border-white/10 rounded px-1.5 py-0.5"
+          {/* Insert candidates bar (multi-cam only) */}
+          {isMultiCam && insertCandidates.length > 0 && (
+            <div className="flex flex-wrap items-center gap-1.5 mt-2">
+              <span className="text-[10px] text-gray-500">可插入:</span>
+              {insertCandidates.map((mat) => (
+                <button
+                  key={mat.id}
+                  className="flex items-center gap-0.5 text-[10px] text-gray-300 bg-white/5 border border-white/10 hover:border-white/30 rounded px-1.5 py-0.5 transition-colors"
+                  onClick={() => onAddInsert(mat.id)}
+                  title={`添加 "${mat.scene || mat.name}" 到时间轴`}
                >
-                  {seg.materialName}
-                </span>
+                  <Plus className="w-2.5 h-2.5" />
+                  {mat.scene || mat.name}
+                </button>
              ))}
            </div>
          )}

          <p className="text-[10px] text-gray-500 mt-1.5">
-            点击波形定位播放 · 拖拽色块调换顺序 · 点击色块设置截取范围
+            {isMultiCam
+              ? "点击主素材设置截取范围 · 拖拽插入块调整位置 · 点击插入块设置截取/时长"
+              : "点击波形定位播放 · 点击素材条设置截取范围"
+            }
          </p>
        </>
      ) : (
--- a/frontend/src/features/home/ui/TitleSubtitlePanel.tsx
+++ b/frontend/src/features/home/ui/TitleSubtitlePanel.tsx
@@ -1,5 +1,6 @@
-import { ChevronDown, Eye } from "lucide-react";
+import { ChevronDown, Eye, Check, Loader2, Sparkles } from "lucide-react";
 import { FloatingStylePreview } from "@/features/home/ui/FloatingStylePreview";
+import { SelectPopover } from "@/shared/ui/SelectPopover";

 interface SubtitleStyleOption {
  id: string;
@@ -34,6 +35,9 @@ interface TitleStyleOption {
 interface TitleSubtitlePanelProps {
  showStylePreview: boolean;
  onTogglePreview: () => void;
+  onGenerateMeta: () => void;
+  isGeneratingMeta: boolean;
+  canGenerateMeta: boolean;
  videoTitle: string;
  onTitleChange: (value: string) => void;
  onTitleCompositionStart?: () => void;
@@ -75,6 +79,9 @@ interface TitleSubtitlePanelProps {
 export function TitleSubtitlePanel({
  showStylePreview,
  onTogglePreview,
+  onGenerateMeta,
+  isGeneratingMeta,
+  canGenerateMeta,
  videoTitle,
  onTitleChange,
  onTitleCompositionStart,
@@ -112,33 +119,100 @@ export function TitleSubtitlePanel({
  previewBaseHeight = 1920,
  previewBackgroundUrl,
 }: TitleSubtitlePanelProps) {
+  const titleDisplayOptions: Array<{ value: "short" | "persistent"; label: string }> = [
+    { value: "short", label: "标题短暂显示" },
+    { value: "persistent", label: "标题常驻显示" },
+  ];
+  const currentTitleDisplay = titleDisplayOptions.find((opt) => opt.value === titleDisplayMode) || titleDisplayOptions[0];
+
+  const currentTitleStyle = titleStyles.find((style) => style.id === selectedTitleStyleId) || titleStyles[0] || null;
+  const currentSecondaryTitleStyle = titleStyles.find((style) => style.id === selectedSecondaryTitleStyleId) || titleStyles[0] || null;
+  const currentSubtitleStyle = subtitleStyles.find((style) => style.id === selectedSubtitleStyleId) || subtitleStyles[0] || null;
+
  return (
    <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
-      <div className="flex items-center justify-between mb-4 gap-2">
-        <h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2">
-          四、标题与字幕
-        </h2>
-        <div className="flex items-center gap-1.5">
-          <div className="relative shrink-0">
-            <select
-              value={titleDisplayMode}
-              onChange={(e) => onTitleDisplayModeChange(e.target.value as "short" | "persistent")}
-              className="appearance-none rounded-lg border border-white/15 bg-black/35 px-2.5 py-1.5 pr-7 text-xs text-gray-200 outline-none transition-colors hover:border-white/25 focus:border-purple-500"
-              aria-label="标题显示方式"
-            >
-              <option value="short">标题短暂显示</option>
-              <option value="persistent">标题常驻显示</option>
-            </select>
-            <ChevronDown className="pointer-events-none absolute right-2 top-1/2 h-3.5 w-3.5 -translate-y-1/2 text-gray-400" />
-          </div>
+      <div className="mb-4 space-y-2">
+        <div className="flex flex-wrap items-center justify-between gap-2">
+          <h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2">
+            四、标题与字幕
+          </h2>
          <button
-            onClick={onTogglePreview}
-            className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 flex items-center gap-1"
+            onClick={onGenerateMeta}
+            disabled={isGeneratingMeta || !canGenerateMeta}
+            className={`px-3 py-1.5 text-xs rounded-lg transition-colors inline-flex items-center gap-1.5 ${
+              isGeneratingMeta || !canGenerateMeta
+                ? "bg-gray-600 cursor-not-allowed text-gray-400"
+                : "bg-gradient-to-r from-blue-600 to-cyan-600 hover:from-blue-700 hover:to-cyan-700 text-white"
+            }`}
          >
-            <Eye className="h-3.5 w-3.5" />
-            {showStylePreview ? "收起预览" : "预览样式"}
+            {isGeneratingMeta ? (
+              <>
+                <Loader2 className="h-3.5 w-3.5 animate-spin" />
+                生成中...
+              </>
+            ) : (
+              <>
+                <Sparkles className="h-3.5 w-3.5" />
+                AI生成标题标签
+              </>
+            )}
          </button>
        </div>
+        <div className="flex justify-end">
+          <div className="flex flex-wrap items-center justify-end gap-1.5">
+            <div className="shrink-0">
+              <SelectPopover
+                sheetTitle="标题显示方式"
+                trigger={({ open, toggle }) => (
+                  <button
+                    type="button"
+                    onClick={toggle}
+                    className="min-w-[146px] rounded-lg border border-white/10 bg-black/25 px-2.5 py-1.5 text-left text-xs text-gray-200 transition-colors hover:border-white/30"
+                    aria-label="标题显示方式"
+                  >
+                    <span className="flex items-center justify-between gap-2">
+                      <span className="whitespace-nowrap">{currentTitleDisplay.label}</span>
+                      <ChevronDown className={`h-3.5 w-3.5 text-gray-400 transition-transform ${open ? "rotate-180" : ""}`} />
+                    </span>
+                  </button>
+                )}
+              >
+                {({ close }) => (
+                  <div className="space-y-1">
+                    {titleDisplayOptions.map((opt) => {
+                      const isSelected = opt.value === titleDisplayMode;
+                      return (
+                        <button
+                          key={opt.value}
+                          type="button"
+                          data-popover-selected={isSelected ? "true" : undefined}
+                          onClick={() => {
+                            onTitleDisplayModeChange(opt.value);
+                            close();
+                          }}
+                          className={`flex w-full items-center justify-between rounded-lg border px-3 py-2 text-left transition-colors ${isSelected
+                            ? "border-purple-500 bg-purple-500/20"
+                            : "border-white/10 bg-white/5 hover:border-white/30"
+                            }`}
+                        >
+                          <span className="text-xs text-white whitespace-nowrap">{opt.label}</span>
+                          {isSelected && <Check className="h-3.5 w-3.5 text-purple-300" />}
+                        </button>
+                      );
+                    })}
+                  </div>
+                )}
+              </SelectPopover>
+            </div>
+            <button
+              onClick={onTogglePreview}
+              className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 flex items-center gap-1"
+            >
+              <Eye className="h-3.5 w-3.5" />
+              {showStylePreview ? "收起预览" : "预览样式"}
+            </button>
+          </div>
+        </div>
      </div>

      {showStylePreview && (
@@ -203,17 +277,48 @@ export function TitleSubtitlePanel({
        <div className="mb-4 space-y-3">
          <div className="flex items-center gap-3">
            <label className="text-sm text-gray-300 shrink-0 w-20">标题样式</label>
-            <div className="relative w-1/3 min-w-[100px]">
-              <select
-                value={selectedTitleStyleId}
-                onChange={(e) => onSelectTitleStyle(e.target.value)}
-                className="w-full appearance-none rounded-lg border border-white/15 bg-black/35 px-3 py-2 pr-8 text-sm text-gray-200 outline-none transition-colors hover:border-white/25 focus:border-purple-500"
+            <div className="w-1/3 min-w-[130px]">
+              <SelectPopover
+                sheetTitle="标题样式"
+                trigger={({ open, toggle }) => (
+                  <button
+                    type="button"
+                    onClick={toggle}
+                    className="w-full rounded-lg border border-white/15 bg-black/35 px-3 py-2 text-left text-sm text-gray-200 transition-colors hover:border-white/25"
+                  >
+                    <span className="flex items-center justify-between gap-2">
+                      <span className="truncate">{currentTitleStyle?.label || "请选择"}</span>
+                      <ChevronDown className={`h-3.5 w-3.5 text-gray-400 transition-transform ${open ? "rotate-180" : ""}`} />
+                    </span>
+                  </button>
+                )}
              >
-                {titleStyles.map((style) => (
-                  <option key={style.id} value={style.id}>{style.label}</option>
-                ))}
-              </select>
-              <ChevronDown className="pointer-events-none absolute right-2.5 top-1/2 h-3.5 w-3.5 -translate-y-1/2 text-gray-400" />
+                {({ close }) => (
+                  <div className="space-y-1">
+                    {titleStyles.map((style) => {
+                      const isSelected = selectedTitleStyleId === style.id;
+                      return (
+                        <button
+                          key={style.id}
+                          type="button"
+                          data-popover-selected={isSelected ? "true" : undefined}
+                          onClick={() => {
+                            onSelectTitleStyle(style.id);
+                            close();
+                          }}
+                          className={`flex w-full items-center justify-between rounded-lg border px-3 py-2 text-left transition-colors ${isSelected
+                            ? "border-purple-500 bg-purple-500/20"
+                            : "border-white/10 bg-white/5 hover:border-white/30"
+                            }`}
+                        >
+                          <span className="text-sm text-white">{style.label}</span>
+                          {isSelected && <Check className="h-4 w-4 text-purple-300" />}
+                        </button>
+                      );
+                    })}
+                  </div>
+                )}
+              </SelectPopover>
            </div>
          </div>
          <div className="flex items-center gap-3">
@@ -231,17 +336,48 @@ export function TitleSubtitlePanel({
        <div className="mb-4 space-y-3">
          <div className="flex items-center gap-3">
            <label className="text-sm text-gray-300 shrink-0 w-20">副标题样式</label>
-            <div className="relative w-1/3 min-w-[100px]">
-              <select
-                value={selectedSecondaryTitleStyleId}
-                onChange={(e) => onSelectSecondaryTitleStyle(e.target.value)}
-                className="w-full appearance-none rounded-lg border border-white/15 bg-black/35 px-3 py-2 pr-8 text-sm text-gray-200 outline-none transition-colors hover:border-white/25 focus:border-purple-500"
+            <div className="w-1/3 min-w-[130px]">
+              <SelectPopover
+                sheetTitle="副标题样式"
+                trigger={({ open, toggle }) => (
+                  <button
+                    type="button"
+                    onClick={toggle}
+                    className="w-full rounded-lg border border-white/15 bg-black/35 px-3 py-2 text-left text-sm text-gray-200 transition-colors hover:border-white/25"
+                  >
+                    <span className="flex items-center justify-between gap-2">
+                      <span className="truncate">{currentSecondaryTitleStyle?.label || "请选择"}</span>
+                      <ChevronDown className={`h-3.5 w-3.5 text-gray-400 transition-transform ${open ? "rotate-180" : ""}`} />
+                    </span>
+                  </button>
+                )}
              >
-                {titleStyles.map((style) => (
-                  <option key={style.id} value={style.id}>{style.label}</option>
-                ))}
-              </select>
-              <ChevronDown className="pointer-events-none absolute right-2.5 top-1/2 h-3.5 w-3.5 -translate-y-1/2 text-gray-400" />
+                {({ close }) => (
+                  <div className="space-y-1">
+                    {titleStyles.map((style) => {
+                      const isSelected = selectedSecondaryTitleStyleId === style.id;
+                      return (
+                        <button
+                          key={style.id}
+                          type="button"
+                          data-popover-selected={isSelected ? "true" : undefined}
+                          onClick={() => {
+                            onSelectSecondaryTitleStyle(style.id);
+                            close();
+                          }}
+                          className={`flex w-full items-center justify-between rounded-lg border px-3 py-2 text-left transition-colors ${isSelected
+                            ? "border-purple-500 bg-purple-500/20"
+                            : "border-white/10 bg-white/5 hover:border-white/30"
+                            }`}
+                        >
+                          <span className="text-sm text-white">{style.label}</span>
+                          {isSelected && <Check className="h-4 w-4 text-purple-300" />}
+                        </button>
+                      );
+                    })}
+                  </div>
+                )}
+              </SelectPopover>
            </div>
          </div>
          <div className="flex items-center gap-3">
@@ -259,17 +395,48 @@ export function TitleSubtitlePanel({
        <div className="mt-4 space-y-3">
          <div className="flex items-center gap-3">
            <label className="text-sm text-gray-300 shrink-0 w-20">字幕样式</label>
-            <div className="relative w-1/3 min-w-[100px]">
-              <select
-                value={selectedSubtitleStyleId}
-                onChange={(e) => onSelectSubtitleStyle(e.target.value)}
-                className="w-full appearance-none rounded-lg border border-white/15 bg-black/35 px-3 py-2 pr-8 text-sm text-gray-200 outline-none transition-colors hover:border-white/25 focus:border-purple-500"
+            <div className="w-1/3 min-w-[130px]">
+              <SelectPopover
+                sheetTitle="字幕样式"
+                trigger={({ open, toggle }) => (
+                  <button
+                    type="button"
+                    onClick={toggle}
+                    className="w-full rounded-lg border border-white/15 bg-black/35 px-3 py-2 text-left text-sm text-gray-200 transition-colors hover:border-white/25"
+                  >
+                    <span className="flex items-center justify-between gap-2">
+                      <span className="truncate">{currentSubtitleStyle?.label || "请选择"}</span>
+                      <ChevronDown className={`h-3.5 w-3.5 text-gray-400 transition-transform ${open ? "rotate-180" : ""}`} />
+                    </span>
+                  </button>
+                )}
              >
-                {subtitleStyles.map((style) => (
-                  <option key={style.id} value={style.id}>{style.label}</option>
-                ))}
-              </select>
-              <ChevronDown className="pointer-events-none absolute right-2.5 top-1/2 h-3.5 w-3.5 -translate-y-1/2 text-gray-400" />
+                {({ close }) => (
+                  <div className="space-y-1">
+                    {subtitleStyles.map((style) => {
+                      const isSelected = selectedSubtitleStyleId === style.id;
+                      return (
+                        <button
+                          key={style.id}
+                          type="button"
+                          data-popover-selected={isSelected ? "true" : undefined}
+                          onClick={() => {
+                            onSelectSubtitleStyle(style.id);
+                            close();
+                          }}
+                          className={`flex w-full items-center justify-between rounded-lg border px-3 py-2 text-left transition-colors ${isSelected
+                            ? "border-purple-500 bg-purple-500/20"
+                            : "border-white/10 bg-white/5 hover:border-white/30"
+                            }`}
+                        >
+                          <span className="text-sm text-white">{style.label}</span>
+                          {isSelected && <Check className="h-4 w-4 text-purple-300" />}
+                        </button>
+                      );
+                    })}
+                  </div>
+                )}
+              </SelectPopover>
            </div>
          </div>
          <div className="flex items-center gap-3">
--- a/frontend/src/features/home/ui/VoiceSelector.tsx
+++ b/frontend/src/features/home/ui/VoiceSelector.tsx
@@ -1,11 +1,34 @@
-import type { ReactNode } from "react";
-import { Mic, Volume2 } from "lucide-react";
+import { useCallback, useEffect, useRef, useState, type MouseEvent, type ReactNode } from "react";
+import { Check, ChevronDown, Loader2, Mic, Pause, Play, Volume2 } from "lucide-react";
+import { toast } from "sonner";
+import { SelectPopover } from "@/shared/ui/SelectPopover";

 interface VoiceOption {
  id: string;
  name: string;
 }

+const LOCALE_LABELS: Record<string, string> = {
+  "zh-CN": "中文",
+  "en-US": "English",
+  "ja-JP": "日本語",
+  "ko-KR": "한국어",
+  "fr-FR": "Français",
+  "de-DE": "Deutsch",
+  "es-ES": "Español",
+  "ru-RU": "Русский",
+  "it-IT": "Italiano",
+  "pt-BR": "Português",
+};
+
+const getLocaleFromVoiceId = (voiceId: string) => {
+  const parts = voiceId.split("-");
+  if (parts.length >= 2) {
+    return `${parts[0]}-${parts[1]}`;
+  }
+  return voiceId;
+};
+
 interface VoiceSelectorProps {
  ttsMode: "edgetts" | "voiceclone";
  onSelectTtsMode: (mode: "edgetts" | "voiceclone") => void;
@@ -25,6 +48,102 @@ export function VoiceSelector({
  voiceCloneSlot,
  embedded = false,
 }: VoiceSelectorProps) {
+  const selectedVoice = voices.find((v) => v.id === voice) ?? voices[0];
+  const selectedLocale = selectedVoice ? getLocaleFromVoiceId(selectedVoice.id) : "";
+  const selectedLangLabel = LOCALE_LABELS[selectedLocale] ?? selectedLocale;
+
+  const [previewingVoiceId, setPreviewingVoiceId] = useState<string | null>(null);
+  const [previewLoadingVoiceId, setPreviewLoadingVoiceId] = useState<string | null>(null);
+  const previewPlayerRef = useRef<HTMLAudioElement | null>(null);
+  const previewRequestIdRef = useRef(0);
+
+  const stopVoicePreview = useCallback(() => {
+    previewRequestIdRef.current += 1;
+
+    if (previewPlayerRef.current) {
+      previewPlayerRef.current.pause();
+      previewPlayerRef.current.src = "";
+      previewPlayerRef.current.currentTime = 0;
+      previewPlayerRef.current = null;
+    }
+    setPreviewingVoiceId(null);
+    setPreviewLoadingVoiceId(null);
+  }, []);
+
+  useEffect(() => () => {
+    stopVoicePreview();
+  }, [stopVoicePreview]);
+
+  useEffect(() => {
+    if (ttsMode !== "edgetts") {
+      stopVoicePreview();
+    }
+  }, [ttsMode, stopVoicePreview]);
+
+  const handleVoicePreview = useCallback(async (voiceId: string, e: MouseEvent<HTMLButtonElement>) => {
+    e.stopPropagation();
+
+    if (previewingVoiceId === voiceId) {
+      stopVoicePreview();
+      return;
+    }
+
+    stopVoicePreview();
+    setPreviewLoadingVoiceId(voiceId);
+    const requestId = ++previewRequestIdRef.current;
+
+    try {
+      const audioUrl = `/api/videos/voice-preview?voice=${encodeURIComponent(voiceId)}`;
+      const player = new Audio(audioUrl);
+      previewPlayerRef.current = player;
+      let errorNotified = false;
+
+      const notifyPreviewError = () => {
+        if (errorNotified) return;
+        errorNotified = true;
+        toast.error("音色试听失败，请稍后重试");
+      };
+
+      player.onplaying = () => {
+        if (requestId === previewRequestIdRef.current) {
+          setPreviewLoadingVoiceId(null);
+          setPreviewingVoiceId(voiceId);
+        }
+      };
+
+      player.onended = () => {
+        if (previewPlayerRef.current === player) {
+          previewPlayerRef.current = null;
+          setPreviewingVoiceId(null);
+          setPreviewLoadingVoiceId(null);
+        }
+      };
+
+      player.onerror = () => {
+        if (previewPlayerRef.current === player) {
+          previewPlayerRef.current = null;
+          setPreviewingVoiceId(null);
+          setPreviewLoadingVoiceId(null);
+          notifyPreviewError();
+        }
+      };
+
+      await player.play();
+
+      if (requestId !== previewRequestIdRef.current) {
+        player.pause();
+        player.src = "";
+        player.currentTime = 0;
+      }
+    } catch {
+      toast.error("音色试听失败，请稍后重试");
+    } finally {
+      if (requestId === previewRequestIdRef.current) {
+        setPreviewLoadingVoiceId(null);
+      }
+    }
+  }, [previewingVoiceId, stopVoicePreview]);
+
  const content = (
    <>
      <div className="flex gap-2 mb-4">
@@ -51,19 +170,86 @@ export function VoiceSelector({
      </div>

      {ttsMode === "edgetts" && (
-        <div className="grid grid-cols-2 gap-3">
-          {voices.map((v) => (
-            <button
-              key={v.id}
-              onClick={() => onSelectVoice(v.id)}
-              className={`p-3 rounded-xl border-2 transition-all text-left ${voice === v.id
-                ? "border-purple-500 bg-purple-500/20"
-                : "border-white/10 bg-white/5 hover:border-white/30"
-                }`}
-            >
-              <span className="text-white text-sm">{v.name}</span>
-            </button>
-          ))}
+        <div className="space-y-2">
+          <p className="text-xs text-gray-400">音色选择</p>
+          <SelectPopover
+            sheetTitle="选择声音"
+            trigger={({ open, toggle }) => (
+              <button
+                type="button"
+                onClick={toggle}
+                className="w-full rounded-xl border border-white/10 bg-black/25 px-3 py-2.5 text-left hover:border-white/30 transition-colors"
+              >
+                <span className="flex items-center justify-between gap-3">
+                  <span className="min-w-0">
+                    <span className="block truncate text-sm text-white">
+                      {selectedVoice?.name || "请选择声音"}
+                    </span>
+                    <span className="block text-xs text-gray-400">
+                      {selectedLangLabel || "未识别语言"}
+                    </span>
+                  </span>
+                  <ChevronDown className={`h-4 w-4 text-gray-300 transition-transform ${open ? "rotate-180" : ""}`} />
+                </span>
+              </button>
+            )}
+          >
+            {({ close }) => (
+              <div className="space-y-1">
+                {voices.map((v) => {
+                  const isSelected = voice === v.id;
+                  const isPreviewing = previewingVoiceId === v.id;
+                  const isPreviewLoading = previewLoadingVoiceId === v.id;
+                  const locale = getLocaleFromVoiceId(v.id);
+                  const langLabel = LOCALE_LABELS[locale] ?? locale;
+
+                  return (
+                    <div
+                      key={v.id}
+                      data-popover-selected={isSelected ? "true" : undefined}
+                      className={`flex w-full items-center justify-between rounded-lg border px-3 py-2 text-left transition-colors ${isSelected
+                        ? "border-purple-500 bg-purple-500/20"
+                        : "border-white/10 bg-white/5 hover:border-white/30"
+                        }`}
+                    >
+                      <button
+                        type="button"
+                        onClick={() => {
+                          stopVoicePreview();
+                          onSelectVoice(v.id);
+                          close();
+                        }}
+                        className="min-w-0 flex-1 text-left"
+                      >
+                        <span className="block truncate text-sm text-white">{v.name}</span>
+                        <span className="mt-0.5 block text-xs text-gray-400">{langLabel}</span>
+                      </button>
+
+                      <div className="flex items-center gap-2 pl-2">
+                        <button
+                          type="button"
+                          onClick={(e) => {
+                            void handleVoicePreview(v.id, e);
+                          }}
+                          className="p-1 text-gray-400 hover:text-purple-300 transition-colors"
+                          title={isPreviewing ? "停止试听" : "试听"}
+                        >
+                          {isPreviewLoading ? (
+                            <Loader2 className="h-4 w-4 animate-spin" />
+                          ) : isPreviewing ? (
+                            <Pause className="h-4 w-4" />
+                          ) : (
+                            <Play className="h-4 w-4" />
+                          )}
+                        </button>
+                        {isSelected && <Check className="h-4 w-4 text-purple-300" />}
+                      </div>
+                    </div>
+                  );
+                })}
+              </div>
+            )}
+          </SelectPopover>
        </div>
      )}

--- a/frontend/src/features/home/ui/script-learning/useScriptLearning.ts
+++ b/frontend/src/features/home/ui/script-learning/useScriptLearning.ts
@@ -0,0 +1,239 @@
+import { useCallback, useEffect, useState } from "react";
+import api from "@/shared/api/axios";
+import { ApiResponse, unwrap } from "@/shared/api/types";
+import { toast } from "sonner";
+
+export type ScriptLearningStep = "input" | "analyzing" | "topics" | "generating" | "result";
+
+const WORD_COUNT_MIN = 80;
+const WORD_COUNT_MAX = 1000;
+const DEFAULT_WORD_COUNT = "300";
+
+interface UseScriptLearningOptions {
+  isOpen: boolean;
+}
+
+interface AnalyzeCreatorPayload {
+  topics: string[];
+  analysis_id: string;
+  fetched_count: number;
+}
+
+interface GenerateTopicScriptPayload {
+  script: string;
+}
+
+export const useScriptLearning = ({ isOpen }: UseScriptLearningOptions) => {
+  const [step, setStep] = useState<ScriptLearningStep>("input");
+  const [inputUrl, setInputUrl] = useState("");
+  const [topics, setTopics] = useState<string[]>([]);
+  const [selectedTopic, setSelectedTopic] = useState<string | null>(null);
+  const [wordCount, setWordCount] = useState(DEFAULT_WORD_COUNT);
+  const [generatedScript, setGeneratedScript] = useState("");
+  const [error, setError] = useState<string | null>(null);
+  const [analysisId, setAnalysisId] = useState<string | null>(null);
+  const [fetchedCount, setFetchedCount] = useState(0);
+
+  const resetAll = useCallback(() => {
+    setStep("input");
+    setInputUrl("");
+    setTopics([]);
+    setSelectedTopic(null);
+    setWordCount(DEFAULT_WORD_COUNT);
+    setGeneratedScript("");
+    setError(null);
+    setAnalysisId(null);
+    setFetchedCount(0);
+  }, []);
+
+  useEffect(() => {
+    if (isOpen) {
+      resetAll();
+    }
+  }, [isOpen, resetAll]);
+
+  const parseWordCount = useCallback((value: string): number | null => {
+    const num = Number(value);
+    if (!Number.isInteger(num)) {
+      return null;
+    }
+    if (num < WORD_COUNT_MIN || num > WORD_COUNT_MAX) {
+      return null;
+    }
+    return num;
+  }, []);
+
+  const handleAnalyze = useCallback(async () => {
+    const urlValue = inputUrl.trim();
+    if (!urlValue) {
+      setError("请先输入博主主页链接");
+      return;
+    }
+
+    setError(null);
+    setStep("analyzing");
+
+    try {
+      const { data: res } = await api.post<ApiResponse<AnalyzeCreatorPayload>>(
+        "/api/tools/analyze-creator",
+        { url: urlValue },
+        { timeout: 60000 }
+      );
+      const payload = unwrap(res);
+      const topicList = payload.topics || [];
+
+      if (topicList.length === 0) {
+        throw new Error("未识别到可用话题，请更换链接重试");
+      }
+
+      setTopics(topicList);
+      setSelectedTopic(topicList[0]);
+      setAnalysisId(payload.analysis_id || null);
+      setFetchedCount(payload.fetched_count || 0);
+      setGeneratedScript("");
+      setStep("topics");
+    } catch (err: unknown) {
+      const axiosErr = err as {
+        response?: { data?: { message?: string } };
+        message?: string;
+      };
+      const msg = axiosErr.response?.data?.message || axiosErr.message || "分析失败，请稍后重试";
+      setError(msg);
+      setStep("input");
+    }
+  }, [inputUrl]);
+
+  const handleGenerate = useCallback(async () => {
+    if (!analysisId) {
+      setError("分析结果已失效，请重新分析");
+      setStep("input");
+      return;
+    }
+    if (!selectedTopic) {
+      setError("请先选择一个话题");
+      return;
+    }
+
+    const count = parseWordCount(wordCount.trim());
+    if (count === null) {
+      setError(`目标字数需在 ${WORD_COUNT_MIN}-${WORD_COUNT_MAX} 之间`);
+      return;
+    }
+
+    setError(null);
+    setStep("generating");
+
+    try {
+      const { data: res } = await api.post<ApiResponse<GenerateTopicScriptPayload>>(
+        "/api/tools/generate-topic-script",
+        {
+          analysis_id: analysisId,
+          topic: selectedTopic,
+          word_count: count,
+        },
+        { timeout: 90000 }
+      );
+      const payload = unwrap(res);
+
+      const script = (payload.script || "").trim();
+      if (!script) {
+        throw new Error("生成内容为空，请重试");
+      }
+
+      setGeneratedScript(script);
+      setStep("result");
+    } catch (err: unknown) {
+      const axiosErr = err as {
+        response?: { data?: { message?: string } };
+        message?: string;
+        code?: string;
+      };
+      let msg = axiosErr.response?.data?.message || axiosErr.message || "生成失败，请稍后重试";
+      if (axiosErr.code === "ECONNABORTED" || /timeout/i.test(axiosErr.message || "")) {
+        msg = "生成超时，请稍后重试（可适当减少目标字数）";
+      }
+      setError(msg);
+      setStep("topics");
+    }
+  }, [analysisId, parseWordCount, selectedTopic, wordCount]);
+
+  const handleRegenerate = useCallback(async () => {
+    await handleGenerate();
+  }, [handleGenerate]);
+
+  const backToInput = useCallback(() => {
+    setError(null);
+    setStep("input");
+  }, []);
+
+  const backToTopics = useCallback(() => {
+    setError(null);
+    setStep("topics");
+  }, []);
+
+  const fallbackCopyTextToClipboard = useCallback((text: string) => {
+    const textArea = document.createElement("textarea");
+    textArea.value = text;
+    textArea.style.top = "0";
+    textArea.style.left = "0";
+    textArea.style.position = "fixed";
+    textArea.style.opacity = "0";
+
+    document.body.appendChild(textArea);
+    textArea.focus();
+    textArea.select();
+
+    try {
+      const successful = document.execCommand("copy");
+      if (successful) {
+        toast.success("已复制到剪贴板");
+      } else {
+        toast.error("复制失败，请手动复制");
+      }
+    } catch {
+      toast.error("复制失败，请手动复制");
+    }
+
+    document.body.removeChild(textArea);
+  }, []);
+
+  const copyToClipboard = useCallback(
+    (text: string) => {
+      if (navigator.clipboard && window.isSecureContext) {
+        navigator.clipboard
+          .writeText(text)
+          .then(() => {
+            toast.success("已复制到剪贴板");
+          })
+          .catch(() => {
+            fallbackCopyTextToClipboard(text);
+          });
+      } else {
+        fallbackCopyTextToClipboard(text);
+      }
+    },
+    [fallbackCopyTextToClipboard]
+  );
+
+  return {
+    step,
+    inputUrl,
+    setInputUrl,
+    topics,
+    selectedTopic,
+    setSelectedTopic,
+    wordCount,
+    setWordCount,
+    generatedScript,
+    error,
+    analysisId,
+    fetchedCount,
+    handleAnalyze,
+    handleGenerate,
+    handleRegenerate,
+    backToInput,
+    backToTopics,
+    resetAll,
+    copyToClipboard,
+  };
+};
--- a/frontend/src/features/publish/model/usePublishController.ts
+++ b/frontend/src/features/publish/model/usePublishController.ts
@@ -7,6 +7,7 @@ import { clampTitle } from "@/shared/lib/title";
 import { useTitleInput } from "@/shared/hooks/useTitleInput";
 import { useAuth } from "@/shared/contexts/AuthContext";
 import { useTask } from "@/shared/contexts/TaskContext";
+import { useCleanup } from "@/shared/contexts/CleanupContext";
 import { toast } from "sonner";
 import { usePublishPrefetch } from "@/shared/hooks/usePublishPrefetch";
 import {
@@ -40,6 +41,7 @@ export const usePublishController = () => {

  const { userId, isLoading: isAuthLoading } = useAuth();
  const { isGenerating } = useTask();
+  const { triggerCleanup } = useCleanup();
  const prevIsGenerating = useRef(isGenerating);
  const { readPrefetch, updatePrefetch } = usePublishPrefetch();

@@ -183,6 +185,23 @@ export const usePublishController = () => {
    window.scrollTo({ top: 0, left: 0, behavior: "auto" });
  }, []);

+  // ---- 工作区清理事件（清理后同步重置当前页输入态） ----
+  useEffect(() => {
+    if (typeof window === "undefined") return;
+
+    const handleWorkspaceCleared = (event: Event) => {
+      const detail = (event as CustomEvent<{ userId?: string }>).detail;
+      if (!detail?.userId || detail.userId !== userId) return;
+
+      setTitle("");
+      setTags("");
+      setPublishResults([]);
+    };
+
+    window.addEventListener("vigent:workspace-cleared", handleWorkspaceCleared);
+    return () => window.removeEventListener("vigent:workspace-cleared", handleWorkspaceCleared);
+  }, [userId]);
+
  // ---- 发布防误操作 ----
  useEffect(() => {
    if (!isPublishing) return;
@@ -231,6 +250,29 @@ export const usePublishController = () => {

  // ---- 操作函数 ----

+  const runWithConcurrency = async <T,>(
+    taskFactories: Array<() => Promise<T>>,
+    concurrency: number
+  ): Promise<T[]> => {
+    if (taskFactories.length === 0) return [];
+
+    const results: T[] = new Array(taskFactories.length);
+    let nextIndex = 0;
+
+    const worker = async () => {
+      while (true) {
+        const currentIndex = nextIndex;
+        nextIndex += 1;
+        if (currentIndex >= taskFactories.length) return;
+        results[currentIndex] = await taskFactories[currentIndex]();
+      }
+    };
+
+    const workerCount = Math.min(Math.max(concurrency, 1), taskFactories.length);
+    await Promise.all(Array.from({ length: workerCount }, () => worker()));
+    return results;
+  };
+
  const togglePlatform = (platform: string) => {
    if (selectedPlatforms.includes(platform)) {
      setSelectedPlatforms(selectedPlatforms.filter((p) => p !== platform));
@@ -252,7 +294,8 @@ export const usePublishController = () => {
    setIsPublishing(true);
    setPublishResults([]);
    const tagList = tags.split(/[,，\s]+/).filter((t) => t.trim());
-    for (const platform of selectedPlatforms) {
+
+    const publishOnePlatform = async (platform: string): Promise<PublishResult> => {
      try {
        const { data: res } = await api.post<ApiResponse<any>>("/api/publish", {
          video_path: video.path, platform, title, tags: tagList, description: "",
@@ -260,19 +303,31 @@ export const usePublishController = () => {
        const result = unwrap(res);
        const screenshotUrl = typeof result.screenshot_url === "string"
          ? resolveMediaUrl(result.screenshot_url) || result.screenshot_url : undefined;
-        setPublishResults((prev) => [...prev, {
+        return {
          platform: result.platform || platform,
          success: Boolean(result.success),
          message: result.message || "",
          url: result.url,
          screenshot_url: screenshotUrl,
-        }]);
+        };
      } catch (error: any) {
        const message = error.response?.data?.message || String(error);
-        setPublishResults((prev) => [...prev, { platform, success: false, message }]);
+        return { platform, success: false, message };
      }
+    };
+
+    try {
+      const taskFactories = selectedPlatforms.map((platform) => () => publishOnePlatform(platform));
+      const results = await runWithConcurrency(taskFactories, 2);
+      const allSuccess = results.length > 0 && results.every(r => r.success);
+      if (allSuccess) {
+        triggerCleanup(results, video.id);
+      } else {
+        setPublishResults(results);
+      }
+    } finally {
+      setIsPublishing(false);
    }
-    setIsPublishing(false);
  };

  const handleLogin = async (platform: string) => {
--- a/frontend/src/features/publish/ui/PublishPage.tsx
+++ b/frontend/src/features/publish/ui/PublishPage.tsx
@@ -4,9 +4,13 @@ import Link from "next/link";
 import Image from "next/image";
 import VideoPreviewModal from "@/components/VideoPreviewModal";
 import AccountSettingsDropdown from "@/components/AccountSettingsDropdown";
+import { SelectPopover } from "@/shared/ui/SelectPopover";
+import { AppModal, AppModalHeader } from "@/shared/ui/AppModal";
 import { usePublishController } from "@/features/publish/model/usePublishController";
 import {
  ArrowLeft,
+  Check,
+  ChevronDown,
  RotateCcw,
  LogOut,
  QrCode,
@@ -18,6 +22,7 @@ import {
 export function PublishPage() {
  const {
    accounts,
+    videos,
    isAccountsLoading,
    isVideosLoading,
    selectedVideo,
@@ -47,6 +52,8 @@ export function PublishPage() {
    closeQrModal,
  } = usePublishController();

+  const selectedVideoItem = videos.find((v) => v.id === selectedVideo) || null;
+
  return (
    <div className="min-h-dvh">
      <VideoPreviewModal
@@ -56,51 +63,69 @@ export function PublishPage() {
      />
      {/* QR码弹窗 */}
      {qrPlatform && (
-        <div className="fixed inset-0 bg-black/80 flex items-center justify-center z-50">
-          <div className="bg-white rounded-2xl p-8 max-w-md min-w-[320px]">
-            <h2 className="text-2xl font-bold mb-4 text-center">🔐 扫码登录 {qrPlatform}</h2>
+        <AppModal
+          isOpen={Boolean(qrPlatform)}
+          onClose={closeQrModal}
+          panelClassName="w-full max-w-md rounded-2xl border border-white/10 bg-[#171821]/95 shadow-[0_24px_80px_rgba(0,0,0,0.55)] overflow-hidden"
+          closeOnOverlay
+        >
+          <AppModalHeader
+            title={`🔐 扫码登录 ${qrPlatform}`}
+            subtitle="请使用手机扫码完成登录验证"
+            icon={<QrCode className="h-5 w-5 text-purple-300" />}
+            onClose={closeQrModal}
+          />
+
+          <div className="p-5 space-y-4">
            {isLoadingQR ? (
              <div className="flex flex-col items-center py-8">
-                <div className="animate-spin w-16 h-16 border-4 border-purple-500 border-t-transparent rounded-full" />
-                <p className="text-gray-600 mt-4">正在获取二维码...</p>
+                <Loader2 className="h-14 w-14 animate-spin text-purple-400" />
+                <p className="text-gray-300 mt-4">正在获取二维码...</p>
              </div>
            ) : faceVerifyQr ? (
-              <>
-                <Image
-                  src={`data:image/png;base64,${faceVerifyQr}`}
-                  alt="Face Verify QR"
-                  width={400}
-                  height={300}
-                  className="w-full h-auto rounded-lg"
-                  unoptimized
-                />
-                <p className="text-center text-orange-600 font-medium mt-4">
-                  需要身份验证，请用抖音APP扫描上方二维码完成刷脸验证
+              <div className="space-y-3">
+                <div className="mx-auto w-fit rounded-xl border border-white/10 bg-white p-2 shadow-[0_10px_30px_rgba(0,0,0,0.35)]">
+                  <Image
+                    src={`data:image/png;base64,${faceVerifyQr}`}
+                    alt="Face Verify QR"
+                    width={400}
+                    height={300}
+                    className="h-auto w-[min(82vw,400px)] border border-black/5"
+                    unoptimized
+                  />
+                </div>
+                <p className="text-center text-amber-300 text-sm font-medium">
+                  需要身份验证，请用抖音 APP 扫描上方二维码完成刷脸验证
                </p>
-              </>
+              </div>
            ) : qrCodeImage ? (
-              <>
-                <Image
-                  src={`data:image/png;base64,${qrCodeImage}`}
-                  alt="QR Code"
-                  width={280}
-                  height={280}
-                  className="w-full h-auto"
-                  unoptimized
-                />
-                <p className="text-center text-gray-600 mt-4">
-                  请使用手机扫码登录
-                </p>
-              </>
-            ) : null}
+              <div className="space-y-3">
+                <div className="mx-auto w-fit rounded-xl border border-white/10 bg-white p-3 shadow-[0_10px_30px_rgba(0,0,0,0.35)]">
+                  <Image
+                    src={`data:image/png;base64,${qrCodeImage}`}
+                    alt="QR Code"
+                    width={300}
+                    height={300}
+                    className="h-auto w-[min(74vw,300px)] border border-black/5"
+                    unoptimized
+                  />
+                </div>
+                <p className="text-center text-gray-300 text-sm">请使用手机扫码登录</p>
+              </div>
+            ) : (
+              <div className="rounded-xl border border-red-500/30 bg-red-500/10 px-4 py-3 text-sm text-red-200">
+                二维码获取失败，请重试
+              </div>
+            )}
+
            <button
              onClick={closeQrModal}
-              className="w-full mt-4 px-4 py-2 bg-gray-200 rounded-lg hover:bg-gray-300"
+              className="w-full px-4 py-2.5 bg-white/10 hover:bg-white/20 text-white rounded-lg transition-colors"
            >
              取消
            </button>
          </div>
-        </div>
+        </AppModal>
      )}

      {/* Header - 统一样式 */}
@@ -227,76 +252,112 @@ export function PublishPage() {
            {/* 选择视频 */}
            <div className="bg-white/5 rounded-2xl p-6 border border-white/10 backdrop-blur-sm">
              <h2 className="text-lg font-semibold text-white mb-4">八、选择发布作品</h2>
-
-              <div className="flex items-center gap-3 mb-4">
-                <Search className="text-gray-400 w-4 h-4" />
-                <input
-                  type="text"
-                  value={videoFilter}
-                  onChange={(e) => setVideoFilter(e.target.value)}
-                  placeholder="搜索视频名称..."
-                  className="flex-1 bg-black/30 border border-white/10 rounded-lg px-3 py-2 text-sm text-white placeholder-gray-500 focus:outline-none focus:border-purple-500"
-                />
-              </div>
-
-              {isVideosLoading ? (
-                <div className="space-y-2">
-                  {Array.from({ length: 2 }).map((_, index) => (
-                    <div
-                      key={`video-skeleton-${index}`}
-                      className="p-3 rounded-lg border border-white/10 bg-white/5 animate-pulse"
-                    >
-                      <div className="h-4 w-40 bg-white/10 rounded" />
-                      <div className="h-3 w-24 bg-white/5 rounded mt-2" />
-                    </div>
-                  ))}
-                </div>
-              ) : filteredVideos.length === 0 ? (
-                <div className="text-center py-8 text-gray-400">
-                  暂无可发布的视频
-                </div>
-              ) : (
-                <div className="space-y-2 max-h-64 overflow-y-auto hide-scrollbar" style={{ contentVisibility: "auto" }}>
-                  {filteredVideos.map((v) => (
-                    <div
-                      key={v.id}
-                      onClick={() => setSelectedVideo(v.id)}
-                      className={`p-3 rounded-lg border transition-all flex items-center justify-between group cursor-pointer ${selectedVideo === v.id
-                        ? "border-purple-500 bg-purple-500/20"
-                        : "border-white/10 bg-white/5 hover:border-white/30"
-                        }`}
-                    >
-                      <div className="flex flex-col">
-                        <span className="text-sm text-white">{v.name}</span>
-                      </div>
-                      <div className="flex items-center gap-2 pl-2">
-                        <button
-                          onClick={(e) => {
-                            e.stopPropagation();
-                            handlePreviewVideo(v.id);
-                          }}
-                          onMouseEnter={() => {
-                            const src = v.path.startsWith("/") ? v.path : `/${v.path}`;
-                            const prefetch = document.createElement("link");
-                            prefetch.rel = "preload";
-                            prefetch.as = "video";
-                            prefetch.href = src;
-                            document.head.appendChild(prefetch);
-                            setTimeout(() => prefetch.remove(), 2000);
-                          }}
-                          className="p-1 text-gray-500 hover:text-purple-400 transition-colors"
-                          title="预览"
-                        >
-                          <Eye className="h-4 w-4" />
-                        </button>
-                        {selectedVideo === v.id && (
-                          <span className="text-xs text-purple-300">已选</span>
-                        )}
+              <SelectPopover
+                sheetTitle="选择发布作品"
+                onOpen={() => setVideoFilter("")}
+                trigger={({ open, toggle }) => (
+                  <button
+                    type="button"
+                    onClick={toggle}
+                    className="w-full rounded-xl border border-white/10 bg-black/25 px-3 py-2.5 text-left transition-colors hover:border-white/30"
+                  >
+                    <span className="flex items-center justify-between gap-3">
+                      <span className="min-w-0">
+                        <span className="block text-xs text-gray-400">当前作品</span>
+                        <span className="mt-0.5 block truncate text-sm text-white">
+                          {selectedVideoItem?.name || (isVideosLoading ? "正在加载作品..." : "请选择发布作品")}
+                        </span>
+                      </span>
+                      <ChevronDown className={`h-4 w-4 text-gray-300 transition-transform ${open ? "rotate-180" : ""}`} />
+                    </span>
+                  </button>
+                )}
+              >
+                {({ close }) => (
+                  <div className="space-y-2">
+                    <div className="rounded-lg border border-white/10 bg-black/30 px-3 py-2">
+                      <div className="flex items-center gap-2">
+                        <Search className="h-4 w-4 text-gray-400" />
+                        <input
+                          type="text"
+                          value={videoFilter}
+                          onChange={(e) => setVideoFilter(e.target.value)}
+                          placeholder="搜索视频名称..."
+                          className="w-full bg-transparent text-sm text-white placeholder-gray-500 outline-none"
+                        />
                      </div>
                    </div>
-                  ))}
-                </div>
-              )}
+
+                    {isVideosLoading ? (
+                      <div className="space-y-2 p-1">
+                        {Array.from({ length: 2 }).map((_, index) => (
+                          <div
+                            key={`video-skeleton-${index}`}
+                            className="p-3 rounded-lg border border-white/10 bg-white/5 animate-pulse"
+                          >
+                            <div className="h-4 w-40 bg-white/10 rounded" />
+                          </div>
+                        ))}
+                      </div>
+                    ) : filteredVideos.length === 0 ? (
+                      <div className="py-8 text-center text-sm text-gray-400">
+                        暂无可发布的视频
+                      </div>
+                    ) : (
+                      <div className="space-y-1 pb-1" style={{ contentVisibility: "auto" }}>
+                        {filteredVideos.map((v) => {
+                          const isSelected = selectedVideo === v.id;
+
+                          return (
+                            <div
+                              key={v.id}
+                              data-popover-selected={isSelected ? "true" : undefined}
+                              className={`flex items-center gap-2 rounded-lg border px-3 py-2 transition-colors ${isSelected
+                                ? "border-purple-500 bg-purple-500/20"
+                                : "border-white/10 bg-white/5 hover:border-white/30"
+                                }`}
+                            >
+                              <button
+                                type="button"
+                                onClick={() => {
+                                  setSelectedVideo(v.id);
+                                  close();
+                                }}
+                                className="min-w-0 flex-1 text-left"
+                              >
+                                <span className="block truncate text-sm text-white">{v.name}</span>
+                              </button>
+
+                              <button
+                                type="button"
+                                onClick={(e) => {
+                                  e.stopPropagation();
+                                  handlePreviewVideo(v.id);
+                                }}
+                                onMouseEnter={() => {
+                                  const src = v.path.startsWith("/") ? v.path : `/${v.path}`;
+                                  const prefetch = document.createElement("link");
+                                  prefetch.rel = "preload";
+                                  prefetch.as = "video";
+                                  prefetch.href = src;
+                                  document.head.appendChild(prefetch);
+                                  setTimeout(() => prefetch.remove(), 2000);
+                                }}
+                                className="p-1 text-gray-400 hover:text-purple-300"
+                                title="预览"
+                              >
+                                <Eye className="h-4 w-4" />
+                              </button>
+
+                              {isSelected && <Check className="h-4 w-4 text-purple-300" />}
+                            </div>
+                          );
+                        })}
+                      </div>
+                    )}
+                  </div>
+                )}
+              </SelectPopover>
            </div>

            {/* 填写信息 */}
--- a/frontend/src/shared/contexts/CleanupContext.tsx
+++ b/frontend/src/shared/contexts/CleanupContext.tsx
@@ -0,0 +1,414 @@
+"use client";
+
+import {
+  createContext,
+  useContext,
+  useState,
+  useEffect,
+  useCallback,
+  type ReactNode,
+} from "react";
+import Image from "next/image";
+import { AppModal, AppModalHeader } from "@/shared/ui/AppModal";
+import { useAuth } from "@/shared/contexts/AuthContext";
+import api from "@/shared/api/axios";
+import type { ApiResponse } from "@/shared/api/types";
+import { Download, Trash2, Loader2, CheckCircle2 } from "lucide-react";
+import type { PublishResult } from "@/shared/types/publish";
+
+/* ────────── types ────────── */
+
+const CLEANUP_EXPIRE_MS = 24 * 60 * 60 * 1000; // 24h
+const MAX_FAIL_BEFORE_SKIP = 3;
+
+interface CleanupState {
+  required: boolean;
+  publishResults: PublishResult[];
+  videoId?: string;
+  createdAt?: number; // timestamp for expiry check
+  failCount?: number;
+}
+
+interface CleanupContextType {
+  triggerCleanup: (results: PublishResult[], videoId?: string) => void;
+}
+
+const EMPTY_STATE: CleanupState = { required: false, publishResults: [] };
+
+const CleanupContext = createContext<CleanupContextType>({
+  triggerCleanup: () => {},
+});
+
+/* ────────── helpers ────────── */
+
+function storageKey(userId: string) {
+  return `vigent_${userId}_cleanup_pending`;
+}
+
+function normalizeVideoId(value: unknown): string | undefined {
+  if (typeof value !== "string") return undefined;
+  const raw = value.trim();
+  if (!raw) return undefined;
+
+  const decoded = (() => {
+    try {
+      return decodeURIComponent(raw);
+    } catch {
+      return raw;
+    }
+  })();
+
+  const routeMatch = decoded.match(/\/generated\/([^/?#]+)\/download/i);
+  if (routeMatch?.[1]) return routeMatch[1];
+
+  const outputMatch = decoded.match(/\/([^/?#]+_output)\.mp4(?:[?#]|$)/i);
+  if (outputMatch?.[1]) return outputMatch[1];
+
+  if (!decoded.includes("/") && !decoded.includes(".") && !decoded.includes("?")) {
+    return decoded;
+  }
+
+  return undefined;
+}
+
+function readPersistedState(userId: string): CleanupState {
+  try {
+    const raw = localStorage.getItem(storageKey(userId));
+    if (!raw) return EMPTY_STATE;
+    const parsed = JSON.parse(raw) as CleanupState;
+    const normalized: CleanupState = {
+      required: Boolean(parsed.required),
+      publishResults: Array.isArray(parsed.publishResults) ? parsed.publishResults : [],
+      videoId: normalizeVideoId(parsed.videoId)
+        || normalizeVideoId((parsed as unknown as Record<string, unknown>).videoDownloadUrl),
+      createdAt: typeof parsed.createdAt === "number" ? parsed.createdAt : Date.now(),
+      failCount: typeof parsed.failCount === "number" && parsed.failCount > 0 ? parsed.failCount : 0,
+    };
+
+    if (!normalized.required) return EMPTY_STATE;
+
+    // 24h expiry check
+    if (normalized.createdAt && Date.now() - normalized.createdAt > CLEANUP_EXPIRE_MS) {
+      localStorage.removeItem(storageKey(userId));
+      return EMPTY_STATE;
+    }
+    return normalized;
+  } catch {
+    return EMPTY_STATE;
+  }
+}
+
+function persistState(userId: string, state: CleanupState) {
+  localStorage.setItem(storageKey(userId), JSON.stringify(state));
+}
+
+function clearPersistedState(userId: string) {
+  localStorage.removeItem(storageKey(userId));
+}
+
+/* ────────── localStorage keys to clear ────────── */
+
+function clearWorkspaceLocalStorage(userId: string) {
+  const key = userId;
+  const keysToRemove = [
+    // home page content
+    `vigent_${key}_text`,
+    `vigent_${key}_title`,
+    `vigent_${key}_secondaryTitle`,
+    // publish page
+    `vigent_${key}_publish_title`,
+    `vigent_${key}_publish_tags`,
+  ];
+  keysToRemove.forEach((k) => localStorage.removeItem(k));
+}
+
+/* ────────── platform icons ────────── */
+
+const platformIcons: Record<string, { src: string; alt: string }> = {
+  douyin: { src: "/platforms/douyin.svg", alt: "抖音" },
+  weixin: { src: "/platforms/wechat.svg", alt: "微信视频号" },
+  bilibili: { src: "/platforms/bilibili.svg", alt: "B站" },
+  xiaohongshu: { src: "/platforms/xiaohongshu.svg", alt: "小红书" },
+};
+
+/* ────────── CleanupModal ────────── */
+
+function CleanupModal({
+  isOpen,
+  publishResults,
+  videoId,
+  cleanupError,
+  failCount,
+  onCleanup,
+  onSkip,
+}: {
+  isOpen: boolean;
+  publishResults: PublishResult[];
+  videoId?: string;
+  cleanupError?: string | null;
+  failCount: number;
+  onCleanup: () => Promise<void>;
+  onSkip: () => void;
+}) {
+  const [isCleaning, setIsCleaning] = useState(false);
+
+  const handleCleanup = async () => {
+    setIsCleaning(true);
+    try {
+      await onCleanup();
+    } catch {
+      // keep modal open for retry
+    } finally {
+      setIsCleaning(false);
+    }
+  };
+
+  const canSkip = failCount >= MAX_FAIL_BEFORE_SKIP;
+
+  return (
+    <AppModal
+      isOpen={isOpen}
+      onClose={() => {}}
+      closeOnOverlay={false}
+      zIndexClassName="z-[300]"
+      panelClassName="w-full max-w-lg rounded-2xl border border-white/10 bg-[#171821]/95 shadow-[0_24px_80px_rgba(0,0,0,0.55)] overflow-hidden max-h-[90vh] flex flex-col"
+    >
+      <AppModalHeader
+        title="发布完成"
+        subtitle="所有平台发布成功"
+        icon={<CheckCircle2 className="h-5 w-5 text-green-400" />}
+      />
+
+      <div className="p-5 space-y-4 overflow-y-auto flex-1">
+        {/* Success results */}
+        <div className="space-y-2">
+          {publishResults.map((r, i) => (
+            <div
+              key={i}
+              className="flex items-center gap-2 p-3 rounded-xl border border-green-500/30 bg-green-500/10"
+            >
+              {platformIcons[r.platform] ? (
+                <Image
+                  src={platformIcons[r.platform].src}
+                  alt={platformIcons[r.platform].alt}
+                  width={20}
+                  height={20}
+                  className="h-5 w-5"
+                />
+              ) : (
+                <span className="text-lg">🌐</span>
+              )}
+              <span className="text-green-400 font-medium text-sm">
+                {platformIcons[r.platform]?.alt || r.platform} - 发布成功
+              </span>
+            </div>
+          ))}
+        </div>
+
+        {/* Download button */}
+        {videoId && (
+          <a
+            href={`/api/videos/generated/${encodeURIComponent(videoId)}/download`}
+            download
+            className="flex items-center justify-center gap-2 w-full py-3 rounded-xl border border-blue-500/30 bg-blue-500/10 text-blue-300 hover:bg-blue-500/20 transition-colors text-sm font-medium"
+          >
+            <Download className="h-4 w-4" />
+            下载视频备份（可选）
+          </a>
+        )}
+
+        {cleanupError && (
+          <div className="rounded-xl border border-red-500/30 bg-red-500/10 px-3 py-2 text-xs text-red-300">
+            清理失败：{cleanupError}
+          </div>
+        )}
+
+        {/* Cleanup button */}
+        <button
+          onClick={handleCleanup}
+          disabled={isCleaning}
+          className="flex items-center justify-center gap-2 w-full py-3 rounded-xl bg-gradient-to-r from-purple-600 to-pink-600 text-white font-semibold hover:from-purple-500 hover:to-pink-500 transition-all disabled:opacity-60"
+        >
+          {isCleaning ? (
+            <>
+              <Loader2 className="h-4 w-4 animate-spin" />
+              正在清理...
+            </>
+          ) : (
+            <>
+              <Trash2 className="h-4 w-4" />
+              清理工作区 &amp; 开始新作品
+            </>
+          )}
+        </button>
+
+        {canSkip && (
+          <button
+            onClick={onSkip}
+            disabled={isCleaning}
+            className="flex items-center justify-center w-full py-2.5 rounded-xl border border-white/10 bg-white/5 text-gray-400 hover:bg-white/10 hover:text-gray-300 transition-colors text-sm disabled:opacity-50 disabled:cursor-not-allowed"
+          >
+            暂不清理，继续使用
+          </button>
+        )}
+
+        <p className="text-xs text-gray-400 text-center leading-relaxed">
+          清理成功后弹窗关闭；若失败将保留弹窗以便重试。清理将删除所有生成的视频和配音，
+          <br />
+          并清空标题、文案和标签（已保存的历史文案不受影响）。
+        </p>
+
+        {/* Screenshots */}
+        {publishResults.some((r) => r.screenshot_url) && (
+          <div className="pt-2 border-t border-white/10">
+            <p className="text-xs text-gray-400 mb-3">发布成功截图</p>
+            <div className="grid grid-cols-1 sm:grid-cols-2 gap-3">
+              {publishResults
+                .filter((r) => r.screenshot_url)
+                .map((r, i) => (
+                  <div key={i} className="space-y-1">
+                    <p className="text-xs text-gray-500">
+                      {platformIcons[r.platform]?.alt || r.platform}
+                    </p>
+                    <a
+                      href={r.screenshot_url}
+                      target="_blank"
+                      rel="noreferrer"
+                      className="block rounded-lg border border-white/10 bg-black/20 overflow-hidden"
+                    >
+                      <Image
+                        src={r.screenshot_url!}
+                        alt={`${r.platform} 截图`}
+                        width={400}
+                        height={300}
+                        className="w-full"
+                        unoptimized
+                      />
+                    </a>
+                  </div>
+                ))}
+            </div>
+          </div>
+        )}
+      </div>
+    </AppModal>
+  );
+}
+
+/* ────────── Provider ────────── */
+
+export function CleanupProvider({ children }: { children: ReactNode }) {
+  const { userId, isLoading: isAuthLoading } = useAuth();
+  const [cleanupState, setCleanupState] = useState<CleanupState>(EMPTY_STATE);
+  const [cleanupError, setCleanupError] = useState<string | null>(null);
+
+  // Restore from localStorage on mount / reset on user switch
+  useEffect(() => {
+    if (isAuthLoading) return;
+
+    if (!userId) {
+      setCleanupState(EMPTY_STATE);
+      setCleanupError(null);
+      return;
+    }
+
+    const persisted = readPersistedState(userId);
+    if (persisted.required) {
+      persistState(userId, persisted);
+      setCleanupState(persisted);
+    } else {
+      setCleanupState(EMPTY_STATE);
+    }
+    setCleanupError(null);
+  }, [isAuthLoading, userId]);
+
+  const triggerCleanup = useCallback(
+    (results: PublishResult[], videoId?: string) => {
+      if (!userId) return;
+      setCleanupError(null);
+      const state: CleanupState = {
+        required: true,
+        publishResults: results,
+        videoId,
+        createdAt: Date.now(),
+        failCount: 0,
+      };
+      persistState(userId, state);
+      setCleanupState(state);
+    },
+    [userId]
+  );
+
+  const executeCleanup = useCallback(async () => {
+    if (!userId) return;
+    setCleanupError(null);
+
+    // 1. Call backend to delete files
+    try {
+      const { data: res } = await api.post<ApiResponse<{ videos_deleted: number; audios_deleted: number }>>(
+        "/api/videos/cleanup"
+      );
+
+      if (!res.success) {
+        throw new Error(res.message || "服务端清理失败");
+      }
+    } catch (e) {
+      console.error("Cleanup API failed:", e);
+      const err = e as { response?: { data?: { message?: string; detail?: string } }; message?: string };
+      const message = err.response?.data?.message || err.response?.data?.detail || err.message || "请稍后重试";
+      setCleanupError(message);
+      setCleanupState((prev) => {
+        if (!prev.required) return prev;
+        const next: CleanupState = {
+          ...prev,
+          failCount: (prev.failCount || 0) + 1,
+          createdAt: prev.createdAt || Date.now(),
+        };
+        persistState(userId, next);
+        return next;
+      });
+      throw e;
+    }
+
+    // 2. Clear workspace localStorage keys
+    clearWorkspaceLocalStorage(userId);
+
+    if (typeof window !== "undefined") {
+      window.dispatchEvent(
+        new CustomEvent("vigent:workspace-cleared", { detail: { userId } })
+      );
+    }
+
+    // 3. Clear cleanup pending state
+    clearPersistedState(userId);
+    setCleanupState(EMPTY_STATE);
+    setCleanupError(null);
+  }, [userId]);
+
+  // Skip: close modal and clear cleanup_pending immediately (user chose to skip)
+  const handleSkip = useCallback(() => {
+    if (!userId) return;
+    clearPersistedState(userId);
+    setCleanupState(EMPTY_STATE);
+    setCleanupError(null);
+  }, [userId]);
+
+  return (
+    <CleanupContext.Provider value={{ triggerCleanup }}>
+      {children}
+      <CleanupModal
+        isOpen={cleanupState.required}
+        publishResults={cleanupState.publishResults}
+        videoId={cleanupState.videoId}
+        cleanupError={cleanupError}
+        failCount={cleanupState.failCount || 0}
+        onCleanup={executeCleanup}
+        onSkip={handleSkip}
+      />
+    </CleanupContext.Provider>
+  );
+}
+
+export function useCleanup() {
+  return useContext(CleanupContext);
+}
--- a/frontend/src/shared/types/timeline.ts
+++ b/frontend/src/shared/types/timeline.ts
@@ -0,0 +1,10 @@
+export interface InsertSegment {
+  id: string;
+  materialId: string;
+  materialName: string;
+  start: number;
+  end: number;
+  sourceStart: number;
+  sourceEnd: number;
+  color: string;
+}
--- a/frontend/src/shared/ui/AppModal.tsx
+++ b/frontend/src/shared/ui/AppModal.tsx
@@ -0,0 +1,139 @@
+"use client";
+
+import { useEffect, useRef, type ReactNode } from "react";
+import { createPortal } from "react-dom";
+import { X } from "lucide-react";
+
+interface AppModalProps {
+  isOpen: boolean;
+  onClose: () => void;
+  children: ReactNode;
+  zIndexClassName?: string;
+  panelClassName?: string;
+  closeOnOverlay?: boolean;
+  closeOnEsc?: boolean;
+  lockBodyScroll?: boolean;
+}
+
+export function AppModal({
+  isOpen,
+  onClose,
+  children,
+  zIndexClassName = "z-[220]",
+  panelClassName = "w-full max-w-2xl rounded-2xl border border-white/10 bg-[#171821]/95 shadow-[0_24px_80px_rgba(0,0,0,0.55)] overflow-hidden",
+  closeOnOverlay = true,
+  closeOnEsc = true,
+  lockBodyScroll = true,
+}: AppModalProps) {
+  const containerRef = useRef<HTMLDivElement | null>(null);
+  const onCloseRef = useRef(onClose);
+
+  useEffect(() => {
+    onCloseRef.current = onClose;
+  }, [onClose]);
+
+  useEffect(() => {
+    if (!isOpen) return;
+
+    const handleEsc = (event: KeyboardEvent) => {
+      if (closeOnEsc && event.key === "Escape") onCloseRef.current();
+    };
+
+    const previousActiveElement = document.activeElement as HTMLElement | null;
+
+    if (lockBodyScroll) {
+      const openCount = Number(document.body.dataset.appModalOpenCount ?? "0");
+      if (openCount === 0) {
+        document.body.dataset.appModalPrevOverflow = document.body.style.overflow;
+        document.body.style.overflow = "hidden";
+      }
+      document.body.dataset.appModalOpenCount = String(openCount + 1);
+    }
+
+    document.addEventListener("keydown", handleEsc);
+    requestAnimationFrame(() => containerRef.current?.focus());
+
+    return () => {
+      document.removeEventListener("keydown", handleEsc);
+
+      if (lockBodyScroll) {
+        const openCount = Number(document.body.dataset.appModalOpenCount ?? "0");
+        const nextCount = Math.max(0, openCount - 1);
+
+        if (nextCount === 0) {
+          document.body.style.overflow = document.body.dataset.appModalPrevOverflow ?? "";
+          delete document.body.dataset.appModalPrevOverflow;
+          delete document.body.dataset.appModalOpenCount;
+        } else {
+          document.body.dataset.appModalOpenCount = String(nextCount);
+        }
+      }
+
+      previousActiveElement?.focus?.();
+    };
+  }, [closeOnEsc, isOpen, lockBodyScroll]);
+
+  if (!isOpen || typeof document === "undefined") return null;
+
+  return createPortal(
+    <div
+      ref={containerRef}
+      role="dialog"
+      aria-modal="true"
+      tabIndex={-1}
+      className={`fixed inset-0 ${zIndexClassName} flex items-center justify-center bg-black/80 backdrop-blur-sm p-4 animate-in fade-in duration-200`}
+      onClick={closeOnOverlay ? onClose : undefined}
+    >
+      <div className={panelClassName} onClick={(event) => event.stopPropagation()}>
+        {children}
+      </div>
+    </div>,
+    document.body
+  );
+}
+
+interface AppModalHeaderProps {
+  title: ReactNode;
+  subtitle?: ReactNode;
+  icon?: ReactNode;
+  onClose?: () => void;
+  actions?: ReactNode;
+}
+
+export function AppModalHeader({
+  title,
+  subtitle,
+  icon,
+  onClose,
+  actions,
+}: AppModalHeaderProps) {
+  return (
+    <div className="flex items-center justify-between gap-3 border-b border-white/10 bg-gradient-to-r from-white/[0.08] via-white/[0.03] to-white/[0.08] px-4 py-3">
+      <div className="min-w-0 flex items-center gap-3">
+        {icon ? (
+          <div className="h-9 w-9 rounded-lg bg-white/10 text-white flex items-center justify-center">
+            {icon}
+          </div>
+        ) : null}
+        <div className="min-w-0">
+          <h3 className="truncate text-base font-semibold text-white">{title}</h3>
+          {subtitle ? <p className="mt-0.5 text-xs text-gray-400">{subtitle}</p> : null}
+        </div>
+      </div>
+
+      <div className="flex items-center gap-2">
+        {actions}
+        {onClose ? (
+          <button
+            type="button"
+            onClick={onClose}
+            aria-label="关闭弹窗"
+            className="p-2 text-gray-400 hover:text-white hover:bg-white/10 rounded-lg transition-colors"
+          >
+            <X className="h-5 w-5" />
+          </button>
+        ) : null}
+      </div>
+    </div>
+  );
+}
--- a/frontend/src/shared/ui/SelectPopover.tsx
+++ b/frontend/src/shared/ui/SelectPopover.tsx
@@ -0,0 +1,233 @@
+"use client";
+
+import { type ReactNode, useEffect, useRef, useState } from "react";
+import { createPortal } from "react-dom";
+
+interface SelectPopoverTriggerContext {
+  open: boolean;
+  isMobile: boolean;
+  toggle: () => void;
+  close: () => void;
+}
+
+interface SelectPopoverPanelContext {
+  isMobile: boolean;
+  close: () => void;
+}
+
+interface SelectPopoverProps {
+  trigger: (ctx: SelectPopoverTriggerContext) => ReactNode;
+  children: (ctx: SelectPopoverPanelContext) => ReactNode;
+  sheetTitle?: string;
+  disabled?: boolean;
+  panelClassName?: string;
+  onOpen?: () => void;
+}
+
+const MOBILE_QUERY = "(max-width: 639px)";
+
+export function SelectPopover({
+  trigger,
+  children,
+  sheetTitle,
+  disabled = false,
+  panelClassName = "",
+  onOpen,
+}: SelectPopoverProps) {
+  type DesktopRect = {
+    left: number;
+    top: number;
+    width: number;
+    maxHeight: number;
+    direction: "up" | "down";
+  };
+
+  const containerRef = useRef<HTMLDivElement | null>(null);
+  const panelRef = useRef<HTMLDivElement | null>(null);
+  const desktopScrollRef = useRef<HTMLDivElement | null>(null);
+  const mobileScrollRef = useRef<HTMLDivElement | null>(null);
+  const [open, setOpen] = useState(false);
+  const [isMobile, setIsMobile] = useState(false);
+  const [desktopRect, setDesktopRect] = useState<DesktopRect | null>(null);
+  const isOpen = open && !disabled;
+
+  const canUseDOM = typeof window !== "undefined" && typeof document !== "undefined";
+
+  useEffect(() => {
+    if (typeof window === "undefined") return;
+
+    const mq = window.matchMedia(MOBILE_QUERY);
+    const handleChange = () => setIsMobile(mq.matches);
+    handleChange();
+
+    if (mq.addEventListener) {
+      mq.addEventListener("change", handleChange);
+      return () => mq.removeEventListener("change", handleChange);
+    }
+
+    mq.addListener(handleChange);
+    return () => mq.removeListener(handleChange);
+  }, []);
+
+  useEffect(() => {
+    if (!isOpen || isMobile) return;
+
+    const handlePointerDown = (event: MouseEvent) => {
+      if (canUseDOM && document.querySelector("[data-video-preview-open='true']")) {
+        return;
+      }
+
+      const target = event.target as Node;
+      const clickedTrigger = containerRef.current?.contains(target) ?? false;
+      const clickedPanel = panelRef.current?.contains(target) ?? false;
+      if (!clickedTrigger && !clickedPanel) {
+        setOpen(false);
+      }
+    };
+
+    document.addEventListener("mousedown", handlePointerDown);
+    return () => document.removeEventListener("mousedown", handlePointerDown);
+  }, [isOpen, isMobile, canUseDOM]);
+
+  useEffect(() => {
+    if (!isOpen) return;
+
+    const handleKeyDown = (event: KeyboardEvent) => {
+      const previewOpen = canUseDOM && Boolean(document.querySelector("[data-video-preview-open='true']"));
+      if (event.key === "Escape" && !previewOpen) {
+        setOpen(false);
+      }
+    };
+
+    document.addEventListener("keydown", handleKeyDown);
+    return () => document.removeEventListener("keydown", handleKeyDown);
+  }, [isOpen, canUseDOM]);
+
+  useEffect(() => {
+    if (isOpen) {
+      onOpen?.();
+    }
+  }, [isOpen, onOpen]);
+
+  useEffect(() => {
+    if (!isOpen || !canUseDOM) return;
+
+    let raf1 = 0;
+    let raf2 = 0;
+    const scrollSelectedIntoView = () => {
+      const container = isMobile ? mobileScrollRef.current : desktopScrollRef.current;
+      if (!container) return;
+
+      const selectedEl = container.querySelector<HTMLElement>(
+        "[data-popover-selected='true'], [aria-selected='true']",
+      );
+      selectedEl?.scrollIntoView({ block: "nearest", behavior: "auto" });
+    };
+
+    raf1 = window.requestAnimationFrame(() => {
+      raf2 = window.requestAnimationFrame(scrollSelectedIntoView);
+    });
+
+    return () => {
+      if (raf1) window.cancelAnimationFrame(raf1);
+      if (raf2) window.cancelAnimationFrame(raf2);
+    };
+  }, [isOpen, isMobile, canUseDOM]);
+
+  useEffect(() => {
+    if (!isOpen || isMobile || !canUseDOM) return;
+
+    const updateDesktopRect = () => {
+      const triggerEl = containerRef.current;
+      if (!triggerEl) return;
+
+      const viewportPadding = 8;
+      const gap = 8;
+      const preferredMaxHeight = 352;
+      const rect = triggerEl.getBoundingClientRect();
+      const width = rect.width;
+      const maxLeft = Math.max(viewportPadding, window.innerWidth - width - viewportPadding);
+      const left = Math.min(Math.max(viewportPadding, rect.left), maxLeft);
+
+      const spaceBelow = window.innerHeight - rect.bottom - gap - viewportPadding;
+      const spaceAbove = rect.top - gap - viewportPadding;
+      const openUp = spaceBelow < 220 && spaceAbove > spaceBelow;
+      const direction: "up" | "down" = openUp ? "up" : "down";
+      const chosenSpace = openUp ? spaceAbove : spaceBelow;
+      const maxHeight = Math.max(120, Math.min(preferredMaxHeight, Math.floor(chosenSpace)));
+      const top = openUp
+        ? Math.max(viewportPadding, rect.top - gap)
+        : Math.min(rect.bottom + gap, window.innerHeight - viewportPadding);
+
+      setDesktopRect({ left, top, width, maxHeight, direction });
+    };
+
+    updateDesktopRect();
+    window.addEventListener("resize", updateDesktopRect);
+    window.addEventListener("scroll", updateDesktopRect, true);
+
+    return () => {
+      window.removeEventListener("resize", updateDesktopRect);
+      window.removeEventListener("scroll", updateDesktopRect, true);
+    };
+  }, [isOpen, isMobile, canUseDOM]);
+
+  const close = () => setOpen(false);
+  const toggle = () => {
+    if (disabled) return;
+    setOpen((prev) => !prev);
+  };
+
+  const desktopPanel = canUseDOM && isOpen && !isMobile && desktopRect
+    ? createPortal(
+      <div
+        ref={panelRef}
+        className={`fixed z-[260] overflow-hidden rounded-2xl border border-white/20 bg-[#130f20]/95 backdrop-blur-md shadow-[0_20px_48px_rgba(8,10,20,0.5)] ${panelClassName}`}
+        style={{
+          left: desktopRect.left,
+          top: desktopRect.top,
+          width: desktopRect.width,
+          transform: desktopRect.direction === "up" ? "translateY(-100%)" : undefined,
+        }}
+        role="dialog"
+        aria-modal="false"
+      >
+        <div ref={desktopScrollRef} className="hide-scrollbar overflow-y-auto p-2" style={{ maxHeight: desktopRect.maxHeight }}>
+          {children({ isMobile: false, close })}
+        </div>
+      </div>,
+      document.body,
+    )
+    : null;
+
+  const mobileSheet = canUseDOM && isOpen && isMobile
+    ? createPortal(
+      <div
+        className="fixed inset-0 z-[220] bg-black/60"
+        onMouseDown={close}
+        role="dialog"
+        aria-modal="true"
+      >
+        <div
+          className="fixed inset-x-0 bottom-0 max-h-[78dvh] overflow-hidden rounded-t-3xl border-t border-white/20 bg-[#130f20]/95"
+          onMouseDown={(e) => e.stopPropagation()}
+        >
+          <div className="mx-auto mt-2 h-1.5 w-12 rounded-full bg-white/20" />
+          {sheetTitle && (
+            <div className="px-5 pt-3 pb-2 text-sm font-medium text-gray-300">{sheetTitle}</div>
+          )}
+          <div ref={mobileScrollRef} className="hide-scrollbar max-h-[calc(78dvh-56px)] overflow-y-auto p-3">{children({ isMobile: true, close })}</div>
+        </div>
+      </div>,
+      document.body,
+    )
+    : null;
+
+  return (
+    <div className="relative" ref={containerRef}>
+      {trigger({ open: isOpen, isMobile, toggle, close })}
+      {desktopPanel}
+      {mobileSheet}
+    </div>
+  );
+}
Author	SHA1	Message	Date
Kevin Wong	0939d81e9f	更新	2026-03-10 10:59:38 +08:00
Kevin Wong	f879fb0001	更新	2026-03-09 10:18:14 +08:00
Kevin Wong	b289006844	更新	2026-03-05 17:23:22 +08:00
Kevin Wong	71b45852bf	更新	2026-03-04 17:35:59 +08:00
Kevin Wong	23ff4ff86e	更新	2026-03-04 14:07:54 +08:00
Kevin Wong	091f78174e	更新	2026-03-03 15:16:38 +08:00
Kevin Wong	190fc2e590	更新	2026-03-03 12:23:49 +08:00