This commit is contained in:
Kevin Wong
2026-02-10 13:31:29 +08:00
parent 3129d45b25
commit e33dfc3031
38 changed files with 2956 additions and 282 deletions

View File

@@ -33,9 +33,10 @@ backend/
│ │ ├── materials/ # 素材管理router/schemas/service │ │ ├── materials/ # 素材管理router/schemas/service
│ │ ├── publish/ # 多平台发布 │ │ ├── publish/ # 多平台发布
│ │ ├── auth/ # 认证与会话 │ │ ├── auth/ # 认证与会话
│ │ ├── ai/ # AI 功能(标题标签生成 │ │ ├── ai/ # AI 功能(标题标签生成、多语言翻译
│ │ ├── assets/ # 静态资源(字体/样式/BGM │ │ ├── assets/ # 静态资源(字体/样式/BGM
│ │ ├── ref_audios/ # 声音克隆参考音频router/schemas/service │ │ ├── ref_audios/ # 声音克隆参考音频router/schemas/service
│ │ ├── generated_audios/ # 预生成配音管理router/schemas/service
│ │ ├── login_helper/ # 扫码登录辅助 │ │ ├── login_helper/ # 扫码登录辅助
│ │ ├── tools/ # 工具接口router/schemas/service │ │ ├── tools/ # 工具接口router/schemas/service
│ │ └── admin/ # 管理员功能 │ │ └── admin/ # 管理员功能

View File

@@ -19,12 +19,13 @@ backend/
│ │ ├── materials/ # 素材管理router/schemas/service │ │ ├── materials/ # 素材管理router/schemas/service
│ │ ├── publish/ # 多平台发布 │ │ ├── publish/ # 多平台发布
│ │ ├── auth/ # 认证与会话 │ │ ├── auth/ # 认证与会话
│ │ ├── ai/ # AI 功能(标题标签生成) │ │ ├── ai/ # AI 功能(标题标签生成、多语言翻译
│ │ ├── assets/ # 静态资源(字体/样式/BGM │ │ ├── assets/ # 静态资源(字体/样式/BGM
│ │ ├── ref_audios/ # 声音克隆参考音频router/schemas/service │ │ ├── ref_audios/ # 声音克隆参考音频router/schemas/service
│ │ ├── login_helper/ # 扫码登录辅助 │ │ ├── generated_audios/ # 预生成配音管理router/schemas/service
│ │ ├── tools/ # 工具接口router/schemas/service │ │ ├── login_helper/ # 扫码登录辅助
│ │ ── admin/ # 管理员功能 │ │ ── tools/ # 工具接口router/schemas/service
│ │ └── admin/ # 管理员功能
│ ├── repositories/ # Supabase 数据访问 │ ├── repositories/ # Supabase 数据访问
│ ├── services/ # 外部服务集成 (TTS/Remotion/Storage/Uploader 等) │ ├── services/ # 外部服务集成 (TTS/Remotion/Storage/Uploader 等)
│ └── tests/ # 单元测试与集成测试 │ └── tests/ # 单元测试与集成测试
@@ -83,11 +84,19 @@ backend/
7. **AI 功能 (AI)** 7. **AI 功能 (AI)**
* `POST /api/ai/generate-meta`: AI 生成标题和标签 * `POST /api/ai/generate-meta`: AI 生成标题和标签
* `POST /api/ai/translate`: AI 多语言翻译(支持 9 种目标语言)
8. **工具 (Tools)** 8. **预生成配音 (Generated Audios)**
* `POST /api/generated-audios/generate`: 异步生成配音(返回 task_id
* `GET /api/generated-audios/tasks/{task_id}`: 轮询生成进度
* `GET /api/generated-audios`: 列出用户所有配音
* `DELETE /api/generated-audios/{audio_id}`: 删除配音
* `PUT /api/generated-audios/{audio_id}`: 重命名配音
9. **工具 (Tools)**
* `POST /api/tools/extract-script`: 从视频链接提取文案 * `POST /api/tools/extract-script`: 从视频链接提取文案
9. **健康检查** 10. **健康检查**
* `GET /api/lipsync/health`: LatentSync 服务健康状态 * `GET /api/lipsync/health`: LatentSync 服务健康状态
* `GET /api/voiceclone/health`: Qwen3-TTS 服务健康状态 * `GET /api/voiceclone/health`: Qwen3-TTS 服务健康状态
@@ -113,6 +122,9 @@ backend/
- `tts_mode`: TTS 模式 (`edgetts` / `voiceclone`) - `tts_mode`: TTS 模式 (`edgetts` / `voiceclone`)
- `voice`: EdgeTTS 音色 IDedgetts 模式) - `voice`: EdgeTTS 音色 IDedgetts 模式)
- `ref_audio_id` / `ref_text`: 参考音频 ID 与文本voiceclone 模式) - `ref_audio_id` / `ref_text`: 参考音频 ID 与文本voiceclone 模式)
- `generated_audio_id`: 预生成配音 ID存在时跳过内联 TTS使用已生成的配音文件
- `custom_assignments`: 自定义素材分配数组(每项含 `material_path` / `start` / `end` / `source_start`),存在时跳过 Whisper 均分
- `language`: TTS 语言(默认自动检测,声音克隆时透传给 Qwen3-TTS
- `title`: 片头标题文字 - `title`: 片头标题文字
- `subtitle_style_id`: 字幕样式 ID - `subtitle_style_id`: 字幕样式 ID
- `title_style_id`: 标题样式 ID - `title_style_id`: 标题样式 ID

View File

@@ -165,6 +165,8 @@ playwright install chromium
CREATE POLICY "Allow public read" ON storage.objects FOR SELECT TO anon USING (bucket_id = 'materials' OR bucket_id = 'outputs'); CREATE POLICY "Allow public read" ON storage.objects FOR SELECT TO anon USING (bucket_id = 'materials' OR bucket_id = 'outputs');
EOF EOF
``` ```
> **注意**:后端启动时会自动创建额外的存储桶(`ref-audios`、`generated-audios`),无需手动创建。
--- ---
@@ -570,6 +572,7 @@ pm2 logs vigent2-qwen-tts
| `next` | React 框架 | | `next` | React 框架 |
| `swr` | 数据请求与缓存 | | `swr` | 数据请求与缓存 |
| `tailwindcss` | CSS 样式 | | `tailwindcss` | CSS 样式 |
| `wavesurfer.js` | 音频波形(时间轴编辑器) |
### LatentSync 关键依赖 ### LatentSync 关键依赖

546
Docs/DevLogs/Day23.md Normal file
View File

@@ -0,0 +1,546 @@
## 🎙️ 配音前置重构 — 第一阶段 (Day 23)
### 概述
将配音从视频生成流程中独立出来,实现"先生成配音 → 选中配音 → 再选素材 → 生成视频"的新工作流。用户可以独立管理配音(生成/试听/改名/删除/选择),并在选中配音后看到时长信息,为第二阶段的素材时间轴编排奠定数据基础。
**旧流程**: 文案 + 选素材 → 一键生成(内联 TTS → Whisper → 均分 → LipSync → 合成)
**新流程**: 文案 → 配音方式 → **生成配音** → 选中配音 → 选素材 → 背景音乐 → 生成视频
---
### 一、后端:新增 `generated_audios` 模块
#### 模块结构
```
backend/app/modules/generated_audios/
├── __init__.py
├── router.py # 5 个 API 端点
├── schemas.py # 请求/响应模型
└── service.py # 生成/列表/删除/改名
```
#### API 端点
| 方法 | 路径 | 说明 |
|------|------|------|
| POST | `/api/generated-audios/generate` | 异步生成配音(返回 task_id |
| GET | `/api/generated-audios/tasks/{task_id}` | 轮询生成进度 |
| GET | `/api/generated-audios` | 列出用户所有配音 |
| DELETE | `/api/generated-audios/{audio_id}` | 删除配音 |
| PUT | `/api/generated-audios/{audio_id}` | 改名 |
#### 存储方案
- Supabase 存储桶:`generated-audios`(启动时自动创建)
- 音频文件:`{user_id}/{timestamp}_audio.wav`
- 元数据文件:`{user_id}/{timestamp}_audio.json`(含 display_name、text、tts_mode、duration_sec 等)
#### 生成流程
复用现有 `TTSService` / `voice_clone_service` / `task_store`
```
POST /generate → 创建 task → BackgroundTask:
1. edgetts → TTSService.generate_audio()
voiceclone → 下载 ref_audio → voice_clone_service.generate_audio()
2. ffprobe 获取时长
3. 上传 .wav + .json 到 generated-audios 桶
4. 更新 task(status=completed, output={audio_id, duration_sec, ...})
```
---
### 二、后端:修改视频生成 workflow
#### `GenerateRequest` 新增字段
```python
generated_audio_id: Optional[str] = None # 预生成配音 ID存在时跳过内联 TTS
```
#### `workflow.py` TTS 阶段新增分支
```python
if req.generated_audio_id:
# 下载预生成配音 + 从元数据读取 language
elif req.tts_mode == "voiceclone":
# 原有声音克隆逻辑
else:
# 原有 EdgeTTS 逻辑
```
向后兼容:不传 `generated_audio_id` 时,原有内联 TTS 流程不受影响。
---
### 三、前端:新增配音列表 hook + 面板
#### `useGeneratedAudios.ts`
- 状态:`generatedAudios[]``selectedAudio``isGeneratingAudio``audioTask`
- 方法:`fetchGeneratedAudios()``generateAudio()``deleteAudio()``renameAudio()``selectAudio()`
- 轮询:生成后 1s 轮询 task 状态,完成后自动刷新列表并选中最新配音
- 独立于视频生成的 TaskContext不互相干扰
#### `GeneratedAudiosPanel.tsx`
- 每条配音:播放/暂停、名称、时长、重命名、删除
- 选中态:`border-purple-500 bg-purple-500/20`
- 内嵌进度条(生成中显示)
- 底部显示选中配音的原始文案(截断)
- 播放逻辑自包含于面板内(`new Audio()` + play/pause toggle
---
### 四、前端UI 面板重排序
**旧顺序**: MaterialSelector → ScriptEditor → TitleSubtitle → VoiceSelector → BgmPanel → GenerateActionBar
**新顺序**:
1. ScriptEditor文案编辑
2. TitleSubtitlePanel标题与字幕样式
3. VoiceSelector配音方式
4. **GeneratedAudiosPanel**(配音列表)← 新增
5. MaterialSelector视频素材← 后移,需选中配音才解锁
6. BgmPanel背景音乐
7. GenerateActionBar生成视频
#### 素材区门控
未选中配音时,素材区显示半透明遮罩 + "请先生成并选中配音"提示。素材上传/预览/改名/删除始终可用,仅选择勾选被遮罩。
#### 时长信息
选中配音后MaterialSelector 顶部显示:
```
当前配音: 45.2 秒 | 已选 3 个素材(自动均分每段 ~15.1 秒)
```
#### 生成按钮条件更新
```typescript
// 旧条件
disabled={isGenerating || selectedMaterials.length === 0 || (ttsMode === "voiceclone" && !selectedRefAudio)}
// 新条件
disabled={isGenerating || selectedMaterials.length === 0 || !selectedAudio}
```
---
### 五、持久化
`useHomePersistence` 新增 `selectedAudioId` 的 localStorage 读写,刷新页面后恢复选中的配音。
---
### 涉及文件汇总
#### 后端新增
| 文件 | 说明 |
|------|------|
| `backend/app/modules/generated_audios/__init__.py` | 模块标记 |
| `backend/app/modules/generated_audios/router.py` | 5 个 API 端点 |
| `backend/app/modules/generated_audios/service.py` | 生成/列表/删除/改名 |
| `backend/app/modules/generated_audios/schemas.py` | 请求/响应模型 |
#### 后端修改
| 文件 | 变更 |
|------|------|
| `backend/app/main.py` | 注册 generated_audios 路由 |
| `backend/app/services/storage.py` | 新增 `BUCKET_GENERATED_AUDIOS`,启动时自动创建桶 |
| `backend/app/modules/videos/schemas.py` | `GenerateRequest` 新增 `generated_audio_id` 字段 |
| `backend/app/modules/videos/workflow.py` | TTS 阶段新增预生成音频分支 |
#### 前端新增
| 文件 | 说明 |
|------|------|
| `frontend/src/features/home/model/useGeneratedAudios.ts` | 配音列表 hook |
| `frontend/src/features/home/ui/GeneratedAudiosPanel.tsx` | 配音列表面板 |
#### 前端修改
| 文件 | 变更 |
|------|------|
| `frontend/src/features/home/ui/HomePage.tsx` | 面板重排序 + 素材区门控 + 插入 GeneratedAudiosPanel |
| `frontend/src/features/home/ui/MaterialSelector.tsx` | 新增 `selectedAudioDuration` prop + 时长信息显示 |
| `frontend/src/features/home/ui/GenerateActionBar.tsx` | 禁用条件改为 `!selectedAudio` |
| `frontend/src/features/home/model/useHomeController.ts` | 集成 useGeneratedAudios、新增 handleGenerateAudio、修改 handleGenerate 使用 generated_audio_id |
| `frontend/src/features/home/model/useHomePersistence.ts` | 新增 selectedAudioId 持久化 |
---
## 🎞️ 素材时间轴编排 — 第二阶段 (Day 23)
### 概述
在第一阶段"配音前置"基础上,新增**时间轴编辑器**,用户可以:
1. 在音频波形上查看各素材块的时长分配
2. 拖拽分割线调整每段素材的时长(无缝铺满,调整一段自动压缩/扩展相邻段)
3. 为每段素材设置**源视频截取起点**(从视频任意位置开始,而非始终从头)
**旧行为**: 多素材时自动均分(`_split_equal`),无法控制每段时长和源视频起始点
**新行为**: 时间轴编辑器可视化分配 + 拖拽调整 + ClipTrimmer 截取设置
---
### 一、后端改动
#### 1.1 新增 `CustomAssignment` 模型
```python
# backend/app/modules/videos/schemas.py
class CustomAssignment(BaseModel):
material_path: str
start: float # 音频时间轴起点
end: float # 音频时间轴终点
source_start: float = 0.0 # 源视频截取起点
```
`GenerateRequest` 新增 `custom_assignments: Optional[List[CustomAssignment]] = None`。存在时跳过 Whisper 均分,直接使用用户定义的分配。
#### 1.2 `prepare_segment` 支持 `source_start`
```python
def prepare_segment(self, video_path, target_duration, output_path,
target_resolution=None, source_start: float = 0.0):
```
关键逻辑:
- `source_start > 0` 时使用 `-ss` 快速 seek并强制重编码避免 stream copy 关键帧不精确)
- 当需要循环且有 `source_start` 时,先裁剪出 `source_start` 到视频结尾的片段,再循环裁剪后的文件(避免 `stream_loop` 从视频 0s 开始循环)
- 裁剪临时文件在 `finally` 中自动清理
#### 1.3 `workflow.py` 支持 `custom_assignments`
- **多素材模式**: `custom_assignments` 存在时,直接使用用户分配(仍运行 Whisper 生成字幕),每个 `prepare_segment` 调用传入 `source_start`
- **单素材模式**: `custom_assignments` 有 1 条且 `source_start > 0` 时,先截取片段再传入 LatentSync
- **向后兼容**: `custom_assignments``None` 时完全走旧路径
---
### 二、前端新增组件
#### 2.1 `useTimelineEditor.ts` — 时间轴段管理 hook
```typescript
interface TimelineSegment {
id: string; // React key
materialId: string; // 素材 ID
materialName: string; // 显示名
start: number; // 音频时间轴开始秒数
end: number; // 音频时间轴结束秒数
sourceStart: number; // 源视频截取起点(默认 0
sourceEnd: number; // 源视频截取终点0 = 到结尾)
color: string; // 色块颜色
}
```
核心方法:
- `initSegments()`: selectedMaterials 变化时按数量均分 audioDuration
- `resizeSegment(id, newEnd)`: 拖拽右边界,约束每段最小 1s
- `setSourceRange(id, sourceStart, sourceEnd)`: 设置截取范围
- `toCustomAssignments()`: 转为后端 `CustomAssignment[]` 格式
#### 2.2 `TimelineEditor.tsx` — 波形 + 色块时间轴
- **wavesurfer.js** 渲染音频波形(仅展示,不播放)
- 色块层按比例排列,显示素材名 + 时长 + 截取标记
- 色块间分割线可拖拽(`onPointerDown/Move/Up` 实现连续像素拖拽)
- 点击色块打开 ClipTrimmer
#### 2.3 `ClipTrimmer.tsx` — 素材截取模态框
- HTML5 `<video>` 实时预览,拖拽滑块时 `video.currentTime` 跟随
- 双端 Range Slider起点/终点),互锁约束 ≥ 0.5s
- 显示截取时长 vs 分配时长对比(循环补足/截断提示)
- `loadedmetadata` 获取源视频时长
---
### 三、前端整合改动
#### 3.1 `useHomeController.ts`
- 集成 `useTimelineEditor` hook
- 新增 `clipTrimmerOpen` / `clipTrimmerSegmentId` 状态
- `handleGenerate` 多素材时始终发送 `custom_assignments`;单素材 + `sourceStart > 0` 时也发送
- 移除不再使用的 `reorderMaterials` 导出
#### 3.2 `HomePage.tsx`
- 在 MaterialSelector 和 BgmPanel 之间插入 TimelineEditor仅当有配音且已选素材时显示
- 底部新增 ClipTrimmer 模态框
- 移除 `reorderMaterials``selectedAudioDuration` prop 传递
#### 3.3 `MaterialSelector.tsx`
- 移除配音时长信息栏(功能迁至 TimelineEditor
- 移除拖拽排序区SortableChip + @dnd-kit 相关代码)
- 移除 `onReorderMaterials` / `selectedAudioDuration` prop
---
### 四、审查修复的 Bug
| # | 严重程度 | 问题 | 修复 |
|---|---------|------|------|
| 1 | **中** | `prepare_segment` 使用 `source_start > 0` + stream copy 时 seek 不精确 | 添加 `source_start > 0` 到重编码条件 |
| 2 | **高** | `stream_loop + source_start` 循环时从视频 0s 开始而非从 source_start 循环 | 改为两步:先裁剪片段再循环裁剪后的文件 |
| 3 | **低** | `useHomeController` 导出已废弃的 `reorderMaterials` | 移除 |
---
### 涉及文件汇总
#### 后端修改
| 文件 | 变更 |
|------|------|
| `backend/app/modules/videos/schemas.py` | 新增 `CustomAssignment` model`GenerateRequest` 新增 `custom_assignments` 字段 |
| `backend/app/services/video_service.py` | `prepare_segment` 新增 `source_start` 参数,循环+截取两步处理 |
| `backend/app/modules/videos/workflow.py` | 多素材/单素材流水线支持 `custom_assignments`,传递 `source_start` |
#### 前端新增
| 文件 | 说明 |
|------|------|
| `frontend/src/features/home/model/useTimelineEditor.ts` | 时间轴段管理 hook |
| `frontend/src/features/home/ui/TimelineEditor.tsx` | 波形 + 色块时间轴组件 |
| `frontend/src/features/home/ui/ClipTrimmer.tsx` | 素材截取模态框 |
#### 前端修改
| 文件 | 变更 |
|------|------|
| `frontend/src/features/home/ui/HomePage.tsx` | 插入 TimelineEditor + ClipTrimmer |
| `frontend/src/features/home/ui/MaterialSelector.tsx` | 移除时长信息 + 拖拽排序区 + 相关 prop |
| `frontend/src/features/home/model/useHomeController.ts` | 集成 useTimelineEditorhandleGenerate 发送 custom_assignments |
| `frontend/package.json` | 新增 `wavesurfer.js` 依赖 |
---
## 🎨 UI 体验优化 + TTS 稳定性修复 — 第三阶段 (Day 23)
### 概述
根据用户反馈,修复 6 项 UI 体验问题,同时修复 Qwen3-TTS 声音克隆服务的 SoX 路径问题和显存缓存管理。
---
### 一、Qwen3-TTS 稳定性修复
#### 1.1 SoX PATH 修复
**问题**: PM2 启动 qwen-tts 时,`sox` 工具安装在 conda env 的 bin 目录中,系统 PATH 找不到,导致音频编解码走 fallback 路径CPU 密集型),日志中出现 `SoX could not be found!` 警告。
**修复**: `run_qwen_tts.sh` 中 export conda env bin 到 PATH
```bash
export PATH="/home/rongye/ProgramFiles/miniconda3/envs/qwen-tts/bin:$PATH"
```
#### 1.2 CUDA 缓存清理
**修复**: `qwen_tts_server.py` 每次生成完成后(无论成功或失败)调用 `torch.cuda.empty_cache()`,防止显存碎片累积。使用 `asyncio.to_thread()` 在线程池中运行推理,避免阻塞事件循环导致健康检查超时。
---
### 二、配音列表按钮布局统一 (反馈 #1 + #6)
**问题**: `GeneratedAudiosPanel` 的试听按钮位于左侧(独立于 Edit/Delete`RefAudioPanel` 的布局不一致。底部文案摘要区域不需要展示。
**修复**:
- Play/Edit/Delete 按钮统一放在右侧同组hover 显示,顺序为 试听→重命名→删除
- 移除选中配音的文案摘要区域
- 布局与 RefAudioPanel 一致:左侧名称+时长,右侧操作按钮组
---
### 三、视频素材区域移除配音依赖遮罩 (反馈 #2)
**问题**: MaterialSelector 被 `!selectedAudio` 遮罩覆盖,必须先选配音才能操作素材。
**修复**: 移除 `HomePage.tsx` 中 MaterialSelector 外层的 disabled overlay `<div>`。素材随时可上传/预览/管理,仅 TimelineEditor 需要选中配音才显示(已有独立条件 `selectedAudio && selectedMaterials.length > 0`)。
---
### 四、时间轴拖拽排序 (反馈 #3)
**问题**: TimelineEditor 不支持调换素材顺序。
**修复**:
- `useTimelineEditor` 已有 `reorderSegments()` 方法(交换两个段的素材信息但保留时间范围)
- 通过 `useHomeController` 暴露 `reorderSegments`,传入 `TimelineEditor`
- 色块支持 HTML5 Drag & Drop`draggable` + `onDragStart/Over/Drop/End`
- 拖拽时:源色块半透明(`opacity-50`),目标色块高亮 ring`ring-2 ring-purple-400 scale-[1.02]`
- 光标样式:`cursor-grab` / `active:cursor-grabbing`
---
### 五、截取设置双手柄 Range Slider (反馈 #4)
**问题**: ClipTrimmer 使用两个独立的 `<input type="range">` 滑块,起点和终点分开操作,体验不直观。
**修复**: 改为自定义双手柄 range slider
- 单条轨道,紫色圆形手柄(起点)+ 粉色圆形手柄(终点)
- 轨道底色 `bg-white/10`,选中范围用素材对应颜色高亮
- Pointer Events 实现拖拽:`onPointerDown` 捕获手柄 → `onPointerMove` 更新位置 → `onPointerUp` 释放
- 手柄互锁约束:起点不超过终点 - 0.5s,终点不低于起点 + 0.5s
- 底部显示起点(紫色)和终点(粉色)时间标签
---
### 六、截取设置视频预览 (反馈 #5)
**问题**: ClipTrimmer 的视频只能静态查看,无法播放预览截取范围。
**修复**:
- 视频区域点击可播放/暂停Play/Pause 图标覆盖层)
- 播放范围:从 sourceStart 播放到 sourceEnd 自动停止
- 播放结束后回到起点
- 拖拽手柄时 `video.currentTime` 实时跟随seek 到当前位置查看画面)
- 播放进度条(白色竖线)叠加在 range slider 轨道上
- `preload="auto"` 预加载视频,确保拖拽时快速 seek
---
### 涉及文件汇总
#### 后端修改
| 文件 | 变更 |
|------|------|
| `run_qwen_tts.sh` | export conda env bin 到 PATH修复 SoX 找不到问题 |
| `models/Qwen3-TTS/qwen_tts_server.py` | 每次生成后 `torch.cuda.empty_cache()`asyncio.to_thread 避免阻塞 |
#### 前端修改
| 文件 | 变更 |
|------|------|
| `frontend/src/features/home/ui/GeneratedAudiosPanel.tsx` | 按钮布局统一Play/Edit/Delete 右侧同组),移除文案摘要 |
| `frontend/src/features/home/ui/HomePage.tsx` | 移除 MaterialSelector 配音遮罩,传入 onReorderSegment |
| `frontend/src/features/home/ui/TimelineEditor.tsx` | 新增 HTML5 Drag & Drop 排序,新增 onReorderSegment prop |
| `frontend/src/features/home/ui/ClipTrimmer.tsx` | 双手柄 range slider + 视频播放预览 + 播放进度指示 |
| `frontend/src/features/home/model/useHomeController.ts` | 暴露 reorderSegments 方法 |
---
## 📝 历史文案保存 + 时间轴拖拽修复 — 第四阶段 (Day 23)
### 概述
新增文案手动保存与加载功能,修复时间轴拖拽排序后素材时长不跟随的 Bug统一按钮视觉规范。
---
### 一、历史文案保存与加载
#### 功能
用户可手动保存当前文案到历史列表,随时从历史中加载恢复。只有手动保存的文案才出现在历史列表中,与自动保存(`useHomePersistence`)完全独立。
#### UI 布局
```
按钮栏: [历史文案▼] [文案提取助手] [AI多语言▼] [AI生成标题标签]
底部栏: 128 字 [保存文案]
```
- **历史文案下拉**: 展示已保存列表(名称 + 日期 + 删除按钮),点击条目加载文案,空列表显示"暂无保存的文案"
- **保存文案按钮**: 文案为空时 disabled点击后 `toast.success("文案已保存")`
- **预计时长已移除**: 底部栏只保留字数 + 保存按钮
#### 实现
##### `useSavedScripts.ts`(新建)
```typescript
interface SavedScript { id: string; name: string; content: string; savedAt: number }
```
- localStorage key: `vigent_{storageKey}_savedScripts`
- `saveScript(content)`: 取前 15 字符自动命名,新条目插入列表头部,**直接写入 localStorage**
- `deleteScript(id)`: 删除指定条目,直接写入 localStorage
- `useEffect([lsKey])`: lsKey 变化时guest → userId重新从 localStorage 读取
- **不使用自动持久化 effect**,避免 storageKey 切换时空数组覆盖已有数据
##### 数据流
```
ScriptEditor (UI)
↑ savedScripts / onSaveScript / onLoadScript / onDeleteScript (纯 props + callbacks)
useHomeController
├── useSavedScripts(storageKey) → { savedScripts, saveScript, deleteScript }
└── handleSaveScript() → saveScript(text) + toast
HomePage
└── 传递 props 到 ScriptEditor
```
---
### 二、时间轴拖拽排序 Bug 修复
#### 问题
拖拽调换素材顺序后各素材的时长没有跟随素材移动而是留在原槽位。例如素材1(3s) + 素材2(8s+4s循环)拖拽后变成素材2(3s) + 素材1(8s+4s循环),时长分配没变。
#### 根因
`reorderSegments` 使用**属性交换**方式:逐个拷贝 `materialId``sourceStart``sourceEnd` 等属性在两个槽位间交换,然后调用 `recalcPositions` 重算位置。
#### 修复
改为**数组移动**splice将整个 segment 对象从旧位置取出插入到新位置。segment 对象携带全部属性materialId、sourceStart、sourceEnd、color 等)作为一个整体移动,再由 `recalcPositions` 重算位置。
```typescript
// 修复前:属性交换
const fromMat = { materialId: next[fromIdx].materialId, ... };
const toMat = { materialId: next[toIdx].materialId, ... };
next[fromIdx] = { ...next[fromIdx], ...toMat };
next[toIdx] = { ...next[toIdx], ...fromMat };
// 修复后:数组移动
const [moved] = next.splice(fromIdx, 1);
next.splice(toIdx, 0, moved);
```
附带优势3+ 素材拖拽行为从"交换"变为"插入",更符合用户直觉。
---
### 三、按钮视觉统一
#### 问题
历史文案、文案提取助手、AI多语言、AI生成标题标签 4 个按钮高度不一致AI 按钮的文本被 `<span>` 嵌套包裹导致内部布局差异。
#### 修复
- 4 个按钮统一为 `h-7 px-2.5 text-xs rounded inline-flex items-center gap-1`(固定高度 28px
- 移除 AI多语言 / AI生成标题标签 按钮内多余的 `<span>` 嵌套,改为 `<>...</>` fragment
---
### 涉及文件汇总
#### 前端新增
| 文件 | 说明 |
|------|------|
| `frontend/src/features/home/model/useSavedScripts.ts` | 历史文案 hooklocalStorage 持久化) |
#### 前端修改
| 文件 | 变更 |
|------|------|
| `frontend/src/features/home/ui/ScriptEditor.tsx` | 历史文案下拉 + 保存按钮 + 移除预计时长 + 按钮高度统一 |
| `frontend/src/features/home/model/useHomeController.ts` | 集成 useSavedScripts新增 handleSaveScript |
| `frontend/src/features/home/ui/HomePage.tsx` | 传递 savedScripts / handleSaveScript / deleteSavedScript 到 ScriptEditor |
| `frontend/src/features/home/model/useTimelineEditor.ts` | reorderSegments 从属性交换改为数组移动splice |

View File

@@ -19,9 +19,12 @@ frontend/src/
│ │ │ ├── useHomePersistence.ts # 持久化管理 │ │ │ ├── useHomePersistence.ts # 持久化管理
│ │ │ ├── useBgm.ts │ │ │ ├── useBgm.ts
│ │ │ ├── useGeneratedVideos.ts │ │ │ ├── useGeneratedVideos.ts
│ │ │ ├── useGeneratedAudios.ts
│ │ │ ├── useMaterials.ts │ │ │ ├── useMaterials.ts
│ │ │ ├── useMediaPlayers.ts │ │ │ ├── useMediaPlayers.ts
│ │ │ ├── useRefAudios.ts │ │ │ ├── useRefAudios.ts
│ │ │ ├── useSavedScripts.ts
│ │ │ ├── useTimelineEditor.ts
│ │ │ └── useTitleSubtitleStyles.ts │ │ │ └── useTitleSubtitleStyles.ts
│ │ └── ui/ # UI 组件(纯 props + 回调) │ │ └── ui/ # UI 组件(纯 props + 回调)
│ │ ├── HomePage.tsx │ │ ├── HomePage.tsx
@@ -35,6 +38,9 @@ frontend/src/
│ │ ├── FloatingStylePreview.tsx │ │ ├── FloatingStylePreview.tsx
│ │ ├── VoiceSelector.tsx │ │ ├── VoiceSelector.tsx
│ │ ├── RefAudioPanel.tsx │ │ ├── RefAudioPanel.tsx
│ │ ├── GeneratedAudiosPanel.tsx
│ │ ├── TimelineEditor.tsx
│ │ ├── ClipTrimmer.tsx
│ │ ├── BgmPanel.tsx │ │ ├── BgmPanel.tsx
│ │ ├── GenerateActionBar.tsx │ │ ├── GenerateActionBar.tsx
│ │ ├── PreviewPanel.tsx │ │ ├── PreviewPanel.tsx
@@ -301,6 +307,15 @@ import { formatDate } from '@/shared/lib/media';
- 标题字号 / 字幕字号 - 标题字号 / 字幕字号
- 背景音乐选择 / 音量 / 开关状态 - 背景音乐选择 / 音量 / 开关状态
- 素材选择 / 历史作品选择 - 素材选择 / 历史作品选择
- 选中配音 ID (`selectedAudioId`)
- 时间轴段信息 (`useTimelineEditor` 的 localStorage)
### 历史文案(独立持久化)
`useSavedScripts` hook 独立管理历史文案的 localStorage 持久化:
- key: `vigent_{storageKey}_savedScripts`
- 仅在用户手动保存/删除时写入 localStorage不使用自动持久化 effect
-`useHomePersistence` 完全独立,互不影响
### 实施规范 ### 实施规范
- 使用 `storageKey = userId || 'guest'`,按用户隔离。 - 使用 `storageKey = userId || 'guest'`,按用户隔离。

View File

@@ -17,7 +17,9 @@ ViGent2 的前端界面,采用 Next.js 16 + TailwindCSS 构建。
- **作品预览**: 生成完成后直接播放下载(作品预览 + 历史作品)。 - **作品预览**: 生成完成后直接播放下载(作品预览 + 历史作品)。
- **预览优化**: 预览视频 `metadata` 预取,首帧加载更快。 - **预览优化**: 预览视频 `metadata` 预取,首帧加载更快。
- **本地保存**: 文案/标题/偏好由 `useHomePersistence` 统一持久化,刷新后恢复 (Day 14/17)。 - **本地保存**: 文案/标题/偏好由 `useHomePersistence` 统一持久化,刷新后恢复 (Day 14/17)。
- **历史文案**: 手动保存/加载/删除历史文案,独立 localStorage 持久化 (Day 23)。
- **选择持久化**: 首页/发布页作品选择均使用稳定 `id` 持久化,刷新保持用户选择;新视频生成后自动选中最新 (Day 21)。 - **选择持久化**: 首页/发布页作品选择均使用稳定 `id` 持久化,刷新保持用户选择;新视频生成后自动选中最新 (Day 21)。
- **AI 多语言翻译**: 支持 9 种目标语言翻译文案 + 还原原文 (Day 22)。
### 2. 全自动发布 (`/publish`) [Day 7 新增] ### 2. 全自动发布 (`/publish`) [Day 7 新增]
- **多平台管理**: 统一管理抖音、微信视频号、B站、小红书账号状态。 - **多平台管理**: 统一管理抖音、微信视频号、B站、小红书账号状态。
@@ -35,8 +37,17 @@ ViGent2 的前端界面,采用 Next.js 16 + TailwindCSS 构建。
- **TTS 模式选择**: EdgeTTS (预设音色) / 声音克隆 (自定义音色) 切换。 - **TTS 模式选择**: EdgeTTS (预设音色) / 声音克隆 (自定义音色) 切换。
- **参考音频管理**: 上传/列表/删除参考音频 (3-20秒 WAV)。 - **参考音频管理**: 上传/列表/删除参考音频 (3-20秒 WAV)。
- **一键克隆**: 选择参考音频后自动调用 Qwen3-TTS 服务。 - **一键克隆**: 选择参考音频后自动调用 Qwen3-TTS 服务。
- **多语言支持**: EdgeTTS 10 语言声音列表,声音克隆 language 透传 (Day 22)。
### 4. 字幕与标题 [Day 13 新增] ### 4. 配音前置 + 时间轴编排 [Day 23 新增]
- **配音独立生成**: 先生成配音 → 选中配音 → 再选素材 → 生成视频。
- **配音管理面板**: 生成/试听/改名/删除/选中,异步生成 + 进度轮询。
- **时间轴编辑器**: wavesurfer.js 音频波形 + 色块可视化素材分配,拖拽分割线调整各段时长。
- **素材截取设置**: ClipTrimmer 双手柄 range slider + HTML5 视频预览播放。
- **拖拽排序**: 时间轴色块支持 HTML5 Drag & Drop 调换素材顺序。
- **自定义分配**: 后端 `custom_assignments` 支持用户定义的素材分配方案。
### 5. 字幕与标题 [Day 13 新增]
- **片头标题**: 可选输入,限制 15 字,视频开头显示 3 秒淡入淡出标题。 - **片头标题**: 可选输入,限制 15 字,视频开头显示 3 秒淡入淡出标题。
- **标题同步**: 首页片头标题修改会同步到发布信息标题。 - **标题同步**: 首页片头标题修改会同步到发布信息标题。
- **逐字高亮字幕**: 卡拉OK效果默认开启可关闭。 - **逐字高亮字幕**: 卡拉OK效果默认开启可关闭。
@@ -45,16 +56,16 @@ ViGent2 的前端界面,采用 Next.js 16 + TailwindCSS 构建。
- **默认样式**: 标题 90px 站酷快乐体;字幕 60px 经典黄字 + DingTalkJinBuTi (Day 17)。 - **默认样式**: 标题 90px 站酷快乐体;字幕 60px 经典黄字 + DingTalkJinBuTi (Day 17)。
- **样式持久化**: 标题/字幕样式与字号刷新保留 (Day 17)。 - **样式持久化**: 标题/字幕样式与字号刷新保留 (Day 17)。
### 5. 背景音乐 [Day 16 新增] ### 6. 背景音乐 [Day 16 新增]
- **试听预览**: 点击试听即选中,音量滑块实时生效。 - **试听预览**: 点击试听即选中,音量滑块实时生效。
- **混音控制**: 仅影响 BGM配音保持原音量。 - **混音控制**: 仅影响 BGM配音保持原音量。
### 6. 账户设置 [Day 15 新增] ### 7. 账户设置 [Day 15 新增]
- **手机号登录**: 11位中国手机号验证登录。 - **手机号登录**: 11位中国手机号验证登录。
- **账户下拉菜单**: 显示有效期 + 修改密码 + 安全退出。 - **账户下拉菜单**: 显示有效期 + 修改密码 + 安全退出。
- **修改密码**: 弹窗输入当前密码与新密码,修改后强制重新登录。 - **修改密码**: 弹窗输入当前密码与新密码,修改后强制重新登录。
### 7. 文案提取助手 (`ScriptExtractionModal`) [Day 15 新增] ### 8. 文案提取助手 (`ScriptExtractionModal`) [Day 15 新增]
- **多源提取**: 支持文件拖拽上传与 URL 粘贴 (B站/抖音/TikTok)。 - **多源提取**: 支持文件拖拽上传与 URL 粘贴 (B站/抖音/TikTok)。
- **AI 洗稿**: 集成 GLM-4.7-Flash自动改写为口播文案。 - **AI 洗稿**: 集成 GLM-4.7-Flash自动改写为口播文案。
- **一键填入**: 提取结果直接填充至视频生成输入框。 - **一键填入**: 提取结果直接填充至视频生成输入框。
@@ -66,6 +77,7 @@ ViGent2 的前端界面,采用 Next.js 16 + TailwindCSS 构建。
- **样式**: TailwindCSS - **样式**: TailwindCSS
- **图标**: Lucide React - **图标**: Lucide React
- **组件**: 自定义现代化组件 (Glassmorphism 风格) - **组件**: 自定义现代化组件 (Glassmorphism 风格)
- **音频波形**: wavesurfer.js (时间轴编辑器)
- **API**: Axios 实例 `@/shared/api/axios` (对接后端 FastAPI :8006) - **API**: Axios 实例 `@/shared/api/axios` (对接后端 FastAPI :8006)
## 🚀 开发指南 ## 🚀 开发指南

View File

@@ -298,12 +298,20 @@ Response: audio/wav 文件
SoX could not be found! SoX could not be found!
``` ```
**解决**: 通过 conda 安装 sox **解决**:
1. 通过 conda 安装 sox
```bash ```bash
conda install -y -c conda-forge sox conda install -y -c conda-forge sox
``` ```
2. 确保启动脚本 `run_qwen_tts.sh` 中已 export conda env bin 到 PATHPM2 启动时系统 PATH 不含 conda 环境目录):
```bash
export PATH="/home/rongye/ProgramFiles/miniconda3/envs/qwen-tts/bin:$PATH"
```
### CUDA 内存不足 ### CUDA 内存不足
Qwen3-TTS 1.7B 通常需要 8-10GB VRAM。如果遇到 OOM Qwen3-TTS 1.7B 通常需要 8-10GB VRAM。如果遇到 OOM
@@ -371,6 +379,7 @@ FOR INSERT TO anon WITH CHECK (bucket_id = 'ref-audios');
| 日期 | 版本 | 说明 | | 日期 | 版本 | 说明 |
|------|------|------| |------|------|------|
| 2026-02-09 | 1.2.0 | 修复 SoX PATH 问题run_qwen_tts.sh export conda bin每次生成后 empty_cache() |
| 2026-01-30 | 1.1.0 | 明确默认模型升级为 1.7B-Base替换旧版 0.6B 路径 | | 2026-01-30 | 1.1.0 | 明确默认模型升级为 1.7B-Base替换旧版 0.6B 路径 |
--- ---

View File

@@ -15,9 +15,13 @@
原有流程: 原有流程:
文本 → EdgeTTS → 音频 → LatentSync → FFmpeg合成 → 最终视频 文本 → EdgeTTS → 音频 → LatentSync → FFmpeg合成 → 最终视频
新流程: 新流程 (单素材):
文本 → EdgeTTS → 音频 ─┬→ LatentSync → 唇形视频 ─┐ 文本 → EdgeTTS/Qwen3-TTS/预生成配音 → 音频 ─┬→ LatentSync → 唇形视频 ─┐
└→ faster-whisper → 字幕JSON ─┴→ Remotion合成 → 最终视频 └→ faster-whisper → 字幕JSON ─┴→ Remotion合成 → 最终视频
新流程 (多素材):
音频 → 多素材按 custom_assignments 拼接 → LatentSync (单次推理) → 唇形视频 ─┐
音频 → faster-whisper → 字幕JSON ─────────────────────────────────────────────┴→ Remotion合成 → 最终视频
``` ```
## 系统要求 ## 系统要求
@@ -140,7 +144,7 @@ remotion/
| 阶段 | 进度 | 说明 | | 阶段 | 进度 | 说明 |
|------|------|------| |------|------|------|
| 下载素材 | 0% → 5% | 从 Supabase 下载输入视频 | | 下载素材 | 0% → 5% | 从 Supabase 下载输入视频 |
| TTS 语音生成 | 5% → 25% | EdgeTTS Qwen3-TTS 生成音频 | | TTS 语音生成 | 5% → 25% | EdgeTTS / Qwen3-TTS / 预生成配音下载 |
| 唇形同步 | 25% → 80% | LatentSync 推理 | | 唇形同步 | 25% → 80% | LatentSync 推理 |
| 字幕对齐 | 80% → 85% | faster-whisper 生成字级别时间戳 | | 字幕对齐 | 80% → 85% | faster-whisper 生成字级别时间戳 |
| Remotion 渲染 | 85% → 95% | 合成字幕和标题 | | Remotion 渲染 | 85% → 95% | 合成字幕和标题 |
@@ -282,4 +286,5 @@ WhisperService(device="cuda:0") # 或 "cuda:1"
| 日期 | 版本 | 说明 | | 日期 | 版本 | 说明 |
|------|------|------| |------|------|------|
| 2026-01-29 | 1.0.0 | 初始版本,使用 faster-whisper + Remotion 实现逐字高亮字幕和片头标题 | | 2026-01-29 | 1.0.0 | 初始版本,使用 faster-whisper + Remotion 实现逐字高亮字幕和片头标题 |
| 2026-02-10 | 1.1.0 | 更新架构图:多素材 concat-then-infer、预生成配音选项 |
| 2026-01-30 | 1.0.1 | 字幕高亮样式与标题动画优化,视觉表现更清晰 | | 2026-01-30 | 1.0.1 | 字幕高亮样式与标题动画优化,视觉表现更清晰 |

View File

@@ -1,8 +1,8 @@
# ViGent2 开发任务清单 (Task Log) # ViGent2 开发任务清单 (Task Log)
**项目**: ViGent2 数字人口播视频生成系统 **项目**: ViGent2 数字人口播视频生成系统
**进度**: 100% (Day 21 - 缺陷修复与持久化回归治理) **进度**: 100% (Day 23 - 配音前置重构 + 素材时间轴编排 + UI 体验优化)
**更新时间**: 2026-02-08 **更新时间**: 2026-02-10
--- ---
@@ -10,7 +10,46 @@
> 这里记录了每一天的核心开发内容与 milestone。 > 这里记录了每一天的核心开发内容与 milestone。
### Day 21: 缺陷修复 + 浮动预览 + 发布重构 + 架构优化 + 多素材生成 (Current) ### Day 23: 配音前置重构 + 素材时间轴编排 + UI 体验优化 + 历史文案 (Current)
#### 第一阶段:配音前置
- [x] **配音生成独立化**: 新增 `generated_audios` 后端模块router/schemas/service5 个 API 端点,复用现有 TTSService / voice_clone_service / task_store。
- [x] **配音管理面板**: 前端新增 `useGeneratedAudios` hook + `GeneratedAudiosPanel` 组件,支持生成/试听/改名/删除/选中。
- [x] **UI 面板重排序**: 文案 → 标题字幕 → 配音方式 → 配音列表 → 素材选择 → BGM → 生成视频。
- [x] **素材区门控**: 未选中配音时素材区显示遮罩,选中后显示配音时长 + 素材均分信息。
- [x] **视频生成对接**: workflow.py 新增预生成音频分支(`generated_audio_id`),跳过内联 TTS向后兼容。
- [x] **持久化**: selectedAudioId 加入 useHomePersistence刷新页面恢复选中配音。
#### 第二阶段:素材时间轴编排
- [x] **时间轴编辑器**: 新增 `TimelineEditor` 组件wavesurfer.js 音频波形 + 色块可视化素材分配,拖拽分割线调整各段时长。
- [x] **素材截取设置**: 新增 `ClipTrimmer` 模态框HTML5 视频预览 + 双端滑块设置源视频截取起点/终点。
- [x] **后端自定义分配**: 新增 `CustomAssignment` 模型,`prepare_segment` 支持 `source_start`workflow 多素材/单素材流水线支持 `custom_assignments`
- [x] **循环截取修复**: `stream_loop + source_start` 改为两步处理(先裁剪再循环),确保从截取起点循环而非从视频 0s 开始。
- [x] **MaterialSelector 精简**: 移除旧的时长信息栏和拖拽排序区(功能迁移到 TimelineEditor
#### 第三阶段UI 体验优化 + TTS 稳定性
- [x] **TTS SoX PATH 修复**: `run_qwen_tts.sh` export conda env bin 到 PATH修复 `SoX could not be found!` 警告。
- [x] **TTS 显存管理**: 每次生成后 `torch.cuda.empty_cache()`asyncio.to_thread 避免阻塞事件循环。
- [x] **配音列表按钮统一**: Play/Edit/Delete 按钮右侧同组 hover 显示,与 RefAudioPanel 一致,移除文案摘要。
- [x] **素材区解除配音门控**: 移除 MaterialSelector 的 selectedAudio 遮罩,素材随时可上传管理。
- [x] **时间轴拖拽排序**: TimelineEditor 色块支持 HTML5 Drag & Drop 调换素材顺序。
- [x] **截取设置 Range Slider**: ClipTrimmer 改为单轨道双手柄(紫色起点+粉色终点),替换两个独立滑块。
- [x] **截取设置视频预览**: 视频区域可播放/暂停,从 sourceStart 到 sourceEnd 自动停止,拖拽手柄时实时 seek。
#### 第四阶段:历史文案 + Bug 修复
- [x] **历史文案保存与加载**: 新增 `useSavedScripts` hook手动保存/加载/删除历史文案,独立 localStorage 持久化。
- [x] **时间轴拖拽修复**: `reorderSegments` 从属性交换改为数组移动splice修复拖拽后时长不跟随素材的 Bug。
- [x] **按钮视觉统一**: 文案编辑区 4 个按钮统一为固定高度 `h-7`,移除多余 `<span>` 嵌套。
- [x] **底部栏调整**: "保存文案"按钮移至底部右侧,移除预计时长显示。
### Day 22: 多素材优化 + AI 翻译 + TTS 多语言
- [x] **多素材 Bug 修复**: 6 个高优 Bug边界溢出、单段 fallback、除零、duration 校验、Whisper 兜底、空列表检查)。
- [x] **架构重构**: 多素材从"逐段 LatentSync"重构为"先拼接再推理",推理次数 N→1。
- [x] **前端优化**: payload 安全、进度消息、上传自动选中、Material 接口统一、拖拽修复、素材上限 4 个。
- [x] **AI 多语言翻译**: 新增 `/api/ai/translate` 接口,前端 9 种语言翻译 + 还原原文。
- [x] **TTS 多语言**: EdgeTTS 10 语言声音列表、翻译自动切换声音、声音克隆 language 透传、textLang 持久化。
### Day 21: 缺陷修复 + 浮动预览 + 发布重构 + 架构优化 + 多素材生成
- [x] **Remotion 崩溃容错**: 渲染进程 SIGABRT 退出时检查输出文件,避免误判失败导致标题/字幕丢失。 - [x] **Remotion 崩溃容错**: 渲染进程 SIGABRT 退出时检查输出文件,避免误判失败导致标题/字幕丢失。
- [x] **首页作品选择持久化**: 修复 `fetchGeneratedVideos` 无条件覆盖恢复值的问题,新增 `preferVideoId` 参数控制选中逻辑。 - [x] **首页作品选择持久化**: 修复 `fetchGeneratedVideos` 无条件覆盖恢复值的问题,新增 `preferVideoId` 参数控制选中逻辑。
- [x] **发布页作品选择持久化**: 根因为签名 URL 不稳定,全面改用 `video.id` 替代 `path` 进行选择/持久化/比较。 - [x] **发布页作品选择持久化**: 根因为签名 URL 不稳定,全面改用 `video.id` 替代 `path` 进行选择/持久化/比较。
@@ -129,6 +168,7 @@
## 🛤️ 后续规划 (Roadmap) ## 🛤️ 后续规划 (Roadmap)
### 🔴 优先待办 ### 🔴 优先待办
- [x] ~~**配音前置重构 — 第二阶段**: 素材片段截取 + 语音时间轴编排~~ ✅ Day 23 已完成
- [ ] **批量生成架构**: 支持 Excel 导入,批量生产视频。 - [ ] **批量生成架构**: 支持 Excel 导入,批量生产视频。
- [ ] **定时任务后台化**: 迁移前端触发的定时发布到后端 APScheduler。 - [ ] **定时任务后台化**: 迁移前端触发的定时发布到后端 APScheduler。
- [ ] **发布任务恢复机制**: 发布任务化 + 状态持久化 + 前端断点恢复,解决刷新后状态丢失。 - [ ] **发布任务恢复机制**: 发布任务化 + 状态持久化 + 前端断点恢复,解决刷新后状态丢失。
@@ -146,7 +186,7 @@
| **核心 API** | 100% | ✅ 稳定 | | **核心 API** | 100% | ✅ 稳定 |
| **Web UI** | 100% | ✅ 稳定 (移动端适配) | | **Web UI** | 100% | ✅ 稳定 (移动端适配) |
| **唇形同步** | 100% | ✅ LatentSync 1.6 | | **唇形同步** | 100% | ✅ LatentSync 1.6 |
| **TTS 配音** | 100% | ✅ EdgeTTS + Qwen3 | | **TTS 配音** | 100% | ✅ EdgeTTS + Qwen3 + 配音前置 + 时间轴编排 |
| **自动发布** | 100% | ✅ 抖音/微信视频号/B站/小红书 | | **自动发布** | 100% | ✅ 抖音/微信视频号/B站/小红书 |
| **用户认证** | 100% | ✅ 手机号 + JWT | | **用户认证** | 100% | ✅ 手机号 + JWT |
| **部署运维** | 100% | ✅ PM2 + Watchdog | | **部署运维** | 100% | ✅ PM2 + Watchdog |

View File

@@ -17,13 +17,14 @@
### 核心能力 ### 核心能力
- 🎬 **高清唇形同步** - LatentSync 1.6 驱动512×512 高分辨率 Latent Diffusion 模型。 - 🎬 **高清唇形同步** - LatentSync 1.6 驱动512×512 高分辨率 Latent Diffusion 模型。
- 🎙️ **多模态配音** - 支持 **EdgeTTS** (微软超自然语音) 和 **Qwen3-TTS** (3秒极速声音克隆)。 - 🎙️ **多模态配音** - 支持 **EdgeTTS** (微软超自然语音, 10 语言) 和 **Qwen3-TTS** (3秒极速声音克隆)。配音前置工作流:先生成配音 → 选素材 → 生成视频。
- 📝 **智能字幕** - 集成 faster-whisper + Remotion自动生成逐字高亮 (卡拉OK效果) 字幕。 - 📝 **智能字幕** - 集成 faster-whisper + Remotion自动生成逐字高亮 (卡拉OK效果) 字幕。
- 🎨 **样式预设** - 标题/字幕样式选择 + 预览 + 字号调节,支持自定义字体库。 - 🎨 **样式预设** - 标题/字幕样式选择 + 预览 + 字号调节,支持自定义字体库。
- 🖼️ **作品预览一致性** - 标题/字幕预览按素材分辨率缩放,效果更接近成片。 - 🖼️ **作品预览一致性** - 标题/字幕预览按素材分辨率缩放,效果更接近成片。
- 💾 **用户偏好持久化** - 首页状态统一恢复/保存,刷新后延续上次配置 - 🎞️ **多素材多机位** - 支持多选素材 + 时间轴编辑器 (wavesurfer.js 波形可视化),拖拽分割线调整时长、拖拽排序切换机位、截取源视频片段
- 💾 **用户偏好持久化** - 首页状态统一恢复/保存,刷新后延续上次配置。历史文案手动保存与加载。
- 🎵 **背景音乐** - 试听 + 音量控制 + 混音,保持配音音量稳定。 - 🎵 **背景音乐** - 试听 + 音量控制 + 混音,保持配音音量稳定。
- 🤖 **AI 辅助创作** - 内置 GLM-4.7-Flash支持 B站/抖音链接文案提取、AI 洗稿、标题/标签自动生成。 - 🤖 **AI 辅助创作** - 内置 GLM-4.7-Flash支持 B站/抖音链接文案提取、AI 洗稿、标题/标签自动生成、9 语言翻译
### 平台化功能 ### 平台化功能
- 📱 **全自动发布** - 支持抖音/微信视频号/B站/小红书立即发布;扫码登录 + Cookie 持久化。 - 📱 **全自动发布** - 支持抖音/微信视频号/B站/小红书立即发布;扫码登录 + Cookie 持久化。
@@ -40,7 +41,7 @@
| 领域 | 核心技术 | 说明 | | 领域 | 核心技术 | 说明 |
|------|----------|------| |------|----------|------|
| **前端** | Next.js 16 | TypeScript, TailwindCSS, SWR | | **前端** | Next.js 16 | TypeScript, TailwindCSS, SWR, wavesurfer.js |
| **后端** | FastAPI | Python 3.10, AsyncIO, PM2 | | **后端** | FastAPI | Python 3.10, AsyncIO, PM2 |
| **数据库** | Supabase | PostgreSQL, Storage (本地/S3), Auth | | **数据库** | Supabase | PostgreSQL, Storage (本地/S3), Auth |
| **唇形同步** | LatentSync 1.6 | PyTorch 2.5, Diffusers, DeepCache | | **唇形同步** | LatentSync 1.6 | PyTorch 2.5, Diffusers, DeepCache |

View File

@@ -15,6 +15,7 @@ from app.modules.ref_audios.router import router as ref_audios_router
from app.modules.ai.router import router as ai_router from app.modules.ai.router import router as ai_router
from app.modules.tools.router import router as tools_router from app.modules.tools.router import router as tools_router
from app.modules.assets.router import router as assets_router from app.modules.assets.router import router as assets_router
from app.modules.generated_audios.router import router as generated_audios_router
from loguru import logger from loguru import logger
import os import os
@@ -124,6 +125,7 @@ app.include_router(ref_audios_router, prefix="/api/ref-audios", tags=["RefAudios
app.include_router(ai_router) # /api/ai app.include_router(ai_router) # /api/ai
app.include_router(tools_router, prefix="/api/tools", tags=["Tools"]) app.include_router(tools_router, prefix="/api/tools", tags=["Tools"])
app.include_router(assets_router, prefix="/api/assets", tags=["Assets"]) app.include_router(assets_router, prefix="/api/assets", tags=["Assets"])
app.include_router(generated_audios_router, prefix="/api/generated-audios", tags=["GeneratedAudios"])
@app.on_event("startup") @app.on_event("startup")

View File

@@ -0,0 +1,77 @@
"""生成配音 API"""
from fastapi import APIRouter, BackgroundTasks, Depends, HTTPException
import uuid
from loguru import logger
from app.core.deps import get_current_user
from app.core.response import success_response
from app.modules.videos.task_store import create_task, get_task
from app.modules.generated_audios.schemas import GenerateAudioRequest, RenameAudioRequest
from app.modules.generated_audios import service
router = APIRouter()
@router.post("/generate")
async def generate_audio(
req: GenerateAudioRequest,
background_tasks: BackgroundTasks,
user: dict = Depends(get_current_user),
):
"""异步生成配音(返回 task_id"""
task_id = str(uuid.uuid4())
create_task(task_id, user["id"])
background_tasks.add_task(service.generate_audio_task, task_id, req, user["id"])
return success_response({"task_id": task_id})
@router.get("/tasks/{task_id}")
async def get_audio_task(task_id: str, user: dict = Depends(get_current_user)):
"""轮询配音生成进度"""
task = get_task(task_id)
if task.get("status") != "not_found" and task.get("user_id") != user["id"]:
return success_response({"status": "not_found"})
return success_response(task)
@router.get("")
async def list_audios(user: dict = Depends(get_current_user)):
"""列出当前用户所有已生成配音"""
try:
result = await service.list_generated_audios(user["id"])
return success_response(result)
except Exception as e:
logger.error(f"列出配音失败: {e}")
raise HTTPException(status_code=500, detail=f"获取列表失败: {str(e)}")
@router.delete("/{audio_id:path}")
async def delete_audio(audio_id: str, user: dict = Depends(get_current_user)):
"""删除配音"""
try:
await service.delete_generated_audio(audio_id, user["id"])
return success_response(message="删除成功")
except PermissionError as e:
raise HTTPException(status_code=403, detail=str(e))
except Exception as e:
logger.error(f"删除配音失败: {e}")
raise HTTPException(status_code=500, detail=f"删除失败: {str(e)}")
@router.put("/{audio_id:path}")
async def rename_audio(
audio_id: str,
request: RenameAudioRequest,
user: dict = Depends(get_current_user),
):
"""重命名配音"""
try:
result = await service.rename_generated_audio(audio_id, request.new_name, user["id"])
return success_response(result, message="重命名成功")
except PermissionError as e:
raise HTTPException(status_code=403, detail=str(e))
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
except Exception as e:
logger.error(f"重命名配音失败: {e}")
raise HTTPException(status_code=500, detail=f"重命名失败: {str(e)}")

View File

@@ -0,0 +1,30 @@
from pydantic import BaseModel
from typing import Optional, List
class GenerateAudioRequest(BaseModel):
text: str
tts_mode: str = "edgetts"
voice: str = "zh-CN-YunxiNeural"
ref_audio_id: Optional[str] = None
ref_text: Optional[str] = None
language: str = "zh-CN"
class RenameAudioRequest(BaseModel):
new_name: str
class GeneratedAudioItem(BaseModel):
id: str
name: str
path: str
duration_sec: float
text: str
tts_mode: str
language: str
created_at: int
class GeneratedAudioListResponse(BaseModel):
items: List[GeneratedAudioItem]

View File

@@ -0,0 +1,263 @@
"""生成配音 - 业务逻辑"""
import re
import json
import time
import asyncio
import subprocess
import tempfile
import os
from pathlib import Path
from typing import Optional
import httpx
from loguru import logger
from app.services.storage import storage_service
from app.services.tts_service import TTSService
from app.services.voice_clone_service import voice_clone_service
from app.modules.videos.task_store import task_store
from app.modules.generated_audios.schemas import (
GenerateAudioRequest,
GeneratedAudioItem,
GeneratedAudioListResponse,
)
BUCKET = "generated-audios"
def _locale_to_qwen_lang(locale: str) -> str:
mapping = {"zh": "Chinese", "en": "English"}
return mapping.get(locale.split("-")[0], "Auto")
def _get_audio_duration(file_path: str) -> float:
try:
result = subprocess.run(
['ffprobe', '-v', 'quiet', '-show_entries', 'format=duration',
'-of', 'csv=p=0', file_path],
capture_output=True, text=True, timeout=10
)
return float(result.stdout.strip())
except Exception as e:
logger.warning(f"获取音频时长失败: {e}")
return 0.0
async def generate_audio_task(task_id: str, req: GenerateAudioRequest, user_id: str):
"""后台任务:生成配音"""
try:
task_store.update(task_id, {"status": "processing", "progress": 10, "message": "正在生成配音..."})
with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tmp:
audio_path = tmp.name
try:
if req.tts_mode == "voiceclone":
if not req.ref_audio_id or not req.ref_text:
raise ValueError("声音克隆模式需要提供参考音频和参考文字")
task_store.update(task_id, {"progress": 20, "message": "正在下载参考音频..."})
with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tmp_ref:
ref_local = tmp_ref.name
try:
ref_url = await storage_service.get_signed_url(
bucket="ref-audios", path=req.ref_audio_id
)
timeout = httpx.Timeout(None)
async with httpx.AsyncClient(timeout=timeout) as client:
async with client.stream("GET", ref_url) as resp:
resp.raise_for_status()
with open(ref_local, "wb") as f:
async for chunk in resp.aiter_bytes():
f.write(chunk)
task_store.update(task_id, {"progress": 40, "message": "正在克隆声音 (Qwen3-TTS)..."})
await voice_clone_service.generate_audio(
text=req.text,
ref_audio_path=ref_local,
ref_text=req.ref_text,
output_path=audio_path,
language=_locale_to_qwen_lang(req.language),
)
finally:
if os.path.exists(ref_local):
os.unlink(ref_local)
else:
task_store.update(task_id, {"progress": 30, "message": "正在生成语音 (EdgeTTS)..."})
tts = TTSService()
await tts.generate_audio(req.text, req.voice, audio_path)
task_store.update(task_id, {"progress": 70, "message": "正在上传配音..."})
duration = _get_audio_duration(audio_path)
timestamp = int(time.time())
audio_id = f"{user_id}/{timestamp}_audio.wav"
meta_id = f"{user_id}/{timestamp}_audio.json"
# 生成 display_name
now = time.strftime("%Y%m%d_%H%M", time.localtime(timestamp))
display_name = f"配音_{now}"
with open(audio_path, "rb") as f:
wav_data = f.read()
await storage_service.upload_file(
bucket=BUCKET, path=audio_id,
file_data=wav_data, content_type="audio/wav",
)
metadata = {
"display_name": display_name,
"text": req.text,
"tts_mode": req.tts_mode,
"voice": req.voice if req.tts_mode == "edgetts" else None,
"ref_audio_id": req.ref_audio_id,
"language": req.language,
"duration_sec": duration,
"created_at": timestamp,
}
await storage_service.upload_file(
bucket=BUCKET, path=meta_id,
file_data=json.dumps(metadata, ensure_ascii=False).encode("utf-8"),
content_type="application/json",
)
signed_url = await storage_service.get_signed_url(BUCKET, audio_id)
task_store.update(task_id, {
"status": "completed",
"progress": 100,
"message": f"配音生成完成 ({duration:.1f}s)",
"output": {
"audio_id": audio_id,
"name": display_name,
"path": signed_url,
"duration_sec": duration,
"text": req.text,
"tts_mode": req.tts_mode,
"language": req.language,
"created_at": timestamp,
},
})
finally:
if os.path.exists(audio_path):
os.unlink(audio_path)
except Exception as e:
import traceback
task_store.update(task_id, {
"status": "failed",
"message": f"配音生成失败: {str(e)}",
"error": traceback.format_exc(),
})
logger.error(f"Generate audio failed: {e}")
async def list_generated_audios(user_id: str) -> dict:
"""列出用户的所有已生成配音"""
files = await storage_service.list_files(BUCKET, user_id)
wav_files = [f for f in files if f.get("name", "").endswith("_audio.wav")]
if not wav_files:
return GeneratedAudioListResponse(items=[]).model_dump()
async def fetch_info(f):
name = f.get("name", "")
storage_path = f"{user_id}/{name}"
meta_name = name.replace("_audio.wav", "_audio.json")
meta_path = f"{user_id}/{meta_name}"
display_name = name
text = ""
tts_mode = "edgetts"
language = "zh-CN"
duration_sec = 0.0
created_at = 0
try:
meta_url = await storage_service.get_signed_url(BUCKET, meta_path)
async with httpx.AsyncClient(timeout=5.0) as client:
resp = await client.get(meta_url)
if resp.status_code == 200:
meta = resp.json()
display_name = meta.get("display_name", name)
text = meta.get("text", "")
tts_mode = meta.get("tts_mode", "edgetts")
language = meta.get("language", "zh-CN")
duration_sec = meta.get("duration_sec", 0.0)
created_at = meta.get("created_at", 0)
except Exception as e:
logger.debug(f"读取配音 metadata 失败: {e}")
try:
created_at = int(name.split("_")[0])
except:
pass
signed_url = await storage_service.get_signed_url(BUCKET, storage_path)
return GeneratedAudioItem(
id=storage_path,
name=display_name,
path=signed_url,
duration_sec=duration_sec,
text=text,
tts_mode=tts_mode,
language=language,
created_at=created_at,
)
items = await asyncio.gather(*[fetch_info(f) for f in wav_files])
items = sorted(items, key=lambda x: x.created_at, reverse=True)
return GeneratedAudioListResponse(items=items).model_dump()
async def delete_generated_audio(audio_id: str, user_id: str) -> None:
if not audio_id.startswith(f"{user_id}/"):
raise PermissionError("无权删除此文件")
await storage_service.delete_file(BUCKET, audio_id)
meta_path = audio_id.replace("_audio.wav", "_audio.json")
try:
await storage_service.delete_file(BUCKET, meta_path)
except:
pass
async def rename_generated_audio(audio_id: str, new_name: str, user_id: str) -> dict:
if not audio_id.startswith(f"{user_id}/"):
raise PermissionError("无权修改此文件")
new_name = new_name.strip()
if not new_name:
raise ValueError("新名称不能为空")
meta_path = audio_id.replace("_audio.wav", "_audio.json")
try:
meta_url = await storage_service.get_signed_url(BUCKET, meta_path)
async with httpx.AsyncClient() as client:
resp = await client.get(meta_url)
if resp.status_code == 200:
metadata = resp.json()
else:
raise Exception(f"Failed to fetch metadata: {resp.status_code}")
except Exception as e:
logger.warning(f"无法读取配音元数据: {e}, 将创建新的")
metadata = {
"display_name": new_name,
"text": "",
"tts_mode": "edgetts",
"language": "zh-CN",
"duration_sec": 0.0,
"created_at": int(time.time()),
}
metadata["display_name"] = new_name
await storage_service.upload_file(
bucket=BUCKET,
path=meta_path,
file_data=json.dumps(metadata, ensure_ascii=False).encode("utf-8"),
content_type="application/json",
)
return {"name": new_name}

View File

@@ -2,6 +2,13 @@ from pydantic import BaseModel
from typing import Optional, List from typing import Optional, List
class CustomAssignment(BaseModel):
material_path: str
start: float # 音频时间轴起点
end: float # 音频时间轴终点
source_start: float = 0.0 # 源视频截取起点
class GenerateRequest(BaseModel): class GenerateRequest(BaseModel):
text: str text: str
voice: str = "zh-CN-YunxiNeural" voice: str = "zh-CN-YunxiNeural"
@@ -11,6 +18,7 @@ class GenerateRequest(BaseModel):
ref_audio_id: Optional[str] = None ref_audio_id: Optional[str] = None
ref_text: Optional[str] = None ref_text: Optional[str] = None
language: str = "zh-CN" language: str = "zh-CN"
generated_audio_id: Optional[str] = None # 预生成配音 ID存在时跳过内联 TTS
title: Optional[str] = None title: Optional[str] = None
enable_subtitles: bool = True enable_subtitles: bool = True
subtitle_style_id: Optional[str] = None subtitle_style_id: Optional[str] = None
@@ -21,3 +29,4 @@ class GenerateRequest(BaseModel):
subtitle_bottom_margin: Optional[int] = None subtitle_bottom_margin: Optional[int] = None
bgm_id: Optional[str] = None bgm_id: Optional[str] = None
bgm_volume: Optional[float] = 0.2 bgm_volume: Optional[float] = 0.2
custom_assignments: Optional[List[CustomAssignment]] = None

View File

@@ -197,7 +197,33 @@ async def process_video_generation(task_id: str, req: GenerateRequest, user_id:
audio_path = temp_dir / f"{task_id}_audio.wav" audio_path = temp_dir / f"{task_id}_audio.wav"
temp_files.append(audio_path) temp_files.append(audio_path)
if req.tts_mode == "voiceclone": if req.generated_audio_id:
# 新流程:使用预生成的配音
_update_task(task_id, message="正在下载配音...", progress=12)
audio_url = await storage_service.get_signed_url(
bucket="generated-audios",
path=req.generated_audio_id,
)
await _download_material(audio_url, audio_path)
# 从元数据获取 language
meta_path = req.generated_audio_id.replace("_audio.wav", "_audio.json")
try:
meta_url = await storage_service.get_signed_url(
bucket="generated-audios", path=meta_path,
)
import httpx as _httpx
async with _httpx.AsyncClient(timeout=5.0) as client:
resp = await client.get(meta_url)
if resp.status_code == 200:
meta = resp.json()
req.language = meta.get("language", req.language)
if not req.text.strip():
req.text = meta.get("text", req.text)
except Exception as e:
logger.warning(f"读取配音元数据失败: {e}")
elif req.tts_mode == "voiceclone":
if not req.ref_audio_id or not req.ref_text: if not req.ref_audio_id or not req.ref_text:
raise ValueError("声音克隆模式需要提供参考音频和参考文字") raise ValueError("声音克隆模式需要提供参考音频和参考文字")
@@ -239,40 +265,74 @@ async def process_video_generation(task_id: str, req: GenerateRequest, user_id:
# ══════════════════════════════════════ # ══════════════════════════════════════
# 多素材流水线 # 多素材流水线
# ══════════════════════════════════════ # ══════════════════════════════════════
_update_task(task_id, progress=12, message="正在生成字幕 (Whisper)...") _update_task(task_id, progress=12, message="正在分配素材...")
captions_path = temp_dir / f"{task_id}_captions.json" if req.custom_assignments:
temp_files.append(captions_path) # 用户自定义分配,跳过 Whisper 均分
try:
captions_data = await whisper_service.align(
audio_path=str(audio_path),
text=req.text,
output_path=str(captions_path),
language=_locale_to_whisper_lang(req.language),
)
print(f"[Pipeline] Whisper alignment completed (multi-material)")
except Exception as e:
logger.warning(f"Whisper alignment failed: {e}")
captions_data = None
captions_path = None
_update_task(task_id, progress=15, message="正在分配素材...")
if captions_data and captions_data.get("segments"):
assignments = _split_equal(captions_data["segments"], material_paths)
else:
# Whisper 失败 → 按时长均分(不依赖字符对齐)
logger.warning("[MultiMat] Whisper 无数据,按时长均分")
audio_dur = video._get_duration(str(audio_path))
if audio_dur <= 0:
audio_dur = 30.0 # 安全兜底
seg_dur = audio_dur / len(material_paths)
assignments = [ assignments = [
{"material_path": material_paths[i], "start": i * seg_dur, {
"end": (i + 1) * seg_dur, "index": i} "material_path": a.material_path,
for i in range(len(material_paths)) "start": a.start,
"end": a.end,
"source_start": a.source_start,
"index": i,
}
for i, a in enumerate(req.custom_assignments)
] ]
# 仍然需要 Whisper 生成字幕(如果启用)
captions_path = temp_dir / f"{task_id}_captions.json"
temp_files.append(captions_path)
if req.enable_subtitles:
_update_task(task_id, message="正在生成字幕 (Whisper)...")
try:
await whisper_service.align(
audio_path=str(audio_path),
text=req.text,
output_path=str(captions_path),
language=_locale_to_whisper_lang(req.language),
)
print(f"[Pipeline] Whisper alignment completed (custom assignments)")
except Exception as e:
logger.warning(f"Whisper alignment failed: {e}")
captions_path = None
else:
captions_path = None
else:
# 原有逻辑Whisper → _split_equal
_update_task(task_id, message="正在生成字幕 (Whisper)...")
captions_path = temp_dir / f"{task_id}_captions.json"
temp_files.append(captions_path)
try:
captions_data = await whisper_service.align(
audio_path=str(audio_path),
text=req.text,
output_path=str(captions_path),
language=_locale_to_whisper_lang(req.language),
)
print(f"[Pipeline] Whisper alignment completed (multi-material)")
except Exception as e:
logger.warning(f"Whisper alignment failed: {e}")
captions_data = None
captions_path = None
_update_task(task_id, progress=15, message="正在分配素材...")
if captions_data and captions_data.get("segments"):
assignments = _split_equal(captions_data["segments"], material_paths)
else:
# Whisper 失败 → 按时长均分(不依赖字符对齐)
logger.warning("[MultiMat] Whisper 无数据,按时长均分")
audio_dur = video._get_duration(str(audio_path))
if audio_dur <= 0:
audio_dur = 30.0 # 安全兜底
seg_dur = audio_dur / len(material_paths)
assignments = [
{"material_path": material_paths[i], "start": i * seg_dur,
"end": (i + 1) * seg_dur, "index": i}
for i in range(len(material_paths))
]
# 扩展段覆盖完整音频范围首段从0开始末段到音频结尾 # 扩展段覆盖完整音频范围首段从0开始末段到音频结尾
audio_duration = video._get_duration(str(audio_path)) audio_duration = video._get_duration(str(audio_path))
@@ -321,7 +381,8 @@ async def process_video_generation(task_id: str, req: GenerateRequest, user_id:
temp_files.append(prepared_path) temp_files.append(prepared_path)
video.prepare_segment( video.prepare_segment(
str(material_locals[i]), seg_dur, str(prepared_path), str(material_locals[i]), seg_dur, str(prepared_path),
target_resolution=base_res if need_scale else None target_resolution=base_res if need_scale else None,
source_start=assignment.get("source_start", 0.0),
) )
prepared_segments.append(prepared_path) prepared_segments.append(prepared_path)
@@ -363,6 +424,25 @@ async def process_video_generation(task_id: str, req: GenerateRequest, user_id:
# ══════════════════════════════════════ # ══════════════════════════════════════
# 单素材流水线(原有逻辑) # 单素材流水线(原有逻辑)
# ══════════════════════════════════════ # ══════════════════════════════════════
# 单素材 + source_start先截取片段
single_source_start = 0.0
if req.custom_assignments and len(req.custom_assignments) == 1:
single_source_start = req.custom_assignments[0].source_start
if single_source_start > 0:
_update_task(task_id, progress=20, message="正在截取素材片段...")
audio_dur = video._get_duration(str(audio_path))
if audio_dur <= 0:
audio_dur = 30.0
trimmed_path = temp_dir / f"{task_id}_trimmed.mp4"
temp_files.append(trimmed_path)
video.prepare_segment(
str(input_material_path), audio_dur, str(trimmed_path),
source_start=single_source_start,
)
input_material_path = trimmed_path
_update_task(task_id, progress=25) _update_task(task_id, progress=25)
_update_task(task_id, message="正在合成唇形 (LatentSync)...", progress=30) _update_task(task_id, message="正在合成唇形 (LatentSync)...", progress=30)

View File

@@ -20,12 +20,13 @@ class StorageService:
self.BUCKET_MATERIALS = "materials" self.BUCKET_MATERIALS = "materials"
self.BUCKET_OUTPUTS = "outputs" self.BUCKET_OUTPUTS = "outputs"
self.BUCKET_REF_AUDIOS = "ref-audios" self.BUCKET_REF_AUDIOS = "ref-audios"
self.BUCKET_GENERATED_AUDIOS = "generated-audios"
# 确保所有 bucket 存在 # 确保所有 bucket 存在
self._ensure_buckets() self._ensure_buckets()
def _ensure_buckets(self): def _ensure_buckets(self):
"""确保所有必需的 bucket 存在""" """确保所有必需的 bucket 存在"""
buckets = [self.BUCKET_MATERIALS, self.BUCKET_OUTPUTS, self.BUCKET_REF_AUDIOS] buckets = [self.BUCKET_MATERIALS, self.BUCKET_OUTPUTS, self.BUCKET_REF_AUDIOS, self.BUCKET_GENERATED_AUDIOS]
try: try:
existing = self.supabase.storage.list_buckets() existing = self.supabase.storage.list_buckets()
existing_names = {b.name for b in existing} if existing else set() existing_names = {b.name for b in existing} if existing else set()

View File

@@ -210,9 +210,10 @@ class VideoService:
return (0, 0) return (0, 0)
def prepare_segment(self, video_path: str, target_duration: float, output_path: str, def prepare_segment(self, video_path: str, target_duration: float, output_path: str,
target_resolution: tuple = None) -> str: target_resolution: tuple = None, source_start: float = 0.0) -> str:
"""将素材视频裁剪或循环到指定时长(无音频)。 """将素材视频裁剪或循环到指定时长(无音频)。
target_resolution: (width, height) 如需统一分辨率则传入,否则保持原分辨率。 target_resolution: (width, height) 如需统一分辨率则传入,否则保持原分辨率。
source_start: 源视频截取起点(秒),默认 0。
""" """
Path(output_path).parent.mkdir(parents=True, exist_ok=True) Path(output_path).parent.mkdir(parents=True, exist_ok=True)
@@ -220,27 +221,62 @@ class VideoService:
if video_dur <= 0: if video_dur <= 0:
video_dur = target_duration video_dur = target_duration
needs_loop = target_duration > video_dur # 可用时长 = 从 source_start 到视频结尾
available = max(video_dur - source_start, 0.1)
needs_loop = target_duration > available
needs_scale = target_resolution is not None needs_scale = target_resolution is not None
# 当需要循环且有 source_start 时,先裁剪出片段,再循环裁剪后的文件
# 避免 stream_loop 循环整个视频(而不是从 source_start 开始的片段)
actual_input = video_path
trim_temp = None
if needs_loop and source_start > 0:
trim_temp = str(Path(output_path).parent / (Path(output_path).stem + "_trim_tmp.mp4"))
trim_cmd = [
"ffmpeg", "-y",
"-ss", str(source_start),
"-i", video_path,
"-t", str(available),
"-an",
"-c:v", "libx264", "-preset", "fast", "-crf", "18",
trim_temp,
]
if not self._run_ffmpeg(trim_cmd):
raise RuntimeError(f"FFmpeg trim for loop failed: {video_path}")
actual_input = trim_temp
source_start = 0.0 # 已裁剪,不需要再 seek
# 重新计算循环次数(基于裁剪后文件)
available = self._get_duration(trim_temp) or available
loop_count = int(target_duration / available) + 1 if needs_loop else 0
cmd = ["ffmpeg", "-y"] cmd = ["ffmpeg", "-y"]
if needs_loop: if needs_loop:
loop_count = int(target_duration / video_dur) + 1
cmd.extend(["-stream_loop", str(loop_count)]) cmd.extend(["-stream_loop", str(loop_count)])
cmd.extend(["-i", video_path, "-t", str(target_duration), "-an"]) if source_start > 0:
cmd.extend(["-ss", str(source_start)])
cmd.extend(["-i", actual_input, "-t", str(target_duration), "-an"])
if needs_scale: if needs_scale:
w, h = target_resolution w, h = target_resolution
cmd.extend(["-vf", f"scale={w}:{h}:force_original_aspect_ratio=decrease,pad={w}:{h}:(ow-iw)/2:(oh-ih)/2"]) cmd.extend(["-vf", f"scale={w}:{h}:force_original_aspect_ratio=decrease,pad={w}:{h}:(ow-iw)/2:(oh-ih)/2"])
# 需要循环缩放时必须重编码,否则用 stream copy 保持原画质 # 需要循环缩放或指定起点时必须重编码,否则用 stream copy 保持原画质
if needs_loop or needs_scale: if needs_loop or needs_scale or source_start > 0:
cmd.extend(["-c:v", "libx264", "-preset", "fast", "-crf", "18"]) cmd.extend(["-c:v", "libx264", "-preset", "fast", "-crf", "18"])
else: else:
cmd.extend(["-c:v", "copy"]) cmd.extend(["-c:v", "copy"])
cmd.append(output_path) cmd.append(output_path)
if self._run_ffmpeg(cmd): try:
return output_path if self._run_ffmpeg(cmd):
raise RuntimeError(f"FFmpeg prepare_segment failed: {video_path}") return output_path
raise RuntimeError(f"FFmpeg prepare_segment failed: {video_path}")
finally:
# 清理裁剪临时文件
if trim_temp:
try:
Path(trim_temp).unlink(missing_ok=True)
except Exception:
pass

View File

@@ -23,20 +23,28 @@ SERVICES = [
"name": "vigent2-qwen-tts", "name": "vigent2-qwen-tts",
"url": "http://localhost:8009/health", "url": "http://localhost:8009/health",
"failures": 0, "failures": 0,
"threshold": 3, "threshold": 5, # 连续5次失败才重启5×30s = 2.5分钟容忍期)
"timeout": 10.0, "timeout": 10.0,
"restart_cmd": ["pm2", "restart", "vigent2-qwen-tts"] "restart_cmd": ["pm2", "restart", "vigent2-qwen-tts"],
"cooldown_until": 0, # 重启后的冷却截止时间戳
"cooldown_sec": 120, # 重启后等待120秒再开始检查
} }
] ]
async def check_service(service): async def check_service(service):
"""检查单个服务健康状态""" """检查单个服务健康状态"""
# 冷却期内跳过检查
now = time.time()
if now < service.get("cooldown_until", 0):
remaining = int(service["cooldown_until"] - now)
logger.debug(f"⏳ 服务 {service['name']} 冷却中,剩余 {remaining}s")
return True
try: try:
timeout = service.get("timeout", 10.0) timeout = service.get("timeout", 10.0)
async with httpx.AsyncClient(timeout=timeout) as client: async with httpx.AsyncClient(timeout=timeout) as client:
response = await client.get(service["url"]) response = await client.get(service["url"])
if response.status_code == 200: if response.status_code == 200:
# 成功
if service["failures"] > 0: if service["failures"] > 0:
logger.info(f"✅ 服务 {service['name']} 已恢复正常") logger.info(f"✅ 服务 {service['name']} 已恢复正常")
service["failures"] = 0 service["failures"] = 0
@@ -45,35 +53,36 @@ async def check_service(service):
logger.warning(f"⚠️ 服务 {service['name']} 返回状态码 {response.status_code}") logger.warning(f"⚠️ 服务 {service['name']} 返回状态码 {response.status_code}")
except Exception as e: except Exception as e:
logger.warning(f"⚠️ 无法连接服务 {service['name']}: {str(e)}") logger.warning(f"⚠️ 无法连接服务 {service['name']}: {str(e)}")
# 失败处理 # 失败处理
service["failures"] += 1 service["failures"] += 1
logger.warning(f"❌ 服务 {service['name']} 连续失败 {service['failures']}/{service['threshold']}") logger.warning(f"❌ 服务 {service['name']} 连续失败 {service['failures']}/{service['threshold']}")
if service["failures"] >= service['threshold']: if service["failures"] >= service['threshold']:
logger.error(f"🚨 服务 {service['name']} 已达到失败阈值,正在重启...") logger.error(f"🚨 服务 {service['name']} 已达到失败阈值,正在重启...")
try: try:
subprocess.run(service["restart_cmd"], check=True) subprocess.run(service["restart_cmd"], check=True)
logger.info(f"♻️ 服务 {service['name']} 重启命令已发送") logger.info(f"♻️ 服务 {service['name']} 重启命令已发送")
# 重启后给予一段宽限期 (例如 60秒) 不检查,等待服务启动 service["failures"] = 0
service["failures"] = 0 # 重置计数 # 设置冷却期,等待服务完成启动和模型加载
return "restarting" service["cooldown_until"] = time.time() + service.get("cooldown_sec", 120)
return "restarting"
except Exception as restart_error: except Exception as restart_error:
logger.error(f"💥 重启服务 {service['name']} 失败: {restart_error}") logger.error(f"💥 重启服务 {service['name']} 失败: {restart_error}")
return False return False
async def main(): async def main():
logger.info("🛡️ ViGent2 服务看门狗 (Watchdog) 已启动") logger.info("🛡️ ViGent2 服务看门狗 (Watchdog) 已启动")
# 启动时给所有服务一个初始冷却期,避免服务还没起来就被判定失败
for service in SERVICES:
service["cooldown_until"] = time.time() + 60
while True: while True:
# 并发检查所有服务
for service in SERVICES: for service in SERVICES:
result = await check_service(service) await check_service(service)
if result == "restarting":
# 如果有服务重启,额外等待包含启动时间
pass
# 每 30 秒检查一次 # 每 30 秒检查一次
await asyncio.sleep(30) await asyncio.sleep(30)

View File

@@ -18,7 +18,8 @@
"react": "19.2.3", "react": "19.2.3",
"react-dom": "19.2.3", "react-dom": "19.2.3",
"sonner": "^2.0.7", "sonner": "^2.0.7",
"swr": "^2.3.8" "swr": "^2.3.8",
"wavesurfer.js": "^7.12.1"
}, },
"devDependencies": { "devDependencies": {
"@tailwindcss/postcss": "^4", "@tailwindcss/postcss": "^4",
@@ -6667,6 +6668,12 @@
"react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0" "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0"
} }
}, },
"node_modules/wavesurfer.js": {
"version": "7.12.1",
"resolved": "https://registry.npmjs.org/wavesurfer.js/-/wavesurfer.js-7.12.1.tgz",
"integrity": "sha512-NswPjVHxk0Q1F/VMRemCPUzSojjuHHisQrBqQiRXg7MVbe3f5vQ6r0rTTXA/a/neC/4hnOEC4YpXca4LpH0SUg==",
"license": "BSD-3-Clause"
},
"node_modules/which": { "node_modules/which": {
"version": "2.0.2", "version": "2.0.2",
"resolved": "https://registry.npmjs.org/which/-/which-2.0.2.tgz", "resolved": "https://registry.npmjs.org/which/-/which-2.0.2.tgz",

View File

@@ -19,7 +19,8 @@
"react": "19.2.3", "react": "19.2.3",
"react-dom": "19.2.3", "react-dom": "19.2.3",
"sonner": "^2.0.7", "sonner": "^2.0.7",
"swr": "^2.3.8" "swr": "^2.3.8",
"wavesurfer.js": "^7.12.1"
}, },
"devDependencies": { "devDependencies": {
"@tailwindcss/postcss": "^4", "@tailwindcss/postcss": "^4",

View File

@@ -46,7 +46,6 @@ export default function RootLayout({
<Toaster <Toaster
position="top-center" position="top-center"
richColors richColors
closeButton
toastOptions={{ toastOptions={{
duration: 3000, duration: 3000,
className: "text-sm", className: "text-sm",

View File

@@ -0,0 +1,192 @@
import { useCallback, useEffect, useRef, useState } from "react";
import api from "@/shared/api/axios";
import { ApiResponse, unwrap } from "@/shared/api/types";
import { toast } from "sonner";
export interface GeneratedAudio {
id: string;
name: string;
path: string;
duration_sec: number;
text: string;
tts_mode: string;
language: string;
created_at: number;
}
interface AudioTask {
status: string;
progress?: number;
message?: string;
output?: GeneratedAudio & { audio_id: string };
}
interface UseGeneratedAudiosOptions {
selectedAudioId: string | null;
setSelectedAudioId: React.Dispatch<React.SetStateAction<string | null>>;
}
export const useGeneratedAudios = ({
selectedAudioId,
setSelectedAudioId,
}: UseGeneratedAudiosOptions) => {
const [generatedAudios, setGeneratedAudios] = useState<GeneratedAudio[]>([]);
const [selectedAudio, setSelectedAudio] = useState<GeneratedAudio | null>(null);
const [isGeneratingAudio, setIsGeneratingAudio] = useState(false);
const [audioTaskId, setAudioTaskId] = useState<string | null>(null);
const [audioTask, setAudioTask] = useState<AudioTask | null>(null);
const pollRef = useRef<NodeJS.Timeout | null>(null);
const fetchGeneratedAudios = useCallback(async (selectId?: string) => {
try {
const { data: res } = await api.get<ApiResponse<{ items: GeneratedAudio[] }>>(
"/api/generated-audios"
);
const payload = unwrap(res);
const items: GeneratedAudio[] = payload.items || [];
setGeneratedAudios(items);
if (selectId && items.length > 0) {
if (selectId === "__latest__") {
setSelectedAudioId(items[0].id);
setSelectedAudio(items[0]);
} else {
const found = items.find((a) => a.id === selectId);
if (found) {
setSelectedAudioId(found.id);
setSelectedAudio(found);
}
}
}
} catch (error) {
console.error("获取配音列表失败:", error);
}
}, [setSelectedAudioId]);
// Sync selectedAudio when selectedAudioId changes externally (e.g. from persistence)
useEffect(() => {
if (!selectedAudioId || generatedAudios.length === 0) return;
const found = generatedAudios.find((a) => a.id === selectedAudioId);
if (found) {
setSelectedAudio(found);
}
}, [selectedAudioId, generatedAudios]);
const stopPolling = useCallback(() => {
if (pollRef.current) {
clearInterval(pollRef.current);
pollRef.current = null;
}
}, []);
const startPolling = useCallback((taskId: string) => {
stopPolling();
pollRef.current = setInterval(async () => {
try {
const { data: res } = await api.get<ApiResponse<AudioTask>>(
`/api/generated-audios/tasks/${taskId}`
);
const task = unwrap(res);
setAudioTask(task);
if (task.status === "completed") {
stopPolling();
setIsGeneratingAudio(false);
setAudioTaskId(null);
// Refresh list and select the new audio
await fetchGeneratedAudios("__latest__");
toast.success(task.message || "配音生成完成");
} else if (task.status === "failed") {
stopPolling();
setIsGeneratingAudio(false);
setAudioTaskId(null);
toast.error(task.message || "配音生成失败");
} else if (task.status === "not_found") {
stopPolling();
setIsGeneratingAudio(false);
setAudioTaskId(null);
setAudioTask(null);
toast.error("任务已丢失(服务可能已重启),请重新生成");
}
} catch {
// Network error, keep polling
}
}, 1000);
}, [stopPolling, fetchGeneratedAudios]);
// Cleanup on unmount
useEffect(() => {
return () => stopPolling();
}, [stopPolling]);
const generateAudio = useCallback(async (params: {
text: string;
tts_mode: string;
voice?: string;
ref_audio_id?: string;
ref_text?: string;
language: string;
}) => {
setIsGeneratingAudio(true);
setAudioTask({ status: "pending", progress: 0, message: "正在提交..." });
try {
const { data: res } = await api.post<ApiResponse<{ task_id: string }>>(
"/api/generated-audios/generate",
params
);
const { task_id } = unwrap(res);
setAudioTaskId(task_id);
startPolling(task_id);
} catch (err: unknown) {
setIsGeneratingAudio(false);
setAudioTask(null);
const axiosErr = err as { response?: { data?: { message?: string } }; message?: string };
const errorMsg = axiosErr.response?.data?.message || axiosErr.message || String(err);
toast.error(`配音生成失败: ${errorMsg}`);
}
}, [startPolling]);
const deleteAudio = useCallback(async (audioId: string) => {
if (!confirm("确定要删除这个配音吗?")) return;
try {
await api.delete(`/api/generated-audios/${encodeURIComponent(audioId)}`);
if (selectedAudioId === audioId) {
setSelectedAudioId(null);
setSelectedAudio(null);
}
fetchGeneratedAudios();
} catch (error) {
toast.error("删除失败: " + error);
}
}, [fetchGeneratedAudios, selectedAudioId, setSelectedAudioId]);
const renameAudio = useCallback(async (audioId: string, newName: string) => {
try {
await api.put(`/api/generated-audios/${encodeURIComponent(audioId)}`, {
new_name: newName,
});
fetchGeneratedAudios();
} catch (err: unknown) {
toast.error("重命名失败: " + String(err));
}
}, [fetchGeneratedAudios]);
const selectAudio = useCallback((audio: GeneratedAudio) => {
setSelectedAudioId(audio.id);
setSelectedAudio(audio);
}, [setSelectedAudioId]);
return {
generatedAudios,
selectedAudio,
selectedAudioId,
isGeneratingAudio,
audioTask,
fetchGeneratedAudios,
generateAudio,
deleteAudio,
renameAudio,
selectAudio,
};
};

View File

@@ -18,11 +18,14 @@ import { usePublishPrefetch } from "@/shared/hooks/usePublishPrefetch";
import { PublishAccount } from "@/shared/types/publish"; import { PublishAccount } from "@/shared/types/publish";
import { useBgm } from "@/features/home/model/useBgm"; import { useBgm } from "@/features/home/model/useBgm";
import { useGeneratedVideos } from "@/features/home/model/useGeneratedVideos"; import { useGeneratedVideos } from "@/features/home/model/useGeneratedVideos";
import { useGeneratedAudios } from "@/features/home/model/useGeneratedAudios";
import { useHomePersistence } from "@/features/home/model/useHomePersistence"; import { useHomePersistence } from "@/features/home/model/useHomePersistence";
import { useMaterials } from "@/features/home/model/useMaterials"; import { useMaterials } from "@/features/home/model/useMaterials";
import { useMediaPlayers } from "@/features/home/model/useMediaPlayers"; import { useMediaPlayers } from "@/features/home/model/useMediaPlayers";
import { useRefAudios } from "@/features/home/model/useRefAudios"; import { useRefAudios } from "@/features/home/model/useRefAudios";
import { useTitleSubtitleStyles } from "@/features/home/model/useTitleSubtitleStyles"; import { useTitleSubtitleStyles } from "@/features/home/model/useTitleSubtitleStyles";
import { useTimelineEditor } from "@/features/home/model/useTimelineEditor";
import { useSavedScripts } from "@/features/home/model/useSavedScripts";
import { ApiResponse, unwrap } from "@/shared/api/types"; import { ApiResponse, unwrap } from "@/shared/api/types";
const VOICES: Record<string, { id: string; name: string }[]> = { const VOICES: Record<string, { id: string; name: string }[]> = {
@@ -164,6 +167,13 @@ export const useHomeController = () => {
const [selectedRefAudio, setSelectedRefAudio] = useState<RefAudio | null>(null); const [selectedRefAudio, setSelectedRefAudio] = useState<RefAudio | null>(null);
const [refText, setRefText] = useState(FIXED_REF_TEXT); const [refText, setRefText] = useState(FIXED_REF_TEXT);
// 预生成配音选中 ID
const [selectedAudioId, setSelectedAudioId] = useState<string | null>(null);
// ClipTrimmer 模态框状态
const [clipTrimmerOpen, setClipTrimmerOpen] = useState(false);
const [clipTrimmerSegmentId, setClipTrimmerSegmentId] = useState<string | null>(null);
// 音频预览与重命名状态 // 音频预览与重命名状态
const [editingAudioId, setEditingAudioId] = useState<string | null>(null); const [editingAudioId, setEditingAudioId] = useState<string | null>(null);
const [editName, setEditName] = useState(""); const [editName, setEditName] = useState("");
@@ -347,6 +357,33 @@ export const useHomeController = () => {
resolveMediaUrl, resolveMediaUrl,
}); });
const {
generatedAudios,
selectedAudio,
isGeneratingAudio,
audioTask,
fetchGeneratedAudios,
generateAudio,
deleteAudio,
renameAudio,
selectAudio,
} = useGeneratedAudios({
selectedAudioId,
setSelectedAudioId,
});
const {
segments: timelineSegments,
reorderSegments,
setSourceRange,
toCustomAssignments,
} = useTimelineEditor({
audioDuration: selectedAudio?.duration_sec ?? 0,
materials,
selectedMaterials,
storageKey,
});
useEffect(() => { useEffect(() => {
if (isAuthLoading || !userId) return; if (isAuthLoading || !userId) return;
let active = true; let active = true;
@@ -420,8 +457,18 @@ export const useHomeController = () => {
selectedVideoId, selectedVideoId,
setSelectedVideoId, setSelectedVideoId,
selectedRefAudio, selectedRefAudio,
selectedAudioId,
setSelectedAudioId,
}); });
const { savedScripts, saveScript, deleteScript: deleteSavedScript } = useSavedScripts(storageKey);
const handleSaveScript = () => {
if (!text.trim()) return;
saveScript(text);
toast.success("文案已保存");
};
const syncTitleToPublish = (value: string) => { const syncTitleToPublish = (value: string) => {
if (typeof window !== "undefined") { if (typeof window !== "undefined") {
localStorage.setItem(`vigent_${storageKey}_publish_title`, value); localStorage.setItem(`vigent_${storageKey}_publish_title`, value);
@@ -441,6 +488,7 @@ export const useHomeController = () => {
fetchMaterials(), fetchMaterials(),
fetchGeneratedVideos(), fetchGeneratedVideos(),
fetchRefAudios(), fetchRefAudios(),
fetchGeneratedAudios(),
refreshSubtitleStyles(), refreshSubtitleStyles(),
refreshTitleStyles(), refreshTitleStyles(),
fetchBgmList(), fetchBgmList(),
@@ -537,14 +585,22 @@ export const useHomeController = () => {
} }
}, [selectedBgmId, bgmList]); }, [selectedBgmId, bgmList]);
// 素材列表滚动:跳过首次恢复,仅用户主动操作时滚动
const materialScrollReady = useRef(false);
useEffect(() => { useEffect(() => {
const firstSelected = selectedMaterials[0]; const firstSelected = selectedMaterials[0];
if (!firstSelected) return; if (!firstSelected) return;
if (!materialScrollReady.current) {
// 首次有选中素材时标记就绪,但不滚动(避免刷新后整页跳动)
materialScrollReady.current = true;
return;
}
const target = materialItemRefs.current[firstSelected]; const target = materialItemRefs.current[firstSelected];
if (target) { if (target) {
target.scrollIntoView({ block: "nearest", behavior: "smooth" }); target.scrollIntoView({ block: "nearest", behavior: "smooth" });
} }
}, [selectedMaterials, materials]); // eslint-disable-next-line react-hooks/exhaustive-deps
}, [selectedMaterials.length]);
// 【修复】历史视频默认选中逻辑 // 【修复】历史视频默认选中逻辑
// 当持久化恢复完成,且列表加载完毕,如果没选中任何视频,默认选中第一个 // 当持久化恢复完成,且列表加载完毕,如果没选中任何视频,默认选中第一个
@@ -741,6 +797,28 @@ export const useHomeController = () => {
} }
}; };
// 生成配音
const handleGenerateAudio = async () => {
if (!text.trim()) {
toast.error("请先输入文案");
return;
}
if (ttsMode === "voiceclone" && !selectedRefAudio) {
toast.error("请选择参考音频");
return;
}
const params = {
text: text.trim(),
tts_mode: ttsMode,
voice: ttsMode === "edgetts" ? voice : undefined,
ref_audio_id: ttsMode === "voiceclone" ? selectedRefAudio!.id : undefined,
ref_text: ttsMode === "voiceclone" ? refText : undefined,
language: textLang,
};
await generateAudio(params);
};
// 生成视频 // 生成视频
const handleGenerate = async () => { const handleGenerate = async () => {
if (selectedMaterials.length === 0 || !text.trim()) { if (selectedMaterials.length === 0 || !text.trim()) {
@@ -748,12 +826,9 @@ export const useHomeController = () => {
return; return;
} }
// 声音克隆模式校验 if (!selectedAudio) {
if (ttsMode === "voiceclone") { toast.error("请先生成并选中配音");
if (!selectedRefAudio) { return;
toast.error("请选择或上传参考音频");
return;
}
} }
if (enableBgm && !selectedBgmId) { if (enableBgm && !selectedBgmId) {
@@ -771,11 +846,12 @@ export const useHomeController = () => {
return; return;
} }
// 构建请求参数 // 构建请求参数 - 使用预生成配音
const payload: Record<string, unknown> = { const payload: Record<string, unknown> = {
material_path: firstMaterialObj.path, material_path: firstMaterialObj.path,
text: text, text: selectedAudio.text || text,
tts_mode: ttsMode, generated_audio_id: selectedAudio.id,
language: selectedAudio.language || textLang,
title: videoTitle.trim() || undefined, title: videoTitle.trim() || undefined,
enable_subtitles: true, enable_subtitles: true,
}; };
@@ -785,6 +861,16 @@ export const useHomeController = () => {
payload.material_paths = selectedMaterials payload.material_paths = selectedMaterials
.map((id) => materials.find((x) => x.id === id)?.path) .map((id) => materials.find((x) => x.id === id)?.path)
.filter((path): path is string => !!path); .filter((path): path is string => !!path);
// 发送自定义时间轴分配
const assignments = toCustomAssignments();
if (assignments.length > 0) {
payload.custom_assignments = assignments;
}
}
// 单素材 + 截取起点
if (selectedMaterials.length === 1 && timelineSegments[0]?.sourceStart > 0) {
payload.custom_assignments = toCustomAssignments();
} }
if (selectedSubtitleStyleId) { if (selectedSubtitleStyleId) {
@@ -814,15 +900,6 @@ export const useHomeController = () => {
payload.bgm_volume = bgmVolume; payload.bgm_volume = bgmVolume;
} }
payload.language = textLang;
if (ttsMode === "edgetts") {
payload.voice = voice;
} else {
payload.ref_audio_id = selectedRefAudio!.id;
payload.ref_text = refText;
}
// 创建生成任务 // 创建生成任务
const { data: res } = await api.post<ApiResponse<{ task_id: string }>>( const { data: res } = await api.post<ApiResponse<{ task_id: string }>>(
"/api/videos/generate", "/api/videos/generate",
@@ -885,7 +962,6 @@ export const useHomeController = () => {
handleUpload, handleUpload,
selectedMaterials, selectedMaterials,
toggleMaterial, toggleMaterial,
reorderMaterials,
handlePreviewMaterial, handlePreviewMaterial,
editingMaterialId, editingMaterialId,
editMaterialName, editMaterialName,
@@ -903,6 +979,9 @@ export const useHomeController = () => {
isTranslating, isTranslating,
originalText, originalText,
handleRestoreOriginal, handleRestoreOriginal,
savedScripts,
handleSaveScript,
deleteSavedScript,
showStylePreview, showStylePreview,
setShowStylePreview, setShowStylePreview,
videoTitle, videoTitle,
@@ -983,5 +1062,22 @@ export const useHomeController = () => {
fetchGeneratedVideos, fetchGeneratedVideos,
registerVideoRef, registerVideoRef,
formatDate, formatDate,
generatedAudios,
selectedAudio,
selectedAudioId,
isGeneratingAudio,
audioTask,
fetchGeneratedAudios,
handleGenerateAudio,
deleteAudio,
renameAudio,
selectAudio,
timelineSegments,
reorderSegments,
setSourceRange,
clipTrimmerOpen,
setClipTrimmerOpen,
clipTrimmerSegmentId,
setClipTrimmerSegmentId,
}; };
}; };

View File

@@ -48,6 +48,8 @@ interface UseHomePersistenceOptions {
selectedVideoId: string | null; selectedVideoId: string | null;
setSelectedVideoId: React.Dispatch<React.SetStateAction<string | null>>; setSelectedVideoId: React.Dispatch<React.SetStateAction<string | null>>;
selectedRefAudio: RefAudio | null; selectedRefAudio: RefAudio | null;
selectedAudioId: string | null;
setSelectedAudioId: React.Dispatch<React.SetStateAction<string | null>>;
} }
export const useHomePersistence = ({ export const useHomePersistence = ({
@@ -88,6 +90,8 @@ export const useHomePersistence = ({
selectedVideoId, selectedVideoId,
setSelectedVideoId, setSelectedVideoId,
selectedRefAudio, selectedRefAudio,
selectedAudioId,
setSelectedAudioId,
}: UseHomePersistenceOptions) => { }: UseHomePersistenceOptions) => {
const [isRestored, setIsRestored] = useState(false); const [isRestored, setIsRestored] = useState(false);
@@ -106,6 +110,7 @@ export const useHomePersistence = ({
const savedTitleFontSize = localStorage.getItem(`vigent_${storageKey}_titleFontSize`); const savedTitleFontSize = localStorage.getItem(`vigent_${storageKey}_titleFontSize`);
const savedBgmId = localStorage.getItem(`vigent_${storageKey}_bgmId`); const savedBgmId = localStorage.getItem(`vigent_${storageKey}_bgmId`);
const savedSelectedVideoId = localStorage.getItem(`vigent_${storageKey}_selectedVideoId`); const savedSelectedVideoId = localStorage.getItem(`vigent_${storageKey}_selectedVideoId`);
const savedSelectedAudioId = localStorage.getItem(`vigent_${storageKey}_selectedAudioId`);
const savedBgmVolume = localStorage.getItem(`vigent_${storageKey}_bgmVolume`); const savedBgmVolume = localStorage.getItem(`vigent_${storageKey}_bgmVolume`);
const savedEnableBgm = localStorage.getItem(`vigent_${storageKey}_enableBgm`); const savedEnableBgm = localStorage.getItem(`vigent_${storageKey}_enableBgm`);
const savedTitleTopMargin = localStorage.getItem(`vigent_${storageKey}_titleTopMargin`); const savedTitleTopMargin = localStorage.getItem(`vigent_${storageKey}_titleTopMargin`);
@@ -153,6 +158,7 @@ export const useHomePersistence = ({
if (savedBgmVolume) setBgmVolume(parseFloat(savedBgmVolume)); if (savedBgmVolume) setBgmVolume(parseFloat(savedBgmVolume));
if (savedEnableBgm !== null) setEnableBgm(savedEnableBgm === 'true'); if (savedEnableBgm !== null) setEnableBgm(savedEnableBgm === 'true');
if (savedSelectedVideoId) setSelectedVideoId(savedSelectedVideoId); if (savedSelectedVideoId) setSelectedVideoId(savedSelectedVideoId);
if (savedSelectedAudioId) setSelectedAudioId(savedSelectedAudioId);
if (savedTitleTopMargin) { if (savedTitleTopMargin) {
const parsed = parseInt(savedTitleTopMargin, 10); const parsed = parseInt(savedTitleTopMargin, 10);
@@ -174,6 +180,7 @@ export const useHomePersistence = ({
setSelectedSubtitleStyleId, setSelectedSubtitleStyleId,
setSelectedTitleStyleId, setSelectedTitleStyleId,
setSelectedVideoId, setSelectedVideoId,
setSelectedAudioId,
setSubtitleFontSize, setSubtitleFontSize,
setSubtitleSizeLocked, setSubtitleSizeLocked,
setText, setText,
@@ -287,6 +294,15 @@ export const useHomePersistence = ({
} }
}, [selectedVideoId, storageKey, isRestored]); }, [selectedVideoId, storageKey, isRestored]);
useEffect(() => {
if (!isRestored) return;
if (selectedAudioId) {
localStorage.setItem(`vigent_${storageKey}_selectedAudioId`, selectedAudioId);
} else {
localStorage.removeItem(`vigent_${storageKey}_selectedAudioId`);
}
}, [selectedAudioId, storageKey, isRestored]);
useEffect(() => { useEffect(() => {
if (isRestored && selectedRefAudio) { if (isRestored && selectedRefAudio) {
localStorage.setItem(`vigent_${storageKey}_refAudioId`, selectedRefAudio.id); localStorage.setItem(`vigent_${storageKey}_refAudioId`, selectedRefAudio.id);

View File

@@ -2,8 +2,36 @@ import { useCallback, useState } from "react";
import api from "@/shared/api/axios"; import api from "@/shared/api/axios";
import { ApiResponse, unwrap } from "@/shared/api/types"; import { ApiResponse, unwrap } from "@/shared/api/types";
import { toast } from "sonner"; import { toast } from "sonner";
import { resolveMediaUrl } from "@/shared/lib/media";
import type { Material } from "@/shared/types/material"; import type { Material } from "@/shared/types/material";
/** Probe video duration from a URL using <video> element */
function probeVideoDuration(url: string): Promise<number> {
return new Promise((resolve) => {
const video = document.createElement("video");
video.preload = "metadata";
video.crossOrigin = "anonymous";
const cleanup = () => {
video.removeEventListener("loadedmetadata", onMeta);
video.removeEventListener("error", onError);
video.src = "";
};
const onMeta = () => {
const dur = video.duration;
cleanup();
resolve(Number.isFinite(dur) ? dur : 0);
};
const onError = () => {
cleanup();
resolve(0);
};
video.addEventListener("loadedmetadata", onMeta);
video.addEventListener("error", onError);
video.src = url;
video.load();
});
}
interface UseMaterialsOptions { interface UseMaterialsOptions {
selectedMaterials: string[]; selectedMaterials: string[];
setSelectedMaterials: React.Dispatch<React.SetStateAction<string[]>>; setSelectedMaterials: React.Dispatch<React.SetStateAction<string[]>>;
@@ -34,6 +62,18 @@ export const useMaterials = ({
setMaterials(nextMaterials); setMaterials(nextMaterials);
setLastMaterialCount(nextMaterials.length); setLastMaterialCount(nextMaterials.length);
// Probe video durations in background
if (nextMaterials.length > 0) {
Promise.all(
nextMaterials.map(async (m) => {
const url = resolveMediaUrl(m.path);
if (!url) return m;
const dur = await probeVideoDuration(url);
return { ...m, duration_sec: dur };
})
).then((enriched) => setMaterials(enriched));
}
setSelectedMaterials((prev) => { setSelectedMaterials((prev) => {
// 保留已选中且仍存在的 // 保留已选中且仍存在的
const existingIds = new Set(nextMaterials.map((m) => m.id)); const existingIds = new Set(nextMaterials.map((m) => m.id));
@@ -133,6 +173,18 @@ export const useMaterials = ({
setMaterials(nextMaterials); setMaterials(nextMaterials);
setLastMaterialCount(nextMaterials.length); setLastMaterialCount(nextMaterials.length);
// Probe video durations in background
if (nextMaterials.length > 0) {
Promise.all(
nextMaterials.map(async (m) => {
const url = resolveMediaUrl(m.path);
if (!url) return m;
const dur = await probeVideoDuration(url);
return { ...m, duration_sec: dur };
})
).then((enriched) => setMaterials(enriched));
}
// 找出新增的素材 ID 并自动选中 // 找出新增的素材 ID 并自动选中
const oldIds = new Set(materials.map((m) => m.id)); const oldIds = new Set(materials.map((m) => m.id));
const newIds = nextMaterials.filter((m) => !oldIds.has(m.id)).map((m) => m.id); const newIds = nextMaterials.filter((m) => !oldIds.has(m.id)).map((m) => m.id);

View File

@@ -0,0 +1,51 @@
import { useState, useEffect, useRef } from "react";
export interface SavedScript {
id: string;
name: string;
content: string;
savedAt: number;
}
export function useSavedScripts(storageKey: string) {
const lsKey = `vigent_${storageKey}_savedScripts`;
const lsKeyRef = useRef(lsKey);
lsKeyRef.current = lsKey;
const [savedScripts, setSavedScripts] = useState<SavedScript[]>([]);
// Re-read from localStorage whenever lsKey changes (e.g. guest → userId)
useEffect(() => {
try {
const raw = localStorage.getItem(lsKey);
setSavedScripts(raw ? JSON.parse(raw) : []);
} catch {
setSavedScripts([]);
}
}, [lsKey]);
const saveScript = (content: string) => {
const name = content.slice(0, 15).replace(/\n/g, " ") || "未命名";
const entry: SavedScript = {
id: Date.now().toString(36) + Math.random().toString(36).slice(2, 6),
name,
content,
savedAt: Date.now(),
};
setSavedScripts((prev) => {
const next = [entry, ...prev];
localStorage.setItem(lsKeyRef.current, JSON.stringify(next));
return next;
});
};
const deleteScript = (id: string) => {
setSavedScripts((prev) => {
const next = prev.filter((s) => s.id !== id);
localStorage.setItem(lsKeyRef.current, JSON.stringify(next));
return next;
});
};
return { savedScripts, saveScript, deleteScript };
}

View File

@@ -0,0 +1,246 @@
import { useCallback, useEffect, useRef, useState } from "react";
import type { Material } from "@/shared/types/material";
export interface TimelineSegment {
id: string;
materialId: string;
materialName: string;
start: number;
end: number;
sourceStart: number;
sourceEnd: number;
color: string;
}
export interface CustomAssignment {
material_path: string;
start: number;
end: number;
source_start: number;
}
const COLORS = ["#8b5cf6", "#ec4899", "#06b6d4", "#f59e0b", "#10b981", "#f97316"];
/** Serializable subset for localStorage */
interface SegmentSnapshot {
materialId: string;
start: number;
end: number;
sourceStart: number;
sourceEnd: number;
}
/** Get effective duration of a segment (clipped range or full material duration) */
function getEffectiveDuration(
seg: { sourceStart: number; sourceEnd: number; materialId: string },
mats: Material[]
): number {
if (seg.sourceEnd > seg.sourceStart) return seg.sourceEnd - seg.sourceStart;
const mat = mats.find((m) => m.id === seg.materialId);
return mat?.duration_sec ?? 0;
}
/**
* Recalculate segment start/end positions based on effective durations.
* - Segments placed sequentially by effective duration
* - Segments exceeding audioDuration keep their positions (overflow, start >= duration)
* - Last visible segment is capped/extended to exactly audioDuration (loop fill)
*/
function recalcPositions(
segs: TimelineSegment[],
mats: Material[],
duration: number
): TimelineSegment[] {
if (segs.length === 0 || duration <= 0) return segs;
const fallbackDur = duration / segs.length;
let cursor = 0;
const result = segs.map((seg) => {
const effDur = getEffectiveDuration(seg, mats);
const dur = effDur > 0 ? effDur : fallbackDur;
const newSeg = { ...seg, start: cursor, end: cursor + dur };
cursor += dur;
return newSeg;
});
// Find last segment that starts before audioDuration
let lastVisibleIdx = -1;
for (let i = result.length - 1; i >= 0; i--) {
if (result[i].start < duration) {
lastVisibleIdx = i;
break;
}
}
// Cap/extend last visible segment to exactly audioDuration
if (lastVisibleIdx >= 0) {
result[lastVisibleIdx] = { ...result[lastVisibleIdx], end: duration };
}
return result;
}
interface UseTimelineEditorOptions {
audioDuration: number;
materials: Material[];
selectedMaterials: string[];
storageKey?: string;
}
export const useTimelineEditor = ({
audioDuration,
materials,
selectedMaterials,
storageKey,
}: UseTimelineEditorOptions) => {
const [segments, setSegments] = useState<TimelineSegment[]>([]);
const prevKey = useRef("");
const restoredRef = useRef(false);
// Refs for stable callbacks (avoid recreating on every materials/duration change)
const materialsRef = useRef(materials);
materialsRef.current = materials;
const audioDurationRef = useRef(audioDuration);
audioDurationRef.current = audioDuration;
// Build a durationsKey so segments re-init when material durations become available
const durationsKey = selectedMaterials
.map((id) => materials.find((m) => m.id === id)?.duration_sec ?? 0)
.join(",");
// Build a cache key from materials + duration
const cacheKey = `${selectedMaterials.join(",")}_${audioDuration.toFixed(1)}`;
const lsKey = storageKey ? `vigent_${storageKey}_timeline` : null;
const initSegments = useCallback(() => {
if (selectedMaterials.length === 0 || audioDuration <= 0) {
setSegments([]);
return;
}
// Try restore from localStorage
if (lsKey) {
try {
const raw = localStorage.getItem(lsKey);
if (raw) {
const saved = JSON.parse(raw) as { key: string; segments: SegmentSnapshot[] };
if (saved.key === cacheKey && saved.segments.length === selectedMaterials.length) {
const allMatch = saved.segments.every(
(s, i) => s.materialId === selectedMaterials[i] || saved.segments.some((ss) => ss.materialId === selectedMaterials[i])
);
if (allMatch) {
const restored: TimelineSegment[] = saved.segments.map((s, i) => {
const mat = materials.find((m) => m.id === s.materialId);
return {
id: `seg-${i}-${Date.now()}`,
materialId: s.materialId,
materialName: mat?.scene || mat?.name || s.materialId,
start: 0,
end: 0,
sourceStart: s.sourceStart,
sourceEnd: s.sourceEnd,
color: COLORS[i % COLORS.length],
};
});
setSegments(recalcPositions(restored, materials, audioDuration));
restoredRef.current = true;
return;
}
}
}
} catch {
// ignore parse errors
}
}
// Create fresh segments — positions derived by recalcPositions
const newSegments: TimelineSegment[] = selectedMaterials.map((matId, i) => {
const mat = materials.find((m) => m.id === matId);
return {
id: `seg-${i}-${Date.now()}`,
materialId: matId,
materialName: mat?.scene || mat?.name || matId,
start: 0,
end: 0,
sourceStart: 0,
sourceEnd: 0,
color: COLORS[i % COLORS.length],
};
});
setSegments(recalcPositions(newSegments, materials, audioDuration));
}, [audioDuration, materials, selectedMaterials, lsKey, cacheKey]);
// Auto-init when selectedMaterials, audioDuration, or material durations change
useEffect(() => {
const key = `${selectedMaterials.join(",")}_${audioDuration}_${durationsKey}`;
if (key !== prevKey.current) {
prevKey.current = key;
initSegments();
}
}, [selectedMaterials, audioDuration, durationsKey, initSegments]);
// Persist segments to localStorage on change (debounced)
useEffect(() => {
if (!lsKey || segments.length === 0) return;
const timeout = setTimeout(() => {
const snapshots: SegmentSnapshot[] = segments.map((s) => ({
materialId: s.materialId,
start: s.start,
end: s.end,
sourceStart: s.sourceStart,
sourceEnd: s.sourceEnd,
}));
localStorage.setItem(lsKey, JSON.stringify({ key: cacheKey, segments: snapshots }));
}, 300);
return () => clearTimeout(timeout);
}, [segments, lsKey, cacheKey]);
const reorderSegments = useCallback(
(fromIdx: number, toIdx: number) => {
setSegments((prev) => {
if (fromIdx < 0 || toIdx < 0 || fromIdx >= prev.length || toIdx >= prev.length) return prev;
if (fromIdx === toIdx) return prev;
const next = [...prev];
// Move the segment: remove from old position, insert at new position
const [moved] = next.splice(fromIdx, 1);
next.splice(toIdx, 0, moved);
return recalcPositions(next, materialsRef.current, audioDurationRef.current);
});
},
[]
);
const setSourceRange = useCallback(
(id: string, sourceStart: number, sourceEnd: number) => {
setSegments((prev) => {
const updated = prev.map((s) => (s.id === id ? { ...s, sourceStart, sourceEnd } : s));
return recalcPositions(updated, materialsRef.current, audioDurationRef.current);
});
},
[]
);
const toCustomAssignments = useCallback((): CustomAssignment[] => {
const duration = audioDurationRef.current;
return segments
.filter((seg) => seg.start < duration)
.map((seg) => {
const mat = materialsRef.current.find((m) => m.id === seg.materialId);
return {
material_path: mat?.path || seg.materialId,
start: seg.start,
end: seg.end,
source_start: seg.sourceStart,
};
});
}, [segments]);
return {
segments,
initSegments,
reorderSegments,
setSourceRange,
toCustomAssignments,
};
};

View File

@@ -0,0 +1,293 @@
import { useCallback, useEffect, useRef, useState } from "react";
import { X, Play, Pause } from "lucide-react";
import type { TimelineSegment } from "@/features/home/model/useTimelineEditor";
interface ClipTrimmerProps {
isOpen: boolean;
segment: TimelineSegment | null;
materialUrl: string | null;
onConfirm: (sourceStart: number, sourceEnd: number) => void;
onClose: () => void;
}
function formatSec(sec: number): string {
const m = Math.floor(sec / 60);
const s = sec % 60;
return `${String(m).padStart(2, "0")}:${s.toFixed(1).padStart(4, "0")}`;
}
export function ClipTrimmer({
isOpen,
segment,
materialUrl,
onConfirm,
onClose,
}: ClipTrimmerProps) {
const videoRef = useRef<HTMLVideoElement>(null);
const trackRef = useRef<HTMLDivElement>(null);
const [duration, setDuration] = useState(0);
const [sourceStart, setSourceStart] = useState(0);
const [sourceEnd, setSourceEnd] = useState(0);
const [currentTime, setCurrentTime] = useState(0);
const [isPlaying, setIsPlaying] = useState(false);
const [dragging, setDragging] = useState<"start" | "end" | null>(null);
const animRef = useRef<number>(0);
// Reset state when segment changes
useEffect(() => {
if (segment && isOpen) {
setSourceStart(segment.sourceStart);
setSourceEnd(segment.sourceEnd);
setCurrentTime(segment.sourceStart);
setIsPlaying(false);
}
}, [segment, isOpen]);
// Track currentTime during playback
useEffect(() => {
if (!isPlaying || !videoRef.current) return;
const tick = () => {
if (!videoRef.current) return;
const t = videoRef.current.currentTime;
const end = sourceEnd || duration;
if (t >= end) {
videoRef.current.pause();
videoRef.current.currentTime = sourceStart;
setCurrentTime(sourceStart);
setIsPlaying(false);
return;
}
setCurrentTime(t);
animRef.current = requestAnimationFrame(tick);
};
animRef.current = requestAnimationFrame(tick);
return () => cancelAnimationFrame(animRef.current);
}, [isPlaying, sourceStart, sourceEnd, duration]);
// Seek video when not playing and currentTime changes
useEffect(() => {
if (videoRef.current && !isPlaying) {
videoRef.current.currentTime = currentTime;
}
}, [currentTime, isPlaying]);
const handleLoadedMetadata = useCallback(() => {
if (videoRef.current) {
const dur = videoRef.current.duration;
setDuration(dur);
if (sourceEnd === 0) {
setSourceEnd(dur);
}
}
}, [sourceEnd]);
const togglePlay = useCallback(() => {
if (!videoRef.current || duration === 0) return;
if (isPlaying) {
videoRef.current.pause();
setIsPlaying(false);
} else {
const end = sourceEnd || duration;
if (videoRef.current.currentTime >= end || videoRef.current.currentTime < sourceStart) {
videoRef.current.currentTime = sourceStart;
setCurrentTime(sourceStart);
}
videoRef.current.play().catch(() => {});
setIsPlaying(true);
}
}, [isPlaying, sourceStart, sourceEnd, duration]);
// --- Dual-handle slider logic ---
const getPositionFromEvent = useCallback(
(clientX: number) => {
if (!trackRef.current || duration === 0) return 0;
const rect = trackRef.current.getBoundingClientRect();
const ratio = Math.max(0, Math.min(1, (clientX - rect.left) / rect.width));
return ratio * duration;
},
[duration]
);
const handleThumbPointerDown = useCallback(
(which: "start" | "end", e: React.PointerEvent) => {
e.preventDefault();
e.stopPropagation();
setDragging(which);
(e.target as HTMLElement).setPointerCapture(e.pointerId);
},
[]
);
const handleTrackPointerMove = useCallback(
(e: React.PointerEvent) => {
if (!dragging) return;
const pos = getPositionFromEvent(e.clientX);
const minGap = 0.5;
if (dragging === "start") {
const clamped = Math.max(0, Math.min(pos, (sourceEnd || duration) - minGap));
setSourceStart(clamped);
setCurrentTime(clamped);
} else {
const clamped = Math.min(duration, Math.max(pos, sourceStart + minGap));
setSourceEnd(clamped);
}
},
[dragging, getPositionFromEvent, sourceStart, sourceEnd, duration]
);
const handleTrackPointerUp = useCallback(() => {
setDragging(null);
}, []);
const handleConfirm = () => {
onConfirm(sourceStart, sourceEnd >= duration ? 0 : sourceEnd);
};
if (!isOpen || !segment) return null;
const assignedDur = segment.end - segment.start;
const effectiveEnd = sourceEnd || duration;
const clipDur = effectiveEnd - sourceStart;
const startPct = duration > 0 ? (sourceStart / duration) * 100 : 0;
const endPct = duration > 0 ? (effectiveEnd / duration) * 100 : 100;
const playheadPct = duration > 0 ? (currentTime / duration) * 100 : 0;
return (
<div className="fixed inset-0 z-50 flex items-center justify-center bg-black/60 backdrop-blur-sm" onClick={onClose}>
<div
className="bg-gray-900 border border-white/10 rounded-2xl w-full max-w-lg mx-4 overflow-hidden"
onClick={(e) => e.stopPropagation()}
>
{/* Header */}
<div className="flex items-center justify-between px-5 py-3 border-b border-white/10">
<h3 className="text-white font-semibold text-sm">
- {segment.materialName}
</h3>
<button onClick={onClose} className="text-gray-400 hover:text-white">
<X className="h-4 w-4" />
</button>
</div>
{/* Video preview */}
<div className="px-5 pt-4">
<div className="relative bg-black rounded-lg overflow-hidden aspect-video group">
{materialUrl ? (
<video
ref={videoRef}
src={materialUrl}
className="w-full h-full object-contain"
onLoadedMetadata={handleLoadedMetadata}
onEnded={() => setIsPlaying(false)}
preload="auto"
muted
/>
) : (
<div className="flex items-center justify-center h-full text-gray-500 text-sm">
</div>
)}
{/* Play/Pause overlay */}
{materialUrl && (
<button
onClick={togglePlay}
className="absolute inset-0 flex items-center justify-center bg-black/0 hover:bg-black/30 transition-colors"
>
<div className={`p-3 rounded-full bg-black/60 text-white transition-opacity ${isPlaying ? "opacity-0 group-hover:opacity-100" : "opacity-100"}`}>
{isPlaying ? <Pause className="h-6 w-6" /> : <Play className="h-6 w-6" />}
</div>
</button>
)}
<div className="absolute bottom-2 right-2 bg-black/70 text-white text-[10px] px-2 py-0.5 rounded pointer-events-none">
{formatSec(currentTime)}
</div>
</div>
</div>
{/* Dual-handle range slider */}
<div className="px-5 py-4 space-y-3">
<div className="text-xs text-gray-400 flex justify-between">
<span>: {duration > 0 ? formatSec(duration) : "加载中..."}</span>
</div>
{/* Custom range track */}
<div
ref={trackRef}
className="relative h-8 cursor-pointer select-none touch-none"
onPointerMove={handleTrackPointerMove}
onPointerUp={handleTrackPointerUp}
onPointerLeave={handleTrackPointerUp}
>
{/* Background track */}
<div className="absolute top-1/2 -translate-y-1/2 left-0 right-0 h-2 bg-white/10 rounded-full" />
{/* Selected range */}
<div
className="absolute top-1/2 -translate-y-1/2 h-2 rounded-full"
style={{
left: `${startPct}%`,
width: `${endPct - startPct}%`,
backgroundColor: segment.color + "88",
}}
/>
{/* Playhead indicator */}
{duration > 0 && (
<div
className="absolute top-1/2 -translate-y-1/2 w-0.5 h-4 bg-white/60 rounded-full pointer-events-none"
style={{ left: `${playheadPct}%` }}
/>
)}
{/* Start thumb */}
<div
onPointerDown={(e) => handleThumbPointerDown("start", e)}
className="absolute top-1/2 -translate-y-1/2 -translate-x-1/2 w-4 h-4 rounded-full bg-purple-500 border-2 border-white shadow-lg cursor-grab active:cursor-grabbing hover:scale-110 transition-transform z-10"
style={{ left: `${startPct}%` }}
title={`起点: ${formatSec(sourceStart)}`}
/>
{/* End thumb */}
<div
onPointerDown={(e) => handleThumbPointerDown("end", e)}
className="absolute top-1/2 -translate-y-1/2 -translate-x-1/2 w-4 h-4 rounded-full bg-pink-500 border-2 border-white shadow-lg cursor-grab active:cursor-grabbing hover:scale-110 transition-transform z-10"
style={{ left: `${endPct}%` }}
title={`终点: ${formatSec(effectiveEnd)}`}
/>
</div>
{/* Time labels */}
<div className="flex justify-between text-xs text-gray-400">
<span className="text-purple-400">{formatSec(sourceStart)}</span>
<span className="text-pink-400">{formatSec(effectiveEnd)}</span>
</div>
{/* Info */}
<div className="text-[11px] text-gray-500 flex items-center gap-2 flex-wrap">
<span>: {clipDur.toFixed(1)}s</span>
<span className="text-gray-600">|</span>
<span>: {assignedDur.toFixed(1)}s</span>
{clipDur < assignedDur && <span className="text-amber-500">()</span>}
{clipDur > assignedDur && <span className="text-cyan-500">()</span>}
</div>
</div>
{/* Actions */}
<div className="flex justify-end gap-2 px-5 pb-4">
<button
onClick={onClose}
className="px-4 py-1.5 text-xs bg-white/10 hover:bg-white/20 rounded-lg text-gray-300 transition-colors"
>
</button>
<button
onClick={handleConfirm}
className="px-4 py-1.5 text-xs bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-700 hover:to-pink-700 text-white rounded-lg transition-colors"
>
</button>
</div>
</div>
</div>
);
}

View File

@@ -0,0 +1,224 @@
import { useState, useRef, useCallback, useEffect } from "react";
import { Play, Pause, Pencil, Trash2, Check, X, RefreshCw, Mic } from "lucide-react";
import type { GeneratedAudio } from "@/features/home/model/useGeneratedAudios";
interface AudioTask {
status: string;
progress?: number;
message?: string;
}
interface GeneratedAudiosPanelProps {
generatedAudios: GeneratedAudio[];
selectedAudioId: string | null;
isGeneratingAudio: boolean;
audioTask: AudioTask | null;
onGenerateAudio: () => void;
onRefresh: () => void;
onSelectAudio: (audio: GeneratedAudio) => void;
onDeleteAudio: (id: string) => void;
onRenameAudio: (id: string, newName: string) => void;
hasText: boolean;
}
export function GeneratedAudiosPanel({
generatedAudios,
selectedAudioId,
isGeneratingAudio,
audioTask,
onGenerateAudio,
onRefresh,
onSelectAudio,
onDeleteAudio,
onRenameAudio,
hasText,
}: GeneratedAudiosPanelProps) {
const [editingId, setEditingId] = useState<string | null>(null);
const [editName, setEditName] = useState("");
const [playingId, setPlayingId] = useState<string | null>(null);
const audioRef = useRef<HTMLAudioElement | null>(null);
const stopPlaying = useCallback(() => {
if (audioRef.current) {
audioRef.current.pause();
audioRef.current.currentTime = 0;
audioRef.current = null;
}
setPlayingId(null);
}, []);
// Cleanup on unmount
useEffect(() => {
return () => {
if (audioRef.current) {
audioRef.current.pause();
audioRef.current = null;
}
};
}, []);
const togglePlay = (audio: GeneratedAudio, e: React.MouseEvent) => {
e.stopPropagation();
if (playingId === audio.id) {
stopPlaying();
return;
}
stopPlaying();
const player = new Audio(audio.path);
player.onended = () => setPlayingId(null);
player.play().catch(() => {});
audioRef.current = player;
setPlayingId(audio.id);
};
const startEditing = (audio: GeneratedAudio, e: React.MouseEvent) => {
e.stopPropagation();
setEditingId(audio.id);
setEditName(audio.name);
};
const saveEditing = (audioId: string, e: React.MouseEvent) => {
e.stopPropagation();
if (!editName.trim()) return;
onRenameAudio(audioId, editName.trim());
setEditingId(null);
setEditName("");
};
const cancelEditing = (e: React.MouseEvent) => {
e.stopPropagation();
setEditingId(null);
setEditName("");
};
return (
<div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
<div className="flex justify-between items-center gap-2 mb-4">
<h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2 whitespace-nowrap">
<Mic className="h-4 w-4 text-purple-400" />
</h2>
<div className="flex gap-1.5">
<button
onClick={onGenerateAudio}
disabled={isGeneratingAudio || !hasText}
className={`px-2 py-1 text-xs rounded transition-all whitespace-nowrap flex items-center gap-1 ${
isGeneratingAudio || !hasText
? "bg-gray-600 cursor-not-allowed text-gray-400"
: "bg-gradient-to-r from-purple-600 to-pink-600 hover:from-purple-700 hover:to-pink-700 text-white"
}`}
>
<Mic className="h-3.5 w-3.5" />
</button>
<button
onClick={onRefresh}
className="px-2 py-1 text-xs bg-white/10 hover:bg-white/20 rounded text-gray-300 whitespace-nowrap flex items-center gap-1"
>
<RefreshCw className="h-3.5 w-3.5" />
</button>
</div>
</div>
{/* 生成进度 */}
{isGeneratingAudio && audioTask && (
<div className="mb-4 p-3 bg-purple-500/10 rounded-xl border border-purple-500/30">
<div className="flex justify-between text-sm text-purple-300 mb-2">
<span>{audioTask.message || "生成中..."}</span>
<span>{audioTask.progress || 0}%</span>
</div>
<div className="h-2 bg-black/30 rounded-full overflow-hidden">
<div
className="h-full bg-gradient-to-r from-purple-500 to-pink-500 transition-all duration-300"
style={{ width: `${audioTask.progress || 0}%` }}
/>
</div>
</div>
)}
{/* 配音列表 */}
{generatedAudios.length === 0 ? (
<div className="text-center py-6 text-gray-400">
<p className="text-sm"></p>
<p className="text-xs mt-1 text-gray-500"></p>
</div>
) : (
<div className="space-y-2 max-h-48 sm:max-h-56 overflow-y-auto hide-scrollbar">
{generatedAudios.map((audio) => {
const isSelected = selectedAudioId === audio.id;
return (
<div
key={audio.id}
onClick={() => onSelectAudio(audio)}
className={`p-3 rounded-lg border transition-all cursor-pointer flex items-center justify-between group ${
isSelected
? "border-purple-500 bg-purple-500/20"
: "border-white/10 bg-white/5 hover:border-white/30"
}`}
>
{editingId === audio.id ? (
<div className="flex-1 flex items-center gap-2" onClick={(e) => e.stopPropagation()}>
<input
value={editName}
onChange={(e) => setEditName(e.target.value)}
className="flex-1 bg-black/40 border border-white/20 rounded-md px-2 py-1 text-xs text-white"
autoFocus
onKeyDown={(e) => {
if (e.key === "Enter") saveEditing(audio.id, e as unknown as React.MouseEvent);
if (e.key === "Escape") cancelEditing(e as unknown as React.MouseEvent);
}}
/>
<button onClick={(e) => saveEditing(audio.id, e)} className="p-1 text-green-400 hover:text-green-300" title="保存">
<Check className="h-4 w-4" />
</button>
<button onClick={cancelEditing} className="p-1 text-gray-400 hover:text-white" title="取消">
<X className="h-4 w-4" />
</button>
</div>
) : (
<>
<div className="min-w-0 flex-1">
<div className="text-white text-sm truncate">{audio.name}</div>
<div className="text-gray-400 text-xs">{audio.duration_sec.toFixed(1)}s</div>
</div>
<div className="flex items-center gap-1 pl-2 opacity-0 group-hover:opacity-100 transition-opacity">
<button
onClick={(e) => togglePlay(audio, e)}
className="p-1 text-gray-500 hover:text-purple-400 transition-colors"
title={playingId === audio.id ? "暂停" : "播放"}
>
{playingId === audio.id ? (
<Pause className="h-3.5 w-3.5" />
) : (
<Play className="h-3.5 w-3.5" />
)}
</button>
<button
onClick={(e) => startEditing(audio, e)}
className="p-1 text-gray-500 hover:text-white transition-colors"
title="重命名"
>
<Pencil className="h-3.5 w-3.5" />
</button>
<button
onClick={(e) => {
e.stopPropagation();
onDeleteAudio(audio.id);
}}
className="p-1 text-gray-500 hover:text-red-400 transition-colors"
title="删除"
>
<Trash2 className="h-3.5 w-3.5" />
</button>
</div>
</>
)}
</div>
);
})}
</div>
)}
</div>
);
}

View File

@@ -1,20 +1,24 @@
"use client"; "use client";
import { useEffect } from "react"; import { useEffect, useMemo } from "react";
import { useRouter } from "next/navigation"; import { useRouter } from "next/navigation";
import VideoPreviewModal from "@/components/VideoPreviewModal"; import VideoPreviewModal from "@/components/VideoPreviewModal";
import ScriptExtractionModal from "./ScriptExtractionModal"; import ScriptExtractionModal from "./ScriptExtractionModal";
import { useHomeController } from "@/features/home/model/useHomeController"; import { useHomeController } from "@/features/home/model/useHomeController";
import { resolveMediaUrl } from "@/shared/lib/media";
import { BgmPanel } from "@/features/home/ui/BgmPanel"; import { BgmPanel } from "@/features/home/ui/BgmPanel";
import { GenerateActionBar } from "@/features/home/ui/GenerateActionBar"; import { GenerateActionBar } from "@/features/home/ui/GenerateActionBar";
import { HistoryList } from "@/features/home/ui/HistoryList"; import { HistoryList } from "@/features/home/ui/HistoryList";
import { HomeHeader } from "@/features/home/ui/HomeHeader"; import { HomeHeader } from "@/features/home/ui/HomeHeader";
import { MaterialSelector } from "@/features/home/ui/MaterialSelector"; import { MaterialSelector } from "@/features/home/ui/MaterialSelector";
import { TimelineEditor } from "@/features/home/ui/TimelineEditor";
import { ClipTrimmer } from "@/features/home/ui/ClipTrimmer";
import { PreviewPanel } from "@/features/home/ui/PreviewPanel"; import { PreviewPanel } from "@/features/home/ui/PreviewPanel";
import { RefAudioPanel } from "@/features/home/ui/RefAudioPanel"; import { RefAudioPanel } from "@/features/home/ui/RefAudioPanel";
import { ScriptEditor } from "@/features/home/ui/ScriptEditor"; import { ScriptEditor } from "@/features/home/ui/ScriptEditor";
import { TitleSubtitlePanel } from "@/features/home/ui/TitleSubtitlePanel"; import { TitleSubtitlePanel } from "@/features/home/ui/TitleSubtitlePanel";
import { VoiceSelector } from "@/features/home/ui/VoiceSelector"; import { VoiceSelector } from "@/features/home/ui/VoiceSelector";
import { GeneratedAudiosPanel } from "@/features/home/ui/GeneratedAudiosPanel";
export function HomePage() { export function HomePage() {
const router = useRouter(); const router = useRouter();
@@ -36,7 +40,6 @@ export function HomePage() {
handleUpload, handleUpload,
selectedMaterials, selectedMaterials,
toggleMaterial, toggleMaterial,
reorderMaterials,
handlePreviewMaterial, handlePreviewMaterial,
editingMaterialId, editingMaterialId,
editMaterialName, editMaterialName,
@@ -54,6 +57,9 @@ export function HomePage() {
isTranslating, isTranslating,
originalText, originalText,
handleRestoreOriginal, handleRestoreOriginal,
savedScripts,
handleSaveScript,
deleteSavedScript,
showStylePreview, showStylePreview,
setShowStylePreview, setShowStylePreview,
videoTitle, videoTitle,
@@ -133,12 +139,40 @@ export function HomePage() {
fetchGeneratedVideos, fetchGeneratedVideos,
registerVideoRef, registerVideoRef,
formatDate, formatDate,
generatedAudios,
selectedAudio,
selectedAudioId,
isGeneratingAudio,
audioTask,
fetchGeneratedAudios,
handleGenerateAudio,
deleteAudio,
renameAudio,
selectAudio,
timelineSegments,
reorderSegments,
setSourceRange,
clipTrimmerOpen,
setClipTrimmerOpen,
clipTrimmerSegmentId,
setClipTrimmerSegmentId,
} = useHomeController(); } = useHomeController();
useEffect(() => { useEffect(() => {
router.prefetch("/publish"); router.prefetch("/publish");
}, [router]); }, [router]);
const clipTrimmerSegment = useMemo(
() => timelineSegments.find((s) => s.id === clipTrimmerSegmentId) ?? null,
[timelineSegments, clipTrimmerSegmentId]
);
const clipTrimmerMaterialUrl = useMemo(() => {
if (!clipTrimmerSegment) return null;
const mat = materials.find((m) => m.id === clipTrimmerSegment.materialId);
return mat?.path ? resolveMediaUrl(mat.path) : null;
}, [clipTrimmerSegment, materials]);
return ( return (
<div className="min-h-dvh"> <div className="min-h-dvh">
<HomeHeader /> <HomeHeader />
@@ -147,34 +181,7 @@ export function HomePage() {
<div className="grid grid-cols-1 lg:grid-cols-2 gap-8"> <div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
{/* 左侧: 输入区域 */} {/* 左侧: 输入区域 */}
<div className="space-y-6"> <div className="space-y-6">
{/* 素材选择 */} {/* 1. 文案输入 */}
<MaterialSelector
materials={materials}
selectedMaterials={selectedMaterials}
isFetching={isFetching}
lastMaterialCount={lastMaterialCount}
editingMaterialId={editingMaterialId}
editMaterialName={editMaterialName}
isUploading={isUploading}
uploadProgress={uploadProgress}
uploadError={uploadError}
fetchError={fetchError}
apiBase={apiBase}
onUploadChange={handleUpload}
onRefresh={fetchMaterials}
onToggleMaterial={toggleMaterial}
onReorderMaterials={reorderMaterials}
onPreviewMaterial={handlePreviewMaterial}
onStartEditing={startMaterialEditing}
onEditNameChange={setEditMaterialName}
onSaveEditing={saveMaterialEditing}
onCancelEditing={cancelMaterialEditing}
onDeleteMaterial={deleteMaterial}
onClearUploadError={() => setUploadError(null)}
registerMaterialRef={registerMaterialRef}
/>
{/* 文案输入 */}
<ScriptEditor <ScriptEditor
text={text} text={text}
onChangeText={setText} onChangeText={setText}
@@ -185,9 +192,13 @@ export function HomePage() {
isTranslating={isTranslating} isTranslating={isTranslating}
hasOriginalText={originalText !== null} hasOriginalText={originalText !== null}
onRestoreOriginal={handleRestoreOriginal} onRestoreOriginal={handleRestoreOriginal}
savedScripts={savedScripts}
onSaveScript={handleSaveScript}
onLoadScript={setText}
onDeleteScript={deleteSavedScript}
/> />
{/* 标题和字幕设置 */} {/* 2. 标题和字幕设置 */}
<TitleSubtitlePanel <TitleSubtitlePanel
showStylePreview={showStylePreview} showStylePreview={showStylePreview}
onTogglePreview={() => setShowStylePreview((prev) => !prev)} onTogglePreview={() => setShowStylePreview((prev) => !prev)}
@@ -222,7 +233,7 @@ export function HomePage() {
previewBaseHeight={materialDimensions?.height || 1920} previewBaseHeight={materialDimensions?.height || 1920}
/> />
{/* 配音方式选择 */} {/* 3. 配音方式选择 */}
<VoiceSelector <VoiceSelector
ttsMode={ttsMode} ttsMode={ttsMode}
onSelectTtsMode={setTtsMode} onSelectTtsMode={setTtsMode}
@@ -260,7 +271,69 @@ export function HomePage() {
)} )}
/> />
{/* 背景音乐 */} {/* 4. 配音列表 */}
<GeneratedAudiosPanel
generatedAudios={generatedAudios}
selectedAudioId={selectedAudioId}
isGeneratingAudio={isGeneratingAudio}
audioTask={audioTask}
onGenerateAudio={handleGenerateAudio}
onRefresh={() => fetchGeneratedAudios()}
onSelectAudio={selectAudio}
onDeleteAudio={deleteAudio}
onRenameAudio={renameAudio}
hasText={!!text.trim()}
/>
{/* 5. 视频素材 */}
<MaterialSelector
materials={materials}
selectedMaterials={selectedMaterials}
isFetching={isFetching}
lastMaterialCount={lastMaterialCount}
editingMaterialId={editingMaterialId}
editMaterialName={editMaterialName}
isUploading={isUploading}
uploadProgress={uploadProgress}
uploadError={uploadError}
fetchError={fetchError}
apiBase={apiBase}
onUploadChange={handleUpload}
onRefresh={fetchMaterials}
onToggleMaterial={toggleMaterial}
onPreviewMaterial={handlePreviewMaterial}
onStartEditing={startMaterialEditing}
onEditNameChange={setEditMaterialName}
onSaveEditing={saveMaterialEditing}
onCancelEditing={cancelMaterialEditing}
onDeleteMaterial={deleteMaterial}
onClearUploadError={() => setUploadError(null)}
registerMaterialRef={registerMaterialRef}
/>
{/* 5.5 时间轴编辑器 — 未选配音/素材时模糊遮挡 */}
<div className="relative">
{(!selectedAudio || selectedMaterials.length === 0) && (
<div className="absolute inset-0 bg-black/50 backdrop-blur-sm rounded-2xl flex items-center justify-center z-10">
<p className="text-gray-400">
{!selectedAudio ? "请先生成并选中配音" : "请先选择素材"}
</p>
</div>
)}
<TimelineEditor
audioDuration={selectedAudio?.duration_sec ?? 0}
audioUrl={selectedAudio ? (resolveMediaUrl(selectedAudio.path) || "") : ""}
segments={timelineSegments}
materials={materials}
onReorderSegment={reorderSegments}
onClickSegment={(seg) => {
setClipTrimmerSegmentId(seg.id);
setClipTrimmerOpen(true);
}}
/>
</div>
{/* 6. 背景音乐 */}
<BgmPanel <BgmPanel
bgmList={bgmList} bgmList={bgmList}
bgmLoading={bgmLoading} bgmLoading={bgmLoading}
@@ -278,12 +351,12 @@ export function HomePage() {
registerBgmItemRef={registerBgmItemRef} registerBgmItemRef={registerBgmItemRef}
/> />
{/* 生成按钮 */} {/* 7. 生成按钮 */}
<GenerateActionBar <GenerateActionBar
isGenerating={isGenerating} isGenerating={isGenerating}
progress={currentTask?.progress || 0} progress={currentTask?.progress || 0}
materialCount={selectedMaterials.length} materialCount={selectedMaterials.length}
disabled={isGenerating || selectedMaterials.length === 0 || (ttsMode === "voiceclone" && !selectedRefAudio)} disabled={isGenerating || selectedMaterials.length === 0 || !selectedAudio}
onGenerate={handleGenerate} onGenerate={handleGenerate}
/> />
</div> </div>
@@ -319,6 +392,19 @@ export function HomePage() {
onClose={() => setExtractModalOpen(false)} onClose={() => setExtractModalOpen(false)}
onApply={(nextText) => setText(nextText)} onApply={(nextText) => setText(nextText)}
/> />
<ClipTrimmer
isOpen={clipTrimmerOpen}
segment={clipTrimmerSegment}
materialUrl={clipTrimmerMaterialUrl}
onConfirm={(sourceStart, sourceEnd) => {
if (clipTrimmerSegmentId) {
setSourceRange(clipTrimmerSegmentId, sourceStart, sourceEnd);
}
setClipTrimmerOpen(false);
}}
onClose={() => setClipTrimmerOpen(false)}
/>
</div> </div>
); );
} }

View File

@@ -1,21 +1,6 @@
import { type ChangeEvent, type MouseEvent } from "react"; import { type ChangeEvent, type MouseEvent } from "react";
import { Upload, RefreshCw, Eye, Trash2, X, Pencil, Check, GripVertical } from "lucide-react"; import { Upload, RefreshCw, Eye, Trash2, X, Pencil, Check } from "lucide-react";
import type { Material } from "@/shared/types/material"; import type { Material } from "@/shared/types/material";
import {
DndContext,
closestCenter,
KeyboardSensor,
PointerSensor,
useSensor,
useSensors,
type DragEndEvent,
} from "@dnd-kit/core";
import {
SortableContext,
horizontalListSortingStrategy,
useSortable,
} from "@dnd-kit/sortable";
import { CSS } from "@dnd-kit/utilities";
interface MaterialSelectorProps { interface MaterialSelectorProps {
materials: Material[]; materials: Material[];
@@ -32,7 +17,6 @@ interface MaterialSelectorProps {
onUploadChange: (event: ChangeEvent<HTMLInputElement>) => void; onUploadChange: (event: ChangeEvent<HTMLInputElement>) => void;
onRefresh: () => void; onRefresh: () => void;
onToggleMaterial: (id: string) => void; onToggleMaterial: (id: string) => void;
onReorderMaterials: (activeId: string, overId: string) => void;
onPreviewMaterial: (path: string) => void; onPreviewMaterial: (path: string) => void;
onStartEditing: (material: Material, event: MouseEvent) => void; onStartEditing: (material: Material, event: MouseEvent) => void;
onEditNameChange: (value: string) => void; onEditNameChange: (value: string) => void;
@@ -43,61 +27,6 @@ interface MaterialSelectorProps {
registerMaterialRef: (id: string, element: HTMLDivElement | null) => void; registerMaterialRef: (id: string, element: HTMLDivElement | null) => void;
} }
function SortableChip({
id,
index,
label,
onRemove,
}: {
id: string;
index: number;
label: string;
onRemove: () => void;
}) {
const {
attributes,
listeners,
setNodeRef,
transform,
transition,
isDragging,
} = useSortable({ id });
const style = {
transform: CSS.Translate.toString(transform),
transition,
};
const circledNumbers = ["\u2460", "\u2461", "\u2462", "\u2463", "\u2464", "\u2465", "\u2466", "\u2467", "\u2468", "\u2469"];
return (
<div
ref={setNodeRef}
style={style}
className={`flex items-center gap-1 rounded-lg px-2 py-1 text-xs whitespace-nowrap transition-colors ${
isDragging
? "bg-purple-500/50 border border-purple-400 text-white shadow-lg shadow-purple-500/30 z-10"
: "bg-purple-500/30 border border-purple-500/50 text-purple-200"
}`}
>
<span {...attributes} {...listeners} className="cursor-grab active:cursor-grabbing text-purple-400">
<GripVertical className="h-3 w-3" />
</span>
<span className="text-purple-300">{circledNumbers[index] || `${index + 1}`}</span>
<span className="max-w-[80px] truncate">{label}</span>
<button
onClick={(e) => {
e.stopPropagation();
onRemove();
}}
className="text-purple-400 hover:text-white ml-0.5"
>
<X className="h-3 w-3" />
</button>
</div>
);
}
export function MaterialSelector({ export function MaterialSelector({
materials, materials,
selectedMaterials, selectedMaterials,
@@ -113,7 +42,6 @@ export function MaterialSelector({
onUploadChange, onUploadChange,
onRefresh, onRefresh,
onToggleMaterial, onToggleMaterial,
onReorderMaterials,
onPreviewMaterial, onPreviewMaterial,
onStartEditing, onStartEditing,
onEditNameChange, onEditNameChange,
@@ -123,21 +51,8 @@ export function MaterialSelector({
onClearUploadError, onClearUploadError,
registerMaterialRef, registerMaterialRef,
}: MaterialSelectorProps) { }: MaterialSelectorProps) {
const sensors = useSensors(
useSensor(PointerSensor, { activationConstraint: { distance: 5 } }),
useSensor(KeyboardSensor)
);
const handleDragEnd = (event: DragEndEvent) => {
const { active, over } = event;
if (over && active.id !== over.id) {
onReorderMaterials(String(active.id), String(over.id));
}
};
const selectedSet = new Set(selectedMaterials); const selectedSet = new Set(selectedMaterials);
const isFull = selectedMaterials.length >= 4; const isFull = selectedMaterials.length >= 4;
const circledNumbers = ["\u2460", "\u2461", "\u2462", "\u2463", "\u2464", "\u2465", "\u2466", "\u2467", "\u2468", "\u2469"];
return ( return (
<div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm"> <div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
@@ -200,38 +115,6 @@ export function MaterialSelector({
</div> </div>
)} )}
{/* 已选素材排列(拖拽排序区) - 仅当选中 >= 2 个时显示 */}
{selectedMaterials.length >= 2 && (
<div className="mb-3 p-3 bg-purple-500/10 rounded-xl border border-purple-500/20">
<div className="text-[11px] text-purple-300/70 mb-2">🎬 ()</div>
<DndContext
sensors={sensors}
collisionDetection={closestCenter}
onDragEnd={handleDragEnd}
>
<SortableContext
items={selectedMaterials}
strategy={horizontalListSortingStrategy}
>
<div className="flex flex-wrap gap-1.5">
{selectedMaterials.map((id, index) => {
const m = materials.find((x) => x.id === id);
return (
<SortableChip
key={id}
id={id}
index={index}
label={m?.scene || m?.name || id}
onRemove={() => onToggleMaterial(id)}
/>
);
})}
</div>
</SortableContext>
</DndContext>
</div>
)}
{fetchError ? ( {fetchError ? (
<div className="p-4 bg-red-500/20 text-red-200 rounded-xl text-sm mb-4"> <div className="p-4 bg-red-500/20 text-red-200 rounded-xl text-sm mb-4">
: {fetchError} : {fetchError}
@@ -265,7 +148,6 @@ export function MaterialSelector({
> >
{materials.map((m) => { {materials.map((m) => {
const isSelected = selectedSet.has(m.id); const isSelected = selectedSet.has(m.id);
const selIndex = selectedMaterials.indexOf(m.id);
return ( return (
<div <div
key={m.id} key={m.id}
@@ -309,7 +191,7 @@ export function MaterialSelector({
: "border-white/30 text-transparent" : "border-white/30 text-transparent"
}`} }`}
> >
{isSelected ? (selIndex >= 0 ? circledNumbers[selIndex] || "✓" : "✓") : ""} {isSelected ? "✓" : ""}
</span> </span>
<div className="min-w-0"> <div className="min-w-0">
<div className="text-white text-sm truncate">{m.scene || m.name}</div> <div className="text-white text-sm truncate">{m.scene || m.name}</div>

View File

@@ -1,5 +1,6 @@
import { useEffect, useRef, useState } from "react"; import { useEffect, useRef, useState } from "react";
import { FileText, Languages, Loader2, RotateCcw, Sparkles } from "lucide-react"; import { FileText, History, Languages, Loader2, RotateCcw, Save, Sparkles, Trash2 } from "lucide-react";
import type { SavedScript } from "@/features/home/model/useSavedScripts";
const LANGUAGES = [ const LANGUAGES = [
{ code: "English", label: "英语 English" }, { code: "English", label: "英语 English" },
@@ -23,6 +24,10 @@ interface ScriptEditorProps {
isTranslating: boolean; isTranslating: boolean;
hasOriginalText: boolean; hasOriginalText: boolean;
onRestoreOriginal: () => void; onRestoreOriginal: () => void;
savedScripts: SavedScript[];
onSaveScript: () => void;
onLoadScript: (content: string) => void;
onDeleteScript: (id: string) => void;
} }
export function ScriptEditor({ export function ScriptEditor({
@@ -35,9 +40,15 @@ export function ScriptEditor({
isTranslating, isTranslating,
hasOriginalText, hasOriginalText,
onRestoreOriginal, onRestoreOriginal,
savedScripts,
onSaveScript,
onLoadScript,
onDeleteScript,
}: ScriptEditorProps) { }: ScriptEditorProps) {
const [showLangMenu, setShowLangMenu] = useState(false); const [showLangMenu, setShowLangMenu] = useState(false);
const langMenuRef = useRef<HTMLDivElement>(null); const langMenuRef = useRef<HTMLDivElement>(null);
const [showHistoryMenu, setShowHistoryMenu] = useState(false);
const historyMenuRef = useRef<HTMLDivElement>(null);
useEffect(() => { useEffect(() => {
if (!showLangMenu) return; if (!showLangMenu) return;
@@ -50,21 +61,81 @@ export function ScriptEditor({
return () => document.removeEventListener("mousedown", handleClickOutside); return () => document.removeEventListener("mousedown", handleClickOutside);
}, [showLangMenu]); }, [showLangMenu]);
useEffect(() => {
if (!showHistoryMenu) return;
const handleClickOutside = (e: MouseEvent) => {
if (historyMenuRef.current && !historyMenuRef.current.contains(e.target as Node)) {
setShowHistoryMenu(false);
}
};
document.addEventListener("mousedown", handleClickOutside);
return () => document.removeEventListener("mousedown", handleClickOutside);
}, [showHistoryMenu]);
const handleSelectLang = (langCode: string) => { const handleSelectLang = (langCode: string) => {
setShowLangMenu(false); setShowLangMenu(false);
onTranslate(langCode); onTranslate(langCode);
}; };
const formatDate = (ts: number) => {
const d = new Date(ts);
return `${(d.getMonth() + 1).toString().padStart(2, "0")}-${d.getDate().toString().padStart(2, "0")} ${d.getHours().toString().padStart(2, "0")}:${d.getMinutes().toString().padStart(2, "0")}`;
};
return ( return (
<div className="relative z-10 bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm"> <div className="relative z-10 bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
<div className="mb-4 space-y-3"> <div className="mb-4 space-y-3">
<h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2"> <h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2">
</h2> </h2>
<div className="flex gap-2 flex-wrap justify-end"> <div className="flex gap-2 flex-wrap justify-end items-center">
{/* 历史文案 */}
<div className="relative" ref={historyMenuRef}>
<button
onClick={() => setShowHistoryMenu((prev) => !prev)}
className="h-7 px-2.5 text-xs rounded transition-all whitespace-nowrap bg-gray-600 hover:bg-gray-500 text-white inline-flex items-center gap-1"
>
<History className="h-3.5 w-3.5" />
</button>
{showHistoryMenu && (
<div className="absolute left-0 top-full mt-1 z-50 bg-gray-800 border border-white/10 rounded-lg shadow-xl py-1 min-w-[220px] max-h-[280px] overflow-y-auto">
{savedScripts.length === 0 ? (
<div className="px-3 py-3 text-xs text-gray-500 text-center"></div>
) : (
savedScripts.map((script) => (
<div
key={script.id}
className="flex items-center gap-1 px-3 py-1.5 hover:bg-white/10 transition-colors group"
>
<button
onClick={() => {
onLoadScript(script.content);
setShowHistoryMenu(false);
}}
className="flex-1 text-left min-w-0"
>
<div className="text-xs text-gray-200 truncate">{script.name}</div>
<div className="text-[10px] text-gray-500">{formatDate(script.savedAt)}</div>
</button>
<button
onClick={(e) => {
e.stopPropagation();
onDeleteScript(script.id);
}}
className="opacity-0 group-hover:opacity-100 p-1 text-gray-500 hover:text-red-400 transition-all shrink-0"
>
<Trash2 className="h-3 w-3" />
</button>
</div>
))
)}
</div>
)}
</div>
<button <button
onClick={onOpenExtractModal} onClick={onOpenExtractModal}
className="px-2 py-1 text-xs rounded transition-all whitespace-nowrap bg-purple-600 hover:bg-purple-700 text-white flex items-center gap-1" className="h-7 px-2.5 text-xs rounded transition-all whitespace-nowrap bg-purple-600 hover:bg-purple-700 text-white inline-flex items-center gap-1"
> >
<FileText className="h-3.5 w-3.5" /> <FileText className="h-3.5 w-3.5" />
@@ -73,22 +144,22 @@ export function ScriptEditor({
<button <button
onClick={() => setShowLangMenu((prev) => !prev)} onClick={() => setShowLangMenu((prev) => !prev)}
disabled={isTranslating || !text.trim()} disabled={isTranslating || !text.trim()}
className={`px-2 py-1 text-xs rounded transition-all whitespace-nowrap ${ className={`h-7 px-2.5 text-xs rounded transition-all whitespace-nowrap inline-flex items-center gap-1 ${
isTranslating || !text.trim() isTranslating || !text.trim()
? "bg-gray-600 cursor-not-allowed text-gray-400" ? "bg-gray-600 cursor-not-allowed text-gray-400"
: "bg-gradient-to-r from-emerald-600 to-teal-600 hover:from-emerald-700 hover:to-teal-700 text-white" : "bg-gradient-to-r from-emerald-600 to-teal-600 hover:from-emerald-700 hover:to-teal-700 text-white"
}`} }`}
> >
{isTranslating ? ( {isTranslating ? (
<span className="flex items-center gap-1"> <>
<Loader2 className="h-3.5 w-3.5 animate-spin" /> <Loader2 className="h-3.5 w-3.5 animate-spin" />
... ...
</span> </>
) : ( ) : (
<span className="flex items-center gap-1"> <>
<Languages className="h-3.5 w-3.5" /> <Languages className="h-3.5 w-3.5" />
AI多语言 AI多语言
</span> </>
)} )}
</button> </button>
{showLangMenu && ( {showLangMenu && (
@@ -120,21 +191,21 @@ export function ScriptEditor({
<button <button
onClick={onGenerateMeta} onClick={onGenerateMeta}
disabled={isGeneratingMeta || !text.trim()} disabled={isGeneratingMeta || !text.trim()}
className={`px-2 py-1 text-xs rounded transition-all whitespace-nowrap ${isGeneratingMeta || !text.trim() className={`h-7 px-2.5 text-xs rounded transition-all whitespace-nowrap inline-flex items-center gap-1 ${isGeneratingMeta || !text.trim()
? "bg-gray-600 cursor-not-allowed text-gray-400" ? "bg-gray-600 cursor-not-allowed text-gray-400"
: "bg-gradient-to-r from-blue-600 to-cyan-600 hover:from-blue-700 hover:to-cyan-700 text-white" : "bg-gradient-to-r from-blue-600 to-cyan-600 hover:from-blue-700 hover:to-cyan-700 text-white"
}`} }`}
> >
{isGeneratingMeta ? ( {isGeneratingMeta ? (
<span className="flex items-center gap-1"> <>
<Loader2 className="h-3.5 w-3.5 animate-spin" /> <Loader2 className="h-3.5 w-3.5 animate-spin" />
... ...
</span> </>
) : ( ) : (
<span className="flex items-center gap-1"> <>
<Sparkles className="h-3.5 w-3.5" /> <Sparkles className="h-3.5 w-3.5" />
AI生成标题标签 AI生成标题标签
</span> </>
)} )}
</button> </button>
</div> </div>
@@ -145,9 +216,20 @@ export function ScriptEditor({
placeholder="请输入你想说的话..." placeholder="请输入你想说的话..."
className="w-full h-40 bg-black/30 border border-white/10 rounded-xl p-4 text-white placeholder-gray-500 resize-none focus:outline-none focus:border-purple-500 transition-colors hide-scrollbar" className="w-full h-40 bg-black/30 border border-white/10 rounded-xl p-4 text-white placeholder-gray-500 resize-none focus:outline-none focus:border-purple-500 transition-colors hide-scrollbar"
/> />
<div className="flex justify-between mt-2 text-sm text-gray-400"> <div className="flex items-center justify-between mt-2 text-sm text-gray-400">
<span>{text.length} </span> <span>{text.length} </span>
<span>: ~{Math.ceil(text.length / 4)} </span> <button
onClick={onSaveScript}
disabled={!text.trim()}
className={`px-2.5 py-1 text-xs rounded transition-all flex items-center gap-1 ${
!text.trim()
? "bg-gray-700 cursor-not-allowed text-gray-500"
: "bg-amber-600/80 hover:bg-amber-600 text-white"
}`}
>
<Save className="h-3 w-3" />
</button>
</div> </div>
</div> </div>
); );

View File

@@ -0,0 +1,283 @@
import { useEffect, useRef, useCallback, useState } from "react";
import WaveSurfer from "wavesurfer.js";
import type { TimelineSegment } from "@/features/home/model/useTimelineEditor";
import type { Material } from "@/shared/types/material";
interface TimelineEditorProps {
audioDuration: number;
audioUrl: string;
segments: TimelineSegment[];
materials: Material[];
onReorderSegment: (fromIdx: number, toIdx: number) => void;
onClickSegment: (segment: TimelineSegment) => void;
}
function formatTime(sec: number): string {
const m = Math.floor(sec / 60);
const s = sec % 60;
return `${String(m).padStart(2, "0")}:${s.toFixed(1).padStart(4, "0")}`;
}
export function TimelineEditor({
audioDuration,
audioUrl,
segments,
materials,
onReorderSegment,
onClickSegment,
}: TimelineEditorProps) {
const waveRef = useRef<HTMLDivElement>(null);
const wsRef = useRef<WaveSurfer | null>(null);
const [waveReady, setWaveReady] = useState(false);
const [isPlaying, setIsPlaying] = useState(false);
// Refs for high-frequency DOM updates (avoid 60fps re-renders)
const playheadRef = useRef<HTMLDivElement>(null);
const timeRef = useRef<HTMLSpanElement>(null);
const audioDurationRef = useRef(audioDuration);
audioDurationRef.current = audioDuration;
// Drag-to-reorder state
const [dragFromIdx, setDragFromIdx] = useState<number | null>(null);
const [dragOverIdx, setDragOverIdx] = useState<number | null>(null);
// Create / recreate wavesurfer when audioUrl changes
useEffect(() => {
if (!waveRef.current || !audioUrl) return;
// Destroy previous instance
if (wsRef.current) {
wsRef.current.destroy();
wsRef.current = null;
}
const ws = WaveSurfer.create({
container: waveRef.current,
height: 56,
waveColor: "#6d28d9",
progressColor: "#a855f7",
barWidth: 2,
barGap: 1,
barRadius: 2,
cursorWidth: 1,
cursorColor: "#e879f9",
interact: true,
normalize: true,
});
// Click waveform → seek + auto-play
ws.on("interaction", () => ws.play());
ws.on("play", () => setIsPlaying(true));
ws.on("pause", () => setIsPlaying(false));
ws.on("finish", () => {
setIsPlaying(false);
if (playheadRef.current) playheadRef.current.style.display = "none";
});
// High-frequency: update playhead + time via refs (no React re-render)
ws.on("timeupdate", (time: number) => {
const dur = audioDurationRef.current;
if (playheadRef.current && dur > 0) {
playheadRef.current.style.left = `${(time / dur) * 100}%`;
playheadRef.current.style.display = "block";
}
if (timeRef.current) {
timeRef.current.textContent = formatTime(time);
}
});
ws.load(audioUrl);
wsRef.current = ws;
return () => {
ws.destroy();
wsRef.current = null;
setIsPlaying(false);
if (playheadRef.current) playheadRef.current.style.display = "none";
if (timeRef.current) timeRef.current.textContent = formatTime(0);
};
}, [audioUrl, waveReady]);
// Callback ref to detect when waveRef div mounts
const waveCallbackRef = useCallback((node: HTMLDivElement | null) => {
(waveRef as React.MutableRefObject<HTMLDivElement | null>).current = node;
setWaveReady(!!node);
}, []);
const handlePlayPause = useCallback(() => {
wsRef.current?.playPause();
}, []);
// Drag-to-reorder handlers
const handleDragStart = useCallback((idx: number, e: React.DragEvent) => {
setDragFromIdx(idx);
e.dataTransfer.effectAllowed = "move";
e.dataTransfer.setData("text/plain", String(idx));
}, []);
const handleDragOver = useCallback((idx: number, e: React.DragEvent) => {
e.preventDefault();
e.dataTransfer.dropEffect = "move";
setDragOverIdx(idx);
}, []);
const handleDragLeave = useCallback(() => {
setDragOverIdx(null);
}, []);
const handleDrop = useCallback((toIdx: number, e: React.DragEvent) => {
e.preventDefault();
const fromIdx = parseInt(e.dataTransfer.getData("text/plain"), 10);
if (!isNaN(fromIdx) && fromIdx !== toIdx) {
onReorderSegment(fromIdx, toIdx);
}
setDragFromIdx(null);
setDragOverIdx(null);
}, [onReorderSegment]);
const handleDragEnd = useCallback(() => {
setDragFromIdx(null);
setDragOverIdx(null);
}, []);
// Filter visible vs overflow segments
const visibleSegments = segments.filter((s) => s.start < audioDuration);
const overflowSegments = segments.filter((s) => s.start >= audioDuration);
const hasSegments = visibleSegments.length > 0;
return (
<div className="bg-white/5 rounded-2xl p-4 sm:p-6 border border-white/10 backdrop-blur-sm">
<div className="flex items-center justify-between mb-3">
<h2 className="text-base sm:text-lg font-semibold text-white flex items-center gap-2">
🎞
</h2>
{audioUrl && (
<div className="flex items-center gap-2 text-xs text-gray-400">
<button
onClick={handlePlayPause}
className="w-7 h-7 flex items-center justify-center rounded-full bg-white/10 hover:bg-white/20 text-white transition-colors"
title={isPlaying ? "暂停" : "播放"}
>
{isPlaying ? "⏸" : "▶"}
</button>
<span ref={timeRef} className="tabular-nums">00:00.0</span>
<span className="text-gray-600">/</span>
<span className="tabular-nums">{formatTime(audioDuration)}</span>
</div>
)}
</div>
{/* Waveform — always rendered so ref stays mounted */}
<div className="relative mb-1">
<div ref={waveCallbackRef} className="rounded-lg overflow-hidden bg-black/20 cursor-pointer" style={{ minHeight: 56 }} />
</div>
{/* Segment blocks or empty placeholder */}
{hasSegments ? (
<>
<div className="relative h-14 flex select-none">
{/* Playhead — syncs with audio playback */}
<div
ref={playheadRef}
className="absolute top-0 h-full w-0.5 bg-fuchsia-400 z-10 pointer-events-none"
style={{ display: "none", left: "0%" }}
/>
{visibleSegments.map((seg, i) => {
const left = (seg.start / audioDuration) * 100;
const width = ((seg.end - seg.start) / audioDuration) * 100;
const segDur = seg.end - seg.start;
const isDragTarget = dragOverIdx === i && dragFromIdx !== i;
// Compute loop portion for the last visible segment
const isLastVisible = i === visibleSegments.length - 1;
let loopPercent = 0;
if (isLastVisible && audioDuration > 0) {
const mat = materials.find((m) => m.id === seg.materialId);
const matDur = mat?.duration_sec ?? 0;
const effDur = (seg.sourceEnd > seg.sourceStart)
? (seg.sourceEnd - seg.sourceStart)
: matDur;
if (effDur > 0 && segDur > effDur + 0.1) {
loopPercent = ((segDur - effDur) / segDur) * 100;
}
}
return (
<div key={seg.id} className="absolute top-0 h-full" style={{ left: `${left}%`, width: `${width}%` }}>
<button
draggable
onDragStart={(e) => handleDragStart(i, e)}
onDragOver={(e) => handleDragOver(i, e)}
onDragLeave={handleDragLeave}
onDrop={(e) => handleDrop(i, e)}
onDragEnd={handleDragEnd}
onClick={() => onClickSegment(seg)}
className={`relative w-full h-full rounded-lg flex flex-col items-center justify-center overflow-hidden cursor-grab active:cursor-grabbing transition-all border ${
isDragTarget
? "ring-2 ring-purple-400 border-purple-400 scale-[1.02]"
: dragFromIdx === i
? "opacity-50 border-white/10"
: "hover:opacity-90 border-white/10"
}`}
style={{ backgroundColor: seg.color + "33", borderColor: isDragTarget ? undefined : seg.color + "66" }}
title={`拖拽可调换顺序 · 点击设置截取范围\n${seg.materialName}\n${segDur.toFixed(1)}s${loopPercent > 0 ? ` (含循环 ${(segDur * loopPercent / 100).toFixed(1)}s)` : ""}`}
>
<span className="text-[11px] text-white/90 truncate max-w-full px-1 leading-tight z-[1]">
{seg.materialName}
</span>
<span className="text-[10px] text-white/60 leading-tight z-[1]">
{segDur.toFixed(1)}s
</span>
{seg.sourceStart > 0 && (
<span className="text-[9px] text-amber-400/80 leading-tight z-[1]">
{seg.sourceStart.toFixed(1)}s
</span>
)}
{/* Loop fill stripe overlay */}
{loopPercent > 0 && (
<div
className="absolute top-0 right-0 h-full pointer-events-none flex items-center justify-center"
style={{
width: `${loopPercent}%`,
background: `repeating-linear-gradient(-45deg, transparent, transparent 3px, rgba(255,255,255,0.07) 3px, rgba(255,255,255,0.07) 6px)`,
borderLeft: "1px dashed rgba(255,255,255,0.25)",
}}
>
<span className="text-[9px] text-white/30"></span>
</div>
)}
</button>
</div>
);
})}
</div>
{/* Overflow segments — shown as gray chips */}
{overflowSegments.length > 0 && (
<div className="flex flex-wrap items-center gap-1.5 mt-1.5">
<span className="text-[10px] text-gray-500">使:</span>
{overflowSegments.map((seg) => (
<span
key={seg.id}
className="text-[10px] text-gray-500 bg-white/5 border border-white/10 rounded px-1.5 py-0.5"
>
{seg.materialName}
</span>
))}
</div>
)}
<p className="text-[10px] text-gray-500 mt-1.5">
· ·
</p>
</>
) : (
<>
<div className="h-14 bg-white/5 rounded-lg" />
<p className="text-[10px] text-gray-500 mt-1.5">
</p>
</>
)}
</div>
);
}

View File

@@ -4,4 +4,5 @@ export interface Material {
path: string; path: string;
size_mb: number; size_mb: number;
scene?: string; scene?: string;
duration_sec?: number;
} }

View File

@@ -120,6 +120,7 @@ async def generate(
if not _model_loaded: if not _model_loaded:
raise HTTPException(status_code=503, detail="Model not loaded") raise HTTPException(status_code=503, detail="Model not loaded")
import torch
import soundfile as sf import soundfile as sf
# 保存上传的参考音频到临时文件 # 保存上传的参考音频到临时文件
@@ -132,7 +133,7 @@ async def generate(
output_path = tempfile.mktemp(suffix=".wav") output_path = tempfile.mktemp(suffix=".wav")
try: try:
print(f"🎤 Generating: {text[:30]}...") print(f"🎤 Generating: {text[:50]}... ({len(text)} chars)")
print(f"📝 Ref text: {ref_text[:50]}...") print(f"📝 Ref text: {ref_text[:50]}...")
print(f"🌐 Language: {language}") print(f"🌐 Language: {language}")
@@ -148,6 +149,9 @@ async def generate(
ref_text=ref_text, ref_text=ref_text,
) )
# 释放 CUDA 缓存,防止显存碎片累积
torch.cuda.empty_cache()
sf.write(output_path, wavs[0], sr) sf.write(output_path, wavs[0], sr)
duration = len(wavs[0]) / sr duration = len(wavs[0]) / sr
@@ -158,11 +162,17 @@ async def generate(
output_path, output_path,
media_type="audio/wav", media_type="audio/wav",
filename="output.wav", filename="output.wav",
background=None # 让客户端下载完再删除 background=None
) )
except Exception as e: except Exception as e:
print(f"❌ Generation failed: {e}") print(f"❌ Generation failed: {e}")
# 释放 CUDA 缓存
try:
import torch
torch.cuda.empty_cache()
except:
pass
raise HTTPException(status_code=500, detail=str(e)) raise HTTPException(status_code=500, detail=str(e))
finally: finally:
# 清理参考音频临时文件 # 清理参考音频临时文件

View File

@@ -5,5 +5,7 @@
cd /home/rongye/ProgramFiles/ViGent2/models/Qwen3-TTS cd /home/rongye/ProgramFiles/ViGent2/models/Qwen3-TTS
# 使用 qwen-tts conda 环境的 Python # 确保 conda env 的 bin 目录在 PATH 中,让 sox 等工具可被找到
/home/rongye/ProgramFiles/miniconda3/envs/qwen-tts/bin/python qwen_tts_server.py export PATH="/home/rongye/ProgramFiles/miniconda3/envs/qwen-tts/bin:$PATH"
python qwen_tts_server.py