Files
ViGent/Docs/DevLogs/Day4.md
2026-01-16 16:27:30 +08:00

98 lines
2.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Day 4: MuseTalk 口型同步完整修复
---
## 🐛 Next.js 启动参数修复 (14:41)
**问题**`npm run dev -- --host 0.0.0.0` 报错 `unknown option '--host'`
**修复**Next.js 使用 `-H` 而非 `--host`,修改 `DEPLOY_MANUAL.md`
**状态**:✅ 已修复
---
## 🔧 MuseTalk 推理完整修复
### 问题描述
视频生成后 `_lipsync.mp4` 文件大小与原视频完全一致 (28MB),说明 MuseTalk 推理静默失败,执行了 fallback 逻辑(直接复制原视频)。
### 根因分析
#### 问题一:权重检测路径不匹配
`lipsync_service.py` 检查路径 `models/musetalk/musetalkV15`,但服务器上 `musetalkV15` 目录位于 `models/` 下,非嵌套结构。
**修复**:在服务器上创建符号链接
```bash
cd /home/rongye/ProgramFiles/ViGent/models/MuseTalk/models/musetalk
ln -s ../musetalkV15 musetalkV15
```
#### 问题二:音视频长度不匹配触发退出
`musetalk/utils/audio_processor.py` 中存在致命缺陷:
```python
# 原代码 - 音频比视频短时触发 assert 失败并 exit()
assert audio_clip.shape[1] == audio_feature_length_per_frame
...
except Exception as e:
print(f"Error occurred: {e}") # e 为空AssertionError 无消息)
exit()
```
日志表现:
```
Error occurred:
whisper_feature.shape: torch.Size([1, 275, 5, 384])
audio_index: 266-276 ← 超出 275 范围
```
**修复**:重写为零填充逻辑,不再中断推理
```python
# 新代码 - 音频不足时使用零填充
if end_index > whisper_feature.shape[1]:
available = whisper_feature[:, audio_index:]
padding = torch.zeros(...)
audio_clip = torch.cat([available, padding], dim=1)
```
### 修改的文件
| 文件 | 修改内容 |
|------|----------|
| `musetalk/utils/audio_processor.py` | 音视频长度不匹配时使用零填充 |
| `scripts/inference.py` | 增强错误日志,禁用 tqdm 避免输出干扰 |
### 验证结果
| 指标 | 修复前 | 修复后 |
|------|--------|--------|
| `_lipsync.mp4` 大小 | 28 MB (原视频) | 3.8 MB |
| 推理帧数 | 0 | 321 帧 |
| Exit code | 0 (静默失败) | 0 (真正成功) |
```
Executing: ffmpeg -y -v warning -r 60.0 -f image2 -i .../IMG_7384_.../%08d.png ...
Combining Audio...
Results saved to /home/rongye/.../debug_fixed.mp4
```
---
## 📝 文档更新 (15:30)
更新 `models/MuseTalk/DEPLOY.md`
- 详细的权重路径总览(目录树形式)
- 关键软链接说明(`musetalk/musetalkV15`
- 与服务器实际配置验证对齐
- 修正 dwpose 模型大小 (62MB → 387MB)
---
## ✅ Day 4 完成事项
- [x] 修复 Next.js 启动参数
- [x] 创建权重检测软链接
- [x] 修复 audio_processor.py 音视频长度不匹配问题
- [x] 增强 inference.py 错误日志
- [x] 验证 MuseTalk 推理生成 MP4
- [x] 更新 MuseTalk 部署文档