更新

2026-02-03 13:46:52 +08:00
parent eb3ed23326
commit cb10da52fc
18 changed files with 1018 additions and 657 deletions
--- a/models/LatentSync/DEPLOY.md
+++ b/models/LatentSync/DEPLOY.md
@@ -1,210 +0,0 @@
-# LatentSync 1.6 部署指南
-
-> 本文档描述如何在 Ubuntu 服务器上部署 LatentSync 1.6 模型。
-
-## 系统要求
-
-| 要求 | 规格 |
-|------|------|
-| GPU | NVIDIA RTX 3090 24GB (或更高) |
-| VRAM | ≥ 18GB |
-| CUDA | 12.1+ |
-| Python | 3.10.x |
-| 系统 | Ubuntu 20.04+ |
-
---
-
-## 步骤 1: 创建 Conda 环境
-
-```bash
-# 创建新的 conda 环境
-conda create -y -n latentsync python=3.10.13
-conda activate latentsync
-
-# 安装 FFmpeg
-conda install -y -c conda-forge ffmpeg
-```
-
---
-
-## 步骤 2: 安装 Python 依赖
-
-```bash
-cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
-
-# 安装 PyTorch (CUDA 12.1)
-pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu121
-
-# 安装其他依赖
-pip install -r requirements.txt
-```
-
---
-
-## 步骤 3: 下载模型权重
-
-```bash
-cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
-
-# 使用 HuggingFace CLI 下载权重
-huggingface-cli download ByteDance/LatentSync-1.6 whisper/tiny.pt --local-dir checkpoints
-huggingface-cli download ByteDance/LatentSync-1.6 latentsync_unet.pt --local-dir checkpoints
-```
-
-下载完成后，目录结构应如下：
-
-```
-checkpoints/
-├── latentsync_unet.pt      # ~2GB
-└── whisper/
-    └── tiny.pt             # ~76MB
-```
-
---
-
-## 步骤 4: 复制核心代码
-
-从 LatentSync 官方仓库复制以下目录到本地：
-
-```bash
-# 克隆仓库 (临时)
-cd /tmp
-git clone https://github.com/bytedance/LatentSync.git
-
-# 复制核心代码
-cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
-cp -r /tmp/LatentSync/latentsync ./
-cp -r /tmp/LatentSync/scripts ./
-cp -r /tmp/LatentSync/configs ./
-
-# 清理临时文件
-rm -rf /tmp/LatentSync
-```
-
---
-
-## 步骤 5: 验证安装
-
-```bash
-cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
-conda activate latentsync
-
-# 检查 PyTorch 和 CUDA
-python -c "import torch; print(f'PyTorch: {torch.__version__}'); print(f'CUDA: {torch.cuda.is_available()}')"
-
-# 运行推理测试 (使用 GPU1)
-CUDA_VISIBLE_DEVICES=1 python -m scripts.inference \
-    --unet_config_path "configs/unet/stage2_512.yaml" \
-    --inference_ckpt_path "checkpoints/latentsync_unet.pt" \
-    --inference_steps 20 \
-    --guidance_scale 1.5 \
-    --enable_deepcache \
-    --video_path "test_video.mp4" \
-    --audio_path "test_audio.wav" \
-    --video_out_path "test_output.mp4"
-```
-
---
-
-## 目录结构
-
-部署完成后，目录结构应如下：
-
-```
-/home/rongye/ProgramFiles/ViGent2/models/LatentSync/
-├── checkpoints/
-│   ├── latentsync_unet.pt
-│   └── whisper/
-│       └── tiny.pt
-├── configs/
-│   ├── scheduler_config.json
-│   └── unet/
-│       ├── stage1.yaml
-│       ├── stage1_512.yaml
-│       ├── stage2.yaml
-│       ├── stage2_512.yaml
-│       └── stage2_efficient.yaml
-├── latentsync/
-│   ├── data/
-│   ├── models/
-│   ├── pipelines/
-│   ├── trepa/
-│   ├── utils/
-│   └── whisper/
-├── scripts/
-│   └── inference.py
-├── requirements.txt
-└── DEPLOY.md
-```
-
---
-
---
-
-## 步骤 7: 性能优化 (预加载模型服务)
-
-为了消除每次生成视频时 30-40秒 的模型加载时间，建议运行常驻服务。
-
-### 1. 安装服务依赖
-
-```bash
-conda activate latentsync
-pip install fastapi uvicorn
-```
-
-### 2. 启动服务
-
-**前台运行 (测试)**:
-```bash
-cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
-# 启动服务 (端口 8007) - 会自动读取 backend/.env 中的 GPU 配置
-python -m scripts.server
-```
-
-**后台运行 (推荐)**:
-```bash
-nohup python -m scripts.server > server.log 2>&1 &
-```
-
-### 3. 更新配置
-
-修改 `ViGent2/backend/.env`:
-
-```bash
-LATENTSYNC_USE_SERVER=True
-```
-
-现在，后端通过 API 调用本地常驻服务，生成速度将显著提升。
-
---
-
-## 故障排除
-
-### CUDA 内存不足
-
-LatentSync 1.6 需要 ~18GB VRAM。如果遇到 OOM 错误：
-
-1. 确保使用的是 RTX 3090 或更高规格的 GPU
-2. 关闭其他 GPU 进程
-3. 降低 `inference_steps` (最低 10)
-
-### 模型加载失败
-
-确保以下文件存在：
- `checkpoints/latentsync_unet.pt`
- `checkpoints/whisper/tiny.pt`
- `configs/unet/stage2_512.yaml`
-
-### 视频输出质量问题
-
-调整以下参数：
- `inference_steps`: 增加到 30-50 可提高质量
- `guidance_scale`: 增加可改善唇同步，但过高可能导致抖动
-
---
-
-## 参考链接
-
- [LatentSync GitHub](https://github.com/bytedance/LatentSync)
- [HuggingFace 模型](https://huggingface.co/ByteDance/LatentSync-1.6)
- [论文](https://arxiv.org/abs/2412.09262)