更新
This commit is contained in:
@@ -1,210 +0,0 @@
|
||||
# LatentSync 1.6 部署指南
|
||||
|
||||
> 本文档描述如何在 Ubuntu 服务器上部署 LatentSync 1.6 模型。
|
||||
|
||||
## 系统要求
|
||||
|
||||
| 要求 | 规格 |
|
||||
|------|------|
|
||||
| GPU | NVIDIA RTX 3090 24GB (或更高) |
|
||||
| VRAM | ≥ 18GB |
|
||||
| CUDA | 12.1+ |
|
||||
| Python | 3.10.x |
|
||||
| 系统 | Ubuntu 20.04+ |
|
||||
|
||||
---
|
||||
|
||||
## 步骤 1: 创建 Conda 环境
|
||||
|
||||
```bash
|
||||
# 创建新的 conda 环境
|
||||
conda create -y -n latentsync python=3.10.13
|
||||
conda activate latentsync
|
||||
|
||||
# 安装 FFmpeg
|
||||
conda install -y -c conda-forge ffmpeg
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 步骤 2: 安装 Python 依赖
|
||||
|
||||
```bash
|
||||
cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
|
||||
|
||||
# 安装 PyTorch (CUDA 12.1)
|
||||
pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu121
|
||||
|
||||
# 安装其他依赖
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 步骤 3: 下载模型权重
|
||||
|
||||
```bash
|
||||
cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
|
||||
|
||||
# 使用 HuggingFace CLI 下载权重
|
||||
huggingface-cli download ByteDance/LatentSync-1.6 whisper/tiny.pt --local-dir checkpoints
|
||||
huggingface-cli download ByteDance/LatentSync-1.6 latentsync_unet.pt --local-dir checkpoints
|
||||
```
|
||||
|
||||
下载完成后,目录结构应如下:
|
||||
|
||||
```
|
||||
checkpoints/
|
||||
├── latentsync_unet.pt # ~2GB
|
||||
└── whisper/
|
||||
└── tiny.pt # ~76MB
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 步骤 4: 复制核心代码
|
||||
|
||||
从 LatentSync 官方仓库复制以下目录到本地:
|
||||
|
||||
```bash
|
||||
# 克隆仓库 (临时)
|
||||
cd /tmp
|
||||
git clone https://github.com/bytedance/LatentSync.git
|
||||
|
||||
# 复制核心代码
|
||||
cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
|
||||
cp -r /tmp/LatentSync/latentsync ./
|
||||
cp -r /tmp/LatentSync/scripts ./
|
||||
cp -r /tmp/LatentSync/configs ./
|
||||
|
||||
# 清理临时文件
|
||||
rm -rf /tmp/LatentSync
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 步骤 5: 验证安装
|
||||
|
||||
```bash
|
||||
cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
|
||||
conda activate latentsync
|
||||
|
||||
# 检查 PyTorch 和 CUDA
|
||||
python -c "import torch; print(f'PyTorch: {torch.__version__}'); print(f'CUDA: {torch.cuda.is_available()}')"
|
||||
|
||||
# 运行推理测试 (使用 GPU1)
|
||||
CUDA_VISIBLE_DEVICES=1 python -m scripts.inference \
|
||||
--unet_config_path "configs/unet/stage2_512.yaml" \
|
||||
--inference_ckpt_path "checkpoints/latentsync_unet.pt" \
|
||||
--inference_steps 20 \
|
||||
--guidance_scale 1.5 \
|
||||
--enable_deepcache \
|
||||
--video_path "test_video.mp4" \
|
||||
--audio_path "test_audio.wav" \
|
||||
--video_out_path "test_output.mp4"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 目录结构
|
||||
|
||||
部署完成后,目录结构应如下:
|
||||
|
||||
```
|
||||
/home/rongye/ProgramFiles/ViGent2/models/LatentSync/
|
||||
├── checkpoints/
|
||||
│ ├── latentsync_unet.pt
|
||||
│ └── whisper/
|
||||
│ └── tiny.pt
|
||||
├── configs/
|
||||
│ ├── scheduler_config.json
|
||||
│ └── unet/
|
||||
│ ├── stage1.yaml
|
||||
│ ├── stage1_512.yaml
|
||||
│ ├── stage2.yaml
|
||||
│ ├── stage2_512.yaml
|
||||
│ └── stage2_efficient.yaml
|
||||
├── latentsync/
|
||||
│ ├── data/
|
||||
│ ├── models/
|
||||
│ ├── pipelines/
|
||||
│ ├── trepa/
|
||||
│ ├── utils/
|
||||
│ └── whisper/
|
||||
├── scripts/
|
||||
│ └── inference.py
|
||||
├── requirements.txt
|
||||
└── DEPLOY.md
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
## 步骤 7: 性能优化 (预加载模型服务)
|
||||
|
||||
为了消除每次生成视频时 30-40秒 的模型加载时间,建议运行常驻服务。
|
||||
|
||||
### 1. 安装服务依赖
|
||||
|
||||
```bash
|
||||
conda activate latentsync
|
||||
pip install fastapi uvicorn
|
||||
```
|
||||
|
||||
### 2. 启动服务
|
||||
|
||||
**前台运行 (测试)**:
|
||||
```bash
|
||||
cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
|
||||
# 启动服务 (端口 8007) - 会自动读取 backend/.env 中的 GPU 配置
|
||||
python -m scripts.server
|
||||
```
|
||||
|
||||
**后台运行 (推荐)**:
|
||||
```bash
|
||||
nohup python -m scripts.server > server.log 2>&1 &
|
||||
```
|
||||
|
||||
### 3. 更新配置
|
||||
|
||||
修改 `ViGent2/backend/.env`:
|
||||
|
||||
```bash
|
||||
LATENTSYNC_USE_SERVER=True
|
||||
```
|
||||
|
||||
现在,后端通过 API 调用本地常驻服务,生成速度将显著提升。
|
||||
|
||||
---
|
||||
|
||||
## 故障排除
|
||||
|
||||
### CUDA 内存不足
|
||||
|
||||
LatentSync 1.6 需要 ~18GB VRAM。如果遇到 OOM 错误:
|
||||
|
||||
1. 确保使用的是 RTX 3090 或更高规格的 GPU
|
||||
2. 关闭其他 GPU 进程
|
||||
3. 降低 `inference_steps` (最低 10)
|
||||
|
||||
### 模型加载失败
|
||||
|
||||
确保以下文件存在:
|
||||
- `checkpoints/latentsync_unet.pt`
|
||||
- `checkpoints/whisper/tiny.pt`
|
||||
- `configs/unet/stage2_512.yaml`
|
||||
|
||||
### 视频输出质量问题
|
||||
|
||||
调整以下参数:
|
||||
- `inference_steps`: 增加到 30-50 可提高质量
|
||||
- `guidance_scale`: 增加可改善唇同步,但过高可能导致抖动
|
||||
|
||||
---
|
||||
|
||||
## 参考链接
|
||||
|
||||
- [LatentSync GitHub](https://github.com/bytedance/LatentSync)
|
||||
- [HuggingFace 模型](https://huggingface.co/ByteDance/LatentSync-1.6)
|
||||
- [论文](https://arxiv.org/abs/2412.09262)
|
||||
Reference in New Issue
Block a user