Originals/ViGent2

Fork 0

Files

Kevin Wong cb10da52fc 更新

2026-02-03 13:46:52 +08:00

4.7 KiB

Raw Permalink Blame History

LatentSync 1.6 部署指南

本文档描述如何在 Ubuntu 服务器上部署 LatentSync 1.6 模型。

系统要求

要求	规格
GPU	NVIDIA RTX 3090 24GB (或更高)
VRAM	≥ 18GB
CUDA	12.1+
Python	3.10.x
系统	Ubuntu 20.04+

步骤 1: 创建 Conda 环境

# 创建新的 conda 环境
conda create -y -n latentsync python=3.10.13
conda activate latentsync

# 安装 FFmpeg
conda install -y -c conda-forge ffmpeg

步骤 2: 安装 Python 依赖

cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync

# 安装 PyTorch (CUDA 12.1)
pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu121

# 安装其他依赖
pip install -r requirements.txt

步骤 3: 下载模型权重

cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync

# 使用 HuggingFace CLI 下载权重
huggingface-cli download ByteDance/LatentSync-1.6 whisper/tiny.pt --local-dir checkpoints
huggingface-cli download ByteDance/LatentSync-1.6 latentsync_unet.pt --local-dir checkpoints

下载完成后，目录结构应如下：

checkpoints/
├── latentsync_unet.pt      # ~2GB
└── whisper/
    └── tiny.pt             # ~76MB

步骤 4: 复制核心代码

从 LatentSync 官方仓库复制以下目录到本地：

# 克隆仓库 (临时)
cd /tmp
git clone https://github.com/bytedance/LatentSync.git

# 复制核心代码
cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
cp -r /tmp/LatentSync/latentsync ./
cp -r /tmp/LatentSync/scripts ./
cp -r /tmp/LatentSync/configs ./

# 清理临时文件
rm -rf /tmp/LatentSync

步骤 5: 验证安装

cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
conda activate latentsync

# 检查 PyTorch 和 CUDA
python -c "import torch; print(f'PyTorch: {torch.__version__}'); print(f'CUDA: {torch.cuda.is_available()}')"

# 运行推理测试 (使用 GPU1)
CUDA_VISIBLE_DEVICES=1 python -m scripts.inference \
    --unet_config_path "configs/unet/stage2_512.yaml" \
    --inference_ckpt_path "checkpoints/latentsync_unet.pt" \
    --inference_steps 20 \
    --guidance_scale 1.5 \
    --enable_deepcache \
    --video_path "test_video.mp4" \
    --audio_path "test_audio.wav" \
    --video_out_path "test_output.mp4"

目录结构

部署完成后，目录结构应如下：

/home/rongye/ProgramFiles/ViGent2/models/LatentSync/
├── checkpoints/
│   ├── latentsync_unet.pt
│   └── whisper/
│       └── tiny.pt
├── configs/
│   ├── scheduler_config.json
│   └── unet/
│       ├── stage1.yaml
│       ├── stage1_512.yaml
│       ├── stage2.yaml
│       ├── stage2_512.yaml
│       └── stage2_efficient.yaml
├── latentsync/
│   ├── data/
│   ├── models/
│   ├── pipelines/
│   ├── trepa/
│   ├── utils/
│   └── whisper/
├── scripts/
│   └── inference.py
├── requirements.txt
└── DEPLOY.md

步骤 7: 性能优化 (预加载模型服务)

为了消除每次生成视频时 30-40秒的模型加载时间，建议运行常驻服务。

1. 安装服务依赖

conda activate latentsync
pip install fastapi uvicorn

2. 启动服务

前台运行 (测试):

cd /home/rongye/ProgramFiles/ViGent2/models/LatentSync
# 启动服务 (端口 8007) - 会自动读取 backend/.env 中的 GPU 配置
python -m scripts.server

后台运行 (推荐):

nohup python -m scripts.server > server.log 2>&1 &

3. 更新配置

修改 ViGent2/backend/.env:

LATENTSYNC_USE_SERVER=True

现在，后端通过 API 调用本地常驻服务，生成速度将显著提升。

故障排除

CUDA 内存不足

LatentSync 1.6 需要 ~18GB VRAM。如果遇到 OOM 错误：

确保使用的是 RTX 3090 或更高规格的 GPU
关闭其他 GPU 进程
降低 inference_steps (最低 10)

模型加载失败

确保以下文件存在：

checkpoints/latentsync_unet.pt
checkpoints/whisper/tiny.pt
configs/unet/stage2_512.yaml

视频输出质量问题

调整以下参数：

inference_steps: 增加到 30-50 可提高质量
guidance_scale: 增加可改善唇同步，但过高可能导致抖动

4.7 KiB Raw Permalink Blame History