4.9 KiB
4.9 KiB
MuseTalk 部署指南
硬件要求
| 配置 | 最低要求 | 推荐配置 |
|---|---|---|
| GPU | 8GB VRAM (如 RTX 3060) | 24GB VRAM (如 RTX 3090) |
| 内存 | 32GB | 64GB |
| CUDA | 11.7+ | 12.0+ |
📦 安装步骤
1. 克隆 MuseTalk 仓库
# 进入 ViGent 项目的 models 目录
cd /home/rongye/ProgramFiles/ViGent/models
# 克隆 MuseTalk 仓库
git clone https://github.com/TMElyralab/MuseTalk.git MuseTalk_repo
# 保留我们的自定义文件
cp MuseTalk/DEPLOY.md MuseTalk_repo/
cp MuseTalk/musetalk_api.py MuseTalk_repo/
# 替换目录
rm -rf MuseTalk
mv MuseTalk_repo MuseTalk
2. 创建虚拟环境
cd /home/rongye/ProgramFiles/ViGent/models/MuseTalk
conda create -n musetalk python=3.10 -y
conda activate musetalk
3. 安装 PyTorch (CUDA 12.1)
# CUDA 12.1 (适配服务器 CUDA 12.8)
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
4. 安装 MuseTalk 依赖
pip install -r requirements.txt
# 安装 mmlab 系列 (MuseTalk 必需)
pip install --no-cache-dir -U openmim
mim install mmengine
mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"
5. 下载模型权重 ⬇️
权重文件较大(约 5GB),请确保网络稳定
方式一:从 Hugging Face 下载 (推荐)
cd /home/rongye/ProgramFiles/ViGent/models/MuseTalk
# 安装 huggingface-cli
pip install huggingface_hub
# 下载 MuseTalk 权重 (v1.5)
huggingface-cli download TMElyralab/MuseTalk \
--local-dir ./models/musetalk \
--include "*.pth" "*.json"
# 下载 MuseTalk V15 权重
huggingface-cli download TMElyralab/MuseTalk \
--local-dir ./models/musetalkV15 \
--include "unet.pth"
# 下载 SD-VAE 模型 (Stable Diffusion VAE)
huggingface-cli download stabilityai/sd-vae-ft-mse \
--local-dir ./models/sd-vae-ft-mse
# 下载 Whisper 模型 (音频特征提取)
# MuseTalk 使用 whisper-tiny
huggingface-cli download openai/whisper-tiny \
--local-dir ./models/whisper
方式二:手动下载
从以下链接下载并放到对应目录:
| 模型 | 下载链接 | 存放路径 |
|---|---|---|
| MuseTalk | Hugging Face | models/MuseTalk/models/musetalk/ |
| MuseTalk V15 | 同上 | models/MuseTalk/models/musetalkV15/ |
| SD-VAE | Hugging Face | models/MuseTalk/models/sd-vae-ft-mse/ |
| Whisper | Hugging Face | models/MuseTalk/models/whisper/ |
| DWPose | 按官方 README | models/MuseTalk/models/dwpose/ |
| Face Parse | 按官方 README | models/MuseTalk/models/face-parse-bisent/ |
6. 验证安装
cd /home/rongye/ProgramFiles/ViGent/models/MuseTalk
conda activate musetalk
# 测试推理 (使用 GPU1)
CUDA_VISIBLE_DEVICES=1 python -m scripts.inference \
--version v15 \
--inference_config configs/inference/test.yaml \
--result_dir ./results \
--use_float16
📂 目录结构
安装完成后目录结构:
models/MuseTalk/
├── configs/
│ └── inference/
├── models/ # ⬅️ 权重文件目录
│ ├── musetalk/ # MuseTalk 基础权重
│ │ ├── config.json
│ │ └── pytorch_model.bin
│ ├── musetalkV15/ # V1.5 版本 UNet
│ │ └── unet.pth
│ ├── sd-vae-ft-mse/ # Stable Diffusion VAE
│ │ └── diffusion_pytorch_model.bin
│ ├── whisper/ # Whisper 模型
│ ├── dwpose/ # 姿态检测
│ └── face-parse-bisent/ # 人脸解析
├── musetalk/ # MuseTalk 源码
├── scripts/
│ └── inference.py
├── DEPLOY.md # 本文档
└── musetalk_api.py # API 服务
🔧 ViGent 集成配置
环境变量配置
在 /home/rongye/ProgramFiles/ViGent/backend/.env 中设置:
# MuseTalk 配置
MUSETALK_LOCAL=true
MUSETALK_GPU_ID=1
MUSETALK_VERSION=v15
MUSETALK_USE_FLOAT16=true
MUSETALK_BATCH_SIZE=8
启动后端服务
cd /home/rongye/ProgramFiles/ViGent/backend
source venv/bin/activate
# 设置 GPU 并启动
CUDA_VISIBLE_DEVICES=1 uvicorn app.main:app --host 0.0.0.0 --port 8000
🚨 常见问题
Q1: CUDA out of memory
解决:减小 MUSETALK_BATCH_SIZE 或启用 MUSETALK_USE_FLOAT16=true
Q2: mmcv 安装失败
解决:确保 CUDA 版本匹配,使用 mim install mmcv==2.0.1
Q3: Whisper 加载失败
解决:检查 models/whisper/ 目录是否包含完整模型文件