# MuseTalk 部署指南 > **更新时间**:2026-01-16 > **适用版本**:MuseTalk v1.5 --- ## 硬件要求 | 配置 | 最低要求 | 推荐配置 | |------|----------|----------| | GPU | 8GB VRAM (RTX 3060) | 24GB VRAM (RTX 3090) | | 内存 | 32GB | 64GB | | CUDA | 11.7+ | 11.8 | --- ## 📦 安装步骤 ### 1. 创建 Conda 环境 ```bash cd /home/rongye/ProgramFiles/ViGent/models/MuseTalk conda create -n musetalk python=3.10 -y conda activate musetalk ``` ### 2. 安装 PyTorch (固定版本 2.0.1) > ⚠️ **重要**:必须使用 PyTorch 2.0.1 + CUDA 11.8,这是 mmcv 预编译包支持的版本。 ```bash pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118 ``` ### 3. 安装 MMLab 依赖 严格按顺序执行: ```bash pip install -r requirements.txt # MMLab 系列 pip install --no-cache-dir -U openmim mim install mmengine mim install "mmcv==2.0.1" mim install "mmdet==3.1.0" pip install chumpy --no-build-isolation pip install "mmpose==1.1.0" --no-deps ``` --- ## ⬇️ 模型权重下载 ### 权重路径总览 ``` ViGent/models/MuseTalk/models/ ├── musetalk/ ← MuseTalk v1 基础模型 │ ├── config.json -> musetalk.json ⚠️ 软链接 │ ├── musetalk.json │ ├── musetalkV15 -> ../musetalkV15 ⚠️ 软链接 (供 lipsync_service 检测) │ └── pytorch_model.bin (~3.2GB) │ ├── musetalkV15/ ← MuseTalk v1.5 UNet 模型 │ ├── musetalk.json │ └── unet.pth (~3.2GB) │ ├── sd-vae -> sd-vae-ft-mse ⚠️ 软链接 ├── sd-vae-ft-mse/ ← Stable Diffusion VAE │ ├── config.json │ └── diffusion_pytorch_model.bin │ ├── whisper/ ← OpenAI Whisper Tiny │ ├── config.json │ ├── pytorch_model.bin (~151MB) │ └── ... │ ├── dwpose/ ← DWPose 人体姿态检测 │ └── dw-ll_ucoco_384.pth (~387MB) │ ├── syncnet/ ← SyncNet 唇形同步评估 │ └── latentsync_syncnet.pt │ └── face-parse-bisent/ ← 人脸解析模型 ├── 79999_iter.pth (~53MB, 需从 Google Drive 下载) └── resnet18-5c106cde.pth (~45MB) ``` ### 下载步骤 ```bash cd /home/rongye/ProgramFiles/ViGent/models/MuseTalk/models # 安装下载工具 pip install huggingface_hub # 1. MuseTalk v1 模型 huggingface-cli download TMElyralab/MuseTalk \ --include "musetalk/musetalk.json" "musetalk/pytorch_model.bin" \ --local-dir ./musetalk_hf mv musetalk_hf/musetalk/* musetalk/ 2>/dev/null || true rm -rf musetalk_hf # 2. MuseTalk v1.5 UNet huggingface-cli download TMElyralab/MuseTalk \ --include "musetalkV15/unet.pth" "musetalkV15/musetalk.json" \ --local-dir ./mt15_tmp mv mt15_tmp/musetalkV15/* musetalkV15/ 2>/dev/null || true rm -rf mt15_tmp # 3. SD-VAE huggingface-cli download stabilityai/sd-vae-ft-mse --local-dir ./sd-vae-ft-mse # 4. Whisper Tiny huggingface-cli download openai/whisper-tiny --local-dir ./whisper # 5. DWPose mkdir -p dwpose huggingface-cli download yzd-v/DWPose dw-ll_ucoco_384.pth --local-dir ./dwpose # 6. SyncNet mkdir -p syncnet huggingface-cli download ByteDance/LatentSync latentsync_syncnet.pt --local-dir ./syncnet # 7. Face Parse BiSeNet mkdir -p face-parse-bisent cd face-parse-bisent wget https://download.pytorch.org/models/resnet18-5c106cde.pth -O resnet18-5c106cde.pth # 79999_iter.pth 需要从 Google Drive 下载 pip install gdown gdown 154JgKpzCPW82qINcVieuPH3fZ2e0P812 -O 79999_iter.pth cd .. ``` ### ⚠️ 创建必要的软链接 ```bash cd /home/rongye/ProgramFiles/ViGent/models/MuseTalk/models # SD-VAE 路径兼容 ln -sf sd-vae-ft-mse sd-vae # MuseTalk 配置文件 cd musetalk ln -sf musetalk.json config.json # ⚠️ 关键:供 lipsync_service.py 权重检测使用 ln -sf ../musetalkV15 musetalkV15 cd .. ``` --- ## � 验证安装 ### 1. 检查 Python 环境 ```bash conda activate musetalk python -c "import torch; print('PyTorch:', torch.__version__); print('CUDA:', torch.cuda.is_available())" python -c "import mmcv; print('mmcv:', mmcv.__version__)" ``` ### 2. 检查模型权重 ```bash cd /home/rongye/ProgramFiles/ViGent/models/MuseTalk/models # 检查关键文件 ls -la musetalk/pytorch_model.bin ls -la musetalkV15/unet.pth ls -la whisper/pytorch_model.bin ls -la dwpose/dw-ll_ucoco_384.pth ls -la musetalk/musetalkV15 # 应显示软链接 # 检查软链接 ls -la sd-vae ls -la musetalk/config.json ls -la musetalk/musetalkV15 ``` ### 3. 运行推理测试 ```bash cd /home/rongye/ProgramFiles/ViGent/models/MuseTalk conda activate musetalk # 使用命令行参数直接测试 CUDA_VISIBLE_DEVICES=1 python -m scripts.inference \ --video_path /path/to/your/video.mp4 \ --audio_path /path/to/your/audio.mp3 \ --output_path /tmp/test_output.mp4 \ --version v15 \ --gpu_id 0 \ --batch_size 8 \ --use_float16 ``` --- ## 🐛 常见问题 ### mmcv 导入失败 ``` ImportError: cannot import name 'Config' from 'mmcv' ``` **解决**:重新安装 mmcv ```bash pip uninstall mmcv mmcv-full -y mim install "mmcv==2.0.1" ``` ### CUDA 版本不匹配 ``` RuntimeError: CUDA error: no kernel image is available ``` **解决**:确保 PyTorch 版本与 CUDA 驱动兼容 ```bash nvidia-smi # 查看驱动支持的 CUDA 版本 pip install torch==2.0.1 --index-url https://download.pytorch.org/whl/cu118 ``` ### 音视频长度不匹配导致推理失败 ``` Error occurred: whisper_feature.shape: torch.Size([1, 275, 5, 384]) ``` **解决**:确保使用了更新后的 `musetalk/utils/audio_processor.py`(包含零填充逻辑) --- ## 📝 与 ViGent 后端集成 MuseTalk 通过 `lipsync_service.py` 以 subprocess 方式被调用: 1. 后端使用 `MUSETALK_GPU_ID=1` 环境变量指定 GPU 2. 权重检测路径:`models/musetalk/musetalkV15` (需要软链接) 3. Conda 环境路径:`~/ProgramFiles/miniconda3/envs/musetalk/bin/python` 配置文件:`backend/.env` ```ini MUSETALK_GPU_ID=1 MUSETALK_LOCAL=true MUSETALK_VERSION=v15 ```