Init: 初版代码
This commit is contained in:
186
models/MuseTalk/DEPLOY.md
Normal file
186
models/MuseTalk/DEPLOY.md
Normal file
@@ -0,0 +1,186 @@
|
||||
# MuseTalk 部署指南
|
||||
|
||||
## 硬件要求
|
||||
|
||||
| 配置 | 最低要求 | 推荐配置 |
|
||||
|------|----------|----------|
|
||||
| GPU | 8GB VRAM (如 RTX 3060) | 24GB VRAM (如 RTX 3090) |
|
||||
| 内存 | 32GB | 64GB |
|
||||
| CUDA | 11.7+ | 12.0+ |
|
||||
|
||||
---
|
||||
|
||||
## 📦 安装步骤
|
||||
|
||||
### 1. 克隆 MuseTalk 仓库
|
||||
|
||||
```bash
|
||||
# 进入 ViGent 项目的 models 目录
|
||||
cd /home/rongye/ProgramFiles/ViGent/models
|
||||
|
||||
# 克隆 MuseTalk 仓库
|
||||
git clone https://github.com/TMElyralab/MuseTalk.git MuseTalk_repo
|
||||
|
||||
# 保留我们的自定义文件
|
||||
cp MuseTalk/DEPLOY.md MuseTalk_repo/
|
||||
cp MuseTalk/musetalk_api.py MuseTalk_repo/
|
||||
|
||||
# 替换目录
|
||||
rm -rf MuseTalk
|
||||
mv MuseTalk_repo MuseTalk
|
||||
```
|
||||
|
||||
### 2. 创建虚拟环境
|
||||
|
||||
```bash
|
||||
cd /home/rongye/ProgramFiles/ViGent/models/MuseTalk
|
||||
conda create -n musetalk python=3.10 -y
|
||||
conda activate musetalk
|
||||
```
|
||||
|
||||
### 3. 安装 PyTorch (CUDA 12.1)
|
||||
|
||||
```bash
|
||||
# CUDA 12.1 (适配服务器 CUDA 12.8)
|
||||
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
|
||||
```
|
||||
|
||||
### 4. 安装 MuseTalk 依赖
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
|
||||
# 安装 mmlab 系列 (MuseTalk 必需)
|
||||
pip install --no-cache-dir -U openmim
|
||||
mim install mmengine
|
||||
mim install "mmcv>=2.0.1"
|
||||
mim install "mmdet>=3.1.0"
|
||||
mim install "mmpose>=1.1.0"
|
||||
```
|
||||
|
||||
### 5. 下载模型权重 ⬇️
|
||||
|
||||
> **权重文件较大(约 5GB),请确保网络稳定**
|
||||
|
||||
#### 方式一:从 Hugging Face 下载 (推荐)
|
||||
|
||||
```bash
|
||||
cd /home/rongye/ProgramFiles/ViGent/models/MuseTalk
|
||||
|
||||
# 安装 huggingface-cli
|
||||
pip install huggingface_hub
|
||||
|
||||
# 下载 MuseTalk 权重 (v1.5)
|
||||
huggingface-cli download TMElyralab/MuseTalk \
|
||||
--local-dir ./models/musetalk \
|
||||
--include "*.pth" "*.json"
|
||||
|
||||
# 下载 MuseTalk V15 权重
|
||||
huggingface-cli download TMElyralab/MuseTalk \
|
||||
--local-dir ./models/musetalkV15 \
|
||||
--include "unet.pth"
|
||||
|
||||
# 下载 SD-VAE 模型 (Stable Diffusion VAE)
|
||||
huggingface-cli download stabilityai/sd-vae-ft-mse \
|
||||
--local-dir ./models/sd-vae-ft-mse
|
||||
|
||||
# 下载 Whisper 模型 (音频特征提取)
|
||||
# MuseTalk 使用 whisper-tiny
|
||||
huggingface-cli download openai/whisper-tiny \
|
||||
--local-dir ./models/whisper
|
||||
```
|
||||
|
||||
#### 方式二:手动下载
|
||||
|
||||
从以下链接下载并放到对应目录:
|
||||
|
||||
| 模型 | 下载链接 | 存放路径 |
|
||||
|------|----------|----------|
|
||||
| MuseTalk | [Hugging Face](https://huggingface.co/TMElyralab/MuseTalk) | `models/MuseTalk/models/musetalk/` |
|
||||
| MuseTalk V15 | 同上 | `models/MuseTalk/models/musetalkV15/` |
|
||||
| SD-VAE | [Hugging Face](https://huggingface.co/stabilityai/sd-vae-ft-mse) | `models/MuseTalk/models/sd-vae-ft-mse/` |
|
||||
| Whisper | [Hugging Face](https://huggingface.co/openai/whisper-tiny) | `models/MuseTalk/models/whisper/` |
|
||||
| DWPose | 按官方 README | `models/MuseTalk/models/dwpose/` |
|
||||
| Face Parse | 按官方 README | `models/MuseTalk/models/face-parse-bisent/` |
|
||||
|
||||
### 6. 验证安装
|
||||
|
||||
```bash
|
||||
cd /home/rongye/ProgramFiles/ViGent/models/MuseTalk
|
||||
conda activate musetalk
|
||||
|
||||
# 测试推理 (使用 GPU1)
|
||||
CUDA_VISIBLE_DEVICES=1 python -m scripts.inference \
|
||||
--version v15 \
|
||||
--inference_config configs/inference/test.yaml \
|
||||
--result_dir ./results \
|
||||
--use_float16
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📂 目录结构
|
||||
|
||||
安装完成后目录结构:
|
||||
|
||||
```
|
||||
models/MuseTalk/
|
||||
├── configs/
|
||||
│ └── inference/
|
||||
├── models/ # ⬅️ 权重文件目录
|
||||
│ ├── musetalk/ # MuseTalk 基础权重
|
||||
│ │ ├── config.json
|
||||
│ │ └── pytorch_model.bin
|
||||
│ ├── musetalkV15/ # V1.5 版本 UNet
|
||||
│ │ └── unet.pth
|
||||
│ ├── sd-vae-ft-mse/ # Stable Diffusion VAE
|
||||
│ │ └── diffusion_pytorch_model.bin
|
||||
│ ├── whisper/ # Whisper 模型
|
||||
│ ├── dwpose/ # 姿态检测
|
||||
│ └── face-parse-bisent/ # 人脸解析
|
||||
├── musetalk/ # MuseTalk 源码
|
||||
├── scripts/
|
||||
│ └── inference.py
|
||||
├── DEPLOY.md # 本文档
|
||||
└── musetalk_api.py # API 服务
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔧 ViGent 集成配置
|
||||
|
||||
### 环境变量配置
|
||||
|
||||
在 `/home/rongye/ProgramFiles/ViGent/backend/.env` 中设置:
|
||||
|
||||
```bash
|
||||
# MuseTalk 配置
|
||||
MUSETALK_LOCAL=true
|
||||
MUSETALK_GPU_ID=1
|
||||
MUSETALK_VERSION=v15
|
||||
MUSETALK_USE_FLOAT16=true
|
||||
MUSETALK_BATCH_SIZE=8
|
||||
```
|
||||
|
||||
### 启动后端服务
|
||||
|
||||
```bash
|
||||
cd /home/rongye/ProgramFiles/ViGent/backend
|
||||
source venv/bin/activate
|
||||
|
||||
# 设置 GPU 并启动
|
||||
CUDA_VISIBLE_DEVICES=1 uvicorn app.main:app --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🚨 常见问题
|
||||
|
||||
### Q1: CUDA out of memory
|
||||
**解决**:减小 `MUSETALK_BATCH_SIZE` 或启用 `MUSETALK_USE_FLOAT16=true`
|
||||
|
||||
### Q2: mmcv 安装失败
|
||||
**解决**:确保 CUDA 版本匹配,使用 `mim install mmcv==2.0.1`
|
||||
|
||||
### Q3: Whisper 加载失败
|
||||
**解决**:检查 `models/whisper/` 目录是否包含完整模型文件
|
||||
157
models/MuseTalk/musetalk_api.py
Normal file
157
models/MuseTalk/musetalk_api.py
Normal file
@@ -0,0 +1,157 @@
|
||||
"""
|
||||
MuseTalk API 服务
|
||||
|
||||
这个脚本将 MuseTalk 封装为 FastAPI 服务,
|
||||
可以独立部署在 GPU 服务器上。
|
||||
|
||||
用法:
|
||||
python musetalk_api.py --port 8001
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import argparse
|
||||
import tempfile
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
from fastapi import FastAPI, UploadFile, File, Form, HTTPException
|
||||
from fastapi.responses import FileResponse
|
||||
from fastapi.middleware.cors import CORSMiddleware
|
||||
import uvicorn
|
||||
|
||||
# 添加 MuseTalk 路径
|
||||
MUSETALK_DIR = Path(__file__).parent
|
||||
sys.path.insert(0, str(MUSETALK_DIR))
|
||||
|
||||
app = FastAPI(
|
||||
title="MuseTalk API",
|
||||
description="唇形同步推理服务",
|
||||
version="0.1.0"
|
||||
)
|
||||
|
||||
app.add_middleware(
|
||||
CORSMiddleware,
|
||||
allow_origins=["*"],
|
||||
allow_credentials=True,
|
||||
allow_methods=["*"],
|
||||
allow_headers=["*"],
|
||||
)
|
||||
|
||||
# 全局模型实例 (懒加载)
|
||||
_model = None
|
||||
|
||||
|
||||
def get_model():
|
||||
"""懒加载 MuseTalk 模型"""
|
||||
global _model
|
||||
if _model is None:
|
||||
print("🔄 加载 MuseTalk 模型...")
|
||||
# TODO: 根据 MuseTalk 实际 API 调整
|
||||
# from musetalk.inference import MuseTalkInference
|
||||
# _model = MuseTalkInference()
|
||||
print("✅ MuseTalk 模型加载完成")
|
||||
return _model
|
||||
|
||||
|
||||
@app.get("/")
|
||||
async def root():
|
||||
return {"name": "MuseTalk API", "status": "ok"}
|
||||
|
||||
|
||||
@app.get("/health")
|
||||
async def health():
|
||||
"""健康检查"""
|
||||
return {"status": "healthy", "gpu": True}
|
||||
|
||||
|
||||
@app.post("/lipsync")
|
||||
async def lipsync(
|
||||
video: UploadFile = File(..., description="输入视频文件"),
|
||||
audio: UploadFile = File(..., description="音频文件"),
|
||||
fps: int = Form(25, description="输出帧率")
|
||||
):
|
||||
"""
|
||||
唇形同步推理
|
||||
|
||||
Args:
|
||||
video: 输入视频 (静态人物)
|
||||
audio: 驱动音频
|
||||
fps: 输出帧率
|
||||
|
||||
Returns:
|
||||
生成的视频文件
|
||||
"""
|
||||
# 创建临时目录
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
tmpdir = Path(tmpdir)
|
||||
|
||||
# 保存上传的文件
|
||||
video_path = tmpdir / "input_video.mp4"
|
||||
audio_path = tmpdir / "input_audio.wav"
|
||||
output_path = tmpdir / "output.mp4"
|
||||
|
||||
with open(video_path, "wb") as f:
|
||||
shutil.copyfileobj(video.file, f)
|
||||
with open(audio_path, "wb") as f:
|
||||
shutil.copyfileobj(audio.file, f)
|
||||
|
||||
try:
|
||||
# 执行唇形同步
|
||||
model = get_model()
|
||||
|
||||
# TODO: 调用实际的 MuseTalk 推理
|
||||
# result = model.inference(
|
||||
# source_video=str(video_path),
|
||||
# driving_audio=str(audio_path),
|
||||
# output_path=str(output_path),
|
||||
# fps=fps
|
||||
# )
|
||||
|
||||
# 临时: 使用 subprocess 调用 MuseTalk CLI
|
||||
import subprocess
|
||||
cmd = [
|
||||
sys.executable, "-m", "scripts.inference",
|
||||
"--video_path", str(video_path),
|
||||
"--audio_path", str(audio_path),
|
||||
"--output_path", str(output_path),
|
||||
]
|
||||
|
||||
result = subprocess.run(
|
||||
cmd,
|
||||
cwd=str(MUSETALK_DIR),
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
if result.returncode != 0:
|
||||
raise RuntimeError(f"MuseTalk 推理失败: {result.stderr}")
|
||||
|
||||
if not output_path.exists():
|
||||
raise RuntimeError("输出文件不存在")
|
||||
|
||||
# 返回生成的视频
|
||||
# 需要先复制到持久化位置
|
||||
final_output = Path("outputs") / f"lipsync_{video.filename}"
|
||||
final_output.parent.mkdir(exist_ok=True)
|
||||
shutil.copy(output_path, final_output)
|
||||
|
||||
return FileResponse(
|
||||
final_output,
|
||||
media_type="video/mp4",
|
||||
filename=f"lipsync_{video.filename}"
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=500, detail=str(e))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--port", type=int, default=8001)
|
||||
parser.add_argument("--host", type=str, default="0.0.0.0")
|
||||
args = parser.parse_args()
|
||||
|
||||
print(f"🚀 MuseTalk API 启动在 http://{args.host}:{args.port}")
|
||||
uvicorn.run(app, host=args.host, port=args.port)
|
||||
Reference in New Issue
Block a user