129 lines
2.8 KiB
Markdown
129 lines
2.8 KiB
Markdown
# Day 25 开发日志
|
||
|
||
**日期**:2025-12-31
|
||
**主题**:室内导盲分割模型训练完成与验证
|
||
|
||
---
|
||
|
||
## 🔧 训练问题排查与修复
|
||
|
||
### 问题 1: YOLOE 不支持自定义类别训练
|
||
|
||
**错误**:
|
||
```
|
||
RuntimeError: shape '[16, 78, -1]' is invalid for input of size 14745600
|
||
```
|
||
|
||
**原因**:`yoloe-11l-seg.pt` (YOLO Everything) 是零样本/开放词汇模型,不支持传统 fine-tuning。
|
||
|
||
**解决方案**:改用标准分割模型 `yolo11l-seg.pt`
|
||
|
||
```python
|
||
# train.py 修改
|
||
model = YOLO("/home/rongye/ProgramFiles/Yolo/yolo11l-seg.pt")
|
||
```
|
||
|
||
---
|
||
|
||
### 问题 2: 数据集类别不匹配
|
||
|
||
**原始数据集**:MIT Indoor Scene (2573 个类别)
|
||
- 类别过多且杂乱(从 alarm clock 到 zuccini)
|
||
- 大部分与导盲无关
|
||
|
||
**解决方案**:创建 `filter_categories.py` 脚本筛选导盲相关类别
|
||
|
||
**筛选结果** (14 类):
|
||
|
||
| ID | 类别 | 用途 |
|
||
|----|------|------|
|
||
| 0 | floor | 可行走地面 |
|
||
| 1 | corridor | 走廊/通道 |
|
||
| 2 | sidewalk | 人行道 |
|
||
| 3 | chair | 椅子障碍物 |
|
||
| 4 | table | 桌子障碍物 |
|
||
| 5 | sofa_bed | 沙发/床 |
|
||
| 6 | door | 门 |
|
||
| 7 | elevator | 电梯 |
|
||
| 8 | stairs | 楼梯 |
|
||
| 9 | wall | 墙壁边界 |
|
||
| 10 | person | 行人 |
|
||
| 11 | cabinet | 柜子 |
|
||
| 12 | trash_can | 垃圾桶 |
|
||
| 13 | window | 窗户/玻璃门 |
|
||
|
||
**数据规模**:
|
||
- 训练集:1265 张
|
||
- 验证集:363 张
|
||
- 测试集:175 张
|
||
|
||
---
|
||
|
||
## ✅ 训练完成
|
||
|
||
### 训练配置
|
||
```python
|
||
model = YOLO("yolo11l-seg.pt")
|
||
model.train(
|
||
data="data.yaml",
|
||
epochs=150,
|
||
imgsz=640,
|
||
batch=16,
|
||
device=1, # RTX 3090
|
||
cache='ram',
|
||
optimizer='AdamW',
|
||
amp=True
|
||
)
|
||
```
|
||
|
||
### 训练结果
|
||
- **模型参数**:27.6M
|
||
- **训练时长**:约 1.5 小时
|
||
- **推理速度**:4.8ms/张 (训练时) / 23ms/张 (验证时)
|
||
|
||
---
|
||
|
||
## ✅ 模型验证
|
||
|
||
### 测试集预测
|
||
|
||
```bash
|
||
python -c "from ultralytics import YOLO; m=YOLO('best.pt'); m.predict('test/images', save=True)"
|
||
```
|
||
|
||
### 验证结果
|
||
|
||
| 指标 | 数值 |
|
||
|------|------|
|
||
| 测试图片 | 175 张 |
|
||
| 推理速度 | 23ms/张 |
|
||
| 无检测率 | 2.3% (4张) |
|
||
| 类别覆盖 | 14/14 全部检测到 |
|
||
|
||
### 典型检测样例
|
||
- `corridor`: 1 floor, 4 doors, 3 walls
|
||
- `dining room`: 6 chairs, 3 tables, 2 walls
|
||
- `conference`: 14 chairs, 2 tables
|
||
- `airport`: 1 floor, 2 chairs, 4 walls, 1 person
|
||
|
||
---
|
||
|
||
## 📁 产物位置
|
||
|
||
```
|
||
/home/rongye/ProgramFiles/Yolo/blind_guide_project/yoloe_seg_blind_v1/
|
||
├── weights/
|
||
│ ├── best.pt ← 最佳模型
|
||
│ └── last.pt
|
||
├── results.csv
|
||
└── results.png
|
||
```
|
||
|
||
---
|
||
|
||
## 📋 下一步
|
||
|
||
- [ ] 导出 TensorRT 格式 (imgsz=480)
|
||
- [ ] 集成到导盲服务器替换现有模型
|
||
- [ ] 实际场景测试
|