YOLOV8代码阅读

2024-05-22

YOLOV8代码阅读

YOLO 训练参数

Argument	Default	Description
model	None	Specifies the model file for training. Accepts a path to either a .pt pretrained model or a .yaml configuration file. Essential for defining the model structure or initializing weights.
data	None	Path to the dataset configuration file (e.g., coco8.yaml). This file contains dataset-specific parameters, including paths to training and validation data, class names, and number of classes.
epochs	100	Total number of training epochs. Each epoch represents a full pass over the entire dataset. Adjusting this value can affect training duration and model performance.
time	None	Maximum training time in hours. If set, this overrides the epochs argument, allowing training to automatically stop after the specified duration. Useful for time-constrained training scenarios.
patience	100	Number of epochs to wait without improvement in validation metrics before early stopping the training. Helps prevent overfitting by stopping training when performance plateaus.
batch	16	Batch size, with three modes: set as an integer (e.g., batch=16), auto mode for 60% GPU memory utilization (batch=-1), or auto mode with specified utilization fraction (batch=0.70).
imgsz	640	Target image size for training. All images are resized to this dimension before being fed into the model. Affects model accuracy and computational complexity.
save	True	Enables saving of training checkpoints and final model weights. Useful for resuming training or model deployment.
save_period	-1	Frequency of saving model checkpoints, specified in epochs. A value of -1 disables this feature. Useful for saving interim models during long training sessions.
cache	False	Enables caching of dataset images in memory (True/ram), on disk (disk), or disables it (False). Improves training speed by reducing disk I/O at the cost of increased memory usage.
device	None	Specifies the computational device(s) for training: a single GPU (device=0), multiple GPUs (device=0,1), CPU (device=cpu), or MPS for Apple silicon (device=mps).
workers	8	Number of worker threads for data loading (per RANK if Multi-GPU training). Influences the speed of data preprocessing and feeding into the model, especially useful in multi-GPU setups.
project	None	Name of the project directory where training outputs are saved. Allows for organized storage of different experiments.
name	None	Name of the training run. Used for creating a subdirectory within the project folder, where training logs and outputs are stored.
exist_ok	False	If True, allows overwriting of an existing project/name directory. Useful for iterative experimentation without needing to manually clear previous outputs.
pretrained	True	Determines whether to start training from a pretrained model. Can be a boolean value or a string path to a specific model from which to load weights. Enhances training efficiency and model performance.
optimizer	‘auto’	Choice of optimizer for training. Options include SGD, Adam, AdamW, NAdam, RAdam, RMSProp etc., or auto for automatic selection based on model configuration. Affects convergence speed and stability.
seed	0	Sets the random seed for training, ensuring reproducibility of results across runs with the same configurations.
deterministic	True	Forces deterministic algorithm use, ensuring reproducibility but may affect performance and speed due to the restriction on non-deterministic algorithms.
single_cls	False	Treats all classes in multi-class datasets as a single class during training. Useful for binary classification tasks or when focusing on object presence rather than classification.
rect	False	Enables rectangular training, optimizing batch composition for minimal padding. Can improve efficiency and speed but may affect model accuracy.
cos_lr	False	Utilizes a cosine learning rate scheduler, adjusting the learning rate following a cosine curve over epochs. Helps in managing learning rate for better convergence.
close_mosaic	10	Disables mosaic data augmentation in the last N epochs to stabilize training before completion. Setting to 0 disables this feature.
resume	False	Resumes training from the last saved checkpoint. Automatically loads model weights, optimizer state, and epoch count, continuing training seamlessly.
amp	True	Enables Automatic Mixed Precision (AMP) training, reducing memory usage and possibly speeding up training with minimal impact on accuracy.
fraction	1.0	指定用于训练的数据集的比例。允许在完整数据集的子集上进行训练，在资源有限或进行实验时非常有用。
profile	False	启用ONNX和TensorRT在训练期间的速率分析，有助于优化模型部署。
freeze	None	冻结模型的前N层或通过索引指定的层，减少可训练参数的数量。对于微调或迁移学习非常有用。
lr0	0.01	初始学习率（例如，SGD=1E-2，Adam=1E-3）。调整这个值对于优化过程至关重要，影响模型权重的更新速度。
lrf	0.01	最终学习率作为初始率的分数 =（lr0 * lrf），与调度器结合使用，随着时间的推移调整学习率。
momentum	0.937	SGD或Adam优化器的动量因子，影响当前更新中过去梯度的融合。
weight_decay	0.0005	L2正则化项，惩罚大权重以防止过拟合。
warmup_epochs	3.0	学习率热身期的周期数，从低值逐渐增加到初始学习率，以在训练初期稳定训练。
warmup_momentum	0.8	热身阶段的初始动量，逐渐调整到设定的动量。
warmup_bias_lr	0.1	热身阶段偏置参数的学习率，帮助在初始周期稳定模型训练。
box	7.5	损失函数中框损失组件的权重，影响准确预测边界框坐标的重视程度。
cls	0.5	总损失函数中分类损失的权重，影响正确类别预测相对于其他组件的重要性。
dfl	1.5	分布焦点损失的权重，用于某些YOLO版本进行细粒度分类。
pose	12.0	在用于姿态估计的模型中姿态损失的权重，影响准确预测姿态关键点的重视程度。
kobj	2.0	姿态估计模型中关键点目标性损失的权重，平衡检测置信度与姿态准确性。
label_smoothing	0.0	应用标签平滑，将硬标签软化为目标标签和均匀分布标签的混合，可以提高泛化能力。
nbs	64	用于损失标准化的名义批量大小。
overlap_mask	True	确定是否应该将对象掩码合并为单个掩码进行训练，或者为每个对象保持单独的掩码。在重叠的情况下，较小的掩码在合并期间覆盖在较大的掩码之上。
mask_ratio	4	分段掩码的下采样比率，影响训练期间使用的掩码的分辨率。
dropout	0.0	分类任务中的正则化丢弃率，通过在训练期间随机省略单元来防止过拟合。
val	True	启用训练期间的验证，允许定期在独立数据集上评估模型性能。
plots	False	生成并保存训练和验证度量的图表以及预测示例，提供模型性能和学习进展的可视化洞察。

YOLO 数据增强

官网地址：数据增强参数

Argument	Type	Default	Range	Description
hsv_h	float	0.015	0.0 - 1.0	调整图像的色调，引入色彩变化。有助于模型在不同光照条件下泛化。
hsv_s	float	0.7	0.0 - 1.0	通过分数改变图像的饱和度，影响色彩强度。对于模拟不同环境条件有用。
hsv_v	float	0.4	0.0 - 1.0	通过分数修改图像的明度（亮度），帮助模型在各种光照条件下表现良好。
degrees	float	0.0	-180 - +180	在指定的角度范围内随机旋转图像，提高模型识别不同方向物体的能力。
translate	float	0.1	0.0 - 1.0	按照图像大小的分数水平和垂直平移图像，有助于学习检测部分可见的物体。
scale	float	0.5	>=0.0	通过增益因子缩放图像，模拟不同距离下的物体。
shear	float	0.0	-180 - +180	通过指定角度剪切图像，模仿从不同角度观察物体的效果。
perspective	float	0.0	0.0 - 0.001	对图像应用随机透视变换，增强模型理解三维空间中物体的能力。
flipud	float	0.0	0.0 - 1.0	以指定概率上下翻转图像，增加数据多样性，不影响物体特征。
fliplr	float	0.5	0.0 - 1.0	以指定概率左右翻转图像，对于学习对称物体和增加数据集多样性有用。
bgr	float	0.0	0.0 - 1.0	以指定概率将图像通道从RGB切换到BGR，对于增加对错误通道顺序的鲁棒性有用。
mosaic	float	1.0	0.0 - 1.0	将四个训练图像组合成一个，模拟不同的场景构成和物体互动。对于复杂场景理解非常有效。
mixup	float	0.0	0.0 - 1.0	混合两个图像及其标签，创建复合图像。通过引入标签噪声和视觉变化，增强模型泛化能力。
copy_paste	float	0.0	0.0 - 1.0	从一个图像复制物体并粘贴到另一个图像上，对于增加物体实例和学习物体遮挡有用。
auto_augment	str	randaugment	-	自动应用预定义的增强策略（randaugment, autoaugment, augmix），通过多样化视觉特征优化分类任务。
erasing	float	0.4	0.0 - 0.9	在分类训练期间随机擦除图像的一部分，鼓励模型关注于不太明显的特征以进行识别。
crop_fraction	float	1.0	0.1 - 1.0	将分类图像裁剪为其大小的分数，强调中心特征，适应物体尺度，减少背景干扰。