语义分割-游程编码（Run-Length Encoding, RLE）

2025-04-17

游程编码（Run-Length Encoding, RLE）详解

游程编码（RLE，Run-Length Encoding）是一种简单而有效的数据压缩方法，特别适用于具有大量连续重复值的数据。它在图像处理、分割掩码表示和时间序列数据压缩等领域有着广泛的应用。

它通过记录每个值的“运行长度”（即连续出现的次数）来减少存储空间。RLE 在图像处理、文件压缩和机器学习中都有广泛应用，例如在 COCO 数据集中用于表示分割掩码。

1. RLE 的基本概念

定义：

游程：一段连续的相同值。
编码：将每个游程用一对值表示，通常是 (值, 长度) 或 (长度, 值)或(起点, 长度)。
- 对于二值mask编码有简化形式
  - 交替存储连续的前景像素数和背景像素数。示例：11100111110 编码为 [3, 2, 5, 1]
  - 存储前景像素的起点和长度。示例：11100111110 编码为 [0, 3, 5, 5]

示例：

假设有一个二值数组 [0, 0, 0, 1, 1, 1, 1, 0, 0]：

使用 RLE 编码后可以表示为 [(0, 3), (1, 4), (0, 2)]，即：
- 0 出现了 3 次，
- 1 出现了 4 次，
- 0 又出现了 2 次。

2. RLE 的两种常见形式

2.1 紧凑格式

将所有游程的长度按顺序排列成一个一维数组。
示例：
- 输入数组：[0, 0, 0, 1, 1, 1, 1, 0, 0]
- 紧凑格式：[3, 4, 2]（分别表示 0 的长度、1 的长度、0 的长度）。

这种格式常用于 COCO 数据集中的分割掩码。

2.2 展开格式

直接记录每个像素的值（通常为布尔值），不进行压缩。
示例：
- 输入数组：[0, 0, 0, 1, 1, 1, 1, 0, 0]
- 展开格式：[False, False, False, True, True, True, True, False, False]

这种格式占用更多存储空间，但在某些情况下更易于处理。

3. RLE 的应用场景

3.1 图像压缩

RLE 是一种简单且高效的图像压缩方法，尤其适用于二值图像或具有大量连续颜色区域的图像。
示例：
- 黑白图像中的大面积背景可以用 RLE 表示，从而显著减少存储空间。

3.2 分割掩码表示

在目标检测和实例分割任务中，RLE 常用于表示分割掩码（mask）。相比于直接存储二值图像，RLE 能够大幅减少数据量。
示例：
- COCO 数据集中的 segmentation 字段使用 RLE 格式存储分割掩码。

3.3 时间序列数据

对于具有重复模式的时间序列数据，RLE 可以有效压缩存储空间。
示例：
- 某个传感器输出的信号可能包含大量连续的相同值，RLE 可以用来压缩这些数据。

4. RLE 的优缺点

优点：

简单高效 RLE 的实现非常简单，适合处理具有大量连续重复值的数据。
无损压缩 RLE 是一种无损压缩算法，解码后可以完全恢复原始数据。
节省存储空间 对于具有长游程的数据，RLE 能够显著减少存储需求。

缺点：

对随机数据无效 如果数据中没有连续重复值（如随机噪声），RLE 不仅无法压缩，反而会增加存储开销。
不适合复杂形状 对于复杂的二值图像（如细碎的边缘或噪声），RLE 的压缩效果较差。

5. RLE 与mask互相转换

这里RLE编码是1base的, 编码格式是二值mask的(起点, 长度)

import numpy as np


def rle_encode(mask: np.ndarray):
    """mask转游程编码

    Args:
        mask (np.ndarray): numpy array, 1 - mask, 0 - background

    Returns:
        str: run length as string formated (start, length)
    """
    pixels = mask.flatten(order = 'F')
    # 前后补零免去越界检测
    pixels = np.concatenate([[0], pixels, [0]])
    # 寻找像素变化的索引坐标
    # 0变1的索引和1变0索引交替排列
    runs = np.where(pixels[1:] != pixels[:-1])[0]
    runs += 1
    # 1变0的索引减去0变1的索引，即为1的长度
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)

def rle_decode(mask_rle: str, shape=(512, 512)):
    """游程编码解码

    Args:
        mask_rle (str): run-length as string formated (start, length)
        shape (tuple, optional): (height, width) of array to return.

    Returns:
        numpy array: 1 - mask, 0 - background
    """
    run_value = 1
    if isinstance(mask_rle, str):
        mask_rle = [int(i) for i in mask_rle.split()]
    elif isinstance(mask_rle, (list, tuple)):
        mask_rle = [int(i) for i in mask_rle]
    else:
        mask_rle = []
    rle_pairs = list(zip(mask_rle[0::2], mask_rle[1::2]))  # 前景和背景像素对
    mask = np.zeros(shape[0] * shape[1], dtype=np.uint8)
    for run_start, run_length in rle_pairs:
        run_start -= 1
        mask[run_start:run_start + run_length] = run_value
    return mask.reshape(shape, order='F')


if __name__ == '__main__':
    data = [1, 10, 12, 5, 100, 8, 120, 9]
    # 解码
    mask = rle_decode(data)
    # 编码
    rle = rle_encode(mask)
    data_str = ' '.join(str(x) for x in data)
    # 判断结果是否相同
    print(data_str==rle)
    print(rle)

6. YOLO标签与mask互相转换

Mask转YOLO标签

import cv2
import numpy as np


def mask_to_yolo_label(mask: np.ndarray, class_id=0):
    """
    将二值 mask 转换为 YOLO 分割标签格式。
    
    参数：
        mask (np.ndarray): 二值 mask 图像，形状为 [H, W]，值为 0 或 255。
        class_id (int): 目标的类别 ID。
    
    返回：
        str: YOLO 格式的分割标签字符串。
    """
    image_width, image_height = mask.shape[:2]
    # 确保 mask 是二值图像
    _, binary_mask = cv2.threshold(mask, 0, 255, cv2.THRESH_BINARY)
    
    # 查找轮廓
    contours, hierarchy = cv2.findContours(binary_mask, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
    contours = [cv2.approxPolyDP(contour, epsilon=1, closed=True) for contour in contours]
    if len(contours) == 0:
        raise ValueError("未找到任何轮廓，请检查输入的 mask 是否有效。")
    
    label = ""
    # 遍历每个mask（假设 mask 中只有一个目标）
    for contour in contours:
        # 计算边界框
        # x, y, w, h = cv2.boundingRect(contour)
        # x_center = (x + w / 2) / image_width
        # y_center = (y + h / 2) / image_height
        # width = w / image_width
        # height = h / image_height
    
        # 提取分割点并归一化
        segmentation_points = []
        for point in contour:
            px, py = point[0]  # 提取点的坐标
            norm_px = px / image_width
            norm_py = py / image_height
            segmentation_points.extend([norm_px, norm_py])
    
        # 拼接 YOLO 标签
        label += f"{class_id} " + \
            " ".join(f"{p:.6f}" for p in segmentation_points) + "\n"
    
    return label

# 示例用法
if __name__ == "__main__":
    # 转换为 YOLO 标签
    yolo_label = mask_to_yolo_label(mask)
    print("YOLO 标签：", yolo_label)

YOLO标签转Mask

import numpy as np
import cv2

def yolo_label_to_mask(yolo_label, image_width, image_height):
    """
    将 YOLO 分割标签转换为二值掩码。
    
    参数：
        yolo_label (str): YOLO 分割标签字符串。
        image_width (int): 原始图像的宽度。
        image_height (int): 原始图像的高度。
    
    返回：
        np.ndarray: 二值掩码，形状为 [H, W]。
    """
    # 解析 YOLO 标签
    labels = yolo_label.strip().split('\n')
    print(labels)
    # 创建空的掩码
    mask = np.zeros((image_height, image_width), dtype=np.uint8)
    for label in labels:
        parts = label.strip().split()
        segmentation_points = list(map(float, parts[1:]))
        # 反归一化分割点坐标
        points = []
        for i in range(0, len(segmentation_points), 2):
            norm_x = segmentation_points[i]
            norm_y = segmentation_points[i + 1]
            px = int(norm_x * image_width)
            py = int(norm_y * image_height)
            points.append([px, py])
        # 使用 fillPoly 填充多边形区域
        points = np.array([points], dtype=np.int32)  # 转换为 OpenCV 格式
        cv2.fillPoly(mask, points, 1)
    
    return mask

# 示例用法
if __name__ == "__main__":
    # 将 YOLO 标签转换为掩码
    mask = yolo_label_to_mask( yolo_label, *(512, 512))
    plt.figure(figsize=(7, 7), dpi=100)
    # plt.title("Mask")
    plt.imshow(mask, cmap="gray")  # 灰度显示掩码
    plt.axis("off")
    plt.show()