Extracting RGB Codes from Multi-Class Segmentation Masks with Python

Imagine you’re training a deep learning model for multi-class segmentation, and you have a bunch of segmentation masks where a unique RGB color represents each class (like sky, road, car, etc.). But here’s the catch — how do you know what RGB codes are being used? What if you need to map these back to class names?

That’s exactly where this small but powerful Python script proves invaluable — because it automatically extracts all unique RGB values used across multiple segmentation masks. Whether you’re preparing a dataset, building a label map, or even verifying class consistency, this tool can significantly streamline your workflow and save you a considerable amount of time.

Why This Matters

In multi-class segmentation tasks, each pixel in a mask image belongs to a class and is typically represented by a unique RGB color. For example:

(128, 0, 0) might be “car”
(0, 128, 0) might be “tree”
(0, 0, 128) might be “sky”

If you don’t know the exact RGB values, it’s nearly impossible to train your model correctly or decode predictions. This script helps you:

Identify all RGB codes used in your segmentation masks
Ensure consistency across a dataset
Prepare legend or mapping files for visualization or training
Debug missing or unexpected classes in your annotations

Now, let’s dive into the code and understand how it works — line by line.

ALSO READ: Human Image Segmentation with DeepLabV3+ in TensorFlow

Code Breakdown and Explanation

Importing Required Libraries

import glob
import numpy as np
import cv2
from tqdm import tqdm

glob: For searching files using patterns (*.png) in the directory.
numpy: For fast numerical operations and array manipulation.
cv2: OpenCV, used here to read image files.
tqdm: Adds a progress bar to track file processing (handy for large datasets).

Function to Extract Unique RGB Colors from a Single Image

def extract_unique_colors(image_path):
    # Read Image
    image = cv2.imread(image_path, cv2.IMREAD_COLOR)

    # Reshape and fine unique RGB colors
    unique_colors = np.unique(image.reshape(-1, 3), axis=0)
    return [tuple(color) for color in unique_colors]

cv2.imread(…) loads the image in color mode.
The image is typically a 3D NumPy array of shape (height, width, 3) representing RGB values.
image.reshape(-1, 3): Flattens the image to a 2D array where each row is an RGB triplet.
np.unique(…, axis=0): Finds all unique RGB combinations (i.e., unique classes in the mask).
Converts NumPy arrays to Python tuples for easy handling (e.g., (128, 0, 0)).

Main Execution Block

if __name__ == "__main__":
    # Get all masks
    mask_paths = glob.glob("masks/*.png")

This part only runs if the script is executed directly.
It gathers all .png files in the “masks” folder — assuming that’s where your segmentation masks are stored.

A sample multi-class segmentation mask from the GTA5 dataset.

    # Iterate over masks and extract RGB colors
    all_colors = set()
    for path in tqdm(mask_paths, desc="Extracting colors"):
        colors = extract_unique_colors(path)
        all_colors.update(colors)

Initializes an empty Python set to store unique RGB values (sets automatically remove duplicates).
For each image:
- It calls the function extract_unique_colors.
- Adds the found RGB codes to the all_colors set.

The tqdm wrapper makes it easy to monitor progress as you process many files.

Print and Save the Results

    # Print total class
    print(f"Total unique RGB colors: {len(all_colors)}")

    # Save the RGB colors
    with open("rgb_code.txt", "w") as f:
        for color in sorted(all_colors):
            f.write(f"{color}\n"

Displays how many unique colors/classes were found across all masks.
Opens a file named rgb_code.txt for writing.
Each unique RGB tuple is written on a new line, sorted for easier viewing.

Sample Output (rgb_code.txt)

(0, 0, 0)
(0, 74, 111)
(0, 220, 220)
(20, 20, 20)
(30, 170, 250)
(35, 142, 107)
(60, 20, 220)
(70, 0, 0)
(70, 70, 70)
(81, 0, 81)
(100, 100, 150)
(128, 64, 128)
(142, 0, 0)
(152, 251, 152)
(153, 153, 153)
(153, 153, 190)
(156, 102, 102)
(180, 130, 70)
(230, 0, 0)
(232, 35, 244)

This file now acts as your class legend — a handy reference for label mapping, training configuration, or visual inspection.

Conclusion

This script is an essential utility for anyone working with semantic segmentation datasets — especially in fields like medical imaging, autonomous driving, satellite imagery, and more.

You can easily adapt or extend it to: