Training on the COCO DIY dataset

This section describes how to extract some of the labels and data from the COCO dataset for training using the scripting tool provided by Petoi.

Below is a brief description of the COCO dataset:

COCO (Common Objects in Context) dataset is a widely used computer vision dataset for tasks such as object detection, segmentation, and keypoint detection. Developed by Microsoft Research, the goal of the COCO dataset is to provide a rich dataset with high-quality annotations to advance computer vision technology.

Key features of COCO dataset:
Data content:

Number of images: contains over 330,000 images.
Classes: Covering 80 object classes, e.g., people, cars, animals, furniture, etc.
Annotation: each image is annotated with objects, supporting tasks such as object detection, segmentation (instance segmentation) and key point detection (e.g., human pose estimation).
Data type:

Object detection: the bounding box (bounding box) of the object is labelled.
Instance segmentation: pixel-level segmentation mask (mask) for each object instance.
Key point detection: the position of the key points of the body is labelled.
Scene description: each image also contains textual information describing the content of the image.
Data Distribution:

Training set: contains about 118,000 images.
Validation set: contains about 5,000 images.
Test set: contains about 20,000 images (for competition and evaluation).
Data set structure:

Image data: Stored in the image folder.
Annotation data: stored in JSON format files, including information such as object bounding boxes, segmentation masks, keypoints, and so on.
Usage Scenario:
Object detection: identify objects in the image and determine their locations.
Example segmentation: segment each object in the image at pixel level.
Human body pose estimation: detects the position of key points of the human body in the image.
Scene understanding: analyses the image content and generates a description.
The COCO dataset is widely used in the field of computer vision because it provides a large and diverse set of annotated data that helps in training and evaluating different vision models.

The COCO dataset is approximately 20GB in size, with a large amount of data and many labels, enabling detection of up to 80 different labels. However, this comes with a trade-off in detection accuracy. To meet users' needs for balancing the number of recognized categories and recognition accuracy in different scenarios, Petoi has implemented a way to recreate the dataset with any number of labels from the COCO dataset.

COCO Dataset Download

Copy the following code into a file named coco_download.py. Place the script in the parent directory of the target location for the COCO dataset.

import os
import requests
from zipfile import ZipFile
from tqdm import tqdm
import argparse

def download_url(url, dir):
    if not os.path.exists(dir):
        os.makedirs(dir)
    response = requests.get(url, stream=True)
    file_size = int(response.headers.get('content-length', 0))
    filename = os.path.join(dir, url.split('/')[-1])
    
    with open(filename, 'wb') as f, tqdm(
        desc=filename,
        total=file_size,
        unit='B',
        unit_scale=True,
        unit_divisor=1024,
    ) as bar:
        for chunk in response.iter_content(chunk_size=1024):
            if chunk:
                f.write(chunk)
                bar.update(len(chunk))

def extract_zip(file_path, extract_to):
    with ZipFile(file_path, 'r') as zip_ref:
        zip_ref.extractall(extract_to)

def main(dataset_path):
    images_path = os.path.join(dataset_path, 'images')
    labels_url = 'https://github.com/ultralytics/yolov5/releases/download/v1.0/coco2017labels.zip'
    data_urls = [
        'http://images.cocodataset.org/zips/train2017.zip',
        'http://images.cocodataset.org/zips/val2017.zip',
        'http://images.cocodataset.org/zips/test2017.zip'
    ]

    # Download and extract labels
    download_url(labels_url, dataset_path)
    extract_zip(os.path.join(dataset_path, 'coco2017labels.zip'), dataset_path)
    
    # Download and extract images
    if not os.path.exists(images_path):
        os.makedirs(images_path)

    for url in data_urls:
        zip_name = url.split('/')[-1]
        zip_path = os.path.join(images_path, zip_name)
        download_url(url, images_path)
        extract_zip(zip_path, images_path)
        os.remove(zip_path)  # Clean up zip file after extraction

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Download and extract COCO dataset.')

    main(".")

Execute:

python .\coco_download.py

If you find that there is no way to download the full COCO dataset due to unstable network or other reasons during the download process, then please download the COCO dataset manually and unzip the zip archive to the appropriate location.

The final dataset format is guaranteed to be:

The COCO DIY dataset is extracted using the following script, which can be named coco_remake.py

Before running the script, please change src_coco_path to the path of COCO2017 and dst_coco_path to the path of DIY_COCO dataset. Change src_yaml_file to the path of the official COCO2017 YAML file and dst_yaml_file to the path of the YAML file for DIY_COCO.

import os
import shutil
import yaml

src_coco_path = "E:\Project\yolov5\datasets\coco"
dst_coco_path = "E:\Project\yolov8_coco_simple\coco_04"
src_yaml_file ="E:\Project\yolov5\data\coco.yaml"
dst_yaml_file ="E:\Project\yolov8_coco_simple\coco_03_02_03.yaml"


def get_key_from_value(dictionary, value):
    for key, val in dictionary.items():
        if(val-int(value))==0:
            return key
           

def get_value_from_key(dictionary, key_item):
    for key, val in dictionary.items():
        if key == key_item:
            return val
    return None  # 如果没有找到匹配的值

def load_coco_labels(yaml_file):
    """
    Load COCO labels from a YAML file.
    :param yaml_file: Path to the YAML file
    :return: Dictionary mapping class names to their indices
    """
    with open(yaml_file, 'r', encoding='utf-8') as file:
        data = yaml.safe_load(file)
    return {v: k for k, v in data['names'].items()}

def get_class_indices(classes, labels_mapping):
    """
    Get indices for specific class names based on the labels mapping.
    :param classes: Set of class names to find indices for
    :param labels_mapping: Dictionary mapping class names to their indices
    :return: List of indices corresponding to the class names
    """
    return [labels_mapping[cls] for cls in classes if cls in labels_mapping]

def load_value_mapping(yaml_file):
    """
    从 YAML 文件中加载标签映射
    :param yaml_file: YAML 文件路径
    :return: 标签 ID 集合
    """
    with open(yaml_file, 'r') as file:
        data = yaml.safe_load(file)
    # 提取标签 ID
    labels = set(data['names'].values())
    return labels

def modify_list(lst):
    #print(lst)
    if not lst:
        return lst  # 如果列表为空，直接返回空列表
    
    first_element = lst[0]  # 提取第一个元素
    #print(first_element)
    dst_first_element_label = get_key_from_value(src_labels_values,first_element)
    #print(dst_first_element_label)
    dst_first_element       = get_value_from_key(dst_labels_values,dst_first_element_label)
    #print(dst_first_element)
    remaining_elements = lst[1:]  # 剩下的元素
    modified_list = []
    #print(">>>>>>>>>>")
    #print(len(remaining_elements))

    for i in range(0, (len(remaining_elements)//4)*4, 4):

        #print(remaining_elements[i])
        modified_list.append(dst_first_element)  # 添加第一个元素
        modified_list.append(' ')
        modified_list.append(remaining_elements[i])
        modified_list.append(' ') 
        modified_list.append(remaining_elements[i+1])  
        modified_list.append(' ')
        modified_list.append(remaining_elements[i+2])  
        modified_list.append(' ')
        modified_list.append(remaining_elements[i+3]) 
        modified_list.append('\n')  # 添加换行符
    
    return modified_list


if __name__ == '__main__':
    values = load_value_mapping(dst_yaml_file)
    #print(values)
    src_labels_values = load_coco_labels(src_yaml_file)
    #print(src_labels_values)
    dst_labels_values = load_coco_labels(dst_yaml_file)
    #print(dst_labels_values)

    gt_labels = get_class_indices(values,src_labels_values)

    #print(gt_labels)
    src_images_dir = os.path.join(src_coco_path, "images")
    src_labels_dir = os.path.join(src_coco_path, "labels")
    src_images_train_dir = os.path.join(src_images_dir, "train2017")
    src_images_val_dir = os.path.join(src_images_dir, "val2017")
    src_labels_train_dir = os.path.join(src_labels_dir, "train2017")
    src_labels_val_dir = os.path.join(src_labels_dir, "val2017")

    dst_images_dir = os.path.join(dst_coco_path, "images")
    dst_labels_dir = os.path.join(dst_coco_path, "labels")
    dst_images_train_dir = os.path.join(dst_images_dir, "train")
    dst_images_val_dir = os.path.join(dst_images_dir, "val")
    dst_labels_train_dir = os.path.join(dst_labels_dir, "train")
    dst_labels_val_dir = os.path.join(dst_labels_dir, "val")
    
    os.makedirs(dst_images_train_dir, exist_ok=True)
    os.makedirs(dst_images_val_dir, exist_ok=True)
    os.makedirs(dst_labels_train_dir, exist_ok=True)
    os.makedirs(dst_labels_val_dir, exist_ok=True)


    #print(src_labels_train_dir)
    for txt_file in os.listdir(src_labels_train_dir):
        if txt_file.endswith(".txt"):
            src_labels_train_file_path = os.path.join(src_labels_train_dir, txt_file)
            src_images_train_file_path = os.path.join(src_images_train_dir, txt_file.replace(".txt", ".jpg"))
            with open(src_labels_train_file_path, 'r') as f:
                    print(src_labels_train_file_path)
                    lines = f.readlines()
                    temp_lines=[]
                    temp_line=[]
                    for line in lines:
                        label_id = int(line.strip().split()[0])
                        if label_id in gt_labels:
                            temp_line=modify_list(line.strip().split())
                            temp_lines+=temp_line
                    #print(temp_lines)

                    if temp_lines:
                        print(temp_lines)
                        dst_labels_train_file_path = os.path.join(dst_labels_train_dir, txt_file)
                        dst_images_train_file_path = os.path.join(dst_images_train_dir, txt_file.replace(".txt", ".jpg"))
                        with open(dst_labels_train_file_path, 'w') as f_2:
                            for item in temp_lines:
                                f_2.write(f"{item}")
                        shutil.copy(src_images_train_file_path, dst_images_train_file_path)
        
    #print(src_labels_val_dir)
    for txt_file in os.listdir(src_labels_val_dir):
        if txt_file.endswith(".txt"):
            src_labels_val_file_path = os.path.join(src_labels_val_dir, txt_file)
            src_images_val_file_path = os.path.join(src_images_val_dir, txt_file.replace(".txt", ".jpg"))
            with open(src_labels_val_file_path, 'r') as f:
                    print(src_labels_val_file_path)
                    lines = f.readlines()
                    temp_lines=[]
                    temp_line=[]
                    for line in lines:
                        label_id = int(line.strip().split()[0])
                        if label_id in gt_labels:
                            temp_line=modify_list(line.strip().split())
                            temp_lines+=temp_line
                    #print(temp_lines)

                    if temp_lines:
                        print(temp_lines)
                        dst_labels_val_file_path = os.path.join(dst_labels_val_dir, txt_file)
                        dst_images_val_file_path = os.path.join(dst_images_val_dir, txt_file.replace(".txt", ".jpg"))
                        with open(dst_labels_val_file_path, 'w') as f_2:
                            for item in temp_lines:
                                f_2.write(f"{item}")
                        shutil.copy(src_images_val_file_path, dst_images_val_file_path)

You can write the YAML file for DIY_COCO in the following style, where the labels need to be selected from the labels that already exist in the COCO dataset. But the order of the labels does not have to be the same as in the COCO dataset, they just need to be counted from 0.

train: E:\Project\yolov8_coco_simple\coco_04\images\train
val: E:\Project\yolov8_coco_simple\coco_04\images\val

# Classes
names:
  0: bicycle

This YAML file includes the paths to the train and val directories as well as the label order and corresponding labels. For local training, use absolute paths. For cloud training, this YAML file needs to be modified to the following format:

And, when training in the cloud, be sure to follow this organisation of the dataset below:

Combined with the steps in the Model Training section, you can train your own model based on the COCO DIY dataset!

If your dataset was downloaded manually, you will also need to download COCO.yaml manually.The download link is below:

https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml

# Ultralytics ð AGPL-3.0 License - https://ultralytics.com/license

# COCO 2017 dataset https://cocodataset.org by Microsoft
# Documentation: https://docs.ultralytics.com/datasets/detect/coco/
# Example usage: yolo train data=coco.yaml
# parent
# âââ ultralytics
# âââ datasets
#     âââ coco â downloads here (20.1 GB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: coco # dataset root dir
train: train2017.txt # train images (relative to 'path') 118287 images
val: val2017.txt # val images (relative to 'path') 5000 images
test: test-dev2017.txt # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794

# Classes
names:
  0: person
  1: bicycle
  2: car
  3: motorcycle
  4: airplane
  5: bus
  6: train
  7: truck
  8: boat
  9: traffic light
  10: fire hydrant
  11: stop sign
  12: parking meter
  13: bench
  14: bird
  15: cat
  16: dog
  17: horse
  18: sheep
  19: cow
  20: elephant
  21: bear
  22: zebra
  23: giraffe
  24: backpack
  25: umbrella
  26: handbag
  27: tie
  28: suitcase
  29: frisbee
  30: skis
  31: snowboard
  32: sports ball
  33: kite
  34: baseball bat
  35: baseball glove
  36: skateboard
  37: surfboard
  38: tennis racket
  39: bottle
  40: wine glass
  41: cup
  42: fork
  43: knife
  44: spoon
  45: bowl
  46: banana
  47: apple
  48: sandwich
  49: orange
  50: broccoli
  51: carrot
  52: hot dog
  53: pizza
  54: donut
  55: cake
  56: chair
  57: couch
  58: potted plant
  59: bed
  60: dining table
  61: toilet
  62: tv
  63: laptop
  64: mouse
  65: remote
  66: keyboard
  67: cell phone
  68: microwave
  69: oven
  70: toaster
  71: sink
  72: refrigerator
  73: book
  74: clock
  75: vase
  76: scissors
  77: teddy bear
  78: hair drier
  79: toothbrush

# Download script/URL (optional)
download: |
  from pathlib import Path

  from ultralytics.utils.downloads import download

  # Download labels
  segments = True  # segment or box labels
  dir = Path(yaml["path"])  # dataset root dir
  url = "https://github.com/ultralytics/assets/releases/download/v0.0.0/"
  urls = [url + ("coco2017labels-segments.zip" if segments else "coco2017labels.zip")]  # labels
  download(urls, dir=dir.parent)
  # Download data
  urls = [
      "http://images.cocodataset.org/zips/train2017.zip",  # 19G, 118k images
      "http://images.cocodataset.org/zips/val2017.zip",  # 1G, 5k images
      "http://images.cocodataset.org/zips/test2017.zip",  # 7G, 41k images (optional)
  ]
  download(urls, dir=dir / "images", threads=3)

PreviousModel deployment NextRobot Arm

Last updated 11 months ago

Was this helpful?