2022-08-06

深入探究COCO数据集（二）

本文是专题深入探究COCO数据集标签的第二部分，主要讲解如何制作COCO style的数据集。其他部分链接：

制作COCO style的检测数据集

在第一部分中我们详细讲了COCO的标签结构，知道了COCO数据集标签的构成，自己制作COCO style的数据集就很简单了，这里依然只从标签的角度讲解。

为简单起见，这里以两类为例，类别分别为A和B。假如我要做一个检测任务，目前我的每一个图像都有一个对应的.txt文件记录图像的标签信息，文件中每一个标签是一个五元组，第一个值为类别索引，其余4个分别为边界框左上和右下角坐标。

那么一开始就可以先这样预定义标签：

label_json = {
    "info" : {
        "contributor"   : "", 
        "date_created"  : "", 
        "description"   : "", 
        "url"           : "", 
        "version"       : "", 
        "year"          : ""
    },
    "licenses"      : [{"name": "", "id": 0, "url": ""}],
    # the main annotation infomation
    "categories"    : [
        {"supercategory" : "", "name": 'A', "id": 1},
        {"supercategory" : "", "name": 'B', "id": 2},
    ],
    "images"        : [],
    "annotations"   : [],
}

之后依次从.txt文件中读取标签，将标签和对应文件信息写入字典label_json中的images项和annotations项即可。程序如下：

import os
import cv2
import json
import warnings
import numpy as np
# root
# ├──images
# │     ├──train
# │     └──val
# ├──labels
# │     ├──train
# │     └──val
# └──annotation
#       └── annotation.json
json_path = "annotation/annotation.json"
ann_id = 0
for id, file in enumerate(txt_list):
    img_path = file.replace(
        "labels", "images"
    ).replace(
        ".txt", ".jpg"
    )
    if not os.path.exists(img_path):
        continue

    H, W = cv2.imread(img_path).shape[:2]
    # part1: images infomation
    img_info = {
        "license" : 0,
        "filename": img_path,
        "height": H,
        "width": W,
        "id": id
    }
    label_json["images"].append(img_info)

    # labels
    with warnings.catch_warnings():
        # ignore the warning information when the file is empty
        warnings.simplefilter("ignore")
        labels = np.loadtxt(file)

    targets = np.zeros((len(labels), 5))
    if labels.size == 0:
        continue
    if labels.ndim == 1:
        labels = labels[None]
    # part2: annotations information
    for label in labels:
        target_info = {
            "segmentation"  : [],
            "category_id"   : int(label[0]),
            "image_id"      : id,
            "id"            : ann_id,
            "area"          : round((label[3]-label[1])*(label[4]-label[2]), 6),
            "bbox"          : xyxy2xywh(label, (H,W)),
            "iscrowd"       : 0,
        }
        ann_id += 1
        label_json["annotations"].append(target_info)
# save the labels in a json file
with open(json_path, 'w') as json_f:
    json.dump(label_json, json_f)

至此，COCO style的目标检测数据集就构建完成了。

从二维掩膜构建COCO style数据集

上边分析过了如何制作COCO style的目标检测数据集，那么如果要做COCO style的分割数据集该怎么实现呢？其实这个过程和上边是一样的，难度仅在于如何将二维掩膜转换成COCO的polygon形式。polygon形式无非就是物体边缘轮廓上的一系列点集，要想得到就必须找到物体的轮廓，这一功能可以使用openCV中的findContours函数实现：

contours, _ = cv2.findContours(
    mask, 
    cv2.RETR_TREE, 
    cv2.CHAIN_APPROX_NONE
)

篇幅所限，这里不再详细讲解cv2.findContours的用法，对于二维掩膜来说，上边的代码足矣。返回的contours是一个列表结构，列表中的每一项都是一个封闭的轮廓点集，接下来遍历这些轮廓依次添加即可。

通常在实际情况中，我们往往还会有些“特别”的要求，本文给出两种情况考量。

轮廓点集太过密集，能不能用更少的点表示？当然是可以的，openCV中的cv2.approxPolyDP 可以实现这个功能，其将轮廓形状近似到另外一种由更少点组成的轮廓形状，新轮廓的点的数目准确度（函数第二个参数）来决定；
能不能补充上面积和边界框信息？ cv2.contourArea可以获得轮廓的面积信息，cv2.boundingRect可以获得轮廓的边界框，返回[x,y,w,h]格式的边界框。

annotations = []
for contour in contours:
    # contour approximate
    epsilon = 0.01 * cv2.arcLength(contour, True)
    contour = cv2.approxPolyDP(contour, epsilon, True)

    polygon = []
    for point in contour:
        polygon.extend(
            [point[0][0], point[0][1]]
        )
    # get area
    area = cv2.contourArea(contour)
    # get bounding box
    bbox = cv2.boundingRect(contour)

    annotations.append([polygon, area, bbox])

标签中其他信息如image_id和category_id等获得方法和获取检测数据集一样，在此不再赘述。