2022-08-06

深入探究COCO数据集（三）

本文是专题深入探究COCO数据集标签的第三部分，主要讲解如何加载COCO style的数据集。其他部分链接：

pycocotools

pycocotools提供了COCO数据集交互的接口，加载COCO style的数据集通常使用其完成，更准确地说，是通过pycocotools.coco中的COCO类实现。标签加载过程如下：

1 2	from pycocotools.coco import COCO coco = COCO(ann_file)

得到一个COCO对象之后呢？别急，先来说几个COCO类中常用的接口：

属性

常用属性的类型均为字典。

名称	键	值	说明
`anns`	`ann_id`	字典，标注文件中对应的标注信息	由`ann_id`快速索引标注
`imgs`	`image_id`	字典，标注文件中对应的图像信息	由`image_id`快速索引图像
`cats`	`category_id`	字典，标注文件中对应的类别信息	由`category_id`快速索引类别
`imgToAnns`	`image_id`	列表，元素为字典形式的标注信息	由`image_id`快速索引该图像全部标注
`catToImgs`	`category_id`	列表，元素为`image_id`	由`category_id`索引有该类别标注的全部图像的id

方法

getAnnIds()

参数：imgIds：待查询image_id（可选）、catIds：待查询category_id（可选）。
返回值：符合全部筛选条件的ann_id列表，如果没有入参返回包含全部ann_id的列表。

示例：

1 2	>>> coco.getAnnIds(imgIds=100) [246, 247]

getImgIds()

参数：imgIds：待查询image_id（可选）、catIds：待查询category_id（可选）。
返回值：符合全部筛选条件的image_id列表，如果没有入参返回包含全部image_id的列表。

示例：

1 2	>>> coco.getImgIds(imgIds=[1, 10, 100], catIds=6) # 三张图中包含category_id=6标注的图 [100]

getCatIds()

参数：catNms：类别名字（可选）、supNms：父类名字（可选）、catIds：category_id（可选）。
返回值：符合全部筛选条件的category_id列表，如果没有入参返回包含全部category_id的列表。

示例：

1 2	>>> coco.getCatIds(catNms="chairs", catIds=6) [6]

loadAnns()

参数：ids：ann_id（可选）。
返回值：标注文件中对应传入ann_id的标注信息列表，如果没有入参返回包含全部标注的列表。
机制：通过anns属性实现。
示例：
1
>>> coco.loadAnns(ids=[2, 3])

loadImgs()

参数：ids：image_id（可选）。
返回值：标注文件中对应传入image_id的图像信息列表，如果没有入参返回包含全部图像信息的列表。
机制：通过imgs属性实现。
示例：
1
>>> coco.loadImgs(ids=0)

loadCats()

参数：ids：category_id（可选）。
返回值：标注文件中对应传入category_id的类别信息列表，如果没有入参返回包含全部类别信息的列表。
机制：通过cats属性实现。
示例：
1
>>> coco.loadCats(ids=6)

annToMask()

参数：ann：标注信息字典。
返回值：ann['segmentation']中的多边形对应的掩膜，类型为np.ndarray
示例：
1
mask = coco.annToMask(coco.anns[0])

COCO style数据加载

其实加载COCO数据集的过程无非就是运用上述属性和方法获得对应信息的过程，掌握了上述接口加载数据就很简单了。

由于很多开源代码中都写有COCO的数据加载类，大多数情况下我们无须自己重写，但是在一些特殊情况下需要根据需求重写或修改，个人比较推荐参考YOLACT代码中的数据加载，因为代码逻辑比较简单清晰。

这里放一个简单的目标检测数据加载模板：

import cv2
import os.path as osp
import numpy as np

class COCODataset():
    """
    COCO style dataset
    """
    def __init__(
        self,
        data_dir=None,
        json_file="coco_train.json",
        name="train",
        area_threshold=-1,
    ):
        """
        Dataset initialization. Annotation data are read into memory by COCO API.
        Args:
            data_dir (str): dataset root directory
            json_file (str): annotation json file name
            name (str): data name (e.g. 'train' or 'val')
           area_threshold(int): area threshold for object
        """
        self.area_thres = area_threshold
        self.data_dir = data_dir
        self.json_file = json_file
        self.imgs = None
        self.name = name
        self.init()

    def init(self):
        from pycocotools.coco import COCO
        self.coco = COCO(osp.join(self.data_dir, "annotations", self.json_file))

        # NOTE: do not use self.coco.getImgIds()
        self.ids = list(self.coco.imgToAnns.keys()) 
        self.class_ids = sorted(self.coco.getCatIds())
        self.cats = self.coco.loadCats(self.coco.getCatIds())
        self._classes = tuple([c["name"] for c in self.cats])

        self.annotations = self._load_coco_annotations()

    def __getitem__(self, index):
        # NOTE: in case empty label
        max_iter = 5
        for _ in range(max_iter):
            res, img_info,  _ = self.annotations[index]
            if res.size > 0:
                break
            index = np.random.randint(0, self.__len__())
        id_ = self.ids[index]

        if self.imgs is not None:
            pad_img = self.imgs[index]
            img = pad_img[: img_info[0], : img_info[1], :].copy()
        else:
            img = self.load_image(index)

        return img, res.copy(), img_info, np.array([id_])

    def __len__(self):
        return len(self.ids)

    def _load_coco_annotations(self):
        return [self.load_anno_from_ids(_ids) for _ids in self.ids]

    def load_anno_from_ids(self, id_):
        im_ann = self.coco.loadImgs(id_)[0]
        width = im_ann["width"]
        height = im_ann["height"]
        anno_ids = self.coco.getAnnIds(imgIds=[int(id_)], iscrowd=False)
        annotations = self.coco.loadAnns(anno_ids)
        objs = []
        for obj in annotations:
            x1 = np.max((0, obj["bbox"][0]))
            y1 = np.max((0, obj["bbox"][1]))
            x2 = np.min((width, x1 + np.max((0, obj["bbox"][2]))))
            y2 = np.min((height, y1 + np.max((0, obj["bbox"][3]))))
            # NOTE: we filter the small object, which is different from the super method
            area_thres = max(self.area_thres, 0)
            if obj["area"] > area_thres and x2 >= x1 and y2 >= y1:
                obj["clean_bbox"] = [x1, y1, x2, y2]
                objs.append(obj)

        num_objs = len(objs)

        res = np.zeros((num_objs, 5))

        for ix, obj in enumerate(objs):
            cls = self.class_ids.index(obj["category_id"])
            res[ix, 0:4] = obj["clean_bbox"]
            res[ix, 4] = cls

        img_info = (height, width)
        file_name = (
            im_ann["file_name"]
            if "file_name" in im_ann
            else "{:012}".format(id_) + ".jpg"
        )

        return (res, img_info, file_name)

    def load_image(self, index):
        file_name = self.annotations[index][-1]
        assert osp.exists(file_name), f"file named {file_name} not found"
        img = cv2.imread(file_name)

        return img

至此，深入探究COCO数据集标签这一专题就结束了，COCO数据集被广泛使用于检测和分割领域，即使我们不直接使用COCO数据集本身，掌握其标签格式进而能够创建和加载自己的COCO style数据集在使用相关模型的时候也很有益。希望本专题能抛砖引玉，给读者提供一些参考。