Data Format: Instance Segmentation
Introduction
Instance segmentation is based on the object detection data format, with the addition of mask boundary localization within the detection box on top of the two-point bounding box. It uses JSON files for annotation, where 001.jpg is an original image, and 001.json contains the annotations and corresponding labels for multiple instance segmentations in that image.
Example
json_Label,image_Source
labels/001.json,images/001.jpgThe JSON annotation content for instance segmentation consists of bounding boxes with their corresponding content and attributes. The bounding boxes are divided into two-point boxes and mask boundaries:
- The two-point box consists of the top-left corner (x_min and y_min) and bottom-right corner (x_max and y_max)
- The mask is a series of point coordinates starting from the top-left point of the region in a clockwise direction, forming a closed shape representing the object's outline.
All annotation coordinates use relative positions on the image. For example: if the image size is (800, 600) and the point coordinate is (10, 30), the bounding box representation would be (10/800, 30/600), which equals (0.125, 0.05).
Field Description
- image_width - The width of the image
- image_height - The height of the image
- image_path - The relative path of the image file
- num_box - The number of bounding boxes on the image
- bboxes - The list of bounding boxes on the image
- attributions - Custom attribute values for the dataset (not used for training, but retained for annotation purposes)
- label - The annotation label for this box
- x_min / y_min - Top-left corner coordinates of the two-point box
- x_max / y_max - Bottom-right corner coordinates of the two-point box
- x_arr - Sequential x coordinates of mask boundary points
- y_arr - Sequential y coordinates of mask boundary points
 
Instance Segmentation Annotation Example
{
  "num_box": 1,
  "bboxes": [
    {
      "id": 0,
      "attributions": {
        "group": 2
      },
      "label": "people",
      "x_max": 0.08816,
      "x_min": 0.00001,
      "y_max": 0.20833,
      "y_min": 0.00001,
      "x_arr": [
        0.0, 0.0, 0.0861, 0.0875, 0.2083, 0.2097, 0.3292, 0.3306, 0.4514, 0.4528, 0.57222, 0.5736, 0.6347, 0.6347
      ],
      "y_arr": [
        0.0, 0.0652, 0.0652, 0.0707, 0.0706, 0.0761, 0.0761, 0.0815, 0.0815, 0.0869, 0.0869, 0.0924, 0.0924, 0.0
      ]
    }
  ],
  "image_height": 2019,
  "image_width": 2048,
  "image_path": "images/1.jpg"
}