mscoco
MSCOCODataset
¶
Bases: DirDataset
A specialized DirDataset to handle MSCOCO data.
This dataset combines images from the MSCOCO data directory with their corresponding bboxes, masks, and captions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
image_dir |
str
|
The path the directory containing MSOCO images. |
required |
annotation_file |
str
|
The path to the file containing annotation data. |
required |
caption_file |
str
|
The path the file containing caption data. |
required |
keypoint_file |
str
|
The path the file containing keypoint data. |
required |
include_bboxes |
bool
|
Whether images should be paired with their associated bounding boxes. |
True
|
include_masks |
bool
|
Whether images should be paired with their associated masks. |
False
|
include_captions |
bool
|
Whether images should be paired with their associated captions. |
False
|
include_keypoints |
bool
|
Whether images should be paired with keypoints. |
False
|
min_bbox_area |
Bounding boxes with a total area less than |
1.0
|
|
replacement |
bool
|
If true, images without requested attributes will be ignored and other images may be oversampled in order to take their place. |
True
|
Source code in fastestimator/fastestimator/dataset/data/mscoco.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
|
load_data
¶
Load and return the COCO dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
root_dir |
Optional[str]
|
The path to store the downloaded data. When |
None
|
load_bboxes |
bool
|
Whether to load bbox-related data, in [x1, y1, w, h] format. |
True
|
load_masks |
bool
|
Whether to load mask data (in the form of an array of 1-hot images). |
False
|
load_captions |
bool
|
Whether to load caption-related data. |
False
|
load_keypoints |
bool
|
Whether to load keypoint data, in format of [array(17, 3)]. 17 is the number of keypoints, 3 is the keypoint format in (x,y,v) with x,y being coordinate and v being visibility. v=0 means not labeled, v=1 means labeled but not visible, and v=2 means labeled and visible. In addition, the bbox of keypoint object will also be available under 'keypoint_bbox' key. |
False
|
replacement |
bool
|
If the specific attribute is missing (like bbox), whether to replace the sample with another random sample. |
True
|
Returns:
Type | Description |
---|---|
Tuple[MSCOCODataset, MSCOCODataset]
|
(train_data, eval_data) |