You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+38-3Lines changed: 38 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,7 +59,7 @@ The output obtained from the process includes several attributes that can be lev
59
59
60
60
3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
61
61
62
-
4.masks: If available, this attribute provides segmentation masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
62
+
4.polygons: If available, this attribute provides a list containing NumPy arrays of polygon coordinates that represent segmentation masks corresponding to the detected objects. These polygons can be utilized to accurately outline the boundaries of each object.
63
63
64
64
5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
65
65
@@ -91,7 +91,7 @@ result = CombineDetections(element_crops, nms_threshold=0.25, match_metric='IOS'
91
91
img=result.image
92
92
confidences=result.filtered_confidences
93
93
boxes=result.filtered_boxes
94
-
masks=result.filtered_masks
94
+
polygons=result.filtered_polygons
95
95
classes_ids=result.filtered_classes_id
96
96
classes_names=result.filtered_classes_names
97
97
```
@@ -115,6 +115,7 @@ Class implementing cropping and passing crops through a neural network for detec
115
115
-**overlap_y** (*float*): Percentage of overlap along the y-axis.
116
116
-**show_crops** (*bool*): Whether to visualize the cropping.
117
117
-**resize_initial_size** (*bool*): Whether to resize the results to the original image size (ps: slow operation).
118
+
-**memory_optimize** (*bool*): Memory optimization option for segmentation (less accurate results when enabled).
118
119
119
120
**CombineDetections**
120
121
Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum Suppression).\
@@ -137,6 +138,7 @@ Visualizes custom results of object detection or segmentation on an image.
137
138
-**classes_ids** (*list*): A list of class IDs for each detection.
138
139
-**confidences** (*list*): A list of confidence scores corresponding to each bounding box. Default is an empty list.
139
140
-**classes_names** (*list*): A list of class names corresponding to the class IDs. Default is an empty list.
141
+
-**polygons** (*list*): A list containing NumPy arrays of polygon coordinates that represent segmentation masks.
140
142
-**masks** (*list*): A list of masks. Default is an empty list.
141
143
-**segment** (*bool*): Whether to perform instance segmentation. Default is False.
142
144
-**show_boxes** (*bool*): Whether to show bounding boxes. Default is True.
@@ -170,13 +172,46 @@ visualize_results(
170
172
img=result.image,
171
173
confidences=result.filtered_confidences,
172
174
boxes=result.filtered_boxes,
173
-
masks=result.filtered_masks,
175
+
polygons=result.filtered_polygons,
174
176
classes_ids=result.filtered_classes_id,
175
177
classes_names=result.filtered_classes_names,
176
178
segment=False,
177
179
)
178
180
```
179
181
182
+
---
183
+
---
184
+
185
+
## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__
186
+
187
+
In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.
188
+
189
+
The difference in the approach to using the function lies in specifying the parameter ```memory_optimize=False``` in the ```MakeCropsDetectThem``` class.
190
+
In such a case, the informative values after processing will be the following:
191
+
192
+
1. img: This attribute contains the original image on which the inference was performed. It provides context for the detected objects.
193
+
194
+
2. confidences: This attribute holds the confidence scores associated with each detected object. These scores indicate the model's confidence level in the accuracy of its predictions.
195
+
196
+
3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
197
+
198
+
4. masks: This attribute provides segmentation binary masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
199
+
200
+
5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
201
+
202
+
6. classes_names: These are the human-readable names corresponding to the class IDs. They provide semantic labels for the detected objects, making the results easier to interpret.
Copy file name to clipboardExpand all lines: examples/example_patch_based_inference.ipynb
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -54,7 +54,7 @@
54
54
"\n",
55
55
"3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.\n",
56
56
"\n",
57
-
"4. polygons: If available, this attribute provides a list of polygon coordinates for masks when used for instance segmentation tasks.\n",
57
+
"4. polygons: If available, this attribute provides a list containing NumPy arrays of polygon coordinates that represent segmentation masks corresponding to the detected objects. These polygons can be utilized to accurately outline the boundaries of each object.\n",
58
58
"\n",
59
59
"5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.\n",
60
60
"\n",
@@ -932,7 +932,7 @@
932
932
"cell_type": "markdown",
933
933
"metadata": {},
934
934
"source": [
935
-
"# __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION::__\n",
935
+
"## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__\n",
936
936
"\n",
937
937
"In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.\n",
Copy file name to clipboardExpand all lines: patched_yolo_infer/README.md
+44-5Lines changed: 44 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,6 +27,7 @@ YOLO-Patch-Based-Inference Example - [Open in Colab](https://colab.research.goog
27
27
28
28
Example of using various functions for visualizing basic YOLOv8/v9 inference results and handling overlapping crops - [Open in Colab](https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing)
29
29
30
+
30
31
## Usage
31
32
32
33
### 1. Patch-Based-Inference
@@ -40,7 +41,7 @@ The output obtained from the process includes several attributes that can be lev
40
41
41
42
3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
42
43
43
-
4.masks: If available, this attribute provides segmentation masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
44
+
4.polygons: If available, this attribute provides a list containing NumPy arrays of polygon coordinates that represent segmentation masks corresponding to the detected objects. These polygons can be utilized to accurately outline the boundaries of each object.
44
45
45
46
5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
46
47
@@ -72,7 +73,7 @@ result = CombineDetections(element_crops, nms_threshold=0.25, match_metric='IOS'
72
73
img=result.image
73
74
confidences=result.filtered_confidences
74
75
boxes=result.filtered_boxes
75
-
masks=result.filtered_masks
76
+
polygons=result.filtered_polygons
76
77
classes_ids=result.filtered_classes_id
77
78
classes_names=result.filtered_classes_names
78
79
```
@@ -96,14 +97,18 @@ Class implementing cropping and passing crops through a neural network for detec
96
97
-**overlap_y** (*float*): Percentage of overlap along the y-axis.
97
98
-**show_crops** (*bool*): Whether to visualize the cropping.
98
99
-**resize_initial_size** (*bool*): Whether to resize the results to the original image size (ps: slow operation).
100
+
-**memory_optimize** (*bool*): Memory optimization option for segmentation (less accurate results when enabled).
99
101
100
102
**CombineDetections**
101
103
Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum Suppression).\
-**nms_threshold** (*float*): IoU/IoS threshold for non-maximum suppression.
105
107
-**match_metric** (*str*): Matching metric, either 'IOU' or 'IOS'.
106
-
-**intelligent_sorter** (*bool*): Enable sorting by area and rounded confidence parameter. If False, sorting will be done only by confidence (usual nms). (Dafault is True)
108
+
-**intelligent_sorter** (*bool*): Enable sorting by area and rounded confidence parameter.
109
+
If False, sorting will be done only by confidence (usual nms). (Dafault is True)
110
+
111
+
107
112
108
113
---
109
114
### 2. Custom inference visualization:
@@ -115,6 +120,7 @@ Visualizes custom results of object detection or segmentation on an image.
115
120
-**classes_ids** (*list*): A list of class IDs for each detection.
116
121
-**confidences** (*list*): A list of confidence scores corresponding to each bounding box. Default is an empty list.
117
122
-**classes_names** (*list*): A list of class names corresponding to the class IDs. Default is an empty list.
123
+
-**polygons** (*list*): A list containing NumPy arrays of polygon coordinates that represent segmentation masks.
118
124
-**masks** (*list*): A list of masks. Default is an empty list.
119
125
-**segment** (*bool*): Whether to perform instance segmentation. Default is False.
120
126
-**show_boxes** (*bool*): Whether to show bounding boxes. Default is True.
@@ -132,7 +138,8 @@ Visualizes custom results of object detection or segmentation on an image.
132
138
-**show_confidences** (*bool*): If true and show_class=True, confidences near class are visualized. Default is False.
133
139
-**axis_off** (*bool*): If true, axis is turned off in the final visualization. Default is True.
134
140
-**show_classes_list** (*list*): If empty, visualize all classes. Otherwise, visualize only classes in the list.
135
-
-**return_image_array** (*bool*): If True, the function returns the image (BGR np.array) instead of displaying it. Default is False.
141
+
-**return_image_array** (*bool*): If True, the function returns the image (BGR np.array) instead of displaying it.
142
+
Default is False.
136
143
137
144
138
145
Example of using:
@@ -147,9 +154,41 @@ visualize_results(
147
154
img=result.image,
148
155
confidences=result.filtered_confidences,
149
156
boxes=result.filtered_boxes,
150
-
masks=result.filtered_masks,
157
+
polygons=result.filtered_polygons,
151
158
classes_ids=result.filtered_classes_id,
152
159
classes_names=result.filtered_classes_names,
153
160
segment=False,
154
161
)
162
+
```
163
+
164
+
---
165
+
166
+
## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__
167
+
168
+
In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.
169
+
170
+
The difference in the approach to using the function lies in specifying the parameter ```memory_optimize=False``` in the ```MakeCropsDetectThem``` class.
171
+
In such a case, the informative values after processing will be the following:
172
+
173
+
1. img: This attribute contains the original image on which the inference was performed. It provides context for the detected objects.
174
+
175
+
2. confidences: This attribute holds the confidence scores associated with each detected object. These scores indicate the model's confidence level in the accuracy of its predictions.
176
+
177
+
3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
178
+
179
+
4. masks: This attribute provides segmentation binary masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
180
+
181
+
5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
182
+
183
+
6. classes_names: These are the human-readable names corresponding to the class IDs. They provide semantic labels for the detected objects, making the results easier to interpret.
0 commit comments