You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: patched_yolo_infer/README.md
+41-4Lines changed: 41 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,10 +23,11 @@ Interactive notebooks are provided to showcase the functionality of the library.
23
23
24
24
__Check this Colab examples:__
25
25
26
-
YOLO-Patch-Based-Inference Example - [Open in Colab](https://colab.research.google.com/drive/1FUao91GyB-ojGRN_okUxYyfagTT9tdsP?usp=sharing)
26
+
Patch-Based-Inference Example - [Open in Colab](https://colab.research.google.com/drive/1XCpIYLMFEmGSO0XCOkSD7CcD9SFHSJPA?usp=sharing)
27
27
28
28
Example of using various functions for visualizing basic YOLOv8/v9 inference results and handling overlapping crops - [Open in Colab](https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing)
29
29
30
+
30
31
## Usage
31
32
32
33
### 1. Patch-Based-Inference
@@ -40,7 +41,7 @@ The output obtained from the process includes several attributes that can be lev
40
41
41
42
3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
42
43
43
-
4.masks: If available, this attribute provides segmentation masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
44
+
4.polygons: If available, this attribute provides a list containing NumPy arrays of polygon coordinates that represent segmentation masks corresponding to the detected objects. These polygons can be utilized to accurately outline the boundaries of each object.
44
45
45
46
5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
46
47
@@ -72,7 +73,7 @@ result = CombineDetections(element_crops, nms_threshold=0.25, match_metric='IOS'
72
73
img=result.image
73
74
confidences=result.filtered_confidences
74
75
boxes=result.filtered_boxes
75
-
masks=result.filtered_masks
76
+
polygons=result.filtered_polygons
76
77
classes_ids=result.filtered_classes_id
77
78
classes_names=result.filtered_classes_names
78
79
```
@@ -96,6 +97,7 @@ Class implementing cropping and passing crops through a neural network for detec
96
97
-**overlap_y** (*float*): Percentage of overlap along the y-axis.
97
98
-**show_crops** (*bool*): Whether to visualize the cropping.
98
99
-**resize_initial_size** (*bool*): Whether to resize the results to the original image size (ps: slow operation).
100
+
-**memory_optimize** (*bool*): Memory optimization option for segmentation (less accurate results when enabled).
99
101
100
102
**CombineDetections**
101
103
Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum Suppression).\
@@ -105,6 +107,8 @@ Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum
105
107
-**match_metric** (*str*): Matching metric, either 'IOU' or 'IOS'.
106
108
-**intelligent_sorter** (*bool*): Enable sorting by area and rounded confidence parameter. If False, sorting will be done only by confidence (usual nms). (Dafault is True)
107
109
110
+
111
+
108
112
---
109
113
### 2. Custom inference visualization:
110
114
Visualizes custom results of object detection or segmentation on an image.
@@ -115,6 +119,7 @@ Visualizes custom results of object detection or segmentation on an image.
115
119
-**classes_ids** (*list*): A list of class IDs for each detection.
116
120
-**confidences** (*list*): A list of confidence scores corresponding to each bounding box. Default is an empty list.
117
121
-**classes_names** (*list*): A list of class names corresponding to the class IDs. Default is an empty list.
122
+
-**polygons** (*list*): A list containing NumPy arrays of polygon coordinates that represent segmentation masks.
118
123
-**masks** (*list*): A list of masks. Default is an empty list.
119
124
-**segment** (*bool*): Whether to perform instance segmentation. Default is False.
120
125
-**show_boxes** (*bool*): Whether to show bounding boxes. Default is True.
@@ -147,9 +152,41 @@ visualize_results(
147
152
img=result.image,
148
153
confidences=result.filtered_confidences,
149
154
boxes=result.filtered_boxes,
150
-
masks=result.filtered_masks,
155
+
polygons=result.filtered_polygons,
151
156
classes_ids=result.filtered_classes_id,
152
157
classes_names=result.filtered_classes_names,
153
158
segment=False,
154
159
)
160
+
```
161
+
162
+
---
163
+
164
+
## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__
165
+
166
+
In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.
167
+
168
+
The difference in the approach to using the function lies in specifying the parameter ```memory_optimize=False``` in the ```MakeCropsDetectThem``` class.
169
+
In such a case, the informative values after processing will be the following:
170
+
171
+
1. img: This attribute contains the original image on which the inference was performed. It provides context for the detected objects.
172
+
173
+
2. confidences: This attribute holds the confidence scores associated with each detected object. These scores indicate the model's confidence level in the accuracy of its predictions.
174
+
175
+
3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
176
+
177
+
4. masks: This attribute provides segmentation binary masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
178
+
179
+
5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
180
+
181
+
6. classes_names: These are the human-readable names corresponding to the class IDs. They provide semantic labels for the detected objects, making the results easier to interpret.
0 commit comments