Skip to content

Commit f4cd05c

Browse files
committed
new docs
1 parent 2fe900c commit f4cd05c

3 files changed

Lines changed: 84 additions & 10 deletions

File tree

README.md

Lines changed: 38 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ The output obtained from the process includes several attributes that can be lev
5959

6060
3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
6161

62-
4. masks: If available, this attribute provides segmentation masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
62+
4. polygons: If available, this attribute provides a list containing NumPy arrays of polygon coordinates that represent segmentation masks corresponding to the detected objects. These polygons can be utilized to accurately outline the boundaries of each object.
6363

6464
5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
6565

@@ -91,7 +91,7 @@ result = CombineDetections(element_crops, nms_threshold=0.25, match_metric='IOS'
9191
img=result.image
9292
confidences=result.filtered_confidences
9393
boxes=result.filtered_boxes
94-
masks=result.filtered_masks
94+
polygons=result.filtered_polygons
9595
classes_ids=result.filtered_classes_id
9696
classes_names=result.filtered_classes_names
9797
```
@@ -115,6 +115,7 @@ Class implementing cropping and passing crops through a neural network for detec
115115
- **overlap_y** (*float*): Percentage of overlap along the y-axis.
116116
- **show_crops** (*bool*): Whether to visualize the cropping.
117117
- **resize_initial_size** (*bool*): Whether to resize the results to the original image size (ps: slow operation).
118+
- **memory_optimize** (*bool*): Memory optimization option for segmentation (less accurate results when enabled).
118119

119120
**CombineDetections**
120121
Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum Suppression).\
@@ -137,6 +138,7 @@ Visualizes custom results of object detection or segmentation on an image.
137138
- **classes_ids** (*list*): A list of class IDs for each detection.
138139
- **confidences** (*list*): A list of confidence scores corresponding to each bounding box. Default is an empty list.
139140
- **classes_names** (*list*): A list of class names corresponding to the class IDs. Default is an empty list.
141+
- **polygons** (*list*): A list containing NumPy arrays of polygon coordinates that represent segmentation masks.
140142
- **masks** (*list*): A list of masks. Default is an empty list.
141143
- **segment** (*bool*): Whether to perform instance segmentation. Default is False.
142144
- **show_boxes** (*bool*): Whether to show bounding boxes. Default is True.
@@ -170,13 +172,46 @@ visualize_results(
170172
img=result.image,
171173
confidences=result.filtered_confidences,
172174
boxes=result.filtered_boxes,
173-
masks=result.filtered_masks,
175+
polygons=result.filtered_polygons,
174176
classes_ids=result.filtered_classes_id,
175177
classes_names=result.filtered_classes_names,
176178
segment=False,
177179
)
178180
```
179181

182+
---
183+
---
184+
185+
## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__
186+
187+
In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.
188+
189+
The difference in the approach to using the function lies in specifying the parameter ```memory_optimize=False``` in the ```MakeCropsDetectThem``` class.
190+
In such a case, the informative values after processing will be the following:
191+
192+
1. img: This attribute contains the original image on which the inference was performed. It provides context for the detected objects.
193+
194+
2. confidences: This attribute holds the confidence scores associated with each detected object. These scores indicate the model's confidence level in the accuracy of its predictions.
195+
196+
3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
197+
198+
4. masks: This attribute provides segmentation binary masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
199+
200+
5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
201+
202+
6. classes_names: These are the human-readable names corresponding to the class IDs. They provide semantic labels for the detected objects, making the results easier to interpret.
203+
204+
205+
Here's how you can obtain them:
206+
```python
207+
img=result.image
208+
confidences=result.filtered_confidences
209+
boxes=result.filtered_boxes
210+
masks=result.filtered_masks
211+
classes_ids=result.filtered_classes_id
212+
classes_names=result.filtered_classes_names
213+
```
214+
180215

181216
[nb_example1]: https://nbviewer.org/github/Koldim2001/YOLO-Patch-Based-Inference/blob/main/examples/example_patch_based_inference.ipynb
182217
[colab_badge]: https://colab.research.google.com/assets/colab-badge.svg

examples/example_patch_based_inference.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@
5454
"\n",
5555
"3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.\n",
5656
"\n",
57-
"4. polygons: If available, this attribute provides a list of polygon coordinates for masks when used for instance segmentation tasks.\n",
57+
"4. polygons: If available, this attribute provides a list containing NumPy arrays of polygon coordinates that represent segmentation masks corresponding to the detected objects. These polygons can be utilized to accurately outline the boundaries of each object.\n",
5858
"\n",
5959
"5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.\n",
6060
"\n",
@@ -932,7 +932,7 @@
932932
"cell_type": "markdown",
933933
"metadata": {},
934934
"source": [
935-
"# __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION::__\n",
935+
"## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__\n",
936936
"\n",
937937
"In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.\n",
938938
"\n",

patched_yolo_infer/README.md

Lines changed: 44 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ YOLO-Patch-Based-Inference Example - [Open in Colab](https://colab.research.goog
2727

2828
Example of using various functions for visualizing basic YOLOv8/v9 inference results and handling overlapping crops - [Open in Colab](https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing)
2929

30+
3031
## Usage
3132

3233
### 1. Patch-Based-Inference
@@ -40,7 +41,7 @@ The output obtained from the process includes several attributes that can be lev
4041

4142
3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
4243

43-
4. masks: If available, this attribute provides segmentation masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
44+
4. polygons: If available, this attribute provides a list containing NumPy arrays of polygon coordinates that represent segmentation masks corresponding to the detected objects. These polygons can be utilized to accurately outline the boundaries of each object.
4445

4546
5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
4647

@@ -72,7 +73,7 @@ result = CombineDetections(element_crops, nms_threshold=0.25, match_metric='IOS'
7273
img=result.image
7374
confidences=result.filtered_confidences
7475
boxes=result.filtered_boxes
75-
masks=result.filtered_masks
76+
polygons=result.filtered_polygons
7677
classes_ids=result.filtered_classes_id
7778
classes_names=result.filtered_classes_names
7879
```
@@ -96,14 +97,18 @@ Class implementing cropping and passing crops through a neural network for detec
9697
- **overlap_y** (*float*): Percentage of overlap along the y-axis.
9798
- **show_crops** (*bool*): Whether to visualize the cropping.
9899
- **resize_initial_size** (*bool*): Whether to resize the results to the original image size (ps: slow operation).
100+
- **memory_optimize** (*bool*): Memory optimization option for segmentation (less accurate results when enabled).
99101

100102
**CombineDetections**
101103
Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum Suppression).\
102104
**Args:**
103105
- **element_crops** (*MakeCropsDetectThem*): Object containing crop information.
104106
- **nms_threshold** (*float*): IoU/IoS threshold for non-maximum suppression.
105107
- **match_metric** (*str*): Matching metric, either 'IOU' or 'IOS'.
106-
- **intelligent_sorter** (*bool*): Enable sorting by area and rounded confidence parameter. If False, sorting will be done only by confidence (usual nms). (Dafault is True)
108+
- **intelligent_sorter** (*bool*): Enable sorting by area and rounded confidence parameter.
109+
If False, sorting will be done only by confidence (usual nms). (Dafault is True)
110+
111+
107112

108113
---
109114
### 2. Custom inference visualization:
@@ -115,6 +120,7 @@ Visualizes custom results of object detection or segmentation on an image.
115120
- **classes_ids** (*list*): A list of class IDs for each detection.
116121
- **confidences** (*list*): A list of confidence scores corresponding to each bounding box. Default is an empty list.
117122
- **classes_names** (*list*): A list of class names corresponding to the class IDs. Default is an empty list.
123+
- **polygons** (*list*): A list containing NumPy arrays of polygon coordinates that represent segmentation masks.
118124
- **masks** (*list*): A list of masks. Default is an empty list.
119125
- **segment** (*bool*): Whether to perform instance segmentation. Default is False.
120126
- **show_boxes** (*bool*): Whether to show bounding boxes. Default is True.
@@ -132,7 +138,8 @@ Visualizes custom results of object detection or segmentation on an image.
132138
- **show_confidences** (*bool*): If true and show_class=True, confidences near class are visualized. Default is False.
133139
- **axis_off** (*bool*): If true, axis is turned off in the final visualization. Default is True.
134140
- **show_classes_list** (*list*): If empty, visualize all classes. Otherwise, visualize only classes in the list.
135-
- **return_image_array** (*bool*): If True, the function returns the image (BGR np.array) instead of displaying it. Default is False.
141+
- **return_image_array** (*bool*): If True, the function returns the image (BGR np.array) instead of displaying it.
142+
Default is False.
136143

137144

138145
Example of using:
@@ -147,9 +154,41 @@ visualize_results(
147154
img=result.image,
148155
confidences=result.filtered_confidences,
149156
boxes=result.filtered_boxes,
150-
masks=result.filtered_masks,
157+
polygons=result.filtered_polygons,
151158
classes_ids=result.filtered_classes_id,
152159
classes_names=result.filtered_classes_names,
153160
segment=False,
154161
)
162+
```
163+
164+
---
165+
166+
## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__
167+
168+
In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.
169+
170+
The difference in the approach to using the function lies in specifying the parameter ```memory_optimize=False``` in the ```MakeCropsDetectThem``` class.
171+
In such a case, the informative values after processing will be the following:
172+
173+
1. img: This attribute contains the original image on which the inference was performed. It provides context for the detected objects.
174+
175+
2. confidences: This attribute holds the confidence scores associated with each detected object. These scores indicate the model's confidence level in the accuracy of its predictions.
176+
177+
3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
178+
179+
4. masks: This attribute provides segmentation binary masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
180+
181+
5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
182+
183+
6. classes_names: These are the human-readable names corresponding to the class IDs. They provide semantic labels for the detected objects, making the results easier to interpret.
184+
185+
186+
Here's how you can obtain them:
187+
```python
188+
img=result.image
189+
confidences=result.filtered_confidences
190+
boxes=result.filtered_boxes
191+
masks=result.filtered_masks
192+
classes_ids=result.filtered_classes_id
193+
classes_names=result.filtered_classes_names
155194
```

0 commit comments

Comments
 (0)