Skip to content

Commit d690e59

Browse files
authored
Merge pull request #18 from Koldim2001/agnostic_nms
new update (nms + auto cropping)
2 parents c7f774a + 730563d commit d690e59

8 files changed

Lines changed: 338 additions & 64 deletions

File tree

README.md

Lines changed: 52 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ You can install the library via pip:
1919
pip install patched_yolo_infer
2020
```
2121

22-
[![PyPI Version](https://img.shields.io/pypi/v/patched-yolo-infer.svg)](https://pypi.org/project/patched-yolo-infer/) - Click here to visit the PyPI page for `patched-yolo-infer`, where you can find more information and documentation.
22+
[![PyPI Version](https://img.shields.io/pypi/v/patched-yolo-infer.svg)](https://pypi.org/project/patched-yolo-infer/) - Click here to visit the PyPI page of `patched-yolo-infer`.
2323

2424
Note: If CUDA support is available, it's recommended to pre-install PyTorch with CUDA support before installing the library. Otherwise, the CPU version will be installed by default.
2525

@@ -78,7 +78,7 @@ import cv2
7878
from patched_yolo_infer import MakeCropsDetectThem, CombineDetections
7979

8080
# Load the image
81-
img_path = 'test_image.jpg'
81+
img_path = "test_image.jpg"
8282
img = cv2.imread(img_path)
8383

8484
element_crops = MakeCropsDetectThem(
@@ -111,7 +111,7 @@ Class implementing cropping and passing crops through a neural network for detec
111111

112112
| **Argument** | **Type** | **Default** | **Description** |
113113
|-----------------------|------------------------|--------------|----------------------------------------------------------------------------------------------------------------|
114-
| image | np.ndarray | | Input image BGR. |
114+
| image | np.ndarray | | The input image in BGR format. |
115115
| model_path | str | "yolov8m.pt" | Path to the YOLO model. |
116116
| model | ultralytics model | None | Pre-initialized model object. If provided, the model will be used directly instead of loading from model_path. |
117117
| imgsz | int | 640 | Size of the input image for inference YOLO. |
@@ -138,8 +138,9 @@ Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum
138138
| element_crops |MakeCropsDetectThem| | Object containing crop information. |
139139
| nms_threshold | float | 0.3 | IoU/IoS threshold for non-maximum suppression. The lower the value, the fewer objects remain after suppression. |
140140
| match_metric | str | IOS | Matching metric, either 'IOU' or 'IOS'. |
141+
| class_agnostic_nms | bool | True | Determines the NMS mode in object detection. When set to True, NMS operates across all classes, ignoring class distinctions and suppressing less confident bounding boxes globally. Otherwise, NMS is applied separately for each class. |
141142
| intelligent_sorter | bool | True | Enable sorting by area and rounded confidence parameter. If False, sorting will be done only by confidence (usual nms). |
142-
| sorter_bins | int | 10 | Number of bins to use for intelligent_sorter. A smaller number of bins makes the NMS more reliant on object sizes rather than confidence scores. |
143+
| sorter_bins | int | 5 | Number of bins to use for intelligent_sorter. A smaller number of bins makes the NMS more reliant on object sizes rather than confidence scores. |
143144

144145

145146

@@ -207,9 +208,12 @@ visualize_results(
207208

208209
4. **Enhancing Detection Within Patches**: To detect more objects within a single crop, increase the `imgsz` parameter and lower the confidence threshold (`conf`). All parameters available for configuring Ultralytics model inference are also accessible during the initialization of the `MakeCropsDetectThem` element.
209210

210-
5. **Handling Duplicate Suppression Issues**: If you encounter issues with duplicate suppression from overlapping patches, consider adjusting the `nms_threshold` and `sorter_bins` parameters in `CombineDetections` or modifying the overlap and size parameters of the patches themselves. (PS: often lowering `sorter_bins` to 5 or 4 can help).
211+
5. **Handling Duplicate Suppression Issues**: If you encounter issues with duplicate suppression from overlapping patches, consider adjusting the `nms_threshold` and `sorter_bins` parameters in `CombineDetections` or modifying the overlap and size parameters of the patches themselves. (PS: often lowering `sorter_bins` to 4 or 2 can help).
212+
213+
6. **Handling Multi-Class Detection Issues**: If you are working on a multi-class detection or instance segmentation task, it may be beneficial to switch the mode to `class_agnostic_nms=False` in the `CombineDetections` parameters. The default mode, with `class_agnostic_nms` set to True, is particularly effective when handling a large number of closely related classes in pre-trained YOLO networks (for example, when there is often confusion between classes like `car` and `truck`). If in your scenario, an object of one class can physically be inside an object of another class, you should definitely set `class_agnostic_nms=False` for such cases.
214+
215+
7. **High-Quality Instance Segmentation**: For tasks requiring high-quality results in instance segmentation, detailed guidance is provided in the next section of the README.
211216

212-
6. **High-Quality Instance Segmentation**: For tasks requiring high-quality results in instance segmentation, detailed guidance is provided in the next section of the README.
213217
---
214218

215219
## __How to improve the quality of the algorithm for the task of instance segmentation:__
@@ -244,6 +248,47 @@ classes_names=result.filtered_classes_names
244248

245249
An example of working with this mode is presented in Google Colab notebook - [![Open In Colab][colab_badge]][colab_ex1_memory_optimize]
246250

251+
---
252+
253+
## __How to automatically determine optimal parameters for patches (crops):__
254+
255+
To efficiently process a large number of images of varying sizes and contents, manually selecting the optimal patch sizes and overlaps can be difficult.. To address this, an algorithm has been developed to automatically calculate the best parameters for patches (crops).
256+
257+
The `auto_calculate_crop_values` function operates in two modes:
258+
259+
1. **Resolution-Based Analysis**: This mode evaluates the resolution of the source images to determine the optimal patch sizes and overlaps. It is faster but may not yield the highest quality results because it does not take into account the actual objects present in the images.
260+
261+
2. **Neural Network-Based Analysis**: This advanced mode employs a neural network to analyze the images. The algorithm performs a standard inference of the network on the entire image and identifies the largest detected objects. Based on the sizes of these objects, the algorithm selects patch parameters to ensure that the largest objects are fully contained within a patch, and overlapping patches ensure comprehensive coverage. In this mode, it is necessary to input the model that will be used for patch-based inference in the subsequent steps.
262+
263+
Possible arguments of the ```auto_calculate_crop_values``` function:
264+
| **Argument** | **Type** | **Default** | **Description** |
265+
|-----------------------|------------------------|--------------|----------------------------------------------------------------------------------------------------------------|
266+
| image | np.ndarray | | The input image in BGR format. |
267+
| mode | str | "network_based" | The type of analysis to perform. Can be "resolution_based" for Resolution-Based Analysis or "network_based" for Neural Network-Based Analysis.|
268+
| model | ultralytics model | YOLO("yolov8m.pt") | Pre-initialized model object for "network_based" mode. If not provided, the default YOLOv8m model will be used.|
269+
| classes_list | list | None | A list of class indices to consider for object detection in "network_based" mode. If None, all classes will be considered. |
270+
| conf | float | 0.25 | The confidence threshold for detection in "network_based" mode. |
271+
272+
Example of using:
273+
```python
274+
import cv2
275+
from ultralytics import YOLO
276+
from patched_yolo_infer import auto_calculate_crop_values
277+
278+
# Load the image
279+
img_path = "test_image.jpg"
280+
img = cv2.imread(img_path)
281+
282+
# Calculate the optimal crop size and overlap for an image
283+
shape_x, shape_y, overlap_x, overlap_y = auto_calculate_crop_values(
284+
image=img, mode="network_based", model=YOLO("yolov8m.pt")
285+
)
286+
```
287+
288+
An example of working with `auto_calculate_crop_values` is presented in Google Colab notebook - [![Open In Colab][colab_badge]][colab_ex1_auto_calculate_crop_values]
289+
290+
291+
247292
[nb_example1]: https://nbviewer.org/github/Koldim2001/YOLO-Patch-Based-Inference/blob/main/examples/example_patch_based_inference.ipynb
248293
[colab_badge]: https://colab.research.google.com/assets/colab-badge.svg
249294
[colab_ex1]: https://colab.research.google.com/drive/1XCpIYLMFEmGSO0XCOkSD7CcD9SFHSJPA?usp=sharing
@@ -252,3 +297,4 @@ An example of working with this mode is presented in Google Colab notebook - [![
252297
[colab_ex2]: https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing
253298
[yt_link2]: https://www.youtube.com/watch?v=nBQuWa63188
254299
[colab_ex1_memory_optimize]: https://colab.research.google.com/drive/1XCpIYLMFEmGSO0XCOkSD7CcD9SFHSJPA?usp=sharing#scrollTo=DM_eCc3yXzXW
300+
[colab_ex1_auto_calculate_crop_values]: https://FIX

patched_yolo_infer/README.md

Lines changed: 34 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,9 @@ Interactive notebooks are provided to showcase the functionality of the library.
2525

2626
__Check this Colab examples:__
2727

28-
Patch-Based-Inference Example - [Open in Colab](https://colab.research.google.com/drive/1XCpIYLMFEmGSO0XCOkSD7CcD9SFHSJPA?usp=sharing)
28+
Patch-Based-Inference Example - [**Open in Colab**](https://colab.research.google.com/drive/1XCpIYLMFEmGSO0XCOkSD7CcD9SFHSJPA?usp=sharing)
2929

30-
Example of using various functions for visualizing basic YOLOv8/v9 inference results - [Open in Colab](https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing)
30+
Example of using various functions for visualizing basic YOLOv8/v9 inference results - [**Open in Colab**](https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing)
3131

3232

3333
## Usage
@@ -54,7 +54,7 @@ import cv2
5454
from patched_yolo_infer import MakeCropsDetectThem, CombineDetections
5555

5656
# Load the image
57-
img_path = 'test_image.jpg'
57+
img_path = "test_image.jpg"
5858
img = cv2.imread(img_path)
5959

6060
element_crops = MakeCropsDetectThem(
@@ -109,8 +109,9 @@ Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum
109109
- **element_crops** (*MakeCropsDetectThem*): Object containing crop information.
110110
- **nms_threshold** (*float*): IoU/IoS threshold for non-maximum suppression.
111111
- **match_metric** (*str*): Matching metric, either 'IOU' or 'IOS'.
112+
- **class_agnostic_nms** (*bool*) Determines the NMS mode in object detection. When set to True, NMS operates across all classes, ignoring class distinctions and suppressing less confident bounding boxes globally. Otherwise, NMS is applied separately for each class. (Default is True)
112113
- **intelligent_sorter** (*bool*): Enable sorting by area and rounded confidence parameter. If False, sorting will be done only by confidence (usual nms). (Dafault is True)
113-
- **sorter_bins** (*int*): Number of bins to use for intelligent_sorter. A smaller number of bins makes the NMS more reliant on object sizes rather than confidence scores. (Defaults to 10)
114+
- **sorter_bins** (*int*): Number of bins to use for intelligent_sorter. A smaller number of bins makes the NMS more reliant on object sizes rather than confidence scores. (Defaults to 5)
114115

115116

116117
---
@@ -166,7 +167,7 @@ visualize_results(
166167

167168
---
168169

169-
## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__
170+
## __How to improve the quality of the algorithm for the task of instance segmentation:__
170171

171172
In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.
172173

@@ -194,4 +195,32 @@ boxes=result.filtered_boxes
194195
masks=result.filtered_masks
195196
classes_ids=result.filtered_classes_id
196197
classes_names=result.filtered_classes_names
198+
```
199+
200+
---
201+
202+
## __How to automatically determine optimal parameters for patches (crops):__
203+
204+
To efficiently process a large number of images of varying sizes and contents, manually selecting the optimal patch sizes and overlaps can be difficult. To address this, an algorithm has been developed to automatically calculate the best parameters for patches (crops).
205+
206+
The `auto_calculate_crop_values` function operates in two modes:
207+
208+
1. **Resolution-Based Analysis**: This mode evaluates the resolution of the source images to determine the optimal patch sizes and overlaps. It is faster but may not yield the highest quality results because it does not take into account the actual objects present in the images.
209+
210+
2. **Neural Network-Based Analysis**: This advanced mode employs a neural network to analyze the images. The algorithm performs a standard inference of the network on the entire image and identifies the largest detected objects. Based on the sizes of these objects, the algorithm selects patch parameters to ensure that the largest objects are fully contained within a patch, and overlapping patches ensure comprehensive coverage. In this mode, it is necessary to input the model that will be used for patch-based inference in the subsequent steps.
211+
212+
Example of using:
213+
```python
214+
import cv2
215+
from ultralytics import YOLO
216+
from patched_yolo_infer import auto_calculate_crop_values
217+
218+
# Load the image
219+
img_path = "test_image.jpg"
220+
img = cv2.imread(img_path)
221+
222+
# Calculate the optimal crop size and overlap for an image
223+
shape_x, shape_y, overlap_x, overlap_y = auto_calculate_crop_values(
224+
image=img, mode="network_based", model=YOLO("yolov8m.pt")
225+
)
197226
```

patched_yolo_infer/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@
33
get_crops,
44
visualize_results,
55
create_masks_from_polygons,
6+
basic_crop_size_calculation,
7+
auto_calculate_crop_values
68
)
79

810
from .nodes.MakeCropsDetectThem import MakeCropsDetectThem

0 commit comments

Comments
 (0)