Skip to content

Commit 79ad3ef

Browse files
committed
optimal crop size calc
1 parent a4f3855 commit 79ad3ef

3 files changed

Lines changed: 71 additions & 9 deletions

File tree

README.md

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ import cv2
7878
from patched_yolo_infer import MakeCropsDetectThem, CombineDetections
7979

8080
# Load the image
81-
img_path = 'test_image.jpg'
81+
img_path = "test_image.jpg"
8282
img = cv2.imread(img_path)
8383

8484
element_crops = MakeCropsDetectThem(
@@ -213,6 +213,7 @@ visualize_results(
213213
6. **Handling Multi-Class Detection Issues**: If you are working on a multi-class detection or instance segmentation task, it may be beneficial to switch the mode to `class_agnostic_nms=False` in the `CombineDetections` parameters. The default mode, with `class_agnostic_nms` set to True, is particularly effective when handling a large number of closely related classes in pre-trained YOLO networks (for example, when there is often confusion between classes like `car` and `truck`). If in your scenario, an object of one class can physically be inside an object of another class, you should definitely set `class_agnostic_nms=False` for such cases.
214214

215215
7. **High-Quality Instance Segmentation**: For tasks requiring high-quality results in instance segmentation, detailed guidance is provided in the next section of the README.
216+
216217
---
217218

218219
## __How to improve the quality of the algorithm for the task of instance segmentation:__
@@ -247,6 +248,38 @@ classes_names=result.filtered_classes_names
247248

248249
An example of working with this mode is presented in Google Colab notebook - [![Open In Colab][colab_badge]][colab_ex1_memory_optimize]
249250

251+
---
252+
253+
## __How to automatically determine optimal parameters for patches (crops):__
254+
255+
To efficiently process a large number of images of varying sizes and contents, manually selecting the optimal patch sizes and overlaps can be cumbersome. To address this, an algorithm has been developed to automatically calculate the best parameters for patches (crops).
256+
257+
The `auto_calculate_crop_values` function operates in two modes:
258+
259+
1. **Resolution-Based Analysis**: This mode evaluates the resolution of the source images to determine the optimal patch sizes and overlaps. It is faster but may not yield the highest quality results because it does not take into account the actual objects present in the images.
260+
261+
2. **Neural Network-Based Analysis**: This advanced mode employs a neural network to analyze the images. The algorithm performs a standard inference of the network on the entire image and identifies the largest detected objects. Based on the sizes of these objects, the algorithm selects patch parameters to ensure that the largest objects are fully contained within a patch, and overlapping patches ensure comprehensive coverage. In this mode, it is necessary to input the model that will be used for patch-based inference in the subsequent steps.
262+
263+
Example of using:
264+
```python
265+
import cv2
266+
from ultralytics import YOLO
267+
from patched_yolo_infer import auto_calculate_crop_values
268+
269+
# Load the image
270+
img_path = "test_image.jpg"
271+
img = cv2.imread(img_path)
272+
273+
# Calculate the optimal crop size and overlap for an image
274+
shape_x, shape_y, overlap_x, overlap_y = auto_calculate_crop_values(
275+
image=img, mode="network_analysis", model=YOLO("yolov8m.pt")
276+
)
277+
```
278+
279+
An example of working with `auto_calculate_crop_values` is presented in Google Colab notebook - [![Open In Colab][colab_badge]][colab_ex1_auto_calculate_crop_values]
280+
281+
282+
250283
[nb_example1]: https://nbviewer.org/github/Koldim2001/YOLO-Patch-Based-Inference/blob/main/examples/example_patch_based_inference.ipynb
251284
[colab_badge]: https://colab.research.google.com/assets/colab-badge.svg
252285
[colab_ex1]: https://colab.research.google.com/drive/1XCpIYLMFEmGSO0XCOkSD7CcD9SFHSJPA?usp=sharing
@@ -255,3 +288,4 @@ An example of working with this mode is presented in Google Colab notebook - [![
255288
[colab_ex2]: https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing
256289
[yt_link2]: https://www.youtube.com/watch?v=nBQuWa63188
257290
[colab_ex1_memory_optimize]: https://colab.research.google.com/drive/1XCpIYLMFEmGSO0XCOkSD7CcD9SFHSJPA?usp=sharing#scrollTo=DM_eCc3yXzXW
291+
[colab_ex1_auto_calculate_crop_values]: https://FIX

patched_yolo_infer/README.md

Lines changed: 32 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,9 @@ Interactive notebooks are provided to showcase the functionality of the library.
2525

2626
__Check this Colab examples:__
2727

28-
Patch-Based-Inference Example - [Open in Colab](https://colab.research.google.com/drive/1XCpIYLMFEmGSO0XCOkSD7CcD9SFHSJPA?usp=sharing)
28+
Patch-Based-Inference Example - [**Open in Colab**](https://colab.research.google.com/drive/1XCpIYLMFEmGSO0XCOkSD7CcD9SFHSJPA?usp=sharing)
2929

30-
Example of using various functions for visualizing basic YOLOv8/v9 inference results - [Open in Colab](https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing)
30+
Example of using various functions for visualizing basic YOLOv8/v9 inference results - [**Open in Colab**](https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing)
3131

3232

3333
## Usage
@@ -54,7 +54,7 @@ import cv2
5454
from patched_yolo_infer import MakeCropsDetectThem, CombineDetections
5555

5656
# Load the image
57-
img_path = 'test_image.jpg'
57+
img_path = "test_image.jpg"
5858
img = cv2.imread(img_path)
5959

6060
element_crops = MakeCropsDetectThem(
@@ -167,7 +167,7 @@ visualize_results(
167167

168168
---
169169

170-
## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__
170+
## __How to improve the quality of the algorithm for the task of instance segmentation:__
171171

172172
In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.
173173

@@ -195,4 +195,32 @@ boxes=result.filtered_boxes
195195
masks=result.filtered_masks
196196
classes_ids=result.filtered_classes_id
197197
classes_names=result.filtered_classes_names
198+
```
199+
200+
---
201+
202+
## __How to automatically determine optimal parameters for patches (crops):__
203+
204+
To efficiently process a large number of images of varying sizes and contents, manually selecting the optimal patch sizes and overlaps can be cumbersome. To address this, an algorithm has been developed to automatically calculate the best parameters for patches (crops).
205+
206+
The `auto_calculate_crop_values` function operates in two modes:
207+
208+
1. **Resolution-Based Analysis**: This mode evaluates the resolution of the source images to determine the optimal patch sizes and overlaps. It is faster but may not yield the highest quality results because it does not take into account the actual objects present in the images.
209+
210+
2. **Neural Network-Based Analysis**: This advanced mode employs a neural network to analyze the images. The algorithm performs a standard inference of the network on the entire image and identifies the largest detected objects. Based on the sizes of these objects, the algorithm selects patch parameters to ensure that the largest objects are fully contained within a patch, and overlapping patches ensure comprehensive coverage. In this mode, it is necessary to input the model that will be used for patch-based inference in the subsequent steps.
211+
212+
Example of using:
213+
```python
214+
import cv2
215+
from ultralytics import YOLO
216+
from patched_yolo_infer import auto_calculate_crop_values
217+
218+
# Load the image
219+
img_path = "test_image.jpg"
220+
img = cv2.imread(img_path)
221+
222+
# Calculate the optimal crop size and overlap for an image
223+
shape_x, shape_y, overlap_x, overlap_y = auto_calculate_crop_values(
224+
image=img, mode="network_analysis", model=YOLO("yolov8m.pt")
225+
)
198226
```

patched_yolo_infer/functions_extra.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -503,7 +503,7 @@ def basic_crop_size_calculation(width, height):
503503
return crop_shape_x, crop_shape_y, crop_overlap_x, crop_overlap_y
504504

505505

506-
def auto_calculate_crop_values(image, type="network_analysis", model=None, classes_list=None):
506+
def auto_calculate_crop_values(image, mode="network_analysis", model=None, classes_list=None):
507507
"""
508508
Automatically calculate the optimal crop size and overlap for an image.
509509
@@ -513,7 +513,7 @@ def auto_calculate_crop_values(image, type="network_analysis", model=None, class
513513
514514
Parameters:
515515
image (numpy.ndarray): The input BGR image.
516-
type (str): The type of analysis to perform. Can be "image_size_analysis" or "network_analysis".
516+
mode (str): The type of analysis to perform. Can be "image_size_analysis" or "network_analysis".
517517
Default is "network_analysis".
518518
model (YOLO): The YOLO model to use for object detection. If None, a default model yolov8m
519519
will be loaded. Default is None.
@@ -526,8 +526,8 @@ def auto_calculate_crop_values(image, type="network_analysis", model=None, class
526526
"""
527527
height, width = image.shape[:2]
528528

529-
# If the type is 'image_size_analysis', calculate crop size based on image dimensions
530-
if type == 'image_size_analysis':
529+
# If the mode is 'image_size_analysis', calculate crop size based on image dimensions
530+
if mode == 'image_size_analysis':
531531
crop_shape_x, crop_shape_y, crop_overlap_x, crop_overlap_y = basic_crop_size_calculation(width, height)
532532
else:
533533
# If no model is provided, load a default YOLO model

0 commit comments

Comments
 (0)