Skip to content

Commit 83e5311

Browse files
authored
Merge pull request #7 from Koldim2001/fix_memory
Fix memory
2 parents 961d5b6 + 6351511 commit 83e5311

9 files changed

Lines changed: 564 additions & 281 deletions

File tree

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,5 @@ setup.cfg
1818
.pypirc
1919
build
2020
info_how_pip_upload.txt
21+
examples/patched_yolo_infer
2122
**.ipynb

README.md

Lines changed: 96 additions & 55 deletions
Large diffs are not rendered by default.

examples/example_patch_based_inference.ipynb

Lines changed: 355 additions & 207 deletions
Large diffs are not rendered by default.

patched_yolo_infer/README.md

Lines changed: 41 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,11 @@ Interactive notebooks are provided to showcase the functionality of the library.
2323

2424
__Check this Colab examples:__
2525

26-
YOLO-Patch-Based-Inference Example - [Open in Colab](https://colab.research.google.com/drive/1FUao91GyB-ojGRN_okUxYyfagTT9tdsP?usp=sharing)
26+
Patch-Based-Inference Example - [Open in Colab](https://colab.research.google.com/drive/1XCpIYLMFEmGSO0XCOkSD7CcD9SFHSJPA?usp=sharing)
2727

2828
Example of using various functions for visualizing basic YOLOv8/v9 inference results and handling overlapping crops - [Open in Colab](https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing)
2929

30+
3031
## Usage
3132

3233
### 1. Patch-Based-Inference
@@ -40,7 +41,7 @@ The output obtained from the process includes several attributes that can be lev
4041

4142
3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
4243

43-
4. masks: If available, this attribute provides segmentation masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
44+
4. polygons: If available, this attribute provides a list containing NumPy arrays of polygon coordinates that represent segmentation masks corresponding to the detected objects. These polygons can be utilized to accurately outline the boundaries of each object.
4445

4546
5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
4647

@@ -72,7 +73,7 @@ result = CombineDetections(element_crops, nms_threshold=0.25, match_metric='IOS'
7273
img=result.image
7374
confidences=result.filtered_confidences
7475
boxes=result.filtered_boxes
75-
masks=result.filtered_masks
76+
polygons=result.filtered_polygons
7677
classes_ids=result.filtered_classes_id
7778
classes_names=result.filtered_classes_names
7879
```
@@ -96,6 +97,7 @@ Class implementing cropping and passing crops through a neural network for detec
9697
- **overlap_y** (*float*): Percentage of overlap along the y-axis.
9798
- **show_crops** (*bool*): Whether to visualize the cropping.
9899
- **resize_initial_size** (*bool*): Whether to resize the results to the original image size (ps: slow operation).
100+
- **memory_optimize** (*bool*): Memory optimization option for segmentation (less accurate results when enabled).
99101

100102
**CombineDetections**
101103
Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum Suppression).\
@@ -105,6 +107,8 @@ Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum
105107
- **match_metric** (*str*): Matching metric, either 'IOU' or 'IOS'.
106108
- **intelligent_sorter** (*bool*): Enable sorting by area and rounded confidence parameter. If False, sorting will be done only by confidence (usual nms). (Dafault is True)
107109

110+
111+
108112
---
109113
### 2. Custom inference visualization:
110114
Visualizes custom results of object detection or segmentation on an image.
@@ -115,6 +119,7 @@ Visualizes custom results of object detection or segmentation on an image.
115119
- **classes_ids** (*list*): A list of class IDs for each detection.
116120
- **confidences** (*list*): A list of confidence scores corresponding to each bounding box. Default is an empty list.
117121
- **classes_names** (*list*): A list of class names corresponding to the class IDs. Default is an empty list.
122+
- **polygons** (*list*): A list containing NumPy arrays of polygon coordinates that represent segmentation masks.
118123
- **masks** (*list*): A list of masks. Default is an empty list.
119124
- **segment** (*bool*): Whether to perform instance segmentation. Default is False.
120125
- **show_boxes** (*bool*): Whether to show bounding boxes. Default is True.
@@ -147,9 +152,41 @@ visualize_results(
147152
img=result.image,
148153
confidences=result.filtered_confidences,
149154
boxes=result.filtered_boxes,
150-
masks=result.filtered_masks,
155+
polygons=result.filtered_polygons,
151156
classes_ids=result.filtered_classes_id,
152157
classes_names=result.filtered_classes_names,
153158
segment=False,
154159
)
160+
```
161+
162+
---
163+
164+
## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__
165+
166+
In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.
167+
168+
The difference in the approach to using the function lies in specifying the parameter ```memory_optimize=False``` in the ```MakeCropsDetectThem``` class.
169+
In such a case, the informative values after processing will be the following:
170+
171+
1. img: This attribute contains the original image on which the inference was performed. It provides context for the detected objects.
172+
173+
2. confidences: This attribute holds the confidence scores associated with each detected object. These scores indicate the model's confidence level in the accuracy of its predictions.
174+
175+
3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
176+
177+
4. masks: This attribute provides segmentation binary masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
178+
179+
5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
180+
181+
6. classes_names: These are the human-readable names corresponding to the class IDs. They provide semantic labels for the detected objects, making the results easier to interpret.
182+
183+
184+
Here's how you can obtain them:
185+
```python
186+
img=result.image
187+
confidences=result.filtered_confidences
188+
boxes=result.filtered_boxes
189+
masks=result.filtered_masks
190+
classes_ids=result.filtered_classes_id
191+
classes_names=result.filtered_classes_names
155192
```

patched_yolo_infer/elements/CropElement.py

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,14 +25,16 @@ def __init__(
2525
self.detected_cls = None # List of classes of detected objects
2626
self.detected_xyxy = None # List of lists containing xyxy box coordinates
2727
self.detected_masks = None # List of np arrays containing masks in case of yolo-seg
28+
self.polygons = None # List of polygons points in case of using memory optimaze
2829

2930
# Refined coordinates according to crop position information
3031
self.detected_xyxy_real = None # List of lists containing xyxy box coordinates in values from source_image_resized or source_image
3132
self.detected_masks_real = None # List of np arrays containing masks in case of yolo-seg with the size of source_image_resized or source_image
33+
self.detected_polygons_real = None # List of polygons points in case of using memory optimaze in values from source_image_resized or source_image
3234

33-
def calculate_inference(self, model, imgsz=640, conf=0.35, iou=0.7, segment=False, classes_list=None):
34-
# Perform inference
35+
def calculate_inference(self, model, imgsz=640, conf=0.35, iou=0.7, segment=False, classes_list=None, memory_optimize=False):
3536

37+
# Perform inference
3638
predictions = model.predict(self.crop, imgsz=imgsz, conf=conf, iou=iou, classes=classes_list, verbose=False)
3739

3840
pred = predictions[0]
@@ -47,8 +49,13 @@ def calculate_inference(self, model, imgsz=640, conf=0.35, iou=0.7, segment=Fals
4749
self.detected_conf = pred.boxes.conf.cpu().numpy()
4850

4951
if segment and len(self.detected_cls) != 0:
50-
# Get the masks
51-
self.detected_masks = pred.masks.data.cpu().numpy()
52+
if memory_optimize:
53+
# Get the polygons
54+
self.polygons = [mask.astype(np.uint16) for mask in pred.masks.xy]
55+
else:
56+
# Get the masks
57+
self.detected_masks = pred.masks.data.cpu().numpy()
58+
5259

5360
def calculate_real_values(self):
5461
# Calculate real values of bboxes and masks in source_image_resized
@@ -57,6 +64,7 @@ def calculate_real_values(self):
5764

5865
self.detected_xyxy_real = [] # List of lists with xyxy box coordinates in the values ​​of the source_image_resized
5966
self.detected_masks_real = [] # List of np arrays with masks in case of yolo-seg sized as source_image_resized
67+
self.detected_polygons_real = [] # List of polygons in case of yolo-seg sized as source_image_resized
6068

6169
for bbox in self.detected_xyxy:
6270
# Calculate real box coordinates based on the position information of the crop
@@ -81,10 +89,18 @@ def calculate_real_values(self):
8189
# Append the masked image to the list of detected_masks_real
8290
self.detected_masks_real.append(black_image)
8391

92+
if self.polygons is not None:
93+
# Adjust the mask coordinates
94+
for mask in self.polygons:
95+
mask[:, 0] += x_start_global # Add x_start_global to all x coordinates
96+
mask[:, 1] += y_start_global # Add y_start_global to all y coordinates
97+
self.detected_polygons_real.append(mask.astype(np.uint16))
98+
8499
def resize_results(self):
85100
# from source_image_resized to source_image sizes transformation
86101
resized_xyxy = []
87102
resized_masks = []
103+
resized_polygons = []
88104

89105
for bbox in self.detected_xyxy_real:
90106
# Resize bbox coordinates
@@ -101,5 +117,12 @@ def resize_results(self):
101117
interpolation=cv2.INTER_NEAREST)
102118
resized_masks.append(mask_resized)
103119

120+
121+
for polygon in self.detected_polygons_real:
122+
polygon[:, 0] = (polygon[:, 0] * (self.source_image.shape[1] / self.source_image_resized.shape[1])).astype(np.uint16)
123+
polygon[:, 1] = (polygon[:, 1] * (self.source_image.shape[0] / self.source_image_resized.shape[0])).astype(np.uint16)
124+
resized_polygons.append(polygon)
125+
104126
self.detected_xyxy_real = resized_xyxy
105127
self.detected_masks_real = resized_masks
128+
self.detected_polygons_real = resized_polygons

patched_yolo_infer/functions_extra.py

Lines changed: 20 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -264,6 +264,7 @@ def visualize_results(
264264
confidences=[],
265265
classes_names=[],
266266
masks=[],
267+
polygons=[],
267268
segment=False,
268269
show_boxes=True,
269270
show_class=True,
@@ -342,7 +343,7 @@ def visualize_results(
342343
box = boxes[i]
343344
x_min, y_min, x_max, y_max = box
344345

345-
if segment:
346+
if segment and len(masks) > 0:
346347
mask = masks[i]
347348
# Resize mask to the size of the original image using nearest neighbor interpolation
348349
mask_resized = cv2.resize(
@@ -354,11 +355,26 @@ def visualize_results(
354355
)
355356

356357
if fill_mask:
357-
color_mask = np.zeros_like(img)
358-
color_mask[mask_resized > 0] = color
359-
labeled_image = cv2.addWeighted(labeled_image, 1, color_mask, alpha, 0)
358+
if alpha == 1:
359+
cv2.fillPoly(labeled_image, pts=mask_contours, color=color)
360+
else:
361+
color_mask = np.zeros_like(img)
362+
color_mask[mask_resized > 0] = color
363+
labeled_image = cv2.addWeighted(labeled_image, 1, color_mask, alpha, 0)
360364

361365
cv2.drawContours(labeled_image, mask_contours, -1, color, thickness)
366+
367+
elif segment and len(polygons) > 0:
368+
if len(polygons[i]) > 0:
369+
points = np.array(polygons[i].reshape((-1, 1, 2)), dtype=np.int32)
370+
cv2.drawContours(labeled_image, [points], -1, color, thickness)
371+
if fill_mask:
372+
if alpha == 1:
373+
cv2.fillPoly(labeled_image, pts=[points], color=color)
374+
else:
375+
mask_from_poly = np.zeros_like(img)
376+
color_mask_from_poly = cv2.fillPoly(mask_from_poly, pts=[points], color=color)
377+
labeled_image = cv2.addWeighted(labeled_image, 1, color_mask_from_poly, alpha, 0)
362378

363379
# Write class label
364380
if show_boxes:

patched_yolo_infer/nodes/CombineDetections.py

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ class CombineDetections:
2525
detected_conf_list_full (list): List of detected confidences.
2626
detected_xyxy_list_full (list): List of detected bounding boxes.
2727
detected_masks_list_full (list): List of detected masks.
28+
detected_polygons_list_full (list): List of detected polygons when memory optimization is enabled.
2829
detected_cls_id_list_full (list): List of detected class IDs.
2930
detected_cls_names_list_full (list): List of detected class names.
3031
filtered_indices (list): List of indices after non-maximum suppression.
@@ -33,6 +34,7 @@ class CombineDetections:
3334
filtered_classes_id (list): List of class IDs after non-maximum suppression.
3435
filtered_classes_names (list): List of class names after non-maximum suppression.
3536
filtered_masks (list): List of filtered (after nms) masks if segmentation is enabled.
37+
filtered_polygons (list): List of filtered (after nms) polygons if segmentation and memory optimization are enabled.
3638
"""
3739

3840
def __init__(
@@ -54,20 +56,21 @@ def __init__(
5456
self.match_metric = match_metric
5557
self.intelligent_sorter = intelligent_sorter # enable sorting by area and confidence parameter
5658

57-
# combinate detections of all patches
59+
# Combinate detections of all patches
5860
(
5961
self.detected_conf_list_full,
6062
self.detected_xyxy_list_full,
6163
self.detected_masks_list_full,
62-
self.detected_cls_id_list_full
64+
self.detected_cls_id_list_full,
65+
self.detected_polygons_list_full
6366
) = self.combinate_detections(crops=self.crops)
6467

6568
self.detected_cls_names_list_full = [
6669
self.class_names[value] for value in self.detected_cls_id_list_full
6770
] # make str list
6871

6972
# Invoke the NMS for segmentation masks method for filtering predictions
70-
if len(self.detected_masks_list_full)>0:
73+
if len(self.detected_masks_list_full) > 0:
7174

7275
self.filtered_indices = self.nms(
7376
self.detected_conf_list_full,
@@ -93,10 +96,17 @@ def __init__(
9396
self.filtered_classes_id = [self.detected_cls_id_list_full[i] for i in self.filtered_indices]
9497
self.filtered_classes_names = [self.detected_cls_names_list_full[i] for i in self.filtered_indices]
9598

96-
if element_crops.segment:
99+
# Masks filtering:
100+
if element_crops.segment and not element_crops.memory_optimize:
97101
self.filtered_masks = [self.detected_masks_list_full[i] for i in self.filtered_indices]
98102
else:
99103
self.filtered_masks = []
104+
105+
# Polygons filtering:
106+
if element_crops.segment and element_crops.memory_optimize:
107+
self.filtered_polygons = [self.detected_polygons_list_full[i] for i in self.filtered_indices]
108+
else:
109+
self.filtered_polygons = []
100110

101111
def combinate_detections(self, crops):
102112
"""
@@ -113,14 +123,16 @@ def combinate_detections(self, crops):
113123
detected_xyxy = []
114124
detected_masks = []
115125
detected_cls = []
126+
detected_polygons = []
116127

117128
for crop in crops:
118129
detected_conf.extend(crop.detected_conf)
119130
detected_xyxy.extend(crop.detected_xyxy_real)
120131
detected_masks.extend(crop.detected_masks_real)
121132
detected_cls.extend(crop.detected_cls)
133+
detected_polygons.extend(crop.detected_polygons_real)
122134

123-
return detected_conf, detected_xyxy, detected_masks, detected_cls
135+
return detected_conf, detected_xyxy, detected_masks, detected_cls, detected_polygons
124136

125137
@staticmethod
126138
def intersect_over_union(mask, masks_list):

patched_yolo_infer/nodes/MakeCropsDetectThem.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ class MakeCropsDetectThem:
2929
image size (ps: slow operation).
3030
model: Pre-initialized model object. If provided, the model will be used directly
3131
instead of loading from model_path.
32+
memory_optimize (bool): Memory optimization option for segmentation (less accurate results)
3233
3334
Attributes:
3435
model: YOLOv8 model loaded from the specified path.
@@ -48,6 +49,7 @@ class MakeCropsDetectThem:
4849
resize_initial_size (bool): Whether to resize the results to the original
4950
image size (ps: slow operation).
5051
class_names_dict (dict): Dictionary containing class names of the YOLO model.
52+
memory_optimize (bool): Memory optimization option for segmentation (less accurate results)
5153
"""
5254

5355
def __init__(
@@ -60,12 +62,13 @@ def __init__(
6062
classes_list=None,
6163
segment=False,
6264
shape_x=700,
63-
shape_y=700,
65+
shape_y=600,
6466
overlap_x=25,
6567
overlap_y=25,
6668
show_crops=False,
6769
resize_initial_size=False,
6870
model=None,
71+
memory_optimize=True
6972
) -> None:
7073
if model is None:
7174
self.model = YOLO(model_path) # Load the model from the specified path
@@ -84,6 +87,7 @@ def __init__(
8487
self.crops = [] # List to store the CropElement objects
8588
self.show_crops = show_crops # Whether to visualize the cropping
8689
self.resize_initial_size = resize_initial_size # slow operation !
90+
self.memory_optimize = memory_optimize # memory opimization option for segmentation
8791
self.class_names_dict = self.model.names
8892

8993
self.crops = self.get_crops_xy(
@@ -195,6 +199,7 @@ def _detect_objects(self):
195199
iou=self.iou,
196200
segment=self.segment,
197201
classes_list=self.classes_list,
202+
memory_optimize=self.memory_optimize
198203
)
199204
crop.calculate_real_values()
200205
if self.resize_initial_size:

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
long_description = "\n" + fh.read()
99

1010

11-
VERSION = '1.1.2'
11+
VERSION = '1.2.1'
1212
DESCRIPTION = '''YOLO-Patch-Based-Inference for detection/segmentation of small objects in images.'''
1313

1414
setup(

0 commit comments

Comments
 (0)