new docs

Koldim2001 · Koldim2001 · commit f4cd05c21a1d · 2024-05-19T20:43:02.000+03:00
diff --git a/README.md b/README.md
@@ -59,7 +59,7 @@ The output obtained from the process includes several attributes that can be lev
 
 3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
 
-4. masks: If available, this attribute provides segmentation masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
+4. polygons: If available, this attribute provides a list containing NumPy arrays of polygon coordinates that represent segmentation masks corresponding to the detected objects. These polygons can be utilized to accurately outline the boundaries of each object.
 
 5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
 
@@ -91,7 +91,7 @@ result = CombineDetections(element_crops, nms_threshold=0.25, match_metric='IOS'
 img=result.image
 confidences=result.filtered_confidences
 boxes=result.filtered_boxes
-masks=result.filtered_masks
+polygons=result.filtered_polygons
 classes_ids=result.filtered_classes_id
 classes_names=result.filtered_classes_names
 ```
@@ -115,6 +115,7 @@ Class implementing cropping and passing crops through a neural network for detec
 - **overlap_y** (*float*): Percentage of overlap along the y-axis.
 - **show_crops** (*bool*): Whether to visualize the cropping.
 - **resize_initial_size** (*bool*): Whether to resize the results to the original image size (ps: slow operation).
+- **memory_optimize** (*bool*): Memory optimization option for segmentation (less accurate results when enabled).
 
 **CombineDetections**
 Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum Suppression).\
@@ -137,6 +138,7 @@ Visualizes custom results of object detection or segmentation on an image.
 - **classes_ids** (*list*): A list of class IDs for each detection.
 - **confidences** (*list*): A list of confidence scores corresponding to each bounding box. Default is an empty list.
 - **classes_names** (*list*): A list of class names corresponding to the class IDs. Default is an empty list.
+- **polygons** (*list*): A list containing NumPy arrays of polygon coordinates that represent segmentation masks.
 - **masks** (*list*): A list of masks. Default is an empty list.
 - **segment** (*bool*): Whether to perform instance segmentation. Default is False.
 - **show_boxes** (*bool*): Whether to show bounding boxes. Default is True.
@@ -170,13 +172,46 @@ visualize_results(
     img=result.image,
     confidences=result.filtered_confidences,
     boxes=result.filtered_boxes,
-    masks=result.filtered_masks,
+    polygons=result.filtered_polygons,
     classes_ids=result.filtered_classes_id,
     classes_names=result.filtered_classes_names,
     segment=False,
 )
 ```
 
+---
+---
+
+## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__
+
+In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.
+
+The difference in the approach to using the function lies in specifying the parameter ```memory_optimize=False``` in the ```MakeCropsDetectThem``` class.
+In such a case, the informative values after processing will be the following:
+
+1. img: This attribute contains the original image on which the inference was performed. It provides context for the detected objects.
+
+2. confidences: This attribute holds the confidence scores associated with each detected object. These scores indicate the model's confidence level in the accuracy of its predictions.
+
+3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
+
+4. masks: This attribute provides segmentation binary masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
+
+5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
+
+6. classes_names: These are the human-readable names corresponding to the class IDs. They provide semantic labels for the detected objects, making the results easier to interpret.
+
+
+Here's how you can obtain them:
+```python
+img=result.image
+confidences=result.filtered_confidences
+boxes=result.filtered_boxes
+masks=result.filtered_masks
+classes_ids=result.filtered_classes_id
+classes_names=result.filtered_classes_names
+```
+
 
 [nb_example1]: https://nbviewer.org/github/Koldim2001/YOLO-Patch-Based-Inference/blob/main/examples/example_patch_based_inference.ipynb
 [colab_badge]: https://colab.research.google.com/assets/colab-badge.svg
diff --git a/examples/example_patch_based_inference.ipynb b/examples/example_patch_based_inference.ipynb
@@ -54,7 +54,7 @@
     "\n",
     "3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.\n",
     "\n",
-    "4. polygons: If available, this attribute provides a list of polygon coordinates for masks when used for instance segmentation tasks.\n",
+    "4. polygons: If available, this attribute provides a list containing NumPy arrays of polygon coordinates that represent segmentation masks corresponding to the detected objects. These polygons can be utilized to accurately outline the boundaries of each object.\n",
     "\n",
     "5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.\n",
     "\n",
@@ -932,7 +932,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION::__\n",
+    "## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__\n",
     "\n",
     "In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.\n",
     "\n",
diff --git a/patched_yolo_infer/README.md b/patched_yolo_infer/README.md
@@ -27,6 +27,7 @@ YOLO-Patch-Based-Inference Example - [Open in Colab](https://colab.research.goog
 
 Example of using various functions for visualizing basic YOLOv8/v9 inference results and handling overlapping crops - [Open in Colab](https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing)
 
+
 ## Usage
 
 ### 1. Patch-Based-Inference
@@ -40,7 +41,7 @@ The output obtained from the process includes several attributes that can be lev
 
 3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
 
-4. masks: If available, this attribute provides segmentation masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
+4. polygons: If available, this attribute provides a list containing NumPy arrays of polygon coordinates that represent segmentation masks corresponding to the detected objects. These polygons can be utilized to accurately outline the boundaries of each object.
 
 5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
 
@@ -72,7 +73,7 @@ result = CombineDetections(element_crops, nms_threshold=0.25, match_metric='IOS'
 img=result.image
 confidences=result.filtered_confidences
 boxes=result.filtered_boxes
-masks=result.filtered_masks
+polygons=result.filtered_polygons
 classes_ids=result.filtered_classes_id
 classes_names=result.filtered_classes_names
 ```
@@ -96,14 +97,18 @@ Class implementing cropping and passing crops through a neural network for detec
 - **overlap_y** (*float*): Percentage of overlap along the y-axis.
 - **show_crops** (*bool*): Whether to visualize the cropping.
 - **resize_initial_size** (*bool*): Whether to resize the results to the original image size (ps: slow operation).
+- **memory_optimize** (*bool*): Memory optimization option for segmentation (less accurate results when enabled).
 
 **CombineDetections**
 Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum Suppression).\
 **Args:**
 - **element_crops** (*MakeCropsDetectThem*): Object containing crop information.
 - **nms_threshold** (*float*): IoU/IoS threshold for non-maximum suppression.
 - **match_metric** (*str*): Matching metric, either 'IOU' or 'IOS'.
-- **intelligent_sorter** (*bool*): Enable sorting by area and rounded confidence parameter. If False, sorting will be done only by confidence (usual nms). (Dafault is True)
+- **intelligent_sorter** (*bool*): Enable sorting by area and rounded confidence parameter. 
+            If False, sorting will be done only by confidence (usual nms). (Dafault is True)
+
+
 
 ---
 ### 2. Custom inference visualization:
@@ -115,6 +120,7 @@ Visualizes custom results of object detection or segmentation on an image.
 - **classes_ids** (*list*): A list of class IDs for each detection.
 - **confidences** (*list*): A list of confidence scores corresponding to each bounding box. Default is an empty list.
 - **classes_names** (*list*): A list of class names corresponding to the class IDs. Default is an empty list.
+- **polygons** (*list*): A list containing NumPy arrays of polygon coordinates that represent segmentation masks.
 - **masks** (*list*): A list of masks. Default is an empty list.
 - **segment** (*bool*): Whether to perform instance segmentation. Default is False.
 - **show_boxes** (*bool*): Whether to show bounding boxes. Default is True.
@@ -132,7 +138,8 @@ Visualizes custom results of object detection or segmentation on an image.
 - **show_confidences** (*bool*): If true and show_class=True, confidences near class are visualized. Default is False.
 - **axis_off** (*bool*): If true, axis is turned off in the final visualization. Default is True.
 - **show_classes_list** (*list*): If empty, visualize all classes. Otherwise, visualize only classes in the list.
-- **return_image_array** (*bool*): If True, the function returns the image (BGR np.array) instead of displaying it. Default is False.
+- **return_image_array** (*bool*): If True, the function returns the image (BGR np.array) instead of displaying it. 
+                                   Default is False.
 
 
 Example of using:
@@ -147,9 +154,41 @@ visualize_results(
     img=result.image,
     confidences=result.filtered_confidences,
     boxes=result.filtered_boxes,
-    masks=result.filtered_masks,
+    polygons=result.filtered_polygons,
     classes_ids=result.filtered_classes_id,
     classes_names=result.filtered_classes_names,
     segment=False,
 )
+```
+
+---
+
+## __HOW TO IMPROVE THE QUALITY OF THE ALGORITHM FOR THE TASK OF INSTANCE SEGMENTATION:__
+
+In this approach, all operations under the hood are performed on binary masks of recognized objects. Storing these masks consumes a lot of memory, so this method requires more RAM and slightly more processing time. However, the accuracy of recognition significantly improves, which is especially noticeable in cases where there are many objects of different sizes and they are densely packed. Therefore, we recommend using this approach in production if accuracy is important and not speed, and if your computational resources allow storing hundreds of binary masks in RAM.
+
+The difference in the approach to using the function lies in specifying the parameter ```memory_optimize=False``` in the ```MakeCropsDetectThem``` class.
+In such a case, the informative values after processing will be the following:
+
+1. img: This attribute contains the original image on which the inference was performed. It provides context for the detected objects.
+
+2. confidences: This attribute holds the confidence scores associated with each detected object. These scores indicate the model's confidence level in the accuracy of its predictions.
+
+3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.
+
+4. masks: This attribute provides segmentation binary masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.
+
+5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.
+
+6. classes_names: These are the human-readable names corresponding to the class IDs. They provide semantic labels for the detected objects, making the results easier to interpret.
+
+
+Here's how you can obtain them:
+```python
+img=result.image
+confidences=result.filtered_confidences
+boxes=result.filtered_boxes
+masks=result.filtered_masks
+classes_ids=result.filtered_classes_id
+classes_names=result.filtered_classes_names
 ```