final docs with examples

Koldim2001 · Koldim2001 · commit 89898465f74a · 2024-05-19T22:02:31.000+03:00
diff --git a/README.md b/README.md
@@ -1,6 +1,8 @@
 # YOLO-Patch-Based-Inference
 
-This library facilitates various visualizations of inference results from ultralytics segmentation/detection models, including cropping with overlays, as well as **a patch-based inference algorithm enabling the detection of small objects in images**. It caters to both object detection and instance segmentation tasks.
+This Python library simplifies SAHI-like inference for instance segmentation tasks, enabling the detection of small objects in images. It caters to both object detection and instance segmentation tasks, supporting a wide range of Ultralytics models. 
+
+The library also provides a sleek customization of the visualization of the inference results for all models, both in the standard approach (direct network run) and the unique patch-based variant.
 
 **Model Support**: The library offers support for multiple ultralytics deep learning models, such as YOLOv8, YOLOv8-seg, YOLOv9, YOLOv9-seg, FastSAM, and RTDETR. Users can select from pre-trained options or utilize custom-trained models to best meet their task requirements.
 
@@ -99,65 +101,69 @@ classes_names=result.filtered_classes_names
 #### Explanation of possible input arguments:
 
 **MakeCropsDetectThem**
-Class implementing cropping and passing crops through a neural network for detection/segmentation.\
-**Args:**
-- **image** (*np.ndarray*): Input image BGR.
-- **model_path** (*str*): Path to the YOLO model.
-- **model** (*ultralytics model*) Pre-initialized model object. If provided, the model will be used directly instead of loading from model_path.
-- **imgsz** (*int*): Size of the input image for inference YOLO.
-- **conf** (*float*): Confidence threshold for detections YOLO.
-- **iou** (*float*): IoU threshold for non-maximum suppression YOLOv8 of single crop.
-- **classes_list** (*List[int] or None*): List of classes to filter detections. If None, all classes are considered. Defaults to None.
-- **segment** (*bool*): Whether to perform segmentation (YOLOv8-seg).
-- **shape_x** (*int*): Size of the crop in the x-coordinate.
-- **shape_y** (*int*): Size of the crop in the y-coordinate.
-- **overlap_x** (*float*): Percentage of overlap along the x-axis.
-- **overlap_y** (*float*): Percentage of overlap along the y-axis.
-- **show_crops** (*bool*): Whether to visualize the cropping.
-- **resize_initial_size** (*bool*): Whether to resize the results to the original image size (ps: slow operation).
-- **memory_optimize** (*bool*): Memory optimization option for segmentation (less accurate results when enabled).
+Class implementing cropping and passing crops through a neural network for detection/segmentation:
+
+| **Argument**          | **Type**               | **Default**  | **Description**                                                                                                |
+|-----------------------|------------------------|--------------|----------------------------------------------------------------------------------------------------------------|
+| image                 | np.ndarray             |              | Input image BGR.                                                                                               |
+| model_path            | str                    | "yolov8m.pt" | Path to the YOLO model.                                                                                        |
+| model                 | ultralytics model      | None         | Pre-initialized model object. If provided, the model will be used directly instead of loading from model_path. |
+| imgsz                 | int                    | 640          | Size of the input image for inference YOLO.                                                                    |
+| conf                  | float                  | 0.5          | Confidence threshold for detections YOLO.                                                                      |
+| iou                   | float                  | 0.7          | IoU threshold for non-maximum suppression YOLOv8 of single  crop.                                              |
+| classes_list          | List[int] or None      | None         | List of classes to filter detections. If None, all classes are considered.                                     |
+| segment               | bool                   | False        | Whether to perform segmentation (YOLOv8-seg).                                                                  |
+| shape_x               | int                    | 700          | Size of the crop in the x-coordinate.                                                                          |
+| shape_y               | int                    | 600          | Size of the crop in the y-coordinate.                                                                          |
+| overlap_x             | float                  | 25           | Percentage of overlap along the x-axis.                                                                        |
+| overlap_y             | float                  | 25           | Percentage of overlap along the y-axis.                                                                        |
+| show_crops            | bool                   | False        | Whether to visualize the cropping.                                                                             |
+| resize_initial_size   | bool                   | False        | Whether to resize the results to the original input image size (ps: slow operation).                           |
+| memory_optimize       | bool                   | True         | Memory optimization option for segmentation (less accurate results when enabled).                              |
+
 
 **CombineDetections**
-Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum Suppression).\
-**Args:**
-- **element_crops** (*MakeCropsDetectThem*): Object containing crop information.
-- **nms_threshold** (*float*): IoU/IoS threshold for non-maximum suppression.
-- **match_metric** (*str*): Matching metric, either 'IOU' or 'IOS'.
-- **intelligent_sorter** (*bool*): Enable sorting by area and rounded confidence parameter. 
-            If False, sorting will be done only by confidence (usual nms). (Dafault is True)
+Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum Suppression):
 
+| **Argument**         | **Type**          | **Default** | **Description**                                                                                                         |
+|----------------------|-------------------|-------------|-------------------------------------------------------------------------------------------------------------------------|
+| element_crops        |MakeCropsDetectThem|             | Object containing crop information.                                                                                     |
+| nms_threshold        | float             | 0.3         | IoU/IoS threshold for non-maximum suppression.                                                                          |
+| match_metric         | str               | IOS         | Matching metric, either 'IOU' or 'IOS'.                                                                                 |
+| intelligent_sorter   | bool              | True        | Enable sorting by area and rounded confidence parameter. If False, sorting will be done only by confidence (usual nms). |
 
 
 ---
 ### 2. Custom inference visualization:
-Visualizes custom results of object detection or segmentation on an image.
-
-**Args:**
-- **img** (*numpy.ndarray*): The input image in BGR format.
-- **boxes** (*list*): A list of bounding boxes in the format [x_min, y_min, x_max, y_max].
-- **classes_ids** (*list*): A list of class IDs for each detection.
-- **confidences** (*list*): A list of confidence scores corresponding to each bounding box. Default is an empty list.
-- **classes_names** (*list*): A list of class names corresponding to the class IDs. Default is an empty list.
-- **polygons** (*list*): A list containing NumPy arrays of polygon coordinates that represent segmentation masks.
-- **masks** (*list*): A list of masks. Default is an empty list.
-- **segment** (*bool*): Whether to perform instance segmentation. Default is False.
-- **show_boxes** (*bool*): Whether to show bounding boxes. Default is True.
-- **show_class** (*bool*): Whether to show class labels. Default is True.
-- **fill_mask** (*bool*): Whether to fill the segmented regions with color. Default is False.
-- **alpha** (*float*): The transparency of filled masks. Default is 0.3.
-- **color_class_background** (*tuple*): The background BGR color for class labels. Default is (0, 0, 255) (red).
-- **color_class_text** (*tuple*): The text color for class labels. Default is (255, 255, 255) (white).
-- **thickness** (*int*): The thickness of bounding box and text. Default is 4.
-- **font**: The font type for class labels. Default is cv2.FONT_HERSHEY_SIMPLEX.
-- **font_scale** (*float*): The scale factor for font size. Default is 1.5.
-- **delta_colors** (*int*): The random seed offset for color variation. Default is seed=0.
-- **dpi** (*int*): Final visualization size (plot is bigger when dpi is higher). Default is 150.
-- **random_object_colors** (*bool*): If true, colors for each object are selected randomly. Default is False.
-- **show_confidences** (*bool*): If true and show_class=True, confidences near class are visualized. Default is False.
-- **axis_off** (*bool*): If true, axis is turned off in the final visualization. Default is True.
-- **show_classes_list** (*list*): If empty, visualize all classes. Otherwise, visualize only classes in the list.
-- **return_image_array** (*bool*): If True, the function returns the image (BGR np.array) instead of displaying it. 
-                                   Default is False.
+Visualizes results of patch-based object detection or segmentation on an image.\
+Possible arguments of the ```visualize_results``` function:
+
+| Argument                | Type            | Default       | Description                                                                                   |
+|-------------------------|-----------------|-----------    |-----------------------------------------------------------------------------------------------|
+| img                     | numpy.ndarray   |               | The input image in BGR format.                                                                |
+| boxes                   | list            |               | A list of bounding boxes in the format [x_min, y_min, x_max, y_max].                          |
+| classes_ids             | list            |               | A list of class IDs for each detection.                                                       |
+| confidences             | list            | []            | A list of confidence scores corresponding to each bounding box.                               |
+| classes_names           | list            | []            | A list of class names corresponding to the class IDs.                                         |
+| polygons                | list            | []            | A list containing NumPy arrays of polygon coordinates that represent segmentation masks.      |
+| masks                   | list            | []            | A list of segmentation binary masks.                                                          |
+| segment                 | bool            | False         | Whether to perform instance segmentation visualization.                                       |
+| show_boxes              | bool            | True          | Whether to show bounding boxes.                                                               |
+| show_class              | bool            | True          | Whether to show class labels.                                                                 |
+| fill_mask               | bool            | False         | Whether to fill the segmented regions with color.                                             |
+| alpha                   | float           | 0.3           | The transparency of filled masks.                                                             |
+| color_class_background  | tuple           | (0, 0, 255)   | The background BGR color for class labels.                                                    |
+| color_class_text        | tuple           |(255, 255, 255)| The text color for class labels.                                                              |
+| thickness               | int             | 4             | The thickness of bounding box and text.                                                       |
+| font                    | cv2.font        |cv2.FONT_HERSHEY_SIMPLEX | The font type for class labels.                                                     |
+| font_scale              | float           | 1.5           | The scale factor for font size.                                                               |
+| delta_colors            | int             | seed=0        | The random seed offset for color variation.                                                   |
+| dpi                     | int             | 150           | Final visualization size (plot is bigger when dpi is higher).                                 |
+| random_object_colors    | bool            | False         | If true, colors for each object are selected randomly.                                        |
+| show_confidences        | bool            | False         | If true and show_class=True, confidences near class are visualized.                           |
+| axis_off                | bool            | True          | If true, axis is turned off in the final visualization.                                       |
+| show_classes_list       | list            | []            | If empty, visualize all classes. Otherwise, visualize only classes in the list.               |
+| return_image_array      | bool            | False         | If True, the function returns the image (BGR np.array) instead of displaying it.              |
 
 
 Example of using: