Skip to content

SEA-AI/seametrics

Repository files navigation

seametrics

Library built by SEA.AI to help measure and improve the performance of AI projects.

Documentation

Install
pip install git+https://github.com/SEA-AI/seametrics

If you want to test a specific branch

pip install git+https://github.com/SEA-AI/seametrics@branch-name

If you want to install additional dependencies.

pip install "seametrics[fiftyone] @ git+https://github.com/SEA-AI/seametrics"

For more information about the optional dependencies have a look at the [project.optional-dependencies] section of the pyproject.toml.

Hugging Face

Have a look at our Hugging Face organisation to browse through the available metrics.

PrecisionRecallF1Support

PrecisionRecallF1Support

Basically a modified cocoeval.py wrapped inside torchmetrics' mAP metric but with numpy arrays instead of torch tensors.

import numpy as np
from seametrics.detection import PrecisionRecallF1Support

predictions = [
    {
        "boxes": np.array(
            [
                [449.3, 197.75390625, 6.25, 7.03125],
                [334.3, 181.58203125, 11.5625, 6.85546875],
            ]
        ),
        "labels": np.array([0, 0]),
        "scores": np.array([0.153076171875, 0.72314453125]),
    }
]

ground_truth = [
    {
        "boxes": np.array(
            [
                [449.3, 197.75390625, 6.25, 7.03125],
                [334.3, 181.58203125, 11.5625, 6.85546875],
            ]
        ),
        "labels": np.array([0, 0]),
        "area": np.array([132.2, 83.8]),
    }
]

metric = PrecisionRecallF1Support() # default settings
metric.update(preds=predictions, target=ground_truth)
metric.compute()['metrics']

Will output:

{'all': {'range': [0, 10000000000.0],
  'iouThr': '0.50',
  'maxDets': 100,
  'tp': 0,
  'fp': 2,
  'fn': 2,
  'duplicates': 0,
  'precision': 0.0,
  'recall': 0.0,
  'f1': 0,
  'support': 2,
  'fpi': 0,
  'nImgs': 1}}

Where:

  • all is the area range label
  • range is the area range
  • iouThr is the IoU threshold in string format
  • maxDets is the maximum number of detections
  • tp, fp, fn are the true positives, false positives and false negatives
  • duplicates is the number of duplicates, a duplicate is a prediction that matches an already matched ground truth.
  • precision, recall, f1 are ... well, the precision, recall and f1 score
  • support is the number of ground truth boxes
  • fpi is the false positive index
  • nImgs is the number of images
Tracking Metrics

Tracking Metrics

TrackingMetrics wraps motmetrics to compute standard MOT scores (MOTA, MOTP, IDF1, …). HOTAMetrics implements HOTA (Higher Order Tracking Accuracy), which jointly evaluates detection and association quality.

Both classes share the same interface and can be evaluated together in a single dataset pass using compute_all_metrics_by_sequence.

import fiftyone as fo
from seametrics.tracking import TrackingMetrics, HOTAMetrics
from seametrics.tracking.utils import compute_all_metrics_by_sequence, results_to_df

dataset = fo.load_dataset("my_dataset")
view = dataset.load_saved_view("my_view")

results = compute_all_metrics_by_sequence(
    view=view,
    gt_field="ground_truth",
    pred_fields=["model_a", "model_b"],
    metrics=[
        (TrackingMetrics, {"max_iou": 0.5}),
        (HOTAMetrics, {}),
    ],
)

Returns a nested dict {pred_field: {metric_class_name: metric_instance}}. Convert any entry to a per-sequence DataFrame with results_to_df:

mot_df  = results_to_df(results["model_a"]["TrackingMetrics"])
hota_df = results_to_df(results["model_a"]["HOTAMetrics"])

TrackingMetrics DataFrame columns: sequence, num_frames, num_unique_objects, mota, motp, idf1, idp, idr, mostly_tracked, partially_tracked, mostly_lost, num_switches, num_false_positives, num_misses, num_fragmentations, precision, recall.

HOTAMetrics DataFrame columns: sequence, hota, deta, assa, loca, num_unique_objects. Scores are expressed as percentages (0–100).

num_unique_objects is included in both DataFrames so you can compute a track-count-weighted global score:

weighted_hota = (
    (hota_df["hota"] * hota_df["num_unique_objects"]).sum()
    / hota_df["num_unique_objects"].sum()
)

Failed sequences (empty GT, empty predictions, or unexpected errors) are logged rather than raising, and are accessible via metric_instance.failed_sequences.

About

Metrics used to evaluate our core technology's performance.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors