Skip to content

Added requirements and tracking script for SAM-2#1358

Open
mwuel wants to merge 2 commits into
biigle:dev-modulesfrom
mwuel:dev-modules
Open

Added requirements and tracking script for SAM-2#1358
mwuel wants to merge 2 commits into
biigle:dev-modulesfrom
mwuel:dev-modules

Conversation

@mwuel
Copy link
Copy Markdown

@mwuel mwuel commented Jan 29, 2026

A custom frame by frame video predictor https://github.com/Gy920/segment-anything-2-real-time has been used to implement tracking using the SAM-2 model by Facebook. The original predictor https://github.com/facebookresearch/sam2 was also considered but not implemented because it required loading the whole video or at least chunks of it in-memory, making it less elegant for larger video files.

The script works by extracting a bounding box from the predicted mask by SAM-2. Tracking via a bounding box is therefore very straightforward to implement in the future. For point and circle tracking, this also means that the point is just an interpolated mask and might show unwanted behavior. Especially point tracking may look very odd, as the point given as output will just be the average over the whole mask. The point might end up outside of the original object if the shape is not simple enough (i.e. big wingspan of birds might enlargen the bounding box, shifting the middle point outside of bounds).
The bounding box will always keep the dimensions of the initial input so it can be easily adjusted. Using polygons as an input for tracking has not been tried but might be possible for even more accurate results. Scripts can be easily switched between old and new by modifying the videos.php config file.
Currently, the script is set to track every second frame only. Tracking every frame significantly hinders computational performance and skipping more than one frame showed worse tracking results because of missing context. The number of frames to be skipped can be easily adjusted in the script.

@mzur
Copy link
Copy Markdown
Member

mzur commented Jan 30, 2026

Revert the logic that keeps the same size for circles. It should be possible to change the circle size during tracking (e.g. when the camera moves closer to the object).

Also think about new tracking behavior: Users draw a point, circle, box or polygon but the result of the tracking is always a polygon.

@mzur mzur linked an issue Jan 30, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Video Tracking SAM2

2 participants