Skip to content

Tracker: make announce lifecycle explicit (started/completed/stopped), fix retries and startup race#540

Open
ivoviz wants to merge 3 commits into
ikatson:mainfrom
ivoviz:main
Open

Tracker: make announce lifecycle explicit (started/completed/stopped), fix retries and startup race#540
ivoviz wants to merge 3 commits into
ikatson:mainfrom
ivoviz:main

Conversation

@ivoviz

@ivoviz ivoviz commented Jan 25, 2026

Copy link
Copy Markdown

Fixes #539 (may also address related tracker announce edge cases).

  • Make tracker announce lifecycle explicit (started, completed, stopped) - paused is still missing but should be straightforward to add in a follow-up
  • Send started only after the torrent is fully initialized and live (previously it could announce before hash checking finished, resulting in an incorrect left value)
  • Retry started announce with exponential backoff until first success
  • Prevent duplicate completed announces across restarts
  • Ensure stopped is sent once on pause/shutdown
  • Respect tracker-provided announce interval by using interval instead of min_interval

Edit: also made peers and peers6 Option as announce parsing would otherwise fail when using IPv6.

@ikatson

ikatson commented Jan 25, 2026

Copy link
Copy Markdown
Owner

Is the code AI generated?

@ivoviz

ivoviz commented Jan 25, 2026

Copy link
Copy Markdown
Author

Is the code AI generated?

The commit messages are, yes. Rust is definitely not my first language though, so if anything looks off I'm happy to revise it.

@ikatson

ikatson commented Jan 25, 2026

Copy link
Copy Markdown
Owner

I can't merge as-is for a bunch of reasons

It complicates the code and state with a bunch of variables. It's already complex enough, so we should be mindful of making it worse.

It spawns a lot of tokio tasks, and the pattern I use now is to not spawn unless necessary, i.e. reduce the number of spawns.

It doesn't use spawn_with_cancel. So I'm not sure it cleans up successfully at the end and doesn't leave lingering tasks around.

if I were to do it, I'd first think how do do it right without creating a hacky solution on top. That's why I thought it's LLMs who wrote this - it's just solving the immediate task at hand without thinking about the larger codebase and its health.

If I were to do it, I'd try creating some sort of event bus - for torrents, and for sessions, while keeping the tracker loop a Stream that consumes those events. Then the torrent_state/ components emit events like started/paused and whatever else might be useful. The tracker_comms task would subscribe to those events and do whatever needs to be done.

There are a few problems to resolve also related to this:

  • the code uses CancellationTokens today to ensure cleanup of all torrent-related tasks. The tracker_comms one gets killed the moment the torrent is paused. Now we can't do this anymore - or need to do smth smarter - cause pausing needs to not just emit the event, but also give the trackers task some time to process this event
  • we need to ensure we don't emit "started" events unless the torrent is actually live. This also steps into another issue - when resolving magnets, or initializing the torrent we probably shouldn't be emitting "started" events to trackers, cause we aren't "live" yet.

I understand it's more complex than just hacking it to work, but these were the reasons I haven't done this yet myself.

@ivoviz

ivoviz commented Jan 25, 2026

Copy link
Copy Markdown
Author

Yeah, I get the concerns and I agree. I ran into the same pause -> cancel issues you mentioned while testing, which is why the solution ended up a bit workaroundish.

I did try to be careful about cleaning up spawned tasks, but I agree that something closer to the approach you described would be much cleaner long term. I’ve since spent some more time digging into the codebase, but I’m still not sure what the right way to handle the stop/pause case is without introducing new hacks.

If you have any guidance on how you’d approach that part, I’m happy to give it another shot.

Also, down the line it would be nice to support manual per tracker announces and tracker tiers, but that’s probably a separate discussion.

@ikatson

ikatson commented Jan 25, 2026

Copy link
Copy Markdown
Owner

Thanks for your understanding. I'll think about how to approach it, need to refresh my memory of how all this works first. Will update here once I have an update

@ikatson

ikatson commented Jan 27, 2026

Copy link
Copy Markdown
Owner

Wrote some thoughts on this, formatted with Claude

Event bus per torrent

The core idea is to have an event bus per torrent. Things can emit events into it, subscribers can subscribe.

Can use a broadcast channel with a large buffer, e.g. 128 events. Can probably just store a broadcast::Sender and that's it - just a small wrapper around a broadcast channel.

This assumes events are relatively infrequent. If the channel fills up (it shouldn't), can use PeerRxTorrentInfo that is already passed into TrackerComms to understand what state the torrent is in.

Events

Started/Paused/Completed are obvious events to start off with.

In the future we might want more granularity like "resolving magnet" or whatever. On a side note, some time ago I tried adding ResolvingMagnet state in addition to Initializing/Paused/Live/Error and abandoned it as it was a bigger change than I anticipated.

Wiring it up

Create the event bus at the moment the torrent is added and just keep it around. Pass it into TrackerComms at creation time. Maybe combine with PeerRxTorrentInfo somehow to reduce the amount of individual state items passed into it.

Each tracker can subscribe to it and do its thing:

select {
    if live => loop forever
    on paused => pause and quit
    on completed => notify if needed (don't remember the protocol), keep looping
}

Notice, no new tokio::spawns are needed anywhere here.

Handling pause/cancel

While torrent is live, peer rx is driven by spawn_peer_adder. This one dies with the live cancellation token today. Instead of using live.spawn (which does spawn_with_cancel), need to tweak it to:

  • still react on cancel but give it some time to complete by itself, some fixed timeout say 5 seconds

This can be some extension to spawn_with_cancel, maybe a new function in spawn_utils.

Open issue: DHT

Ensuring peer_rx dies by itself on TorrentEvent::Paused would be awesome too. DHT implementation will get in the way though as it has no clue about cancellations today. Might need to wrap DHT stream in some way to stop itself (by dropping) on cancel.

@ivoviz

ivoviz commented Jan 28, 2026

Copy link
Copy Markdown
Author

Thanks for the detailed write-up, I’ll try to rework this along these lines in the next couple of days.

AFAIK the protocol, the completed event is independent from the others. It is sent once when the download finishes, while the loop itself keeps running.

The paused event actually corresponds to partial seeding. The current pause() function should therefore emit a stopped event, but it was probably just a typo from Claude.

@ikatson

ikatson commented Jan 28, 2026

Copy link
Copy Markdown
Owner

I think that's how I wrote it, but you are right, I didn't look at the protocol details while writing this, the challenge is not there, but in the refactoring itself.

but it was probably just a typo from Claude.

Here are my raw notes for what it's worth :)

prerequisite: event bus per torrent. Things can emit events into it, subscribers can subscribe.
Can use a broadcast channel with a large buffer, e.g. 128 events.
Can probably just store a broadcast::Sender and that's it - just a small wrapper around a broadcast channel.

That assumes events are relatively infrequent. If the channel fills up (it shouldn't) can use PeerRxTorrentInfo that is already passed into TrackerComms to understand what state the torrent is in.

Started/Paused/Completed are obvious events to start off with.
However in the future we might want more granuarity like "resolving magnet" or whatever. On a side note, some time ago I tried adding "ResolvingMagnet" state in addition to Initializing/Paused/Live/Error and abandoned it as it was a bigger change than I anticipated.

Create the event bus at the moment the torrent is added and just keep it around.
Pass it into TrackerComms at creation time. Maybe combine with PeerRxTorrentInfo somehow to reduce the amount of individual state items passed into it.

Each tracker can subscribe to it and do its thing - select {if live => loop forever, on paused => pause and quit, on completed => notify if needed (don't rememeber the protocol), and keep looping}

Notice, no new tokio::spawns are needed anywhere here.

While torrent is live, peer rx is driven by fn spawn_peer_adder. This one dies with the live cancellation token today. Instead of using live.spawn (which does spawn_with_cancel), need to tweak it to:

  • still react on cancel but give it some time to complete by itself, some fixed timeout say 5 seconds.
    This can be some extension to spawn_with_cancel, maybe a new function in spawn_utils

Ensuring peer_rx dies by itself on TorrentEvent::Paused would be awesome too. DHT implementation will get in the way though as it has no clue about cancellations today. Might need to wrap DHT stream in some way to stop itself (by dropping) on cancel.

@evanrichter

Copy link
Copy Markdown

I'm not familiar with the tracker protocol as it exists in real production use cases, but it seems excessive to have room for multiple (128) "current states" in memory in the event channel when only the most recent state should matter for announces.

From Bep 003:

event
This is an optional key which maps to started, completed, or stopped (or empty, which is the same as not being present). If not present, this is one of the announcements done at regular intervals. An announcement using started is sent when a download first begins, and one using completed is sent when the download is complete. No completed is sent if the file was complete when started. Downloaders send an announcement using stopped when they cease downloading.

It seems like TrackerComms already handles sending Started and the regular (empty event) announcements by sleeping/looping.

I agree that select! is the right thing to do when checking torrent event changes like Stopped. But in the scenario that there are a bunch of state changes in the queue, I don't think it makes sense to announce to the tracker every state change. Just report the most recent/accurate one. For that, there is tokio::sync::watch.

Maybe Completed is the only one that could be in a stream of Started and Stopped that we actually care to report? I would need to see how other clients handle this, or what trackers expect to be sure. If Completed is absolutely mandatory to report in the event field, then we could have at most 2 watch channels. One for "state" of Started/Stopped, and the other for "Completed" (just a bool!).


Your project seems very nice so far, but I'm just getting started looking at the internals, so forgive me if I'm way off base with my take here.

@ivoviz

ivoviz commented Feb 9, 2026

Copy link
Copy Markdown
Author

I’m actually coming from the other side here (meaning I wrote the tracker part). To be fair, most of these events only really matter on private trackers.

Started isn’t particularly important from the tracker’s perspective. You just check whether the peer already exists in your memory (or whatever storage you’re using), and if not, you insert it. Whether you received a Started event or not is basically irrelevant.

Stopped is important, otherwise the tracker could (and likely would) hand out dead peers to others.

Completed is important if you differentiate or prioritize what kind of peers you send to leechers. If you want to prefer sending seeders over leechers, you need to know when someone has finished downloading. It would still work without it, but having it speeds things up.

Paused isn’t mentioned here, but it would be nice to have. It basically represents a partial seeder, someone who doesn’t have all the files and doesn’t want to download the remaining ones either.

@evanrichter

Copy link
Copy Markdown

I guess I'll be that guy and post my Claude chat on this subject: https://claude.ai/share/27539cc4-627d-40e4-9859-7d0766e7be7b

Basically:

  • watch or some other cheap/single slot notify mechanism is all that the TrackerComms task needs to figure out what to send to the tracker and when
  • we might want to clarify "Paused" as it is the same word for GUI state of a torrent and a new tracker event field option that is a draft BEP (this usage relates to a peer being incomplete but still wanting to seed the pieces they do have)
  • an event bus can definitely be useful but TrackerComms would bear a lot more complexity if that is its notification source

@evanrichter

Copy link
Copy Markdown

@teohhanhui do you have something constructive to add to the discussion? I understand some people have negativity toward LLM usage, but I don't think I'm going against the policy of the project. I am an experienced rust developer (been using is since shortly after version 1.0) and I have already vetted the chatbot's output and used my own brain to think through things. I'm also upfront with exactly how I used AI and exactly where my experience is lacking.


For everyone else in the thread, I don't want to be seen as blocking a PR unnecessarily. I just want to help find a precise and efficient solution. The bittorrent protocol has interested me for a long time, and this project more recently. If there's a better place to have the design discussion let me know!

@ikatson

ikatson commented Feb 10, 2026

Copy link
Copy Markdown
Owner

Thanks @ivoviz @evanrichter for your input and expertise on the matter, it seems that you guys can figure out the best way to approach this.

tokio::sync::watch

this is reasonable

an event bus can definitely be useful but TrackerComms would bear a lot more complexity if that is its notification source

I was thinking to reuse the tracker bus in other parts to decouple things / decrease spaghetti. The one use-case I had at first was when I launched rqbit on Android through C bindings (none of this is merged), and I needed to let C know about state changes and other things like progress. sync::watch may work here too.

I'm not bound to event bus approach, as long as the PR doesn't make things worse. The current unidirectional design "tracker comms is a stream" doesn't work for this feature, so it needs to be refactored somehow.

@ivoviz

ivoviz commented Feb 10, 2026

Copy link
Copy Markdown
Author

@iPLAYCAFE

  1. Agree on using tokio::sync::watch.

  2. I can’t recall the exactly, but I’m pretty sure that in the current state (before this PR), when I tested it, the first announce (event=started) could happen before the initial hash check finished, even with fastresume enabled, which made the left parameter totally incorrect.

  3. I don’t think it’s necessary to store the completed state. It only happens once during the torrent’s lifetime, and when it does, we should just announce immediately. Unless the current codebase doesn’t distinguish between "finished downloading" and "hash check reporting 100%", we could simply trigger the announce when the last chunk is downloaded.

  4. Agree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Announce timing and missing completed/stopped events

3 participants