You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Torrent's health zombie records with invalid last check time from the future infect Tribler nodes, spread from machine to machine, and can't be fixed by local health checks
#7295
Closed
kozlovsky opened this issue
Feb 20, 2023
· 1 comment
While reviewing #7286, I discovered a very interesting bug when incorrect health information with a future date and sometimes a huge number of seeders spreads from peer to peer infecting new peers and spreading even further. This becomes possible through a combination of several bugs in TorrentChecker, MetadataStore, and PopularityCommunity.
As I currently understand it, the bug works the following way:
Some Tribler clients' machines have totally incorrect local times. I think some active peer has local time +35 days in the future, and sometimes I also see records with a date several years in the future.
These nodes spread incorrect information to their peers.
What is important, during this update, the state of the TorrentState.self_checked flag does not alter. That means if a peer checked the health of this torrent at least once in the past, then the health record is considered a locally checked record, even if the info from the remote peer overrode it.
Then, sometime later, the infected Tribler client is restarted. After the restart, PopularityCommunity loads recently checked torrents to spread them to other peers. By "recently checked torrents," it means "self-checked torrents that were checked four hours ago or later":
All invalid torrents with future check dates that could override the local self-checked health suit this condition perfectly. And the correct locally checked health records are loaded only if they are no more than four hours old. As a result, it is pretty usual that a tribler client, when restarted, only loads invalid health records, hundreds of them, with dates a month or a year in the future, and starts spreading them to its peers.
defupdate_torrents_checked(self, new_result):
""" Update the set with torrents that we have checked ourselves. """infohash=new_result['infohash']
seeders=new_result['seeders']
new_result_tuple= (infohash, seeders, new_result['leechers'], new_result['last_check'])
ifseeders>0: # <-- hereself._torrents_checked[infohash] =new_result_tuple
As a result, when the Tribler client starts, it loads hundreds of invalid health records, often with a huge number of seeders, as if they were locally checked. Even if Tribler later checks this torrent and discovers that the actual number of seeders is zero, it does not remove the invalid health record from the cache and continues spreading it to its peers. So invalid health records with huge numbers of peers and time in the future behave like invulnerable zombies that infect other machines and continue spreading with an incorrect huge number of seeders even after the local check reveals zero seeders.
The discovery of this amazing combination of bugs becomes possible after the @drew2a refactoring of TorrentChecker. I fixed it in #7286 with the following fixes:
Invalid health records are rejected and do not "infect" local records (fixed here and here in 99d5725)
Health records just slightly in the future are accepted, as some minor time drift is unavoidable (fixed here in 99d5725).
Valid remote health records do not override local health records if the local check was very recent, as we trust local checks more (fixed here and here in 2bfc9da).
When a valid remote health record overrides the previous local record, it clears the self_checked flag, so the record is not considered by PopularityCommunity as local anymore (fixed here in 8a0eb9e).
If the number of seeders of the new valid local health record is zero, the torrent is removed from the cache of locally recently checked health records, so the local peer stops spreading information about this torrent (fixed here in 943a138).
When TorrentChecker loads locally checked torrents, it now filters out records with invalid last check time (fixed here in 8fea75d)
Also, Tribler now logs information about rejected invalid health records received from other peers, so by looking into the logs, it becomes possible to estimate the severity of the problem and see in real time how many invalid health records Tribler client receives at every moment.
The text was updated successfully, but these errors were encountered:
While reviewing #7286, I discovered a very interesting bug when incorrect health information with a future date and sometimes a huge number of seeders spreads from peer to peer infecting new peers and spreading even further. This becomes possible through a combination of several bugs in
TorrentChecker
,MetadataStore
, andPopularityCommunity
.As I currently understand it, the bug works the following way:
Some Tribler clients' machines have totally incorrect local times. I think some active peer has local time +35 days in the future, and sometimes I also see records with a date several years in the future.
These nodes spread incorrect information to their peers.
On receiving health information from the remote peer,
PopularityCommunity
callsMetadataStore.process_torrent_health
method:and this method stores information in the DB with the only check that the health's last check date should be later than the last check date in the DB.
What is important, during this update, the state of the
TorrentState.self_checked
flag does not alter. That means if a peer checked the health of this torrent at least once in the past, then the health record is considered a locally checked record, even if the info from the remote peer overrode it.Then, sometime later, the infected Tribler client is restarted. After the restart,
PopularityCommunity
loads recently checked torrents to spread them to other peers. By "recently checked torrents," it means "self-checked torrents that were checked four hours ago or later":All invalid torrents with future check dates that could override the local self-checked health suit this condition perfectly. And the correct locally checked health records are loaded only if they are no more than four hours old. As a result, it is pretty usual that a tribler client, when restarted, only loads invalid health records, hundreds of them, with dates a month or a year in the future, and starts spreading them to its peers.
The following bug makes the problem even worse:
TorrentChecker
has a cache of recently checked torrents, andPopularityCommunity
spreads health records contained in this cache. But if the local check reveals a zero number of seeders for a torrent, the cache update is skipped in this case:As a result, when the Tribler client starts, it loads hundreds of invalid health records, often with a huge number of seeders, as if they were locally checked. Even if Tribler later checks this torrent and discovers that the actual number of seeders is zero, it does not remove the invalid health record from the cache and continues spreading it to its peers. So invalid health records with huge numbers of peers and time in the future behave like invulnerable zombies that infect other machines and continue spreading with an incorrect huge number of seeders even after the local check reveals zero seeders.
The discovery of this amazing combination of bugs becomes possible after the @drew2a refactoring of
TorrentChecker
. I fixed it in #7286 with the following fixes:self_checked
flag, so the record is not considered byPopularityCommunity
as local anymore (fixed here in 8a0eb9e).TorrentChecker
loads locally checked torrents, it now filters out records with invalid last check time (fixed here in 8fea75d)Also, Tribler now logs information about rejected invalid health records received from other peers, so by looking into the logs, it becomes possible to estimate the severity of the problem and see in real time how many invalid health records Tribler client receives at every moment.
The text was updated successfully, but these errors were encountered: