Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FilterLists-hosted mirrors #464

Open
collinbarrett opened this issue Sep 9, 2018 · 8 comments
Open

FilterLists-hosted mirrors #464

collinbarrett opened this issue Sep 9, 2018 · 8 comments
Assignees
Labels
archival service for archiving copies of FilterLists

Comments

@collinbarrett
Copy link
Owner

collinbarrett commented Sep 9, 2018

An anonymous user suggested capturing backups of lists in a public repository or similar.

This is a tricky one and something I have certainly considered. Currently, FilterLists has "SnapshotService" that crawls all of the lists and stores all of the rules in the database (in a way that's linked to the list and the time it was Snapshot-ed). These aren't exact copies of the lists, though. Each unique line in the plain text file is ingested as a separate rule. We do not preserve order. So, we could reproduce from this a very similar mirror/backup of the original list, but order would not be preserved and duplicate lines would be de-duplicated (comments, for example, would be completely un-useful). Implementation details of SnapshotService.

I want to be very careful about respecting list maintainers licenses. Some maintainers do not want their lists re-distributed in any way. There are some open GitHub issues for some cool features (such as #33) that could make use of these snapshot-ed rules, but I am not really sure I want FilterLists to be in the "business" of hosting mirrors/backups. If the project ever were to go offline for any reason (no plans to do so, but I do pay monthly out of my own pocket to host it, so it is certainly possible), I don't want people relying on it for their list subscriptions, etc. We could capture backups on a site like GitHub that is free and would theoretically remain online even if FilterLists had to shut down, but that could violate some maintainer's licenses.

related to #6

@Atavic
Copy link

Atavic commented Sep 9, 2018

Some maintainers do not want their lists re-distributed in any way.

Such data should be totally free.

@collinbarrett
Copy link
Owner Author

@Atavic I agree. If I were to maintain a list, I would make it free. That's why FilterLists is under the MIT license. But, there are some maintainers who have a less permissive license. And, there are many lists who do not specify a license, in which case FilterLists treats as "All Rights Reserved" by default. I am certainly not a legal professional, but I do want to keep FilterLists on the right side of the law (U.S., as that's where I'm based) to the best of my knowledge.

@Atavic
Copy link

Atavic commented Sep 9, 2018

Uh, legalese. To me, if the mirror has zero ads + has big links to the original authors + updates stay a little behind the source (one day? two days?) there should be zero issues.

@collinbarrett collinbarrett changed the title consider serving mirrors/backups of lists consider capturing and serving plain-text mirrors/backups of lists Sep 9, 2018
@elypter
Copy link

elypter commented Sep 11, 2018

@DandelionSprout
Copy link
Contributor

I think that the amount of times that a list that was on FilterLists.com has completely vanished from the internet, can be counted on one hand. As such, this isn't a top-priority thing to implement, in my eyes.

The mirror viewlink system, and the existence of the Wayback Machine, also seems to have worked fairly well as far as I can determine.

@collinbarrett
Copy link
Owner Author

Certainly open to revisiting in the future, but closing as won't fix for now.

@collinbarrett collinbarrett added the wontfix will not be worked on label Oct 9, 2018
@collinbarrett
Copy link
Owner Author

Re-opening for consideration per NanoAdblocker/NanoCore#220 (comment) and #584

@collinbarrett collinbarrett reopened this Oct 17, 2018
@collinbarrett collinbarrett removed the wontfix will not be worked on label Oct 17, 2018
collinbarrett added a commit that referenced this issue Oct 26, 2018
@collinbarrett collinbarrett removed the feedback wanted provide your input label Sep 3, 2019
@collinbarrett collinbarrett changed the title consider capturing and serving plain-text mirrors/backups of lists capture and serve plain-text mirrors/backups of lists Sep 3, 2019
@collinbarrett collinbarrett added url-validation service that validates URLs blocked blocked by another issue and removed enhancement labels Feb 17, 2020
@collinbarrett collinbarrett added archival service for archiving copies of FilterLists and removed blocked blocked by another issue url-validation service that validates URLs labels Sep 13, 2020
@collinbarrett
Copy link
Owner Author

started some work on this here. want to get the service up and building the archives first. then, will maybe expose the mirrored copies to the public. want to consider licensing of lists and only serve FilterLists-hosted mirrors of lists that allow that via licensing.

further down the road, these archived list copies can be used for things like #33 , etc.

@collinbarrett collinbarrett changed the title capture and serve plain-text mirrors/backups of lists host plain-text mirrors Sep 21, 2020
@collinbarrett collinbarrett changed the title host plain-text mirrors FilterLists-hosted mirrors Sep 21, 2020
@collinbarrett collinbarrett self-assigned this Oct 5, 2020
@collinbarrett collinbarrett mentioned this issue Aug 25, 2023
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
archival service for archiving copies of FilterLists
Projects
None yet
Development

No branches or pull requests

4 participants