Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable Etag #321

Closed
pombredanne opened this issue Jan 29, 2021 · 2 comments · Fixed by #600
Closed

Disable Etag #321

pombredanne opened this issue Jan 29, 2021 · 2 comments · Fixed by #600

Comments

@pombredanne
Copy link
Collaborator

The current Etag handling IMHO are more problematic than helpful at this stage, especially when there is a failure during import. See 255b7f9#diff-938a299f8406c1d3defaec48838bc4f6f1307635f5d2ba36e9777227cedbb383R46

  1. We are not using Etag correctly: Etags are designed to be passed back in HTTP headers. We should instead send the proper HTTP header as per https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-None-Match and integrate this finely in the processing as this typically needs to use streaming=True requests, and proper handling of the HTTP return code. In all cases this should end up being a single request, not a HEAD followed by another GET request

  2. Etag are unlikely meant to stored long term in the DB like we are now

  3. the create_etag function name is misleading as we are creating, checking and saving the Etag in this create_etag function

  4. If some import fails, we have to manually delete the records in the DB to restart an import

  5. the time it takes to download data seems to be very small when compared with the time to perform the import

  6. Once the initial import is done, incremental import should be much smaller, making Etgas even less relevant

Therefore, I think we need to reconsider using Etag.

@sbs2001
Copy link
Collaborator

sbs2001 commented Jan 29, 2021

@pombredanne

Thanks for (1) ! The current implementation is so crude.

All other points make 100% sense to me, except (6) . Since only way to know whether the data is new, is to check it against what we have in the db one to one. We can't just skip a file. Hence metadata like etag is used. Any other way to do this ?

@sbs2001
Copy link
Collaborator

sbs2001 commented Jan 31, 2021

Maybe adding a flag, to skip interpretting etag would help ?

Hritik14 added a commit to Hritik14/vulnerablecode that referenced this issue Jan 26, 2022
Etags are meant for transient usage in browsers and are not meant for
any long term usage.
Fixes: aboutcode-org#321

Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
Hritik14 added a commit to Hritik14/vulnerablecode that referenced this issue Jan 26, 2022
Etags are meant for transient usage in browsers and are not meant for
any long term usage.
Fixes: aboutcode-org#321

Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
Hritik14 added a commit to Hritik14/vulnerablecode that referenced this issue Jan 26, 2022
Etags are meant for transient usage in browsers and are not meant for
any long term usage.
Fixes: aboutcode-org#321

Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
Hritik14 added a commit to Hritik14/vulnerablecode that referenced this issue Feb 5, 2022
Etags are meant for transient usage in browsers and are not meant for
any long term usage.
Fixes: aboutcode-org#321

Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
Hritik14 added a commit to Hritik14/vulnerablecode that referenced this issue Feb 5, 2022
Etags are meant for transient usage in browsers and are not meant for
any long term usage.
Fixes: aboutcode-org#321

Signed-off-by: Hritik Vijay <hritikxx8@gmail.com>
@Hritik14 Hritik14 added this to the v30.0 milestone Feb 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants