Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should the date be 'sanity-checked' prior to being updated? #18

Closed
jgrisham opened this issue Sep 6, 2023 · 2 comments
Closed

Should the date be 'sanity-checked' prior to being updated? #18

jgrisham opened this issue Sep 6, 2023 · 2 comments

Comments

@jgrisham
Copy link

jgrisham commented Sep 6, 2023

Observed behavior

On many (over 600 items in my personal Zotero library) items (example #1), the Date field is apparently set by this plugin to '1969-12-31'.

  • (That seems like a 'default'/'epoch' date, and I'm not sure if it's coming from the HTTP headers for that page or from the plugin.)

Related?

While I don't know if also caused by this plugin, I have a dozen or so items with the Date field only containing a time (e.g. '21:54:00 +0100' for this URL).

  • I can't imagine I would have entered those, but perhaps they were populated by Zotero itself.
  • I haven't yet had a chance to investigate those items further.

Possible solution

I don't imagine a limited number of checks would add significant overhead to the plugin? ¯\(ツ)

Example - only update date if (all?) of the following are true:

  1. Date field is blank (already implemented - thanks, Emiliano!)
  2. The calculated date is 1990 or later (does anyone / any CMSs actually back-date HTTP headers for > 33 year-old documents / publications?)
  3. The calculated date is prior to any of the automatic date fields ('Accessed', 'Date Added', 'Modified') that are not blank
  4. The calculated year is, say, 2200 or earlier (I realize this creates a 'Y2.2k problem', but it might catch 'over-range' dates from malfunctioning / mis-configured webservers)

(I'll try to take a look at the code myself if I can, but I wanted to make sure to share my observations before the week got away from me.)

This is a great concept for a plugin; thank you for sharing it with the world!

Cheers,

  • Jim

Version details

Zotero version: 6.0.26 (Windows)
Zotero Date From Last Modified plugin: 0.1.0

retorquere added a commit that referenced this issue Sep 25, 2023
@github-actions
Copy link

🤖 this is your friendly neighborhood build bot announcing test build 0.1.3.18.14 ("fixes #18, part 1")

Install in Zotero by downloading test build 0.1.3.18.14, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".

@retorquere
Copy link
Owner

On many (over 600 items in my personal Zotero library) items (example #1), the Date field is apparently set by this plugin to '1969-12-31'.
* (That seems like a 'default'/'epoch' date, and I'm not sure if it's coming from the HTTP headers for that page or from the plugin.)

That does look an epoch date, shifted one day. The URL in the sample doesn't exhibit the problem (anymore), but I try first to convert the date to UTC, and if that's the unix epoch of 1970-01-01, I make no changes.

Related?

While I don't know if also caused by this plugin, I have a dozen or so items with the Date field only containing a time (e.g. '21:54:00 +0100' for this URL).

Nope, I never set a time in any way with this plugin. That value comes from the standard scraper.

* I can't imagine I would have entered those, but perhaps they were populated by _Zotero_ itself.

The zotero scraper, yes.

1. Date field is blank _(already implemented - thanks, Emiliano!)_

2. The calculated date is [1990 or later](https://www.google.com/search?q=first+web+server) _(does anyone / any CMSs actually back-date HTTP headers for > 33 year-old documents / publications?)_

Maybe not, but that's what the URL claims. I can see that the epoch date is unlikely, but for this, let's first see if we find samples that necessitate it.

3. The calculated date is _prior_ to any of the automatic date fields _('Accessed', 'Date Added', 'Modified')_ that are not blank

Any live URL is going to be older than the date added/modified? Accessed I could see.

4. The calculated year is, say, 2200 or earlier _(I realize this creates a 'Y2.2k problem', but it might catch 'over-range' dates from malfunctioning / mis-configured webservers)_

2200 is a loooooong time from now, so uninstalling the plugin would het you the same behavior ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants