-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hash-based caching #11
Conversation
// FreshRSS commented out, to allow HTTP 304 | ||
// // ignore data if internal cache expiration time is expired | ||
// if ($data['__cache_expiration_time'] < time()) { | ||
// return $default; | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Art4 This looks suspicious, probably broken in the current version of SimplePie. This seems to disable the possibility of conditional requests (HTTP 304).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be referring to simplepie#846
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Art4 I think the logic is wrong. The cache is destroyed before the conditional request having a chance to kick-in. So you will (almost) never receive HTTP 304 responses.
For instance, I use a 15-minute cache. Within the next 15 minutes, the cache should be used without any request. After the 15 minutes, a conditional request should be emitted (if information was available), to potentially allow for an HTTP 304 Not Modified response.
This code very much seems to break this, by wrongly destroying the cache after the 15 minutes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct, because this mimics the way PSR-16 works, see https://www.php-fig.org/psr/psr-16/#12-definitions
Expiration - The actual time when an item is set to go stale. This is calculated by adding the TTL to the time when an object is stored.
An item with a 300 second TTL stored at 1:30:00 will have an expiration of 1:35:00.
Implementing Libraries MAY expire an item before its requested Expiration Time, but MUST treat an item as expired once its Expiration Time is reached.
If set_data($data, $ttl)
is called with $ttl=15 min, than the cache has to return null (or $default), but not a potential expired value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A workaround would be to set the $ttl to 30 min, but send a request after 15 min and check for HTTP 304 response. If 304 is returned, than the cache should be written for another 30 min.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not in a position to do that at the moment
No worries :-)
'cache_expiration_time' => $this->cache_duration - 300 + time(),`
I am not sure where the 300
is coming from, but it should not be hardcoded.
One can even rename it to
check_rss_feed_after
Yes, it looks to me like a bit of refactoring would be needed there. I still think the current behaviour is buggy.
Ideally, it should evolve to be HTTP-compliant, which is not the case at the moment, by properly supporting Cache-Control: max-age
and other related headers.
or even more custom values
Yes, that would be needed.
I can create a PR with a minimal PSR-16 implementation based on the File cache adapter as a starting point
That would be much welcome 👍🏻 It is still unclear to me how much can be put there without changing SimplePie core's code.
When this clearing is executed is part of the cache implementation and could be triggered by a cronjob
Yes, this is what we have at the moment in FreshRSS (prior to those SimplePie changes), but without needing those extra steps:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure where the
300
is coming from, but it should not be hardcoded.
The 300
was only an example.
I can create a PR with a minimal PSR-16 implementation based on the File cache adapter as a starting point
That would be much welcome 👍🏻 It is still unclear to me how much can be put there without changing SimplePie core's code.
Ok, I will create a PR in FreshRSS repo based on the allow-simplepie-updates-with-composer
branch.
You can see the first results here: https://github.com/Art4/FreshRSS/compare/allow-simplepie-updates-with-composer...Art4:FreshRSS:create-psr-16-cache?expand=1
When this clearing is executed is part of the cache implementation and could be triggered by a cronjob
Yes, this is what we have at the moment in FreshRSS (prior to those SimplePie changes), but without needing those extra steps:
It should be possible to move this logic into the Cache implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Alkarex See Art4/FreshRSS#1 for a PSR-16 implementation based on the file cache driver from SimplePie.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your efforts 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Art4 Could you please change the target of your PR to https://github.com/FreshRSS/simplepie/tree/freshrss
I am not sure I will be able to use it as it does not seem to allow offloading much of my needed patches, while adding 500 lines of boilerplate, but I will try to give it another look later.
In particular, I still fail to see so far how to achieve your claim of Makes most changes in the cache class of the SimplePie fork not necessary
@Alkarex As you requested in FreshRSS/FreshRSS#4374 (comment) I've looked into this PR. I think I'm understanding the purpose of introducing the hashing in the Cache in SimplePie 1.7.0Until SimplePie 1.7.0 the cache was using the The happy path for this optimization works like this: (I hope I get this correct from here.)
Cache in SimplePie 1.8.0In SimplePie 1.8.0 support for PSR-16 cache was introduced and internally the cache only uses Now the happy path described above works like this: (see here.)
ConclusionAs you can see the sum of operations changes for all cache drivers. The file drive (that is used by FreshRSS) now has 1x read and 1x write operations. Thats why I recommend to create or use a PSR-16 implementation for FreshRSS. There it should possible to optimize this newly write operation (e.g. if |
Thanks for the feedback @Art4 . I will try to dig into (I still fail to see how to move some of my patches to a PSR-16 implementation). A more pressing matter is what looks very much like a bug in the current SimplePie, preventing HTTP 304 from working at all #11 (comment) For now, this PR seems to work for me, except for an array to string conversion in the HTTP headers parsing, which also seems to be a new SimplePie bug, but I have not investigated in details yet.
Debug: {"url":"https://gizmodo.com/feed","key":"if-modified-since","value":["Thu, 22 Aug 2024 06:57:33 GMT"]}
{"url":"https://gizmodo.com/feed","key":"if-none-match","value":["W/\"067d554a3cf80ce6cbe5f2d31da11861\""]} |
I think the syntax in the headers is wrong. Arrays are not allowed, see https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-Modified-Since#syntax
|
Indeed, arrays are not allowed and it should be a string. But I face a case when SimplePie ends up trying to convert an array (which should never be the case) to a string. I have not investigated yet, though, but will follow up on it |
The problem with headers as array seems to come from those changes simplepie#815 which are now mixing headers as string and as array... Code like that seems also to indicate that there is such a confusion of types: if (is_array($link_headers)) {
$link_headers = implode(',', $link_headers);
} It looks quite messy to me that Anyway, 04e66e0 seems to fix what I need to make it work. |
Minor follow-up #17 |
simplepie#401
FreshRSS/FreshRSS@9aab83a
FreshRSS/FreshRSS@00127f0