Make fewer `lookup` requests when inode type might be known #12

jamesbornholt · 2022-10-29T19:23:41Z

Right now our lookup implementation does two concurrent ListObjectsV2 requests. List requests are expensive, both cost-wise and performance-wise. We should think about a better strategy here, perhaps starting with a HeadObject request. If there's an existing inode, we could also use that as a hint as to which requests are most likely to succeed (only as a hint, because the bucket might have changed).

The text was updated successfully, but these errors were encountered:

Our current `lookup` does two concurrent ListObjects requests. After thinking about it a bit more carefully, one of them can be replaced with a cheaper, faster HeadObject request. The "unsuffixed" request we were doing was purely to discover whether an object of the exact looked-up name existed, which is what HeadObject does. Switching to HeadObject reduces the request costs of a lookup. One disadvantage of HeadObject is when looking up directories. The unsuffixed ListObjects we're replacing here could discover a common prefix and return it immediately without waiting for the other request to complete. But in practice, the two requests were dispatched concurrently, so the customer still pays for both requests, and the latency is the minimum latency of two concurrently ListObjects. Now, the latency for a directory lookup will be the maximum of a concurrent ListObjects and HeadObject. An issue in this change is that we expect HeadObject to return 404 when doing directory lookups, but right now the way our error types are structured gives us no way to distinguish 404s from other errors. For now, I'm just swallowing all errors on the HeadObject request, and I'll follow up with a broader change to fix our error handling story to make this work. This is a partial fix for #12, but in future we can do better for lookups against objects we've seen before by remembering their type. Signed-off-by: James Bornholt <bornholt@amazon.com>

jamesbornholt · 2023-02-07T01:54:02Z

#69 improved this by replacing one ListObjects with a HeadObject. What's left to do is to use cached state as a hint to optimize the requests here—if we already suspect something is a file (or directory), we can start with the HeadObject (or ListObjects) and only do the other request if it fails.

plundra · 2023-09-06T15:58:42Z

Not sure if applicable, I'm not familiar with internals of FUSE, however:

In mount.fuse(8) I see two options:

entry_timeout=T
       The timeout in seconds for which name lookups will be cached. The default is 1.0 second. For all the timeout options, it is possible to give fractions of a second as well (e.g. entry_timeout=2.8)

attr_timeout=T
       The timeout in seconds for which file/directory attributes are cached.  The default is 1.0 second.

Which both sounds excellent, if mount-s3 could set/use these?

We have a very static object structure, so caching would be great and work very well.
Did some measurements and for one test it's 30-50% list-type requests, which are costly both in round trip time, but also money :-)

(Thanks for a very exciting and promising project btw!)

jamesbornholt · 2023-09-06T16:59:27Z

Yeah, we actually do set those timeouts, here:

mountpoint-s3/mountpoint-s3/src/fs.rs

Lines 201 to 216 in b632bbe

    
           impl Default for CacheConfig { 
        
               fn default() -> Self { 
        
                   // We want to do as little caching as possible, but Linux filesystems behave badly when the 
        
                   // TTL is exactly zero. For example, results from `readdir` will expire immediately, and so 
        
                   // the kernel will immediately re-lookup every entry returned from `readdir`. So we apply 
        
                   // small non-zero TTLs. The goal is to be small enough that the impact on consistency is 
        
                   // minimal, but large enough that a single cache miss doesn't cause a cascading effect where 
        
                   // every other cache entry expires by the time that cache miss is serviced. We also apply a 
        
                   // longer TTL for directories, which are both less likely to change on the S3 side and 
        
                   // checked more often (for directory permissions checks). 
        
                   let file_ttl = Duration::from_millis(100); 
        
                   let dir_ttl = Duration::from_millis(1000); 
        
                   Self { file_ttl, dir_ttl } 
        
               } 
        
           }

We set them very low because we want to preserve S3's strong consistency model by default. But we know for some workloads, the bucket doesn't change very much/at all, and so we could cache that metadata much longer. We're tracking that as a roadmap item in #255.

This issue is tracking a smaller improvement we could make: if we've listed a file/directory previously, then when the cache expires we could speculate that it's still a file/directory when we try to look it up from S3 again. I think with the way that lookup works right now, that would allow us to skip some HeadObject requests, but probably not any ListObjects requests.

jamesbornholt added this to the beta milestone Oct 29, 2022

jamesbornholt mentioned this issue Feb 1, 2023

Use HeadObject for lookup #69

Merged

jamesbornholt changed the title ~~More frugal lookup requests~~ Make lookup requests more frugal when inode type might be known Feb 7, 2023

jamesbornholt modified the milestones: beta, 1.0 Feb 7, 2023

jamesbornholt added this to Mountpoint for Amazon S3 - Public Roadmap Feb 17, 2023

jamesbornholt moved this to Planned for GA in Mountpoint for Amazon S3 - Public Roadmap Feb 17, 2023

jamesbornholt removed this from Mountpoint for Amazon S3 - Public Roadmap Mar 3, 2023

passaro removed this from the General Availability (1.0) milestone Jul 24, 2023

jamesbornholt added the enhancement New feature or request label Aug 5, 2023

jamesbornholt changed the title ~~Make lookup requests more frugal when inode type might be known~~ Make fewer lookup requests when inode type might be known Aug 5, 2023

passaro mentioned this issue Sep 5, 2023

Performance characteristics of absolute path vs relative path (and excessive HEAD requests) #500

Closed

mhnap mentioned this issue Feb 22, 2024

stat call for path triggers two requests (one List and one Head) with enabled cache #777

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make fewer `lookup` requests when inode type might be known #12

Make fewer `lookup` requests when inode type might be known #12

jamesbornholt commented Oct 29, 2022

jamesbornholt commented Feb 7, 2023

plundra commented Sep 6, 2023

jamesbornholt commented Sep 6, 2023

Make fewer lookup requests when inode type might be known #12

Make fewer lookup requests when inode type might be known #12

Comments

jamesbornholt commented Oct 29, 2022

jamesbornholt commented Feb 7, 2023

plundra commented Sep 6, 2023

jamesbornholt commented Sep 6, 2023

Make fewer `lookup` requests when inode type might be known #12

Make fewer `lookup` requests when inode type might be known #12