Cross-thread memoization with eventual consistency.
Memoization is popular pattern to reduce expensive computation; you don't need a library for this, despite some existing to provide better developer ergonomics. What is hard, however, is supporting higher-level memoization which can be leveraged across threads and periodically reloads/refreshes. This library is, conceptually, a mix of memoization and in-memory caching with time-to-live expiration/refresh which is thread-safe. It works best for computations or data fetching which:
- Can be eventually correct; where inconsistent data across processes is acceptable.
- Given two or more processes, one may have "stale" data while the other may have "less-stale" data. There are no cross-process data consistency guarantees.
- Happens in any given thread in a given process.
- Since this library memoizes data in-memory it may result in poorly allocated memory consumption if only the occasional thread needs the data.
This library is a sharp knife with a specific use-case. Do not use it without fully understanding the implications of its application.
Extracted from the scaling pains we experienced over at Huntress Labs (the scale of many billions of ruby background jobs per month, millions of HTTP requests per minute), this pattern has allowed us to reduce the execution time of hot code paths where every computation, database query, and HTTP request matters. We discovered these code paths were accessing infrequently changing data sets in each execution and began investigating ways to reduce the overhead of their access. Since it's inception, this library has been used widely across our code bases.
require "benchmark"
require "ttl_memoizeable"
class ApplicationConfig
class << self
def config_without_ttl_memoization
# JSON.parse($redis.get("some_big_json_string")) => 0.05ms of execution time
sleep 0.05
end
def config_with_ttl_memoization
# JSON.parse($redis.get("some_big_json_string")) => 0.05ms of execution time
sleep 0.05
end
extend TTLMemoizeable
ttl_memoized_method :config_with_ttl_memoization, ttl: 1000
end
end
iterations_per_thread = 1000
thread_count = 4
Benchmark.bm do |x|
x.report("baseline:") do
thread_count.times.collect do
Thread.new do
iterations_per_thread.times do
ApplicationConfig.config_without_ttl_memoization
end
end
end.each(&:join)
end
x.report("ttl_memoized:") do
thread_count.times.collect do
Thread.new do
iterations_per_thread.times do
ApplicationConfig.config_with_ttl_memoization
end
end
end.each(&:join)
end
end
user system total real
baseline: 0.112220 0.101602 0.213822 ( 52.803622)
ttl_memoized: 0.008847 0.000755 0.009602 ( 0.221783)
- Define your method as you normally would. Test it. Benchmark it to know that it is "expensive"
- Extend the methods defined in this file by calling
extend TTLMemoizeable
in your class (if not already extended) - Call
ttl_memoized_method :your_method_name, ttl: 5.minutes
where:your_method_name
is the method you just defined, and thettl
is the duration (in time or accessor counts) of acceptable data inconsistency - 🎉
Two methods of TTL expiration are available
-
Time Duration (i.e
5.minutes
). This will ensure the process will cache your method for that given amount of time. This option is likely best when you can quantify the acceptable threshold for stale data. Every time the memoized method is called, the date the current memoized value was fetched + your ttl value will be compared to the current time. -
Accessor count (i.e. 10_000). This will ensure the process will cache your method for that number of attempts to access the data. This option is likely best when you want to TTL to expire based of volume. Every time the memoized method is called, the counter will decrement by 1.
- Use this library on methods that have logic involving state
- Use this library on methods that accept parameters, as that introduces state; see above
Using this library is most effective on class methods.
require "ttl_memoizeable"
class ApplicationConfig
class << self
extend TTLMemoizeable
def config
JSON.parse($redis.get("some_big_json_string"))
end
ttl_memoized_method :config, ttl: 1.minute # Redis/JSON.parse will only be hit once per minute from this process
end
end
ApplicationConfig.config # => {...} Redis/JSON.parse will be called
ApplicationConfig.config # => {...} Redis/JSON.parse will NOT be called
#... at least 1 minute later ...
ApplicationConfig.config # => {...} Redis/JSON.parse will be called
It will work on instance methods as well, however, this is less useful as it does not share state across threads without the use of a global
require "ttl_memoizeable"
class ApplicationConfig
extend TTLMemoizeable
def config
JSON.parse($redis.get("some_big_json_string"))
end
ttl_memoized_method :config, ttl: 1.minute
end
ApplicationConfig.new.config # => {...} Redis/JSON.parse will be called
ApplicationConfig.new.config # => {...} Redis/JSON.parse will be called
application_config = ApplicationConfig.new
application_config.config # => {...} Redis/JSON.parse will be called
application_config.config # => {...} Redis/JSON.parse will NOT be called
#... at least 1 minute later ...
application_config.config # => {...} Redis/JSON.parse will be called
You likely don't want to test the implementation of this library, but the logic of your memoized method. In that case you probably want "fresh" data on every invocation of the method. There are a few approaches, depending on your preference of flavor.
- Use the reset method provided for you. It follows the pattern of
reset_memoized_value_for_#{method_name}
. Note that this will only reset the value for the current thread, and shouldn't be used to try and create consistent data state across processes.
def test_config
ApplicationConfig.reset_memoized_value_for_config # or in a setup method or before block if available
assert_equal {...}, ApplicationConfig.config
end
- Disable ttl memoization globally in your tests. This will prevent a memoized value from ever being returned.
TTLMemoizeable.disable!
- Reset ttl memoization values before/after your test runs.
# RSpec
RSpec.configure do |config|
config.around { TTLMemoizeable.reset!; _1.run; TTLMemoizeable.reset! }
end
# minitest
def setup
TTLMemoizeable.reset!
end
def teardown
TTLMemoizeable.reset!
end
- Conditionally TTL memoize the method based on test environment or some other condition.
def config
JSON.parse($redis.get("some_big_json_string"))
end
ttl_memoized_method :config, ttl: 1.minute unless test_env?
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and the created tag, and push the .gem
file to rubygems.org.
Bug reports and pull requests are welcome on GitHub at https://github.com/huntresslabs/ttl_memoizeable.
bundle exec bump ${major / minor / patch / pre} --tag --edit-changelog
git push
git push --tags
gem build
gem push
The gem is available as open source under the terms of the MIT License.