#24 by @maxjeblick and #29 by @SimJeg introduce a non-breaking refactoring:
- a press does not require the
compression_ratio
input argument anymore as some presses do not explicitly require it (e.g.ThinKPress
,SimLayerKVPress
). However every press must have acompression_ratio
attribute after any forward pass (assertion added in tests) to allow average compression ratio measurement on a benchmark - the core compression logic has been moved from
BasePress.forward_hook
toBasePress.compress
.BasePress.forward_hook
now only checks ifcompress
must be called (pre-filling vs decoding), de-quantize cache beforecompress
and re-quantize it afterwards - the
BasePress
does not implement ascore
method anymore, this has been moved to theScorerPress
with the associatedScorerPress.compress
method
Other features:
- Add
SimLayerKVPress
, #28 by @SimJeg and @dame-cell - Add
ComposedPress
, #29 by @SimJeg - Add
KeyReRotationPress
, #31 by @maxjeblick and @giulio98 - Fix
QuantizedCache
, #30 by @maxjeblick - Add new tests, including an integration test on a sample from RULER