-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: avoid ZSTD codec from overriding service codec factory. #7037
Conversation
- addresses opensearch-project#7012 Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works, but feels awfully specific to the fact that we have these two custom codecs in the project. A few ideas before I hit approve/merge.
- Can we check whether the codec is not custom (aka well known codecs) instead of whether it's a known custom codec?
- Does it make sense to make names such as "CUSTOM:ZSTD" and then look for "CUSTOM:" instead or is it a silly idea?
Gradle Check (Jenkins) Run Completed with:
|
*/ | ||
@Override | ||
public Optional<CodecServiceFactory> getCustomCodecServiceFactory(final IndexSettings indexSettings) { | ||
return Optional.of(new CustomCodecServiceFactory()); | ||
String codec = indexSettings.getValue(EngineConfig.INDEX_CODEC_SETTING); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add unit test for this class.
@dblock Do you mean requiring the user to specify all custom codecs as, for example, |
One comment, if an index sets this value to use the custom codec and from another plugin the codec is coming lets say a k-NN index, which codec will be picked up? or it will lead to failures? Example(might result in failure, can we check this case):
I have created this issue: #7032 tries to explore the possible solution. One suggestion I have is can we add java doc on top of this plugins and also on the EnginePlugin class which provides this interface that if an index tries to use 2 codec this can lead to failures in creating the index. |
- Removed custom classes for CodecService and CodecServiceFactory. - Also removed PerFieldMappingPostingFormatCodec -- not required. - Added documentation. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>
I reviewed the code again to see if I could do it without overriding the existing code service factory, and I can. So, I have removed the CodecService classes altogether. Note that the custom compression codecs are registered by calling org.apache.lucene.codecs.Codec ctor here. The @dblock The names for custom compression codecs, @navneet1v @martin-gaievski Can you check if this also addresses #7032? |
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## main #7037 +/- ##
============================================
- Coverage 70.78% 70.72% -0.07%
- Complexity 59269 59278 +9
============================================
Files 4823 4820 -3
Lines 283985 283962 -23
Branches 40953 40952 -1
============================================
- Hits 201026 200820 -206
- Misses 66403 66693 +290
+ Partials 16556 16449 -107
... and 505 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Now you have removed all CodecSeviceFactory , I want to know how the codec is now getting used or getting attached to a particular index? is it like if someone specify the index.codec: ZSTD the ZstdCodec would be picked up if already present? |
@navneet1v Yes. That happens here, which calls Lucene's registered codecs here. Lucene's Flamegraphs for the run below, with
|
This is awesome, @mulugetam , basically the standard service loader mechanism is purely sufficient here, right? |
|
- Zstandard version 1.5.5 contains a bug fix for a rare corruption error described here: https://github.com/facebook/zstd/releases/tag/v1.5.5. The zstd-jni version we use here, 1.5.5-1, uses Zstandard v1.5.5. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>
@reta I have also upgraded the zstd-jni version from 1.5.4-1 to 1.5.5-1. Version 1.5.5-1 is based on ZSTD version 1.5.5 that addresses the rare corruption bug described here: https://github.com/facebook/zstd/releases/tag/v1.5.5 |
Gradle Check (Jenkins) Run Completed with:
|
* Fix: enable ZSTD codec only if index.codec is set to ZSTD. - addresses #7012 Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Removed custom CodecService and CodecServiceFactory classes. - Removed custom classes for CodecService and CodecServiceFactory. - Also removed PerFieldMappingPostingFormatCodec -- not required. - Added documentation. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Bump zstd-jni version from 1.5.4-1 to 1.5.5-1. - Zstandard version 1.5.5 contains a bug fix for a rare corruption error described here: https://github.com/facebook/zstd/releases/tag/v1.5.5. The zstd-jni version we use here, 1.5.5-1, uses Zstandard v1.5.5. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> --------- Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> (cherry picked from commit 569e90c) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…7149) * Fix: enable ZSTD codec only if index.codec is set to ZSTD. - addresses #7012 * Removed custom CodecService and CodecServiceFactory classes. - Removed custom classes for CodecService and CodecServiceFactory. - Also removed PerFieldMappingPostingFormatCodec -- not required. - Added documentation. * Bump zstd-jni version from 1.5.4-1 to 1.5.5-1. - Zstandard version 1.5.5 contains a bug fix for a rare corruption error described here: https://github.com/facebook/zstd/releases/tag/v1.5.5. The zstd-jni version we use here, 1.5.5-1, uses Zstandard v1.5.5. --------- (cherry picked from commit 569e90c) Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…rch-project#7037) * Fix: enable ZSTD codec only if index.codec is set to ZSTD. - addresses opensearch-project#7012 Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Removed custom CodecService and CodecServiceFactory classes. - Removed custom classes for CodecService and CodecServiceFactory. - Also removed PerFieldMappingPostingFormatCodec -- not required. - Added documentation. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Bump zstd-jni version from 1.5.4-1 to 1.5.5-1. - Zstandard version 1.5.5 contains a bug fix for a rare corruption error described here: https://github.com/facebook/zstd/releases/tag/v1.5.5. The zstd-jni version we use here, 1.5.5-1, uses Zstandard v1.5.5. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> --------- Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>
The new ZSTD compression codec adds ZSTD to the existing compression codecs (
default
,best_compression
, andlucene_default
). This PR allows the compression codec to give a custom-codec service factory only whenindex.codec
is set toZSTD
orZSTDNODICT
.Description
Fixes issue #7012 by explicitly avoiding the creation of a custom codec service unless the
index.codec
value is eitherZSTD
orZSTDNODICT
.Issues Resolved
Resolves #7012
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.