CJKFoldingFilter¶ ↑

YUL Customizations *

Yul has added some mappings and we have moved the mappings into an XML file rather than in the Java code. The mappings are now loaded in a static block so that each instance doesn’t re-add the items to the map.

This is a Lucene filter and filter factory (see lucene.apache.org ) to fold certain CJK characters to improve recall. You should put it in your analysis chain BEFORE ICUTransforms from Traditional->Simplified Han, as it converts modern Japanese Kanji to their traditional equivalents.

Usage¶ ↑

clone the project

git clone git://github.com/solrmarc/CJKFoldingFilter.git

run the jar ant task

ant jar

put the CJKFoldingFilter.jar file found in the dist directory into your Solr lib directory
utilize the Solr CJKFoldingFilterFactory in your schema.xml file.

<fieldType name="text_cjk" class="solr.TextField" positionIncrementGap="10000" autoGeneratePhraseQueries="false">
  <analyzer>
    <tokenizer class="solr.ICUTokenizerFactory" />
    <filter class="solr.CJKWidthFilterFactory"/>
    <filter class="edu.stanford.lucene.analysis.CJKFoldingFilterFactory"/>
    <filter class="solr.ICUTransformFilterFactory" id="Traditional-Simplified"/>
    <filter class="solr.ICUTransformFilterFactory" id="Katakana-Hiragana"/>
    <filter class="solr.ICUFoldingFilterFactory"/>
    <filter class="solr.CJKBigramFilterFactory" han="true" hiragana="true" katakana="true" hangul="true" outputUnigrams="true" />
  </analyzer>
</fieldType>

Contributing¶ ↑

Fork it
Create your feature branch (‘git checkout -b my-new-feature`)
Commit your changes (‘git commit -am ’Added some feature’‘)
Push to the branch (‘git push origin my-new-feature`)
Create new Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
src/edu/stanford/lucene/analysis		src/edu/stanford/lucene/analysis
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.rdoc		README.rdoc
build.properties		build.properties
build.xml		build.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CJKFoldingFilter¶ ↑

Usage¶ ↑

Contributing¶ ↑

About

Releases

Packages

Languages

License

yalelibrary/CJKFoldingFilter

Folders and files

Latest commit

History

Repository files navigation

CJKFoldingFilter¶ ↑

Usage¶ ↑

Contributing¶ ↑

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages