analyze offset don't response #265

xiongleijack · 2022-02-10T08:30:27Z

elasticsearch-analysis-pinyin 版本 7.10.2。
question: 分词测试的时候没有生成 start_offset,和end_offset ？下面为执行结果
1 创建索引

PUT /medcl/ 
{
    "settings" : {
        "analysis" : {
            "analyzer" : {
                "pinyin_analyzer" : {
                    "tokenizer" : "my_pinyin"
                    }
            },
            "tokenizer" : {
                "my_pinyin" : {
                    "type" : "pinyin",
                    "keep_separate_first_letter" : false,
                    "keep_full_pinyin" : true,
                    "keep_original" : true,
                    "limit_first_letter_length" : 16,
                    "lowercase" : true,
                    "remove_duplicated_term" : true,
                    "term_vector": "with_positions_offsets"
                }
            }
        }
    }
}

2 分词操作

POST /medcl/_analyze
{
  "text": ["刘德华"],
  "analyzer": "pinyin_analyzer"
}

3 结果

{
  "tokens" : [
    {
      "token" : "liu",
      "start_offset" : 0,
      "end_offset" : 0,
      "type" : "word",
      "position" : 0
    },
    {
      "token" : "刘德华",
      "start_offset" : 0,
      "end_offset" : 0,
      "type" : "word",
      "position" : 0
    },
    {
      "token" : "ldh",
      "start_offset" : 0,
      "end_offset" : 0,
      "type" : "word",
      "position" : 0
    },
    {
      "token" : "de",
      "start_offset" : 0,
      "end_offset" : 0,
      "type" : "word",
      "position" : 1
    },
    {
      "token" : "hua",
      "start_offset" : 0,
      "end_offset" : 0,
      "type" : "word",
      "position" : 2
    }
  ]
}

备注：使用最新的版本也有尝试，start_offset 和 end_offset 都是0，请问下是我哪里配置错了吗，还是 pinyin 插件目前不支持计算 offset

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

analyze offset don't response #265

analyze offset don't response #265

xiongleijack commented Feb 10, 2022 •

edited

Loading

analyze offset don't response #265

analyze offset don't response #265

Comments

xiongleijack commented Feb 10, 2022 • edited Loading

xiongleijack commented Feb 10, 2022 •

edited

Loading