首页 文章

模糊匹配得分高于完全匹配

提问于
浏览
0

我是ElasticSearch的新手,并试图配置Elasticsearch给我模糊匹配 . 在实现模糊搜索,自动完成过滤器和带状疱疹时,精确匹配似乎具有比部分匹配更低的分数 . 例如,如果查询是“Ring”,则它似乎与“Brass Ring”而不是“Ring”具有更高的匹配 .

谁能帮我吗?

以下是我制作索引的方法:

itemindex = es.indices.create(
        index='mo-items-index-1',
        body={
        "settings": {
            "number_of_shards": 1,
            "analysis": {
                "filter": {
                    "autocomplete_filter": {
                        "type":     "edge_ngram",
                        "min_gram": 1,
                        "max_gram": 20
                    },
                    "custom_shingle": {
                        "type": "shingle",
                        "min_shingle_size": 2,
                        "max_shingle_size": 3,
                        "output_unigrams": True

                    },
                    "my_char_filter": {
                        "type": "pattern_replace",
                        "pattern": " ",
                        "replacement": ""
                    }
                },
                "analyzer": {
                    "autocomplete": {
                        "type":      "custom",
                        "tokenizer": "standard",
                        "filter": [
                            "lowercase",
                            "custom_shingle",
                            "autocomplete_filter",
                            "my_char_filter"
                        ]
                    }
                }
            }
        },
        "mappings": {
        "my_type": {
            "properties": {
                "item_id": {
                    "type":     "string",
                    "analyzer": "autocomplete",
            "search_analyzer": "standard"

                },
            "item_name": {
                    "type":     "string",
                    "analyzer": "autocomplete",
            "search_analyzer": "standard"

                }
            }
        }
    }
       },
        # Will ignore 400 errors, remove to ensure you're prompted
       ignore=400
    )

以下是我查询术语的方法:

res2 = es.search(index="mo-items-index-1", size=200, body={"query": {"multi_match": {
        "fields": [
            "item_name", "item_id"], "query": userQuery, "fuzziness": "AUTO"}}, "highlight": {

        "fields": {
            "item_name": {},
            "item_id": {}

        }
    }, })

1 回答

  • 1

    有一个非常简单的方法可以"boost"完全匹配的得分:使用 bool 查询将使用您已经存在的查询和 term 一个 should 语句:

    "query": {
        "bool": {
          "should": [
            {
              "multi_match": {
                "fields": [
                  "item_name",
                  "item_id"
                ],
                "query": "Ring",
                "fuzziness": "AUTO"
              }
            },
            {
              "term": {
                "item_name.keyword": {
                  "value": "Ring"
                }
              }
            }
          ]
        }
      }
    

    而且你还需要在字段中添加一个 keyword 类型的子字段,以支持完美匹配:

    "mappings": {
        "my_type": {
          "properties": {
            "item_id": {
              "type": "string",
              "analyzer": "autocomplete",
              "search_analyzer": "standard"
            },
            "item_name": {
              "type": "string",
              "analyzer": "autocomplete",
              "search_analyzer": "standard",
              "fields": {
                "keyword": {
                  "type": "keyword"
                }
              }
            }
          }
        }
      }
    

相关问题