首页 文章

使用术语聚合执行搜索时如何返回实际值(而不是小写)?

提问于
浏览
1

我正在开发一个ElasticSearch(6.2)项目,其中 index 有很多 keyword 字段,并使用 lowercase 过滤器进行规范化,以执行不区分大小写的搜索 . 搜索工作得很好并返回规范化字段的实际值(而不是小写) . 但是,聚合不返回字段的实际值(返回小写) .

以下示例取自ElasticSearch doc .

https://www.elastic.co/guide/en/elasticsearch/reference/master/normalizer.html

创建索引:

PUT index
{
  "settings": {
    "analysis": {
      "normalizer": {
        "my_normalizer": {
          "type": "custom",
          "char_filter": [],
          "filter": ["lowercase", "asciifolding"]
        }
      }
    }
  },
  "mappings": {
    "_doc": {
      "properties": {
        "foo": {
          "type": "keyword",
          "normalizer": "my_normalizer"
        }
      }
    }
  }
}

插入文档:

PUT index/_doc/1
{
  "foo": "Bar"
}

PUT index/_doc/2
{
  "foo": "Baz"
}

使用聚合搜索:

GET index/_search
{
  "size": 0,
  "aggs": {
    "foo_terms": {
      "terms": {
        "field": "foo"
      }
    }
  }
}

结果:

{
  "took": 43,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped" : 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0.0,
    "hits": {
    "total": 2,
    "max_score": 0.47000363,
    "hits": [
      {
        "_index": "index",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.47000363,
        "_source": {
          "foo": "Bar"
        }
      },
      {
        "_index": "index",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.47000363,
        "_source": {
          "foo": "Baz"
        }
      }
    ]
  }
  },
  "aggregations": {
    "foo_terms": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "bar",
          "doc_count": 1
        },
        {
          "key": "baz",
          "doc_count": 1
        }
      ]
    }
  }
}

如果检查聚合,您将看到已返回小写值 . 例如 "key": "bar" .

有没有办法更改聚合以返回实际值?

例如 "key": "Bar"

1 回答

  • 1

    如果您想进行不区分大小写的搜索,但在聚合中返回准确值,则不需要任何规范化器 . 您可以通过 keyword 子字段简单地拥有一个 text 字段(它会降低标记并默认情况下不区分大小写) . 您可以使用前者进行搜索,将后者用于聚合 . 它是这样的:

    PUT index
    {
      "mappings": {
        "_doc": {
          "properties": {
            "foo": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword"
                }
              }
            }
          }
        }
      }
    }
    

    索引两个文档后,您可以在 foo.keyword 上发出 terms 聚合:

    GET index/_search
    {
      "size": 2,
      "aggs": {
        "foo_terms": {
          "terms": {
            "field": "foo.keyword"
          }
        }
      }
    }
    

    结果看起来像这样:

    {
      "took": 0,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 2,
        "max_score": 1,
        "hits": [
          {
            "_index": "index",
            "_type": "_doc",
            "_id": "2",
            "_score": 1,
            "_source": {
              "foo": "Baz"
            }
          },
          {
            "_index": "index",
            "_type": "_doc",
            "_id": "1",
            "_score": 1,
            "_source": {
              "foo": "Bar"
            }
          }
        ]
      },
      "aggregations": {
        "foo_terms": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "Bar",
              "doc_count": 1
            },
            {
              "key": "Baz",
              "doc_count": 1
            }
          ]
        }
      }
    }
    

相关问题