弹性搜索 - 不一致的术语方面结果-Java 学习之路

我在以前的帖子中找不到答案，所以我希望我的帖子是相关的 . 我在使用ElasticSearch术语方面遇到了麻烦 .

当我查询每个术语方面的文档计数时，我得到，对于某些字段值，我会说8但是当我查询具有该字段的特定值的文档计数时，我得到，比方说19 .

为了更加深思熟虑，我正在使用Kibana，这里是查询和响应（我被告知要重命名字段值fyi）：

all term facets count query:

{
    "facets" : {
        "terms" : {
            "terms" : {
                **"fields" : ["field.name"],**
                "size" : 6,
                "order" : "count",
                "exclude" : []
            },
            "facet_filter" : {
                "fquery" : {
                    "query" : {
                        "filtered" : {
                            "query" : {
                                "bool" : {
                                    "should" : [{
                                            "query_string" : {
                                                "query" : "*"
                                            }
                                        }
                                    ]
                                }
                            },
                            "filter" : {
                                "bool" : {
                                    "must" : [{
                                            "match_all" : {}

                                        }
                                    ]
                                }
                            }
                        }
                    }
                }
            }
        }
    },
    "size" : 0
}

the response:

{
    "took" : 1,
    "timed_out" : false,
    "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
    },
    "hits" : {
        "total" : 20374,
        "max_score" : 0.0,
        "hits" : []
    },
    "facets" : {
        "terms" : {
            "_type" : "terms",
            "missing" : 10567,
            "total" : 9918,
            "other" : 9781,
            "terms" : [{
                    "term" : "fieldValue1"
                    "count" : 43
                }, {
                    "term" : "fieldValue2",
                    "count" : 27
                }, {
                    "term" : "fieldValue3",
                    "count" : 23
                }, {
                    "term" : "fieldValue4",
                    "count" : 23
                }, {
                    "term" : "fieldValue5",
                    "count" : 13
                }, {
                    "term" : "fieldValue6",
                    "count" : 8
                }
            ]
        }
    }
}

the query on "fieldValue6"

{
    "facets" : {
        "terms" : {
            "terms" : {
                "fields" : ["field.name"],
                "size" : 6,
                "order" : "count",
                "exclude" : []
            },
            "facet_filter" : {
                "fquery" : {
                    "query" : {
                        "filtered" : {
                            "query" : {
                                "bool" : {
                                    "should" : [{
                                            "query_string" : {
                                                "query" : "*"
                                            }
                                        }
                                    ]
                                }
                            },
                            "filter" : {
                                "bool" : {
                                    "must" : [{
                                            "terms" : {
                                                "field.name" : ["fieldValue6"]
                                            }
                                        }
                                    ]
                                }
                            }
                        }
                    }
                }
            }
        }
    },
    "size"

the response :

{
    "took" : 2,
    "timed_out" : false,
    "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
    },
    "hits" : {
        "total" : 20374,
        "max_score" : 0.0,
        "hits" : []
    },
    "facets" : {
        "terms" : {
            "_type" : "terms",
            "missing" : 0,
            "total" : 19,
            "other" : 0,
            "terms" : [{
                    "term" : "fieldValue6",
                    "count" : 19
                }
            ]
        }
    }
}

我应用facet过滤器（或实际应该调用的任何东西）的字段设置为"not analyzed"：

properties: {
    type_ref2Strack: {
        properties: {
            position: {
                type: long
            }
            name: {
                index: not_analyzed
                norms: {
                    enabled: false
                }
                index_options: docs
                type: string
            }
        }
    }
}

1 回答

0
这是弹性研究方面（现在称为聚合）的长期已知限制 .

关键问题在于它针对每个具有给定大小的分片运行方面，然后组合结果，这意味着计数可以被切断 .

有两种非理想的方法可以解决这个问题：
- 添加比您真正需要的更大的"shard_size"输入 . 这将主要起作用，但仍然无法确保计数 .
- 索引只是一个分片 . 这样，它将始终收集确切的结果 . 这会影响将索引缩放到大量文档，但YMMV
有关详情，请参阅此处：

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_document_counts_are_approximate
回复于 2024-05-06T01:22:38+08:00

弹性搜索 - 不一致的术语方面结果

1 回答

相关问题