Elasticsearch查询性能下降-Java 学习之路

我们设置了7个节点的elasticsearch集群 . 每个节点都具有 16G RAM, 8 Core cpu, centos 6 之类的配置 .

Elasticsearch版本： 1.3.0
堆内存是 - 9000m

1 Master (Non data)
1 Capable master (Non data)
5 Data node

有10个索引，其中一个索引有5500万个文档[254Gi（508Gi with replica）]大小，所有索引都有大约20k文档 .

每1秒就有5-10个新文件正在编制索引 .

但问题是搜索有点慢 . 几乎平均为 2000 ms 至 5000 ms . 有些查询在1秒内完成 .

制图：

{
    "my_index": {
        "mappings": {
            "product": {
                "_id": {
                    "path": "product_refer_id"
                },
                "properties": {
                    "product_refer_id": {
                        "type": "string"
                    },
                    "body": {
                        "type": "string"
                    },
                    "cat": {
                        "type": "string"
                    },
                    "cat_score": {
                        "type": "float"
                    },
                    "compliant": {
                        "type": "string"
                    },
                    "created": {
                        "type": "integer"
                    },
                    "facets": {
                        "properties": {
                            "ItemsPerCategoryCount": {
                                "properties": {
                                    "terms": {
                                        "properties": {
                                            "field": {
                                                "type": "string"
                                            },
                                            "size": {
                                                "type": "long"
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    },
                    "fields": {
                        "type": "string"
                    },
                    "from": {
                        "type": "string"
                    }
                    "id": {
                        "type": "string"
                    },
                    "image": {
                        "type": "string"
                    },
                    "lang": {
                        "type": "string"
                    },
                    "main_cat": {
                        "properties": {
                            "Technology": {
                                "type": "double"
                            }
                        }
                    },
                    "md5_product": {
                        "type": "string"
                    },
                    "post_created": {
                        "type": "long"
                    },
                    "query": {
                        "properties": {
                            "bool": {
                                "properties": {
                                    "must": {
                                        "properties": {
                                            "query_string": {
                                                "properties": {
                                                    "default_field": {
                                                        "type": "string"
                                                    },
                                                    "query": {
                                                        "type": "string"
                                                    }
                                                }
                                            },
                                            "range": {
                                                "properties": {
                                                    "main_cat.Technology": {
                                                        "properties": {
                                                            "gte": {
                                                                "type": "string"
                                                            }
                                                        }
                                                    },
                                                    "sub_cat.Technology.computers": {
                                                        "properties": {
                                                            "gte": {
                                                                "type": "string"
                                                            }
                                                        }
                                                    }
                                                }
                                            },
                                            "term": {
                                                "properties": {
                                                    "product.secondary_cat": {
                                                        "type": "string"
                                                    }
                                                }
                                            }
                                        }
                                    }
                                }
                            },
                            "match_all": {
                                "type": "object"
                            }
                        }
                    },
                    "secondary_cat": {
                        "type": "string"
                    },
                    "secondary_cat_score": {
                        "type": "float"
                    },
                    "size": {
                        "type": "long"
                    },
                    "sort": {
                        "properties": {
                            "_uid": {
                                "type": "string"
                            }
                        }
                    },
                    "sub_cat": {
                        "properties": {
                            "Technology": {
                                "properties": {
                                    "audio": {
                                        "type": "double"
                                    },
                                    "computers": {
                                        "type": "double"
                                    },
                                    "gadgets": {
                                        "type": "double"
                                    },
                                    "geekchic": {
                                        "type": "double"
                                    }
                                }
                            }
                        }
                    },
                    "title": {
                        "type": "string"
                    },
                    "product": {
                        "type": "string"
                    }
                }
            }
        }
    }
}

我们正在使用 Default Analyzer .
有什么建议吗？这种配置是不够的？

2 回答

1

看起来索引不能适应内存，所以会有更多的磁盘I / O在进行 . 你使用固态硬盘吗？如果没有，你应该得到一些 .

除此之外，您的节点需要更多资源（内存，CPU）来处理索引大小 .

我对这里的尺寸感到有点惊讶：“仅”5500万个文件大约250GB，我看不到你在那里存储任何更大的blob（我可能会弄错，很难从映射定义中看到） . 也许您可以考虑保留一些未分析的数据，以防您不需要查询，只需检索它即可 . 这会减少索引大小 .

除此之外，我没有其他想法，没有更详细地了解所有相关的基础设施 .

回复于 2024-04-26T03:31:44+08:00
1

要添加Torsten Engelbrecht的答案，默认分析器可能是罪魁祸首的一部分 . 该分析器将每个单词的每个形式编入索引作为单独的标记，这意味着具有复杂共轭的语言中的单个动词可以被索引十几次 . 此外，这会降低搜索结果的质量 . 如果您的文档包含格式信息（HTML标记？），则同样适用 .

更多，stop words are disabled by default，意味着每个"the"，"a" ...例如英文也将被编入索引 .

您应该考虑使用本地化分析器（可能是雪球分析仪？）并停止文档中使用的语言，以限制反向索引大小，这样可以提高性能 .

另外，请考虑将 not_analyzed 字段设置为md5，url，id和其他类型的不可搜索字段 .

回复于 2024-04-26T03:31:44+08:00

Elasticsearch查询性能下降

制图：

2 回答

相关问题