Elasticsearch Python客户端Reindex Timedout-Java 学习之路

我正在尝试使用https://elasticsearch-py.readthedocs.org/en/master/helpers.html#elasticsearch.helpers.reindex重新索引使用Elasticsearch python客户端 . 但我一直得到以下例外： elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeout

错误的堆栈跟踪是

Traceback (most recent call last):
  File "~/es_test.py", line 33, in <module>
    main()
  File "~/es_test.py", line 30, in main
    target_index='users-2')
  File "~/ENV/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 306, in reindex
    chunk_size=chunk_size, **kwargs)
  File "~/ENV/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 182, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "~/ENV/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 124, in streaming_bulk
    raise e
elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeout(HTTPSConnectionPool(host='myhost', port=9243): Read timed out. (read timeout=10))

除了增加超时之外，还有什么可以阻止此异常？

EDIT: python代码

from elasticsearch import Elasticsearch, RequestsHttpConnection, helpers

es = Elasticsearch(connection_class=RequestsHttpConnection,
                   host='myhost',
                   port=9243,
                   http_auth=HTTPBasicAuth(username, password),
                   use_ssl=True,
                   verify_certs=True,
                   timeout=600)
helpers.reindex(es, source_index=old_index, target_index=new_index)

2 回答

0

由于Java堆空间的OutOfMemoryError，可能会发生这种情况，这意味着您没有为您想要的内容提供弹性搜索足够的内存 . 如果有任何异常，请尝试查看 /var/log/elasticsearch .

https://github.com/elastic/elasticsearch/issues/2636

回复于 2024-05-03T04:25:11+08:00
0

我已经遭受了这个问题几天，我将request_timeout参数更改为30（这是30秒）不起作用 . 最后，我必须在elasticsearch.py中编辑stream_bulk和reindex API

将chunk_size参数从默认值500（处理500个文档）更改为每批次数较少的文档 . 我把我改成了50，这对我来说很好 . 不再读取超时错误 .

def streaming_bulk（client，actions， chunk_size=50 ，raise_on_error = True，expand_action_callback = expand_action，raise_on_exception = True，** kwargs）：

def reindex（client，source_index，target_index，query = None，target_client = None， chunk_size=50 ，scroll = '5m'，scan_kwargs = {}，bulk_kwargs = {}）：

回复于 2024-05-03T04:25:11+08:00

Elasticsearch Python客户端Reindex Timedout

2 回答

相关问题