我在4节点kubernetes集群中部署了elasticsearch集群 . 以下是elasticsearch deployement:

es-master(运行3个专用副本)

es-client(运行2个专用副本)

es-data(运行2个专用副本)

上述所有7个节点都在4节点kube集群中运行 . 但是当我重新启动运行1个客户机节点的一个kubernetes集群时,elasticsearch集群没有响应请求和 elasticsearch cluster URL is down .

我想在拥有多个副本后,这应该可以正常工作 . 任何想法我怎么能使elasticseach URL始终起作用 .

Python脚本检查es-cluster运行状况:

## Cluster health ##
url = 'http://<es-cluster>:30000/_cluster/health'
headers = {'content-type': 'application/json'}
get_reg = requests.get(url, timeout = 60, headers=headers)

if (get_req.status_code != 200):
    print ( "Error- cluster request failed" )
    print get_req.status_code
    print get_req.text
    break
print ("passed")
print get_req.status_code
print get_req.text
time.sleep(1)

一切都结束时的输出:

passed
200
{"cluster_name":"ebs_elastic","status":"green","timed_out":false,"number_of_nodes":7,"number_of_data_nodes":2,"active_primary_shards":46,"active_shards":92,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}

一个节点关闭时的输出:

hemanty-mac-0:Desktop hemanty$ python ping.py 
        Traceback (most recent call last):
  File "ping.py", line 25, in <module>
    post_req = requests.get(url, timeout = 60, headers=headers)
  File "/Library/Python/2.7/site-packages/requests-2.19.1-py2.7.egg/requests/api.py", line 72, in get
    return request('get', url, params=params, **kwargs)
  File "/Library/Python/2.7/site-packages/requests-2.19.1-py2.7.egg/requests/api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Library/Python/2.7/site-packages/requests-2.19.1-py2.7.egg/requests/sessions.py", line 512, in request
    resp = self.send(prep, **send_kwargs)
  File "/Library/Python/2.7/site-packages/requests-2.19.1-py2.7.egg/requests/sessions.py", line 622, in send
    r = adapter.send(request, **kwargs)
  File "/Library/Python/2.7/site-packages/requests-2.19.1-py2.7.egg/requests/adapters.py", line 513, in send
    raise ConnectionError(e, request=request)
    requests.exceptions.ConnectionError: HTTPConnectionPool(host='<es-cluster>', port=30000): Max retries exceeded with url: /_cluster/health (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x108765f90>: Failed to establish a new connection: [Errno 51] Network is unreachable',))

请注意, I DO NOT see any downtime when I restart any POD like master, client or data . 它只在重新启动kubernetes节点时发生 .