我有三台服务器:
blade1(192.168.112.31),
blade2(192.168.112.32)和
blade3(192.168.112.33) .
在每台服务器上安装kafka_2.11-1.0.0 .
在刀片3(192.168.112.33:2181)上也安装了zookeeper .
我创建了一个主题repl3part5,其中包含以下行:
bin/kafka-topics.sh --zookeeper 192.168.112.33:2181 --create --replication-factor 3 --partitions 5 --topic repl3part5
当我描述主题时,它看起来像这样:
[root@blade1 kafka]# bin/kafka-topics.sh --describe --topic repl3part5 --zookeeper 192.168.112.33:2181
Topic:repl3part5 PartitionCount:5 ReplicationFactor:3 Configs:
Topic: repl3part5 Partition: 0 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1
Topic: repl3part5 Partition: 1 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
Topic: repl3part5 Partition: 2 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: repl3part5 Partition: 3 Leader: 2 Replicas: 2,1,3 Isr: 2,1,3
Topic: repl3part5 Partition: 4 Leader: 3 Replicas: 3,2,1 Isr: 3,2,1
我有一个关于这个主题的制作人:
bin/kafka-console-producer.sh --broker-list 192.168.112.31:9092,192.168.112.32:9092,192.168.112.33:9092 --topic repl3part5
和单一消费者:
bin/kafka-console-consumer.sh --bootstrap-server 192.168.112.31:9092,192.168.112.32:9092,192.168.112.33:9092 --topic repl3part5 --consumer-property group.id=zoran_1
生产环境 者发送的每条消息都由消费者收集 .
到现在为止还挺好 .
现在我想测试kafka服务器的故障转移 . 如果我放下刀片3 kafka服务,我会收到消费者警告,但仍会消耗所有生成的消息 .
[2018-01-30 14:30:01,203] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 3 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:30:01,299] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 3 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:30:01,475] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 3 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
现在我已经在刀片3上启动了kafka服务,我已经在刀片2服务器上放下了kafka服务 . 消费者现在显示一个警告,但所有生成的消息仍然消耗 .
[2018-01-30 14:31:38,164] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
现在我已经在刀片2上启动了kafka服务,我已经在刀片1服务器上放下了kafka服务 .
消费者现在显示有关节点1/2147483646的警告,但也显示偏移的异步自动提交...失败:偏移提交失败并带有可重试的异常 . 您应该重试提交偏移量 . 潜在的错误是:null .
[2018-01-30 14:33:16,393] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:16,469] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:16,557] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:16,986] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:16,991] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:17,493] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:17,495] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:18,002] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:18,003] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=18, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=19, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=20, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: null (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:33:18,611] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:18,932] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:18,933] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=18, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=19, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=20, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: null (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:33:19,977] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 2147483646 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
[2018-01-30 14:33:19,978] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=18, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=19, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=20, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: null (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:33:19,979] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
我试图通过在所有三个server.properties文件(其中一个在这里https://pastebin.com/Japn0Grk)上添加一个offsets.topic.replication.factor = 2(或3)来解决问题,但没有成功 . 我的想法是主题__consumer_offset没有在整个集群中复制,但看起来并非如此 .
虽然刀片1 kafka服务被关闭主题描述显示如下:
[root@blade1 kafka]# bin/kafka-topics.sh --describe --topic repl3part5 --zookeeper 192.168.112.33:2181
Topic:repl3part5 PartitionCount:5 ReplicationFactor:3 Configs:
Topic: repl3part5 Partition: 0 Leader: 3 Replicas: 2,3,1 Isr: 3
Topic: repl3part5 Partition: 1 Leader: 3 Replicas: 3,1,2 Isr: 3
Topic: repl3part5 Partition: 2 Leader: 3 Replicas: 1,2,3 Isr: 3
Topic: repl3part5 Partition: 3 Leader: 3 Replicas: 2,1,3 Isr: 3
Topic: repl3part5 Partition: 4 Leader: 3 Replicas: 3,2,1 Isr: 3
生产环境 者现在显示以下警告,它仍然会在主题上放置消息,但消息只会增加分区上的延迟计数:
[2018-01-30 14:37:21,816] WARN [Producer clientId=console-producer] Connection to node 1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
我注意到虽然blade1上的kafka服务还活着,但我可以将任何组合中的刀片2和3放下来,消费者将始终能够使用消息 . 如果刀片1上的kafka服务已关闭,则即使刀片2和刀片3上的kafka服务启动并运行,消费者也无法使用消息 .
在刀片1上启用kafka服务后, 生产环境 者在刀片1上的kafka服务发送时发送的所有消息都被重放,并且消费者终端显示以下内容:
[2018-01-30 14:44:30,817] ERROR [Consumer clientId=consumer-1, groupId=zoran_1] Offset commit failed on partition repl3part5-4 at offset 20: This is not the correct coordinator. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:44:30,817] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=22, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=20, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=22, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=22, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: This is not the correct coordinator. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:44:31,202] ERROR [Consumer clientId=consumer-1, groupId=zoran_1] Offset commit failed on partition repl3part5-4 at offset 22: This is not the correct coordinator. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-01-30 14:44:31,202] WARN [Consumer clientId=consumer-1, groupId=zoran_1] Asynchronous auto-commit of offsets {repl3part5-4=OffsetAndMetadata{offset=22, metadata=''}, repl3part5-3=OffsetAndMetadata{offset=24, metadata=''}, repl3part5-2=OffsetAndMetadata{offset=22, metadata=''}, repl3part5-1=OffsetAndMetadata{offset=24, metadata=''}, repl3part5-0=OffsetAndMetadata{offset=24, metadata=''}} failed: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: This is not the correct coordinator. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
从现在开始,一切都没有问题或警告,系统功能齐全 .
有人可以向我解释为什么刀片1上的kafka服务器如此重要,为了能够阻止两台服务器中的任何一台(包括刀片1上的kafka服务器)并且能够毫不拖延地使用消息,我有哪些选择?这件事让我抓狂 .
你能帮忙吗?
问候 .