首页 文章

Kubernetes HA:Flannel抛出SubnetManager错误

提问于
浏览
0

我一直在按照https://kubernetes.io/docs/setup/independent/high-availability/提供的步骤来启动HA群集 . 我正在使用CoreOS节点(VERSION = 1688.5.3)和Kubernetes版本v1.10 .

我已经遵循在主节点上运行所有三个etcd的选项 . 对于Load Balancer,我使用了容器化的keepalived,如https://github.com/alterway/docker-keepalived所示 . 已上载到容器化keepalived的keepalived.conf文件在k8s HA指南中给出 .

当我到达配置CNI网络的步骤(https://kubernetes.io/docs/setup/independent/high-availability/#install-cni-network)时,flannel-ds pod进入CrashLoopBackoff,出现错误:“无法创建SubnetManager:错误检索'kube-system/kube-flannel-ds-fjn6w'的pod规范:获取https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-fjn6w:拨打tcp 10.96.0.1:443: i / o超时“

这可能是什么问题?以下是运行flannel-ds pod的主节点的iptables:

The flannel pod is trying to retrieve its configuration from the API server using the service-IP 10.96.0.1, which is supposed to get DNAT to node IPs
    -A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
    -A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m statistic --mode random --probability 0.33332999982 -j KUBE-SEP-SIIK55AX7MK5ONR7
    -A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-GBLS75FLCCJBNQB6
    -A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-2CDZMOLH2PKAG52U

But I don’t see these rules being triggered at all.
    0     0 KUBE-SEP-SIIK55AX7MK5ONR7  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* default/kubernetes:https */ statistic mode random probability 0.33332999982
    0     0 KUBE-SEP-GBLS75FLCCJBNQB6  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* default/kubernetes:https */ statistic mode random probability 0.50000000000
    0     0 KUBE-SEP-2CDZMOLH2PKAG52U  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* default/kubernetes:https */


Curl to the service IP does not work, however a curl request to the kubernetes cluster IP gets a response:

    master # curl -k https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/
    curl: (7) Failed to connect to 10.96.0.1 port 443: Connection timed out

    master # curl -k https://10.106.73.226:6443/api/v1/namespaces/kube-system/pods/
    {
      "kind": "Status",
      "apiVersion": "v1",
      "metadata": {

      },
      "status": "Failure",
      "message": "pods is forbidden: User \"system:anonymous\" cannot list pods in the namespace \"kube-system\"",
      "reason": "Forbidden",
      "details": {
        "kind": "pods"
      },
      "code": 403

Also note, the service endpoints have been set correctly to the cluster IP:

    master # kubectl describe svc kubernetes
    Name:              kubernetes
    Namespace:         default
    Labels:            component=apiserver
                       provider=kubernetes
    Annotations:       <none>
    Selector:          <none>
    Type:              ClusterIP
    IP:                10.96.0.1
    Port:              https  443/TCP
    TargetPort:        6443/TCP
    Endpoints:         10.106.73.226:6443
    Session Affinity:  ClientIP
    Events:            <none>

    master # kubectl cluster-info
    Kubernetes master is running at https://10.106.73.226:6443
    KubeDNS is running at https://10.106.73.226:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

我尝试添加DNAT iptables手动将群集IP映射到服务IP,但它似乎没有帮助...虽然我不确定是否将规则添加到正确的iptable链 .

EDIT 1 -- Full iptables

master ~ # iptables -S -t nat 
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N DOCKER
-N KUBE-MARK-DROP
-N KUBE-MARK-MASQ
-N KUBE-NODEPORTS
-N KUBE-POSTROUTING
-N KUBE-SEP-PE4UL45OLJLNLYYS
-N KUBE-SERVICES
-N KUBE-SVC-NPX46M4PTMTKRN6Y
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A PREROUTING -d 10.96.0.1/32 -p tcp -m tcp --dport 443 -j DNAT --to-destination 10.106.73.226:6443
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A DOCKER -i docker0 -j RETURN
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-SEP-PE4UL45OLJLNLYYS -s 10.106.73.226/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-PE4UL45OLJLNLYYS -p tcp -m comment --comment "default/kubernetes:https" -m recent --set --name KUBE-SEP-PE4UL45OLJLNLYYS --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 10.106.73.226:6443
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m recent --rcheck --seconds 10800 --reap --name KUBE-SEP-PE4UL45OLJLNLYYS --mask 255.255.255.255 --rsource -j KUBE-SEP-PE4UL45OLJLNLYYS
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-PE4UL45OLJLNLYYS

注意:我添加了规则 -A PREROUTING -d 10.96.0.1/32 -p tcp -m tcp --dport 443 -j DNAT --to-destination 10.106.73.226:6443 手动希望将10.96.0.1映射到apiserver IP,但这并没有改变卷曲请求或法兰绒吊舱的行为

主机上pod的当前状态:

master ~ # kubectl get pods -o wide --all-namespaces
NAME                                  READY     STATUS              RESTARTS   AGE       IP              NODE
etcd-master                           1/1       Running             0          13d       10.106.73.226   master
kube-apiserver-master                 1/1       Running             0          13d       10.106.73.226   master
kube-controller-manager-master        1/1       Running             1          13d       10.106.73.226   master
kube-dns-86f4d74b45-dkzlk             0/3       ContainerCreating   0          13d       <none>          master
kube-flannel-ds-j5fxd                 0/1       CrashLoopBackOff    3550       13d       10.106.73.226   master
kube-proxy-pml47                      1/1       Running             0          13d       10.106.73.226   master
kube-scheduler-master                 1/1       Running             0          13d       10.106.73.226   master

1 回答

  • 0

    您的所有设置都很好,包括路线和 systctl 值 .

    我唯一可以猜到的是防火墙规则中的某个问题 . 请确保您接受 Forward 链中的流量转发 .

    你可以这样检查:

    • 停止K8s .

    • 停止防火墙 .

    • 停止泊坞窗 .

    • 写入文件 /var/lib/iptables/rules-save 内容(如果文件存在则覆盖):

    *filter
    :INPUT ACCEPT [0:0]
    :FORWARD ACCEPT [0:0]
    :OUTPUT ACCEPT [0:0]
    
    • 启动防火墙 .

    • 启动泊坞窗 .

    • 开始K8s .

    • 检查服务 .

    这是我能想象你为什么遇到服务问题的唯一原因 .

相关问题