首页 文章

GSSException:在将Polybase与Kerberos连接时,未提供有效凭据(机制级别:无法找到任何Kerberos tgt)

提问于
浏览
2

我们希望通过Polybase将我们的SQL Server 2016 Enterprise与我们的Kerberized OnPrem Hadoop-Cluster与Cloudera 5.14连接起来 .

我按照Microsoft PolyBase Guide配置了Polybase . 在这个主题上工作几天之后,由于异常,我无法继续:javax.security.sasl.SaslException:GSS启动失败[由GSS异常引起:未提供有效凭据(机制级别:无法找到任何Kerberos tgt) ]

Microsoft为troubleshooting the connectivity with PolyBase and Kerberos提供了内置的诊断工具 . 在Microsoft的此故障排除指南中,有4个检查点,并且我成功了)

  • 检查点1: Successfull! 针对KDC进行了身份验证并收到了TGT

  • 检查点2: Successfull! 关于故障排除指南PolyBase将尝试访问HDFS并因为请求未包含必要的服务票证而失败 .

  • 检查点3: Sucessfull! 第二个十六进制转储表示SQL Server成功使用了TGT并从KDC获取了名称节点的SPN的适用服务票证 .

  • 检查点4: Not successfull SQL Server由Hadoop使用ST(服务票证)进行身份验证,并且授予会话以访问受保护资源 .

krb5.conf文件

[libdefaults]
default_realm = COMPANY.REALM.COM
dns_lookup_kdc = false
dns_lookup_realm = false
ticket_lifetime = 86400
renew_lifetime = 604800
forwardable = true
default_tgs_enctypes = aes256-cts-hmac-sha1-96 aes128-cts-hmac-sha1-96
default_tkt_enctypes = aes256-cts-hmac-sha1-96 aes128-cts-hmac-sha1-96
permitted_enctypes = aes256-cts-hmac-sha1-96 aes128-cts-hmac-sha1-96
udp_preference_limit = 1
kdc_timeout = 3000
[realms]
COMPANY.REALM.COM = {
kdc = ipadress.kdc.host
admin_server = ipadress.kdc.host
}
[logging]
default = FILE:/var/log/krb5/kdc.log
kdc = FILE:/var/log/krb5/kdc.log
admin_server = FILE:/var/log/krb5/kadmind.log

SQL Server上Polybase的

core-site.xml

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
  <property>
    <name>io.file.buffer.size</name>
    <value>131072</value>
  </property>
  <property>
    <name>ipc.client.connect.max.retries</name>
    <value>2</value>
  </property>
  <property>
    <name>ipc.client.connect.max.retries.on.timeouts</name>
    <value>2</value>
  </property>

<!-- kerberos security information, PLEASE FILL THESE IN ACCORDING TO HADOOP CLUSTER CONFIG -->
<property>
    <name>polybase.kerberos.realm</name>
    <value>COMPANY.REALM.COM</value>
  </property>
  <property>
    <name>polybase.kerberos.kdchost</name>
    <value>ipadress.kdc.host</value>
  </property>
  <property>
    <name>hadoop.security.authentication</name>
    <value>KERBEROS</value>
  </property>
</configuration>

SQL Server上Polybase的

hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
  <property>
    <name>dfs.block.size</name>
    <value>268435456</value> 
  </property>
  <!-- Client side file system caching is disabled below for credential refresh and 
       settting the below cache disabled options to true might result in 
       stale credentials when an alter credential or alter datasource is performed
  -->
  <property>
    <name>fs.wasb.impl.disable.cache</name>
    <value>true</value>
  </property>
  <property>
    <name>fs.wasbs.impl.disable.cache</name>
    <value>true</value>
  </property>
  <property>
    <name>fs.asv.impl.disable.cache</name>
    <value>true</value>
  </property>
  <property>
    <name>fs.asvs.impl.disable.cache</name>
    <value>true</value>
  </property>
  <property>
    <name>fs.hdfs.impl.disable.cache</name>
    <value>true</value>
  </property>
<!-- kerberos security information, PLEASE FILL THESE IN ACCORDING TO HADOOP CLUSTER CONFIG -->
  <property>
    <name>dfs.namenode.kerberos.principal</name>
    <value>hdfs/_HOST@COMPANY.REALM.COM</value> 
  </property>
</configuration>

Polybase异常

[2018-06-22 12:51:50,349] WARN  2872[main] - org.apache.hadoop.security.UserGroupInformation.hasSufficientTimeElapsed(UserGroupInformation.java:1156) - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
[2018-06-22 12:51:53,568] WARN  6091[main] - org.apache.hadoop.security.UserGroupInformation.hasSufficientTimeElapsed(UserGroupInformation.java:1156) - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
[2018-06-22 12:51:56,127] WARN  8650[main] - org.apache.hadoop.security.UserGroupInformation.hasSufficientTimeElapsed(UserGroupInformation.java:1156) - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
[2018-06-22 12:51:58,998] WARN 11521[main] - org.apache.hadoop.security.UserGroupInformation.hasSufficientTimeElapsed(UserGroupInformation.java:1156) - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
[2018-06-22 12:51:59,139] WARN 11662[main] - org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:676) - Couldn't setup connection for hdfs@COMPANY.REALM.COM to IPADRESS_OF_NAMENODE:8020
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]

NameNode上的日志条目

Socket Reader #1 for port 8020: readAndProcess from client IP-ADRESS_SQL-SERVER threw exception [javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: AES128 CTS mode with HMAC SHA1-96 encryption type not in permitted_enctypes list)]]

Auth failed for IP-ADRESS_SQL-SERVER:60484:null (GSS initiate failed) with true cause: (GSS initiate failed)

对我来说令人困惑的部分是来自NameNode的日志条目,因为 AES128 CTS mode with HMAC SHA1-96 已经在允许的enctype列表中,如krb5.conf和Cloudera Manager UI中所示

Cloudera Manager UI krb_enc_types

感谢您的帮助!

1 回答

  • 1

    重新启动群集后,问题本身就会得到解决 . 我认为问题是我们的Hadoop-Cluster中的krb5.conf文件由于某些正在运行的服务而无法在所有节点上分发 . 在Cloudera Manager中还有一个关于Kerberos的陈旧配置的警告 . 非常感谢大家!

相关问题