hadoop集群在啟用了kerbose之后鑒權問題會變得很詭異,稍微有些條件不滿足就會有異常,對kerbose一般都是敬而遠之
1,在一次測試環境部署集群客戶端的時候, 因集群啟用了kerbose,驗證客戶端發現 鑒權失敗,日志如下
hdfs dfs -ls / 21/07/13 21:36:45 WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 21/07/13 21:36:45 WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] 21/07/13 21:36:45 INFO retry.RetryInvocationHandler: java.io.IOException: DestHost:destPort xxxx.dev.com:8020 , LocalHost:localPort xxxx.dev.com/x.x.x.x. Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS], while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over xxxx.dev.com/x.x.x.x:8020 after 1 failover attempts. Trying to failover after sleeping for 717ms.
2,啟用debug大法
export HADOOP_ROOT_LOGGER=DEBUG,console export HADOOP_OPTS="-Dsun.security.krb5.debug=true -Djavax.net.debug=ssl" 查看用戶組信息,截取部分輸出得關鍵信息
hadoop org.apache.hadoop.security.UserGroupInformation
>>>DEBUG <CCacheInputStream> key type: 0
>>>DEBUG <CCacheInputStream> auth time: Thu Jan 01 08:00:00 CST 1970
>>>DEBUG <CCacheInputStream> start time: null
>>>DEBUG <CCacheInputStream> end time: Thu Jan 01 08:00:00 CST 1970
>>>DEBUG <CCacheInputStream> renew_till time: null
>>> CCacheInputStream: readFlags()
>>> unsupported key type found the default TGT: 18
21/07/13 21:41:40 DEBUG security.UserGroupInformation: hadoop login
21/07/13 21:41:40 DEBUG security.UserGroupInformation: hadoop login commit
21/07/13 21:41:40 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: root
21/07/13 21:41:40 DEBUG security.UserGroupInformation: Using user: "UnixPrincipal: root" with name root
21/07/13 21:41:40 DEBUG security.UserGroupInformation: User entry: "root"
21/07/13 21:41:40 DEBUG security.UserGroupInformation: UGI loginUser:root (auth:SIMPLE)
User: root
Group Ids:
21/07/13 21:41:40 DEBUG security.Groups: GroupCacheLoader - load.
Groups: root
UGI: root (auth:SIMPLE)
Auth method SIMPLE
Keytab false
3,聯想到以前得歷史問題多數是jdk 版本作妖,指定好JAVA_HOME, 任然不行
4,根據 黃色部分的關鍵字
unsupported key type found the default TGT: 18
網上找到兩個比較重要的信息,撥雲見天后終於看到了正解,看起來是加密的問題,見下方相關鏈接
在 /etc/krb5.conf 里補全了加密方法后
[libdefaults] default_tkt_enctypes = rc4-hmac aes256-cts aes128-cts des3-cbc-sha1 des-cbc-md5 des-cbc-crc default_tgs_enctypes = rc4-hmac aes256-cts aes128-cts des3-cbc-sha1 des-cbc-md5 des-cbc-crc permitted_enctypes = rc4-hmac aes256-cts aes128-cts des3-cbc-sha1 des-cbc-md5 des-cbc-crc
5,重新 執行 hadoop org.apache.hadoop.security.UserGroupInformation, 認證通過
Getting UGI for current user 21/07/13 22:49:21 DEBUG security.SecurityUtil: Setting hadoop.security.token.service.use_ip to true 21/07/13 22:49:21 DEBUG util.Shell: setsid exited with exit code 0 Java config name: null Native config name: /etc/krb5.conf Loaded from native config 21/07/13 22:49:21 DEBUG security.Groups: Creating new Groups object 21/07/13 22:49:21 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library... 21/07/13 22:49:21 DEBUG util.NativeCodeLoader: Loaded the native-hadoop library 21/07/13 22:49:21 DEBUG security.JniBasedUnixGroupsMapping: Using JniBasedUnixGroupsMapping for Group resolution 21/07/13 22:49:21 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMapping 21/07/13 22:49:22 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000 Java config name: null Native config name: /etc/krb5.conf Loaded from native config >>> KdcAccessibility: reset >>> KdcAccessibility: reset >>>KinitOptions cache name is /tmp/krb5cc_0 >>>DEBUG <CCacheInputStream> client principal is hdfs-hdtest@XXXXX.CN >>>DEBUG <CCacheInputStream> server principal is krbtgt/XXXXX.CN@XXXXX.CN >>>DEBUG <CCacheInputStream> key type: 23 >>>DEBUG <CCacheInputStream> auth time: Wed Jul 14 11:49:32 CST 2021 >>>DEBUG <CCacheInputStream> start time: Wed Jul 14 11:49:32 CST 2021 >>>DEBUG <CCacheInputStream> end time: Wed Jul 14 21:49:32 CST 2021 >>>DEBUG <CCacheInputStream> renew_till time: Wed Jul 21 11:49:32 CST 2021 >>> CCacheInputStream: readFlags() FORWARDABLE; RENEWABLE; INITIAL; PRE_AUTH; >>>DEBUG <CCacheInputStream> client principal is hdfs-hdtest@XXXXX.CN >>>DEBUG <CCacheInputStream> server principal is X-CACHECONF:/krb5_ccache_conf_data/pa_type/krbtgt/XXXXX.CN@XXXXX.CN@XXXXX.CN >>>DEBUG <CCacheInputStream> key type: 0 >>>DEBUG <CCacheInputStream> auth time: Thu Jan 01 08:00:00 CST 1970 >>>DEBUG <CCacheInputStream> start time: null >>>DEBUG <CCacheInputStream> end time: Thu Jan 01 08:00:00 CST 1970 >>>DEBUG <CCacheInputStream> renew_till time: null >>> CCacheInputStream: readFlags() 21/07/13 22:49:22 DEBUG security.UserGroupInformation: hadoop login 21/07/13 22:49:22 DEBUG security.UserGroupInformation: hadoop login commit 21/07/13 22:49:22 DEBUG security.UserGroupInformation: using kerberos user:hdfs-hdtest@XXXXX.CN 21/07/13 22:49:22 DEBUG security.UserGroupInformation: Using user: "hdfs-hdtest@XXXXX.CN" with name hdfs-hdtest@XXXXX.CN 21/07/13 22:49:22 DEBUG security.UserGroupInformation: User entry: "hdfs-hdtest@XXXXX.CN" 21/07/13 22:49:22 DEBUG security.UserGroupInformation: UGI loginUser:hdfs-hdtest@XXXXX.CN (auth:KERBEROS) User: hdfs-hdtest@XXXXX.CN Group Ids: 21/07/13 22:49:22 DEBUG security.UserGroupInformation: Current time is 1626234562116 21/07/13 22:49:22 DEBUG security.UserGroupInformation: Next refresh is 1626263372000 21/07/13 22:49:22 DEBUG security.Groups: GroupCacheLoader - load. Groups: hadoop hdfs UGI: hdfs-hdtest@XXXXX.CN (auth:KERBEROS) Auth method KERBEROS Keytab false
6,相關鏈接參考
https://stackoverflow.com/questions/23867628/kerberos-found-unsupported-keytype-1/23883508
https://stackoverflow.com/questions/48411107/java-8-update-161-breaks-httpclient-kerberos-authentication
https://www.opencore.com/blog/2016/5/user-name-handling-in-hadoop/