0.环境信息
软件名称 | 版本 |
---|---|
OS | openEuler 22.03 (LTS-SP1) |
BiSheng JDK | OpenJDK 64-Bit Server VM BiSheng (build 1.8.0_342-b11) |
Apache | Haoop 3.2.0 |
1. Kerberos 原理及其使用
1.1 Kerberos 原理
Kerberos是一种基于票据的、集中式的网络认证协议,适用于C/S模型,最初由麻省理工学院(Massachusetts Institute of Technology, MIT)开发的。
- 密钥分发中心 KDC (Key Distribution Center)是 Kerberos 的核心组件,包含 AS(Authentication Server) 和 TGS(Ticket Granting Server)
- AS(Authentication Server) 负责用户信息认证,给客户端提供TGT(Ticket Granting Tickets)。
- TGS(Ticket Granting Server) 向客户端提供ST(Service Ticket)和 Session Key(服务会话密钥)。
Kerberos 认证的时序图,如下所示。
1.2 Kerberos 的安装和使用
1.2.1 Kerberos 的安装
server1 安装和配置 krb5
# 步骤1:server1安装krb5、krb5-libs、krb5-server、krb5-client
#方法1:配置好yum源后安装
$yum install -y krb5 krb5-libs krb5-server krb5-client
#方法2:根据系统版本仓库从下kerberos的rpm包,https://repo.openeuler.org/openEuler-22.03-LTS/OS/aarch64/Packages/
$rpm -iv krb5-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-libs-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-devel-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-server-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-client-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-help-1.19.2-6.oe2203sp1.noarch.rpm
# 步骤2:修改krb5.conf为如下内容
$vi /etc/krb5.conf
# Configuration snippets may be placed in this directory as well
includedir /etc/krb5.conf.d/
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
dns_lookup_realm = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
default_realm = HADOOP.COM
#default_ccache_name = KEYRING:persistent:%{uid}
[realms]
HADOOP.COM = {
kdc = server1
admin_server = server1
}
[domain_realm]
.hadoop.com = HADOOP.COM
hadoop.com = HADOOP.COM
# 步骤3:创建KDC数据库
$ll /var/kerberos/krb5kdc/
$kdb5_util create -s
#输入密码如test123,再输入一次确认。
$ll /var/kerberos/krb5kdc/
#多了principal、principal.kadm5、principal.kadm5.lock、principal.ok等文件
# 步骤4:放行所有管理员权限,修改为如下内容
$vi /var/kerberos/krb5kdc/kadm5.acl
*/admin@TEST.COM *
# 步骤5:注释掉使用KCM作为凭据缓存,最后两行
$vi /etc/krb5.conf.d/kcm_default_ccache
#[libdefaults]
# default_ccache_name = KCM:
# 步骤6:新建管理员root/admin
#1.输入 addprinc root/admin -- 新建管理员 root/admin,允许用户端使用 kadmin 使用root/admin 账户密码登录。
#2.输入 root/admin 的密码如test123,并再次输入确认。
#3.输入 listprincs -- 查看是否存在新建的管理员信息。
#4.输入 exit -- 退出。
# 步骤7:启动kadmin、krb5kdc服务,设置开机启动
$systemctl start kadmin krb5kdc
$systemctl enable kadmin krb5kdc
$chkconfig --level 35 krb5kdc on
$chkconfig --level 35 kadmin on
# 步骤8:管理员登录测试
$kadmin
#1.输入root/admin密码。
#2.输入listprincs。
#3.输入exit 退出
# 步骤9:创建主机票据
$kadmin.local
#1.输入addprinc -randkey host/server1 -- 添加 host/server1,密码为随机。
#2.输入addprinc krbuser1 -- 添加一个普通账户 krbuser1。
#3.输入krbuser1 在 kerberos 上的密码如 123,再次输入确认。
#4.输入exit 退出。
说明:
kadmin.local
:需要在 KDC server上面运行,无需密码即可管理数据库。kadmin
:可以在任何一台KDC领域的系統上面运行,但是需要输入管理员密码。
agent1~3 安装和配置 krb5 client
# 步骤1:agent1~3安装krb client
$rpm -iv krb5-libs-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-client-1.19.2-6.oe2203sp1.aarch64.rpm
$rpm -iv krb5-help-1.19.2-6.oe2203sp1.noarch.rpm
# 步骤2:修改krb5.conf为如下内容
$vi /etc/krb5.conf
# Configuration snippets may be placed in this directory as well
includedir /etc/krb5.conf.d/
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
dns_lookup_realm = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
default_realm = HADOOP.COM
#default_ccache_name = KEYRING:persistent:%{uid}
[realms]
HADOOP.COM = {
kdc = server1
admin_server = server1
}
[domain_realm]
.hadoop.com = HADOOP.COM
hadoop.com = HADOOP.COM
# 步骤3:注释掉使用KCM作为凭据缓存,最后两行
$vi /etc/krb5.conf.d/kcm_default_ccache
#[libdefaults]
# default_ccache_name = KCM:
# 步骤4:创建主机票据
$kadmin.local
#1.输入addprinc -randkey host/agent1 -- 添加 host/agent1,密码为随机。
#2.输入exit 退出。
1.2.2 Kerberos 的使用
样例1:添加namenode的kerberos用户
$kadmin.local
$addprinc -randkey nn/server1@HADOOP.COM
#查看全部凭证
$listprincs
#指定namenode的凭证文件导出到指定目录
$ktadd -k /etc/security/keytab/nn.keytab nn/server1
#指定root/admin和root的凭证文件导出到指定目录
$ktadd -k /etc/security/keytab/root.keytab root/admin
$ktadd -k /etc/security/keytab/root.keytab root
样例2:指定root/admin的kerberos用户进行认证
# 步骤1:指定用户登录
#方法1:输入密码
$kinit root
#方法2:利用凭证文件登录
$kinit -kt etc/security/keytab/root.keytab root/admin
# 步骤2:查看登录信息
$klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: root@HADOOP.COM
Valid starting Expires Service principal
07/31/2023 09:12:05 08/01/2023 09:12:05 krbtgt/HADOOP.COM@HADOOP.COM
renew until 07/31/2023 09:12:05
07/31/2023 09:26:21 08/01/2023 09:12:05 host/agent1@
renew until 07/31/2023 09:12:05
# 步骤3:退出
$kdestroy
2.HDFS 对接 Kerberos
说明:
1.HDFS 的 namenode、secondarynamenode 和 datanode 组件的各个文件目录权限全部使用 root,而没有做权限划分
2.namenode、secondarynamenode 部署在 server1,而 datanode 部署在 agent1~agent3
2.1 创建 hdfs 的 Kerberos 用户及其凭证文件
#在server1上创建用户和导出凭证文件
$mkdir -p /etc/security/keytab
$kadmin.local
$addprinc -randkey nn/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/nn.keytab nn/server1
$addprinc -randkey sn/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/sn.keytab sn/server1
$addprinc -randkey dn/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/dn.keytab dn/server1
#把凭证文件keytab复制到agent1~agent3
$scp /etc/security/keytab/* agent1:/etc/security/keytab
2.2 配置 core-site.xml 和 hdfs-site.xml
确定 hadoop 的安装目录
$env |grep HADOOP_HOME
HADOOP_HOME=/usr/local/hadoop
core-site.xml
新增配置
$cd /usr/local/hadoop/etc/hadoop
$vi core-site.xml
<!-- 新增如下配置 -->
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.token.service.use_ip</name>
<value>true</value>
</property>
<property>
<name>hadoop.rpc.protection</name>
<value>authentication</value>
</property>
<property>
<name>hadoop.security.auth_to_local</name>
<value>
RULE:[2:@hdfs-site-site.xml
](nn@HADOOP.COM)s/.*/hdfs/
RULE:[2:@$cd /usr/local/hadoop/etc/hadoop
$vi hdfs-site.xml
<!-- 新增如下配置 -->
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>nn/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.namenode.keytab.file</name>
<value>/etc/security/keytab/nn.keytab</value>
</property>
<property>
<name>dfs.namenode.kerberos.internal.spnego.principal</name>
<value>HTTP/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.secondary.namenode.kerberos.principal</name>
<value>sn/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.secondary.namenode.keytab.file</name>
<value>/etc/security/keytab/sn.keytab</value>
</property>
<property>
<name>dfs.secondary.namenode.kerberos.internal.spnego.principal</name>
<value>HTTP/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.journalnode.kerberos.principal</name>
<value>jn/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.journalnode.keytab.file</name>
<value>/etc/security/keytab/jn.keytab</value>
</property>
<property>
<name>dfs.journalnode.kerberos.internal.spnego.principal</name>
<value>HTTP/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>dn/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.datanode.keytab.file</name>
<value>/etc/security/keytab/dn.keytab</value>
</property>
<property>
<name>dfs.datanode.data.dir.perm</name>
<value>700</value>
</property>
<property>
<name>dfs.web.authentication.kerberos.principal</name>
<value>HTTP/server1@HADOOP.COM</value>
</property>
<property>
<name>dfs.web.authentication.kerberos.keytab</name>
<value>/etc/security/keytab/spnego.keytab</value>
</property>
<property>
<name>dfs.permissions.superusergroup</name>
<value>hdfs</value>
<description>The name of the group of super-users.</description>
</property>
<property>
<name>dfs.http.policy</name>
<value>HTTP_ONLY</value>
</property>
<property>
<name>dfs.block.access.token.enable</name>
<value>true</value>
</property>
<property>
<name>dfs.data.transfer.protection</name>
<value>authentication</value>
</property>
](sn@HADOOP.COM)s/.*/hdfs/
RULE:[2:@core-site.xml
](dn@HADOOP.COM)s/.*/hdfs/
RULE:[2:@hdfs-site-site.xml
](nm@HADOOP.COM)s/.*/yarn/
RULE:[2:@$scp core-site.xml agent1:/usr/local/hadoop/etc/hadoop
$scp hdfs-site.xml agent1:/usr/local/hadoop/etc/hadoop
#省略...
](rm@HADOOP.COM)s/.*/yarn/
RULE:[2:@2.3 启动服务并验证
](tl@HADOOP.COM)s/.*/yarn/
RULE:[2:@启动 namenode、secondarynamenode 和 datanode](jh@HADOOP.COM)s/.*/mapred/
RULE:[2:@#在server1节点上操作
$kinit -kt /etc/security/keytab/root.keytab root
$cd /usr/local/hadoop/sbin
$./start-dfs.sh
](HTTP@HADOOP.COM)s/.*/hdfs/
DEFAULT
</value>
</property>
<property>
<name>hadoop.security.token.service.use_ip</name>
<value>true</value>
</property>
验证 hdfs新增配置
#server1
$ps -ef |grep namenode
$netstat -anp|grep 50070
#agent1~3
$ps -ef |grep datanode
#利用 hdfs cli
$hdfs dfs -ls /
$echo "func test!" > functest.txt
$hdfs dfs -put functest.txt /
$hdfs dfs -cat /functest.txt
$hdfs dfs -rm -f /functest.txt
把 3.YARN 对接 Kerberos
和 说明 同步到 agent1~3
3.1 创建 yarn 的 Kerberos 用户及其凭证文件
#在server1上创建用户和导出凭证文件
$kadmin.local
$addprinc -randkey rm/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/rm.keytab rm/server1
$addprinc -randkey nm/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/nm.keytab nm/server1
#把凭证文件keytab复制到agent1~agent3
$scp /etc/security/keytab/* agent1:/etc/security/keytab
yarn-site-site.xml
yarn-site-site.xml
$cd /usr/local/hadoop/etc/hadoop
$vi yarn-site.xml
<!-- 新增如下内容 -->
<property>
<name>yarn.resourcemanager.principal</name>
<value>rm/server1@HADOOP.COM</value>
</property>
<property>
<name>yarn.resourcemanager.keytab</name>
<value>/etc/security/keytab/rm.keytab</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.principal</name>
<value>nm/server1@HADOOP.COM</value>
</property>
<property>
<name>yarn.nodemanager.keytab</name>
<value>/etc/security/keytab/nm.keytab</value>
</property>
<property>
<name>yarn.timeline-service.principal</name>
<value>tl/server1@HADOOP.COM</value>
</property>
<property>
<name>yarn.timeline-service.keytab</name>
<value>/etc/security/keytab/tl.keytab</value>
</property>
<property>
<name>yarn.timeline-service.http-authentication.type</name>
<value>kerberos</value>
</property>
<property>
<name>yarn.timeline-service.http-authentication.kerberos.principal</name>
<value>HTTP/server1@HADOOP.COM</value>
</property>
<property>
<name>yarn.timeline-service.http-authentication.kerberos.keytab</name>
<value>/etc/security/keytab/spnego.keytab</value>
</property>
<property>
<name>yarn.http.policy</name>
<value>HTTP_ONLY</value>
</property>
说明
打开web ui,http://server1:50070,查看overview-》Summary的Live Nodes
yarn.timeline-service.enabled
false
:
1.yarn 的 resourcemanager 和 nodemanager 组件的各个文件目录权限全部使用 root,而没有做权限划分
2.resourcemanager 部署在 server1,而 nodemanager 部署在 agent1~agent3
mapred-site-site.xml
$cd /usr/local/hadoop/etc/hadoop
$vi mapred-site.xml
<!-- 新增如下内容 -->
<property>
<name>mapreduce.jobhistory.keytab</name>
<value>/etc/security/keytab/jh.keytab</value>
</property>
<property>
<name>mapreduce.jobhistory.principal</name>
<value>jh/server1@HADOOP.COM</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.spnego-keytab-file</name>
<value>/etc/security/keytab/spnego.keytab</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.spnego-principal</name>
<value>HTTP/server1@HADOOP.COM</value>
</property>
<property>
<name>mapreduce.jobhistory.http.policy</name>
<value>HTTP_ONLY</value>
</property>
3.2 配置 3.3 启动服务并验证
启动 resourcemanager 和 nodemanager新增配置
#在server1节点上操作
$kinit -kt /etc/security/keytab/root.keytab root
$cd /usr/local/hadoop/sbin
$./start-yarn.sh
$ps -ef|grep resourcemanager
$netstat -anp|grep 8088
#打开 yarn web ui 页面,http://server1:8088/cluster
yarn 功能验证:建议设置
$hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.0.jar pi 10 10 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/tez-0.10.0/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Number of Maps = 10 Samples per Map = 10 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 省略... 2023-07-31 14:50:28,215 INFO mapreduce.Job: map 0% reduce 0% 2023-07-31 14:50:33,257 INFO mapreduce.Job: map 100% reduce 0% 2023-07-31 14:50:39,277 INFO mapreduce.Job: map 100% reduce 100% 2023-07-31 14:50:39,282 INFO mapreduce.Job: Job job_1690451579267_0899 completed successfully 2023-07-31 14:50:39,379 INFO mapreduce.Job: Counters: 54 File System Counters FILE: Number of bytes read=226 FILE: Number of bytes written=2543266 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=2620 HDFS: Number of bytes written=215 HDFS: Number of read operations=45 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 HDFS: Number of bytes read erasure-coded=0 Job Counters Launched map tasks=10 Launched reduce tasks=1 Data-local map tasks=10 Total time spent by all maps in occupied slots (ms)=176478 Total time spent by all reduces in occupied slots (ms)=17310 Total time spent by all map tasks (ms)=29413 Total time spent by all reduce tasks (ms)=2885 Total vcore-milliseconds taken by all map tasks=29413 Total vcore-milliseconds taken by all reduce tasks=2885 Total megabyte-milliseconds taken by all map tasks=180713472 Total megabyte-milliseconds taken by all reduce tasks=17725440 Map-Reduce Framework Map input records=10 Map output records=20 Map output bytes=180 Map output materialized bytes=280 Input split bytes=1440 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=280 Reduce input records=20 Reduce output records=0 Spilled Records=40 Shuffled Maps =10 Failed Shuffles=0 Merged Map outputs=10 GC time elapsed (ms)=754 CPU time spent (ms)=5880 Physical memory (bytes) snapshot=5023223808 Virtual memory (bytes) snapshot=85769342976 Total committed heap usage (bytes)=16638803968 Peak Map Physical memory (bytes)=467083264 Peak Map Virtual memory (bytes)=7938478080 Peak Reduce Physical memory (bytes)=422326272 Peak Reduce Virtual memory (bytes)=6416666624 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1180 File Output Format Counters Bytes Written=97 Job Finished in 20.347 seconds Estimated value of Pi is 3.20000000000000000000
为4.ZOOKEEPER 对接 Kerberos
说明新增配置
4.1 创建 zookeeper 的 Kerberos 用户及其凭证文件
#在server1上创建用户和导出凭证文件
$kadmin.local
$addprinc -randkey zookeeper/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/zookeeper.keytab zookeeper/server1
#把凭证文件keytab复制到agent1~agent3
$scp /etc/security/keytab/* agent1:/etc/security/keytab
zoo.cfg
jaas.conf
#在agent1上进行如下操作
$env |grep ZOOK
ZOOKEEPER_HOME=/usr/local/zookeeper
$cd /usr/local/zookeeper/conf
$vi zoo.cfg
#新增如下配置
authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
requireClientAuthScheme=sasl
jaasLoginRenew=3600000
#新建文件jaas.conf
$vi jaas.conf
Server {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
keyTab="/etc/security/keytab/zookeeper.keytab"
storeKey=true
useTicketCache=false
principal="zookeeper/server1@HADOOP.COM";
};
#把zoo.cfg和jaas.conf拷贝到其他agent节点
$scp zoo.cfg jaas.conf agent2:/usr/local/zookeeper/conf
4.3 启动服务并验证
#依次在agent1~3进行如下操作
$cd /usr/local/zookeeper/bin
$./zkServer.sh start
$ps -ef |grep zookeeper
$netstat -anp|grep 2181
#验证zookeeper基本功能
$./zkServer.sh status
$./zkCli.sh
ls /
说明5.HIVE 对接 Kerberos
:
1.zookeeper 各组件的各个文件目录权限全部使用 root,而没有做权限划分
2.zookeeper server 分别部署在 agent1~3
5.1 创建 hive 的 Kerberos 用户及其凭证文件
4.2 配置 #在server1上创建用户和导出凭证文件
$kadmin.local
$addprinc -randkey hive/server1@HADOOP.COM
$ktadd -k /etc/security/keytab/hive.keytab hive/server1
#把凭证文件keytab复制到agent1~agent3
$scp /etc/security/keytab/* agent1:/etc/security/keytab
和 hive-site.xml
说明
/usr/local/hive/conf
core-site.xml
hdfs-site.xml
HADOOP_HOME
:
1.hive 的 metastore、server2 和 hive cli 组件的各个文件目录权限全部使用 root,而没有做权限划分
2.metastore 部署在 server1
3.本文不涉及 server2 对接 Kerberos 和部署
etc/hadoop
#查询 HIVE_HOME
$env |grep HIVE_HOME
HIVE_HOME=/usr/local/hive
$cd /usr/local/hive/conf
$vi hive-site.xml
<!-- 新增如下配置 -->
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>hive/server1@HADOOP.COM</value>
</property>
<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>/etc/security/keytab/hive.keytab</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive@HADOOP.COM</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/etc/security/keytab/hive.keytab</value>
</property>
<property>
<name>dfs.data.transfer.protection</name>
<value>authentication</value>
</property>
5.2 配置 5.3 启动并验证 metastore 和 hive cli
启动 metastore:删除
$hive --service metastore -p 9083 & $ps -ef |grep metastore $netstat -anp|grep 9083
目录下的 hadoop 配置文件 验证 hive cli 和#kerberos用户使用 root $kinit -kt etc/security/keytab/root.keytab root $klist Ticket cache: FILE:/tmp/krb5cc_0 Default principal: root@HADOOP.COM Valid starting Expires Service principal 07/31/2023 12:45:37 08/01/2023 12:45:37 krbtgt/HADOOP.COM@HADOOP.COM renew until 07/31/2023 12:45:37 #启动 hive cli $hive $use default; DROP TABLE IF EXISTS table1; CREATE TABLE table1 ( t1_a INT, t1_b INT, t1_c INT, t1_d INT ); INSERT INTO table1 (t1_a, t1_b, t1_c, t1_d) VALUES (1, 10, 100, 1000), (2, 20, 200, 2000), (3, 30, 300, 3000), (4, 40, 400, 4000), (5, 50, 500, 5000), (6, 60, 600, 6000), (7, 70, 700, 7000), (8, 80, 800, 8000), (9, 90, 900, 9000), (10, 100, 1000, 10000); SELECT * FROM table1;
,hive 会自动从6.SPARK 对接 Kerberos
下面的 说明 找到对应的配置
spark-defaults.conf
$env |grep SPARK
SPARK_HOME=/usr/local/spark
$cd /usr/local/spark/conf
$vi spark-defaults.conf
#新增如下配置
spark.kerberos.principal root@HADOOP.COM
spark.kerberos.keytab /etc/security/keytab/root.keytab
6.2 启动并验证 HistoryServer
#启动 HistoryServer
$cd /usr/local/spark/sbin
$./start-history-server.sh
$ps -ef |grep history
$netstat -anp|grep 18080
#打开 spark history 的页面,http://server1:18080/
6.3 启动并验证 spark-sql 原生作业
#kerberos用户使用 root
$kinit -kt etc/security/keytab/root.keytab root
$klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: root@HADOOP.COM
Valid starting Expires Service principal
07/31/2023 12:45:37 08/01/2023 12:45:37 krbtgt/HADOOP.COM@HADOOP.COM
renew until 07/31/2023 12:45:37
#利用spark-sql交互式验证
$spark-sql
$use default;
DROP TABLE IF EXISTS table1;
CREATE TABLE table1 (
t1_a INT,
t1_b INT,
t1_c INT,
t1_d INT
);
INSERT INTO table1 (t1_a, t1_b, t1_c, t1_d)
VALUES
(1, 10, 100, 1000),
(2, 20, 200, 2000),
(3, 30, 300, 3000),
(4, 40, 400, 4000),
(5, 50, 500, 5000),
(6, 60, 600, 6000),
(7, 70, 700, 7000),
(8, 80, 800, 8000),
(9, 90, 900, 9000),
(10, 100, 1000, 10000);
SELECT * FROM table1;
#启动spark-sql Q_01原生作业
$spark-sql \
--deploy-mode client \
--driver-cores 2 \
--driver-memory 8g \
--num-executors 5 \
--executor-cores 2 \
--executor-memory 8g \
--master yarn \
--conf spark.task.cpus=1 \
--conf spark.sql.orc.impl=native \
--conf spark.sql.shuffle.partitions=600 \
--conf spark.sql.adaptive.enabled=true \
--conf spark.sql.autoBroadcastJoinThreshold=100M \
--conf spark.sql.broadcastTimeout=1000 \
--database tpcds_bin_partitioned_orc_5 \
--name spark_sql_01 \
-e "SELECT
dt.d_year,
item.i_brand_id brand_id,
item.i_brand brand,
SUM(ss_ext_sales_price) sum_agg
FROM date_dim dt, store_sales, item
WHERE dt.d_date_sk = store_sales.ss_sold_date_sk
AND store_sales.ss_item_sk = item.i_item_sk
AND item.i_manufact_id = 128
AND dt.d_moy = 11
GROUP BY dt.d_year, item.i_brand, item.i_brand_id
ORDER BY dt.d_year, sum_agg DESC, brand_id
LIMIT 100"
常见问题
问题1:HDFS CLI 没有 Kerberos 权限
:
1.spark 各组件的各个文件目录权限全部使用 root,而没有做权限划分
2.HistoryServer 部署在 server1
6.1 配置 问题现象
解决方案
/etc/krb5.conf
default_ccache_name
问题2:Spark 对接 HDFS 安全集群的认证信息不正确
问题现象解决方案
SPARK_HOME
hdfs-site.xml
:Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
HADOOP_HOME
:在配置文件 hdfs-site.xml
删除或者注解 SPARK_HOME
hdfs-site.xml
问题3:Hive CLI 拿不到 HDFS 安全集群的认证信息
: 没有足够的datanode,其错误日志如下,ERROR SparkContext: Error initializing SparkContext.org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/root/.sparkStaging/application_1690451579267_0910/__spark_libs__993965846103132811.zip could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
问题现象:直接删除 解决方案 的 conf 目录下面的
HIVE_HOME
,即 spark 会直接找 hive-site.xml
下面的 <property>
<name>dfs.data.transfer.protection</name>
<value>authentication</value>
</property>
,或者把 hdfs 的 kerberos 认证信息填写在 的 conf 目录下面的 。
:hive cli 在执行 select 语句的时候出现无法连接 datanode 的错误日志,Failed with exception java.io.IOException:org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-977099591-192.168.43.210-1684921499379:blk_1074594764_854564 file=/user/hive/warehouse/table2/part-00000-c2edd0d3-580d-4de0-826d-ee559c6a61f6-c000
:在 的 conf 目录下的 ,添加如下配置。