当前位置：首页>数据库>正文

ES使用HDFS做快照如何部署 es和hdfs

数据库2024-05-23 15:44:31

1. 问题

Elasticsearch 副本提供了高可靠性；它们让你可以容忍零星的节点丢失而不会中断服务。但是，副本并不提供对灾难性故障的保护。对这种情况，你需要的是对集群真正的备份——在某些东西确实出问题的时候有一个完整的拷贝。

2.解决方案

通过快照的方式，将Elasticsearch集群中的数据，备份到HDFS上，这样数据即存在于Elasticsearch(简称ES)集群当中，又存在于HDFS上。当ES集群出现不可恢复性的故障时，可以将数据从HDFS上快速恢复。

ES集群快照存在版本兼容性问题，请注意：
A snapshot of an index created in 5.x can be restored to 6.x.
A snapshot of an index created in 2.x can be restored to 5.x.
A snapshot of an index created in 1.x can be restored to 2.x.

3. 操作步骤

3.1.安装插件

插件git地址：https://github.com/elastic/ela … -hdfs
下载地址：https://download.elastic.co/elasticsearch/elasticsearch-repository-hdfs/elasticsearch-repository-hdfs-2.2.0-hadoop2.zip
在线安装
进入ES的目录，执行命令：bin/elasticsearch-plugin install repository-hdfs
在线安装
现将下载好的zip包，放在指定目录，如/home/hadoop/elk/es-reporitory.zip，然后执行命令：bin/plugin install file:///home/hadoop/elk/es-reporitory.zip

显示： 
 -> Installing from file:/home/hadoop/elk/elasticsearch-repository-hdfs-2.2.0-hadoop2.zip… 
 Trying file:/home/hadoop/elk/elasticsearch-repository-hdfs-2.2.0-hadoop2.zip … 
 Downloading ………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………..DONE 
 Verifying file:/home/hadoop/elk/elasticsearch-repository-hdfs-2.2.0-hadoop2.zip checksums if available … 
 NOTE: Unable to verify checksum for downloaded plugin (unable to find .sha1 or .md5 file to verify) 
 ERROR: Plugin [repository-hdfs] is incompatible with Elasticsearch [2.3.3]. Was designed for version [2.2.0] 
 注意：应当选择与你使用ES版本的插件，由于我们使用得ES版本是2.3.3，而插件版本使用的2.2.0 ，故可以先解压，修改plugin-descriptor.properties 
 解压，修改plugin-descriptor.properties： 
 name=repository-hdfs 
 description=Elasticsearch HDFS Repository 
 version=2.3.3 
 classname=org.elasticsearch.plugin.hadoop.hdfs.HdfsPluginelasticsearch.version=2.3.3 
 java.version=1.7 
 jvm=true 
 重新打包为es-reporitory.zip，再执行（最好用root权限），使用hadoop权限，会出现以下信息： 
 bin/plugin install file:///home/hadoop/elk/es-reporitory.zip 
 -> Installing from file:/home/hadoop/elk/es-reporitory.zip… 
 Trying file:/home/hadoop/elk/es-reporitory.zip … 
 Downloading ……………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….DONE 
 Verifying file:/home/hadoop/elk/es-reporitory.zip checksums if available … 
 NOTE: Unable to verify checksum for downloaded plugin (unable to find .sha1 or .md5 file to verify) 
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 
 @ WARNING: plugin requires additional permissions @ 
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 
 * java.lang.RuntimePermission accessClassInPackage.sun.security.krb5 
 * java.lang.RuntimePermission accessDeclaredMembers 
 * java.lang.RuntimePermission getClassLoader 
 * java.lang.RuntimePermission loadLibrary.jaas_nt 
 * java.lang.RuntimePermission setContextClassLoader 
 * java.lang.RuntimePermission shutdownHooks 
 * java.lang.reflect.ReflectPermission suppressAccessChecks 
 * java.security.SecurityPermission createAccessControlContext 
 * java.util.PropertyPermission * read,write 
 * javax.security.auth.AuthPermission getSubject 
 * javax.security.auth.AuthPermission modifyPrincipals 
 * javax.security.auth.AuthPermission modifyPrivateCredentials 
 * javax.security.auth.AuthPermission modifyPublicCredentials 
 * javax.security.auth.PrivateCredentialPermission org.apache.hadoop.security.Credentials “” “” read 
 See http://docs.oracle.com/javase/ … .html 
 for descriptions of what these permissions allow and the associated risks.Continue with installation? [y/N]y 
 Installed repository-hdfs into /home/hadoop/elk/elasticsearch-1/plugins/repository-hdfs

3.2.ES集群添加配置

ES集群各个node节点的config/elasticsearch.yml文件添加一下配置，然后滚动式重启：

security.manager.enabled: false 
 repositories.hdfs: 
 uri:”hdfs://172.23.5.124:9000” 
 path:”/es”

注意：这里是简单的配置，还有其它的参数，这里就采用默认的了

3.3. 常用的快照命令

建立仓库命令：

curl -XPUT ‘http://172.20.33.3:9200/_snapshot/es_backup’ -d ’ {“type”:”hdfs”, “settings”:{ “path”:”/user/ysl”, “uri”:”hdfs://172.23.5.124:9000” } }’

至于快照命令，常用的快照命令，简单记录一下
创建存储快照的仓库

curl -XPUT ‘http://172.20.33.3:9200/_snapshot/backup’ -d ‘{“type”:”hdfs”, “settings”:{ “path”:”/user/ysl”, “uri”:”hdfs://172.23.5.124:9000” } }’

快照特定的索引

curl -XPUT ‘http://172.20.33.3:9200/_snapshot/backup/snapshot_1’ -d ‘{“indices”:”logstash-gatewaylog”}’

恢复特定索引

curl -XPOST ‘http://172.20.33.3:9200/_snapshot/my_backup/snapshot_1/_restore?pretty’

查看特定快照信息

curl -XGET ‘http://172.20.33.3:9200/_snapshot/backup/snapshot_20171223’

删除快照

curl -XDELETE ‘http://172.20.33.3:9200/_snapshot/backup/snapshot_20171223’

监控快照

curl -XGET ‘http://172.20.33.3:9200/_snapshot/backup/snapshot_20171223/_status’

响应包括快照的总体状况，但也包括下钻到每个索引和每个分片的统计值。这个给你展示了有关快照进展的非常详细的视图。分片可以在不同的完成状态：

INITIALIZING：分片在检查集群状态看看自己是否可以被快照。这个一般是非常快的。
STARTED：数据正在被传输到仓库。
FINALIZING：数据传输完成；分片现在在发送快照元数据。
DONE：快照完成！
FAILED：快照处理的时候碰到了错误，这个分片/索引/快照不可能完成了。检查你的日志获取更多信息。
监控恢复快照

Curl –XGET ‘http://172.20.33.3:9200/logstash-gatewaylog/_recovery’

要获取一个仓库中所有快照的完整列表，使用 _all 占位符替换掉具体的快照名称

curl -XGET ‘http://172.20.33.3:9200/_snapshot/backup/_all’

取消一个快照

curl -XDELETE ‘http://172.20.33.3:9200/_snapshot/backup/snapshot_20171223’

备份集群：https://www.elastic.co/guide/c … .html
快照恢复：https://www.elastic.co/guide/c … .html

3.4.脚本

/home/hadoop/elk/script/snapshot_all_hdfs.sh
#!/usr/bin 
 current_time=$(date +%Y%m%d%H%M%S) 
 command_prefix=” http://172.20.33.3:9200/_snaps … ot%3B 
 co
      
    command -d ‘{“indices”:”index*,logstash*,nginx*,magicianlog*,invokelog*,outside*”}’ 
 /home/hadoop/elk/script/snapshot_gatewaylog_hdfs.sh 
 3.5. crontab 
 0 0 /1  * /bin/bash /home/hadoop/elk/script/snapshot_all_hdfs.sh>>/home/hadoop/elk/task_log/snapshot_all_day.log 2>&1

注意：这里采用的是每天一份快照，快照的频率可以自己控制

4.备份和恢复时间

我们采用的测试环境：
ES集群：四个节点，每个节点：10G内存，ES版本：2.3.3，每个索引5个主分片，一个replica.

4.1.备份数据

第一次快照是全量的（gatewaylog_20171226）,第二次快照则是增量的快照（gatewaylog_20171228）:

ES使用HDFS做快照如何部署 es和hdfs,ES使用HDFS做快照如何部署 es和hdfs_java,第1张