当前位置：首页>后端>正文

keepalived探活nginx keepalived nginx

后端2024-05-05 23:06:19

Nginx作为对外暴露的访问入口，必须具有高可用性，才能保证能够正常提供服务。单机Nginx服务的情况下，一旦出现宕机，将会导致需要Nginx路由的服务不可用访问，因此，保证Nginx服务的HA（high availabitlity）,也就是高可用性。

keepalived+lvs+nginx如何保证Nginx高可用？

keepalived是一个集群高可用的轻量级解决方案，关于他的介绍不多做描述，度娘很多。这里主要分析一下是如何保证nginx高可用。

我们都知道单机无法保证高可用，那么必定要实现主备或者集群来保证其可用性。Nginx本身并没有提供这样的功能，keepalived就是解决这种问题的一种实现方案。利用keepalived可以实现主备架构，在master故障发生时进行故障转移，选举备机作为新的master提供服务，同时结合keepalived提供的检测机制，可以保证Nginx的高可用。

按照我的理解，画了下面的架构图，下面看图分析。

keepalived探活nginx keepalived nginx,keepalived探活nginx keepalived nginx_lvs,第1张

首先是外部请求，客户端访问在 keepalived中的vrrp配置的对外暴露的虚拟ip，访问到keepalived-service-master所在服务器server1，此时keepalived-service-backup服务做备用，不提供对外服务。
通过keepalived-service-master中的路由配置，keepalived将请求路由到实际处理请求的service，在HA Nginx架构中，keepalived将请求路由到Nginx服务，并且这个Nginx服务需要根keepalived服务在同一个server上面（稍后分析为何需要同一个server），一个keepalived服务相当于“监控”一个Nginx服务，图中keepalived-master-service的请求都路由到 Nginx service1（这里也可以配置多个Nginx服务），不会到Nginx service2。
Nginx service1接收请求之后，根据负载配置分发请求。

以上就是一个正常的请求通过keepalived的处理流程，在此时server2中的keepalived和nginx服务其实都是没有处理请求的，只做备机。下面分析如果Nginx-service1不可用时，如何保证高可用。

keepalived探活nginx keepalived nginx,keepalived探活nginx keepalived nginx_keepalived_02,第2张

如上图所示，加入Nginx service1服务挂了，那么势必需要启用备用Nginx服务，这时候就需要Keepalived发挥作用了。

在keepalived中，配置定时执行脚本，检测Nginx service是否可用。例如上图，如果Nginx service1不可用，那么就需要把请求都转移到Nginx service2中。通过脚本方式，检测到Nginx service1不可用，此时我们自动把server1的keepalived service master关闭
ps:因为需要执行shell脚本检测Nginx服务的可用性以及自动关闭Keepalived service，所以二者需要在同一台server。
当server1的keepalived service master关闭后，会自动选举server2的keepalived service backup为新的master，通过虚拟ip访问的所有请求都会转发到server2的 keepalived service。同样，server2的keepalived service只会转发请求给同机器的Nginx service2，这样就完成了请求转移处理，保证Nginx的服务可用。

keepalived安装与配置

安装步骤

下载压缩包，官网地址
解压缩压缩包
新建一个安装路径，我使用的是 /usr/local/lib/keepalived
apt-get install libssl-dev ,在ubuntu下安装openssl依赖，其他系统类似。
安装前进入解压缩文件夹，执行命令预先配置 ./configure -prefix=/usr/local/lib/keepalived --sysconf=/etc
配置好安装路径之后，执行安装命令make && make install

做完上面的工作之后，keepalived就安装完成了，但是为了操作方便，我们可以把keepalived的相关命令添加到系统中。

进入安装路径，为keepalived命令创建软连接，进入/usr/local/lib/keepalived,执行命令ln -s sbin/keepalived /sbin。
复制解压文件夹中的init.d到系统环境中cp /usr/local/download/keepalived-2.0.11/keepalived/etc/init.d/keepalived /etc/init.d。
ubuntu中检查添加系统服务命令： sysv-rc-conf --list，查看是否有keepalived。
sysv-rc-conf keepalived on启用keepalived相关命令。

做完以上工作就可以使用service keepalived [start|stop|restart]等命令操作。

配置步骤

基本配置

在这里面，配置文件放在/etc/keepalived/文件夹下面。我们需要配置一个vrrp虚拟路由，拥有主备两个节点。完整的配置文件说明在官网地址中有详细说明。执行vim /etc/keepalived/keepalived.conf

精简原配置文件，Master配置文件如下。

! Configuration File for keepalived

global_defs {
# String identifying the machine (doesn't have to be hostname).
# (default: local host name)
   router_id LVS_DEVEL_15
}
#vrrp 虚拟路由冗余协议定义部分
vrrp_instance VI_1 {
# Initial state, MASTER|BACKUP
# As soon as the other machine(s) come up,
# an election will be held and the machine
# with the highest priority will become MASTER.
# So the entry here doesn't matter a whole lot.
# 实际上还是根据优先级来选取master 这个地方的定义不重要
    state MASTER
# interface for inside_network, bound by vrrp
    interface eth0
# arbitrary unique number from 0 to 255
# used to differentiate multiple instances of vrrpd
# running on the same NIC (and hence same socket).
    virtual_router_id 51
#优先级决定谁是master
    priority 100
# VRRP Advert interval in seconds (e.g. 0.92) (use default)，vrrp主备之间检查时间间隔
    advert_int 1

    authentication {
        auth_type PASS
 # should be the same on all machines.所有节点应该相同
        auth_pass 1111
    }
 #对外暴露的虚拟ip，可以配置多个
    virtual_ipaddress {
        192.168.0.16
    }
}

#为虚拟ip配置真实ip映射
virtual_server 192.168.0.16 80 {
#health check
    delay_loop 6
#负载均衡算法，表示这里可以配置多个realserver
    lb_algo rr
    lb_kind NAT
 #会话保持时间
    persistence_timeout 50
    protocol TCP
#路由到实际的工作nginx服务进行请求分发，master转发到15的nginx
    real_server 192.168.0.15 80 {
        weight 1
        TCP_CHECK {
             connect_timeout 3  #超时时间
             delay_before_retry 3 #重试间隔
             connect_port 80   #监测端口 
        }
    }
}

备份机Backup配置文件如下。

! Configuration File for keepalived

global_defs {
# String identifying the machine (doesn't have to be hostname).
# (default: local host name)
   router_id LVS_DEVEL_15
}
#vrrp 虚拟路由冗余协议定义部分
vrrp_instance VI_1 {
# Initial state, MASTER|BACKUP
# As soon as the other machine(s) come up,
# an election will be held and the machine
# with the highest priority will become MASTER.
# So the entry here doesn't matter a whole lot.
# 实际上还是根据优先级来选取master 这个地方的定义不重要
    state BACKUP
# interface for inside_network, bound by vrrp
    interface eth0
# arbitrary unique number from 0 to 255
# used to differentiate multiple instances of vrrpd
# running on the same NIC (and hence same socket).
    virtual_router_id 51
#优先级决定谁是master
    priority 50
# VRRP Advert interval in seconds (e.g. 0.92) (use default)，vrrp主备之间检查时间间隔
    advert_int 1

    authentication {
        auth_type PASS
 # should be the same on all machines.所有节点应该相同
        auth_pass 1111
    }
 #对外暴露的虚拟ip，可以配置多个
    virtual_ipaddress {
        192.168.0.16
    }
}

#为虚拟ip配置真实ip映射
virtual_server 192.168.0.16 80 {
#health check
    delay_loop 6
#负载均衡算法，表示这里可以配置多个realserver
    lb_algo rr
    lb_kind NAT
 #会话保持时间
    persistence_timeout 50
    protocol TCP
#路由到实际的工作nginx服务进行请求分发，备机转发到13的nginx
    real_server 192.168.0.13 80 {
        weight 1
        TCP_CHECK {
             connect_timeout 3  #超时时间
             delay_before_retry 3 #重试间隔
             connect_port 80   #监测端口 
        }
    }
}

上面的配置文件，配置了一个vrrp实例VI_1，拥有Master（195.168.0.15）和backup（192.168.0.13），对外暴露的虚拟ip是192.168.0.16。

客户端访问192.168.0.16:80，请求都会路由到Master进行处理转发，当Master的keepalived服务挂了，备机的keepalived服务升级为Master，继续对外提供服务。目前，上面的配置中，没有配置定时检测Nginx服务可用性以及自动关闭故障Keepalived master服务，所以故障目前不能转移，但是正常操作可以。

13、15分别执行命令service keepalived satrt启动keepalived服务
分别执行ip addr命令
13（备机）ip信息如下

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:97:9b:fd brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.13/24 brd 192.168.0.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe97:9bfd/64 scope link 
       valid_lft forever preferred_lft forever

15（Master）ip信息如下

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:42:db:66 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.15/24 brd 192.168.0.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.0.16/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe42:db66/64 scope link 
       valid_lft forever preferred_lft forever

可以明显看出Master的网卡上挂载了192.168.0.16这个虚拟ip

在15（master）上执行命令 service keepalived stop，然后在13（备机）上执行ip addr，ip信息如下

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:97:9b:fd brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.13/24 brd 192.168.0.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.0.16/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe97:9bfd/64 scope link 
       valid_lft forever preferred_lft forever

备机成为Master之后，网卡信息上也挂载了192.168.0.16这个虚拟ip，表示此时由备机对外提供服务。

配置踩过的坑

vrrp_strict配置需要注释掉，否则可能无法ping 通我们配置的对外暴露的虚拟ip，192.168.0.16这个虚拟ip。
如果注释了 vrrp_strict，依然不能ping 通虚拟ip，那么尝试执行iptables --list，查看拟配置的虚拟ip是否已经包含在iptables规则中，如果在，那么也可能导致你的虚拟ip无法ping通。
再另外可能就是防火墙的问题，粗暴点关闭防火墙

定时检测脚本配置–实现故障时自动转移

在基本配置中我们配置了基本的转发，与备机启用，没有配置检测检测，无法做到自动的故障转移，还需要手动执行service keepalived stop来关闭故障Nginx服务对应的keepalived，下面配置定时检测Nginx服务可用性，并且决定是否自动关闭keepalived服务，实现故障自动转移。

Master初始主机配置内容修改为如下。新增了【newadd】标记部分

! Configuration File for keepalived

global_defs {
# String identifying the machine (doesn't have to be hostname).
# (default: local host name)
   router_id LVS_DEVEL_15

# Don't run scripts configured to be run as root if any part of the path，启用执行脚本【newadd】
# is writable by a non-root user.
   enable_script_security
}
#定义vrrp执行的脚本【newadd】
vrrp_script nginx_check_for_keepalived {
   script "/usr/local/lib/keepalived/script/nginx-check-for-keepalived.sh"
#执行周期2秒一次
   interval 2
   user root

}


#vrrp 虚拟路由冗余协议定义部分
vrrp_instance VI_1 {
# Initial state, MASTER|BACKUP
# As soon as the other machine(s) come up,
# an election will be held and the machine
# with the highest priority will become MASTER.
# So the entry here doesn't matter a whole lot.
# 实际上还是根据优先级来选取master 这个地方的定义不重要
    state MASTER
# interface for inside_network, bound by vrrp
    interface eth0
# arbitrary unique number from 0 to 255
# used to differentiate multiple instances of vrrpd
# running on the same NIC (and hence same socket).
    virtual_router_id 51
#优先级决定谁是master
    priority 100
# VRRP Advert interval in seconds (e.g. 0.92) (use default)，vrrp主备之间检查时间间隔
    advert_int 1

    authentication {
        auth_type PASS
 # should be the same on all machines.所有节点应该相同
        auth_pass 1111
    }
 #对外暴露的虚拟ip，可以配置多个
    virtual_ipaddress {
        192.168.0.16
    }
#配置检测脚本【newadd】
    track_script {
        nginx_check_for_keepalived
    }

}

#为虚拟ip配置真实ip映射
virtual_server 192.168.0.16 80 {
#health check
    delay_loop 6
#负载均衡算法，表示这里可以配置多个realserver
    lb_algo rr
    lb_kind NAT
 #会话保持时间
    persistence_timeout 50
    protocol TCP
#路由到实际的工作nginx服务进行请求分发
    real_server 192.168.0.15 80 {
        weight 1
        TCP_CHECK {
             connect_timeout 3  #超时时间
             delay_before_retry 3 #重试间隔
             connect_port 80   #监测端口 
        }
    }
}

配置修改之后，我么你需要在对应的路径创建相应脚本，在/usr/local/lib/keepalived/script/下面创建nginx-check-for-keepalived.shshell脚本。脚本内容大致为检测nginx是否存活，不存活的话就关闭keepalived服务，启用备机keepalived服务。下面简单写了个脚本，内容可能不够严谨，仅供参考（shell命令不够熟悉…emmmm）。

#! /bin/bash
c=`ps -ef|grep nginx|grep -v nginx-check-for-keepalived|wc -l`
if [ $c -le 1 ];then
        echo "nginx service is dead"
        service keepalived stop
else echo "nginx service is healthy"
fi

以上配置完成之后，重启主机的keepalived服务
执行命令，关闭nginx服务
观察keepalived服务是否也会自动关闭
原master keepalived service自动关闭后，请求自动转发到备机，故障时转移成功。

遇到的坑

配置完成后，发现脚本一直没有执行成功，执行cat /var/log/syslog |grep keepalived查看keepalived日志，发现报错如下。

Jan  9 22:32:55 zhoujy-VirtualBox Keepalived_vrrp[4368]: Error exec-ing command '/usr/local/lib/keepalived/script/nginx-check-for-keepalived.sh', error 8: Exec format error
Jan  9 22:32:57 zhoujy-VirtualBox Keepalived_vrrp[4369]: Error exec-ing command '/usr/local/lib/keepalived/script/nginx-check-for-keepalived.sh', error 8: Exec format error
Jan  9 22:32:59 zhoujy-VirtualBox Keepalived_vrrp[4370]: Error exec-ing command

原来是脚本开头定义出错了，平时执行脚本可以，但是keepalived执行时报错。

#! /bin/bash

写成了

# !/bin/bash

keepalived执行脚本时报错，改正后即可。

总结

总的来说，使用keepalived保证Nginx高可用，就是基于主-备架构，利用keepalived实现故障时自动切换到备机。一般使用一个keepalived服务+一个Nginx服务搭配作为一个（主节点），备机节点也一样。相当于keepalived服务监控着Nginx服务，然后利用keepalived自身的故障选举机制，实现间接实现Nginx的故障转移。

查看全文

https://www.xamrdz.com/backend/3x51944651.html