Prometheus Federation 实验小记

一、环境的相关信息

1. 版本信息

Docker Engine Community 19.03.1
Prometheus 2.11.1
Node exporter 0.18.1
Nginx 1.16.1

2. 服务器信息

192.168.112.131 server04
192.168.112.132 server05
192.168.112.133 server06

二、关键点说明

我们知道,Exporter通过http的形式对外提供服务,而Prometheus实例通过定时请求Exporter的http服务来获取监控指标的样本数据;而Federation同样也是通过http的形式对外提供服务,每个Prometheus实例都支持http的形式的Federation接口,顶级的Prometheus实例,通过定时请求直接下级Prometheus实例的http的形式的Federation接口来获取监控指标的样本数据。

当你面临的场景是,跨地区、跨机房甚至是跨数据中心的时候,例如想在北京做南京、成都和重庆三地的监控时,在不考虑高可用的情况下,该如何应用上述的Exporter和Federation ?
第一种,在北京部署一个Prometheus实例,然后在南京、成都和重庆三地各部署一个Exporter,把其端口暴露在公网上,通过公网IP和公网端口号定时请求http服务获取监控指标的样本数据;
第二种,在北京部署一个Prometheus实例,然后在南京、成都和重庆三地各部署一个Prometheus实例,把其端口暴露在公网上,通过公网IP和公网端口号定时请求南京、成都和重庆的http的形式的Federation接口获取其上的监控指标的样本数据。

问题来了,上述两种方式太不安全了,原因如下:
第一,http请求没有认证,所有请求无条件接受;
第二,单纯的http协议,传输报文为明文,没有加密。

想在公网上使用,我们就得做个简单的安全加固:Basic auth 和 TLS encryption。通过查询Prometheus的官方文档发现,这两个安全加固特性的支持,Prometheus需要借助Nginx这种http反向代理来实现,官方推荐我们使用Nginx。具体链接请参考后面的参考资料,这里不再啰嗦了。

为了方便搭建演示环境,我这里使用容器化的Prometheus和Node Exporter,二进制部署的Nginx,换成二进制部署的Prometheus和Node Exporter同理,这里不再啰嗦了。

下面仅演示跨地区、跨机房甚至是跨数据中心的时候如何使用Prometheus Federation。关于Prometheus通过公网拉取Exporter获取监控指标的样本数据的方法同理,关键在于Nginx的使用上。

三、过程记录

1. 下载 prometheus 相关的Docker镜像

1
2
docker pull prom/prometheus:v2.11.1
docker pull prom/node-exporter:v0.18.1

2. 下载 Nginx 的 docker 镜像(使用容器启动 Nginx 的方式。如使用二进制方式启动,则该步骤跳过)

1
docker pull nginx:1.16.1

3. 三台服务器上分别使用yum安装Nginx(使用二进制启动 Nginx 的方式。如使用容器方式启动,则该步骤跳过)

1
2
3
4
yum install -y epel-release
yum makecache fast
yum install -y nginx
yum install -y httpd-tools

4. 在server04上生成证书,然后分发所有证书文件到server05和server06的相同目录下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
mkdir -p /opt/prometheus/pki/
cd /opt/prometheus/pki/
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=prometheus" -days 50000 -out ca.crt

cat <<EOF > ssl.cnf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name

[req_distinguished_name]
[v3_req]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation,digitalSignature,keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = server04
DNS.2 = server05
DNS.3 = server06
DNS.4 = localhost
IP.1 = 192.168.112.131
IP.2 = 192.168.112.132
IP.3 = 192.168.112.133
IP.4 = 127.0.0.1
EOF

openssl genrsa -out server.key 2048
openssl req -new -key server.key -subj "/CN=prometheus-server" -config ssl.cnf -out server.csr
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 50000 -extensions v3_req -extfile ssl.cnf -out server.crt

openssl genrsa -out client.key 2048
openssl req -new -key client.key -subj "/CN=prometheus-client" -out client.csr
openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client.crt -days 50000

# 分发步骤这里省略不写了

5. 在三台服务器上分别启动Nginx实例(使用二进制启动Nginx的方式。如使用容器方式启动,则该步骤跳过)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
# 在server04上执行
htpasswd -c /etc/nginx/.htpasswd admin

## 修改nginx的主配置文件
http {
。。。。。。以下是修改。。。。。。
include /etc/nginx/conf.d/http_*.conf;
。。。。。。以下是新增。。。。。。
stream {
include /etc/nginx/conf.d/stream_*.conf;
}


cat <<EOF > /etc/nginx/conf.d/http_prometheus.conf
upstream prometheus {
server 192.168.112.131:9090;
}

server {
listen 19090 ssl;
server_name 192.168.112.131;
ssl_certificate /opt/prometheus/pki/server.crt;
ssl_certificate_key /opt/prometheus/pki/server.key;
access_log /var/log/nginx/prometheus-access.log main;
error_log /var/log/nginx/prometheus-error.log;
add_header Cache-Control no-cache;

location / {
auth_basic "prometheus";
auth_basic_user_file /etc/nginx/.htpasswd;
proxy_pass http://prometheus/;
proxy_http_version 1.1;
proxy_connect_timeout 30m;
proxy_send_timeout 30m;
proxy_read_timeout 30m;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $http_connection;
proxy_buffering off;
}
}

EOF

systemctl enable nginx.service
systemctl start nginx.service
systemctl status nginx.service

# 在server05上执行
htpasswd -c /etc/nginx/.htpasswd admin

## 修改nginx的主配置文件
http {
。。。。。。以下是修改。。。。。。
include /etc/nginx/conf.d/http_*.conf;
。。。。。。以下是新增。。。。。。
stream {
include /etc/nginx/conf.d/stream_*.conf;
}


cat <<EOF > /etc/nginx/conf.d/http_prometheus.conf
upstream prometheus {
server 192.168.112.132:9090;
}

server {
listen 19090 ssl;
server_name 192.168.112.132;
ssl_certificate /opt/prometheus/pki/server.crt;
ssl_certificate_key /opt/prometheus/pki/server.key;
access_log /var/log/nginx/prometheus-access.log main;
error_log /var/log/nginx/prometheus-error.log;
add_header Cache-Control no-cache;

location / {
auth_basic "prometheus";
auth_basic_user_file /etc/nginx/.htpasswd;
proxy_pass http://prometheus/;
proxy_http_version 1.1;
proxy_connect_timeout 30m;
proxy_send_timeout 30m;
proxy_read_timeout 30m;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $http_connection;
proxy_buffering off;
}
}

EOF

systemctl enable nginx.service
systemctl start nginx.service
systemctl status nginx.service

# 在server06上执行
htpasswd -c /etc/nginx/.htpasswd admin

## 修改nginx的主配置文件
http {
。。。。。。以下是修改。。。。。。
include /etc/nginx/conf.d/http_*.conf;
。。。。。。以下是新增。。。。。。
stream {
include /etc/nginx/conf.d/stream_*.conf;
}


cat <<EOF > /etc/nginx/conf.d/http_prometheus.conf
upstream prometheus {
server 192.168.112.133:9090;
}

server {
listen 19090 ssl;
server_name 192.168.112.133;
ssl_certificate /opt/prometheus/pki/server.crt;
ssl_certificate_key /opt/prometheus/pki/server.key;
access_log /var/log/nginx/prometheus-access.log main;
error_log /var/log/nginx/prometheus-error.log;
add_header Cache-Control no-cache;

location / {
auth_basic "prometheus";
auth_basic_user_file /etc/nginx/.htpasswd;
proxy_pass http://prometheus/;
proxy_http_version 1.1;
proxy_connect_timeout 30m;
proxy_send_timeout 30m;
proxy_read_timeout 30m;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $http_connection;
proxy_buffering off;
}
}

EOF

systemctl enable nginx.service
systemctl start nginx.service
systemctl status nginx.service

6. 在三台服务器上分别启动Nginx容器(使用容器启动Nginx的方式。如使用二进制方式启动,则该步骤跳过)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
# 在server04上执行
mkdir -p nginx/etc/nginx/conf.d/
cp -r /opt/prometheus/pki nginx/etc/nginx/
cp -r /etc/nginx/.htpasswd nginx/etc/nginx/

cat <<EOF > nginx/etc/nginx/nginx.conf

user nginx;
worker_processes 1;

error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;


events {
worker_connections 1024;
}


http {
include /etc/nginx/mime.types;
default_type application/octet-stream;

log_format main '\$remote_addr - \$remote_user [\$time_local] "\$request" '
'\$status \$body_bytes_sent "\$http_referer" '
'"\$http_user_agent" "\$http_x_forwarded_for"';

access_log /var/log/nginx/access.log main;

sendfile on;
#tcp_nopush on;

keepalive_timeout 65;

#gzip on;

include /etc/nginx/conf.d/http_*.conf;
}

stream {
include /etc/nginx/conf.d/stream_*.conf;
}

EOF

cat <<EOF > nginx/etc/nginx/conf.d/http_prom.conf
upstream prometheus {
server 192.168.112.131:9090;
}

server {
listen 19090 ssl;
server_name 192.168.112.131;
ssl_certificate /etc/nginx/pki/server.crt;
ssl_certificate_key /etc/nginx/pki/server.key;
add_header Cache-Control no-cache;

location / {
auth_basic "prometheus";
auth_basic_user_file /etc/nginx/.htpasswd;
proxy_pass http://prometheus/;
proxy_http_version 1.1;
proxy_connect_timeout 30m;
proxy_send_timeout 30m;
proxy_read_timeout 30m;
proxy_set_header Upgrade \$http_upgrade;
proxy_set_header Connection \$http_connection;
proxy_buffering off;
}
}

EOF

cd nginx/

docker run -d -p 19090:19090 -v `pwd`/etc/nginx/nginx.conf:/etc/nginx/nginx.conf -v `pwd`/etc/nginx/.htpasswd:/etc/nginx/.htpasswd -v `pwd`/etc/nginx/pki/:/etc/nginx/pki/ -v `pwd`/etc/nginx/conf.d/:/etc/nginx/conf.d/ -v /etc/localtime:/etc/localtime --name nginx nginx:1.16.1

# 在server05上执行
mkdir -p nginx/etc/nginx/conf.d/
cp -r /opt/prometheus/pki nginx/etc/nginx/
cp -r /etc/nginx/.htpasswd nginx/etc/nginx/

cat <<EOF > nginx/etc/nginx/nginx.conf

user nginx;
worker_processes 1;

error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;


events {
worker_connections 1024;
}


http {
include /etc/nginx/mime.types;
default_type application/octet-stream;

log_format main '\$remote_addr - \$remote_user [\$time_local] "\$request" '
'\$status \$body_bytes_sent "\$http_referer" '
'"\$http_user_agent" "\$http_x_forwarded_for"';

access_log /var/log/nginx/access.log main;

sendfile on;
#tcp_nopush on;

keepalive_timeout 65;

#gzip on;

include /etc/nginx/conf.d/http_*.conf;
}

stream {
include /etc/nginx/conf.d/stream_*.conf;
}

EOF

cat <<EOF > nginx/etc/nginx/conf.d/http_prom.conf
upstream prometheus {
server 192.168.112.132:9090;
}

server {
listen 19090 ssl;
server_name 192.168.112.132;
ssl_certificate /etc/nginx/pki/server.crt;
ssl_certificate_key /etc/nginx/pki/server.key;
add_header Cache-Control no-cache;

location / {
auth_basic "prometheus";
auth_basic_user_file /etc/nginx/.htpasswd;
proxy_pass http://prometheus/;
proxy_http_version 1.1;
proxy_connect_timeout 30m;
proxy_send_timeout 30m;
proxy_read_timeout 30m;
proxy_set_header Upgrade \$http_upgrade;
proxy_set_header Connection \$http_connection;
proxy_buffering off;
}
}

EOF

cd nginx/

docker run -d -p 19090:19090 -v `pwd`/etc/nginx/nginx.conf:/etc/nginx/nginx.conf -v `pwd`/etc/nginx/.htpasswd:/etc/nginx/.htpasswd -v `pwd`/etc/nginx/pki/:/etc/nginx/pki/ -v `pwd`/etc/nginx/conf.d/:/etc/nginx/conf.d/ -v /etc/localtime:/etc/localtime --name nginx nginx:1.16.1

# 在server06上执行
mkdir -p nginx/etc/nginx/conf.d/
cp -r /opt/prometheus/pki nginx/etc/nginx/
cp -r /etc/nginx/.htpasswd nginx/etc/nginx/

cat <<EOF > nginx/etc/nginx/nginx.conf

user nginx;
worker_processes 1;

error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;


events {
worker_connections 1024;
}


http {
include /etc/nginx/mime.types;
default_type application/octet-stream;

log_format main '\$remote_addr - \$remote_user [\$time_local] "\$request" '
'\$status \$body_bytes_sent "\$http_referer" '
'"\$http_user_agent" "\$http_x_forwarded_for"';

access_log /var/log/nginx/access.log main;

sendfile on;
#tcp_nopush on;

keepalive_timeout 65;

#gzip on;

include /etc/nginx/conf.d/http_*.conf;
}

stream {
include /etc/nginx/conf.d/stream_*.conf;
}

EOF

cat <<EOF > nginx/etc/nginx/conf.d/http_prom.conf
upstream prometheus {
server 192.168.112.133:9090;
}

server {
listen 19090 ssl;
server_name 192.168.112.133;
ssl_certificate /etc/nginx/pki/server.crt;
ssl_certificate_key /etc/nginx/pki/server.key;
add_header Cache-Control no-cache;

location / {
auth_basic "prometheus";
auth_basic_user_file /etc/nginx/.htpasswd;
proxy_pass http://prometheus/;
proxy_http_version 1.1;
proxy_connect_timeout 30m;
proxy_send_timeout 30m;
proxy_read_timeout 30m;
proxy_set_header Upgrade \$http_upgrade;
proxy_set_header Connection \$http_connection;
proxy_buffering off;
}
}

EOF

cd nginx/
docker run -d -p 19090:19090 -v `pwd`/etc/nginx/nginx.conf:/etc/nginx/nginx.conf -v `pwd`/etc/nginx/.htpasswd:/etc/nginx/.htpasswd -v `pwd`/etc/nginx/pki/:/etc/nginx/pki/ -v `pwd`/etc/nginx/conf.d/:/etc/nginx/conf.d/ -v /etc/localtime:/etc/localtime --name nginx nginx:1.16.1

7. 在三台服务器上分别启动Prometheus实例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
# 在server04上执行
cat <<EOF > /opt/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s

scrape_configs:
- job_name: 'server04'
scrape_interval: 15s
basic_auth:
username: admin
password: 12345678
tls_config:
ca_file: /etc/prometheus/pki/ca.crt
cert_file: /etc/prometheus/pki/client.crt
key_file: /etc/prometheus/pki/client.key

honor_labels: true
metrics_path: '/federate'
scheme: https
params:
'match[]':
- '{job=~"server.*"}'

static_configs:
- targets:
- '192.168.112.132:19090'
labels:
prometheus: 'server04'

- job_name: 'server05'
scrape_interval: 15s
basic_auth:
username: admin
password: 12345678
tls_config:
ca_file: /etc/prometheus/pki/ca.crt
cert_file: /etc/prometheus/pki/client.crt
key_file: /etc/prometheus/pki/client.key

honor_labels: true
metrics_path: '/federate'
scheme: https
params:
'match[]':
- '{job=~"server.*"}'

static_configs:
- targets:
- '192.168.112.133:19090'
labels:
prometheus: 'server05'

rule_files:
- "/opt/prometheus/rules/prometheus.yaml"

EOF

cat <<EOF > /opt/prometheus/rules/prometheus.yaml
groups:
- name: node.down
rules:
- alert: node:down
expr: |
up{instance="192.168.112.131",job="server04"} == 0
for: 5s
labels:
severity: node
annotations:
summary: "节点当机"
description: "节点{{ $labels.instance }}当机了,请抓紧排查原因。"
EOF

docker run -d --name=prometheus -p 9090:9090 -v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml -v /opt/prometheus/rules/prometheus.yaml:/opt/prometheus/rules/prometheus.yaml -v /opt/prometheus/pki/:/etc/prometheus/pki/ -v /etc/localtime:/etc/localtime prom/prometheus:v2.11.1 --config.file=/etc/prometheus/prometheus.yml --web.external-url="http://192.168.122.131:19090/" --web.route-prefix="/"

# 在server05上执行
cat <<EOF > /opt/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s

scrape_configs:
- job_name: server05
static_configs:
- targets: ['192.168.112.132:9100']
labels:
instance: 192.168.112.132
environment: dev

rule_files:
- "/opt/prometheus/rules/prometheus.yaml"

EOF

cat <<EOF > /opt/prometheus/rules/prometheus.yaml
groups:
- name: node.down
rules:
- alert: node:down
expr: |
up{instance="192.168.112.132",job="server05"} == 0
for: 5s
labels:
severity: node
annotations:
summary: "节点当机"
description: "节点{{ $labels.instance }}当机了,请抓紧排查原因。"
EOF

docker run -d --name=prometheus -p 9090:9090 -v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml -v /opt/prometheus/rules/prometheus.yaml:/opt/prometheus/rules/prometheus.yaml -v /etc/localtime:/etc/localtime prom/prometheus:v2.11.1 --config.file=/etc/prometheus/prometheus.yml --web.external-url="http://192.168.122.132:19090/" --web.route-prefix="/"

# 在server06上执行
cat <<EOF > /opt/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s

scrape_configs:
- job_name: server06
static_configs:
- targets: ['192.168.112.133:9100']
labels:
instance: 192.168.112.133
environment: dev

rule_files:
- "/opt/prometheus/rules/prometheus.yaml"

EOF

cat <<EOF > /opt/prometheus/rules/prometheus.yaml
groups:
- name: node.down
rules:
- alert: node:down
expr: |
up{instance="192.168.112.133",job="server06"} == 0
for: 5s
labels:
severity: node
annotations:
summary: "节点当机"
description: "节点{{ $labels.instance }}当机了,请抓紧排查原因。"
EOF

docker run -d --name=prometheus -p 9090:9090 -v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml -v /opt/prometheus/rules/prometheus.yaml:/opt/prometheus/rules/prometheus.yaml -v /etc/localtime:/etc/localtime prom/prometheus:v2.11.1 --config.file=/etc/prometheus/prometheus.yml --web.external-url="http://192.168.122.133:19090/" --web.route-prefix="/"

四、如何验证

访问 https://192.168.112.131:19090/targets ,然后输入用户名和密码就可以看到,所有的targets都是健康的。如下图所示:
targets

五、参考资料

1. 官方资料

https://prometheus.io/docs/guides/basic-auth/
https://prometheus.io/docs/guides/tls-encryption/
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config

2 非官方资料

http://nginx.org/en/docs/http/configuring_https_servers.html
https://www.cnblogs.com/shenlinken/p/9968274.html
https://www.cnblogs.com/qiyueqi/p/11551238.html
https://my.oschina.net/sskxyz/blog/1554093?utm_source=debugrun&utm_medium=referral

使用Docker容器快速体验Prometheus的生态体系

一、环境的相关信息

1. 版本信息

Docker Engine Community 19.03.1
Prometheus 2.11.1
Alertmanager 0.18.0
Pushgateway 0.9.0
Node exporter 0.18.1
Grafana 6.2.5

2. 服务器信息

192.168.112.128

二、实验过程记录

1. 下载相关的Docker镜像,如下所示:

1
2
3
4
5
docker pull prom/prometheus:v2.11.1
docker pull prom/alertmanager:v0.18.0
docker pull prom/node-exporter:v0.18.1
docker pull prom/pushgateway:v0.9.0
docker pull grafana/grafana:6.2.5

2. 准备配置文件,如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
mkdir -p /opt/prometheus/rules/
cat <<EOD > /opt/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 5s

scrape_configs:
- job_name: prometheus
static_configs:
- targets: ['192.168.112.128:9090']
labels:
instance: 192.168.112.128
environment: prometheus

- job_name: node
static_configs:
- targets: ['192.168.112.128:9100']
labels:
instance: 192.168.112.128
environment: dev

- job_name: pushgateway
static_configs:
- targets: ['192.168.112.128:9091']
labels:
instance: pushgateway

rule_files:
- "/opt/prometheus/rules/prometheus.yaml"

alerting:
alertmanagers:
- static_configs:
- targets: ['192.168.112.128:9093']

EOD


cat <<EOD > /opt/prometheus/rules/prometheus.yaml
groups:
- name: node.disk
rules:
- expr: |
sum by (instance, job) (node_filesystem_size_bytes{fstype=~"ext[234]|btrfs|xfs|zfs", job="node", mountpoint=~"/rootfs.*"})
record: disk:node_filesystem_size_bytes:total
- expr: |
sum by (instance, job) (node_filesystem_avail_bytes{fstype=~"ext[234]|btrfs|xfs|zfs", job="node", mountpoint=~"/rootfs.*"})
record: disk:node_filesystem_avail_bytes:total

- name: node.down
rules:
- alert: node:down
expr: |
up{instance="192.168.112.128",job="node"} == 0
for: 5s
labels:
severity: node
annotations:
summary: "节点当机"
description: "节点{{ $labels.instance }}当机了,请抓紧排查原因。"

EOD


mkdir -p /opt/alertmanager/
cat <<EOD > /opt/alertmanager/alertmanager.yml
global:
resolve_timeout: 5m

route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'web.hook'

receivers:
- name: 'web.hook'
webhook_configs:
- url: 'http://192.168.112.128:8080/webhook'

EOD

3. 启动相关的Docker容器,如下所示:

1
2
3
4
5
6
7
8
9
docker run -d --name=alertmanager -p 9093:9093 -v /opt/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml -v /etc/localtime:/etc/localtime  prom/alertmanager:v0.18.0

docker run -d --name=prometheus -p 9090:9090 -v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml -v /opt/prometheus/rules/prometheus.yaml:/opt/prometheus/rules/prometheus.yaml -v /etc/localtime:/etc/localtime prom/prometheus:v2.11.1

docker run -d --name=node-exporter -p 9100:9100 -v "/proc:/host/proc:ro" -v "/sys:/host/sys:ro" -v "/:/rootfs:ro" -v /etc/localtime:/etc/localtime --net="host" prom/node-exporter:v0.18.1

docker run -d --name=pushgateway -p 9091:9091 -v /etc/localtime:/etc/localtime prom/pushgateway:v0.9.0

docker run -d --name=grafana -p 3000:3000 -v /opt/grafana-storage:/var/lib/grafana -v /etc/localtime:/etc/localtime grafana/grafana:6.2.5

4. 编写Webhook服务(用于接收告警通知)

1
2
详见 https://github.com/singhwang/alertmanager-webhook.git
里面只包含最基本的功能,即接收告警信息并打印到日志中,日志默认输出在控制台上。

5. 编译和部署Webhook服务(用于接收告警通知)

1
2
3
4
5
6
git clone https://github.com/singhwang/alertmanager-webhook.git
cd alertmanager-webhook/
export GOPROXY=https://goproxy.io
GOOS=linux GOARCH=amd64 go build -v -a github.com/singhwang/alertmanager-webhook

./alertmanager-webhook

6. 模拟节点宕机,查看Webhook服务接收到的告警信息

1
2
3
4
5
6
7
8
9
10
11
12
13
I0913 16:31:18.036839   33432 webhook.go:19] notification.Version : 4
I0913 16:31:18.036866 33432 webhook.go:20] notification.GroupKey : {}:{alertname="node:down"}
I0913 16:31:18.036872 33432 webhook.go:21] notification.Status : firing
I0913 16:31:18.036877 33432 webhook.go:22] notification.Receiver : web\.hook
I0913 16:31:18.036881 33432 webhook.go:23] notification.GroupLabels : map[alertname:node:down]
I0913 16:31:18.036903 33432 webhook.go:24] notification.CommonLabels : map[instance:192.168.112.128 job:node severity:node alertname:node:down environment:dev]
I0913 16:31:18.036916 33432 webhook.go:25] notification.CommonAnnotations : map[summary:节点当机 description:节点192.168.112.128当机了,请抓紧排查原因。]
I0913 16:31:18.036924 33432 webhook.go:26] notification.ExternalURL : http://5b6e7c8cf1d4:9093
I0913 16:31:18.036934 33432 webhook.go:30] alert.Labels : map[job:node severity:node alertname:node:down environment:dev instance:192.168.112.128]
I0913 16:31:18.036943 33432 webhook.go:31] alert.Annotations : map[description:节点192.168.112.128当机了,请抓紧排查原因。 summary:节点当机]
I0913 16:31:18.036950 33432 webhook.go:32] alert.StartsAt : 2019-09-13 16:31:08.028362063 +0800 CST
I0913 16:31:18.036974 33432 webhook.go:33] alert.EndsAt : 0001-01-01 00:00:00 +0000 UTC
[GIN] 2019/09/13 - 16:31:18 | 200 | 1.165161ms | 172.17.0.2 | POST /webhook

注意:notification.Status : firing

7. 模拟节点恢复,查看Webhook服务接收到的恢复信息

1
2
3
4
5
6
7
8
9
10
11
12
13
I0913 16:32:48.047377   33432 webhook.go:19] notification.Version : 4
I0913 16:32:48.047403 33432 webhook.go:20] notification.GroupKey : {}:{alertname="node:down"}
I0913 16:32:48.047411 33432 webhook.go:21] notification.Status : resolved
I0913 16:32:48.047416 33432 webhook.go:22] notification.Receiver : web\.hook
I0913 16:32:48.047421 33432 webhook.go:23] notification.GroupLabels : map[alertname:node:down]
I0913 16:32:48.047457 33432 webhook.go:24] notification.CommonLabels : map[alertname:node:down environment:dev instance:192.168.112.128 job:node severity:node]
I0913 16:32:48.047470 33432 webhook.go:25] notification.CommonAnnotations : map[description:节点192.168.112.128当机了,请抓紧排查原因。 summary:节点当机]
I0913 16:32:48.047479 33432 webhook.go:26] notification.ExternalURL : http://5b6e7c8cf1d4:9093
I0913 16:32:48.047485 33432 webhook.go:30] alert.Labels : map[severity:node alertname:node:down environment:dev instance:192.168.112.128 job:node]
I0913 16:32:48.047494 33432 webhook.go:31] alert.Annotations : map[description:节点192.168.112.128当机了,请抓紧排查原因。 summary:节点当机]
I0913 16:32:48.047502 33432 webhook.go:32] alert.StartsAt : 2019-09-13 16:31:08.028362063 +0800 CST
I0913 16:32:48.047514 33432 webhook.go:33] alert.EndsAt : 2019-09-13 16:32:48.028362063 +0800 CST
[GIN] 2019/09/13 - 16:32:48 | 200 | 272.06µs | 172.17.0.2 | POST /webhook

注意:notification.Status : resolved

8. Pushgateway API 使用示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 添加复杂数据,通常数据会带上instance, 表示来源位置
cat <<EOF | curl --data-binary @- http://192.168.112.128:9091/metrics/job/some_job/instance/some_instance
# TYPE some_metric counter
some_metric{label="val1"} 42
# TYPE another_metric gauge
# HELP another_metric Just an example.
another_metric 2398.283
EOF

# 删除某个组下的某实例的所有数据
curl -X DELETE http://192.168.112.128:9091/metrics/job/some_job/instance/some_instance

# 删除某个组下的所有数据
curl -X DELETE http://192.168.112.128:9091/metrics/job/some_job

9、Grafana Dashboard Example,请导入后查看效果

注意:请替换下面代码中的192.168.112.128为实验机器的实际地址。另外,关于如何使用Grafana制作自己需要的Dashboard,这里不提及。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"id": 1,
"iteration": 1568364552383,
"links": [],
"panels": [
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"fill": 1,
"gridPos": {
"h": 6,
"w": 20,
"x": 0,
"y": 0
},
"id": 6,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "((sum by (job, instance) (node_cpu_seconds_total{job=\"node\", mode!=\"idle\", instance=\"192.168.112.128\"})) / (sum by (job, instance) (node_cpu_seconds_total{job=\"node\", instance=\"192.168.112.128\"}))) * 100",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ instance }}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "CPU使用率",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "percent",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": true,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "Prometheus",
"format": "percent",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": true,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 6,
"w": 4,
"x": 20,
"y": 0
},
"id": 8,
"interval": null,
"links": [],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"options": {},
"pluginVersion": "6.2.5",
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "((sum by (job, instance) (node_cpu_seconds_total{job=\"node\", mode!=\"idle\", instance=\"192.168.112.128\"})) / (sum by (job, instance) (node_cpu_seconds_total{job=\"node\", instance=\"192.168.112.128\"}))) * 100",
"format": "time_series",
"intervalFactor": 1,
"refId": "A"
}
],
"thresholds": "80,90",
"timeFrom": null,
"timeShift": null,
"title": "CPU使用率",
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "avg"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "Prometheus",
"fill": 1,
"gridPos": {
"h": 6,
"w": 20,
"x": 0,
"y": 6
},
"id": 2,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max by (job, instance) (((node_memory_MemTotal_bytes{job=\"node\", instance=\"192.168.112.128\"} - node_memory_MemFree_bytes{job=\"node\", instance=\"192.168.112.128\"} - node_memory_Buffers_bytes{job=\"node\", instance=\"192.168.112.128\"} - node_memory_Cached_bytes{job=\"node\", instance=\"192.168.112.128\"}) / node_memory_MemTotal_bytes{job=\"node\", instance=\"192.168.112.128\"}) * 100)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ instance }}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "内存使用率",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "percent",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": true,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "Prometheus",
"format": "percent",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": true,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 6,
"w": 4,
"x": 20,
"y": 6
},
"id": 4,
"interval": null,
"links": [],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"options": {},
"pluginVersion": "6.2.5",
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "max(((node_memory_MemTotal_bytes{job=\"node\", instance=\"192.168.112.128\"} - node_memory_MemFree_bytes{job=\"node\", instance=\"192.168.112.128\"} - node_memory_Buffers_bytes{job=\"node\", instance=\"192.168.112.128\"} - node_memory_Cached_bytes{job=\"node\", instance=\"192.168.112.128\"}) / node_memory_MemTotal_bytes{job=\"node\", instance=\"192.168.112.128\"}) * 100)",
"format": "time_series",
"intervalFactor": 1,
"refId": "A"
}
],
"thresholds": "80,90",
"timeFrom": null,
"timeShift": null,
"title": "内存使用率",
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "avg"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "Prometheus",
"fill": 1,
"gridPos": {
"h": 6,
"w": 20,
"x": 0,
"y": 12
},
"id": 10,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max by (instance, device) ((node_filesystem_size_bytes{fstype=~\"ext[234]|btrfs|xfs|zfs\"}\n - node_filesystem_avail_bytes{fstype=~\"ext[234]|btrfs|xfs|zfs\"})\n / node_filesystem_size_bytes{fstype=~\"ext[234]|btrfs|xfs|zfs\"})",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ device }}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "磁盘设备使用率",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "percent",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": true,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "Prometheus",
"format": "percent",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": true,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 6,
"w": 4,
"x": 20,
"y": 12
},
"id": 12,
"interval": null,
"links": [],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"options": {},
"pluginVersion": "6.2.5",
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "(sum by (instance, job) (disk:node_filesystem_size_bytes:total) - sum by (instance, job) (disk:node_filesystem_avail_bytes:total)) / sum by (instance, job) (disk:node_filesystem_size_bytes:total)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ device }}",
"refId": "A"
}
],
"thresholds": "80,90",
"timeFrom": null,
"timeShift": null,
"title": "磁盘设备总使用率",
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "avg"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "Prometheus",
"fill": 1,
"gridPos": {
"h": 6,
"w": 20,
"x": 0,
"y": 18
},
"id": 14,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max(node_filesystem_files{job=\"node\", instance=\"192.168.112.128\"} - node_filesystem_files_free{job=\"node\", instance=\"192.168.112.128\"})",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "inode used",
"refId": "A"
},
{
"expr": "max(node_filesystem_files_free{job=\"node\", instance=\"192.168.112.128\"})",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "inode free",
"refId": "B"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Inode使用量",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": true,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "Prometheus",
"format": "percent",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": true,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 6,
"w": 4,
"x": 20,
"y": 18
},
"id": 16,
"interval": null,
"links": [],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"options": {},
"pluginVersion": "6.2.5",
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false
},
"tableColumn": "",
"targets": [
{
"expr": "max(((node_filesystem_files{job=\"node\", instance=\"192.168.112.128\"} - node_filesystem_files_free{job=\"node\", instance=\"192.168.112.128\"}) / node_filesystem_files{job=\"node\", instance=\"192.168.112.128\"}) * 100)",
"format": "time_series",
"intervalFactor": 1,
"refId": "A"
}
],
"thresholds": "80,90",
"timeFrom": null,
"timeShift": null,
"title": "Inode使用率",
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "avg"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "Prometheus",
"fill": 1,
"gridPos": {
"h": 6,
"w": 12,
"x": 0,
"y": 24
},
"id": 20,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node_network_transmit_bytes_total{job=\"node\", instance=\"192.168.112.128\", device!~\"lo\"}",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ device }}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "网络发送量",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "Prometheus",
"fill": 1,
"gridPos": {
"h": 6,
"w": 12,
"x": 12,
"y": 24
},
"id": 18,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "node_network_receive_bytes_total{job=\"node\", instance=\"192.168.112.128\", device!~\"lo\"}",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ device }}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "网络接收量",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "Prometheus",
"fill": 1,
"gridPos": {
"h": 6,
"w": 24,
"x": 0,
"y": 30
},
"id": 22,
"legend": {
"avg": false,
"current": false,
"hideZero": false,
"max": false,
"min": false,
"rightSide": false,
"show": true,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum by (instance, job) (node_disk_read_bytes_total{job=\"node\", instance=\"192.168.112.128\"})",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "read",
"refId": "A"
},
{
"expr": "sum by (instance, job) (node_disk_written_bytes_total{job=\"node\", instance=\"192.168.112.128\"})",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "written",
"refId": "B"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "磁盘IO使用情况",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": "",
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"decimals": null,
"format": "ms",
"label": "",
"logBase": 10,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
}
],
"refresh": "10s",
"schemaVersion": 18,
"style": "dark",
"tags": [],
"templating": {
"list": [
{
"auto": false,
"auto_count": 30,
"auto_min": "10s",
"current": {
"text": "10s",
"value": "10s"
},
"hide": 2,
"label": "Interval",
"name": "Interval",
"options": [
{
"selected": true,
"text": "10s",
"value": "10s"
},
{
"selected": false,
"text": "20s",
"value": "20s"
},
{
"selected": false,
"text": "30s",
"value": "30s"
},
{
"selected": false,
"text": "60m",
"value": "60m"
}
],
"query": "10s,20s,30s,60m",
"refresh": 2,
"skipUrlSync": false,
"type": "interval"
}
]
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "browser",
"title": "Node监控信息",
"uid": "q0DESkvWk",
"version": 10
}

10. 常见服务的访问地址,实验过程中,请自行访问相应组件的控制台查看数据变化

1
2
3
4
Prometheus http://192.168.112.128:9090
Pushgateway http://192.168.112.128:9091
Alertmanager http://192.168.112.128:9093
Grafana http://192.168.112.128:3000

三、参考资料

https://yunlzheng.gitbook.io/prometheus-book/introduction
https://www.cnblogs.com/xiao987334176/p/9933963.html

使用Gluster工具管理GlusterFS集群

一、配置和验证GlusterFS Cluster

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# 添加节点到glusterfs cluster
[root@server07 ~]# gluster peer probe server08
peer probe: success.

[root@server07 ~]# gluster peer probe server09
peer probe: success.

# 查看glusterfs cluster的状态
[root@server07 ~]# gluster peer status
Number of Peers: 2

Hostname: server08
Uuid: 8530c074-760f-4d03-a5a7-f1b3ccaa5cfd
State: Peer in Cluster (Connected)
Hostname: server09

Uuid: 41a4b6df-bcb3-4650-8a21-54afc1e27cbe
State: Peer in Cluster (Connected)

二、在GlusterFS Cluster上操作和使用Volume

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
# 查看volume的列表
[root@server07 ~]# gluster volume info
No volumes present

# 创建volume对应的存储目录(集群的所有主机上都要创建)
mkdir -p /opt/gluster/data

# 创建volume
[root@server07 ~]# gluster volume create k8s-volume transport tcp server07:/opt/gluster/data server08:/opt/gluster/data server09:/opt/gluster/data force
volume create: k8s-volume: success: please start the volume to access data

[root@server07 ~]# gluster volume info k8s-volume

Volume Name: k8s-volume
Type: Distribute
Volume ID: e4974285-1304-4ea7-b60f-ebe8375dba86
Status: Created
Snapshot Count: 0
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: server07:/opt/gluster/data
Brick2: server08:/opt/gluster/data
Brick3: server09:/opt/gluster/data
Options Reconfigured:
transport.address-family: inet
nfs.disable: on

# 启动volume
[root@server07 ~]# gluster volume start k8s-volume
volume start: k8s-volume: success

# 查看volume的信息
[root@server07 ~]# gluster volume info k8s-volume

Volume Name: k8s-volume
Type: Distribute
Volume ID: e4974285-1304-4ea7-b60f-ebe8375dba86
Status: Started
Snapshot Count: 0
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: server07:/opt/gluster/data
Brick2: server08:/opt/gluster/data
Brick3: server09:/opt/gluster/data
Options Reconfigured:
transport.address-family: inet
nfs.disable: on

# 验证数据卷的挂载和数据写入
[root@server07 ~]# mount -t glusterfs server07:k8s-volume /mnt
[root@server07 ~]# ls -la /mnt
drwxr-xr-x. 3 root root 4096 4月 9 23:36 .
dr-xr-xr-x. 17 root root 224 4月 9 21:59 ..
[root@server07 ~]# echo "hello glusterfs kubernetes." > /mnt/readme.md
[root@server07 ~]# ls -la /mnt
drwxr-xr-x. 3 root root 4096 4月 10 05:36 .
dr-xr-xr-x. 17 root root 224 4月 9 21:59 ..
-rw-r--r--. 1 root root 28 4月 10 05:36 readme.md

[root@server07 ~]# cat /mnt/readme.md
hello glusterfs kubernetes.

[root@server07 ~]# df -h
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/cl-root 8.0G 1.1G 7.0G 14% /
devtmpfs 478M 0 478M 0% /dev
tmpfs 489M 0 489M 0% /dev/shm
tmpfs 489M 6.8M 482M 2% /run
tmpfs 489M 0 489M 0% /sys/fs/cgroup
/dev/sda1 1014M 139M 876M 14% /boot
tmpfs 98M 0 98M 0% /run/user/0
server07:k8s-volume 24G 3.2G 21G 14% /mnt
[root@server07 ~]# umount server07:k8s-volume
[root@server07 ~]# ls -la /mnt/
drwxr-xr-x. 2 root root 6 11月 5 2016 .
dr-xr-xr-x. 17 root root 224 4月 9 21:59 ..

# 查看volume的信息
[root@server07 ~]# gluster volume info k8s-volume

Volume Name: k8s-volume
Type: Distribute
Volume ID: e4974285-1304-4ea7-b60f-ebe8375dba86
Status: Started
Snapshot Count: 0
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: server07:/opt/gluster/data
Brick2: server08:/opt/gluster/data
Brick3: server09:/opt/gluster/data
Options Reconfigured:
transport.address-family: inet
nfs.disable: on

# 停止volume
[root@server07 ~]# gluster volume stop k8s-volume
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: k8s-volume: success

# 查看volume的信息
[root@server07 ~]# gluster volume info k8s-volume

Volume Name: k8s-volume
Type: Distribute
Volume ID: e4974285-1304-4ea7-b60f-ebe8375dba86
Status: Stopped
Snapshot Count: 0
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: server07:/opt/gluster/data
Brick2: server08:/opt/gluster/data
Brick3: server09:/opt/gluster/data
Options Reconfigured:
transport.address-family: inet
nfs.disable: on

# 删除volume
[root@server07 ~]# gluster volume delete k8s-volume
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y
volume delete: k8s-volume: success

# 查看volume的信息
[root@server07 ~]# gluster volume info k8s-volume
Volume k8s-volume does not exist

三、如何重置GlusterFS Cluster中的所有Node

假设集群中只有一个volume,它叫k8s-volume,下面对集群进行重置:

1
2
3
4
5
6
7
8
9
10
# 重置glusterfs
gluster volume list
gluster volume stop k8s-volume
gluster volume delete k8s-volume
gluster volume list
gluster peer status
gluster peer help
gluster peer detach server08
gluster peer detach server09
gluster peer status

四、如何重置GlusterFS Cluster使用过的磁盘为裸磁盘

1
2
# 注意千万不用其操作根磁盘,其同样适用于被其他存储使用过的磁盘
dd if=/dev/zero of=/dev/<sd?> bs=1M count=200

CentOS 7 Nginx 使用示例

1. 使用yum安装Nginx

1
2
3
yum install -y epel-release
yum makecache fast
yum install -y nginx

2. 配置Nginx,这里仅提供一个七层HTTP代理的示例,实际中请根据需要修改

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
# (1)HTTP代理
# 新建配置文件 /etc/nginx/conf.d/remote_xxx.conf

upstream remote-xxx {
server x.x.x.x:43747;
}

server {
listen 43747;
server_name 111.206.120.158;
access_log /var/log/nginx/remote-xxx-access.log main;
error_log /var/log/nginx/remote-xxx-error.log;
add_header Cache-Control no-cache;

location / {
proxy_pass http://remote-xxx/;
proxy_http_version 1.1;
proxy_connect_timeout 30m;
proxy_send_timeout 30m;
proxy_read_timeout 30m;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $http_connection;
proxy_buffering off;
}
}

# (2)HTTPS代理
# 在 /etc/nginx/nginx.conf 中配置SSL证书
。。。。。。
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';

access_log /var/log/nginx/access.log main;

sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
ssl_certificate /data/demo/ssl/3838460__xesv5.com.pem; # 指定证书的位置,绝对路径
ssl_certificate_key /data/demo/ssl/3838460__xesv5.com.key; # 绝对路径,同上
ssl_session_timeout 5m;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2; #按照这个协议配置
ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:HIGH:!aNULL:!MD5:!RC4:!DHE;#按照这个套件配置
ssl_prefer_server_ciphers on;

include /etc/nginx/mime.types;
default_type application/octet-stream;

# Load modular configuration files from the /etc/nginx/conf.d directory.
# See http://nginx.org/en/docs/ngx_core_module.html#include
# for more information.
include /etc/nginx/conf.d/*.conf;
。。。。。。


# 新建配置文件 /etc/nginx/conf.d/xxx_xx_ssl.conf
upstream xxx-xx {
server x.x.x.x:6088;
}

server {
listen 0.0.0.0:9606 ssl;
server_name xxx-xx;
access_log /var/log/nginx/xxx-xx-access.log main;
error_log /var/log/nginx/xxx-xx-error.log;

location / {
proxy_pass http://xxx-xx/;
proxy_http_version 1.1;
proxy_connect_timeout 30m;
proxy_send_timeout 30m;
proxy_read_timeout 30m;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $http_connection;
}
}

# (3)TCP代理
# 在 /etc/nginx/nginx.conf 配置同时可以使用http和stream
。。。。。。
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';

access_log /var/log/nginx/access.log main;

sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
ssl_certificate /data/demo/ssl/3838460__xesv5.com.pem; # 指定证书的位置,绝对路径
ssl_certificate_key /data/demo/ssl/3838460__xesv5.com.key; # 绝对路径,同上
ssl_session_timeout 5m;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2; #按照这个协议配置
ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:HIGH:!aNULL:!MD5:!RC4:!DHE;#按照这个套件配置
ssl_prefer_server_ciphers on;

include /etc/nginx/mime.types;
default_type application/octet-stream;

# Load modular configuration files from the /etc/nginx/conf.d directory.
# See http://nginx.org/en/docs/ngx_core_module.html#include
# for more information.
include /etc/nginx/conf.d/http_*.conf;
。。。。。。

stream {
include /etc/nginx/conf.d/stream_*.conf;
}

# 新建配置文件 /etc/nginx/conf.d/stream_prom_snmp.conf
upstream backend1 {
server 192.168.112.130:41782 max_fails=3 fail_timeout=30s;
}

server {
listen 41782;
proxy_connect_timeout 1s;
proxy_timeout 3s;
proxy_pass backend1;
}

# (4)UDP代理
# 在 /etc/nginx/nginx.conf 配置同时可以使用http和stream
。。。。。。
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';

access_log /var/log/nginx/access.log main;

sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
ssl_certificate /data/demo/ssl/3838460__xesv5.com.pem; # 指定证书的位置,绝对路径
ssl_certificate_key /data/demo/ssl/3838460__xesv5.com.key; # 绝对路径,同上
ssl_session_timeout 5m;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2; #按照这个协议配置
ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:HIGH:!aNULL:!MD5:!RC4:!DHE;#按照这个套件配置
ssl_prefer_server_ciphers on;

include /etc/nginx/mime.types;
default_type application/octet-stream;

# Load modular configuration files from the /etc/nginx/conf.d directory.
# See http://nginx.org/en/docs/ngx_core_module.html#include
# for more information.
include /etc/nginx/conf.d/http_*.conf;
。。。。。。

stream {
include /etc/nginx/conf.d/stream_*.conf;
}

# 新建配置文件 /etc/nginx/conf.d/stream_core_dns.conf
upstream backend2 {
server 192.168.112.130:31857 max_fails=3 fail_timeout=30s;
}

server {
listen 53 udp;
proxy_connect_timeout 1s;
proxy_timeout 3s;
proxy_pass backend2;
}

4. 启动Nginx

1
2
systemctl start nginx.service
systemctl status nginx.service

5. 参考资料

https://blog.csdn.net/weixin_44723434/article/details/97809824
http://nginx.org/en/docs/stream/ngx_stream_core_module.html
http://nginx.org/en/docs/stream/ngx_stream_proxy_module.html#proxy_connect_timeout

使用kubeadm的方式安装Kubernetes集群(一)

一、实现方式介绍

1. 组件分类介绍

附加组件:kube-proxy、coredns 和 calico。
核心组件:etcd、kube-apiserver、kube-controller-manager 和 kube-scheduler。
基础组件:docker、kubelet、kubeadm 和 kubectl。

核心组件和附加组件是工作在基础组件的基础之上的。

基础组件都采用二进制的方式部署,docker和kubelet是守护进程类型,交由systemd统一管理;kubectl和kubeadm都是二进制工具,非守护进程直接使用即可;
核心组件都采用容器化的方式部署,使用静态pod这种特性,交由kubelet和docker统一管理;
附加组件都采用kubernetes addon的方式部署,相当于kubernetes上的一个容器化应用,交由kubernetes统一管理。

2. 控制节点(master)

基础组件 + 核心组件 + 附加组件

附加组件:kube-proxy、coredns 和 calico。
核心组件:etcd、kube-apiserver、kube-controller-manager 和 kube-scheduler。
基础组件:docker、kubelet、kubeadm 和 kubectl。

3. 计算节点(node)

基础组件 + 附加组件

附加组件:kube-proxy、coredns 和 calico。
基础组件:docker、kubelet、kubeadm 和 kubectl。

二、实验环境版本信息

1. 操作系统的版本信息

CentOS Linux release 7.6.1810 (Core)

2. 各组件的版本信息

docker community 18.09.9
etcd v3.3.10 v3.3.15-0 v3.4.3
kube-apiserver v1.15.3 v1.16.0 v1.17.0
kube-controller-manager v1.15.3 v1.16.0 v1.17.0
kube-scheduler v1.15.3 v1.16.0 v1.17.0
kube-proxy v1.15.3 v1.16.0 v1.17.0
coredns v1.3.1 v1.6.2 1.6.5
kubectl v1.15.3 v1.16.0 v1.17.0
kubelet v1.15.3 v1.16.0 v1.17.0
calico v3.8.2 v3.11.1

注意:本文经实验支持 kubernetes 的 v1.15.3 和 v1.16.0 版本。上面各组件左边的版本号对应 v1.15.3的,右边的版本号对应 v1.16.0 的,而 docker 和 calico 在两个版本的 kubernetes 中使用的版本完全一直,请对号入座。

三、部署架构

1. Kubernetes Master(Control Plane)

192.168.112.128 master -> docker kubelet kube-proxy calico coredns etcd kube-apiserver kube-controller-manager kube-scheduler

2. Kubernetes Node

192.168.112.129 node01 -> docker kubelet kube-proxy calico
192.168.112.130 node02 -> docker kubelet kube-proxy calico

四、安装步骤

1. 准备基础的Linux服务器环境(要求所有的节点上执行,既包括master节点又包括node节点)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# 更新系统
yum update -y

# 设置正确的时区和时间
yum install -y ntpdate
timedatectl set-timezone Asia/Shanghai
ntpdate cn.ntp.org.cn

# 关闭防火墙
systemctl disable firewalld.service
systemctl stop firewalld.service

# 关闭swap分区
swapoff -a
sed -i 's#/dev/mapper/cl-swap#\# /dev/mapper/cl-swap#' /etc/fstab

# 关闭selinux
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

# 设置各个节点的主机名
## 192.168.112.128
hostnamectl set-hostname master

## 192.168.112.129
hostnamectl set-hostname node01

## 192.168.112.130
hostnamectl set-hostname node02

# 配置主机名和IP的映射
cat <<EOF >> /etc/hosts

# For Kubernetes Cluster
192.168.112.128 master
192.168.112.129 node01
192.168.112.130 node02
EOF

# 修改内核参数
cat <<EOF > /etc/sysctl.d/kubernetes.conf
net.bridge.bridge-nf-call-iptables = 1

net.ipv4.ip_forward = 1

net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_tw_resue = 1
EOF

sysctl --system

# 修改ulimit限制
cat <<EOF > /etc/security/limits.d/kubernetes.conf
* hard nofile 65535
* soft nofile 65535

* hard nproc 65535
* soft nproc 65535
EOF

2. 安装Docker(要求所有的节点上执行,既包括master节点又包括node节点)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# 配置yum源,然后安装Docker
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum makecache fast
yum install -y docker-ce-18.09.9-3.el7 docker-ce-cli-18.09.9-3.el7

# 启动Docker,并将其设置为开机启动
systemctl daemon-reload
systemctl enable docker.service
systemctl start docker.service

# 确认Docker启动是否正常
systemctl status docker.service
docker info
docker version

# 方法一:检查iptables的forward链的默认策略
iptables -nL
。。。
Chain FORWARD (policy ACCEPT)
。。。

# 方法二:检查iptables的forward链的默认策略
iptables-save -t filter
。。。
# Generated by iptables-save v1.4.21 on Thu Oct 3 12:28:24 2019
*filter
:INPUT ACCEPT [2117:366255]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [2188:436727]
。。。

# 设置docker daemon的cgroup driver为systemd
cat <<EOF > /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver":"json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
EOF

systemctl daemon-reload
systemctl restart docker.service
systemctl status docker.service

# 验证docker daemon的cgroup driver是否为systemd

。。。
Cgroup Driver: systemd
。。。

3. 安装容器化部署kubernetes所必备的二进制文件,它们包括kubeadm、kubelet和kubectl(要求所有的节点上执行,既包括master节点又包括node节点)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum makecache fast
yum list kubelet --showduplicates | sort -r
yum list kubeadm --showduplicates | sort -r
yum list kubectl --showduplicates | sort -r
yum list kubeadm --showduplicates | sort -r

## v1.15.3
yum install -y kubeadm-1.15.3-0 kubelet-1.15.3-0 kubectl-1.15.3-0

## v1.16.0
yum install -y kubeadm-1.16.0-0 kubelet-1.16.0-0 kubectl-1.16.0-0

## v1.17.0
yum install -y kubeadm-1.17.0-0 kubelet-1.17.0-0 kubectl-1.17.0-0

systemctl daemon-reload
systemctl enable kubelet.service

4. 拉取相关的Docker镜像

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# master节点
## v1.15.3
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.15.3
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.15.3
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.15.3
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.15.3
docker pull calico/node:v3.8.2
docker pull calico/cni:v3.8.2
docker pull calico/kube-controllers:v3.8.2
docker pull calico/pod2daemon-flexvol:v3.8.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.3.1
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.3.10
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1

## v1.16.0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.16.0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.16.0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.16.0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.16.0
docker pull calico/node:v3.8.2
docker pull calico/cni:v3.8.2
docker pull calico/kube-controllers:v3.8.2
docker pull calico/pod2daemon-flexvol:v3.8.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.3.15-0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1

# node节点
## v1.15.3
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.3.1
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.15.3
docker pull calico/node:v3.8.2
docker pull calico/cni:v3.8.2
docker pull calico/kube-controllers:v3.8.2
docker pull calico/pod2daemon-flexvol:v3.8.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1

## v1.16.0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.16.0
docker pull calico/node:v3.8.2
docker pull calico/cni:v3.8.2
docker pull calico/kube-controllers:v3.8.2
docker pull calico/pod2daemon-flexvol:v3.8.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1

5. 使用kubeadm工具初始化master节点(仅在master节点上执行)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
kubeadm config print init-defaults
mkdir -p kubeadm/

## v1.15.3
cat <<EOF > kubeadm/kubeadm.conf
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.112.128
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: master
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.15.3
networking:
podSubnet: 10.211.0.0/16
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/16
scheduler: {}

EOF

## v1.16.0
cat <<EOF > kubeadm/kubeadm.conf
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.112.128
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: master
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.16.0
networking:
podSubnet: 10.211.0.0/16
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/16
scheduler: {}

EOF

## v1.17.0
cat <<EOF > kubeadm/kubeadm.conf
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.112.128
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: master
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.17.0
networking:
podSubnet: 10.211.0.0/16
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/16
scheduler: {}

EOF

kubeadm config images pull --config kubeadm/kubeadm.conf

kubeadm init --config kubeadm/kubeadm.conf
。。。
[init] Using Kubernetes version: v1.15.3
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master localhost] and IPs [192.168.112.128 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master localhost] and IPs [192.168.112.128 127.0.0.1 ::1]
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.112.128]
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 38.005889 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.15" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.112.128:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:3410e92fa3bb8f43684021614514e0a79d602ad0fe08aae90dc7961348b702de
。。。

mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

6. 安装Calico网络插件(仅在master节点上执行)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
cat <<EOF > kubeadm/calico.yaml
---
# Source: calico/templates/calico-config.yaml
# This ConfigMap is used to configure a self-hosted Calico installation.
kind: ConfigMap
apiVersion: v1
metadata:
name: calico-config
namespace: kube-system
data:
# Typha is disabled.
typha_service_name: "none"
# Configure the backend to use.
calico_backend: "bird"

# Configure the MTU to use
veth_mtu: "1440"

# The CNI network configuration to install on each node. The special
# values in this config will be automatically populated.
cni_network_config: |-
{
"name": "k8s-pod-network",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "calico",
"log_level": "info",
"datastore_type": "kubernetes",
"nodename": "__KUBERNETES_NODE_NAME__",
"mtu": __CNI_MTU__,
"ipam": {
"type": "calico-ipam"
},
"policy": {
"type": "k8s"
},
"kubernetes": {
"kubeconfig": "__KUBECONFIG_FILEPATH__"
}
},
{
"type": "portmap",
"snat": true,
"capabilities": {"portMappings": true}
}
]
}

---
# Source: calico/templates/kdd-crds.yaml
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: felixconfigurations.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: FelixConfiguration
plural: felixconfigurations
singular: felixconfiguration
---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: ipamblocks.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: IPAMBlock
plural: ipamblocks
singular: ipamblock

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: blockaffinities.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: BlockAffinity
plural: blockaffinities
singular: blockaffinity

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: ipamhandles.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: IPAMHandle
plural: ipamhandles
singular: ipamhandle

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: ipamconfigs.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: IPAMConfig
plural: ipamconfigs
singular: ipamconfig

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: bgppeers.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: BGPPeer
plural: bgppeers
singular: bgppeer

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: bgpconfigurations.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: BGPConfiguration
plural: bgpconfigurations
singular: bgpconfiguration

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: ippools.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: IPPool
plural: ippools
singular: ippool

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: hostendpoints.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: HostEndpoint
plural: hostendpoints
singular: hostendpoint

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: clusterinformations.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: ClusterInformation
plural: clusterinformations
singular: clusterinformation

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: globalnetworkpolicies.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: GlobalNetworkPolicy
plural: globalnetworkpolicies
singular: globalnetworkpolicy

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: globalnetworksets.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: GlobalNetworkSet
plural: globalnetworksets
singular: globalnetworkset

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: networkpolicies.crd.projectcalico.org
spec:
scope: Namespaced
group: crd.projectcalico.org
version: v1
names:
kind: NetworkPolicy
plural: networkpolicies
singular: networkpolicy

---

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: networksets.crd.projectcalico.org
spec:
scope: Namespaced
group: crd.projectcalico.org
version: v1
names:
kind: NetworkSet
plural: networksets
singular: networkset
---
# Source: calico/templates/rbac.yaml

# Include a clusterrole for the kube-controllers component,
# and bind it to the calico-kube-controllers serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: calico-kube-controllers
rules:
# Nodes are watched to monitor for deletions.
- apiGroups: [""]
resources:
- nodes
verbs:
- watch
- list
- get
# Pods are queried to check for existence.
- apiGroups: [""]
resources:
- pods
verbs:
- get
# IPAM resources are manipulated when nodes are deleted.
- apiGroups: ["crd.projectcalico.org"]
resources:
- ippools
verbs:
- list
- apiGroups: ["crd.projectcalico.org"]
resources:
- blockaffinities
- ipamblocks
- ipamhandles
verbs:
- get
- list
- create
- update
- delete
# Needs access to update clusterinformations.
- apiGroups: ["crd.projectcalico.org"]
resources:
- clusterinformations
verbs:
- get
- create
- update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: calico-kube-controllers
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: calico-kube-controllers
subjects:
- kind: ServiceAccount
name: calico-kube-controllers
namespace: kube-system
---
# Include a clusterrole for the calico-node DaemonSet,
# and bind it to the calico-node serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: calico-node
rules:
# The CNI plugin needs to get pods, nodes, and namespaces.
- apiGroups: [""]
resources:
- pods
- nodes
- namespaces
verbs:
- get
- apiGroups: [""]
resources:
- endpoints
- services
verbs:
# Used to discover service IPs for advertisement.
- watch
- list
# Used to discover Typhas.
- get
- apiGroups: [""]
resources:
- nodes/status
verbs:
# Needed for clearing NodeNetworkUnavailable flag.
- patch
# Calico stores some configuration information in node annotations.
- update
# Watch for changes to Kubernetes NetworkPolicies.
- apiGroups: ["networking.k8s.io"]
resources:
- networkpolicies
verbs:
- watch
- list
# Used by Calico for policy information.
- apiGroups: [""]
resources:
- pods
- namespaces
- serviceaccounts
verbs:
- list
- watch
# The CNI plugin patches pods/status.
- apiGroups: [""]
resources:
- pods/status
verbs:
- patch
# Calico monitors various CRDs for config.
- apiGroups: ["crd.projectcalico.org"]
resources:
- globalfelixconfigs
- felixconfigurations
- bgppeers
- globalbgpconfigs
- bgpconfigurations
- ippools
- ipamblocks
- globalnetworkpolicies
- globalnetworksets
- networkpolicies
- networksets
- clusterinformations
- hostendpoints
verbs:
- get
- list
- watch
# Calico must create and update some CRDs on startup.
- apiGroups: ["crd.projectcalico.org"]
resources:
- ippools
- felixconfigurations
- clusterinformations
verbs:
- create
- update
# Calico stores some configuration information on the node.
- apiGroups: [""]
resources:
- nodes
verbs:
- get
- list
- watch
# These permissions are only requried for upgrade from v2.6, and can
# be removed after upgrade or on fresh installations.
- apiGroups: ["crd.projectcalico.org"]
resources:
- bgpconfigurations
- bgppeers
verbs:
- create
- update
# These permissions are required for Calico CNI to perform IPAM allocations.
- apiGroups: ["crd.projectcalico.org"]
resources:
- blockaffinities
- ipamblocks
- ipamhandles
verbs:
- get
- list
- create
- update
- delete
- apiGroups: ["crd.projectcalico.org"]
resources:
- ipamconfigs
verbs:
- get
# Block affinities must also be watchable by confd for route aggregation.
- apiGroups: ["crd.projectcalico.org"]
resources:
- blockaffinities
verbs:
- watch
# The Calico IPAM migration needs to get daemonsets. These permissions can be
# removed if not upgrading from an installation using host-local IPAM.
- apiGroups: ["apps"]
resources:
- daemonsets
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: calico-node
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: calico-node
subjects:
- kind: ServiceAccount
name: calico-node
namespace: kube-system

---
# Source: calico/templates/calico-node.yaml
# This manifest installs the calico-node container, as well
# as the CNI plugins and network config on
# each master and worker node in a Kubernetes cluster.
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: calico-node
namespace: kube-system
labels:
k8s-app: calico-node
spec:
selector:
matchLabels:
k8s-app: calico-node
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
k8s-app: calico-node
annotations:
# This, along with the CriticalAddonsOnly toleration below,
# marks the pod as a critical add-on, ensuring it gets
# priority scheduling and that its resources are reserved
# if it ever gets evicted.
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
nodeSelector:
beta.kubernetes.io/os: linux
hostNetwork: true
tolerations:
# Make sure calico-node gets scheduled on all nodes.
- effect: NoSchedule
operator: Exists
# Mark the pod as a critical add-on for rescheduling.
- key: CriticalAddonsOnly
operator: Exists
- effect: NoExecute
operator: Exists
serviceAccountName: calico-node
# Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force
# deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.
terminationGracePeriodSeconds: 0
priorityClassName: system-node-critical
initContainers:
# This container performs upgrade from host-local IPAM to calico-ipam.
# It can be deleted if this is a fresh installation, or if you have already
# upgraded to use calico-ipam.
- name: upgrade-ipam
image: calico/cni:v3.8.2
command: ["/opt/cni/bin/calico-ipam", "-upgrade"]
env:
- name: KUBERNETES_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: CALICO_NETWORKING_BACKEND
valueFrom:
configMapKeyRef:
name: calico-config
key: calico_backend
volumeMounts:
- mountPath: /var/lib/cni/networks
name: host-local-net-dir
- mountPath: /host/opt/cni/bin
name: cni-bin-dir
# This container installs the CNI binaries
# and CNI network config file on each node.
- name: install-cni
image: calico/cni:v3.8.2
command: ["/install-cni.sh"]
env:
# Name of the CNI config file to create.
- name: CNI_CONF_NAME
value: "10-calico.conflist"
# The CNI network config to install on each node.
- name: CNI_NETWORK_CONFIG
valueFrom:
configMapKeyRef:
name: calico-config
key: cni_network_config
# Set the hostname based on the k8s node name.
- name: KUBERNETES_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# CNI MTU Config variable
- name: CNI_MTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# Prevents the container from sleeping forever.
- name: SLEEP
value: "false"
volumeMounts:
- mountPath: /host/opt/cni/bin
name: cni-bin-dir
- mountPath: /host/etc/cni/net.d
name: cni-net-dir
# Adds a Flex Volume Driver that creates a per-pod Unix Domain Socket to allow Dikastes
# to communicate with Felix over the Policy Sync API.
- name: flexvol-driver
image: calico/pod2daemon-flexvol:v3.8.2
volumeMounts:
- name: flexvol-driver-host
mountPath: /host/driver
containers:
# Runs calico-node container on each Kubernetes node. This
# container programs network policy and routes on each
# host.
- name: calico-node
image: calico/node:v3.8.2
env:
# Use Kubernetes API as the backing datastore.
- name: DATASTORE_TYPE
value: "kubernetes"
# Wait for the datastore.
- name: WAIT_FOR_DATASTORE
value: "true"
# Set based on the k8s node name.
- name: NODENAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# Choose the backend to use.
- name: CALICO_NETWORKING_BACKEND
valueFrom:
configMapKeyRef:
name: calico-config
key: calico_backend
# Cluster type to identify the deployment type
- name: CLUSTER_TYPE
value: "k8s,bgp"
# Auto-detect the BGP IP address.
- name: IP
value: "autodetect"
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "Always"
# Set MTU for tunnel device used if ipip is enabled
- name: FELIX_IPINIPMTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# The default IPv4 pool to create on startup if none exists. Pod IPs will be
# chosen from this range. Changing this value after installation will have
# no effect. This should fall within \`--cluster-cidr\`.
- name: CALICO_IPV4POOL_CIDR
value: "10.211.0.0/16"
# Disable file logging so \`kubectl logs\` works.
- name: CALICO_DISABLE_FILE_LOGGING
value: "true"
# Set Felix endpoint to host default action to ACCEPT.
- name: FELIX_DEFAULTENDPOINTTOHOSTACTION
value: "ACCEPT"
# Disable IPv6 on Kubernetes.
- name: FELIX_IPV6SUPPORT
value: "false"
# Set Felix logging to "info"
- name: FELIX_LOGSEVERITYSCREEN
value: "info"
- name: FELIX_HEALTHENABLED
value: "true"
securityContext:
privileged: true
resources:
requests:
cpu: 250m
livenessProbe:
httpGet:
path: /liveness
port: 9099
host: localhost
periodSeconds: 10
initialDelaySeconds: 10
failureThreshold: 6
readinessProbe:
exec:
command:
- /bin/calico-node
- -bird-ready
- -felix-ready
periodSeconds: 10
volumeMounts:
- mountPath: /lib/modules
name: lib-modules
readOnly: true
- mountPath: /run/xtables.lock
name: xtables-lock
readOnly: false
- mountPath: /var/run/calico
name: var-run-calico
readOnly: false
- mountPath: /var/lib/calico
name: var-lib-calico
readOnly: false
- name: policysync
mountPath: /var/run/nodeagent
volumes:
# Used by calico-node.
- name: lib-modules
hostPath:
path: /lib/modules
- name: var-run-calico
hostPath:
path: /var/run/calico
- name: var-lib-calico
hostPath:
path: /var/lib/calico
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
# Used to install CNI.
- name: cni-bin-dir
hostPath:
path: /opt/cni/bin
- name: cni-net-dir
hostPath:
path: /etc/cni/net.d
# Mount in the directory for host-local IPAM allocations. This is
# used when upgrading from host-local to calico-ipam, and can be removed
# if not using the upgrade-ipam init container.
- name: host-local-net-dir
hostPath:
path: /var/lib/cni/networks
# Used to create per-pod Unix Domain Sockets
- name: policysync
hostPath:
type: DirectoryOrCreate
path: /var/run/nodeagent
# Used to install Flex Volume Driver
- name: flexvol-driver-host
hostPath:
type: DirectoryOrCreate
path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
---

apiVersion: v1
kind: ServiceAccount
metadata:
name: calico-node
namespace: kube-system

---
# Source: calico/templates/calico-kube-controllers.yaml

# See https://github.com/projectcalico/kube-controllers
apiVersion: apps/v1
kind: Deployment
metadata:
name: calico-kube-controllers
namespace: kube-system
labels:
k8s-app: calico-kube-controllers
spec:
# The controllers can only have a single active instance.
replicas: 1
selector:
matchLabels:
k8s-app: calico-kube-controllers
strategy:
type: Recreate
template:
metadata:
name: calico-kube-controllers
namespace: kube-system
labels:
k8s-app: calico-kube-controllers
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
nodeSelector:
beta.kubernetes.io/os: linux
tolerations:
# Mark the pod as a critical add-on for rescheduling.
- key: CriticalAddonsOnly
operator: Exists
- key: node-role.kubernetes.io/master
effect: NoSchedule
serviceAccountName: calico-kube-controllers
priorityClassName: system-cluster-critical
containers:
- name: calico-kube-controllers
image: calico/kube-controllers:v3.8.2
env:
# Choose which controllers to run.
- name: ENABLED_CONTROLLERS
value: node
- name: DATASTORE_TYPE
value: kubernetes
readinessProbe:
exec:
command:
- /usr/bin/check-status
- -r

---

apiVersion: v1
kind: ServiceAccount
metadata:
name: calico-kube-controllers
namespace: kube-system
---
# Source: calico/templates/calico-etcd-secrets.yaml

---
# Source: calico/templates/calico-typha.yaml

---
# Source: calico/templates/configure-canal.yaml


EOF

kubectl create -f kubeadm/calico.yaml


# 等待所有组件启动完毕
kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-65b8787765-5g29q 1/1 Running 0 31s 10.211.219.67 master <none> <none>
calico-node-9knqh 1/1 Running 0 31s 192.168.112.128 master <none> <none>
coredns-6967fb4995-sqr5n 1/1 Running 0 5m9s 10.211.219.66 master <none> <none>
coredns-6967fb4995-xzqrw 1/1 Running 0 5m9s 10.211.219.65 master <none> <none>
etcd-master 1/1 Running 0 4m18s 192.168.112.128 master <none> <none>
kube-apiserver-master 1/1 Running 0 4m24s 192.168.112.128 master <none> <none>
kube-controller-manager-master 1/1 Running 0 4m18s 192.168.112.128 master <none> <none>
kube-proxy-sjs7z 1/1 Running 0 5m9s 192.168.112.128 master <none> <none>
kube-scheduler-master 1/1 Running 0 4m29s 192.168.112.128 master <none> <none>

kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready master 6m18s v1.15.3 192.168.112.128 <none> CentOS Linux 7 (Core) 3.10.0-957.21.3.el7.x86_64 docker://18.9.9

7. 把计算节点加入集群

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
# 添加主机命令没保存怎么办?
kubeadm join --token <token> <control-plane-host>:<control-plane-port> --discovery-token-ca-cert-hash sha256:<hash>

## 示例
kubeadm join --token abcdef.0123456789abcdef 192.168.112.128:6443 --discovery-token-ca-cert-hash sha256:a8f5f1e52f04a086ba664a401a957cbb1a3c3b400bbed3d56f6bf9b8a8f3f969

## get exist token value
kubeadm token list

## create new token
kubeadm token create

## get --discovery-token-ca-cert-hash value
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | \
openssl dgst -sha256 -hex | sed 's/^.* //'

# node01节点上执行
kubeadm join 192.168.112.128:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:3410e92fa3bb8f43684021614514e0a79d602ad0fe08aae90dc7961348b702de
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.


# node02节点上执行
kubeadm join 192.168.112.128:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:3410e92fa3bb8f43684021614514e0a79d602ad0fe08aae90dc7961348b702de

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

# 在master节点上执行
kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-65b8787765-5g29q 1/1 Running 0 6m50s 10.211.219.67 master <none> <none>
calico-node-9knqh 1/1 Running 0 6m50s 192.168.112.128 master <none> <none>
calico-node-wzvtt 1/1 Running 0 50s 192.168.112.130 node02 <none> <none>
calico-node-z8vzx 1/1 Running 0 2m1s 192.168.112.129 node01 <none> <none>
coredns-6967fb4995-sqr5n 1/1 Running 0 11m 10.211.219.66 master <none> <none>
coredns-6967fb4995-xzqrw 1/1 Running 0 11m 10.211.219.65 master <none> <none>
etcd-master 1/1 Running 0 10m 192.168.112.128 master <none> <none>
kube-apiserver-master 1/1 Running 0 10m 192.168.112.128 master <none> <none>
kube-controller-manager-master 1/1 Running 0 10m 192.168.112.128 master <none> <none>
kube-proxy-675dj 1/1 Running 0 50s 192.168.112.130 node02 <none> <none>
kube-proxy-sjs7z 1/1 Running 0 11m 192.168.112.128 master <none> <none>
kube-proxy-whjbv 1/1 Running 0 2m1s 192.168.112.129 node01 <none> <none>
kube-scheduler-master 1/1 Running 0 10m 192.168.112.128 master <none> <none>

kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready master 12m v1.15.3 192.168.112.128 <none> CentOS Linux 7 (Core) 3.10.0-957.21.3.el7.x86_64 docker://18.9.9
node01 Ready <none> 2m18s v1.15.3 192.168.112.129 <none> CentOS Linux 7 (Core) 3.10.0-957.21.3.el7.x86_64 docker://18.9.9
node02 Ready <none> 67s v1.15.3 192.168.112.130 <none> CentOS Linux 7 (Core) 3.10.0-957.21.3.el7.x86_64 docker://18.9.9

8. 验证Pod的网络和DNS配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
# 在node01节点和node02节点上分别操作
mkdir -p network/
cd network/

cat <<EOF > Dockerfile
FROM alpine:3.8

MAINTAINER wangxin_0611@126.com

RUN apk add --no-cache ca-certificates bind-tools iputils iproute2 net-tools tcpdump
EOF

docker build -t alpine:3.8-network .

# 在master节点上操作
mkdir -p network/

cat <<EOF > network/network.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: network
namespace: default
spec:
selector:
matchLabels:
app: network
template:
metadata:
labels:
app: network
spec:
containers:
- name: network
image: alpine:3.8-network
imagePullPolicy: IfNotPresent
command:
- sleep
- "3600"
restartPolicy: Always
EOF

kubectl create -f network/
[root@master ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
network-47cp6 1/1 Running 0 7s 10.211.140.65 node02 <none> <none>
network-86x76 1/1 Running 0 7s 10.211.196.129 node01 <none> <none>

# 在node01上的pod中验证
[root@master ~]# kubectl exec -it network-86x76 /bin/sh
/ # cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
/ # ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
4: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group default
link/ether ea:1e:49:24:18:23 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.211.196.129/32 scope global eth0
valid_lft forever preferred_lft forever
/ # ping -c 4 10.211.140.65
PING 10.211.140.65 (10.211.140.65) 56(84) bytes of data.
64 bytes from 10.211.140.65: icmp_seq=1 ttl=62 time=0.513 ms
64 bytes from 10.211.140.65: icmp_seq=2 ttl=62 time=0.734 ms
64 bytes from 10.211.140.65: icmp_seq=3 ttl=62 time=0.348 ms
64 bytes from 10.211.140.65: icmp_seq=4 ttl=62 time=0.889 ms

--- 10.211.140.65 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3001ms
rtt min/avg/max/mdev = 0.348/0.621/0.889/0.206 ms
/ # nslookup kubernetes.default
Server: 10.96.0.10
Address: 10.96.0.10#53

Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1

/ # exit

# 在node02上的pod中验证
[root@master ~]# kubectl exec -it network-47cp6 /bin/sh
/ # cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
/ # ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
4: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group default
link/ether 56:b4:e3:cc:63:5e brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.211.140.65/32 scope global eth0
valid_lft forever preferred_lft forever
/ # ping -c 4 10.211.196.129
PING 10.211.196.129 (10.211.196.129) 56(84) bytes of data.
64 bytes from 10.211.196.129: icmp_seq=1 ttl=62 time=0.539 ms
64 bytes from 10.211.196.129: icmp_seq=2 ttl=62 time=0.885 ms
64 bytes from 10.211.196.129: icmp_seq=3 ttl=62 time=0.999 ms
64 bytes from 10.211.196.129: icmp_seq=4 ttl=62 time=0.975 ms

--- 10.211.196.129 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3005ms
rtt min/avg/max/mdev = 0.539/0.849/0.999/0.186 ms
/ # nslookup kubernetes.default
Server: 10.96.0.10
Address: 10.96.0.10#53

Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1

/ # exit

五、参考资料

https://yq.aliyun.com/articles/110806?spm=5176.8351553.0.0.7e4d1991yETOzt
https://blog.csdn.net/kozazyh/article/details/79795559
https://www.kubernetes.org.cn/5551.html

CentOS 7 HAProxy 使用示例

1. 使用yum安装HAProxy

1
2
yum makecache fast
yum install -y haproxy

2. 借助rsyslog配置HAProxy输出日志到文件

1
2
3
4
5
6
7
8
9
# 编辑/etc/rsyslog.conf
# 启用在udp 514端口接收日志消息
$ModLoad imudp
$UDPServerRun 514

# 以下内容追加到配置文件最后

# Save haproxy log to haproxy.log
local0.* /var/log/haproxy.log

3. 配置HAProxy,这里仅提供一个四层TCP代理的示例,实际中请根据需要修改

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# 备份/etc/haproxy/haproxy.cfg
cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak

# 编辑/etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local0 debug
maxconn 50000
uid 99
gid 99
#daemon
nbproc 1
pidfile haproxy.pid

defaults
mode tcp
log global
maxconn 50000
retries 3
timeout connect 10s
timeout client 60m
timeout server 60m

listen stats
mode http
bind 0.0.0.0:9090
log global
stats refresh 30s
stats uri /haproxy-status
stats realm Haproxy\ Statistics
stats auth admin:12345678
stats hide-version
stats admin if TRUE

frontend remote-tools-xxx-frontend
mode tcp
bind :43745
default_backend remote-tools-xxx-backend

backend remote-tools-xxx-backend
mode tcp
balance roundrobin
server node01 x.x.x.x:43745 weight 3 minconn 100 maxconn 50000 check inter 5000 rise 2 fall 5

4. 启动HAProxy

1
2
systemctl start haproxy.service
systemctl status haproxy.service

5. 参考资料

https://blog.51cto.com/yanconggod/2062213

使用二进制文件安装高可用Kubernetes v1.17.0集群(Stacked Control Plane Nodes For Baremetal)

一、高可用部署的实现方式介绍

本方案演变自 Kubeadm Highly Available v1.17.0(Stacked etcd topology)部署方案。

二、实验环境版本信息

1. 高可用工具的版本(这里记录的是docker镜像的版本)

keepalived-1.3.5-16.el7
haproxy-1.5.18-9.el7

2. Kubernetes各个组件的版本

etcd v3.4.3
kube-apiserver v1.17.0
kube-controller-manager v1.17.0
kube-scheduler v1.17.0
kubectl v1.17.0
coredns 1.6.5

docker 18.09.9
kube-proxy v1.17.0
kubelet v1.17.0
calico v3.11.1 (calico/node:v3.11.1 calico/pod2daemon-flexvol:v3.11.1 calico/cni:v3.11.1 calico/kube-controllers:v3.11.1)

三、部署架构介绍

stacked_etcd_topology

1. Kubernetes Master(Control Plane)

192.168.112.128 master01 -> docker kubelet keepalived haproxy etcd kube-apiserver kube-controller-manager kube-scheduler kube-proxy calico
192.168.112.129 master02 -> docker kubelet keepalived haproxy etcd kube-apiserver kube-controller-manager kube-scheduler kube-proxy calico
192.168.112.130 master03 -> docker kubelet keepalived haproxy etcd kube-apiserver kube-controller-manager kube-scheduler kube-proxy calico

2. Kubernetes Node

192.168.112.131 node01 -> docker kubelet kube-proxy calico(calico-node)
192.168.112.132 node02 -> docker kubelet kube-proxy calico(calico-node)

四、实现过程记录

1. 在Kubernetes Control Plane上的所有Node上部署HAProxy做为负载均衡器(由Systemd管理以启动二进制文件的方式实现)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
## 在控制平面的所有Node上执行,即master01、master02和master03上都执行
yum install -y haproxy
cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak
cat <<EOF > /etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local0 err
maxconn 50000
uid 99
gid 99
#daemon
nbproc 1
pidfile haproxy.pid

defaults
mode tcp
log 127.0.0.1 local0 err
maxconn 50000
retries 3
timeout connect 10s
timeout client 10m
timeout server 10m

listen stats
mode http
bind 0.0.0.0:9090
log 127.0.0.1 local0 err
stats refresh 30s
stats uri /haproxy-status
stats realm Haproxy\ Statistics
stats auth admin:12345678
stats hide-version
stats admin if TRUE

frontend kube-apiserver-https
mode tcp
bind :8443
default_backend kube-apiserver-backend

backend kube-apiserver-backend
mode tcp
balance roundrobin
server master01 192.168.112.128:6443 weight 3 minconn 100 maxconn 50000 check inter 5000 rise 2 fall 5
server master02 192.168.112.129:6443 weight 3 minconn 100 maxconn 50000 check inter 5000 rise 2 fall 5
server master03 192.168.112.130:6443 weight 3 minconn 100 maxconn 50000 check inter 5000 rise 2 fall 5
EOF

systemctl daemon-reload
systemctl enable haproxy.service
systemctl start haproxy.service
systemctl status haproxy.service
------------------------------------------------------------------------------------------------------------------------------------------------
● haproxy.service - HAProxy Load Balancer
Loaded: loaded (/usr/lib/systemd/system/haproxy.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2020-03-15 12:42:17 CST; 34s ago
Main PID: 3273 (haproxy-systemd)
Tasks: 3
Memory: 2.3M
。。。。。。

2. 在Kubernetes Control Plane的所有Node上部署Keepalived(由Systemd管理以启动二进制文件的方式实现)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# 在控制平面的所有Node上执行,即master01、master02和master03上都执行
yum install -y keepalived
cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak

# 在控制平面的master01上执行
cat <<EOF > /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
router_id k8s-1
}

vrrp_script CheckK8sMaster {
script "curl -k https://127.0.0.1:6443/api"
interval 3
timeout 9
fall 2
rise 2
}

vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 51
priority 200
advert_int 1
mcast_src_ip 192.168.112.128
nopreempt
authentication {
auth_type PASS
auth_pass 378378
}
unicast_peer {
192.168.112.129
192.168.112.130
}
virtual_ipaddress {
192.168.112.136
}
track_script {
CheckK8sMaster
}
}
EOF

# 在控制平面的master02上执行
cat <<EOF > /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
router_id k8s-2
}

vrrp_script CheckK8sMaster {
script "curl -k https://127.0.0.1:6443/api"
interval 3
timeout 9
fall 2
rise 2
}

vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 150
advert_int 1
mcast_src_ip 192.168.112.129
nopreempt
authentication {
auth_type PASS
auth_pass 378378
}
unicast_peer {
192.168.112.128
192.168.112.130
}
virtual_ipaddress {
192.168.112.136
}
track_script {
CheckK8sMaster
}
}
EOF

# 在控制平面的master03上执行
cat <<EOF > /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
router_id k8s-3
}

vrrp_script CheckK8sMaster {
script "curl -k https://127.0.0.1:6443/api"
interval 3
timeout 9
fall 2
rise 2
}

vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 150
advert_int 1
mcast_src_ip 192.168.112.130
nopreempt
authentication {
auth_type PASS
auth_pass 378378
}
unicast_peer {
192.168.112.128
192.168.112.129
}
virtual_ipaddress {
192.168.112.136
}
track_script {
CheckK8sMaster
}
}
EOF

# 在控制平面的所有Node上执行,即master01、master02和master03上都执行
systemctl daemon-reload
systemctl enable keepalived.service
systemctl start keepalived.service
systemctl status keepalived.service
------------------------------------------------------------------------------------------------------------------------------------------------
● keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2020-03-15 12:49:45 CST; 16s ago
Process: 3632 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 3633 (keepalived)
Tasks: 3
Memory: 6.5M
。。。。。。

3. 复制所有二进制文件到操作系统/usr/bin/目录下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# master01、master02和master03上分别执行
tar -zxvf etcd-v3.4.3-linux-amd64.tar.gz
tar -zxvf kubernetes-server-linux-amd64.tar.gz

cp etcd-v3.4.3-linux-amd64/etcd /usr/bin/
cp etcd-v3.4.3-linux-amd64/etcdctl /usr/bin/
cp kubernetes/server/bin/kube-apiserver /usr/bin/
cp kubernetes/server/bin/kube-controller-manager /usr/bin/
cp kubernetes/server/bin/kube-scheduler /usr/bin/
cp kubernetes/server/bin/kubectl /usr/bin/


# node01和node02上分别执行
tar -zxvf kubernetes-server-linux-amd64.tar.gz

cp kubernetes/server/bin/kubelet /usr/bin/
cp kubernetes/server/bin/kube-proxy /usr/bin/

4. Kubernetes Control Plane的第一个Node上生成根证书、RSA秘钥和kubectl的访问配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# 创建证书和配置文件的存放目录
mkdir -p /etc/kubernetes/pki/etcd/

# 生成etcd的相关证书
cd /etc/kubernetes/pki/etcd/
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=etcd-ca" -days 5000 -out ca.crt

# 生成rsa的公钥和私钥
cd /etc/kubernetes/pki/
openssl genrsa -out sa.key 2048
openssl rsa -in sa.key -pubout -out sa.pub

# 生成根证书
cd /etc/kubernetes/pki/
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=kubernetes" -days 5000 -out ca.crt

openssl genrsa -out front-proxy-ca.key 2048
openssl req -x509 -new -nodes -key front-proxy-ca.key -subj "/CN=front-proxy-ca" -days 5000 -out front-proxy-ca.crt

# 为kubectl生成相关的证书和配置文件
openssl genrsa -out kubectl.key 2048
openssl req -new -key kubectl.key -subj "/O=system:masters/CN=kubernetes-admin" -out kubectl.csr
openssl x509 -req -in kubectl.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out kubectl.crt -days 5000

export KUBECONFIG=/etc/kubernetes/admin.conf
kubectl config set-cluster kubernetes --server=https://192.168.112.136:8443 --certificate-authority=/etc/kubernetes/pki/ca.crt --embed-certs=true
kubectl config set-credentials kubernetes-admin --client-certificate=/etc/kubernetes/pki/kubectl.crt --client-key=/etc/kubernetes/pki/kubectl.key --embed-certs=true
kubectl config set-context kubernetes-admin@kubernetes --cluster=kubernetes --user=kubernetes-admin
kubectl config use-context kubernetes-admin@kubernetes
unset KUBECONFIG

## 为kube-proxy生成相关的证书和配置文件
## kubernetes内置的为kube-proxy而生的clusterrole,可以使用kubectl get clusterrole system:node-proxier -o yaml进行查看
## kubernetes内置的为kube-proxy而生的clusterrolebinding,绑定到了用户system:kube-proxy,可以使用kubectl get clusterrolebinding system:node-proxier -o yaml进行查看
openssl genrsa -out proxy.key 2048
openssl req -new -key proxy.key -subj "/CN=system:kube-proxy" -out proxy.csr
openssl x509 -req -in proxy.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out proxy.crt -days 5000

export KUBECONFIG=/etc/kubernetes/proxy.conf
kubectl config set-cluster kubernetes --server=https://192.168.112.136:8443 --certificate-authority=/etc/kubernetes/pki/ca.crt --embed-certs=true
kubectl config set-credentials system:kube-proxy --client-certificate=/etc/kubernetes/pki/proxy.crt --client-key=/etc/kubernetes/pki/proxy.key --embed-certs=true
kubectl config set-context system:kube-proxy@kubernetes --cluster=kubernetes --user=system:kube-proxy
kubectl config use-context system:kube-proxy@kubernetes
unset KUBECONFIG

## 为Bootstrap Token生成配置文件,一旦这里的 --token 参数值做了修改,后面用于开启Bootstrap Token的Secret配置需要同步修改,其对应赢规律如下:
## 1. token为abcdef.0123456789abcdef,其对应了后面启用Bootstrap Token的Secret中的 <token-id>.<token-secret>
## 2. 后面用于启用Bootstrap Token的Secret的名字为bootstrap-token-abcdef,其严格对应了格式:bootstrap-token-<token-id>
export KUBECONFIG=/etc/kubernetes/bootstrap-kubelet.conf
kubectl config set-cluster kubernetes --server=https://192.168.112.136:8443 --certificate-authority=/etc/kubernetes/pki/ca.crt --embed-certs=true
kubectl config set-credentials system:bootstrap:abcdef --token=abcdef.0123456789abcdef
kubectl config set-context system:bootstrap:abcdef@kubernetes --cluster=kubernetes --user=system:bootstrap:abcdef
kubectl config use-context system:bootstrap:abcdef@kubernetes
unset KUBECONFIG

5. 分发根证书、RSA秘钥和kubectl的访问配置文件到Kubernetes Control Plane的剩余两个Node上

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# 在master01上执行
# 配置master01到master02和master03的ssh免密登录
ssh-keygen
ssh-copy-id -i .ssh/id_rsa.pub root@master02
ssh-copy-id -i .ssh/id_rsa.pub root@master03

## 验证master01到master02和master03的ssh免密登录
ssh master02
ssh master03

cat <<EOF > kubernetes-master-transfer.sh
USER=root
CONTROL_PLANE_IPS="192.168.112.129 192.168.112.130"
for host in \${CONTROL_PLANE_IPS}; do
ssh \${USER}@\$host 'mkdir -p /etc/kubernetes/pki/etcd/'
scp /etc/kubernetes/pki/ca.crt \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/ca.key \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.key \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.pub \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.crt \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.key \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/ca.crt \${USER}@\$host:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/pki/etcd/ca.key \${USER}@\$host:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/admin.conf \${USER}@\$host:/etc/kubernetes/
scp /etc/kubernetes/proxy.conf \${USER}@\$host:/etc/kubernetes/
scp /etc/kubernetes/bootstrap-kubelet.conf \${USER}@\$host:/etc/kubernetes/
done
EOF
chmod 0755 kubernetes-master-transfer.sh
./kubernetes-master-transfer.sh

6. 利用根证书签发各个Master节点上需要的所有证书

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
# 仅在master01上执行
## 生成etcd的相关证书
cd /etc/kubernetes/pki/etcd/

cat <<EOF > server_ssl.cnf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name

[req_distinguished_name]
[v3_req]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation,digitalSignature,keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = master01
DNS.2 = localhost
IP.1 = 192.168.112.128
IP.2 = 127.0.0.1
EOF
openssl genrsa -out server.key 2048
openssl req -new -key server.key -subj "/CN=master01" -config server_ssl.cnf -out server.csr
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 5000 -extensions v3_req -extfile server_ssl.cnf -out server.crt

cat <<EOF > peer_ssl.cnf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name

[req_distinguished_name]
[v3_req]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation,digitalSignature,keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = master01
DNS.2 = localhost
IP.1 = 192.168.112.128
IP.2 = 127.0.0.1
EOF
openssl genrsa -out peer.key 2048
openssl req -new -key peer.key -subj "/CN=master01" -config peer_ssl.cnf -out peer.csr
openssl x509 -req -in peer.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 5000 -extensions v3_req -extfile peer_ssl.cnf -out peer.crt

openssl genrsa -out healthcheck-client.key 2048
openssl req -new -key healthcheck-client.key -subj "/O=system:masters/CN=kube-etcd-healthcheck-client" -out healthcheck-client.csr
openssl x509 -req -in healthcheck-client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out healthcheck-client.crt -days 5000

cd /etc/kubernetes/pki/
openssl genrsa -out apiserver-etcd-client.key 2048
openssl req -new -key apiserver-etcd-client.key -subj "/O=system:masters/CN=kube-apiserver-etcd-client" -out apiserver-etcd-client.csr
openssl x509 -req -in apiserver-etcd-client.csr -CA /etc/kubernetes/pki/etcd/ca.crt -CAkey /etc/kubernetes/pki/etcd/ca.key -CAcreateserial -out apiserver-etcd-client.crt -days 5000

## 为kube-apiserver生成相关的证书和配置文件
cat <<EOF > master_ssl.cnf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name

[req_distinguished_name]
[v3_req]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation,digitalSignature,keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = master
DNS.2 = kubernetes
DNS.3 = kubernetes.default
DNS.4 = kubernetes.default.svc
DNS.5 = kubernetes.default.svc.cluster.local
IP.1 = 10.96.0.1
IP.2 = 192.168.112.128
IP.3 = 192.168.112.136
EOF

openssl genrsa -out apiserver.key 2048
openssl req -new -key apiserver.key -subj "/CN=kube-apiserver" -config master_ssl.cnf -out apiserver.csr
openssl x509 -req -in apiserver.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 5000 -extensions v3_req -extfile master_ssl.cnf -out apiserver.crt

openssl genrsa -out front-proxy-client.key 2048
openssl req -new -key front-proxy-client.key -subj "/CN=front-proxy-client" -out front-proxy-client.csr
openssl x509 -req -in front-proxy-client.csr -CA front-proxy-ca.crt -CAkey front-proxy-ca.key -CAcreateserial -out front-proxy-client.crt -days 5000

openssl genrsa -out apiserver-kubelet-client.key 2048
openssl req -new -key apiserver-kubelet-client.key -subj "/O=system:masters/CN=kube-apiserver-kubelet-client" -out apiserver-kubelet-client.csr
openssl x509 -req -in apiserver-kubelet-client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out apiserver-kubelet-client.crt -days 5000


# 仅在master02上执行
## 生成etcd的相关证书
cd /etc/kubernetes/pki/etcd/

cat <<EOF > server_ssl.cnf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name

[req_distinguished_name]
[v3_req]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation,digitalSignature,keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = master02
DNS.2 = localhost
IP.1 = 192.168.112.129
IP.2 = 127.0.0.1
EOF
openssl genrsa -out server.key 2048
openssl req -new -key server.key -subj "/CN=master02" -config server_ssl.cnf -out server.csr
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 5000 -extensions v3_req -extfile server_ssl.cnf -out server.crt

cat <<EOF > peer_ssl.cnf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name

[req_distinguished_name]
[v3_req]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation,digitalSignature,keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = master02
DNS.2 = localhost
IP.1 = 192.168.112.129
IP.2 = 127.0.0.1
EOF
openssl genrsa -out peer.key 2048
openssl req -new -key peer.key -subj "/CN=master02" -config peer_ssl.cnf -out peer.csr
openssl x509 -req -in peer.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 5000 -extensions v3_req -extfile peer_ssl.cnf -out peer.crt

openssl genrsa -out healthcheck-client.key 2048
openssl req -new -key healthcheck-client.key -subj "/O=system:masters/CN=kube-etcd-healthcheck-client" -out healthcheck-client.csr
openssl x509 -req -in healthcheck-client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out healthcheck-client.crt -days 5000

cd /etc/kubernetes/pki/
openssl genrsa -out apiserver-etcd-client.key 2048
openssl req -new -key apiserver-etcd-client.key -subj "/O=system:masters/CN=kube-apiserver-etcd-client" -out apiserver-etcd-client.csr
openssl x509 -req -in apiserver-etcd-client.csr -CA /etc/kubernetes/pki/etcd/ca.crt -CAkey /etc/kubernetes/pki/etcd/ca.key -CAcreateserial -out apiserver-etcd-client.crt -days 5000

## 为kube-apiserver生成相关的证书和配置文件
cat <<EOF > master_ssl.cnf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name

[req_distinguished_name]
[v3_req]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation,digitalSignature,keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = master
DNS.2 = kubernetes
DNS.3 = kubernetes.default
DNS.4 = kubernetes.default.svc
DNS.5 = kubernetes.default.svc.cluster.local
IP.1 = 10.96.0.1
IP.2 = 192.168.112.129
IP.3 = 192.168.112.136
EOF

openssl genrsa -out apiserver.key 2048
openssl req -new -key apiserver.key -subj "/CN=kube-apiserver" -config master_ssl.cnf -out apiserver.csr
openssl x509 -req -in apiserver.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 5000 -extensions v3_req -extfile master_ssl.cnf -out apiserver.crt

openssl genrsa -out front-proxy-client.key 2048
openssl req -new -key front-proxy-client.key -subj "/CN=front-proxy-client" -out front-proxy-client.csr
openssl x509 -req -in front-proxy-client.csr -CA front-proxy-ca.crt -CAkey front-proxy-ca.key -CAcreateserial -out front-proxy-client.crt -days 5000

openssl genrsa -out apiserver-kubelet-client.key 2048
openssl req -new -key apiserver-kubelet-client.key -subj "/O=system:masters/CN=kube-apiserver-kubelet-client" -out apiserver-kubelet-client.csr
openssl x509 -req -in apiserver-kubelet-client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out apiserver-kubelet-client.crt -days 5000


# 仅在master03上执行
## 生成etcd的相关证书
cd /etc/kubernetes/pki/etcd/

cat <<EOF > server_ssl.cnf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name

[req_distinguished_name]
[v3_req]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation,digitalSignature,keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = master03
DNS.2 = localhost
IP.1 = 192.168.112.130
IP.2 = 127.0.0.1
EOF
openssl genrsa -out server.key 2048
openssl req -new -key server.key -subj "/CN=master03" -config server_ssl.cnf -out server.csr
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 5000 -extensions v3_req -extfile server_ssl.cnf -out server.crt

cat <<EOF > peer_ssl.cnf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name

[req_distinguished_name]
[v3_req]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation,digitalSignature,keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = master03
DNS.2 = localhost
IP.1 = 192.168.112.130
IP.2 = 127.0.0.1
EOF
openssl genrsa -out peer.key 2048
openssl req -new -key peer.key -subj "/CN=master03" -config peer_ssl.cnf -out peer.csr
openssl x509 -req -in peer.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 5000 -extensions v3_req -extfile peer_ssl.cnf -out peer.crt

openssl genrsa -out healthcheck-client.key 2048
openssl req -new -key healthcheck-client.key -subj "/O=system:masters/CN=kube-etcd-healthcheck-client" -out healthcheck-client.csr
openssl x509 -req -in healthcheck-client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out healthcheck-client.crt -days 5000

cd /etc/kubernetes/pki/
openssl genrsa -out apiserver-etcd-client.key 2048
openssl req -new -key apiserver-etcd-client.key -subj "/O=system:masters/CN=kube-apiserver-etcd-client" -out apiserver-etcd-client.csr
openssl x509 -req -in apiserver-etcd-client.csr -CA /etc/kubernetes/pki/etcd/ca.crt -CAkey /etc/kubernetes/pki/etcd/ca.key -CAcreateserial -out apiserver-etcd-client.crt -days 5000

## 为kube-apiserver生成相关的证书和配置文件
cat <<EOF > master_ssl.cnf
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name

[req_distinguished_name]
[v3_req]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation,digitalSignature,keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = master
DNS.2 = kubernetes
DNS.3 = kubernetes.default
DNS.4 = kubernetes.default.svc
DNS.5 = kubernetes.default.svc.cluster.local
IP.1 = 10.96.0.1
IP.2 = 192.168.112.130
IP.3 = 192.168.112.136
EOF

openssl genrsa -out apiserver.key 2048
openssl req -new -key apiserver.key -subj "/CN=kube-apiserver" -config master_ssl.cnf -out apiserver.csr
openssl x509 -req -in apiserver.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 5000 -extensions v3_req -extfile master_ssl.cnf -out apiserver.crt

openssl genrsa -out front-proxy-client.key 2048
openssl req -new -key front-proxy-client.key -subj "/CN=front-proxy-client" -out front-proxy-client.csr
openssl x509 -req -in front-proxy-client.csr -CA front-proxy-ca.crt -CAkey front-proxy-ca.key -CAcreateserial -out front-proxy-client.crt -days 5000

openssl genrsa -out apiserver-kubelet-client.key 2048
openssl req -new -key apiserver-kubelet-client.key -subj "/O=system:masters/CN=kube-apiserver-kubelet-client" -out apiserver-kubelet-client.csr
openssl x509 -req -in apiserver-kubelet-client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out apiserver-kubelet-client.crt -days 5000

# 分别在master01、master02和master03上执行
## 为kube-controller-manager生成相关的证书和配置文件
openssl genrsa -out controller-manager.key 2048
openssl req -new -key controller-manager.key -subj "/CN=system:kube-controller-manager" -out controller-manager.csr
openssl x509 -req -in controller-manager.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out controller-manager.crt -days 5000

export KUBECONFIG=/etc/kubernetes/controller-manager.conf
kubectl config set-cluster kubernetes --server=https://192.168.112.136:8443 --certificate-authority=/etc/kubernetes/pki/ca.crt --embed-certs=true
kubectl config set-credentials system:kube-controller-manager --client-certificate=/etc/kubernetes/pki/controller-manager.crt --client-key=/etc/kubernetes/pki/controller-manager.key --embed-certs=true
kubectl config set-context system:kube-controller-manager@kubernetes --cluster=kubernetes --user=system:kube-controller-manager
kubectl config use-context system:kube-controller-manager@kubernetes
unset KUBECONFIG

## 为kube-scheduler生成相关的证书和配置文件
openssl genrsa -out scheduler.key 2048
openssl req -new -key scheduler.key -subj "/CN=system:kube-scheduler" -out scheduler.csr
openssl x509 -req -in scheduler.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out scheduler.crt -days 5000

export KUBECONFIG=/etc/kubernetes/scheduler.conf
kubectl config set-cluster kubernetes --server=https://192.168.112.136:8443 --certificate-authority=/etc/kubernetes/pki/ca.crt --embed-certs=true
kubectl config set-credentials system:kube-scheduler --client-certificate=/etc/kubernetes/pki/scheduler.crt --client-key=/etc/kubernetes/pki/scheduler.key --embed-certs=true
kubectl config set-context system:kube-scheduler@kubernetes --cluster=kubernetes --user=system:kube-scheduler
kubectl config use-context system:kube-scheduler@kubernetes
unset KUBECONFIG

7. 在所有Master上,分别配置和启动所有组件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
# 仅在master01上执行
## 配置和启动etcd服务
mkdir -p /etc/etcd/
mkdir -p /var/lib/etcd/

cat <<EOF > /usr/lib/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target

[Service]
Type=simple
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.env
ExecStart=/usr/bin/etcd \$ETCD_ARGS

[Install]
WantedBy=multi-user.target
EOF

cat <<EOF > /etc/etcd/etcd.env
ETCD_ARGS="--advertise-client-urls=https://192.168.112.128:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --initial-advertise-peer-urls=https://192.168.112.128:2380 --initial-cluster=master01=https://192.168.112.128:2380 --key-file=/etc/kubernetes/pki/etcd/server.key --listen-client-urls=https://127.0.0.1:2379,https://192.168.112.128:2379 --listen-metrics-urls=http://127.0.0.1:2381 --listen-peer-urls=https://192.168.112.128:2380 --name=master01 --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-client-cert-auth=true --peer-key-file=/etc/kubernetes/pki/etcd/peer.key --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --snapshot-count=10000 --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt"
EOF

systemctl daemon-reload
systemctl enable etcd.service
systemctl start etcd.service
systemctl status etcd.service

## 为etcd集群添加两个节点
etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key member add master02 --peer-urls="https://192.168.112.129:2380"

etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key member add master03 --peer-urls="https://192.168.112.130:2380"

## 配置kube-apiserver服务
cat <<EOF > /usr/lib/systemd/system/kube-apiserver.service
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=etcd.service
Wants=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/kube-apiserver.env
ExecStart=/usr/bin/kube-apiserver \$KUBE_API_ARGS
Restart=on-failure
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

cat <<EOF > /etc/kubernetes/kube-apiserver.env
KUBE_API_ARGS="--advertise-address=192.168.112.128 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key --etcd-servers=https://127.0.0.1:2379 --insecure-port=0 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-key-file=/etc/kubernetes/pki/sa.pub --service-cluster-ip-range=10.96.0.0/16 --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/apiserver.key"
EOF

# 仅在master02上执行
## 配置和启动etcd服务
mkdir -p /etc/etcd/
mkdir -p /var/lib/etcd/

cat <<EOF > /usr/lib/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target

[Service]
Type=simple
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.env
ExecStart=/usr/bin/etcd \$ETCD_ARGS

[Install]
WantedBy=multi-user.target
EOF

cat <<EOF > /etc/etcd/etcd.env
ETCD_ARGS="--advertise-client-urls=https://192.168.112.129:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --initial-advertise-peer-urls=https://192.168.112.129:2380 --initial-cluster=master01=https://192.168.112.128:2380,master02=https://192.168.112.129:2380 --initial-cluster-state=existing --key-file=/etc/kubernetes/pki/etcd/server.key --listen-client-urls=https://127.0.0.1:2379,https://192.168.112.129:2379 --listen-metrics-urls=http://127.0.0.1:2381 --listen-peer-urls=https://192.168.112.129:2380 --name=master02 --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-client-cert-auth=true --peer-key-file=/etc/kubernetes/pki/etcd/peer.key --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --snapshot-count=10000 --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt"
EOF

systemctl daemon-reload
systemctl enable etcd.service
systemctl start etcd.service
systemctl status etcd.service

## 配置kube-apiserver服务
cat <<EOF > /usr/lib/systemd/system/kube-apiserver.service
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=etcd.service
Wants=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/kube-apiserver.env
ExecStart=/usr/bin/kube-apiserver \$KUBE_API_ARGS
Restart=on-failure
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

cat <<EOF > /etc/kubernetes/kube-apiserver.env
KUBE_API_ARGS="--advertise-address=192.168.112.129 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key --etcd-servers=https://127.0.0.1:2379 --insecure-port=0 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-key-file=/etc/kubernetes/pki/sa.pub --service-cluster-ip-range=10.96.0.0/16 --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/apiserver.key"
EOF


# 仅在master03上执行
## 配置和启动etcd服务
mkdir -p /etc/etcd/
mkdir -p /var/lib/etcd/

cat <<EOF > /usr/lib/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target

[Service]
Type=simple
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.env
ExecStart=/usr/bin/etcd \$ETCD_ARGS

[Install]
WantedBy=multi-user.target
EOF

cat <<EOF > /etc/etcd/etcd.env
ETCD_ARGS="--advertise-client-urls=https://192.168.112.130:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --initial-advertise-peer-urls=https://192.168.112.130:2380 --initial-cluster=master01=https://192.168.112.128:2380,master03=https://192.168.112.130:2380,master02=https://192.168.112.129:2380 --initial-cluster-state=existing --key-file=/etc/kubernetes/pki/etcd/server.key --listen-client-urls=https://127.0.0.1:2379,https://192.168.112.130:2379 --listen-metrics-urls=http://127.0.0.1:2381 --listen-peer-urls=https://192.168.112.130:2380 --name=master03 --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-client-cert-auth=true --peer-key-file=/etc/kubernetes/pki/etcd/peer.key --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --snapshot-count=10000 --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt"
EOF

systemctl daemon-reload
systemctl enable etcd.service
systemctl start etcd.service
systemctl status etcd.service

## 配置kube-apiserver服务
cat <<EOF > /usr/lib/systemd/system/kube-apiserver.service
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=etcd.service
Wants=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/kube-apiserver.env
ExecStart=/usr/bin/kube-apiserver \$KUBE_API_ARGS
Restart=on-failure
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

cat <<EOF > /etc/kubernetes/kube-apiserver.env
KUBE_API_ARGS="--advertise-address=192.168.112.130 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key --etcd-servers=https://127.0.0.1:2379 --insecure-port=0 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-key-file=/etc/kubernetes/pki/sa.pub --service-cluster-ip-range=10.96.0.0/16 --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/apiserver.key"
EOF


# 分别在master01、master02和master03上执行

## 启动kube-apiserver服务
systemctl daemon-reload
systemctl enable kube-apiserver.service
systemctl start kube-apiserver.service
systemctl status kube-apiserver.service

## 配置和启动kube-controller-manager服务
cat <<EOF > /usr/lib/systemd/system/kube-controller-manager.service
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes
After=kube-apiserver.service
Requires=kube-apiserver.service

[Service]
EnvironmentFile=-/etc/kubernetes/kube-controller-manager.env
ExecStart=/usr/bin/kube-controller-manager \$KUBE_CONTROLLER_MANAGER_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

cat <<EOF > /etc/kubernetes/kube-controller-manager.env
KUBE_CONTROLLER_MANAGER_ARGS="--allocate-node-cidrs=true --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf --bind-address=127.0.0.1 --client-ca-file=/etc/kubernetes/pki/ca.crt --cluster-cidr=10.211.0.0/16 --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --controllers=*,bootstrapsigner,tokencleaner --kubeconfig=/etc/kubernetes/controller-manager.conf --leader-elect=true --node-cidr-mask-size=24 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.96.0.0/16 --use-service-account-credentials=true"
EOF

systemctl daemon-reload
systemctl enable kube-controller-manager.service
systemctl start kube-controller-manager.service
systemctl status kube-controller-manager.service

## 配置和启动kube-scheduler服务
cat <<EOF > /usr/lib/systemd/system/kube-scheduler.service
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes
After=kube-apiserver.service
Requires=kube-apiserver.service

[Service]
EnvironmentFile=-/etc/kubernetes/kube-scheduler.env
ExecStart=/usr/bin/kube-scheduler \$KUBE_SCHEDULER_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

cat <<EOF > /etc/kubernetes/kube-scheduler.env
KUBE_SCHEDULER_ARGS=" --authentication-kubeconfig=/etc/kubernetes/scheduler.conf --authorization-kubeconfig=/etc/kubernetes/scheduler.conf --bind-address=127.0.0.1 --kubeconfig=/etc/kubernetes/scheduler.conf --leader-elect=true"
EOF

systemctl daemon-reload
systemctl enable kube-scheduler.service
systemctl start kube-scheduler.service
systemctl status kube-scheduler.service

8. 集群中配置启用Bootstrap Token

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# 仅在master01上执行
## 注意:expiration必须要在当前日期以后,否则会出现token创建后,kubernetes就会自动删除
export KUBECONFIG=/etc/kubernetes/admin.conf
cat <<EOF > /etc/kubernetes/bootstrap-token-abcdef.yaml
apiVersion: v1
kind: Secret
metadata:
name: bootstrap-token-abcdef
namespace: kube-system
type: bootstrap.kubernetes.io/token
stringData:
auth-extra-groups: system:bootstrappers:default-node-token
expiration: 2020-12-31T00:00:00+08:00
token-id: abcdef
token-secret: 0123456789abcdef
usage-bootstrap-authentication: "true"
usage-bootstrap-signing: "true"
EOF
kubectl create -f /etc/kubernetes/bootstrap-token-abcdef.yaml

cat <<EOF > /etc/kubernetes/create-csrs-for-bootstrapping.yaml
# enable bootstrapping nodes to create CSR
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: create-csrs-for-bootstrapping
subjects:
- kind: Group
name: system:bootstrappers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:node-bootstrapper
apiGroup: rbac.authorization.k8s.io
EOF
kubectl create -f /etc/kubernetes/create-csrs-for-bootstrapping.yaml

cat <<EOF > /etc/kubernetes/auto-approve-csrs-for-group.yaml
# Approve all CSRs for the group "system:bootstrappers"
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: auto-approve-csrs-for-group
subjects:
- kind: Group
name: system:bootstrappers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:certificates.k8s.io:certificatesigningrequests:nodeclient
apiGroup: rbac.authorization.k8s.io
EOF
kubectl create -f /etc/kubernetes/auto-approve-csrs-for-group.yaml

cat <<EOF > /etc/kubernetes/auto-approve-renewals-for-nodes.yaml
# Approve renewal CSRs for the group "system:nodes"
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: auto-approve-renewals-for-nodes
subjects:
- kind: Group
name: system:nodes
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient
apiGroup: rbac.authorization.k8s.io
EOF
kubectl create -f /etc/kubernetes/auto-approve-renewals-for-nodes.yaml

9. 分发bootstrap-kubelet.conf和proxy.conf到所有Master和Node上

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# 在master01上执行
# 配置master01到master02和master03的ssh免密登录
ssh-keygen
ssh-copy-id -i .ssh/id_rsa.pub root@node01
ssh-copy-id -i .ssh/id_rsa.pub root@node02

## 验证master01到node01和node02的ssh免密登录
ssh node01
ssh node02

cat <<EOF > kubernetes-node-transfer.sh
USER=root
CONTROL_PLANE_IPS="192.168.112.129 192.168.112.130 192.168.112.131 192.168.112.132"
for host in \${CONTROL_PLANE_IPS}; do
scp /etc/kubernetes/pki/ca.crt \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/bootstrap-kubelet.conf \${USER}@\$host:/etc/kubernetes/
scp /etc/kubernetes/proxy.conf \${USER}@\$host:/etc/kubernetes/
done
EOF
chmod 0755 kubernetes-node-transfer.sh
./kubernetes-node-transfer.sh

8. 在所有Node上,分别配置和启动所有组件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
# 在node01和node02上执行,如果master01、master02和master03也需要具备Node的功能,那么其上也需要执行
## 创建配置目录和工作目录
mkdir -p /etc/kubernetes/manifests
mkdir -p /etc/kubernetes/pki/
mkdir -p /var/lib/kubelet/
mkdir -p /var/lib/kube-proxy/

## 创建kubelet的配置文件
cat <<EOF > /var/lib/kubelet/config.yaml
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: true
webhook:
cacheTTL: 0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 0s
cacheUnauthorizedTTL: 0s
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
rotateCertificates: true
runtimeRequestTimeout: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s
EOF

## 配置和启动kubelet服务
cat <<EOF > /usr/lib/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet Server
Documentation=https://github.com/kubernetes/kubernetes
After=docker.service
Requires=docker.service

[Service]
WorkingDirectory=/var/lib/kubelet
EnvironmentFile=-/etc/kubernetes/kubelet.env
ExecStart=/usr/bin/kubelet \$KUBELET_ARGS
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

cat <<EOF > /etc/kubernetes/kubelet.env
KUBELET_ARGS="--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=systemd --network-plugin=cni --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1"
EOF

systemctl daemon-reload
systemctl enable kubelet.service
systemctl start kubelet.service
systemctl status kubelet.service


## 创建kube-proxy的配置文件
cat <<EOF > /var/lib/kube-proxy/config.conf
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
clientConnection:
acceptContentTypes: ""
burst: 0
contentType: ""
kubeconfig: /etc/kubernetes/proxy.conf
qps: 0
clusterCIDR: 10.211.0.0/16
configSyncPeriod: 0s
conntrack:
maxPerCore: null
min: null
tcpCloseWaitTimeout: null
tcpEstablishedTimeout: null
enableProfiling: false
healthzBindAddress: ""
hostnameOverride: ""
iptables:
masqueradeAll: false
masqueradeBit: null
minSyncPeriod: 0s
syncPeriod: 0s
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: ""
strictARP: false
syncPeriod: 0s
kind: KubeProxyConfiguration
metricsBindAddress: ""
mode: ""
nodePortAddresses: null
oomScoreAdj: null
portRange: ""
udpIdleTimeout: 0s
winkernel:
enableDSR: false
networkName: ""
sourceVip: ""
EOF

## 配置和启动kube-proxy服务
cat <<EOF > /usr/lib/systemd/system/kube-proxy.service
[Unit]
Description=Kubernetes Proxy Server
Documentation=https://github.com/kubernetes/kubernetes
After=network.target
Requires=network.service

[Service]
EnvironmentFile=-/etc/kubernetes/kube-proxy.env
ExecStart=/usr/bin/kube-proxy \$KUBE_PROXY_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

cat <<EOF > /etc/kubernetes/kube-proxy.env
KUBE_PROXY_ARGS="--config=/var/lib/kube-proxy/config.conf --hostname-override=node01"
EOF

yum install -y conntrack

systemctl daemon-reload
systemctl enable kube-proxy.service
systemctl start kube-proxy.service
systemctl status kube-proxy.service

9. 让master01、master02和master03节点具备Node节点的功能

1
2
3
4
5
6
7
8
9
## 如果master01、master02和master03节点需要具备node节点的功能,需要参考8中的步骤,先分别在master01、master02和master03节点上完成kubelet和kube-proxy的安装后,再分别给master01、master02和master03节点打上下面的标签和污点
kubectl label node master01 node-role.kubernetes.io/master=
kubectl taint node master01 node-role.kubernetes.io/master=:NoSchedule

kubectl label node master02 node-role.kubernetes.io/master=
kubectl taint node master02 node-role.kubernetes.io/master=:NoSchedule

kubectl label node master03 node-role.kubernetes.io/master=
kubectl taint node master03 node-role.kubernetes.io/master=:NoSchedule

10. 配置和安装网络插件(calico和core-dns)

请参考单点二进制Kubernetes集群的配置和安装方法,这里不再赘述。

11. 确认集群各组件的健康状况

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 确认etcd的健康状况
etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key member list
------------------------------------------------------------------------------------------------------------------------------------------------
70b95c7dc2a3de1e, started, master03, https://192.168.112.130:2380, https://192.168.112.130:2379, false
71611ba7f1e4ff79, started, master02, https://192.168.112.129:2380, https://192.168.112.129:2379, false
ade36780a0899522, started, master01, https://192.168.112.128:2380, https://192.168.112.128:2379, false
------------------------------------------------------------------------------------------------------------------------------------------------

# 确认所有节点(Master和Node)的健康状况
kubectl get node -o wide
------------------------------------------------------------------------------------------------------------------------------------------------
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master01 Ready master 130m v1.17.0 192.168.112.128 <none> CentOS Linux 7 (Core) 3.10.0-1062.12.1.el7.x86_64 docker://18.9.9
master02 Ready master 130m v1.17.0 192.168.112.129 <none> CentOS Linux 7 (Core) 3.10.0-1062.12.1.el7.x86_64 docker://18.9.9
master03 Ready master 130m v1.17.0 192.168.112.130 <none> CentOS Linux 7 (Core) 3.10.0-957.21.3.el7.x86_64 docker://18.9.9
------------------------------------------------------------------------------------------------------------------------------------------------

# 确认calico和core dns的运行状况
kubectl get pod --all-namespaces -o wide
------------------------------------------------------------------------------------------------------------------------------------------------
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-648f4868b8-ldgrw 1/1 Running 1 121m 10.211.241.66 master01 <none> <none>
kube-system calico-node-2dtf2 1/1 Running 0 121m 192.168.112.130 master03 <none> <none>
kube-system calico-node-2z8nv 1/1 Running 1 121m 192.168.112.128 master01 <none> <none>
kube-system calico-node-tvs2j 1/1 Running 1 121m 192.168.112.129 master02 <none> <none>
kube-system coredns-7f9c544f75-s26rt 1/1 Running 1 115m 10.211.59.194 master02 <none> <none>
kube-system coredns-7f9c544f75-zfst9 1/1 Running 0 115m 10.211.235.1 master03 <none> <none>
------------------------------------------------------------------------------------------------------------------------------------------------

12. 为高可用集群添加两个Node

因高可用二进制Kubernetes集群添加Node的方法与单点二进制Kubernetes集群添加Node的方法完全一直,故请参考单点二进制Kubernetes集群的添加Node方法。

五、参考资料

1. 官方资料(官方最新版本v1.17)

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/ha-topology/
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/

2. 第三方资料(因Kubernetes 从v1.15开始到v1.17,kubeadm的安装方式和二进制安装方式基本相同,故v1.15的资料可以供v1.17参考)

https://www.cnblogs.com/lingfenglian/p/11753590.html
https://blog.51cto.com/fengwan/2426528?source=dra
https://my.oschina.net/beyondken/blog/1935402
https://www.cnblogs.com/shenlinken/p/9968274.html

通过 Ingress HTTPS 的方式暴露 Kubernetes Dashboard 服务

一、实验环境版本信息

1. 操作系统的版本信息

CentOS Linux release 7.6.1810 (Core)

2. 各组件的版本信息

kubernetes cluster v1.17.0,推荐使用kubeadm v1.17.0 进行试验

etcd v3.4.3
kube-apiserver v1.17.0
kube-controller-manager v1.17.0
kube-scheduler v1.17.0
kubectl v1.17.0

docker 18.09.9
kubelet v1.17.0
calico v3.11.1

kubernetes dashborad,使用容器化的方式部署

kubernetes dashboard v2.0.0-rc5

二、配置以 http 的方式访问 Kubernetes Dashboard

  1. 决定了 Kubernetes Dashboard 以 http 的形式对外提供服务的关键参数

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    ## 以 http 对外提供服务时,Kubernetes Dashboard 默认是禁用登录模式的

    ## 特别注意:该参数启用后,Kubernetes Dashboard 会监听 8443 端口对外提供 https 服务,并且不会监听 9090 端口提供 http 服务
    --auto-generate-certificates

    ## 设置 http 监听端口,默认值为 9090。当 --auto-generate-certificates 开启时,经测试该参数无效
    --insecure-port

    ## 设置 http 监听地址,默认值为 127.0.0.1
    --insecure-bind-address

    ## 设置以 http 提供服务时,Kubernetes Dashboard 是否启用登录模式,默认为 false
    --enable-insecure-login
  2. 决定了 Kubernetes Dashboard 能够启动成功的关键参数

    1
    2
    ## 证书相关的secret对象放在哪个namespace下,通常情况下与 Kubernetes Dashboard 的 pod 所在的 namespace 相同,默认值为 kube-system
    --namespace
  3. 修改相关的 Kubernetes YAML 部署文件,关闭 https 服务,然后开启 http 服务

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    # git clone https://github.com/kubernetes/dashboard.git
    # cd dashboard/
    # git checkout -b v2.0.0-rc5.tag v2.0.0-rc5
    # git diff aio/deploy/recommended.yaml
    diff --git a/aio/deploy/recommended.yaml b/aio/deploy/recommended.yaml
    index 742f616..b8c48bd 100644
    --- a/aio/deploy/recommended.yaml
    +++ b/aio/deploy/recommended.yaml
    @@ -38,8 +38,12 @@ metadata:
    namespace: kubernetes-dashboard
    spec:
    ports:
    - - port: 443
    + - name: https
    + port: 443
    targetPort: 8443
    + - name: http
    + port: 80
    + targetPort: 9090
    selector:
    k8s-app: kubernetes-dashboard

    @@ -188,13 +192,21 @@ spec:
    containers:
    - name: kubernetes-dashboard
    image: kubernetesui/dashboard:v2.0.0-rc5
    - imagePullPolicy: Always
    + imagePullPolicy: IfNotPresent
    ports:
    - containerPort: 8443
    protocol: TCP
    + name: https
    + - containerPort: 9090
    + protocol: TCP
    + name: http
    args:
    - - --auto-generate-certificates
    + # - --auto-generate-certificates
    - --namespace=kubernetes-dashboard
    + # - --insecure-port=9090
    + # - --port=8443
    + # - --insecure-bind-address=0.0.0.0
    + - --enable-insecure-login
    # Uncomment the following line to manually specify Kubernetes API server Host
    # If not specified, Dashboard will attempt to auto discover the API server and connect
    # to it. Uncomment only if the default does not work.
    @@ -207,9 +219,12 @@ spec:
    name: tmp-volume
    livenessProbe:
    httpGet:
    - scheme: HTTPS
    + # scheme: HTTPS
    + # path: /
    + # port: 8443
    + scheme: HTTP
    path: /
    - port: 8443
    + port: 9090
    initialDelaySeconds: 30
    timeoutSeconds: 30
    securityContext:
    @@ -272,6 +287,7 @@ spec:
    containers:
    - name: dashboard-metrics-scraper
    image: kubernetesui/metrics-scraper:v1.0.3
    + imagePullPolicy: IfNotPresent
    ports:
    - containerPort: 8000
    protocol: TCP
    ## 按照上述git对比出来的变化进行修改

    # kubectl create -f aio/deploy/recommended.yaml

三、安装 Nginx Ingress Controller

1. 安装 Helm 3 包管理工具

1
2
3
4
5
6
7
# curl -o helm-v3.1.0-linux-amd64.tar.gz https://get.helm.sh/helm-v3.1.0-linux-amd64.tar.gz
# tar -zxvf helm-v3.1.0-linux-amd64.tar.gz
# cd linux-amd64/
# cp helm /usr/local/bin/

# helm version
version.BuildInfo{Version:"v3.1.0", GitCommit:"b29d20baf09943e134c2fa5e1e1cab3bf93315fa", GitTreeState:"clean", GoVersion:"go1.13.7"}

2. 使用 Helm 3 安装 Nginx Ingress Controller

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
## kubernetes node 上拉取镜像
# docker pull quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.29.0
# docker pull mirrorgooglecontainers/defaultbackend-amd64:1.5

# git clone https://github.com/helm/charts.git
# cd charts/
# git checkout -b 43edde894f4b141319e46e4311ddfa576a6973f6.tag 43edde894f4b141319e46e4311ddfa576a6973f6
# git diff stable/nginx-ingress/values.yaml
diff --git a/stable/nginx-ingress/values.yaml b/stable/nginx-ingress/values.yaml
index 270a1d3..107d259 100644
--- a/stable/nginx-ingress/values.yaml
+++ b/stable/nginx-ingress/values.yaml
@@ -28,7 +28,7 @@ controller:
# Required for use with CNI based kubernetes installations (such as ones set up by kubeadm),
# since CNI and hostport don't mix yet. Can be deprecated once https://github.com/kubernetes/kubernetes/issues/23920
# is merged
- hostNetwork: false
+ hostNetwork: true

# Optionally customize the pod dnsConfig.
dnsConfig: {}
@@ -119,7 +119,7 @@ controller:

## DaemonSet or Deployment
##
- kind: Deployment
+ kind: DaemonSet

## Annotations to be added to the controller deployment
##
@@ -428,7 +428,7 @@ defaultBackend:

name: default-backend
image:
- repository: k8s.gcr.io/defaultbackend-amd64
+ repository: mirrorgooglecontainers/defaultbackend-amd64
tag: "1.5"
pullPolicy: IfNotPresent
# nobody user -> uid 65534
## 按照上述git对比出来的变化进行修改

# helm install nginx-ingress stable/nginx-ingress --set rbac.create=true --namespace kube-system
# helm list -n kube-system
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
nginx-ingress kube-system 1 2020-02-16 12:20:48.748124293 +0800 CST deployed nginx-ingress-1.30.3 0.28.0

# kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
。。。
kube-system nginx-ingress-controller-69878bd7c7-wjjmp 1/1 Running 0 20h 192.168.112.129 node01 <none> <none>
kube-system nginx-ingress-default-backend-7cbf68bcd8-6csw4 1/1 Running 0 20h 10.211.140.76 node02 <none> <none>
。。。

四、配置以 Ingress https 的方式暴露 Kubernetes Dashboard 服务

  1. 准备 https 证书,以 secret 的形式提交到 Kubernetes Cluster 上

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    mkdir ingress/
    openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=dashboard.kubernetes.singhwang.com"

    ## 方法一:利用 kubectl 的功能,直接把证书创建到 kubernetes-dashboard-secret 这个 secret 对象中
    kubectl create secret tls kubernetes-dashboard-secret --key tls.key --cert tls.crt -n kubernetes-dashboard

    ## 方法二:证书内容做 base64 加密后,写入到 kubernetes-dashboard-secret 这个 secret 对象 data 部分的 tls.crt 和 tls.key 中
    cat <<EOF > ingress/01-secret.yaml
    apiVersion: v1
    kind: Secret
    metadata:
    name: kubernetes-dashboard-secret
    namespace: kubernetes-dashboard
    type: kubernetes.io/tls
    data:
    tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURMVENDQWhXZ0F3SUJBZ0lKQUtSMDREdGNBN2MxTUEwR0NTcUdTSWIzRFFFQkN3VUFNQzB4S3pBcEJnTlYKQkFNTUltUmhjMmhpYjJGeVpDNXJkV0psY201bGRHVnpMbk5wYm1kb2QyRnVaeTVqYjIwd0hoY05NakF3TWpFMgpNVEV3TXpVMFdoY05NakV3TWpFMU1URXdNelUwV2pBdE1Tc3dLUVlEVlFRRERDSmtZWE5vWW05aGNtUXVhM1ZpClpYSnVaWFJsY3k1emFXNW5hSGRoYm1jdVkyOXRNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUIKQ2dLQ0FRRUFwZDZLanM0MnZhOElhWGxGM1hxeUYyM3gvTlJKMGtmM1B2V05hR3RsYzE2UW10Ry9Rcnl6SFVGOAoxOVJDQm5Td01tUVlMUE1xc0crYS9HOGExY0NoQm5mM3pmU09qSmpab0ErVmdqWnYrSE80YmxMaVZ0R25aYkJ5CldaNFZDdlJld0tDakdzQzhmSk5wZG0rYkxLdWgzNUZBeDNlekNnaXNOWXM5VEROWFpWdjArRk15dUNlczBQWUcKYWU1Zi96emtlTUxJT3RPUkc4NllIL2NyQVhUdUxQZko3RXlwRFpMMUpLY1pBUUw4S3RHQmk5aFZvN20xdHJ2NAo2dktudkZnL2pMREszNmI4MTNIekJaa21sUzUwV09XemFSbkxaaGZ6TGRpRkVIbHYwd05jd0dGbVlkb21VaFdnCjJwSEVSY0NrUG5uODF5YXZ2ZktCUElXNVFBZjd0d0lEQVFBQm8xQXdUakFkQmdOVkhRNEVGZ1FVM2tnTnFUMWgKM3ZpQkt0Z0U0VWRKS1J2Wmh3a3dId1lEVlIwakJCZ3dGb0FVM2tnTnFUMWgzdmlCS3RnRTRVZEpLUnZaaHdrdwpEQVlEVlIwVEJBVXdBd0VCL3pBTkJna3Foa2lHOXcwQkFRc0ZBQU9DQVFFQXBLNXR0S0Q5amhXakxHRzNkZjRoClQ4RDhNeHV6SXhNWkRSMFJYeTBLWEhsNFU3VFZVbjVLTEJqdDVUUUQ4ZU00ZEk0N2RQaEUzOVRaV3ZGVmFRbjcKckpiUldXbFV5UmdlWmM2NFFWMzZiOVNLSnErSHRManFTQUNBQmVrT2lNT3pOdTgrWGFBUDRpaERtOG5kU2thQwpjd3U3QkltbHMxS1hHU2VOenNpWU8rOTJ5U3MzTHlYMnQ4aHB6eXFvUzlOTkF3Vnd6WVlhWUo3TmtEWWtaOUZZCnhPUmpuUGlhMEQyZmVzNFhNQkdYK2FScEV4K0MvWUpKTWJxSzJJN3MzK21jRitOaTE2KzJmd0pJTHZuR0hWRkcKaHVKa3RSMGQzK1ZBMzBlTDJMY3l6NW4rTERQeFk2Z1MrVTJiYzJJdFAyZjZRUlRJTTVCUmFjUEJDNlM3UXFKaAoyUT09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
    tls.key: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRQ2wzb3FPemphOXJ3aHAKZVVYZGVySVhiZkg4MUVuU1IvYys5WTFvYTJWelhwQ2EwYjlDdkxNZFFYelgxRUlHZExBeVpCZ3M4eXF3YjVyOApieHJWd0tFR2QvZk45STZNbU5tZ0Q1V0NObS80YzdodVV1SlcwYWRsc0hKWm5oVUs5RjdBb0tNYXdMeDhrMmwyCmI1c3NxNkhma1VESGQ3TUtDS3cxaXoxTU0xZGxXL1Q0VXpLNEo2elE5Z1pwN2wvL1BPUjR3c2c2MDVFYnpwZ2YKOXlzQmRPNHM5OG5zVEtrTmt2VWtweGtCQXZ3cTBZR0wyRldqdWJXMnUvanE4cWU4V0QrTXNNcmZwdnpYY2ZNRgptU2FWTG5SWTViTnBHY3RtRi9NdDJJVVFlVy9UQTF6QVlXWmgyaVpTRmFEYWtjUkZ3S1ErZWZ6WEpxKzk4b0U4CmhibEFCL3UzQWdNQkFBRUNnZ0VBYlRNSHNXQ2R0VjlvZ0ZmdzRSRUg4bGpWdVlmaFdlazdJMTN4ek03M3FXNlcKY1BhcG5qd3hCNCszcXpmNGg5dUdySVl0VEZxQ3ZrbWJsWmxuNTFXOExWQUorck9JclpOcm91N2ZsU3hWcHhJNApWNW1GblhiRmFETXo5VUFYeG5CL2VQM0lvN0pENVJmL2xKT0JhM1ZMU3E2TUlVWHl2eVphaVoyenExa1pyb1krCmFYRWxrSjE4YmFWTHNXVnA2amgxVmxwVXhQTEZJQmloRzN2OWdVL09qTTF4YzVjaVJBZzBDd2hSTlV0SVZrU2EKeTdxUzZRNDZyaUIrSnViaDhFOW54VWJhMExoeGo3MUZZdklpemNGMklwTDNoN3ZzUWtWclk3Y3NRRGN5L2psSApjNllSU2ljRVBSdXEwM3Zwa2YxaUFmRDdzcEJ4RmJsLzZidmsxN0ZxNFFLQmdRRFN2dTZ2OU5MY3dPSnpZeDNFCnVqS0UvTVM0Z1pwWGFnK2VqTE1qMlBNR0RnSG9sa1Zua0l5aXhKa1Z1amhJMWVsRDVRc1hiS1VyMld1aUR4NGYKRFJxSGVqUUNUOHV2R1dOamtvOTRqZkZSUzBLa2pPa0NRajZQZVMwcGhneDNGdWVLdTFTT0g1eENyMGdla1dvMApxVUlWZ3BwZlVpdHVDLzd5Q2F6alZ4cTYzd0tCZ1FESmZLcmZ6NlNHNWVGU3BVL0dKV3pxbytEWjBVU25EWFFFCmpSL0gvdnBtNEpQTThXVTNrcUdTdTVlYkI2Nk9ENkxCVEhEUFE0dWNSSUxzNFlVa0RFZWdrbmRHMlllSSt0VkYKN3hoUUdxYXhGZDJxdjZSL2IxWEhNSFdUOVdVdURpUTJTYWVDY3c4NklhaC9ROGxTUmVLNWFSWHJYaHI2ZFZ0TQowaWlOZEUreUtRS0JnUUNTc3Y0TDFleUNabkk3eUI4TXRtQThXb2ZGdDlIc1Q1UVgxZkZOWHRPc3YwdHMwRTMzCnpaTllLbW8xeWE4c1pGdEFPOHdBdmt3cnZlbENvaXRoaWdtUmpPdHZRSVNVbXFPb3lIaStmbkFoR3JhRlBPRm0KQlI3dldIYXJsUGhRWGMxSHNTY20xN0k2YVRGV3RmcXNOYlllcXc4eWsweFFDbUdwc2pwNjlrTlJHUUtCZ0ZWUgpVKzNYdUJ4akpTbGcxTW5idVNZV1pMVDNOekhoc1hubjVFaEV3UFZsTFZDLyt4TXdKUGpFTktzeDhvazNON3pRClNJaUxXb2UrUHc1ZFpJcGlKTVpxbnRWQ2NYRGdmZ1RSL0tLVzFuVHdCR0EwTEV6RjhUV2FZSDlaanhHVWJXTUwKaDBIbXhORGh4YjYyRG42bkZ4MVowUzFNT1BKTFZYRFBJTnJkSUk0WkFvR0FZeDMyWDdlUWhTY2pLNVd1cnF6cQo5THR0ZTZNbDBHamV0MDNLUk5HdStBbVFhVkF4eEw1SGxHaEhpem5TK01ZZnlIMzVzVFZNcnVTNXUyVDVHVEZNCnkrdGtvdVRPRjY1T0YxRFZmNGtWNXM5aWJ0VVk5Z0lqYitFTHAvMzhZVmdJR2tKOGFuYXpOc3NSc3FaM1Z4MDAKbktjT1Y4MlhTRzY5T1YrSUt4YmhhT009Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

    EOF
  2. 创建 ingress 资源对象到 Kubernetes Cluster 上

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    cat <<EOF > ingress/02-ingress.yaml
    apiVersion: networking.k8s.io/v1beta1
    kind: Ingress
    metadata:
    name: kubernetes-dashboard
    namespace: kubernetes-dashboard
    annotations:
    nginx.ingress.kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/secure-backends: "true"
    nginx.ingress.kubernetes.io/ssl-passthrough: "true"
    spec:
    tls:
    - hosts:
    - dashboard.kubernetes.singhwang.com
    secretName: kubernetes-dashboard-secret
    rules:
    - host: dashboard.kubernetes.singhwang.com
    http:
    paths:
    - path: /
    backend:
    serviceName: kubernetes-dashboard
    servicePort: 80
    EOF

    kubectl create -f ingress/
  3. 获取 ingress 资源对象中的 HOSTS 和 ADDRESS 在访问端做好 hosts 映射,条件允许的话,也可以配置为网络中的 DNS 记录

    1
    2
    3
    4
    5
    # kubectl get ingress -n kubernetes-dashboard -o wide
    NAME HOSTS ADDRESS PORTS AGE
    kubernetes-dashboard dashboard.kubernetes.singhwang.com 192.168.112.129,192.168.112.130 80, 443 27s

    ## 访问端或者访问端的DNS中配置域名 dashboard.kubernetes.singhwang.com 解析为地址 192.168.112.129 或者 192.168.112.130
  4. 接上一步,访问 Kubernetes Dashboard 服务 https://dashboard.kubernetes.singhwang.com
    login_01

五、使用说明

  1. 创建访问用户并授权

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    mkdir -p access/
    cat <<EOF > access/01-serviceaccount.yaml
    apiVersion: v1
    kind: ServiceAccount
    metadata:
    name: admin-user
    namespace: kubernetes-dashboard
    EOF

    cat <<EOF > access/02-clusterrolebinding.yaml
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
    name: admin-user
    roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: cluster-admin
    subjects:
    - kind: ServiceAccount
    name: admin-user
    namespace: kubernetes-dashboard
    EOF

    kubectl create -f access/
  2. 获取用户的 Token, 并在登录页面上输入, 然后登录 Kubernetes Dashboard

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    ## 获取上一步授权的用户 Token,用于登录 Kubernetes Dashboard 
    # kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}')
    Name: admin-user-token-5k9vs
    Namespace: kubernetes-dashboard
    Labels: <none>
    Annotations: kubernetes.io/service-account.name: admin-user
    kubernetes.io/service-account.uid: 4a2e4bbf-2bb6-4e65-ab49-94913da8d04c

    Type: kubernetes.io/service-account-token

    Data
    ====
    ca.crt: 1025 bytes
    namespace: 20 bytes
    token: eyJhbGciOiJSUzI1NiIsImtpZCI6IkpWcTFvc0Rza0xYZVVhVnlkRkhUX2VDM1RBR1hUNXpKVkdna3kyRTAyVlEifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLTVrOXZzIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiI0YTJlNGJiZi0yYmI2LTRlNjUtYWI0OS05NDkxM2RhOGQwNGMiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZXJuZXRlcy1kYXNoYm9hcmQ6YWRtaW4tdXNlciJ9.r1tufqV-G_AF1D-WXhP0i_ggM4rBHuzcNryPIyaIdJOYQEfoQ_G7rPPb2qEux6XrmObFgbNZoXvXUWWn8Q_OulalGNmtAO17xgCvTPjs4A_jvQGv-kiVM_OjBAUL5VGn3leT4KkK60U2q6fGUuHVAu6Fzanq178r8F17uyY_6pAz5xkHx_CZQH4aVpOWOOgcN0u8IyjxSgder_KGP7tZqbrjv29hff6xnEWU_x3qxvxWxWtOOj8egjb_NpJQge5Lh_NQvi78djq8SaBn7otkapg8Ob6FuOP48q9N01ALoJoyT2yPVbml7JLoi1qizd5PAQ40ow18cF_soxTdh7iTRw

    ## 登录页面上输入 Token 后,点击对应按钮即可实现登录

login_02
login_03

六、参考资料

1. Kubernetes Dashboard 的官方资料

https://github.com/kubernetes/dashboard/blob/v2.0.0-rc5/src/app/backend/dashboard.go
https://github.com/kubernetes/dashboard/blob/v2.0.0-rc5/docs/user/accessing-dashboard/1.7.x-and-above.md
https://github.com/kubernetes/dashboard/blob/v2.0.0-rc5/docs/user/access-control/creating-sample-user.md
https://github.com/kubernetes/dashboard/blob/v2.0.0-rc5/docs/user/integrations.md
https://github.com/kubernetes/dashboard/blob/v2.0.0-rc5/docs/user/certificate-management.md

2. Nginx Ingress Controller

https://kubernetes.github.io/ingress-nginx/deploy/#using-helm
https://www.jianshu.com/p/2da985a32db8

3. Kubernetes Metric Server

https://www.cnblogs.com/ding2016/p/10786252.html
https://github.com/singhwang/k8s-prom-hpa

4. 关于 Chrome 无法访问 Kubernetes Dashboard 的问题解决

http://team.jiunile.com/blog/2018/12/k8s-dashboard-chrome-err.html
https://superuser.com/questions/27268/how-do-i-disable-the-warning-chrome-gives-if-a-security-certificate-is-not-trust
https://www.jianshu.com/p/a8cc2c04ee7c
https://blog.gxxsite.com/wei-mac-osxde-cheng-xu-tian-jia-yong-jiu-qi-dong-can-shu/

使用Kubeadm安装高可用Kubernetes v1.17.0集群(Stacked Control Plane Nodes For Baremetal)

一、高可用部署的实现方式介绍

官方文档介绍了使用Kbeadm设置高可用性Kubernetes集群的两种不同方法:

1. 堆叠master的方式(with stacked masters)

这种方法需要较少的基础设施。控制平面节点和etcd成员位于同一位置。

2. 使用外部etcd集群的方式(with an external etcd cluster)

这种方法需要更多的基础设施。控制平面节点和etcd成员是分开的。
这里重点介绍第一种方式,即堆叠master的方式。官方文档链接详见参考资料。

二、实验环境版本信息

1. 高可用工具的版本(这里记录的是docker镜像的版本)

haproxy:1.7-alpine
osixia/keepalived:1.4.5

2. Kubernetes各个组件的版本

etcd v3.4.3
kube-apiserver v1.17.0
kube-controller-manager v1.17.0
kube-scheduler v1.17.0
kubectl v1.17.0
coredns 1.6.5

docker 18.09.9
kube-proxy v1.17.0
kubelet v1.17.0
calico v3.11.1 (calico/node:v3.11.1 calico/pod2daemon-flexvol:v3.11.1 calico/cni:v3.11.1 calico/kube-controllers:v3.11.1)

三、部署架构介绍

stacked_etcd_topology

1. Kubernetes Master(Control Plane)

192.168.112.128 master01 -> docker kubelet keepalived haproxy etcd kube-apiserver kube-controller-manager kube-scheduler kube-proxy calico
192.168.112.129 master02 -> docker kubelet keepalived haproxy etcd kube-apiserver kube-controller-manager kube-scheduler kube-proxy calico
192.168.112.130 master03 -> docker kubelet keepalived haproxy etcd kube-apiserver kube-controller-manager kube-scheduler kube-proxy calico

2. Kubernetes Node

192.168.112.131 node01 -> docker kubelet kube-proxy calico(calico-node)
192.168.112.132 node02 -> docker kubelet kube-proxy calico(calico-node)

四、实现过程记录

1. 在Kubernetes Control Plane上的所有Node上部署HAProxy做为负载均衡器(由Kubelet管理以静态Pod的方式实现)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
## 在控制平面的所有Node上执行
mkdir -p /etc/haproxy/
cat <<EOF > /etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local0 err
maxconn 50000
uid 99
gid 99
#daemon
nbproc 1
pidfile haproxy.pid

defaults
mode tcp
log 127.0.0.1 local0 err
maxconn 50000
retries 3
timeout connect 10s
timeout client 10m
timeout server 10m

listen stats
mode http
bind 0.0.0.0:9090
log 127.0.0.1 local0 err
stats refresh 30s
stats uri /haproxy-status
stats realm Haproxy\ Statistics
stats auth admin:12345678
stats hide-version
stats admin if TRUE

frontend kube-apiserver-https
mode tcp
bind :8443
default_backend kube-apiserver-backend

backend kube-apiserver-backend
mode tcp
balance roundrobin
server server01 192.168.112.128:6443 weight 3 minconn 100 maxconn 50000 check inter 5000 rise 2 fall 5
server server02 192.168.112.129:6443 weight 3 minconn 100 maxconn 50000 check inter 5000 rise 2 fall 5
server server03 192.168.112.130:6443 weight 3 minconn 100 maxconn 50000 check inter 5000 rise 2 fall 5
EOF

## 仅在master01上执行
mkdir -p /etc/kubernetes/manifests/

## 在master01上需要先执行,在master02和master03上需先做完kubeadm join后再执行
cat <<EOF > /etc/kubernetes/manifests/haproxy.yaml
kind: Pod
apiVersion: v1
metadata:
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
labels:
component: haproxy
tier: control-plane
name: kube-haproxy
namespace: kube-system
spec:
hostNetwork: true
containers:
- name: kube-haproxy
image: haproxy:1.7-alpine
resources:
requests:
cpu: 100m
volumeMounts:
- name: haproxy-cfg
readOnly: true
mountPath: /usr/local/etc/haproxy/haproxy.cfg
volumes:
- name: haproxy-cfg
hostPath:
path: /etc/haproxy/haproxy.cfg
EOF

2. 在Kubernetes Control Plane的所有Node上部署Keepalived(由Kubelet管理以静态Pod的方式实现)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
## 仅在master01上执行
mkdir -p /etc/kubernetes/manifests/

## 在master01上需要在kubeadm init前先执行,在master02和master03上需先做完kubeadm join后再执行
cat <<EOF > /etc/kubernetes/manifests/keepalived.yaml
kind: Pod
apiVersion: v1
metadata:
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
labels:
component: keepalived
tier: control-plane
name: kube-keepalived
namespace: kube-system
spec:
hostNetwork: true
containers:
- name: kube-keepalived
image: osixia/keepalived:1.4.5
env:
- name: KEEPALIVED_VIRTUAL_IPS
value: 192.168.112.136
- name: KEEPALIVED_INTERFACE
value: ens33
- name: KEEPALIVED_UNICAST_PEERS
value: "#PYTHON2BASH:['192.168.112.128', '192.168.112.129', '192.168.112.130']"
- name: KEEPALIVED_PASSWORD
value: docker
- name: KEEPALIVED_PRIORITY
value: "200"
- name: KEEPALIVED_ROUTER_ID
value: "51"
resources:
requests:
cpu: 100m
securityContext:
privileged: true
capabilities:
add:
- NET_ADMIN
EOF

3. 在Kubernetes Control Plane 的第一个Node(master01)上操作:

(1)生成kubeadm配置文件,并拉取相关的docker镜像

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
## 生成Kubeadm初始化需要使用的配置文件
mkdir -p kubeadm/config/
cat <<EOF > kubeadm/config/kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.112.128
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: master01
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
controlPlaneEndpoint: 192.168.112.136:8443
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.17.0
networking:
podSubnet: 10.211.0.0/16
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/16
scheduler: {}

EOF

## 拉取Kubeadm初始化需要使用的docker镜像
kubeadm config images pull --config kubeadm/config/kubeadm-config.yaml
------------------------------------------------------------------------------------------------------------------------------------------------
W0315 10:52:16.188454 5239 validation.go:28] Cannot validate kube-proxy config - no validator is available
W0315 10:52:16.188503 5239 validation.go:28] Cannot validate kubelet config - no validator is available
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.17.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.17.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.17.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.17.0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0
[config/images] Pulled registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.5
------------------------------------------------------------------------------------------------------------------------------------------------
。。。。。。

(2)初始化集群(千万注意A和B任选一种方法即可,不可以同时使用)

A. 使用自动分发根证书的方式初始化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
# 执行Kubeadm的初始化操作(自动分发根证书)
kubeadm init --config kubeadm/config/kubeadm-config.yaml --upload-certs
------------------------------------------------------------------------------------------------------------------------------------------------
W0315 10:53:05.509978 5340 validation.go:28] Cannot validate kube-proxy config - no validator is available
W0315 10:53:05.510016 5340 validation.go:28] Cannot validate kubelet config - no validator is available
[init] Using Kubernetes version: v1.17.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.112.128 192.168.112.136]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master01 localhost] and IPs [192.168.112.128 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master01 localhost] and IPs [192.168.112.128 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0315 10:53:08.283605 5340 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0315 10:53:08.284727 5340 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 42.019337 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.17" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
16f06d3321fce089cad4b229da9b5d3ef94c08a246943e0f375b977f18bbab8e
[mark-control-plane] Marking the node master01 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:33a6370b4bb4a9385c1d878e9a7a085ad969d521e4b309b01be797c0d7867d69 \
--control-plane --certificate-key 16f06d3321fce089cad4b229da9b5d3ef94c08a246943e0f375b977f18bbab8e

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:33a6370b4bb4a9385c1d878e9a7a085ad969d521e4b309b01be797c0d7867d69
------------------------------------------------------------------------------------------------------------------------------------------------

# 保存输出中类似于下面的命令,供添加节点功能使用
。。。。。。
You can now join any number of the control-plane node running the following command on each as root:

kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:33a6370b4bb4a9385c1d878e9a7a085ad969d521e4b309b01be797c0d7867d69 \
--control-plane --certificate-key 16f06d3321fce089cad4b229da9b5d3ef94c08a246943e0f375b977f18bbab8e

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:33a6370b4bb4a9385c1d878e9a7a085ad969d521e4b309b01be797c0d7867d69

B. 使用手动分发根证书的方式初始化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
# 执行Kubeadm的初始化操作(手动分发根证书)
kubeadm init --config kubeadm/config/kubeadm-config.yaml
------------------------------------------------------------------------------------------------------------------------------------------------
W0315 11:37:50.200933 2834 validation.go:28] Cannot validate kubelet config - no validator is available
W0315 11:37:50.201021 2834 validation.go:28] Cannot validate kube-proxy config - no validator is available
[init] Using Kubernetes version: v1.17.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.112.128 192.168.112.136]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master01 localhost] and IPs [192.168.112.128 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master01 localhost] and IPs [192.168.112.128 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0315 11:37:52.884008 2834 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0315 11:37:52.885218 2834 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 36.521431 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.17" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master01 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:05945a0dc7d9c5e45e196d8582de19a3df559d1f9f4e4cb52c77d3051db923b4 \
--control-plane

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:05945a0dc7d9c5e45e196d8582de19a3df559d1f9f4e4cb52c77d3051db923b4
------------------------------------------------------------------------------------------------------------------------------------------------

# 保存输出中类似于下面的命令,供添加节点功能使用(后续Master节点的加入一定要在手动分发完根证书后再执行第一个命令进行加入)
# 注意:控制Master节点的加入使用第一个命令,Node节点的加入使用第二个命令
。。。。。。
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:05945a0dc7d9c5e45e196d8582de19a3df559d1f9f4e4cb52c77d3051db923b4 \
--control-plane

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:05945a0dc7d9c5e45e196d8582de19a3df559d1f9f4e4cb52c77d3051db923b4


## 配置master01到master02和master03的ssh免密登录
ssh-keygen
ssh-copy-id -i .ssh/id_rsa.pub root@master02
ssh-copy-id -i .ssh/id_rsa.pub root@master03

## 验证master01到master02和master03的ssh免密登录
ssh master02
ssh master03

## 分发pki证书和admin.conf文件
cat <<EOF > kubeadm/config/scp-config.sh
USER=root
CONTROL_PLANE_IPS="192.168.112.129 192.168.112.130"
for host in \${CONTROL_PLANE_IPS}; do
ssh \${USER}@\$host 'mkdir -p /etc/kubernetes/pki/etcd/'
scp /etc/kubernetes/pki/ca.crt \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/ca.key \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.key \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.pub \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.crt \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.key \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/ca.crt \${USER}@\$host:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/pki/etcd/ca.key \${USER}@\$host:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/admin.conf \${USER}@\$host:/etc/kubernetes/
done
EOF
chmod 0755 kubeadm/config/scp-config.sh
./kubeadm/config/scp-config.sh

4. 在Kubernetes Control Plane 的第二个Node(master02)上操作:(千万注意A和B任选一种方法即可,不可以同时使用)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# A. 使用自动分发根证书的方式初始化
kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:33a6370b4bb4a9385c1d878e9a7a085ad969d521e4b309b01be797c0d7867d69 \
--control-plane --certificate-key 16f06d3321fce089cad4b229da9b5d3ef94c08a246943e0f375b977f18bbab8e
------------------------------------------------------------------------------------------------------------------------------------------------
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master02 localhost] and IPs [192.168.112.129 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master02 localhost] and IPs [192.168.112.129 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master02 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.112.129 192.168.112.136]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
W0315 10:59:52.640333 1546 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0315 10:59:52.645116 1546 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0315 10:59:52.646387 1546 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.17" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
{"level":"warn","ts":"2020-03-15T11:00:28.875+0800","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://192.168.112.129:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node master02 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master02 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

------------------------------------------------------------------------------------------------------------------------------------------------

# B. 使用手动分发根证书的方式初始化
kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:05945a0dc7d9c5e45e196d8582de19a3df559d1f9f4e4cb52c77d3051db923b4 \
--control-plane
------------------------------------------------------------------------------------------------------------------------------------------------
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master02 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.112.129 192.168.112.136]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master02 localhost] and IPs [192.168.112.129 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master02 localhost] and IPs [192.168.112.129 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
W0315 11:48:00.712980 2760 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0315 11:48:00.717833 2760 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0315 11:48:00.718658 2760 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.17" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
{"level":"warn","ts":"2020-03-15T11:48:38.856+0800","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://192.168.112.129:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet-check] Initial timeout of 40s passed.
[mark-control-plane] Marking the node master02 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master02 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

------------------------------------------------------------------------------------------------------------------------------------------------

5. 在Kubernetes Control Plane 的第三个Node(master03)上操作:(千万注意A和B任选一种方法即可,不可以同时使用)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
# A. 使用自动分发根证书的方式初始化
kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:33a6370b4bb4a9385c1d878e9a7a085ad969d521e4b309b01be797c0d7867d69 \
--control-plane --certificate-key 16f06d3321fce089cad4b229da9b5d3ef94c08a246943e0f375b977f18bbab8e
------------------------------------------------------------------------------------------------------------------------------------------------
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master03 localhost] and IPs [192.168.112.130 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master03 localhost] and IPs [192.168.112.130 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master03 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.112.130 192.168.112.136]
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
W0315 11:02:05.176831 1648 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0315 11:02:05.182344 1648 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0315 11:02:05.183197 1648 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.17" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
{"level":"warn","ts":"2020-03-15T11:02:32.084+0800","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://192.168.112.130:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node master03 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master03 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

------------------------------------------------------------------------------------------------------------------------------------------------

# B. 使用手动分发根证书的方式初始化
kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:05945a0dc7d9c5e45e196d8582de19a3df559d1f9f4e4cb52c77d3051db923b4 \
--control-plane
------------------------------------------------------------------------------------------------------------------------------------------------
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master03 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.112.130 192.168.112.136]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master03 localhost] and IPs [192.168.112.130 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master03 localhost] and IPs [192.168.112.130 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
W0315 11:49:29.220424 2807 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0315 11:49:29.225217 2807 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0315 11:49:29.226261 2807 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.17" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
{"level":"warn","ts":"2020-03-15T11:49:56.765+0800","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://192.168.112.130:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node master03 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master03 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

------------------------------------------------------------------------------------------------------------------------------------------------

6. Kubernetes Control Plane的三个节点上分别配置Kubectl访问权限

1
2
3
4
5
# 在master01、master02和master03上分别执行
rm -rf $HOME/.kube
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

7. 高可用部署的Stack结构验证

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# 在master01、master02和master03中的任意一个master执行都可以
kubectl get pod --all-namespaces -o wide
------------------------------------------------------------------------------------------------------------------------------------------------
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-648f4868b8-gszmn 1/1 Running 0 2m36s 10.211.235.1 master03 <none> <none>
kube-system calico-node-dk4s6 1/1 Running 0 2m36s 192.168.112.128 master01 <none> <none>
kube-system calico-node-lhj5p 1/1 Running 0 2m36s 192.168.112.129 master02 <none> <none>
kube-system calico-node-tscpz 1/1 Running 0 2m36s 192.168.112.130 master03 <none> <none>
kube-system coredns-7f9c544f75-9w4kn 1/1 Running 0 12m 10.211.59.193 master02 <none> <none>
kube-system coredns-7f9c544f75-xvsbn 1/1 Running 0 12m 10.211.59.194 master02 <none> <none>
kube-system etcd-master01 1/1 Running 0 12m 192.168.112.128 master01 <none> <none>
kube-system etcd-master02 1/1 Running 0 5m58s 192.168.112.129 master02 <none> <none>
kube-system etcd-master03 1/1 Running 0 3m46s 192.168.112.130 master03 <none> <none>
kube-system kube-apiserver-master01 1/1 Running 0 12m 192.168.112.128 master01 <none> <none>
kube-system kube-apiserver-master02 1/1 Running 0 5m59s 192.168.112.129 master02 <none> <none>
kube-system kube-apiserver-master03 1/1 Running 0 3m46s 192.168.112.130 master03 <none> <none>
kube-system kube-controller-manager-master01 1/1 Running 1 12m 192.168.112.128 master01 <none> <none>
kube-system kube-controller-manager-master02 1/1 Running 0 5m59s 192.168.112.129 master02 <none> <none>
kube-system kube-controller-manager-master03 1/1 Running 0 3m46s 192.168.112.130 master03 <none> <none>
kube-system kube-haproxy-master01 1/1 Running 0 12m 192.168.112.128 master01 <none> <none>
kube-system kube-keepalived-master01 1/1 Running 0 12m 192.168.112.128 master01 <none> <none>
kube-system kube-proxy-6fw8x 1/1 Running 0 12m 192.168.112.128 master01 <none> <none>
kube-system kube-proxy-7hkv7 1/1 Running 0 6m 192.168.112.129 master02 <none> <none>
kube-system kube-proxy-9trwk 1/1 Running 0 3m47s 192.168.112.130 master03 <none> <none>
kube-system kube-scheduler-master01 1/1 Running 1 12m 192.168.112.128 master01 <none> <none>
kube-system kube-scheduler-master02 1/1 Running 0 5m59s 192.168.112.129 master02 <none> <none>
kube-system kube-scheduler-master03 1/1 Running 0 3m46s 192.168.112.130 master03 <none> <none>
------------------------------------------------------------------------------------------------------------------------------------------------

# 在master01、master02和master03中的任意一个master执行都可以
kubectl get node -o wide
------------------------------------------------------------------------------------------------------------------------------------------------
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master01 Ready master 13m v1.17.0 192.168.112.128 <none> CentOS Linux 7 (Core) 3.10.0-1062.12.1.el7.x86_64 docker://18.9.9
master02 Ready master 6m40s v1.17.0 192.168.112.129 <none> CentOS Linux 7 (Core) 3.10.0-1062.12.1.el7.x86_64 docker://18.9.9
master03 Ready master 4m27s v1.17.0 192.168.112.130 <none> CentOS Linux 7 (Core) 3.10.0-1062.12.1.el7.x86_64 docker://18.9.9
------------------------------------------------------------------------------------------------------------------------------------------------

8. 确认etcd的健康状况

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# master01、master02和master03上分别执行,这里以master01为例
kubectl exec -it etcd-master01 /bin/sh -n kube-system

etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key member list
------------------------------------------------------------------------------------------------------------------------------------------------
ade36780a0899522, started, master01, https://192.168.112.128:2380, https://192.168.112.128:2379, false
b4a6061544dbd63b, started, master03, https://192.168.112.130:2380, https://192.168.112.130:2379, false
ecaa91fc374ff6f0, started, master02, https://192.168.112.129:2380, https://192.168.112.129:2379, false
------------------------------------------------------------------------------------------------------------------------------------------------

etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key endpoint health
------------------------------------------------------------------------------------------------------------------------------------------------
https://127.0.0.1:2379 is healthy: successfully committed proposal: took = 9.338525ms
------------------------------------------------------------------------------------------------------------------------------------------------

etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key endpoint status
------------------------------------------------------------------------------------------------------------------------------------------------
https://127.0.0.1:2379, ade36780a0899522, 3.4.3, 2.6 MB, false, false, 21, 53251, 53251,
------------------------------------------------------------------------------------------------------------------------------------------------

9. 为高可用集群添加两个Node

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# node01上执行(无论是自动分发根证书的方式还是手动分发证书的方式,在这里都没有区别)
kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:33a6370b4bb4a9385c1d878e9a7a085ad969d521e4b309b01be797c0d7867d69
------------------------------------------------------------------------------------------------------------------------------------------------
W0315 11:12:27.853703 9587 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.17" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

------------------------------------------------------------------------------------------------------------------------------------------------

# node02上执行(无论是自动分发根证书的方式还是手动分发证书的方式,在这里都没有区别)
kubeadm join 192.168.112.136:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:33a6370b4bb4a9385c1d878e9a7a085ad969d521e4b309b01be797c0d7867d69
------------------------------------------------------------------------------------------------------------------------------------------------
W0315 11:13:18.680949 9561 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.17" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

------------------------------------------------------------------------------------------------------------------------------------------------

# 在master01、master02和master03中的任意一个master执行都可以
kubectl get node -o wide
------------------------------------------------------------------------------------------------------------------------------------------------
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master01 Ready master 23m v1.17.0 192.168.112.128 <none> CentOS Linux 7 (Core) 3.10.0-1062.12.1.el7.x86_64 docker://18.9.9
master02 Ready master 17m v1.17.0 192.168.112.129 <none> CentOS Linux 7 (Core) 3.10.0-1062.12.1.el7.x86_64 docker://18.9.9
master03 Ready master 15m v1.17.0 192.168.112.130 <none> CentOS Linux 7 (Core) 3.10.0-1062.12.1.el7.x86_64 docker://18.9.9
node01 Ready <none> 4m59s v1.17.0 192.168.112.131 <none> CentOS Linux 7 (Core) 3.10.0-1062.12.1.el7.x86_64 docker://18.9.9
node02 Ready <none> 4m8s v1.17.0 192.168.112.132 <none> CentOS Linux 7 (Core) 3.10.0-1062.12.1.el7.x86_64 docker://18.9.9
------------------------------------------------------------------------------------------------------------------------------------------------

# 在master01、master02和master03中的任意一个master执行都可以
kubectl get pod --all-namespaces -o wide
------------------------------------------------------------------------------------------------------------------------------------------------
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-648f4868b8-gszmn 1/1 Running 0 14m 10.211.235.1 master03 <none> <none>
kube-system calico-node-dk4s6 1/1 Running 0 14m 192.168.112.128 master01 <none> <none>
kube-system calico-node-lhj5p 1/1 Running 0 14m 192.168.112.129 master02 <none> <none>
kube-system calico-node-lkl66 1/1 Running 0 4m43s 192.168.112.132 node02 <none> <none>
kube-system calico-node-ncjc4 1/1 Running 0 5m34s 192.168.112.131 node01 <none> <none>
kube-system calico-node-tscpz 1/1 Running 0 14m 192.168.112.130 master03 <none> <none>
kube-system coredns-7f9c544f75-9w4kn 1/1 Running 0 24m 10.211.59.193 master02 <none> <none>
kube-system coredns-7f9c544f75-xvsbn 1/1 Running 0 24m 10.211.59.194 master02 <none> <none>
kube-system etcd-master01 1/1 Running 0 24m 192.168.112.128 master01 <none> <none>
kube-system etcd-master02 1/1 Running 0 18m 192.168.112.129 master02 <none> <none>
kube-system etcd-master03 1/1 Running 0 15m 192.168.112.130 master03 <none> <none>
kube-system kube-apiserver-master01 1/1 Running 0 24m 192.168.112.128 master01 <none> <none>
kube-system kube-apiserver-master02 1/1 Running 0 18m 192.168.112.129 master02 <none> <none>
kube-system kube-apiserver-master03 1/1 Running 0 15m 192.168.112.130 master03 <none> <none>
kube-system kube-controller-manager-master01 1/1 Running 1 24m 192.168.112.128 master01 <none> <none>
kube-system kube-controller-manager-master02 1/1 Running 0 18m 192.168.112.129 master02 <none> <none>
kube-system kube-controller-manager-master03 1/1 Running 0 15m 192.168.112.130 master03 <none> <none>
kube-system kube-haproxy-master01 1/1 Running 0 24m 192.168.112.128 master01 <none> <none>
kube-system kube-keepalived-master01 1/1 Running 0 24m 192.168.112.128 master01 <none> <none>
kube-system kube-proxy-6fw8x 1/1 Running 0 24m 192.168.112.128 master01 <none> <none>
kube-system kube-proxy-7hkv7 1/1 Running 0 18m 192.168.112.129 master02 <none> <none>
kube-system kube-proxy-96cz5 1/1 Running 0 5m34s 192.168.112.131 node01 <none> <none>
kube-system kube-proxy-9trwk 1/1 Running 0 15m 192.168.112.130 master03 <none> <none>
kube-system kube-proxy-pwslt 1/1 Running 0 4m43s 192.168.112.132 node02 <none> <none>
kube-system kube-scheduler-master01 1/1 Running 1 24m 192.168.112.128 master01 <none> <none>
kube-system kube-scheduler-master02 1/1 Running 0 18m 192.168.112.129 master02 <none> <none>
kube-system kube-scheduler-master03 1/1 Running 0 15m 192.168.112.130 master03 <none> <none>
------------------------------------------------------------------------------------------------------------------------------------------------

五、关于所有节点(Master和Node)的重置

1
2
3
kubeadm reset
rm -rf /etc/kubernetes/ /var/lib/etcd/ /etc/cni/ $HOME/.kube/
reboot

六、参考资料

1. 官方资料(官方最新版本v1.17)

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/ha-topology/
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/

2. 第三方资料(因Kubernetes 从v1.15开始到v1.17,kubeadm的安装方式和二进制安装方式基本相同,故v1.15的资料可以供v1.17参考)

https://www.cnblogs.com/lingfenglian/p/11753590.html
https://blog.51cto.com/fengwan/2426528?source=dra
https://my.oschina.net/beyondken/blog/1935402
https://www.cnblogs.com/shenlinken/p/9968274.html

使用Kubeadm安装高可用Kubernetes v1.11.0集群(Stacked Control Plane Nodes For Baremetal)

一、高可用部署的实现方式介绍

官方文档介绍了使用Kbeadm设置高可用性Kubernetes集群的两种不同方法:

1. 堆叠master的方式(with stacked masters)

这种方法需要较少的基础设施。控制平面节点和etcd成员位于同一位置。

2. 使用外部etcd集群的方式(with an external etcd cluster)

这种方法需要更多的基础设施。控制平面节点和etcd成员是分开的。
这里重点介绍第一种方式,即堆叠master的方式。官方文档链接详见参考资料。

二、实验环境版本信息

docker 17.03.1-ce
kubeadm v1.11.0
kubelet v1.11.0
kubectl v1.11.0
calico v3.1.3

三、部署架构介绍

1. Kubernetes Master(Control Plane)

172.16.170.128 server01 -> docker kubelet keepalived haproxy etcd kube-apiserver kube-controller-manager kube-scheduler kube-proxy calico-node
172.16.170.129 server02 -> docker kubelet keepalived haproxy etcd kube-apiserver kube-controller-manager kube-scheduler kube-proxy calico-node
172.16.170.130 server03 -> docker kubelet keepalived haproxy etcd kube-apiserver kube-controller-manager kube-scheduler kube-proxy calico-node

2. Kubernetes Node

172.16.170.134 server07 -> docker kubelet kube-proxy calico-node
172.16.170.135 server08 -> docker kubelet kube-proxy calico-node

四、实现过程记录

1. 在Kubernetes Control Plane上的所有Node上部署HAProxy做为负载均衡器(由Kubelet管理以静态Pod的方式实现)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
mkdir -p /etc/haproxy/
cat <<EOF > /etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local0 err
maxconn 50000
uid 99
gid 99
#daemon
nbproc 1
pidfile haproxy.pid

defaults
mode tcp
log 127.0.0.1 local0 err
maxconn 50000
retries 3
timeout connect 10s
timeout client 10m
timeout server 10m

listen stats
mode http
bind 0.0.0.0:9090
log 127.0.0.1 local0 err
stats refresh 30s
stats uri /haproxy-status
stats realm Haproxy\ Statistics
stats auth admin:12345678
stats hide-version
stats admin if TRUE

frontend kube-apiserver-https
mode tcp
bind :8443
default_backend kube-apiserver-backend

backend kube-apiserver-backend
mode tcp
balance roundrobin
server server01 172.16.170.128:6443 weight 3 minconn 100 maxconn 50000 check inter 5000 rise 2 fall 5
server server02 172.16.170.129:6443 weight 3 minconn 100 maxconn 50000 check inter 5000 rise 2 fall 5
server server03 172.16.170.130:6443 weight 3 minconn 100 maxconn 50000 check inter 5000 rise 2 fall 5
EOF

cat <<EOF > /etc/kubernetes/manifests/haproxy.yaml
kind: Pod
apiVersion: v1
metadata:
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
labels:
component: haproxy
tier: control-plane
name: kube-haproxy
namespace: kube-system
spec:
hostNetwork: true
containers:
- name: kube-haproxy
image: haproxy:1.7-alpine
resources:
requests:
cpu: 100m
volumeMounts:
- name: haproxy-cfg
readOnly: true
mountPath: /usr/local/etc/haproxy/haproxy.cfg
volumes:
- name: haproxy-cfg
hostPath:
path: /etc/haproxy/haproxy.cfg
EOF

2. 在Kubernetes Control Plane的所有Node上部署Keepalived(由Kubelet管理以静态Pod的方式实现)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
cat <<EOF > /etc/kubernetes/manifests/keepalived.yaml
kind: Pod
apiVersion: v1
metadata:
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
labels:
component: keepalived
tier: control-plane
name: kube-keepalived
namespace: kube-system
spec:
hostNetwork: true
containers:
- name: kube-keepalived
image: osixia/keepalived:1.4.5
env:
- name: KEEPALIVED_VIRTUAL_IPS
value: 172.16.170.151
- name: KEEPALIVED_INTERFACE
value: ens33
- name: KEEPALIVED_UNICAST_PEERS
value: "#PYTHON2BASH:['172.16.170.128', '172.16.170.129', '172.16.170.130']"
- name: KEEPALIVED_PASSWORD
value: docker
- name: KEEPALIVED_PRIORITY
value: "100"
- name: KEEPALIVED_ROUTER_ID
value: "51"
resources:
requests:
cpu: 100m
securityContext:
privileged: true
capabilities:
add:
- NET_ADMIN
EOF

3. 在Kubernetes Control Plane 的第一个Node(server01)上操作:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
# 生成Kubeadm初始化需要使用的配置文件
mkdir -p kubeadm/config/
cat <<EOF > kubeadm/config/kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1alpha2
kind: MasterConfiguration
kubernetesVersion: v1.11.0
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
apiServerCertSANs:
- "172.16.170.151"
api:
controlPlaneEndpoint: "172.16.170.151:8443"
etcd:
local:
extraArgs:
listen-client-urls: "https://127.0.0.1:2379,https://172.16.170.128:2379"
advertise-client-urls: "https://172.16.170.128:2379"
listen-peer-urls: "https://172.16.170.128:2380"
initial-advertise-peer-urls: "https://172.16.170.128:2380"
initial-cluster: "server01=https://172.16.170.128:2380"
serverCertSANs:
- server01
- 172.16.170.128
peerCertSANs:
- server01
- 172.16.170.128
controllerManagerExtraArgs:
node-monitor-grace-period: 10s
pod-eviction-timeout: 10s

networking:
podSubnet: 10.211.0.0/16
serviceSubnet: 10.96.0.0/16

kubeProxy:
config:
mode: iptables
EOF

# 拉取Kubeadm初始化需要使用的docker镜像
kubeadm config images pull --config kubeadm/config/kubeadm-config.yaml

# 执行Kubeadm的初始化操作(注意记录输出的Node加入命令)
kubeadm init --config kubeadm/config/kubeadm-config.yaml

# 配置当前节点上的Kubectl访问权限
rm -rf $HOME/.kube
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

# 保存输出中类似于下面的命令,供添加节点功能使用
kubeadm join 172.16.170.151:8443 --token lt0o7j.ayxwcqr8v88spzjj --discovery-token-ca-cert-hash sha256:1ad613cf114281af6eca0afeebae7185ed69218ff92b73ebe248b90cc74353a3

# 配置server01到server02和server03的ssh免密登录
ssh-keygen
ssh-copy-id -i .ssh/id_rsa.pub root@server02
ssh-copy-id -i .ssh/id_rsa.pub root@server03

# 验证server01到server02和server03的ssh免密登录
ssh server02
ssh server03

# 分发pki证书和admin.conf文件
ssh server02 'mkdir -p /etc/kubernetes/pki/etcd/'
ssh server03 'mkdir -p /etc/kubernetes/pki/etcd/'
cat <<EOF > kubeadm/config/scp-config.sh
USER=root
CONTROL_PLANE_IPS="172.16.170.129 172.16.170.130"
for host in \${CONTROL_PLANE_IPS}; do
scp /etc/kubernetes/pki/ca.crt \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/ca.key \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.key \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.pub \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.crt \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.key \${USER}@\$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/ca.crt \${USER}@\$host:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/pki/etcd/ca.key \${USER}@\$host:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/admin.conf \${USER}@\$host:/etc/kubernetes/
done
EOF
chmod 0755 kubeadm/config/scp-config.sh
./kubeadm/config/scp-config.sh

4. 在Kubernetes Control Plane 的第二个Node(server02)上操作:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# 生成Kubeadm初始化需要使用的配置文件
mkdir -p kubeadm/config/
cat <<EOF > kubeadm/config/kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1alpha2
kind: MasterConfiguration
kubernetesVersion: v1.11.0
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
apiServerCertSANs:
- "172.16.170.151"
api:
controlPlaneEndpoint: "172.16.170.151:8443"
etcd:
local:
extraArgs:
listen-client-urls: "https://127.0.0.1:2379,https://172.16.170.129:2379"
advertise-client-urls: "https://172.16.170.129:2379"
listen-peer-urls: "https://172.16.170.129:2380"
initial-advertise-peer-urls: "https://172.16.170.129:2380"
initial-cluster: "server01=https://172.16.170.128:2380,server02=https://172.16.170.129:2380"
initial-cluster-state: existing
serverCertSANs:
- server02
- 172.16.170.129
peerCertSANs:
- server02
- 172.16.170.129
controllerManagerExtraArgs:
node-monitor-grace-period: 10s
pod-eviction-timeout: 10s

networking:
podSubnet: 10.211.0.0/16
serviceSubnet: 10.96.0.0/16

kubeProxy:
config:
mode: iptables
EOF

# 拉取Kubeadm初始化需要使用的docker镜像
kubeadm config images pull --config kubeadm/config/kubeadm-config.yaml

# 通过Kubeadm phase来启动server02上的Kubelet
kubeadm alpha phase certs all --config kubeadm/config/kubeadm-config.yaml
kubeadm alpha phase kubelet config write-to-disk --config kubeadm/config/kubeadm-config.yaml
kubeadm alpha phase kubelet write-env-file --config kubeadm/config/kubeadm-config.yaml
kubeadm alpha phase kubeconfig kubelet --config kubeadm/config/kubeadm-config.yaml
systemctl restart kubelet.service
systemctl status kubelet.service

# 添加当前Node上的etcd节点到etcd集群中
CP0_IP=172.16.170.128
CP0_HOSTNAME=server01
CP1_IP=172.16.170.129
CP1_HOSTNAME=server02
KUBECONFIG=/etc/kubernetes/admin.conf kubectl exec -n kube-system etcd-${CP0_HOSTNAME} -- etcdctl --ca-file /etc/kubernetes/pki/etcd/ca.crt --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key --endpoints=https://${CP0_IP}:2379 member add ${CP1_HOSTNAME} https://${CP1_IP}:2380
kubeadm alpha phase etcd local --config kubeadm/config/kubeadm-config.yaml

# 部署Kubernetes Control Plane的相关组件,并且标记当前Node为Master
kubeadm alpha phase kubeconfig all --config kubeadm/config/kubeadm-config.yaml
kubeadm alpha phase controlplane all --config kubeadm/config/kubeadm-config.yaml
kubeadm alpha phase mark-master --config kubeadm/config/kubeadm-config.yaml

5. 在Kubernetes Control Plane 的第三个Node(server03)上操作:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# 生成Kubeadm初始化需要使用的配置文件
mkdir -p kubeadm/config/
cat <<EOF > kubeadm/config/kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1alpha2
kind: MasterConfiguration
kubernetesVersion: v1.11.0
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
apiServerCertSANs:
- "172.16.170.151"
api:
controlPlaneEndpoint: "172.16.170.151:8443"
etcd:
local:
extraArgs:
listen-client-urls: "https://127.0.0.1:2379,https://172.16.170.130:2379"
advertise-client-urls: "https://172.16.170.130:2379"
listen-peer-urls: "https://172.16.170.130:2380"
initial-advertise-peer-urls: "https://172.16.170.130:2380"
initial-cluster: "server01=https://172.16.170.128:2380,server02=https://172.16.170.129:2380,server03=https://172.16.170.130:2380"
initial-cluster-state: existing
serverCertSANs:
- server03
- 172.16.170.130
peerCertSANs:
- server03
- 172.16.170.130
controllerManagerExtraArgs:
node-monitor-grace-period: 10s
pod-eviction-timeout: 10s

networking:
podSubnet: 10.211.0.0/16
serviceSubnet: 10.96.0.0/16

kubeProxy:
config:
mode: iptables
EOF

# 拉取Kubeadm初始化需要使用的docker镜像
kubeadm config images pull --config kubeadm/config/kubeadm-config.yaml

# 通过Kubeadm phase来启动server03上的Kubelet
kubeadm alpha phase certs all --config kubeadm/config/kubeadm-config.yaml
kubeadm alpha phase kubelet config write-to-disk --config kubeadm/config/kubeadm-config.yaml
kubeadm alpha phase kubelet write-env-file --config kubeadm/config/kubeadm-config.yaml
kubeadm alpha phase kubeconfig kubelet --config kubeadm/config/kubeadm-config.yaml
systemctl restart kubelet.service
systemctl status kubelet.service

# 添加当前Node上的etcd节点到etcd集群中
CP0_IP=172.16.170.128
CP0_HOSTNAME=server01
CP2_IP=172.16.170.130
CP2_HOSTNAME=server03
KUBECONFIG=/etc/kubernetes/admin.conf kubectl exec -n kube-system etcd-${CP0_HOSTNAME} -- etcdctl --ca-file /etc/kubernetes/pki/etcd/ca.crt --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key --endpoints=https://${CP0_IP}:2379 member add ${CP2_HOSTNAME} https://${CP2_IP}:2380
kubeadm alpha phase etcd local --config kubeadm/config/kubeadm-config.yaml

# 部署Kubernetes Control Plane的相关组件,并且标记当前Node为Master
kubeadm alpha phase kubeconfig all --config kubeadm/config/kubeadm-config.yaml
kubeadm alpha phase controlplane all --config kubeadm/config/kubeadm-config.yaml
kubeadm alpha phase mark-master --config kubeadm/config/kubeadm-config.yaml

6. Kubernetes Control Plane的另外两个节点分别配置Kubectl访问权限

1
2
3
4
rm -rf $HOME/.kube
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

7. 高可用部署的Stack结构验证

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system calico-node-ff5cv 2/2 Running 0 47s 172.16.170.130 server03
kube-system calico-node-hb782 2/2 Running 0 8m 172.16.170.128 server01
kube-system calico-node-zpwcp 2/2 Running 0 4m 172.16.170.129 server02
kube-system coredns-777d78ff6f-5n8bg 1/1 Running 0 10m 10.211.0.4 server01
kube-system coredns-777d78ff6f-wfm7d 1/1 Running 0 10m 10.211.0.5 server01
kube-system etcd-server01 1/1 Running 0 9m 172.16.170.128 server01
kube-system etcd-server02 1/1 Running 0 3m 172.16.170.129 server02
kube-system etcd-server03 1/1 Running 0 27s 172.16.170.130 server03
kube-system kube-apiserver-server01 1/1 Running 0 9m 172.16.170.128 server01
kube-system kube-apiserver-server02 1/1 Running 0 2m 172.16.170.129 server02
kube-system kube-apiserver-server03 1/1 Running 0 16s 172.16.170.130 server03
kube-system kube-controller-manager-server01 1/1 Running 0 9m 172.16.170.128 server01
kube-system kube-controller-manager-server02 1/1 Running 0 2m 172.16.170.129 server02
kube-system kube-controller-manager-server03 1/1 Running 0 16s 172.16.170.130 server03
kube-system kube-haproxy-server01 1/1 Running 0 9m 172.16.170.128 server01
kube-system kube-haproxy-server02 1/1 Running 0 4m 172.16.170.129 server02
kube-system kube-haproxy-server03 1/1 Running 0 27s 172.16.170.130 server03
kube-system kube-keepalived-server01 1/1 Running 0 9m 172.16.170.128 server01
kube-system kube-keepalived-server02 1/1 Running 0 4m 172.16.170.129 server02
kube-system kube-keepalived-server03 1/1 Running 0 27s 172.16.170.130 server03
kube-system kube-proxy-88b55 1/1 Running 0 4m 172.16.170.129 server02
kube-system kube-proxy-9n9vv 1/1 Running 0 9m 172.16.170.128 server01
kube-system kube-proxy-j7lqz 1/1 Running 0 47s 172.16.170.130 server03
kube-system kube-scheduler-server01 1/1 Running 0 9m 172.16.170.128 server01
kube-system kube-scheduler-server02 1/1 Running 0 2m 172.16.170.129 server02
kube-system kube-scheduler-server03 1/1 Running 0 16s 172.16.170.130 server03

# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
server01 Ready master 10m v1.11.0 172.16.170.128 <none> CentOS Linux 7 (Core) 3.10.0-862.11.6.el7.x86_64 docker://17.3.1
server02 Ready master 4m v1.11.0 172.16.170.129 <none> CentOS Linux 7 (Core) 3.10.0-862.11.6.el7.x86_64 docker://17.3.1
server03 Ready master 1m v1.11.0 172.16.170.130 <none> CentOS Linux 7 (Core) 3.10.0-862.11.6.el7.x86_64 docker://17.3.1

8. 为高可用集群添加两个Node

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
# 在server07上执行添加Node的命令
# kubeadm join 172.16.170.151:8443 --token lt0o7j.ayxwcqr8v88spzjj --discovery-token-ca-cert-hash sha256:1ad613cf114281af6eca0afeebae7185ed69218ff92b73ebe248b90cc74353a3
[preflight] running pre-flight checks
[WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs_wrr ip_vs_sh ip_vs ip_vs_rr] or no builtin kernel ipvs support: map[ip_vs_sh:{} nf_conntrack_ipv4:{} ip_vs:{} ip_vs_rr:{} ip_vs_wrr:{}]
you can solve this problem with following methods:
1. Run 'modprobe -- ' to load missing kernel modules;
2. Provide the missing builtin kernel ipvs support

I0123 16:10:01.668746 17689 kernel_validator.go:81] Validating kernel version
I0123 16:10:01.668820 17689 kernel_validator.go:96] Validating kernel config
[discovery] Trying to connect to API Server "172.16.170.151:8443"
[discovery] Created cluster-info discovery client, requesting info from "https://172.16.170.151:8443"
[discovery] Requesting info from "https://172.16.170.151:8443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "172.16.170.151:8443"
[discovery] Successfully established connection with API Server "172.16.170.151:8443"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.11" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[preflight] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "server07" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to master and a response
was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

# 在server08上执行添加Node的命令
# kubeadm join 172.16.170.151:8443 --token lt0o7j.ayxwcqr8v88spzjj --discovery-token-ca-cert-hash sha256:1ad613cf114281af6eca0afeebae7185ed69218ff92b73ebe248b90cc74353a3
[preflight] running pre-flight checks
[WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs_rr ip_vs_wrr ip_vs_sh ip_vs] or no builtin kernel ipvs support: map[ip_vs_sh:{} nf_conntrack_ipv4:{} ip_vs:{} ip_vs_rr:{} ip_vs_wrr:{}]
you can solve this problem with following methods:
1. Run 'modprobe -- ' to load missing kernel modules;
2. Provide the missing builtin kernel ipvs support

I0123 16:10:29.832899 17706 kernel_validator.go:81] Validating kernel version
I0123 16:10:29.833038 17706 kernel_validator.go:96] Validating kernel config
[discovery] Trying to connect to API Server "172.16.170.151:8443"
[discovery] Created cluster-info discovery client, requesting info from "https://172.16.170.151:8443"
[discovery] Requesting info from "https://172.16.170.151:8443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "172.16.170.151:8443"
[discovery] Successfully established connection with API Server "172.16.170.151:8443"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.11" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[preflight] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "server08" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to master and a response
was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

# 在任意一个Master节点上执行
# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
server01 Ready master 16m v1.11.0 172.16.170.128 <none> CentOS Linux 7 (Core) 3.10.0-862.11.6.el7.x86_64 docker://17.3.1
server02 Ready master 10m v1.11.0 172.16.170.129 <none> CentOS Linux 7 (Core) 3.10.0-862.11.6.el7.x86_64 docker://17.3.1
server03 Ready master 7m v1.11.0 172.16.170.130 <none> CentOS Linux 7 (Core) 3.10.0-862.11.6.el7.x86_64 docker://17.3.1
server07 Ready <none> 40s v1.11.0 172.16.170.134 <none> CentOS Linux 7 (Core) 3.10.0-957.1.3.el7.x86_64 docker://17.3.1
server08 Ready <none> 12s v1.11.0 172.16.170.135 <none> CentOS Linux 7 (Core) 3.10.0-957.1.3.el7.x86_64 docker://17.3.1

# 在任意一个Master节点上执行
# kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system calico-node-c8j7r 2/2 Running 0 1m 172.16.170.134 server07
kube-system calico-node-chngv 1/2 Running 0 32s 172.16.170.135 server08
kube-system calico-node-ff5cv 2/2 Running 0 7m 172.16.170.130 server03
kube-system calico-node-hb782 2/2 Running 0 15m 172.16.170.128 server01
kube-system calico-node-zpwcp 2/2 Running 0 11m 172.16.170.129 server02
kube-system coredns-777d78ff6f-5n8bg 1/1 Running 0 16m 10.211.0.4 server01
kube-system coredns-777d78ff6f-wfm7d 1/1 Running 0 16m 10.211.0.5 server01
kube-system etcd-server01 1/1 Running 0 16m 172.16.170.128 server01
kube-system etcd-server02 1/1 Running 0 10m 172.16.170.129 server02
kube-system etcd-server03 1/1 Running 0 7m 172.16.170.130 server03
kube-system kube-apiserver-server01 1/1 Running 0 16m 172.16.170.128 server01
kube-system kube-apiserver-server02 1/1 Running 0 9m 172.16.170.129 server02
kube-system kube-apiserver-server03 1/1 Running 0 7m 172.16.170.130 server03
kube-system kube-controller-manager-server01 1/1 Running 0 16m 172.16.170.128 server01
kube-system kube-controller-manager-server02 1/1 Running 0 9m 172.16.170.129 server02
kube-system kube-controller-manager-server03 1/1 Running 0 7m 172.16.170.130 server03
kube-system kube-haproxy-server01 1/1 Running 0 16m 172.16.170.128 server01
kube-system kube-haproxy-server02 1/1 Running 0 10m 172.16.170.129 server02
kube-system kube-haproxy-server03 1/1 Running 0 7m 172.16.170.130 server03
kube-system kube-keepalived-server01 1/1 Running 0 16m 172.16.170.128 server01
kube-system kube-keepalived-server02 1/1 Running 0 10m 172.16.170.129 server02
kube-system kube-keepalived-server03 1/1 Running 0 7m 172.16.170.130 server03
kube-system kube-proxy-88b55 1/1 Running 0 11m 172.16.170.129 server02
kube-system kube-proxy-9n9vv 1/1 Running 0 16m 172.16.170.128 server01
kube-system kube-proxy-g8lsj 1/1 Running 0 1m 172.16.170.134 server07
kube-system kube-proxy-j7lqz 1/1 Running 0 7m 172.16.170.130 server03
kube-system kube-proxy-qdhpj 1/1 Running 0 32s 172.16.170.135 server08
kube-system kube-scheduler-server01 1/1 Running 0 16m 172.16.170.128 server01
kube-system kube-scheduler-server02 1/1 Running 0 9m 172.16.170.129 server02
kube-system kube-scheduler-server03 1/1 Running 0 7m 172.16.170.130 server03

五、参考资料

https://v1-11.docs.kubernetes.io/docs/setup/independent/high-availability/
https://my.oschina.net/u/3433152/blog/1935402
https://www.jianshu.com/p/49a48752c1a3?utm_source=oschina-app
https://tonybai.com/2017/05/15/setup-a-ha-kubernetes-cluster-based-on-kubeadm-part1/
https://tonybai.com/2017/05/15/setup-a-ha-kubernetes-cluster-based-on-kubeadm-part2/
https://blog.csdn.net/liu_qingbo/article/details/78383892