TOC

Docker 容器对宿主机的访问故障

因为安全原因,开启 ufw 之后,docker 容器内部无法访问宿主机 5000 端口的服务。
宿主机服务是没有问题的,只要关闭 ufw 就可以正常访问,所以百分百确认是 ufw 的原因。

  • 从容器内访问宿主机 http 服务卡住(timeout)
# docker exec -it catroll-nginx bash
curl -v <host-ip>:5000
curl -v http://host.docker.internal:5000
curl -v http://172.17.0.1:5000
  • UFW 已启用(deny routed)
-> % sudo ufw status verbose
状态:激活
日志: on (low)
默认:deny (incoming), allow (outgoing), deny (routed)
新建配置文件: skip

至                          动作          来自
-                          --          --
3389/tcp                   ALLOW IN    Anywhere
22/tcp (OpenSSH)           ALLOW IN    Anywhere
3389/tcp (v6)              ALLOW IN    Anywhere (v6)
22/tcp (OpenSSH (v6))      ALLOW IN    Anywhere (v6)

但我对 UFW 不熟,采用了 GPT 给的解释和操作,走了太多弯路,弄了个把小时都没有搞好,最后才找到解决方案:

sudo ufw allow from 172.16.0.0/12

走过的弯路

ufw

我用的 docker compose 创建了一个 subnet 172.18.0.0/16
GPT 没有了解到这个信息,误以为业务流量走 docker0(172.17/16),但实际 compose network 使用的是 172.18/16172.19/16
导致后面走了太多弯路:

sudo ufw allow from 172.17.0.0/16

尝试放行 5000 端口之后问题好转,但是我希望 Docker 内部网络作为 trusted zone,而不是逐端口放行。

ufw allow 5000/tcp

检查 iptables

-> % sudo iptables -L ufw-before-input -n -v --line-numbers
Chain ufw-before-input (1 references)
num   pkts bytes target     prot opt in     out     source               destination
1     1557 1462K ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0
2    30651 5181K ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
3       17 15380 ufw-logging-deny  all  --  *      *       0.0.0.0/0            0.0.0.0/0            ctstate INVALID
4       17 15380 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0            ctstate INVALID
5        0     0 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0            icmptype 3
6        0     0 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0            icmptype 11
7        0     0 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0            icmptype 12
8        0     0 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0            icmptype 8
9        0     0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            udp spt:67 dpt:68
10      63 11466 ufw-not-local  all  --  *      *       0.0.0.0/0            0.0.0.0/0
11       1    73 ACCEPT     udp  --  *      *       0.0.0.0/0            224.0.0.251          udp dpt:5353
12      28  5142 ACCEPT     udp  --  *      *       0.0.0.0/0            239.255.255.250      udp dpt:1900
13      34  6251 ufw-user-input  all  --  *      *       0.0.0.0/0            0.0.0.0/0

-> % sudo iptables -L ufw-user-input -n -v --line-numbers
Chain ufw-user-input (1 references)
num   pkts bytes target     prot opt in     out     source               destination
1        0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:3389
2        0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:22 /* 'dapp_OpenSSH' */
3        0     0 ACCEPT     all  --  *      *       172.17.0.0/16        0.0.0.0/0
4        0     0 ACCEPT     tcp  --  docker0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:5000
5        0     0 ACCEPT     udp  --  docker0 *       0.0.0.0/0            0.0.0.0/0            udp dpt:5000

-> % sudo iptables -L ufw-reject-input -n -v
Chain ufw-reject-input (1 references)
 pkts bytes target     prot opt in     out     source               destination
sudo iptables -I FORWARD -p tcp --dport 5000 -j ACCEPT

docker0

tcpdump 在 docker0 抓包,没有检测到流量,加了一堆 ufw 和 iptable 的规则,无济于事:

-> % sudo iptables -S FORWARD
-P FORWARD DROP
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-FORWARD
-A FORWARD -j ufw-before-logging-forward
-A FORWARD -j ufw-before-forward
-A FORWARD -j ufw-after-forward
-A FORWARD -j ufw-after-logging-forward
-A FORWARD -j ufw-reject-forward
-A FORWARD -j ufw-track-forward

sudo vim /etc/default/ufw
# DEFAULT_FORWARD_POLICY="ACCEPT"

sudo ufw allow in on docker0
sudo ufw allow out on docker0

sudo vim /etc/ufw/before.rules

# 加上:

-A ufw-before-forward -i docker0 -j ACCEPT
-A ufw-before-forward -o docker0 -j ACCEPT
-A ufw-before-forward -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT

*nat
:POSTROUTING ACCEPT [0:0]
-A POSTROUTING -s 172.17.0.0/16 -o eth0 -j MASQUERADE
COMMIT

-> % sudo iptables -S FORWARD
-P FORWARD ACCEPT
-A FORWARD -i eth0 -o docker0 -m state --state RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker0 -o eth0 -j ACCEPT
-A FORWARD -p tcp -m tcp --dport 5000 -j ACCEPT
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-FORWARD
-A FORWARD -j ufw-before-logging-forward
-A FORWARD -j ufw-before-forward
-A FORWARD -j ufw-after-forward
-A FORWARD -j ufw-after-logging-forward
-A FORWARD -j ufw-reject-forward
-A FORWARD -j ufw-track-forward

虽然 FORWARD 默认策略为 DROP,但 Docker 自身会插入 forwarding chain,因此不能仅凭默认策略判断问题。

内核参数层(rp_filter)

尝试排除 rp_filter(严格反向路径校验) 干扰,但最终证明不是问题根因。

-> % sysctl net.ipv4.conf.all.rp_filter
net.ipv4.conf.all.rp_filter = 2

sudo sysctl -w net.ipv4.conf.all.rp_filter=0
sudo sysctl -w net.ipv4.conf.default.rp_filter=0
sudo sysctl -w net.ipv4.conf.docker0.rp_filter=0
sudo sysctl -w net.ipv4.conf.eth0.rp_filter=0

清理排查过程中产生的冗余 ufw 规则

-> % sudo ufw status numbered
[ 1] 3389/tcp                   ALLOW IN    Anywhere
[ 2] OpenSSH                    ALLOW IN    Anywhere
[ 3] Anywhere on docker0        ALLOW FWD   Anywhere on docker0
[ 4] 5000 on docker0            ALLOW IN    Anywhere
[ 5] Anywhere on docker0        ALLOW IN    Anywhere
[ 6] Anywhere                   ALLOW OUT   Anywhere on docker0        (out)
[ 7] 5000/tcp                   ALLOW IN    Anywhere
[ 8] Anywhere                   ALLOW IN    172.16.0.0/12
[ 9] 3389/tcp (v6)              ALLOW IN    Anywhere (v6)
[10] OpenSSH (v6)               ALLOW IN    Anywhere (v6)
[11] Anywhere (v6) on docker0   ALLOW FWD   Anywhere (v6) on docker0
[12] 5000 (v6) on docker0       ALLOW IN    Anywhere (v6)
[13] Anywhere (v6) on docker0   ALLOW IN    Anywhere (v6)
[14] Anywhere (v6)              ALLOW OUT   Anywhere (v6) on docker0   (out)
[15] 5000/tcp (v6)              ALLOW IN    Anywhere (v6)

-> % sudo ufw delete 3
...

删除了一堆规则后,回到之前的状态:

-> % sudo ufw status numbered
状态: 激活

     至                          动作          来自
     -                          --          --
[ 1] 3389/tcp                   ALLOW IN    Anywhere
[ 2] OpenSSH                    ALLOW IN    Anywhere
[ 3] Anywhere                   ALLOW IN    172.16.0.0/12
[ 4] 3389/tcp (v6)              ALLOW IN    Anywhere (v6)
[ 5] OpenSSH (v6)               ALLOW IN    Anywhere (v6)

Docker Compose 自定义网络

Docker Compose 默认会创建 user-defined bridge network。

此时:

  • 容器流量不会经过 docker0
  • 会生成 br-xxxx 网桥
  • subnet 由 compose 自动分配

例如:

  • docker0 → 172.17.0.0/16
  • deploy_catroll-network → 172.18.0.0/16

因此:

  • tcpdump docker0 无流量是正常现象,不能默认认为 Docker 流量一定经过 docker0
  • Docker 网络排查时,应优先确认容器实际所在 network/subnet,而不是默认假设流量经过 docker0
如果你有魔法,你可以看到一个评论框~