Anycast primary IP timing out 16:38 - 17:08 UTC every day
As the title says - the anycast primary IP (45.90.28.0) times out for me for 30 minutes everyday from 16:38 - 17:08 UTC. The network is otherwise functioning fine but dnscrypt-proxy reports a timeout if I restart the service.
The affected client is an OPNsense router running in Hetzner's Finland datacenter.
This is the output of the diag tool while the issue was occurring:
Welcome to NextDNS network diagnostic tool.
This tool will download a small binary to capture latency and routing information
regarding the connectivity of your network with NextDNS. In order to perform a
traceroute, root permission is required. You may therefore be asked to provide
your password for sudo.
The source code of this tool is available at https://github.com/nextdns/diag
Do you want to continue? (press enter to accept)
Testing IPv6 connectivity
available: false
Fetching https://test.nextdns.io
status: unconfigured
client: 135.181.137.104
resolver: 162.158.180.135
Fetching PoP name for ultra low latency primary IPv4 (ipv4.dns1.nextdns.io)
hetzner-hel: 527µs
Fetching PoP name for ultra low latency secondary IPv4 (ipv4.dns2.nextdns.io)
tavu-hel: 6.907ms
Fetching PoP name for anycast primary IPv4 (45.90.28.0)
Fetch error: Get "https://dns.nextdns.io/info": dial tcp 45.90.28.0:443: connect: connection timed out
Fetching PoP name for anycast secondary IPv4 (45.90.30.0)
anexia-sto: 7.1ms
Pinging PoPs
hetzner-hel: 621µs
anexia-sto: 6.469ms
tavu-hel: 9ms
wavecom-tll: 18.058ms
zepto-sto: 31.133ms
zepto-osl: 33.591ms
edis-vno: 57.505ms
melbicom-vno: 57.494ms
edis-rix: 61.965ms
melbicom-rix: 62ms
Traceroute for ultra low latency primary IPv4 (135.181.102.167)
1 10.7.2.1 0ms 0ms 0ms
2 100.80.104.1 0ms 0ms 0ms
3 213.239.224.129 2ms 0ms 0ms
4 88.198.249.94 0ms 0ms 0ms
5 * * *
6 95.216.129.231 0ms 0ms 0ms
7 135.181.102.167 0ms 0ms 0ms
Traceroute for ultra low latency secondary IPv4 (185.87.111.218)
1 10.7.2.1 0ms 0ms 0ms
2 100.80.104.1 0ms 0ms 0ms
3 213.239.224.125 0ms 0ms 0ms
4 213.239.224.37 0ms 0ms 0ms
5 193.110.226.20 1ms 1ms 1ms
6 217.78.199.68 6ms 6ms 6ms
7 62.115.177.183 7ms 7ms 7ms
8 95.175.96.33 6ms 6ms 6ms
9 95.175.105.70 6ms 6ms 6ms
10 185.87.111.218 6ms 6ms 6ms
Traceroute for anycast primary IPv4 (45.90.28.0)
1 10.7.2.1 0ms 0ms 0ms
2 100.80.104.1 0ms 0ms 0ms
3 213.239.224.125 0ms 0ms 0ms
4 213.239.224.37 0ms 0ms 0ms
5 109.239.137.157 6ms 6ms 6ms
6 45.131.68.6 6ms 6ms 6ms
7 * * *
8 * * *
9 * * *
10 * * *
11 * * *
12 * * *
13 * * *
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
Traceroute for anycast secondary IPv4 (45.90.30.0)
1 10.7.2.1 0ms 0ms 0ms
2 100.80.104.1 0ms 0ms 0ms
3 213.239.224.125 2ms 0ms 0ms
4 213.239.224.17 7ms 7ms 7ms
5 194.68.123.18 8ms 7ms 7ms
6 213.227.185.32 7ms 7ms 7ms
7 45.90.30.0 7ms 7ms 7ms
Do you want to send this report? [Y/n]: n
I re-ran the tool later and it's basically the same except it didn't timeout on the anycast primary and it shows as configured now as I previously had to use a third party resolver due to this issue.
Welcome to NextDNS network diagnostic tool.
This tool will download a small binary to capture latency and routing information
regarding the connectivity of your network with NextDNS. In order to perform a
traceroute, root permission is required. You may therefore be asked to provide
your password for sudo.
The source code of this tool is available at https://github.com/nextdns/diag
Do you want to continue? (press enter to accept)
Testing IPv6 connectivity
available: false
Fetching https://test.nextdns.io
status: ok
client: 135.181.137.104
protocol: DOH
dest IP: 45.90.28.0
server: zepto-led-1
Fetching PoP name for ultra low latency primary IPv4 (ipv4.dns1.nextdns.io)
hetzner-hel: 576µs
Fetching PoP name for ultra low latency secondary IPv4 (ipv4.dns2.nextdns.io)
tavu-hel: 6.838ms
Fetching PoP name for anycast primary IPv4 (45.90.28.0)
zepto-led: 6.914ms
Fetching PoP name for anycast secondary IPv4 (45.90.30.0)
anexia-sto: 7.092ms
Pinging PoPs
hetzner-hel: 466µs
anexia-sto: 6.125ms
tavu-hel: 6ms
wavecom-tll: 18.055ms
zepto-sto: 45.352ms
zepto-osl: 47.068ms
edis-vno: 57.507ms
melbicom-vno: 57.435ms
melbicom-rix: 62ms
edis-rix: 63.523ms
Traceroute for ultra low latency primary IPv4 (135.181.102.167)
1 10.7.2.1 0ms 0ms 0ms
2 100.80.104.1 0ms 0ms 0ms
3 213.239.224.129 2ms 2ms 0ms
4 88.198.249.94 9ms 30ms 6ms
5 * * *
6 95.216.129.231 0ms 0ms 0ms
7 135.181.102.167 0ms 0ms 0ms
Traceroute for ultra low latency secondary IPv4 (185.87.111.218)
1 10.7.2.1 0ms 0ms 0ms
2 100.80.104.1 0ms 0ms 0ms
3 213.239.224.125 0ms 0ms 0ms
4 213.239.224.37 0ms 0ms 0ms
5 193.110.226.20 1ms 1ms 1ms
6 217.78.199.68 6ms 6ms 6ms
7 62.115.177.183 7ms 7ms 7ms
8 95.175.96.33 6ms 6ms 6ms
9 95.175.105.70 6ms 6ms 6ms
10 185.87.111.218 6ms 6ms 6ms
Traceroute for anycast primary IPv4 (45.90.28.0)
1 10.7.2.1 0ms 0ms 0ms
2 100.80.104.1 0ms 0ms 0ms
3 213.239.224.125 0ms 0ms 0ms
4 213.239.224.37 0ms 0ms 0ms
5 109.239.137.157 6ms 6ms 6ms
6 45.131.68.6 6ms 6ms 6ms
7 45.90.28.0 6ms 6ms 6ms
Traceroute for anycast secondary IPv4 (45.90.30.0)
1 10.7.2.1 0ms 0ms 0ms
2 100.80.104.1 0ms 0ms 0ms
3 213.239.224.125 0ms 0ms 0ms
4 213.239.224.17 7ms 7ms 7ms
5 194.68.123.18 8ms 7ms 7ms
6 213.227.185.32 7ms 7ms 7ms
7 45.90.30.0 7ms 7ms 7ms
Do you want to send this report? [Y/n]: n
I can see the timing reflected in the NextDNS query logs too (My clock is UTC+1)
I have another OPNsense box with dnscrypt-proxy configured identically which works alright, just seems to be an issue between Hetzner and NextDNS. I first picked up this issue about 2 weeks ago but it might've been happening for longer than that.
Any help would be appreciated.
1 reply
-
Figured I'd update you lot after all the helpful replies I got.
Research after this indicated that sdns stamps have the IP hardcoded in them, which is why it wouldn't fail over to the secondary, the easiest "solution" was to take the NextDNS generated stamp and paste it into https://dnscrypt.info/stamps/ then update the IP to the secondary, copy the new stamp and add it as a server in dnscrypt-proxy.
Effectively I could failover to the more reliable secondary when the primary had its daily meltdown. Not great but in the absence of help from NextDNS themselves I have no choice.
While this theoretically should've worked alright, it seemed the timeouts caused another issue in dnscrypt-proxy where it'd run out of client connections (capped at 250!) and I think this was due to dnscrypt-proxy not switching over to the secondary quick enough, causing lots of longer lived connections while waiting for timeouts.
In light of how unbelievably annoying this was getting and given my patience was worn razor thin at this point, I ended up changing to another provider and I don't use NextDNS anymore on the Hetzner box. No issues since.
Content aside
- 2 yrs agoLast active
- 1Replies
- 109Views
-
1
Following