0

Anycast primary IP timing out 16:38 - 17:08 UTC every day

As the title says - the anycast primary IP (45.90.28.0) times out for me for 30 minutes everyday from 16:38 - 17:08 UTC. The network is otherwise functioning fine but dnscrypt-proxy reports a timeout if I restart the service.

The affected client is an OPNsense router running in Hetzner's Finland datacenter.

This is the output of the diag tool while the issue was occurring:

Welcome to NextDNS network diagnostic tool.

This tool will download a small binary to capture latency and routing information
regarding the connectivity of your network with NextDNS. In order to perform a
traceroute, root permission is required. You may therefore be asked to provide
your password for sudo.

The source code of this tool is available at https://github.com/nextdns/diag

Do you want to continue? (press enter to accept)
Testing IPv6 connectivity
  available: false
Fetching https://test.nextdns.io
  status: unconfigured
  client: 135.181.137.104
  resolver: 162.158.180.135
Fetching PoP name for ultra low latency primary IPv4 (ipv4.dns1.nextdns.io)
  hetzner-hel: 527µs
Fetching PoP name for ultra low latency secondary IPv4 (ipv4.dns2.nextdns.io)
  tavu-hel: 6.907ms
Fetching PoP name for anycast primary IPv4 (45.90.28.0)
Fetch error: Get "https://dns.nextdns.io/info": dial tcp 45.90.28.0:443: connect: connection timed out
Fetching PoP name for anycast secondary IPv4 (45.90.30.0)
  anexia-sto: 7.1ms
Pinging PoPs
  hetzner-hel: 621µs
  anexia-sto: 6.469ms
  tavu-hel: 9ms
  wavecom-tll: 18.058ms
  zepto-sto: 31.133ms
  zepto-osl: 33.591ms
  edis-vno: 57.505ms
  melbicom-vno: 57.494ms
  edis-rix: 61.965ms
  melbicom-rix: 62ms
Traceroute for ultra low latency primary IPv4 (135.181.102.167)
    1       10.7.2.1    0ms   0ms   0ms
    2   100.80.104.1    0ms   0ms   0ms
    3 213.239.224.129    2ms   0ms   0ms
    4  88.198.249.94    0ms   0ms   0ms
    5                   *     *     *
    6 95.216.129.231    0ms   0ms   0ms
    7 135.181.102.167    0ms   0ms   0ms
Traceroute for ultra low latency secondary IPv4 (185.87.111.218)
    1       10.7.2.1    0ms   0ms   0ms
    2   100.80.104.1    0ms   0ms   0ms
    3 213.239.224.125    0ms   0ms   0ms
    4 213.239.224.37    0ms   0ms   0ms
    5 193.110.226.20    1ms   1ms   1ms
    6  217.78.199.68    6ms   6ms   6ms
    7 62.115.177.183    7ms   7ms   7ms
    8   95.175.96.33    6ms   6ms   6ms
    9  95.175.105.70    6ms   6ms   6ms
   10 185.87.111.218    6ms   6ms   6ms
Traceroute for anycast primary IPv4 (45.90.28.0)
    1       10.7.2.1    0ms   0ms   0ms
    2   100.80.104.1    0ms   0ms   0ms
    3 213.239.224.125    0ms   0ms   0ms
    4 213.239.224.37    0ms   0ms   0ms
    5 109.239.137.157    6ms   6ms   6ms
    6    45.131.68.6    6ms   6ms   6ms
    7                   *     *     *
    8                   *     *     *
    9                   *     *     *
   10                   *     *     *
   11                   *     *     *
   12                   *     *     *
   13                   *     *     *
   14                   *     *     *
   15                   *     *     *
   16                   *     *     *
   17                   *     *     *
   18                   *     *     *
   19                   *     *     *
   20                   *     *     *
Traceroute for anycast secondary IPv4 (45.90.30.0)
    1       10.7.2.1    0ms   0ms   0ms
    2   100.80.104.1    0ms   0ms   0ms
    3 213.239.224.125    2ms   0ms   0ms
    4 213.239.224.17    7ms   7ms   7ms
    5  194.68.123.18    8ms   7ms   7ms
    6 213.227.185.32    7ms   7ms   7ms
    7     45.90.30.0    7ms   7ms   7ms
Do you want to send this report? [Y/n]: n

I re-ran the tool later and it's basically the same except it didn't timeout on the anycast primary and it shows as configured now as I previously had to use a third party resolver due to this issue.

Welcome to NextDNS network diagnostic tool.

This tool will download a small binary to capture latency and routing information
regarding the connectivity of your network with NextDNS. In order to perform a
traceroute, root permission is required. You may therefore be asked to provide
your password for sudo.

The source code of this tool is available at https://github.com/nextdns/diag

Do you want to continue? (press enter to accept)
Testing IPv6 connectivity
  available: false
Fetching https://test.nextdns.io
  status: ok
  client: 135.181.137.104
  protocol: DOH
  dest IP: 45.90.28.0
  server: zepto-led-1
Fetching PoP name for ultra low latency primary IPv4 (ipv4.dns1.nextdns.io)
  hetzner-hel: 576µs
Fetching PoP name for ultra low latency secondary IPv4 (ipv4.dns2.nextdns.io)
  tavu-hel: 6.838ms
Fetching PoP name for anycast primary IPv4 (45.90.28.0)
  zepto-led: 6.914ms
Fetching PoP name for anycast secondary IPv4 (45.90.30.0)
  anexia-sto: 7.092ms
Pinging PoPs
  hetzner-hel: 466µs
  anexia-sto: 6.125ms
  tavu-hel: 6ms
  wavecom-tll: 18.055ms
  zepto-sto: 45.352ms
  zepto-osl: 47.068ms
  edis-vno: 57.507ms
  melbicom-vno: 57.435ms
  melbicom-rix: 62ms
  edis-rix: 63.523ms
Traceroute for ultra low latency primary IPv4 (135.181.102.167)
    1       10.7.2.1    0ms   0ms   0ms
    2   100.80.104.1    0ms   0ms   0ms
    3 213.239.224.129    2ms   2ms   0ms
    4  88.198.249.94    9ms  30ms   6ms
    5                   *     *     *
    6 95.216.129.231    0ms   0ms   0ms
    7 135.181.102.167    0ms   0ms   0ms
Traceroute for ultra low latency secondary IPv4 (185.87.111.218)
    1       10.7.2.1    0ms   0ms   0ms
    2   100.80.104.1    0ms   0ms   0ms
    3 213.239.224.125    0ms   0ms   0ms
    4 213.239.224.37    0ms   0ms   0ms
    5 193.110.226.20    1ms   1ms   1ms
    6  217.78.199.68    6ms   6ms   6ms
    7 62.115.177.183    7ms   7ms   7ms
    8   95.175.96.33    6ms   6ms   6ms
    9  95.175.105.70    6ms   6ms   6ms
   10 185.87.111.218    6ms   6ms   6ms
Traceroute for anycast primary IPv4 (45.90.28.0)
    1       10.7.2.1    0ms   0ms   0ms
    2   100.80.104.1    0ms   0ms   0ms
    3 213.239.224.125    0ms   0ms   0ms
    4 213.239.224.37    0ms   0ms   0ms
    5 109.239.137.157    6ms   6ms   6ms
    6    45.131.68.6    6ms   6ms   6ms
    7     45.90.28.0    6ms   6ms   6ms
Traceroute for anycast secondary IPv4 (45.90.30.0)
    1       10.7.2.1    0ms   0ms   0ms
    2   100.80.104.1    0ms   0ms   0ms
    3 213.239.224.125    0ms   0ms   0ms
    4 213.239.224.17    7ms   7ms   7ms
    5  194.68.123.18    8ms   7ms   7ms
    6 213.227.185.32    7ms   7ms   7ms
    7     45.90.30.0    7ms   7ms   7ms
Do you want to send this report? [Y/n]: n

I can see the timing reflected in the NextDNS query logs too (My clock is UTC+1)

 

I have another OPNsense box with dnscrypt-proxy configured identically which works alright, just seems to be an issue between Hetzner and NextDNS. I first picked up this issue about 2 weeks ago but it might've been happening for longer than that.

Any help would be appreciated.

1 reply

null
    • William_Foster
    • 2 yrs ago
    • Reported - view

    Figured I'd update you lot after all the helpful replies I got.

    Research after this indicated that sdns stamps have the IP hardcoded in them, which is why it wouldn't fail over to the secondary, the easiest "solution" was to take the NextDNS generated stamp and paste it into https://dnscrypt.info/stamps/ then update the IP to the secondary, copy the new stamp and add it as a server in dnscrypt-proxy.

    Effectively I could failover to the more reliable secondary when the primary had its daily meltdown. Not great but in the absence of help from NextDNS themselves I have no choice.

    While this theoretically should've worked alright, it seemed the timeouts caused another issue in dnscrypt-proxy where it'd run out of client connections (capped at 250!) and I think this was due to dnscrypt-proxy not switching over to the secondary quick enough, causing lots of longer lived connections while waiting for timeouts.

    In light of how unbelievably annoying this was getting and given my patience was worn razor thin at this point, I ended up changing to another provider and I don't use NextDNS anymore on the Hetzner box. No issues since.

Content aside

  • 2 yrs agoLast active
  • 1Replies
  • 109Views
  • 1 Following