3

Short-term outages with DoT

I have set up the "DNS-over-TLS/QUIC" address in my Android smartphone as "private DNS" and also in the router for the home network. Basically, the service works very well, but several times a day I have short-term outages of up to 30 seconds on all end devices, both at home and mobile, during which no address can be resolved.

The problem apparently only occurs with the natively supported protocol DoT. If I only enter the unencrypted IPv6 DNS server in the router, I cannot determine any comparable failures so far.

https://nextdns.io/diag/6bddc480-68e8-11ec-987f-9119e8b922f7

32replies Oldest first
  • Oldest first
  • Newest first
  • Active threads
  • Popular
  • I do have the exact same issue when using DoT on my router (which is a fritzbox). However I found 6 month old threads here from users reporting the same issue with different brands - and other users with a fritzbox also.

    Unfortunately it seems that this issue was not resolved by now. I am not very happy about that :(

    Like
  • The problem also exists via smartphone in the mobile network - so I can rule out the router as the sole source of the error.

    As a temporary measure, I now use the DoH entry on the systems that natively support it (Chrome, Windows 11). This way, uninterrupted web browsing is at least possible. However, I do not consider this setup ideal. I do not want to install additional apps and they are not available everywhere (e.g. router).

    Like
      • JCVR
      • jcvr
      • 2 wk ago
      • Reported - view

      Koboldchen I absolutely agree. I have disabled DoT on my router as a "bugfix". I will check every now and then if it will work as intended. If its not getting better I will have to switch back to a different DNS provider unfortunately. I feel like this problem is not acknowledged by NextDNS by now despite several users reporting this issue. I did use several other DNS providers in the past with DoT and never had any issues.

      Like
  • I've got the same issue, router-based DNS-over-TLS. ping.nextdns shows as if anycast servers have appeared to be dead.

    Like
      • JCVR
      • jcvr
      • 2 wk ago
      • 1
      • Reported - view

      A B I just tested a bit more and found that it takes about 30sec - 1min usually and then the service runs again normally. It also seems that if I disable / enable DoT in the router it immediately works again. It probably is like you said - servers do not respond until a timeout happens or a completely new connection is established.

      I do not have any network traces etc to back this up.

      Like 1
  • I have a Netgate pfSense router experiencing the same issues with DoT configs except my outages are for longer periods of times, sometimes up to 10 minutes. 

    Not only is my router unable to resolve addresses, the two iphones I have in the house directed to NextDNS DoT DNS  servers were not resolving.  

    After complaints from family members about TVs, laptops, etc not working i had to divert to the plain-text configs.

    Also, this isn’t a new issue. It appears often and it’s quite annoying.  

    Like
      • JCVR
      • jcvr
      • 2 wk ago
      • Reported - view

      user1  Can you confirm that DoT works for you with different providers? I used to use DoT for a long time now and never had problems like that before with other DNS providers.

      Like
      • user1
      • user1
      • 2 wk ago
      • Reported - view

      JCVR 

      i cannot confirm that but i will. i’ve just been lazy and put all my eggs in the NextDNS basket trusting it would work. Perhaps i’ll see the same issue with other providers, i don’t know but i’ll switch to AdGuard to compare stability. 

      Like
  • That last time this happened it was an issue with Stubby not gracefully handling slow handshakes with NextDNS, which was supposed to be fixed, but I do know there was an issue filed on the Stubby GitHub regarding this as well.

    TLS Connection Failures - Stubby - Bug Reports - NextDNS Help Center

    Like 1
  • I have now tested personal AdGuard DNS beta intensively for two days. No timeouts via DoT so far, neither natively via my Android smartphone nor via my router. Also, the name resolution seems a little bit faster to me, but may just be imagination.

    The range of functions of AdGuard DNS is still quite limited. The interface is also not as innovative and clear as NextDNS. Yet!

    So @NextDNS, use your advantage over the upcoming competition and finally fix the problems with DoT! :-) I would like to stay here as paying customer.

    Like 2
    • Koboldchen can you please try with anycast.dns.nextdns.io and tell us if you can reproduce the issue?

      Like 3
      • Koboldchen
      • Koboldchen
      • 7 days ago
      • Reported - view

      NextDNS Yes, happened right now again on my smartphone at 8:37:04 a.m. CET accessing to golem.de from IP 109.250.66.1 / 2001:9e8:2706:3800:b9f2:5812:d8b4:68d5.

      Like
    • Koboldchen we will deploy a tentative fix today, I’ll keep you posted.

      Like 2
      • JCVR
      • jcvr
      • 7 days ago
      • Reported - view

      NextDNS Is it likely that this fix will be a general fix for DoT/DoH? I (among others) had also temporary issues with DoT on Android and on home routers.

      Like
    • JCVR there is no known issue with DoH. This is a potential fix for DoT although we couldn’t reproduce the issue so we need field testing to confirm.

      Like
      • JCVR
      • jcvr
      • 7 days ago
      • 1
      • Reported - view

      NextDNS Since I also had the issue with DoT I will be happy to test the fix as soon as it was applied. I also am not able to reproduce it on call, it was the worst kind of IT problem - a sporadic one. But it happened often enough to be noticable daily and annoying.

      Like 1
    • JCVR it's fully deployed, please tell us if you see improvement or not.

      Like 1
      • Koboldchen
      • Koboldchen
      • 6 days ago
      • Reported - view

      NextDNS Thank you very much! I'll check it out and get back to you.

      Like
      • Pro
      • Pro.1
      • 6 days ago
      • Reported - view

      Everything is the same as before

      I just could not open the sites, after 3 seconds it worked. Then the same with other sites.

       

      NextDNS

      Like
      • BS
      • teal_rabbit
      • 6 days ago
      • Reported - view

      NextDNS Not the OP, but still having timeout issues with using DoT on ASUS-Merlin.

      Like
      • JCVR
      • jcvr
      • 6 days ago
      • 1
      • Reported - view

      NextDNS Thank you! I have tested today and a few minutes ago it started happening again. This is not perfect, but I opened 6 PowerShell windows and did a domain resolution for different domains every 3 seconds. It turned out that first the resolution of nextdns.io failed, after a while one other domain resolution failed (which is expected when you are unable to resolve the DoT hostname.

      Probably this already helps a bit to narrow down the issue? Maybe resolution of nextdns.io fails for whatever reason?

      Like 1
    • Thanks for testing. Could please now test with the following hostnames *one at a time* and report if one or the other is fixing the issue:

      Like 1
      • Koboldchen
      • Koboldchen
      • 6 days ago
      • 2
      • Reported - view

      NextDNS Can also confirm that the fix yesterday didn't solve the problem. Will try the anycast addresses now.

      Like 2
      • JCVR
      • jcvr
      • 6 days ago
      • 3
      • Reported - view

      NextDNS I already can tell that ipv4-anycast.dns1.nextdns.io did have the same issue. In this case resolving nextdns.io worked, but resolving ipv4-anycast.dns1.nextdns.io did stop working. For some reason not all resolves fail together, but this might be cause to cached data, not sure though. Will now continue testing with ipv4-anycast.dns2.nextdns.io

      Like 3
      • JCVR
      • jcvr
      • 6 days ago
      • 3
      • Reported - view

      NextDNS Unfortunately the same happened with dns2 - the outage started this time with a different domain but quickly started to happen with one of the nextdns subdomains. I am still not 100% sure about the true nature of the issue, but it might be the case that the resolution of the nextdns domains / subdomains is not working properly which leads the other resolutions to fail as well? But why would it fail? Its truely a mystery to me, but then again I am no DNS admin ;)

      Like 3
      • JCVR
      • jcvr
      • 6 days ago
      • 3
      • Reported - view

      If someone wants to do the same test you can use this PowerShell command:

       $domain = "nextdns.io" ; do { resolve-dnsname $domain ; $c = 0 ; do { sleep 1 ; $c++ ; write-host -fore cyan "." -no } until($c -ge 5) ; "`n" } while(1)

      Just change the $domain variable and run it on several powershell shells with different domains.

      Like 3
      • Koboldchen
      • Koboldchen
      • 2 days ago
      • 3
      • Reported - view

      NextDNS Any new knowledge here? As JCVR has already confirmed, there are still infrequently resolving issues with the alternative addresses.

      Like 3
      • JCVR
      • jcvr
      • 2 days ago
      • 3
      • Reported - view

      NextDNS I also think that this issue is well documented for a while now and needs to be adressed by you quickly as it renders your service virtually unusable.

      it would be great to get an commitment and updates on the matter from you. i think your service has great potential if this is fixed.

      Like 3
    • JCVR we just pushed another tentative fix, please report if it does change anything.

      Like 2
  • Using AdGuardHome and forwarding upstream to the dot://url or doh://url and then checking the hit counter in NextDNS Analytics.. I found that doh was (almost) 3x used than dot.

     

    Unbound and Core can use dot as a recursive forward.. and dnscrypt-proxy can only do doh

     

    I've used clash (proxy) which can support dot doh directly.. but I have way to prove how it's goes from one to the other..   I was also trying to name the nextdns-cli to see how that appears in the logs.. but it seems to just come from the ip address of the wan connection it is coming from.. 

    I can't seem to name that endpoint like you can with dot://dot-custom-name-123abc.dns.nextdns.io or doh://dns.nextdns.io/123abc/doh-custom-name

     

    I'm not sure if AdGuardHome had problems with dot and that is why the count was uneven.. 

     

    My 0.02

    Like
  • Same issue here with DNS over TLS. 

    I have an Asus router with NextDNS TLS address input manually. Either it stops working on a reboot or the connection just drops randomly. I also ditched the cli client as that wouldn't reconnect on a reboot

    Like
Like3 Follow
  • 3 Likes
  • 6 hrs agoLast active
  • 32Replies
  • 443Views
  • 12 Following