How does NextDNS work behind the scene?
I know how DNS based blocking works, but I am not able to understand how NextDNS works - with each user having his/her own blocklists and still be able to provide the same performance as any other DNS.
It would be nice if someone from the core team replied to this.
I'm not their employee or anything. but while each user can choose which lists to enable, those are from preselected collections. It's possible to compile a combined list that contains deduplicated domain as key with the value being combined flags of whether that domain exists in a collection member. So let's say you have
List A : example.com doubleclick.net
List B : doubleclick.net google.com
The combined blocklist will be :
example.com : 10
doubleclick.net : 11
google.com : 01
Then if users have config C that enables both list, config D that only enable list A and config E that only enable list B, then the config list will be :
config C : 11
config D : 10
config E : 01
It's trivial to then use binary AND between the combined list and each config entry, any non-zero value means it's blocked, ie, if we want to check if config C blocks example.com, then it's 11 AND 10 which return 10, indicating the domain is blocked, while against config E, 10 AND 01 is 00 which means example.com isn't blocked.
Hash lookup takes constant time, so does the comparison operator, so CPU-wise, this solution is scalable, and even storage-wise since most lists would have significant overlap anyway (in the example, the combined blocklist only has three entries instead of 2 + 2).
I'm sure NextDNS implement an even more sophisticated algorithm than this, but as you can see even with basic DB you can get O(1) complexity.