dimanche 13 décembre 2009

Introduction to the DNS protocol (part 2)

So far, you have seen what a DNS query to a name server looks like. But this does not explain how the name server of your internet provider manages to find the IP address of "www.google.com", to keep the same example. As already explained, the DNS protocol is decentralized, and when you send a query to your name server, it may have to communicate to another name server, or several other ones, to get the result. Let's see how it works.

When a name server receives a query to get the IP address corresponding to a domain name, it first checks if the domain belongs to its own zone. A zone is a domain controlled by a name server, which means the server can give an authoritative answer to all queries related to this domain. If the domain does not belong to a zone owned by the name server, it has to find which name server has the information, and then forwards the query to this server.
In order to do this, the name server reads the domain name from the right to the left: in our example, the first part of the domain is "com", so the server has to find which name server in the world knows the "com" domain. Actually, all the top-level domains, such as "com", are controlled by name servers called the "root servers". There are 13 official root servers, which have fixed IP addresses, so that every DNS server in the world can know them. You can see the list of root servers with the following command:
# dig -t ns .

;; QUESTION SECTION:
;. IN NS

;; ANSWER SECTION:
. 185710 IN NS a.root-servers.net.
. 185710 IN NS d.root-servers.net.
. 185710 IN NS f.root-servers.net.
. 185710 IN NS b.root-servers.net.
. 185710 IN NS i.root-servers.net.
. 185710 IN NS e.root-servers.net.
. 185710 IN NS h.root-servers.net.
. 185710 IN NS c.root-servers.net.
. 185710 IN NS k.root-servers.net.
. 185710 IN NS m.root-servers.net.
. 185710 IN NS j.root-servers.net.
. 185710 IN NS l.root-servers.net.
. 185710 IN NS g.root-servers.net.

;; ADDITIONAL SECTION:
b.root-servers.net. 3599912 IN A 192.228.79.201
d.root-servers.net. 3599912 IN A 128.8.10.90
k.root-servers.net. 3599912 IN A 193.0.14.129
g.root-servers.net. 3599912 IN A 192.112.36.4
h.root-servers.net. 3599912 IN A 128.63.2.53
c.root-servers.net. 3599912 IN A 192.33.4.12
i.root-servers.net. 3599912 IN A 192.36.148.17
l.root-servers.net. 3599912 IN A 199.7.83.42
m.root-servers.net. 3599912 IN A 202.12.27.33
e.root-servers.net. 3599912 IN A 192.203.230.10
a.root-servers.net. 3599912 IN A 198.41.0.4
j.root-servers.net. 3599912 IN A 192.58.128.30
f.root-servers.net. 3599912 IN A 192.5.5.241
So, your name server already knows the IP address of the root servers, and can send a DNS query (called a "NS" query, in that case) directly to one of them, to know which name server owns the "com" domain. You can simulate this DNS query with the dig command, by adding the '@' option, which specifies the IP address of the name server you want to query. For instance, let's query the root server "A":
# dig -t ns com @198.41.0.4

;; QUESTION SECTION:
;com. IN NS

;; AUTHORITY SECTION:
com. 172800 IN NS K.GTLD-SERVERS.NET.
com. 172800 IN NS I.GTLD-SERVERS.NET.
com. 172800 IN NS H.GTLD-SERVERS.NET.
com. 172800 IN NS G.GTLD-SERVERS.NET.
com. 172800 IN NS F.GTLD-SERVERS.NET.
com. 172800 IN NS A.GTLD-SERVERS.NET.
com. 172800 IN NS D.GTLD-SERVERS.NET.
com. 172800 IN NS E.GTLD-SERVERS.NET.
com. 172800 IN NS B.GTLD-SERVERS.NET.
com. 172800 IN NS J.GTLD-SERVERS.NET.
com. 172800 IN NS C.GTLD-SERVERS.NET.
com. 172800 IN NS L.GTLD-SERVERS.NET.
com. 172800 IN NS M.GTLD-SERVERS.NET.

;; ADDITIONAL SECTION:
A.GTLD-SERVERS.NET. 172800 IN A 192.5.6.30
B.GTLD-SERVERS.NET. 172800 IN A 192.33.14.30
C.GTLD-SERVERS.NET. 172800 IN A 192.26.92.30
D.GTLD-SERVERS.NET. 172800 IN A 192.31.80.30
E.GTLD-SERVERS.NET. 172800 IN A 192.12.94.30
F.GTLD-SERVERS.NET. 172800 IN A 192.35.51.30
G.GTLD-SERVERS.NET. 172800 IN A 192.42.93.30
H.GTLD-SERVERS.NET. 172800 IN A 192.54.112.30
I.GTLD-SERVERS.NET. 172800 IN A 192.43.172.30
J.GTLD-SERVERS.NET. 172800 IN A 192.48.79.30
K.GTLD-SERVERS.NET. 172800 IN A 192.52.178.30
L.GTLD-SERVERS.NET. 172800 IN A 192.41.162.30
M.GTLD-SERVERS.NET. 172800 IN A 192.55.83.30
Once again, there is a big list of name servers in the reply, which are the 13 official top-level domain name servers.

Then the same process goes on: your name server can send a query to any of the above servers, for instance 192.5.6.30, to get the address of the name server which owns the domain "google.com". Let's do this with dig:
# dig -t ns google.com @192.5.6.30

;; QUESTION SECTION:
;google.com. IN NS

;; ANSWER SECTION:
google.com. 172800 IN NS ns1.google.com.
google.com. 172800 IN NS ns2.google.com.
google.com. 172800 IN NS ns3.google.com.
google.com. 172800 IN NS ns4.google.com.

;; ADDITIONAL SECTION:
ns1.google.com. 172800 IN A 216.239.32.10
ns2.google.com. 172800 IN A 216.239.34.10
ns3.google.com. 172800 IN A 216.239.36.10
ns4.google.com. 172800 IN A 216.239.38.10
Now, the reply is much smaller, and shows the address of the 4 DNS servers which are authoritative for the zone "google.com".

At last, your name server can send a query to one of these servers, to get the IP address of "www.google.com". This time, this is a "A" query, and not a "NS " query", but you don't need to precise "-t a" to dig, this is the default option:
# dig www.google.com @216.239.32.10

;; QUESTION SECTION:
;www.google.com. IN A

;; ANSWER SECTION:
www.google.com. 604800 IN CNAME www.l.google.com.
www.l.google.com. 300 IN A 209.85.227.147
www.l.google.com. 300 IN A 209.85.227.103
www.l.google.com. 300 IN A 209.85.227.106
www.l.google.com. 300 IN A 209.85.227.104
www.l.google.com. 300 IN A 209.85.227.105
www.l.google.com. 300 IN A 209.85.227.99
Here the reply shows that www.google.com is actually an alias (a "CNAME") to www.l.google.com, and gives 6 different IP addresses, as already seen in the previous article.

So, your name server has now learned the IP address of www.google.com, and can reply to your query. By the way, let's see what happens with dig when you query your own dns server; i.e the one of your internet provider:
# dig www.google.com

;; QUESTION SECTION:
;www.google.com. IN A

;; ANSWER SECTION:
www.google.com. 471995 IN CNAME www.l.google.com.
www.l.google.com. 95 IN A 209.85.229.106
www.l.google.com. 95 IN A 209.85.229.147
www.l.google.com. 95 IN A 209.85.229.104
www.l.google.com. 95 IN A 209.85.229.103
www.l.google.com. 95 IN A 209.85.229.105
www.l.google.com. 95 IN A 209.85.229.99

;; Query time: 29 msec
;; SERVER: 84.103.237.140#53(84.103.237.140)
;; WHEN: Sun Dec 13 11:20:59 2009
;; MSG SIZE rcvd: 148
You can see in the SERVER line the IP address of the name server which has been queried. As the address was not passed as parameter to dig, the default name server has been used (the one in /etc/resolv.conf). The interesting difference compared to the query done directly to the google name server is the number in red, called the TTL (Time To Live) of the DNS record. The TTL is the number of seconds for which the reply is valid, which means that if your computer needs to know again the IP address of www.google.com after the TTL, it has to send another query to the name server, otherwise it can just reuse the address it already knows (which is called caching). The goal of this TTL mechanism is to reduce the traffic to the name servers, but with the guarantee that the server will be called again after a given delay, in case the IP address has changed meanwhile. Then if you do the same query a few seconds later to your name server, you can see that the TTL has decreased:
# dig www.google.com

;; QUESTION SECTION:
;www.google.com. IN A

;; ANSWER SECTION:
www.google.com. 471989 IN CNAME www.l.google.com.
www.l.google.com. 89 IN A 209.85.229.104
www.l.google.com. 89 IN A 209.85.229.99
www.l.google.com. 89 IN A 209.85.229.147
www.l.google.com. 89 IN A 209.85.229.106
www.l.google.com. 89 IN A 209.85.229.105
www.l.google.com. 89 IN A 209.85.229.103
If you look again at the query done to the google name server, you can see that the TTL was 300s for all the www.l.google.com records. It means all the name servers which are not authoritative for the google.com domain (for instant the name server of your provider) must send again a query to the google name server after a delay of 300s, and that the TTL they use for their own replies must be lower than 300s.

These notions of caching and TTL are very important, as they explain why different name servers can give different results to the same query at the same time. It also explains why a change of IP address can take some time to be propagated to the whole internet, as if you own a DNS server and change your DNS table, you have to wait for the TTL to be sure that everyone will be aware of the change. That's why the TTL should be as low as possible if the IP addresses often changes, for instance if you use Dynamic DNS, which is a way to have a fixed domain name for your computer, even if it is connected to the internet through a cheap connection without a fixed IP address. On the other hand, a low TTL increases the traffic to the DNS servers, so a balance has to be found.

That's all for this introduction, I hope you now understand better how this important protocol works!

Aucun commentaire:

Enregistrer un commentaire