Address Resolution (DNS)
When you set up the address struct for a socket, it requires an IP address. But how often do you actually input an IP address? Or even know what the IP address of the server is? Probably almost never. Instead, you enter a domain or host name and somehow the network app (eg, your web browser) is happy with that.That "somehow" is the Domain Name Service (DNS). It's a hierarchical network of servers that collectively has a list of all the hosts in all the domains and their IP addresses. When you need the IP address for a host name, your application sends a DNS request to its DNS servers (the ones in the TCP/IP properties page for your network connection -- if you're a DHCP client of your ISP, then DHCP has filled in that information for you, along with your IP address). If those servers don't have the information, then they'll ask their DNS servers up the hierarchy until one of them does know and responds with the IP address. Your DNS server will then respond to you, plus it will cache that information locally for a limited amount of time, so that if you ask for the same IP address again it will be able to respond immediately. To see why caching would be a good idea, consider visiting a web site in which you will not only be requesting several pages from that server, but also each page could generate requests for several files (eg, each graphic on the page is a separate file and hence a separate file request).
The whole subject of how DNS works and how DNS servers are set up is a rather involved topic to which whole chapters and more are devoted in textbooks. If you want to know more, then refer to Wikipedia's article on DNS for a better and more complete explanation than I could provide.
But you don't need to know how DNS works in order to use it; you just need to know how to use it. And in order to use DNS within our applications, we need to know what functions to call and what to do with the data in the struct that they return.
It's fairly simple. There are just two functions (copied and edited from Visual C++ documentation):
struct hostent* gethostbyname (const char* name); name -- A pointer to the null-terminated name of the host to resolve.RemarksThe gethostbyname function returns a pointer to a HOSTENT structure — a static structure allocated by the operating system. The HOSTENT structure contains the results of a successful search for the host specified in the name parameter.Return ValuesThe application must never attempt to modify this structure or to free any of its components. Furthermore, only one copy of this structure is allocated per thread, so the application should copy any information it needs before issuing any other sockets function calls.
The gethostbyname function cannot resolve IP address strings passed to it. Such a request is treated exactly as if an unknown host name were passed. Use inet_addr to convert an IP address string to an actual IP address, then use another function, gethostbyaddr, to obtain the contents of the HOSTENT structure.
The gethostbyname function resolves the string returned by a successful call to gethostname.
If no error occurs, gethostbyname returns a pointer to the HOSTENT structure described above. Otherwise, it returns a NULL pointer and a specific error number can be retrieved by calling the appropriate error functions (eg, perror, WSAGetLastError).
struct hostent* gethostbyaddr (const char* addr, int len, int type); addr -- A pointer to an address in network byte order.Remarks
len -- The length of the address.
type -- The type of the address. Address family; ie AF_INET
The gethostbyaddr function returns a pointer to the HOSTENT structure that contains the name and address corresponding to the given network address. All strings are null-terminated.Return ValuesIf no error occurs, gethostbyaddr returns a pointer to the HOSTENT structure. Otherwise, it returns a NULL pointer, and a specific error number can be retrieved by calling the appropriate error functions (eg, perror, WSAGetLastError).Since you'll usually need to resolve a host or domain name to an IP address, so most of the time you'll use gethostbyname. However, there are times when you'll want to do a reverse look-up and obtain the domain name of a given IP address, in which case you would use gethostbyaddr. An example of this would be the nslookup utility (following command-line session scrubbed of local network information):
C:>nslookup www.yahoo.com Non-authoritative answer: Name: www.yahoo-ht3.akadns.net Address: 209.131.36.158 Aliases: www.yahoo.com C:>nslookup 209.131.36.158 Name: f1.www.vip.sp1.yahoo.com Address: 209.131.36.158 C:>
Here is a simple function, ResolveName, that you can include in your code and use to resolve a domain/host name into an IP address that you can copy directly into a sockaddr_in struct.Review the section on setting up a socket address. Normally an IP address will be typed in as a dotted-decimal string which would be a string like "192.168.0.1". That's fine and good for input and output, but before it's used in a socket it must be converted to a 32-bit binary number in network byte order. Normally, we would feed the dotted-decimal string to the function, inet_addr, which would then return the 32-bit binary equivalent IP address. ResolveName returns the IP address as a 32-bit binary IP address in network byte order, so there's no need to convert it further.
Here's the sample code from that section you just reviewed, modified to use ResolveName:
int StuffSockAddr_inWithName(struct addr_in *addr, const char *domain_name, short int port) { unsigned long addr; if (ResolveName(domain_name, &addr) == -1) return -1; /* for error handling in the calling function */ memset(addr, 0, sizeof(*addr)); /* Zero out structure */ addr->sin_family = AF_INET; /* Internet address family */ addr->sin_addr.s_addr = addr; /* IP address */ addr->sin_port = htons(port); /* Port */ }And here's the ResolveName function itself, with all the explanation in the commenting:
/*************************************************************************** * Function name : ResolveName * returns : int -- error/success indication: * -1 indicates failure, 0 indicates success * arg1 : char name[] -- C-style character string containing the * domain or host name to be resolved into an IP address. * arg2 : unsigned long *addr -- pointer to an unsigned long * variable into which to save the IP address, in binary * form in network byte order. This binary form is as * required by the sin_addr field in struct sockaddr_in. * Description : This function resolves the domain name to an IP address by * calling gethostbyname and passing it the domain * name. * If the return value is NULL, the function failed to * resolved the name and the function returns a -1 * to indicate failure. * Else, the return value points to a hostent structure * that contains the information on that domain. * In this case, the IP address is copied to the * variable pointed to by the addr parameter and * the function returns a 0 to indicate success. * Notes : This function does not create a hostent struct, but rather * can only create a pointer to one. The hostent struct * pointer value returned by gethostbyname points to a * static variable that will be overwritten by the next * socket function that could affect it. * Therefore, before we exit this function we make sure * to copy from that struct any data that we need. * This sample was written with a minimum amount of error * checking and reporting. Since the details of error * handling of sockets functions is implementation * dependent (ie, handled differently in UNIX/Linux than * in Winsock), I will leave it to you to elaborate the * code as you need to. */ int ResolveName(char name[], unsigned long *addr) { struct hostent *host; /* Structure containing host information */ /* Try to resolve the name, testing for failure */ if ((host = gethostbyname(name)) == NULL) { /* failed, so output error message and return a -1 for failure */ fprintf(stderr, "gethostbyname() failed"); return -1; } /* return the binary, network byte ordered address through the pointer parameter*/ *addr = *((unsigned long *) host->h_addr_list[0]); /* return a 0 for success */ return 0; }One more note that will be explained more fully below. We obtained the IP address from the address list array in the hostent struct. The declaration of struct hostent includes some macro definitions that simplify its use, much as was the case with the sockaddr_in declaration (look at it for yourself; go to your compiler's INCLUDE directory and find the header file that contains the declaration for struct sockaddr_in).
Specifically, this macro definition:
allows us to change this line:#define h_addr h_addr_list[0] /* for backward compatibility */
to this:*addr = *((unsigned long *) host->h_addr_list[0]);
*addr = *((unsigned long *) host->h_addr);
I only bring it up here because you may see other authors using the macro instead of the field name. As I said, more on that in the next section where we take a closer look at the hostent struct.
This is the heart of the sockets' support of DNS. The only purpose of the two DNS functions, gethostbyname and gethostbyaddr, is to stuff a hostent struct with all the DNS information about that host.Here is the definition of the hostent struct:
#include <netdb.h> struct hostent { char *h_name; /* official name of host */ char **h_aliases; /* alias list */ short h_addrtype; /* host address type */ short h_length; /* length of address */ char **h_addr_list; /* list of addresses */ }; #define h_addr h_addr_list[0] /* for backward compatibility */The members of the hostent structure are:
- h_name
- The official name of the host.
- h_aliases
- These are alternative names for the host, represented as a null-terminated vector of strings.
- h_addrtype
- This is the host address type; in practice, its value is always either AF_INET or AF_INET6, with the latter being used for IPv6 hosts. In principle other kinds of addresses could be represented in the database as well as Internet addresses; if this were done, you might find a value in this field other than AF_INET or AF_INET6.
- h_length
- This is the length, in bytes, of each address.
- h_addr_list
- This is the vector of addresses for the host. (Recall that the host might be connected to multiple networks and have different addresses on each one.) The vector is terminated by a null pointer.
- h_addr
- This is a synonym for h_addr_list[0]; in other words, it is the first host address.
As I noted earlier, gethostbyname and gethostbyaddr both return a pointer to a statically allocated struct. That means that one and only one such struct exists and the next call to any function that would modify the struct will overwrite it. That means that if there's something in that struct that you want to use later, then you need to copy it to a variable declared within your application.
Winsock's hostent is a bit different, but still very close. Windows documentation notes:
The hostent structure is used by functions to store information about a given host, such as host name, IPv4 address, and so forth. An application should never attempt to modify this structure or to free any of its components. Furthermore, only one copy of the hostent structure is allocated per thread, and an application should therefore copy any information that it needs before issuing any other Windows Sockets API calls.Winsock's declaration and field definitions are:
#include <winsock2.h> typedef struct hostent { char FAR* h_name; char FAR FAR** h_aliases; short h_addrtype; short h_length; char FAR FAR** h_addr_list; } HOSTENT, *PHOSTENT, FAR *LPHOSTENT;Members
- h_name
- Official name of the host (PC).If using the DNS or similar resolution system, it is the Fully Qualified Domain Name (FQDN) that caused the server to return a reply. If using a local "hosts" file, it is the first entry after the IP address.
- h_aliases
- A NULL-terminated array of alternate names.
- h_addrtype
- The type of address being returned.
- h_length
- This is the length, in bytes, of each address.
- h_addr_list
- A NULL-terminated list of addresses for the host. Addresses are returned in network byte order. The macro h_addr is defined to be h_addr_list[0] for compatibility with older software.
Note that Winsock typedefs struct hostent as HOSTENT, so that your C code will not need to use the struct keyword all the time. If you write Winsock code, you should get into the habit of taking advantage of the HOSTENT typedef.
I know that the first time I looked at the documentation on hostent, I couldn't quite understand what it was telling me, so I wrote a program to play with it. I recommend that you do the same.
Here is a function built from the code I had written:
int DisplayHostEnt(char *name) { struct hostent *he; int i; struct in_addr addr; he = gethostbyname(name); if (he == NULL) { fprintf(stderr,"gethostbyname failed"); return -1; /* return -1 for error */ } else { printf("h_name = %s\n",he->h_name); } if (he->h_aliases[0] == NULL) printf("No aliases.\n"); else { printf("Aliases:\n"); for (i = 0; he->h_aliases[i] != 0; ++i) { printf(" %d. %s\n",i+1,he->h_aliases[i]); } } /* original code had an array of address family strings * that was indexed by h_addrtype, so I printed out the name. * Left that out here to keep from cluttering up the web page. * AF_INET == 2 * AF_INET6 should be 26, but it's not defined on all platforms. */ printf("h_addrtype = %d\n",he->h_addrtype); printf("h_length = %d\n",he->h_length); if (he->h_addr_list == NULL) printf("No h_addr_list present.\n"); else { printf("h_addr_list:\n"); for (i = 0; he->h_addr_list[i] != 0; ++i) { memcpy(&addr, he->h_addr_list[i], sizeof(struct in_addr)); printf(" Addr #%d: %s\n",i,inet_ntoa(addr)); } } return 0; /* for success */ }Running my program that I just pulled that code out of, I get this output (scrubbed for security reasons):
Hostname = myPC h_name = myPC.myemployer.com No aliases. h_addrtype = AF_INET [2] h_length = 4 h_addr_list: Addr #0: 192.168.8.180
As I noted in my section on ports (scroll down a bit once you get there), ports 0 through 1023 are the "Well Known Ports" that are reserved for and associated with standard services like telnet, ftp, http, ntp. Where this is leading us is that we will want to be able to accept a service name and be able to resolve it to a port number.Well, we do have the capability. And it is almost exactly like using hostent.
The SERVICES File
First, what are the service names and what ports do they belong to? On each computer with TCP/IP there should be a file named SERVICES . On UNIX and Linux systems it should be in the /etc directory. On Windows it tends to move around a bit, but it should be under the system directory in System32\Drivers\ETC . BTW, the same directory contains the HOSTS file, into which you can enter host names and their associate IP addresses as part of host name resolution; basically a local component to the DNS process.
SERVICES is a text file. Here is a short excerpt from it:
# Copyright (c) 1993-1999 Microsoft Corp. # # This file contains port numbers for well-known services defined by IANA # # Format: # # <service name> <port number>/<protocol> [aliases...] [#<comment>] # echo 7/tcp echo 7/udp discard 9/tcp sink null discard 9/udp sink null systat 11/tcp users #Active users systat 11/tcp users #Active users daytime 13/tcp daytime 13/udp qotd 17/tcp quote #Quote of the day qotd 17/udp quote #Quote of the day chargen 19/tcp ttytst source #Character generator chargen 19/udp ttytst source #Character generator ftp-data 20/tcp #FTP, data ftp 21/tcp #FTP. control telnet 23/tcp smtp 25/tcp mail #Simple Mail Transfer Protocol time 37/tcp timserver time 37/udp timserverOne thing you'll notice is that a lot of the services run on either tcp or udp. Another thing you'll notice is that some only run on one protocol (eg, ftp, telnet, and smtp only run with the tcp protocol).
struct servent -- The Service Resolution Structure
Service resolution is almost completely analogous to domain name resolution. In place of struct hostent, we have struct servent. In place of gethostbyname and gethostby addr, we have getservbyname and getservbyport.
In UNIX/Linux, include netdb.h In Winsock, include winsock2.h
struct servent { char *s_name; /* official name of service */ char **s_aliases; /* alias list */ int s_port; /* port service resides at */ char *s_proto; /* protocol to use */ };Members:
- s_name
- The official name of the service.
- s_aliases
- A NULL-terminated array of alternate names for the service.
- s_port
- The port number at which the service resides. Port numbers are returned in network byte order.
- s_proto
- The name of the protocol to use when contacting the service. Eg, "tcp", "udp".
Note that, as before, Winsock's declaration is slightly different and that it also creates typedefs to simplify usage. Other than that, the struct fields all have the same definitions as above:
typedef struct servent { char FAR* s_name; char FAR FAR** s_aliases; short s_port; char FAR* s_proto; } SERVENT, *PSERVENT, FAR *LPSERVENT;
The Service Resolution Functions
Just as with hostent, we have two functions that will return a servent struct: getservbyname and getservbyport. Most of the time, you will use getservbyname in order to resolve a service name to its associate port so that you can set up the sockaddr_in struct to connect to the server. Then on occasion you may want to use the reverse lookup function, getservbyport, to see what service is on a particular well-known port. Either function will return a servent struct with the same information, just as with hostent and its two functions. You're in familiar territory here!
The syntax for the two functions are:
getservbyname retrieves service information corresponding to a service name and protocolstruct servent* getservbyname(const char* name, const char* proto);Parameters
name -- Pointer to a null-terminated service name.Return Valueproto -- Optional pointer to a null-terminated protocol name. If this pointer is NULL, getservbyname returns the first service entry where name matches the s_name member of the servent structure or the s_aliases member of the servent structure. Otherwise, getservbyname matches both the name and the proto.
If no error occurs, getservbyname returns a pointer to the servent structure. Otherwise, it returns a null pointer and a specific error number can be retrieved by the appropriate error handling routines.
getservbyport retrieves service information corresponding to a port and protocolstruct servent* getservbyport(int port, const char* proto);Parameters
port -- Port for a service, in network byte order.Return Valueproto -- Optional pointer to a protocol name. If this is null, getservbyport returns the first service entry for which the port matches the s_port of the servent structure. Otherwise, getservbyport matches both the port and the proto parameters.
If no error occurs, getservbyport returns a pointer to the servent structure. Otherwise, it returns a null pointer and a specific error number can be retrieved by calling WSAGetLastError.
Using Service Resolution
Just as with using struct hostent and gethostbyname, the code for using servent and getservbyname is fairly simple. And while I haven't yet gone through the exercise of printing out all the information in a servent struct, you might want to give it a whirl.
Though I guess I should at some point describe why you would want to resolve a service name. I've written a udp time client, UdpTimeC, which runs from the command line and which expects the user to enter a time server (either by name or by IP address; my code figures out which it has) and the port, either by port number or by service name. It will accept either ntp (port 123) or time (port 37). Now, I could require the user to remember that the time service is on port 37 or that NTP is on port 123, but that wouldn't be very user-friendly, now would it? Instead, he can enter in either "time" or "ntp" and I use getservbyname to resolve that to a port number, which is what I need to complete the sockaddr_in struct.
The following is a function I had written to resolve the service passed to it to a port number:
/*************************************************************************** * Function name : ResolveService * returns : unsigned short -- * on failure, returns 0xFFFF, an impossible value for * a well-known port. * on success, returns the port number converted to * network byte order. * arg1 : char service[] -- C-style character string containing the * service name to be resolved into a port number. * arg2 : char protocol[] -- C-style character string containing the * protocol name. For most purposes, this will be either * "tcp" or "udp". * Description : This function resolves the service name to a port number by * calling getservbyname and passing it the service * name and protocol name. * First, the service name is tested for containing a * numeric string, in which case the user had * entered a port number so no look-up would be * necessary. That string is converted to a numeric * value which is converted to network byte order * and returned. * Else, getservbyname is called and its return value * is tested. * If the return value is NULL, the function failed * to resolve the name and the function returns a -1 * to indicate failure. * Else, the return value points to a servent * structure that contains the information on that * service. In this case, the port number is * returned -- because it's from the servent struct, * it's already in network byte order. * Notes : This function does not create a servent struct, but rather * can only create a pointer to one. The servent struct * pointer value returned by getservbyname points to a * static variable that will be overwritten by the next * socket function that could affect it. * Therefore, before we exit this function we make sure * to copy from that struct any data that we need. * This sample was written with a minimum amount of error * checking and reporting. Since the details of error * handling of sockets functions is implementation * dependent (ie, handled differently in UNIX/Linux than * in Winsock), I will leave it to you to elaborate the * code as you need to. */ unsigned short ResolveService(char service[], char protocol[]) { struct servent *serv; /* Structure containing service information */ unsigned short port; /* Port to return */ if ((port = atoi(service)) == 0) /* Is port numeric? */ { /* Not numeric. Try to find as name */ if ((serv = getservbyname(service, protocol)) == NULL) { fprintf(stderr, "getservbyname() failed"); return 0xFFFF; /* to signal failure */ } else port = serv->s_port; /* Found port (network byte order) by name */ } else /* it's already a port number */ port = htons(port); /* Convert port to network byte order */ return port; }
Now that you have learned how to do domain name and service name resolution with struct hostent and struct servent, you need to know that they're not the only functions available, nor necessarily the best. I can't find any reference right now, but the impression I have is that they're on their way out, so you'll eventually need to learn how to use their replacements. I haven't done anything with the new functions yet, so all I can do is to introduce you to them and then you can continue the research on your own.The following is an abridged copy of the Linux man page for getaddrinfo; I've inserted "[SNIP]" wherever I've cut something out. The functions that it makes reference to, getipnodebyname and getipnodebyaddr, do the same jobs as gethostbyname and gethostbyaddr do, only they appear to work a bit more along the lines that getaddrinfo does. I'm not sure what their relationship is to our old friends, gethostbyname and gethostbyaddr, but the man pages are quite clear that getipnodebyname and getipnodebyaddr are on their way out and are being replaced by getaddrinfo.
getaddrinfo(3) Linux Programmer's Manual getaddrinfo(3) NAME getaddrinfo, freeaddrinfo, gai_strerror - network address and service translation SYNOPSIS #include <sys/types.h> #include <sys/socket.h> #include <netdb.h> int getaddrinfo(const char *node, const char *service, const struct addrinfo *hints, struct addrinfo **res); void freeaddrinfo(struct addrinfo *res); const char *gai_strerror(int errcode); DESCRIPTION The getaddrinfo(3) function combines the functionality provided by the getipnodebyname(3), getipnodebyaddr(3), getservbyname(3), and get- servbyport(3) functions into a single interface. The thread-safe getaddrinfo(3) function creates one or more socket address structures that can be used by the bind(2) and connect(2) system calls to create a client or a server socket. The getaddrinfo(3) function is not limited to creating IPv4 socket address structures; IPv6 socket address structures can be created if IPv6 support is available. These socket address structures can be used directly by bind(2) or connect(2), to prepare a client or a server socket. The addrinfo structure used by this function contains the following members: struct addrinfo { int ai_flags; int ai_family; int ai_socktype; int ai_protocol; size_t ai_addrlen; struct sockaddr *ai_addr; char *ai_canonname; struct addrinfo *ai_next; }; getaddrinfo(3) sets res to point to a dynamically-allocated linked list of addrinfo structures, linked by the ai_next member. There are sev- eral reasons why the linked list may have more than one addrinfo struc- ture, including: if the network host is multi-homed; or if the same service is available from multiple socket protocols (one SOCK_STREAM address and another SOCK_DGRAM address, for example). The members ai_family, ai_socktype, and ai_protocol have the same mean- ing as the corresponding parameters in the socket(2) system call. The getaddrinfo(3) function returns socket addresses in either IPv4 or IPv6 address family, (ai_family will be set to either AF_INET or AF_INET6). The hints parameter specifies the preferred socket type, or protocol. A NULL hints specifies that any network address or protocol is accept- able. If this parameter is not NULL it points to an addrinfo structure whose ai_family, ai_socktype, and ai_protocol members specify the pre- ferred socket type. AF_UNSPEC in ai_family specifies any protocol fam- ily (either IPv4 or IPv6, for example). 0 in ai_socktype or ai_proto- col specifies that any socket type or protocol is acceptable as well. The ai_flags member specifies additional options, defined below. Mul- tiple flags are specified by logically OR-ing them together. All the other members in the hints parameter must contain either 0, or a null pointer. The node or service parameter, but not both, may be NULL. node speci- fies either a numerical network address (dotted-decimal format for IPv4, hexadecimal format for IPv6) or a network hostname, whose network addresses are looked up and resolved. If hints.ai_flags contains the AI_NUMERICHOST flag then the node parameter must be a numerical network address. The AI_NUMERICHOST flag suppresses any potentially lengthy network host address lookups. The getaddrinfo(3) function creates a linked list of addrinfo struc- tures, one for each network address subject to any restrictions imposed by the hints parameter. [snip] service sets the port number in the network address of each socket structure. If service is NULL the port number will be left uninitial- ized. If AI_NUMERICSERV is specified in hints.ai_flags and service is not NULL, then service must point to a string containing a numeric port number. This flag is used to inhibit the invocation of a name resolu- tion service in cases where it is known not to be required. The freeaddrinfo(3) function frees the memory that was allocated for the dynamically allocated linked list res. RETURN VALUE getaddrinfo(3) returns 0 if it succeeds, or one of the following non- zero error codes: [snip] The gai_strerror(3) function translates these error codes to a human readable string, suitable for error reporting.
Return to Top of Page
Return to DWise1's Sockets Programming Page
Return to DWise1's Programming Page
Share and enjoy!
First uploaded on 2007 October 04.
Updated 2011 July 18.