Lone Learner Lone Learner - 2 months ago 10
C Question

Why do I get duplicate addrinfo objects in the linked-list returned by getaddrinfo()?

Here is my code.

#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <arpa/inet.h>

int main()
{
struct addrinfo hints, *res, *p;
int error;

memset(&hints, 0, sizeof hints);

/* If we comment or remove the following line, the duplicate entries
* disappear */
hints.ai_family = AF_INET;

error = getaddrinfo("localhost", "http", &hints, &res);
if (error != 0) {
printf("Error %d: %s\n", error, gai_strerror(error));
return 1;
}

for (p = res; p != NULL; p = p->ai_next)
{
if (p->ai_family == AF_INET) {
struct sockaddr_in *addr = (struct sockaddr_in *) p->ai_addr;
char ip[INET_ADDRSTRLEN];

printf("ai_flags: %d; ai_family: %d; ai_socktype: %d; "
"ai_protocol: %2d; sin_family: %d; sin_port: %d; "
"sin_addr: %s; ai_canonname: %s\n",
p->ai_flags, p->ai_family, p->ai_socktype,
p->ai_protocol, addr->sin_family, ntohs(addr->sin_port),
inet_ntop(AF_INET, &addr->sin_addr, ip, INET_ADDRSTRLEN),
p->ai_canonname);
} else if (p->ai_family == AF_INET6) {
struct sockaddr_in6 *addr = (struct sockaddr_in6 *) p->ai_addr;
char ip[INET6_ADDRSTRLEN];

printf("ai_flags: %d; ai_family: %d; ai_socktype: %d; "
"ai_protocol: %2d; sin6_family: %d; sin6_port: %d; "
"sin6_addr: %s; ai_canonname: %s\n",
p->ai_flags, p->ai_family, p->ai_socktype,
p->ai_protocol, addr->sin6_family, ntohs(addr->sin6_port),
inet_ntop(AF_INET6, &addr->sin6_addr, ip, INET6_ADDRSTRLEN),
p->ai_canonname);
}
}

return 0;
}


Here is the output.

$ gcc -std=c99 -D_POSIX_SOURCE -Wall -Wextra -pedantic bar.c && ./a.out
ai_flags: 0; ai_family: 2; ai_socktype: 1; ai_protocol: 6; sin_family: 2; sin_port: 80; sin_addr: 127.0.0.1; ai_canonname: (null)
ai_flags: 0; ai_family: 2; ai_socktype: 2; ai_protocol: 17; sin_family: 2; sin_port: 80; sin_addr: 127.0.0.1; ai_canonname: (null)
ai_flags: 0; ai_family: 2; ai_socktype: 1; ai_protocol: 6; sin_family: 2; sin_port: 80; sin_addr: 127.0.0.1; ai_canonname: (null)
ai_flags: 0; ai_family: 2; ai_socktype: 2; ai_protocol: 17; sin_family: 2; sin_port: 80; sin_addr: 127.0.0.1; ai_canonname: (null)


The output shows that the 1st and 3rd entries are exactly same. Similarly, the 2nd and 4th entries are exactly same. Why do we get these duplicates in the results?

If we comment or remove the following line from the code, then the duplicate entries disappear.

/* If we comment or remove the following line, the duplicate entries
* disappear */
/* hints.ai_family = AF_INET; */


Here is the output in this case.

$ gcc -std=c99 -D_POSIX_SOURCE -Wall -Wextra -pedantic bar.c && ./a.out
ai_flags: 0; ai_family: 10; ai_socktype: 1; ai_protocol: 6; sin6_family: 10; sin6_port: 80; sin6_addr: ::1; ai_canonname: (null)
ai_flags: 0; ai_family: 10; ai_socktype: 2; ai_protocol: 17; sin6_family: 10; sin6_port: 80; sin6_addr: ::1; ai_canonname: (null)
ai_flags: 0; ai_family: 2; ai_socktype: 1; ai_protocol: 6; sin_family: 2; sin_port: 80; sin_addr: 127.0.0.1; ai_canonname: (null)
ai_flags: 0; ai_family: 2; ai_socktype: 2; ai_protocol: 17; sin_family: 2; sin_port: 80; sin_addr: 127.0.0.1; ai_canonname: (null)


This is how my
/etc/hosts
looks.

$ cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 debian1

# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters


If
hints.ai_family = AF_INET
is present in the code but if the line in
/etc/hosts
that begins with
::1
is commented out, indeed the duplicate entries disappear.

$ gcc -std=c99 -D_POSIX_SOURCE -Wall -Wextra -pedantic bar.c && ./a.out
ai_flags: 0; ai_family: 2; ai_socktype: 1; ai_protocol: 6; sin_family: 2; sin_port: 80; sin_addr: 127.0.0.1; ai_canonname: (null)
ai_flags: 0; ai_family: 2; ai_socktype: 2; ai_protocol: 17; sin_family: 2; sin_port: 80; sin_addr: 127.0.0.1; ai_canonname: (null)


But I would still like to know why the IPv6 entry in
/etc/hosts
cause duplicate entries even when
hints.ai_family = AF_INET
is used to select the IPv4 entries only.

Answer

This is a long-standing bug/feature of glibc. When you have an IPv6 localhost entry in your hosts file, like

::1 localhost

it gets automatically used for AF_INET name resolutions. This behaviour was introduced in November 2006 by Ulrich Drepper with this comment:

nss/nss_files/files-hosts.c (LINE_PARSER): Support IPv6-style addresses for IPv4 queries if they can be mapped.

Some people think that it's bug and although there were at least two bug reports with lengthy discussions (first one and second one) of this topic, no one really explained why this change was made, so I guess the only person who can do that is Ulrich himself. But probably it is useful in some scenarios, because even though Ulrich isn't working on glibc since May 2012, this code is still present in every modern version of glibc.

If you don't like this behaviour, you can tune your hosts file to not have localhost name for IPv6 loopback address, use some other name, like:

::1 localhost6

Or use some distribution where people thinking that it's a bug are actually maintaining glibc package, like openSUSE (and probably SLES/SLED) that just patches this behaviour away.

Comments