IPv6

Internet Protocol version 6

Synopsis:

#include <sys/socket.h>
#include <netinet/in.h>

int socket( AF_INET6, 
            SOCK_RAW, 
            proto );

Description:

The IP6 protocol is the network-layer protocol used by the Internet Protocol version 6 family (AF_INET6). Options may be set at the IP6 level when using higher-level protocols based on IP6 (such as TCP and UDP). It may also be accessed through a "raw socket" when developing new protocols, or special-purpose applications.

There are several IP6-level setsockopt() and getsockopt() options. They're separated into the basic IP6 sockets API (defined in RFC 2553), and the advanced API (defined in RFC 2292). The basic API looks very similar to the API presented in IP. The advanced API uses ancillary data and can handle more complex cases.

Note: Specifying some of the socket options requires root privileges.

Basic IP6 sockets API

You can use the IPV6_UNICAST_HOPS option to set the hoplimit field in the IP6 header on unicast packets. If you specify -1, the socket manager uses the default value. If you specify a value of 0 to 255, the packet uses the specified value as it hoplimit. Other values are considered invalid and result in an error code of EINVAL. For example:

int hlim = 60;    /* max = 255 */
setsockopt( s, IPPROTO_IPV6, IPV6_UNICAST_HOPS,
            &hlim, sizeof(hlim) );

The IP6 multicasting is supported only on AF_INET6 sockets of type SOCK_DGRAM and SOCK_RAW, and only on networks where the interface driver supports multicasting.

The IPV6_MULTICAST_HOPS option changes the hoplimit for outgoing multicast datagrams in order to control the scope of the multicasts:

unsigned int hlim; /* range: 0 to 255, default = 1 */
setsockopt( s, IPPROTO_IPV6, IPV6_MULTICAST_HOPS,
            &hlim, sizeof(hlim) );

Datagrams with a hoplimit of 1 aren't forwarded beyond the local network. Multicast datagrams with a hoplimit of 0 won't be transmitted on any network, but may be delivered locally if the sending host belongs to the destination group and if multicast loopback hasn't been disabled on the sending socket (see below). Multicast datagrams with a hoplimit greater than 1 may be forwarded to other networks if a multicast router is attached to the local network.

For hosts with multiple interfaces, each multicast transmission is sent from the primary network interface. The IPV6_MULTICAST_IF option overrides the default for subsequent transmissions from a given socket:

unsigned int outif;
outif = if_nametoindex("ne0");
setsockopt( s, IPPROTO_IPV6, IPV6_MULTICAST_IF,
            &outif, sizeof(outif) );

(The outif argument is an interface index of the desired interface, or 0 to specify the default interface.)

If a multicast datagram is sent to a group to which the sending host itself belongs (on the outgoing interface), a copy of the datagram is, by default, looped back by the IP6 layer for local delivery. The IPV6_MULTICAST_LOOP option gives the sender explicit control over whether or not subsequent datagrams are looped back:

u_char loop;  /* 0 = disable, 1 = enable (default) */
setsockopt( s, IPPROTO_IPV6, IPV6_MULTICAST_LOOP,
            &loop, sizeof(loop));

This option improves performance for applications that may have no more than one instance on a single host (such as a router daemon), by eliminating the overhead of receiving their own transmissions. Don't use the IPV6_MULTICAST_LOOP option if there might be more than one instance of your application on a single host (e.g., a conferencing program), or if the sender doesn't belong to the destination group (e.g., a time-querying program).

A multicast datagram sent with an initial hoplimit greater than 1 may be delivered to the sending host on a different interface from that on which it was sent, if the host belongs to the destination group on that other interface. The loopback control option has no effect on such a delivery.

A host must become a member of a multicast group before it can receive datagrams sent to the group. To join a multicast group, use the IPV6_JOIN_GROUP option:

struct ipv6_mreq mreq6;
setsockopt( s, IPPROTO_IPV6, IPV6_JOIN_GROUP,
            &mreq6, sizeof(mreq6) );

Note that the mreq6 argument has the following structure:

struct ipv6_mreq {
  struct in6_addr ipv6mr_multiaddr;
  unsigned int ipv6mr_interface;
};

Set the ipv6mr_interface member to 0 to choose the default multicast interface, or set it to the interface index of a particular multicast-capable interface if the host is multihomed. Membership is associated with a single interface; programs running on multihomed hosts may need to join the same group on more than one interface.

To drop a membership, use:

struct ipv6_mreq mreq6;
setsockopt( s, IPPROTO_IPV6, IPV6_LEAVE_GROUP,
            &mreq6, sizeof(mreq6) );

The mreq6 argument contains the same values as used to add the membership. Memberships are dropped when the socket is closed or the process exits.

The IPV6_PORTRANGE option controls how ephemeral ports are allocated for SOCK_STREAM and SOCK_DGRAM sockets. For example:

int range = IPV6_PORTRANGE_LOW;  /* see <netinet/in.h> */
setsockopt( s, IPPROTO_IPV6, IPV6_PORTRANGE, &range,
            sizeof(range) );

The IPV6_V6ONLY option controls the behavior of the AF_INET6 wildcard listening socket:

The following example sets the option to 1:

int on = 1;
setsockopt( s, IPPROTO_IPV6, IPV6_V6ONLY,
            &on, sizeof(on) );

The default value for this flag is copied at socket-instantiation time, from the net.inet6.ip6.bindv6only variable from the sysctl utility. The option affects TCP and UDP sockets only.

Advanced IP6 sockets API

The advanced IP6 sockets API lets applications specify or obtain details about the IP6 header and extension headers on packets. The advanced API uses ancillary data for passing data to or from the socket manager.

There are also setsockopt() / getsockopt() options to get optional information on incoming packets:

int  on = 1;

setsockopt( fd, IPPROTO_IPV6, IPV6_PKTINFO,
            &on, sizeof(on) );
setsockopt( fd, IPPROTO_IPV6, IPV6_HOPLIMIT,
            &on, sizeof(on) );
setsockopt( fd, IPPROTO_IPV6, IPV6_HOPOPTS,
            &on, sizeof(on) );
setsockopt( fd, IPPROTO_IPV6, IPV6_DSTOPTS,
            &on, sizeof(on) );
setsockopt( fd, IPPROTO_IPV6, IPV6_RTHDR,
            &on, sizeof(on) );

When any of these options are enabled, the corresponding data is returned as control information by recvmsg(), as one or more ancillary data objects.

If IPV6_PKTINFO is enabled, the destination IP6 address and the arriving interface index are available via struct in6_pktinfo on an ancillary data stream. You can pick the structure by checking for an ancillary data item by setting the cmsg_level argument to IPPROTO_IPV6 and the cmsg_type argument to IPV6_PKTINFO.

If IPV6_HOPLIMIT is enabled, the hoplimit value on the packet is made available to the application. The ancillary data stream contains an integer data item with a cmsg_level of IPPROTO_IPV6 and a cmsg_type of IPV6_HOPLIMIT.

The inet6_option_space() family of functions help you parse ancillary data items for IPV6_HOPOPTS and IPV6_DSTOPTS. Similarly, the inet6_rthdr_space() family of functions help you parse ancillary data items for IPV6_RTHDR.

Note: The IPV6_HOPOPTS and IPV6_DSTOPTS values may appear multiple times on an ancillary data stream (note that the behavior is slightly different from the specification). Other ancillary data items appear no more than once.

You can pass ancillary data items with normal payload data, using the sendmsg() function. Ancillary data items are parsed by the socket manager, and are used to construct the IP6 header and extension headers. For the cmsg_level values listed above, the ancillary data format is the same as the inbound case.

Additionally, you can specify a IPV6_NEXTHOP data object. The IPV6_NEXTHOP ancillary data object specifies the next hop for the datagram as a socket address structure. In the cmsghdr structure containing this ancillary data, the cmsg_level argument is IPPROTO_IPV6, the cmsg_type argument is IPV6_NEXTHOP, and the first byte of cmsg_data is the first byte of the socket address structure.

If the socket address structure contains an IP6 address (e.g., the sin6_family argument is AF_INET6 ), then the node identified by that address must be a neighbor of the sending host. If that address equals the destination IP6 address of the datagram, then this is equivalent to the existing SO_DONTROUTE socket option.

For applications that don't, or can't, use the sendmsg() or the recvmsg() function, the IPV6_PKTOPTIONS socket option is defined. Setting the socket option specifies any of the optional output fields:

setsockopt( fd, IPPROTO_IPV6, IPV6_PKTOPTIONS,
            &buf, len );

The buf argument points to a buffer containing one or more ancillary data objects; the len argument is the total length of all these objects. The application fills in this buffer exactly as if the buffer were being passed to the sendmsg() function as control information.

The options set by calling setsockopt() for IPV6_PKTOPTIONS are called "sticky" options because once set, they apply to all packets sent on that socket. The application can call setsockopt() again to change all the sticky options, or it can call setsockopt() with a length of 0 to remove all the sticky options for the socket.

The corresponding receive option:

getsockopt( fd, IPPROTO_IPV6, IPV6_PKTOPTIONS,
            &buf, &len );

returns a buffer with one or more ancillary data objects for all the optional receive information that the application has previously specified that it wants to receive. The buf argument points to the buffer that the call fills in. The len argument is a pointer to a value-result integer; when the function is called, the integer specifies the size of the buffer pointed to by buf, and on return this integer contains the actual number of bytes that were stored in the buffer. The application processes this buffer exactly as if it were returned by recvmsg() as control information.

Advanced API and TCP sockets

When you're using getsockopt() with the IPV6_PKTOPTIONS option and a TCP socket, only the options from the most recently received segment are retained and returned, and only after the socket option has been set. The application isn't allowed to specify ancillary data in a call to sendmsg() on a TCP socket, and none of the ancillary data described above is ever returned as control information by recvmsg() on a TCP socket.

Conflict resolution

In some cases, there are multiple APIs defined for manipulating an IP6 header field. A good example is the outgoing interface for multicast datagrams: it can be manipulated by IPV6_MULTICAST_IF in the basic API, by IPV6_PKTINFO in the advanced API, and by the sin6_scope_id field of the socket address structure passed to the sendto() function.

In QNX Neutrino, when conflicting options are given to the socket manager, the socket manager gets the value in the following order:

  1. options specified by using ancillary data
  2. options specified by a sticky option of the advanced API
  3. options specified by using the basic API
  4. options specified by a socket address
Note: The conflict resolution is undefined in the API specification and depends on the implementation.

Raw IP6 Sockets

Raw IP6 sockets are connectionless, and are normally used with sendto() and recvfrom(), although you can also use connect() to fix the destination for future packets (in which case you can use read() or recv(), and write() or send()).

If proto is 0, the default protocol IPPROTO_RAW is used for outgoing packets, and only incoming packets destined for that protocol are received. If proto is nonzero, that protocol number is used on outgoing packets and to filter incoming packets.

Outgoing packets automatically have an IP6 header prepended to them (based on the destination address and the protocol number the socket is created with). Incoming packets are received without the IP6 header or extension headers.

All data sent via raw sockets must be in network byte order; all data received via raw sockets is in network-byte order. This differs from the IPv4 raw sockets, which didn't specify a byte ordering and typically used the host's byte order.

Another difference from IPv4 raw sockets is that complete packets (i.e., IP6 packets with extension headers) can't be read or written using the IP6 raw sockets API. Instead, ancillary data objects are used to transfer the extension headers, as described above.

All fields in the IP6 header that an application might want to change (i.e., everything other than the version number) can be modified using ancillary data and/or socket options by the application for output. All fields in a received IP6 header (other than the version number and Next Header fields) and all extension headers are also made available to the application as ancillary data on input. Hence, there's no need for a socket option similar to the IPv4 IP_HDRINCL socket option.

When writing to a raw socket, the socket manager automatically fragments the packet if the size exceeds the path MTU, inserting the required fragmentation headers. On input, the socket manager reassembles received fragments, so the reader of a raw socket never sees any fragment headers.

Most IPv4 implementations give special treatment to a raw socket created with a third argument to socket() of IPPROTO_RAW, whose value is normally 255. We note that this value has no special meaning to an IP6 raw socket (and the IANA currently reserves the value of 255 when used as a next-header field).

For ICMP6 raw sockets, the socket manager calculates and inserts the mandatory ICMP6 checksum.

For other raw IP6 sockets (i.e., for raw IP6 sockets created with a third argument other than IPPROTO_ICMPV6), the application must:

  1. Set the new IPV6_CHECKSUM socket option to have the socket manager compute and store a pseudo header checksum for output.
  2. Verify the received pseudo header checksum on input, discarding the packet if the checksum is in error.

This option prevents applications from having to perform source-address selection on the packets they send. The checksum incorporates the IP6 pseudo-header, defined in Section 8.1 of RFC 2460. This new socket option also specifies an integer offset into the user data of where the checksum is located.

int offset = 2;
setsockopt( fd, IPPROTO_IPV6, IPV6_CHECKSUM,
            &offset, sizeof(offset));

By default, this socket option is disabled. Setting the offset to -1 also disables the option. Disabled means:

  1. The socket manager won't calculate and store a checksum for outgoing packets.
  2. The socket manager kernel won't verify a checksum for received packets.
Note:
  • Since the checksum is always calculated by the socket manager for an ICMP6 socket, applications can't generate ICMPv6 packets with incorrect checksums (presumably for testing purposes) using this API.
  • The IPV6_NEXTHOP object/option isn't fully implemented.

Based on:

RFC 2553, RFC 2292, RFC 2460