Week 6: Networking
Navigate to the CS644 home page to find the latest version.
This week will be discussing two forms of networking, both intermachine and intramachine. Actually, intermachine networking is better described as a form of IPC than networking, but we're covering them together because they both use the socket API.
The kernel doesn't expose the lowest level of networking – userspace programs cannot directly read Ethernet packets off the NIC, for instance. Nor does it implement the highest level of the network stack – there's not an in-kernel HTTP or SSH implementation. What it gives you is TCP and UDP – transport-level protocols that you can build your application protocols on top of.
Locations on a network are identified by a combination of address and port. At the IP level, an address is an IP address like 167.71.190.147, though you may use a domain name like iafisher.com instead of the raw numeric address. A port is an integer that identifies a particular service at an address. Ports allow a single computer to offer multiple networked services. There are conventional ports for different services, for instance port 22 for SSH, port 80 for HTTP, and port 443 for HTTPS, but nothing stops you from using a different port, as long as the client and server agree.
It takes quite a few steps to set up a network connection. On the server side:
socketto get a file descriptorgetaddrinfoto create an address (e.g., IP address and port)bindto bind the file descriptor to the addresslistento start listening for connections- Call
acceptin a loop to accept a single incoming connection, as a new file descriptor - Use
sendandrecv(orwriteandread) on the new file descriptor shutdownto close the connection
And on the client side:
socketto get a file descriptorgetaddrinfoto create an address (or resolve a domain name into an address)connectto establish a connection to the serversendandrecv
hello_conn.c has an example of both sides of the connection.
The first argument to socket is the domain:
AF_INETmeans IPv4AF_INET6means IPv6AF_UNIXmeans Unix sockets
IPv4 and IPv6 are the familiar network protocols. Unix sockets are for IPC – they copy data through the kernel and involve no actual networking.
The second argument to socket is the type:
SOCK_STREAMfor reliable data streams, i.e. TCPSOCK_DATAGRAMfor unreliable data packets, i.e. UDP
Though note that for Unix sockets, delivery is always reliable regardless of the type.
Because the socket API returns file descriptors, we can use regular file APIs like read and write to work with them, though not all APIs (e.g., lseek) make sense for sockets. But the socket-specific APIs like send and recv expose some extra options.
Homework exercises
Note: The socket API may be exposed in a different library in your programming language, e.g. in Python it's in socket, not os.
- (★) What function does a DNS lookup to turn a domain name into an IP address?
Solution
The function isgetaddrinfo. This is a standard library function, not a system call: the kernel does not implement DNS. - (★) What's the difference between
bindandlisten?Solution
bindassociates a socket file descriptor with an address and port, whilelistenstarts actively listening for new connections on that socket. - (★) What flags do you pass to request a TCP/IP connection?
Solution
AF_INETorAF_INET6for the first argument, andSOCK_STREAMfor the second argument. - (★★) Final project (database): Let's provide a proper client interface to the database. Have the main process listen for connections (you can decide whether you want to do TCP or local Unix) and allow querying the database. You can decide what the protocol looks like; a simple one might have commands like
get <key>\nandset <key> <value>\n. Write a client program that provides a nice command-line interface to send commands to the database. - (★★) Final project (web server): It's finally time to make a proper web server! Use the socket API to listen for TCP connections. You can fork off a child process to handle each connection, or wait until next week when we learn about multithreading. You can make up a simple TCP-based protocol (e.g., client sends
hello server\n, server sendshello client\nback) and test it withtelnet, or if you're ambitious you can implement HTTP on the server side – either yourself or using an existing library. - (★★) How can a server access its client's network address?
Solution
When a connection is initiated,acceptfills in the second argument with the client address. - (★★) What happens if a client calls
connectbefore the server has calledbind? What about after the server has calledbind, but before it has calledlisten? Write a program to find out. - (★★) If I open a socket with
SOCK_DATAGRAM, willrecvalways return a single packet at a time? What happens if the buffer is too small to fit the packet? Write a program to demonstrate what happens. - (★★) What's the difference between
closeandshutdown? Can you write a program that shows them behaving differently? - (★★★) Use
epollto write a single-threaded server that can simultaneously handle multiple connections. - (★★★) You can pass an open file descriptor from one process to another via a socket. What syscall allows you to do this? When might this technique be useful?
Solution
You can transfer an open file descriptor using themsghdrparameter of thesendmsgsyscall. See Section 17.4 of Advanced Programming in the Unix Environment for details. You could use this technique to implement an "open server", a server program that implements access controls beyond what is possible in the Linux permissions models. For instance, it could enforce that a client can only open a file in append mode.