home blog portfolio Ian Fisher

Week 3: Filesystems, part 2

Types of files

Linux distinguishes between several types of files:

This list is not exhaustive, but these are the key file types we are concerned with now.

The classic Unix permissions model

Linux has inherited the classic file permissions model from Unix. There are three permission bits, each of which has a different meaning for regular files versus directories:

Execute for directories

The execute permission is really a misnomer for directories: it has nothing to do with execution, they just appropriated an otherwise meaningless bit.

The execute permission for directories is tricky. It really has nothing to do with execution; since the bit was meaningless for directories, they appropriated it for something unrelated. It seems similar to read, and indeed usually the read and execute bits for directories have the same value, but all four combinations of the two bits are possible:

Owners and groups

Every file has an owner and a group. The three permissions (read, write, execute) can be separately set for the owner, the group, and everyone else. We can write a file's permission as a nine-character string, such as rwxr-xr-x, where the first three bits are the owner's permissions, the next bits the group's, and the last bits everyone else's.

Example: Suppose I'm a professor teaching a CS course, and I have a file of homework solutions. I should have read and write access to the file. My TAs, who belong to the teaching_assistants group, should have read-only access. Everyone else should have no access at all. Therefore, my file permissions should look like:

$ ls -l solutions.txt
-rw-r----- ian teaching_assistants ... solutions.txt

(The initial dash is how ls indicates the file is not a directory.)

Octal notation

Instead of a nine-character string, we can represent permissions more concisely as a three-digit octal number, for example 755. To break it down:

Some common permissions are:

Bonus: chmod abbreviations

The chmod shell command understands some abbreviations:

# give the owner ('user') exec permission
$ chmod u+x myfile.txt
# remove the owner's exec permission
$ chmod u-x myfile.txt

Unfortunately they are easy to get confused:

Syscalls: File metadata and permissions

struct stat {
  mode_t st_mode;
  uid_t  st_uid;
  gid_t  st_gid;
  off_t  st_size;
  /* other fields */
};

int stat(const char* pathname, struct stat* statbuf);
int fstat(int fd, struct stat* statbuf);
int chmod(const char* pathname, mode_t mode);
int fchmod(int fd, mode_t mode);

Syscalls: Directories

int mkdir(const char* pathname, mode_t mode);
struct linux_dirent {
    char d_name[];
    char d_type;
    /* other fields */
};

// raw syscall
ssize_t getdents64(int fd, void* dirp, size_t count);

// libc wrapper
DIR* opendir(const char* pathname);
struct dirent* readdir(DIR* dirp);

Syscalls: Moving and deleting files

int rename(const char* oldpath, const char* newpath);
int unlink(const char* pathname);
int rmdir(const char* pathname);

Syscalls: File locking

const int LOCK_SH = /* */;
const int LOCK_EX = /* */;
const int LOCK_UN = /* */;
const int LOCK_NB = /* */;

int flock(int fd, int op);

In-class exercises

  1. What permissions are required to rename a file?
    Solution The basic permissions are +x on each directory in the source and destination paths, and +w on the ultimate source and destination directories. For a more comprehensive answer, see my blog post.
  2. Let's rewrite our program from last week to more efficiently find a file's size.
    Solution Use the stat syscall to retrieve st_size, instead of reading every byte of the file.

Homework exercises

  1. (★) Do syscalls follow hard links?
    Solution Yes, although to be pedantic it's misleading to speak of "following" hard links in the same way as symbolic links, since there's not really a link to be followed but a direct reference to the file's content and metadata.
  2. (★) True or false: If I acquire an exclusive lock on a file, no other process will be able to write to it.
    Solution False. Locks are advisory, so another process that does not even attempt to acquire the lock will not be prevented from writing to it.
  3. (★) How is the owner of a new file or directory set?
    Solution The owner of the new file is set to the effective user ID of the process that created it. We'll learn in week 4 what exactly the "effective user ID" is.
  4. (★★) Final project (database): Modify your program to create the database file with permissions locked down to the file's owner. Use file locking to ensure that get and set are synchronized. Test your solution by having the set command go to sleep for 5 seconds while holding the lock; in another terminal, ensure that get and set block until the first set command wakes up and releases the lock.
  5. (★★) Final project (web server): A busy HTTP server can accumulate lots of logs. Add a rotate command which renames http.log to http.1.log, http.1.log to http.2.log, etc. Delete any log files older than http.5.log. Next, instead of running and exiting, modify the run command to loop forever and write a line to the log file each second, sleeping in between. Use file locking to ensure that count and run are synchronized, and make sure you can still run count while the server is running.
  6. (★★) Write a simple version of rm -rf. If your language doesn't expose getdents64 directly, you can use a higher-level API. Warning: Depending on your language, listing the directory may include the special .. entry. Make sure to filter this out! Otherwise you might try to recursively delete the whole tree.
    Solution https://github.com/iafisher/cs644/tree/master/week3/solutions/rmrf.c
  7. (★★) Write a program that takes in a file path and prints its permissions in rwxrwxrwx format.
    Solution https://github.com/iafisher/cs644/tree/master/week3/solutions/printperm.c
  8. (★★) We have syscalls to move and delete files, but not to copy them. Why not?
    Solution Copying can be accomplished using open, read, and write; a dedicated syscall is not necessary. However, this is inefficient because you have to copy every byte twice: once from the kernel into userspace on read, and once again from userspace into the kernel on write. So Linux has a sendfile syscall that lets you directly transfer bytes between two open files.
  9. (★★) What happens if you rename a file while another process is writing to it? Make a prediction, then write a pair of programs to demonstrate what happens.
    Solution Two reasonable predictions are that (a) it will continue writing to the new path, or (b) it will start writing to the old path. If you are doing the web server final project, you can use the logging command along with mv to test this. It turns out that (a) is correct – but only if the writing process keeps the file open. If it closes and reopens each time, it will keep writing to the old path.
  10. (★★) Is getdents64 atomic? Write a pair of programs that demonstrates its behavior.
    Solution I wrote a program that creates files 0001 through 0999 and rapidly renames, e.g., 0001 back and forth to 1001. Another thread calls getdents64 in a loop; if it ever sees both file N and N + 1000, then it was not atomic. I also tested the same thing with readdir instead of getdents64. My program never detected a non-atomic read, though I'm not very confident in this result. Certainly, consecutive calls to getdents64 cannot be atomic. This seems to be what readdir is doing under the hood (in batches), so my test program must be flawed. This Stack Overflow answer says that getdents64 is atomic, though it's not documented as such. Unless you are reading a very large directory, you will most likely get back an atomic listing regardless of what you do.
  11. (★★) Linux file permissions are a little more complicated than what was presented here. Research the concepts of the set-uid, set-gid, and sticky bits.
    Solution Normally, when an executable is run on Linux, it will run with the user ID and group ID of the user who started it. But if the set-uid or set-gid bit is set in the executable file's st_mode, then it will instead run as the file's owner or group, respectively. Usually this is when an executable needs root permissions but should be available to non-root users. Because they allow privilege escalation, set-uid executables are a potential security hole and must be written very carefully. See section 8.11 of Advanced Programming in the Unix Environment for an example.

    The sticky bit is another bit in st_mode that changes the interpretation of directory permissions: if set, then files in the directory can only be renamed or removed by the file's owner, the directory's owner, or the superuser (instead of anyone with +w permissions). Most commonly, the sticky bit is set on /tmp so that everyone can create files but not interfere with others' files. Tony Finch's blog post goes into more detail.
  12. (★★★) Is it possible to atomically overwrite a file? First, write a pair of programs (a reader and a writer) that shows that simply overwriting with write isn't atomic. Then, find a way to do it atomically.
    Solution We demonstrated in class with simultaneous.c that writes are not atomic. To overwrite atomically, create a temporary file, write to it, and then use rename to atomically replace the destination file. It's important to create the temporary file on the same filesystem as the destination file (e.g., in the same directory), because rename does not work across filesystems.
  13. (★★★) mv myfile.txt /tmp requires an additional permission beyond what we discussed. (NOTE: This turns out to not be true on the shared server, but if you have a macOS computer, you might be able to replicate it there.) What is it? Can you explain why it's necessary?
    Hint Run mv under strace. What syscall(s) does it use to move the file?
    Solution It also requires +r permission on myfile.txt. The reason is that /tmp is mounted as a separate filesystem, and rename cannot move files across filesystems. So the mv command instead reads the source file and writes it out to the destination, and this requires +r on the source file.
  14. (★★★) Linux file permissions are a lot more complicated than what was presented here. Research file ACLs and SELinux contexts. What syscalls do they use?
    Solution TODO

Further reading