Home

Syllabus

Notes

Homework

Grades


The File System


The File System
        The Unix view of files
        Directories and filenames
        Permissions
        Inodes
        Other stuff


Reading: The Unix Programming Environment, Chapter 2

The Unix view of files

  • Everything is a file in Unix
    • Files
    • Devices
    • User sessions
  • There aren't "filetypes" like you would think of them in Windows or MacOS
    • Files are files
    • The contents of a file determine its type
    • The file command examines the content of the file to figure out what it is
  • The newline character divides lines
    • written in C strings as "\n", in octal as "\012" and in hexadecimal as "0x0a"
    • Windows uses a carriage return and a linefeed
      • i.e. "\r\n", often abbreviated "CRLF"
    • MacOS uses carriage return ("\r")
  • The contents of a file and the interpretation of those contents are separate matters
    • A tab in a text file just means that an octal '\011' or hex '0x09' is stored at that point in the file
      • What that character means is up to the program
        • Most programs just happen to align the following text to the next tab stop
        • Most often the tab stops default to 8 characters
        • So text ends up at column 9, 17, 25, etc.
  • Control D ("^D") signals the end of file on input, but there is no end of file character in Unix files

Directories and filenames

  • Every file has a unique name (consisting of directory and filename)
  • Two different files can be named "foo" if they're in different directories, but you can't have two files named "foo" in the same directory
  • Filenames can be relative or absolute
    • Absolute filenames start at the top of the "tree"
    • Relative filenames start at the current directory
  • The Unix filesystem is case-sensitive

Permissions

  • Every file has 12 bits that define its permissions
    • i.e. who can do what with the file
    • usually specified in octal
    • called the file's "mode"
  • Permission bits
    • One set each for user, group, and others
    • 4 (r--) -- read
      • Can access the file or ls the directory
    • 2 (-w-) -- write
      • Can modify the file or change the directory
    • 1 (--x) -- execute
      • Can execute the file or use the directory
  • First three bits control the behavior of executables
    • 4000
      • setuid -- when the program is run, it will run with the effective UID of the owner of the program
      • Shown in ls -l as an "s" in the user execute spot
    • 2000
      • setgid -- when the program is run, it will run with the effective GID of the group of the program
      • Shown in ls -l as an "s" in the group execute spot
    • 1000
      • "sticky bit" -- originally meant to keep a program in memory, rather than swapping it out
        • Also used for directories. If the sticky bit is set for a directory, only the owner of a file may delete the file. Often used in temp directories.
      • Shown in ls -l as a "t" in the others execute spot
  • Chmod
    • Change file permissions ("change mode")
      • Only the owner of a file can change file permissions
    • Two forms usage:
      • chmod perm filename
      • example
        • chmod 755 myprog
        • ls -l myprog
        • -rwxr-xr-x 1 me mygroup 222 Jan 20 10:22 myprog
      • chmod who+what filename or chmod who-what filename
        • chmod u+x myprog
          • Adds execute permission for the owner ("user") for myprog
        • chmod +x myprog
          • Adds all three execute bits for myprog
        • chmod o-w myfile
          • Removes write permission for "others"
  • umask
    • Controls "default" permissions
    • Any bit "set" in the umask will not be set in any files created

Inodes

  • The "inode" is the data structure on disk that is associated with a file
    • In essence, the inode is the file
  • The inode contains information about the size of the file, the permissions, where on disk the contents of the file are stored, etc.
  • Times (and their names in the stat(2) structure)
    • Modification time (mtime)
      • When the contents of the file were last changed
      • This is the time shown by ls -l
    • Access time (atime)
      • When the file was last accessed
    • inode change time (ctime)
      • When the inode itself was modified
        • e.g. To change owner, permissions, etc.
  • A directory is just a file which contains names and inodes corresponding to those names
    • The entry is called a "link"
    • The ln (link) command can make extra links
      • ln -s creates a new inode that is a type of file called a "symbolic" link" or "soft link" which points to another file by name
  • You don't really delete files in Unix
    • You remove directory entries
      • Using the unlink() system call
    • When all directory entries to an inode are removed (the "link count" is zero) the file is "deleted" and the disk space is marked as available
  • Different ways of "copying a file"
    • ln makes another directory entry pointing to the same inode. It is not a copy. The "link count" in the inode increases by one
    • ln -s creates a new file, which "points" to the name of the first file
    • cp creates a new file whose contents are a copy of the original
    • mv changes the directory entry, but doesn't affect the inode

Other stuff

  • We won't spend time now on the directory hierarchy or device file information in this chapter
  • But /dev/null is a very useful device
    • Often referred to as "the bit bucket"
    • Things written to /dev/null are forever gone
      • See the book's example of timing a grep operation, and directing stdout to /dev/null, because we don't actually care about the results, just the timing
    • Reading from /dev/null gives you an empty file
    • Copying from /dev/null to an existing file truncates the file