The add_tar_entry() function

Updated: April 19, 2023

Header analysis is done by the add_tar_entry() function. If this looks a little bit familiar it's because the code is based on the pathwalk() function from the RAM-disk resource manager, with logic added to process the tar file header.

static int
add_tar_entry (cfs_attr_t *a, off_t off, ustar_t *t, char *tarfile)
{
  des_t   output [_POSIX_PATH_MAX];
  int     nels;
  char    *p;
  int     i;
  int     mode;

  // 1) first entry is the mount point
  output [0].attr = a;
  output [0].name = NULL;

  // 2) break apart the pathname at the slashes
  nels = 1;
  for (p = strtok (t -> name, "/"); p; p = strtok (NULL, "/"), nels++) {
    if (nels >= _POSIX_PATH_MAX) {
      return (E2BIG);
    }
    output [nels].name = p;
    output [nels].attr = NULL;
  }

  // 3) analyze each pathname component
  for (i = 1; i < nels; i++) {

    // 4) check
    if (!S_ISDIR (output [i - 1].attr -> attr.mode)) {
      return (ENOTDIR);       // effectively an internal error
    }

    // 5) check to see if the element exists...
    if (!(output [i].attr = search_dir (output [i].name,
                                        output [i-1].attr))) {
      mode = parse_mode (t);

      // 6) intermediate directory needs to be created...
      if (S_ISDIR (mode) || (i + 1 < nels)) {
        output [i].attr = cfs_a_mkdir (output [i - 1].attr,
                                       output [i].name, NULL);
        tar_to_attr (t, &output [i].attr -> attr);
        // kludge for implied "directories"
        if (S_ISREG (output [i].attr -> attr.mode)) {
          output [i].attr -> attr.mode =
            (output [i].attr -> attr.mode & ~S_IFREG) | S_IFDIR;
        }

      // 7) add a regular file
      } else if (S_ISREG (mode)) {
        output [i].attr = cfs_a_mkfile (output [i - 1].attr,
                                        output [i].name, NULL);
        tar_to_attr (t, &output [i].attr -> attr);
        output [i].attr -> nels = output [i].attr -> nalloc = 1;
        output [i].attr -> type.vfile.name = tarfile;
        output [i].attr -> type.vfile.off = off;

      // 8) add a symlink
      } else if (S_ISLNK (mode)) {
        output [i].attr = cfs_a_mksymlink (output [i - 1].attr,
                                           output [i].name, NULL);
        tar_to_attr (t, &output [i].attr -> attr);
        output [i].attr -> type.symlinkdata = strdup (t -> linkname);
        output [i].attr -> attr.nbytes = strlen (t -> linkname);

      } else {
        // code prints an error message here...
        return (EBADF);         // effectively an internal error
      }
    }
  }

  return (EOK);
}

The code walkthrough is:

  1. Just as in the RAM disk's pathwalk(), the first entry in the output array is the mount point.
  2. And, just as in the RAM disk's pathwalk(), we break apart the pathname at the / delimiters.
  3. Then we use the for loop to analyze each and every pathname component.
  4. This step is a basic check—we're assuming that the .tar file came from a normal, sane, and self-consistent filesystem. If that's the case, then the parent of an entry will always be a directory; this check ensures that that's the case. (It's a small leftover from pathwalk().)
  5. Next we need to see if the component exists. Just as in pathwalk(), we need to be able to verify that each component in the pathname exists, not only for confidence, but also for implied intermediate directories.
  6. In this case, the component does not exist, and yet there are further pathname components after it! This is the case of the implied intermediate directory; we simply create an intermediate directory and “pretend” that everything is just fine.
  7. In this case, the component does not exist, and it's a regular file. We just go ahead and create the regular file. Note that we don't check to see if there are any components following this regular file, for two reasons. In a sane, consistent filesystem, this wouldn't happen anyway (so we're really not expecting this case in the .tar file). More importantly, that error will be caught in step 4 above, when we go to process the next entry and discover that its parent isn't a directory.
  8. In this case, we create a symbolic link.