| Updated: October 28, 2024 |
Another easy function to understand is the io_write() function. It gets a little more complicated because we have to handle allocating blocks when we run out (i.e., when we need to extend the file because we have written past the end of the file).
The io_write() functionality is presented in two parts, one is a fairly generic io_write() handler, the other is the actual block handler that writes the data to the blocks.
The generic io_write() handler looks at the current size of the file, the OCB's offset member, and the number of bytes being written to determine if the handler needs to extend the number of blocks stored in the fileblocks member of the extended attributes structure. Once that determination is made, and blocks have been added (and zeroed!), then the RAM-disk-specific write handler, ramdisk_io_write(), is called.
The following diagram illustrates the case where we need to extend the blocks stored in the file:
Figure 1. A write that overwrites existing data in the file, adds data to the unused portion of the current last block, and then adds one more block of data.The following shows what happens when the RAM disk fills up. Initially, the write would want to perform something like this:
Figure 2. A write that requests more space than exists on the disk.However, since the disk is full (we could allocate only one more block), we trim the write request to match the maximum space available:
Figure 3. A write that's been trimmed due to lack of disk space.There was only 4 KB more available, but the client requested more than that, so the request was trimmed.
int
cfs_io_write (resmgr_context_t *ctp, io_write_t *msg,
RESMGR_OCB_T *ocb)
{
cfs_attr_t *attr;
int i;
off_t newsize;
if ((i = iofunc_write_verify (ctp, msg, ocb, NULL)) != EOK) {
return (i);
}
// shortcuts
attr = ocb -> attr;
newsize = ocb -> offset + msg -> i.nbytes;
// 1) see if we need to grow the file
if (newsize > attr -> attr.nbytes) {
// 2) truncate to new size using TRUNCATE_ERASE
cfs_a_truncate (attr, newsize, TRUNCATE_ERASE);
// 3) if it's still not big enough
if (newsize > attr -> attr.nbytes) {
// 4) trim the client's size
msg -> i.nbytes = attr -> attr.nbytes - ocb -> offset;
if (!msg -> i.nbytes) {
return (ENOSPC);
}
}
}
// 5) call the RAM disk version
return (ramdisk_io_write (ctp, msg, ocb));
}
The code walkthrough is as follows:
As mentioned above, the generic io_write() function isn't doing anything that's RAM-disk-specific; that's why it was separated out into its own function.
Now, for the RAM-disk-specific functionality. The following code implements the block-management logic (refer to the diagrams for the read logic):
int
ramdisk_io_write (resmgr_context_t *ctp, io_write_t *msg,
RESMGR_OCB_T *ocb)
{
cfs_attr_t *attr;
int sb; // startblock
int so; // startoffset
int lb; // lastblock
int nbytes, nleft;
int toread;
iov_t *newblocks;
int i;
off_t newsize;
int pool_flag;
// shortcuts
nbytes = msg -> i.nbytes;
attr = ocb -> attr;
newsize = ocb -> offset + nbytes;
// 1) precalculate the block size constants...
sb = ocb -> offset / BLOCKSIZE;
so = ocb -> offset & (BLOCKSIZE - 1);
lb = newsize / BLOCKSIZE;
// 2) allocate IOVs
i = lb - sb + 1;
if (i <= 8) {
newblocks = mpool_malloc (mpool_iov8);
pool_flag = 1;
} else {
newblocks = malloc (sizeof (iov_t) * i);
pool_flag = 0;
}
if (newblocks == NULL) {
return (ENOMEM);
}
// 3) calculate the first block size
toread = BLOCKSIZE - so;
if (toread > nbytes) {
toread = nbytes;
}
SETIOV (&newblocks [0], (char *)
(attr -> type.fileblocks [sb].iov_base) + so, toread);
// 4) now calculate zero or more blocks;
// special logic exists for a short final block
nleft = nbytes - toread;
for (i = 1; nleft > 0; i++) {
if (nleft > BLOCKSIZE) {
SETIOV (&newblocks [i],
attr -> type.fileblocks [sb + i].iov_base, BLOCKSIZE);
nleft -= BLOCKSIZE;
} else {
SETIOV (&newblocks [i],
attr -> type.fileblocks [sb + i].iov_base, nleft);
nleft = 0;
}
}
// 5) transfer data from the message directly into the ramdisk...
resmgr_msggetv (ctp, newblocks, i, sizeof (msg -> i));
// 6) clean up
if (pool_flag) {
mpool_free (mpool_iov8, newblocks);
} else {
free (newblocks);
}
// 7) use the original value of nbytes here...
if (nbytes) {
attr -> attr.flags |= IOFUNC_ATTR_MTIME | IOFUNC_ATTR_DIRTY_TIME;
ocb -> offset += nbytes;
}
_IO_SET_WRITE_NBYTES (ctp, nbytes);
return (EOK);
}