|
5 | 5 | \begin{slide}
|
6 | 6 | \sltitle{File API}
|
7 | 7 | \begin{itemize}
|
8 |
| -\item before working with a file, it must be first open via |
| 8 | +\item before working with a file, it must be first opened via |
9 | 9 | \funnm{open}() or \funnm{creat}()
|
10 | 10 | \item open files are accessible via \emph{file descriptors}, numbered from 0.
|
11 | 11 | More descriptors can share the same file opening (read/write mode, position).
|
|
112 | 112 | current mask that can be changed via a shell command \texttt{umask} -- those
|
113 | 113 | bits in \emph{mode}, also set in the process umask, are nullified. The
|
114 | 114 | default umask value is typically (and historically) \texttt{022}. We recommend
|
115 |
| -you to always set it to \texttt{077} in your profile script. Never do that for |
| 115 | +that you always set it to \texttt{077} in your profile script. Never do that for |
116 | 116 | root though otherwise you will end up with a system in a non-supported
|
117 | 117 | configuration -- installed software will not be possible to run by
|
118 | 118 | non-privileged users, what worked before may stop working, etc.
|
119 | 119 | \item If the \emph{mode} argument is required and not specified, you get
|
120 |
| -whatever is on the stack. Both flags and the mode are stored in the system file |
| 120 | +whatever is on the stack. Both flags and mode are stored in the system file |
121 | 121 | table, see page \pageref{OPENFILETABLES}.
|
122 | 122 | \item Macros for use with \emph{mode} can be usually found in the manual page
|
123 | 123 | for \texttt{chmod(2)}, and you can find them also in the \texttt{stat.h} header
|
|
142 | 142 | as historically, implementations used 0 for the read-only flag. The standard
|
143 | 143 | defines that only one of those three flags may be used.
|
144 | 144 | \item Is is possible to open and create a file for writing so that writing is
|
145 |
| -disallowed by its mode. It will work for that file opening but any other file |
146 |
| -opening for writing will fail. |
| 145 | +disallowed by its mode. It will work for the initial file opening but any subsequent |
| 146 | +attempts to write will fail. |
147 | 147 | \item You need write permission to use \texttt{O\_TRUNC}.
|
148 | 148 | \item The behavior of \texttt{O\_EXCL} without using \texttt{O\_CREAT} at the
|
149 | 149 | same is undefined.
|
|
179 | 179 | \label{CREAT}
|
180 | 180 |
|
181 | 181 | \begin{itemize}
|
182 |
| -\item The \texttt{open} call allows to open a regular file, a device, or a named |
| 182 | +\item The \texttt{open} call allows opening of a regular file, device, or named |
183 | 183 | pipe. However, it (and \texttt{creat} as well) can only create a regular file,
|
184 | 184 | so you need the other two calls for non-regular files.
|
185 |
| -\item The test of a file existence using the flag \texttt{O\_EXCL} and its |
| 185 | +\item The test of a file's existence using the flag \texttt{O\_EXCL} and its |
186 | 186 | subsequent creation if it did not exist, is an atomic operation. You can use
|
187 | 187 | that for lock files but only with the \texttt{open} call, not \texttt{creat}.
|
188 | 188 | \item You need extra privileges to create device special files (e.g. to be a
|
|
223 | 223 | \setlength{\itemsep}{0.8\itemsep}
|
224 | 224 | \item For any Unix system, a file is just a sequence of bytes without any inner
|
225 | 225 | structure.
|
226 |
| -\item \emsl{Behavior of \texttt{read} and \texttt{write} depends on the type of |
| 226 | +\item The \emsl{behavior of \texttt{read} and \texttt{write} depends on the type of |
227 | 227 | the file} (regular, device, pipe, or socket) and whether the file is in a
|
228 | 228 | blocking or non-blocking mode (flag \texttt{O\_NONBLOCK} on file opening, see
|
229 | 229 | page \pageref{O_NONBLOCK}).
|
|
236 | 236 | \texttt{read} will block unless some data gets available, a non-blocking
|
237 | 237 | \texttt{read} returns -1 and sets \texttt{errno} to \texttt{EAGAIN}.
|
238 | 238 | \item \texttt{write} returns a non-zero number of bytes less than \emph{nbyte}
|
239 |
| -if less then \emph{nbyte} bytes can fit the file (e.g. disk full), if the call |
| 239 | +if less then \emph{nbyte} bytes can fit into the file (e.g. disk full), if the call |
240 | 240 | was interrupted by a signal, or if \verb#O_NONBLOCK# was set and only part of
|
241 | 241 | the data fits into a pipe, socket, or a device; without \verb#O_NONBLOCK#
|
242 | 242 | the call will block until all the data can be written. If nothing can be
|
|
268 | 268 | \begin{itemize}
|
269 | 269 | \item releases \texttt{fildes}, if it was the last descriptor for a file
|
270 | 270 | opening, closes the file
|
271 |
| -\item if number of links is 0, the file data is released |
272 |
| -\item if the last pipe descriptor is closed, remaining data is lost |
273 |
| -\item on a process termination, implicit \texttt{close} is called on all |
| 271 | +\item if the number of links is 0, the file data is released |
| 272 | +\item if the last pipe descriptor is closed, any remaining data is lost |
| 273 | +\item on process termination, an implicit \texttt{close} is called on all |
274 | 274 | descriptors
|
275 | 275 | \end{itemize}
|
276 | 276 | \end{slide}
|
|
391 | 391 | \item When writing to a pipe without a consumer (i.e. the producer opened the
|
392 | 392 | pipe when there was at least one existing consumer), the kernel will send the
|
393 | 393 | producer a signal \texttt{SIGPIPE} (``broken pipe''). See the following
|
394 |
| -example. For simplicity, we are using an unnamed pipe but that does not matter |
395 |
| -as it would have behaved in the same manner. The \texttt{date(1)} command never |
| 394 | +example. For simplicity, we are using an unnamed pipe but a named pipe |
| 395 | +would behave in the same manner. The \texttt{date(1)} command never |
396 | 396 | reads anything from its standard input so it is guaranteed that the producer,
|
397 | 397 | \texttt{dd(1)}, will be writing to a pipe without a consumer. If a process is
|
398 | 398 | killed by a signal, the shell provides a signal number added to 128 as its
|
|
409 | 409 |
|
410 | 410 | \item When opening a pipe for writing only with \texttt{O\_NONBLOCK} and without
|
411 | 411 | an existing consumer, the call returns -1 and \texttt{errno} is set to
|
412 |
| -\texttt{ENXIO}. This asymmetry to opening a pipe for reading in a non-blocking |
| 412 | +\texttt{ENXIO}. This asymmetry in opening a pipe for reading in non-blocking |
413 | 413 | mode is due to the fact that it is not desirable to have data in a pipe that may
|
414 | 414 | not be read in a short period of time. The Unix system does not allow for
|
415 |
| -storing pipe data for arbitrary length of time. Without the |
| 415 | +storing pipe data for an arbitrary length of time. Without the |
416 | 416 | \texttt{O\_NONBLOCK} flag, the process will block while waiting for a consumer.
|
417 |
| -By asymmetry we mean that the system does not mind to keep consumers without |
| 417 | +By asymmetry, we mean that the system allows consumers without |
418 | 418 | producers but it tries to avoid writers without existing readers.
|
419 | 419 | \item If you want to create a process that sits on a named pipe and processes
|
420 | 420 | data from producers, you need to open it with the flag \texttt{O\_RDWR} even
|
421 |
| -that you do not intend to write it. If you do not use the flag, you might end |
422 |
| -up with \texttt{read} returning 0 after all producers, perhaps temporarily only, |
| 421 | +if you do not intend to write to it. If you do not use the flag, you might end |
| 422 | +up with \texttt{read} returning 0 after all producers, perhaps only temporarily, |
423 | 423 | disappear, which could be solved by busy waiting. A much better solution would
|
424 | 424 | be to use the \texttt{select} call, see page \pageref{SELECT}.
|
425 | 425 | \item Writing data of length \texttt{PIPE\_BUF} bytes or less
|
|
486 | 486 |
|
487 | 487 | \begin{itemize}
|
488 | 488 | \item \label{LSEEK} The first byte is at position 0. If it makes sense, you may
|
489 |
| -use a negative number for setting \emph{offset}. Example: |
| 489 | +use a negative number for setting the \emph{offset}. Example: |
490 | 490 | \example{read/lseek.c}.
|
491 |
| -\item If it legal to move beyond the end of the file. If data is written there, |
| 491 | +\item It is legal to move beyond the end of the file. If data is written there, |
492 | 492 | the file size will be set accordingly, the ``holes'' will be read as zeros.
|
493 | 493 | Note that just changing the file position will not increase the file size.
|
494 | 494 | \item You can get the file size via \texttt{lseek(fildes, 0, SEEK\_END)}.
|
495 | 495 | \item The most common operations with \texttt{lseek} are three: setting the
|
496 | 496 | position from the beginning of a file, setting the position to the end of a
|
497 | 497 | file, and getting the current file position (0 with \texttt{SEEK\_CUR}).
|
498 | 498 | \item There is no I/O involved when calling \texttt{lseek}.
|
499 |
| -\item You can obviously use \texttt{lseek} not only for subsequent calls |
| 499 | +\item You can obviously use \texttt{lseek} not only for subsequent calls to |
500 | 500 | \texttt{read} and \texttt{write} but also for another call to \texttt{lseek}.
|
501 | 501 | \item \label{BIG_FILE} Beware of files with holes as it may lead to problems
|
502 | 502 | with backing up the data. Example: \example{read/big-file.c} demonstrates that
|
503 |
| -moving a sparse file may end up in the actual storage data occupation increase. |
504 |
| -It greatly depends on the system you run, what an archiving utility is used, and |
505 |
| -their versions. Some utilities provide means to preserve holes, for example, |
| 503 | +moving a sparse file may end up in an actual storage data occupation increase. |
| 504 | +It greatly depends on the system you run, what archiving utility is used, and |
| 505 | +their versions. Some utilities provide the means to preserve holes, for example, |
506 | 506 | \texttt{dd} with \texttt{conv=sparse}, \texttt{tar} with \texttt{-S},
|
507 | 507 | \texttt{rsync} with \texttt{--sparse}, etc.
|
508 | 508 | \item Beware of confusing the parameters. The second line below looks OK but
|
509 | 509 | the arguments are in reversed order. What is more, \texttt{SEEK\_SET} is
|
510 | 510 | defined as 0 and \texttt{SEEK\_CUR} is 1, so the file position is not moved
|
511 |
| -which is not by itself a disastrous thing, and makes it more difficult to find |
| 511 | +which is not by itself a disastrous thing, which makes it more difficult to find |
512 | 512 | it:
|
513 | 513 |
|
514 | 514 | \begin{verbatim}
|
|
529 | 529 | \item causes the regular file to be truncated to a size of precisely
|
530 | 530 | \emph{length} bytes.
|
531 | 531 | \item if the file was larger than \emph{length}, the extra data is lost
|
532 |
| -\item if the file previously was shorter, it is extended, and the extended part |
| 532 | +\item if the file was previously shorter, it is extended, and the extended part |
533 | 533 | reads as null bytes
|
534 | 534 | \end{itemize}
|
535 | 535 | \end{slide}
|
536 | 536 |
|
537 | 537 | \begin{itemize}
|
538 |
| -\item To truncate the file when opening it can be achieved via using the |
| 538 | +\item Truncating the file when opening it can be achieved via the |
539 | 539 | \texttt{O\_TRUNC} flag in \texttt{open}, see page \pageref{OPEN}.
|
540 | 540 | \end{itemize}
|
541 | 541 |
|
|
605 | 605 | \texttt{>>}.
|
606 | 606 | \item \label{REDIRECT} Another example of \texttt{dup} use will be provided when
|
607 | 607 | we start working with pipes. The first redirection example from the slide
|
608 |
| -(without \texttt{stderr}) is in \example{read/redirect.c}. The call |
609 |
| -\texttt{execl} in that example replaces the current process image with the |
610 |
| -program passed as the first argument. We got ahead of ourselves here though, we |
| 608 | +(without \texttt{stderr}) is in \example{read/redirect.c}. In that example, the |
| 609 | +\texttt{execl} call replaces the current process image with the |
| 610 | +program passed in the first argument. We got ahead of ourselves here though, we |
611 | 611 | will learn about the \texttt{exec} calls on page \pageref{EXEC}.
|
612 | 612 | \item To fully understand how redirection works it is good to draw the file
|
613 |
| -descriptor table for each step and where the slots point to. For example, for |
614 |
| -the \nth{2} example in the slide, we have the initial state, after |
| 613 | +descriptor table for each step and where the slots point to. In |
| 614 | +the \nth{2} example in the slide above, we have the initial state, after |
615 | 615 | \texttt{close(1)} and \texttt{open("out", ...)}, and the final state, as
|
616 | 616 | follows:
|
617 | 617 |
|
|
626 | 626 | \end{verbatim}
|
627 | 627 |
|
628 | 628 | \item You need to pay attention to the state of descriptors. The \nth{2} example
|
629 |
| -will not work if the descriptor 0 is already closed, as |
| 629 | +above will not work if the descriptor 0 is already closed, as |
630 | 630 | \texttt{open} returns 0 (the first available descriptor) and \texttt{dup} fails
|
631 | 631 | while trying to duplicate an already closed descriptor. Possible
|
632 | 632 | solutions:
|
|
765 | 765 | data itself, neither the filename as the file data can be accesses through
|
766 | 766 | several different hard links and those hardlinks are in the data of directories.
|
767 | 767 | In other words, metadata is data about the actual file data.
|
768 |
| -\item Metadata can be read even when the process has not rights to read the file |
| 768 | +\item Metadata can be read even when the process has no rights to read the file |
769 | 769 | data.
|
770 | 770 | \item These functions do not provide file descriptor flags or flags from the
|
771 | 771 | system file table. These functions are about file information as stored on some
|
772 | 772 | mountable media.
|
773 | 773 | \item \texttt{st\_ctime} is not the creation time but the change time -- the
|
774 | 774 | last modification of the inode.
|
775 | 775 | \item The UNIX norm does not specify the ordering of the \texttt{struct stat}
|
776 |
| -members, neither it prohibits adding new. |
| 776 | +members, nor does it prohibit adding new ones. |
777 | 777 | \item \label{STAT} Example: \example{stat/stat.c}
|
778 | 778 | \item You can call \texttt{fstat} on file descriptors 0,1,2 as well. Unless
|
779 | 779 | redirected before, you will get information on the underlying terminal device
|
|
0 commit comments