|
5 | 5 | \begin{slide} |
6 | 6 | \sltitle{File API} |
7 | 7 | \begin{itemize} |
8 | | -\item before working with a file, it must be first open via |
| 8 | +\item before working with a file, it must be first opened via |
9 | 9 | \funnm{open}() or \funnm{creat}() |
10 | 10 | \item open files are accessible via \emph{file descriptors}, numbered from 0. |
11 | 11 | More descriptors can share the same file opening (read/write mode, position). |
|
112 | 112 | current mask that can be changed via a shell command \texttt{umask} -- those |
113 | 113 | bits in \emph{mode}, also set in the process umask, are nullified. The |
114 | 114 | default umask value is typically (and historically) \texttt{022}. We recommend |
115 | | -you to always set it to \texttt{077} in your profile script. Never do that for |
| 115 | +that you always set it to \texttt{077} in your profile script. Never do that for |
116 | 116 | root though otherwise you will end up with a system in a non-supported |
117 | 117 | configuration -- installed software will not be possible to run by |
118 | 118 | non-privileged users, what worked before may stop working, etc. |
119 | 119 | \item If the \emph{mode} argument is required and not specified, you get |
120 | | -whatever is on the stack. Both flags and the mode are stored in the system file |
| 120 | +whatever is on the stack. Both flags and mode are stored in the system file |
121 | 121 | table, see page \pageref{OPENFILETABLES}. |
122 | 122 | \item Macros for use with \emph{mode} can be usually found in the manual page |
123 | 123 | for \texttt{chmod(2)}, and you can find them also in the \texttt{stat.h} header |
|
142 | 142 | as historically, implementations used 0 for the read-only flag. The standard |
143 | 143 | defines that only one of those three flags may be used. |
144 | 144 | \item Is is possible to open and create a file for writing so that writing is |
145 | | -disallowed by its mode. It will work for that file opening but any other file |
146 | | -opening for writing will fail. |
| 145 | +disallowed by its mode. It will work for the initial file opening but any subsequent |
| 146 | +attempts to write will fail. |
147 | 147 | \item You need write permission to use \texttt{O\_TRUNC}. |
148 | 148 | \item The behavior of \texttt{O\_EXCL} without using \texttt{O\_CREAT} at the |
149 | 149 | same is undefined. |
|
179 | 179 | \label{CREAT} |
180 | 180 |
|
181 | 181 | \begin{itemize} |
182 | | -\item The \texttt{open} call allows to open a regular file, a device, or a named |
| 182 | +\item The \texttt{open} call allows opening of a regular file, device, or named |
183 | 183 | pipe. However, it (and \texttt{creat} as well) can only create a regular file, |
184 | 184 | so you need the other two calls for non-regular files. |
185 | | -\item The test of a file existence using the flag \texttt{O\_EXCL} and its |
| 185 | +\item The test of a file's existence using the flag \texttt{O\_EXCL} and its |
186 | 186 | subsequent creation if it did not exist, is an atomic operation. You can use |
187 | 187 | that for lock files but only with the \texttt{open} call, not \texttt{creat}. |
188 | 188 | \item You need extra privileges to create device special files (e.g. to be a |
|
223 | 223 | \setlength{\itemsep}{0.8\itemsep} |
224 | 224 | \item For any Unix system, a file is just a sequence of bytes without any inner |
225 | 225 | structure. |
226 | | -\item \emsl{Behavior of \texttt{read} and \texttt{write} depends on the type of |
| 226 | +\item The \emsl{behavior of \texttt{read} and \texttt{write} depends on the type of |
227 | 227 | the file} (regular, device, pipe, or socket) and whether the file is in a |
228 | 228 | blocking or non-blocking mode (flag \texttt{O\_NONBLOCK} on file opening, see |
229 | 229 | page \pageref{O_NONBLOCK}). |
|
236 | 236 | \texttt{read} will block unless some data gets available, a non-blocking |
237 | 237 | \texttt{read} returns -1 and sets \texttt{errno} to \texttt{EAGAIN}. |
238 | 238 | \item \texttt{write} returns a non-zero number of bytes less than \emph{nbyte} |
239 | | -if less then \emph{nbyte} bytes can fit the file (e.g. disk full), if the call |
| 239 | +if less then \emph{nbyte} bytes can fit into the file (e.g. disk full), if the call |
240 | 240 | was interrupted by a signal, or if \verb#O_NONBLOCK# was set and only part of |
241 | 241 | the data fits into a pipe, socket, or a device; without \verb#O_NONBLOCK# |
242 | 242 | the call will block until all the data can be written. If nothing can be |
|
268 | 268 | \begin{itemize} |
269 | 269 | \item releases \texttt{fildes}, if it was the last descriptor for a file |
270 | 270 | opening, closes the file |
271 | | -\item if number of links is 0, the file data is released |
272 | | -\item if the last pipe descriptor is closed, remaining data is lost |
273 | | -\item on a process termination, implicit \texttt{close} is called on all |
| 271 | +\item if the number of links is 0, the file data is released |
| 272 | +\item if the last pipe descriptor is closed, any remaining data is lost |
| 273 | +\item on process termination, an implicit \texttt{close} is called on all |
274 | 274 | descriptors |
275 | 275 | \end{itemize} |
276 | 276 | \end{slide} |
|
391 | 391 | \item When writing to a pipe without a consumer (i.e. the producer opened the |
392 | 392 | pipe when there was at least one existing consumer), the kernel will send the |
393 | 393 | producer a signal \texttt{SIGPIPE} (``broken pipe''). See the following |
394 | | -example. For simplicity, we are using an unnamed pipe but that does not matter |
395 | | -as it would have behaved in the same manner. The \texttt{date(1)} command never |
| 394 | +example. For simplicity, we are using an unnamed pipe but a named pipe |
| 395 | +would behave in the same manner. The \texttt{date(1)} command never |
396 | 396 | reads anything from its standard input so it is guaranteed that the producer, |
397 | 397 | \texttt{dd(1)}, will be writing to a pipe without a consumer. If a process is |
398 | 398 | killed by a signal, the shell provides a signal number added to 128 as its |
|
409 | 409 |
|
410 | 410 | \item When opening a pipe for writing only with \texttt{O\_NONBLOCK} and without |
411 | 411 | an existing consumer, the call returns -1 and \texttt{errno} is set to |
412 | | -\texttt{ENXIO}. This asymmetry to opening a pipe for reading in a non-blocking |
| 412 | +\texttt{ENXIO}. This asymmetry in opening a pipe for reading in non-blocking |
413 | 413 | mode is due to the fact that it is not desirable to have data in a pipe that may |
414 | 414 | not be read in a short period of time. The Unix system does not allow for |
415 | | -storing pipe data for arbitrary length of time. Without the |
| 415 | +storing pipe data for an arbitrary length of time. Without the |
416 | 416 | \texttt{O\_NONBLOCK} flag, the process will block while waiting for a consumer. |
417 | | -By asymmetry we mean that the system does not mind to keep consumers without |
| 417 | +By asymmetry, we mean that the system allows consumers without |
418 | 418 | producers but it tries to avoid writers without existing readers. |
419 | 419 | \item If you want to create a process that sits on a named pipe and processes |
420 | 420 | data from producers, you need to open it with the flag \texttt{O\_RDWR} even |
421 | | -that you do not intend to write it. If you do not use the flag, you might end |
422 | | -up with \texttt{read} returning 0 after all producers, perhaps temporarily only, |
| 421 | +if you do not intend to write to it. If you do not use the flag, you might end |
| 422 | +up with \texttt{read} returning 0 after all producers, perhaps only temporarily, |
423 | 423 | disappear, which could be solved by busy waiting. A much better solution would |
424 | 424 | be to use the \texttt{select} call, see page \pageref{SELECT}. |
425 | 425 | \item Writing data of length \texttt{PIPE\_BUF} bytes or less |
|
486 | 486 |
|
487 | 487 | \begin{itemize} |
488 | 488 | \item \label{LSEEK} The first byte is at position 0. If it makes sense, you may |
489 | | -use a negative number for setting \emph{offset}. Example: |
| 489 | +use a negative number for setting the \emph{offset}. Example: |
490 | 490 | \example{read/lseek.c}. |
491 | | -\item If it legal to move beyond the end of the file. If data is written there, |
| 491 | +\item It is legal to move beyond the end of the file. If data is written there, |
492 | 492 | the file size will be set accordingly, the ``holes'' will be read as zeros. |
493 | 493 | Note that just changing the file position will not increase the file size. |
494 | 494 | \item You can get the file size via \texttt{lseek(fildes, 0, SEEK\_END)}. |
495 | 495 | \item The most common operations with \texttt{lseek} are three: setting the |
496 | 496 | position from the beginning of a file, setting the position to the end of a |
497 | 497 | file, and getting the current file position (0 with \texttt{SEEK\_CUR}). |
498 | 498 | \item There is no I/O involved when calling \texttt{lseek}. |
499 | | -\item You can obviously use \texttt{lseek} not only for subsequent calls |
| 499 | +\item You can obviously use \texttt{lseek} not only for subsequent calls to |
500 | 500 | \texttt{read} and \texttt{write} but also for another call to \texttt{lseek}. |
501 | 501 | \item \label{BIG_FILE} Beware of files with holes as it may lead to problems |
502 | 502 | with backing up the data. Example: \example{read/big-file.c} demonstrates that |
503 | | -moving a sparse file may end up in the actual storage data occupation increase. |
504 | | -It greatly depends on the system you run, what an archiving utility is used, and |
505 | | -their versions. Some utilities provide means to preserve holes, for example, |
| 503 | +moving a sparse file may end up in an actual storage data occupation increase. |
| 504 | +It greatly depends on the system you run, what archiving utility is used, and |
| 505 | +their versions. Some utilities provide the means to preserve holes, for example, |
506 | 506 | \texttt{dd} with \texttt{conv=sparse}, \texttt{tar} with \texttt{-S}, |
507 | 507 | \texttt{rsync} with \texttt{--sparse}, etc. |
508 | 508 | \item Beware of confusing the parameters. The second line below looks OK but |
509 | 509 | the arguments are in reversed order. What is more, \texttt{SEEK\_SET} is |
510 | 510 | defined as 0 and \texttt{SEEK\_CUR} is 1, so the file position is not moved |
511 | | -which is not by itself a disastrous thing, and makes it more difficult to find |
| 511 | +which is not by itself a disastrous thing, which makes it more difficult to find |
512 | 512 | it: |
513 | 513 |
|
514 | 514 | \begin{verbatim} |
|
529 | 529 | \item causes the regular file to be truncated to a size of precisely |
530 | 530 | \emph{length} bytes. |
531 | 531 | \item if the file was larger than \emph{length}, the extra data is lost |
532 | | -\item if the file previously was shorter, it is extended, and the extended part |
| 532 | +\item if the file was previously shorter, it is extended, and the extended part |
533 | 533 | reads as null bytes |
534 | 534 | \end{itemize} |
535 | 535 | \end{slide} |
536 | 536 |
|
537 | 537 | \begin{itemize} |
538 | | -\item To truncate the file when opening it can be achieved via using the |
| 538 | +\item Truncating the file when opening it can be achieved via the |
539 | 539 | \texttt{O\_TRUNC} flag in \texttt{open}, see page \pageref{OPEN}. |
540 | 540 | \end{itemize} |
541 | 541 |
|
|
605 | 605 | \texttt{>>}. |
606 | 606 | \item \label{REDIRECT} Another example of \texttt{dup} use will be provided when |
607 | 607 | we start working with pipes. The first redirection example from the slide |
608 | | -(without \texttt{stderr}) is in \example{read/redirect.c}. The call |
609 | | -\texttt{execl} in that example replaces the current process image with the |
610 | | -program passed as the first argument. We got ahead of ourselves here though, we |
| 608 | +(without \texttt{stderr}) is in \example{read/redirect.c}. In that example, the |
| 609 | +\texttt{execl} call replaces the current process image with the |
| 610 | +program passed in the first argument. We got ahead of ourselves here though, we |
611 | 611 | will learn about the \texttt{exec} calls on page \pageref{EXEC}. |
612 | 612 | \item To fully understand how redirection works it is good to draw the file |
613 | | -descriptor table for each step and where the slots point to. For example, for |
614 | | -the \nth{2} example in the slide, we have the initial state, after |
| 613 | +descriptor table for each step and where the slots point to. In |
| 614 | +the \nth{2} example in the slide above, we have the initial state, after |
615 | 615 | \texttt{close(1)} and \texttt{open("out", ...)}, and the final state, as |
616 | 616 | follows: |
617 | 617 |
|
|
626 | 626 | \end{verbatim} |
627 | 627 |
|
628 | 628 | \item You need to pay attention to the state of descriptors. The \nth{2} example |
629 | | -will not work if the descriptor 0 is already closed, as |
| 629 | +above will not work if the descriptor 0 is already closed, as |
630 | 630 | \texttt{open} returns 0 (the first available descriptor) and \texttt{dup} fails |
631 | 631 | while trying to duplicate an already closed descriptor. Possible |
632 | 632 | solutions: |
|
765 | 765 | data itself, neither the filename as the file data can be accesses through |
766 | 766 | several different hard links and those hardlinks are in the data of directories. |
767 | 767 | In other words, metadata is data about the actual file data. |
768 | | -\item Metadata can be read even when the process has not rights to read the file |
| 768 | +\item Metadata can be read even when the process has no rights to read the file |
769 | 769 | data. |
770 | 770 | \item These functions do not provide file descriptor flags or flags from the |
771 | 771 | system file table. These functions are about file information as stored on some |
772 | 772 | mountable media. |
773 | 773 | \item \texttt{st\_ctime} is not the creation time but the change time -- the |
774 | 774 | last modification of the inode. |
775 | 775 | \item The UNIX norm does not specify the ordering of the \texttt{struct stat} |
776 | | -members, neither it prohibits adding new. |
| 776 | +members, nor does it prohibit adding new ones. |
777 | 777 | \item \label{STAT} Example: \example{stat/stat.c} |
778 | 778 | \item You can call \texttt{fstat} on file descriptors 0,1,2 as well. Unless |
779 | 779 | redirected before, you will get information on the underlying terminal device |
|
0 commit comments