|
22 | 22 | systems), \texttt{APFS} (macOS and iOS since 2017), etc. A filesystem can be |
23 | 23 | either used on local or remote storage, and in case of a remote storage network |
24 | 24 | filesystem protocols like \texttt{NFS} or \texttt{AFS} are used to access the |
25 | | -data. Note that these network filesystems do not define the filesystem |
| 25 | +data. Note that these network file systems do not define the filesystem |
26 | 26 | structure itself, they only provide for accessing existing filesystems remotely. |
27 | 27 | Each filesystem also has its limits, a largest file size or the maximum size of |
28 | 28 | the filesystem itself, for example. |
29 | 29 | \item Unix does not have A, B, C, D\dots disks as Windows and other systems. |
30 | 30 | All filesystems are mounted to a single directory hierarchy on any Unix system, |
31 | 31 | as shown on the slide where you can see the root filesystem and three other |
32 | | -fileystems, mounted on \texttt{/usr}, \texttt{/dev/tty}, and \texttt{/home} |
| 32 | +file systems, mounted on \texttt{/usr}, \texttt{/dev/tty}, and \texttt{/home} |
33 | 33 | directories. You could also further mount other filesystems on directories that |
34 | 34 | are part of these non-root filesystems. |
35 | 35 | \item Each filesystem mounted to the common hierarchy may be formatted using a |
|
41 | 41 | filesystems are later mounted via the \texttt{mount} command, usually from |
42 | 42 | specific startup services based on the system you use. The startup services |
43 | 43 | sometimes use file \texttt{/etc/fstab} as a source of information about what |
44 | | -filesystems to mount. You can also use \texttt{mount} manually. To unmount a |
45 | | -filesystem, the \texttt{umount} command is used. |
| 44 | +filesystems to mount. You can also use \texttt{mount} manually. |
| 45 | +To ifdef([[[NOSPELLCHECK]]], [[[unmount]]]) a filesystem, the \texttt{umount} |
| 46 | +command is used. |
46 | 47 | \item Some systems also provide for auto mounting where filesystems are mounted |
47 | 48 | during the first access attempt, and may be automatically unmounted after a |
48 | 49 | period of inactivity. Such functionality is usually called an |
|
64 | 65 | \texttt{/tmp} & public directory for temporary files\\ |
65 | 66 | \texttt{/home} & root of home directories\\ |
66 | 67 | \texttt{/var/adm} & administrative files (not on BSD) \\ |
67 | | -\texttt{/usr/in{}clude} & header files for C\\ |
| 68 | +\texttt{/usr/include} & header files for C\\ |
68 | 69 | \texttt{/usr/local} & locally installed software\\ |
69 | 70 | \texttt{/usr/man} & man pages\\ |
70 | 71 | \texttt{/var/spool} & spool (mail, printing,...) |
|
82 | 83 | It contain binaries not needed by typical user. |
83 | 84 | \item The \texttt{/usr} directory tree contains files that are not changing |
84 | 85 | while the system is running and are not dependent on given machine. |
85 | | -This property should make it elegible for read-only sharing. On your own |
| 86 | +This property should make it eligible for read-only sharing. On your own |
86 | 87 | machine it will be typically read-write, though. |
87 | 88 | \item The \texttt{/lib} directory typically contains libraries needed for |
88 | 89 | programs from the root file system. If all libraries were in \texttt{/usr/lib} |
|
96 | 97 | system is running and are specific for given machine. |
97 | 98 | \item There can be differences in directory layout found between installations |
98 | 99 | of the same operating system. |
99 | | -\item The \texttt{hier(7)} manual page on FreeBSD and Linuxu describes directory |
| 100 | +\item The \texttt{hier(7)} manual page on FreeBSD and Linux describes directory |
100 | 101 | hierarchy on these systems. Solaris uses \texttt{filesystem(5)}. |
101 | 102 | \end{itemize} |
102 | 103 |
|
|
139 | 140 | refreshed by a shell script (\texttt{MAKEDEV}) or by hand whenever hardware |
140 | 141 | configuration changed. Today most of the systems populate the directory on the |
141 | 142 | fly as kernel detects addition or removal of hardware components |
142 | | -(see \emph{devfs} on page \pageref{DEVFS}). |
| 143 | +(see \emph{\texttt{devfs}} on page \pageref{DEVFS}). |
143 | 144 | \item Immediate write to disk can be forced by using the \texttt{O\_DIRECT} |
144 | 145 | command via the \texttt{fcntl} system call. |
145 | 146 | \end{itemize} |
|
153 | 154 | \begin{itemize2} |
154 | 155 | \item \emsl{partition} -- part of disk, one disk can have multiple |
155 | 156 | partitions |
156 | | - \item \emsl{logickém volume} -- can be used to connect multiple partitions |
| 157 | + \item \emsl{logical volume} -- can be used to connect multiple partitions |
157 | 158 | (that can reside on distinct disks) into one file system. |
158 | 159 | \end{itemize2} |
159 | 160 | \item more choices: striping, mirroring, RAID |
|
211 | 212 | \end{itemize} |
212 | 213 | \item \emph{boot block} -- for OS boot loader |
213 | 214 | \item \emph{superblock} -- basic information about the file system: number of |
214 | | -blocks for i-nodes, number of file system block, list of feee blocks (continues |
| 215 | +blocks for i-nodes, number of file system block, list of free blocks (continues |
215 | 216 | in the free block area), list of free i-nodes (after it is exhausted the i-node |
216 | 217 | table is searched), locks for the lists of free data blocks and i-nodes, |
217 | 218 | modification flag (used for checking whether the file system was correctly |
|
280 | 281 | \item can be created only within one file system |
281 | 282 | \item not possible to create for directories |
282 | 283 | \end{itemize} |
283 | | -\item [symbolic link (symlink, softlink)]~ |
| 284 | +\item [symbolic link (symlink, ifdef([[[NOSPELLCHECK]]], [[[softlink]]]))]~ |
284 | 285 |
|
285 | 286 | \begin{itemize} |
286 | 287 | \item only reference to the real path of file of different type |
287 | 288 | (it is marked as '\texttt{l}' in the \texttt{ls -l} output), i.e. the type |
288 | 289 | of symbolic differs from ordinary file. Its data contain a simple string |
289 | 290 | -- name of the path, either relative or absolute. |
290 | | - \item different behavior for original and symlink (e.g. upon unlink) |
| 291 | + \item different behavior for original and symlink (e.g. upon deletion) |
291 | 292 | \item watch out for relative and absolute paths when moving symbolic link |
292 | 293 | \item can point to directory or non-existing file |
293 | 294 | \end{itemize} |
|
319 | 320 | \item bitmaps for free i-nodes and data blocks |
320 | 321 | \item data blocks |
321 | 322 | \end{itemize} |
322 | | -\item bloks of size 4 to ¾ 8 kB, smaller parts stored into block fragments |
| 323 | +\item blocks of size 4 to 8 kilobytes, smaller parts stored into block fragments |
323 | 324 | \item file names up to 255 characters |
324 | 325 | \end{itemize} |
325 | 326 | \end{slide} |
|
330 | 331 | \item more file systems: UFS2, Ext3, ReiserFS, XFS, ZFS etc. |
331 | 332 | \item UFS was still 32-bit, that reflected on maximum file length and maximum |
332 | 333 | file system size. The 32-bit notation means that i-node numbers are represented |
333 | | -as 32 bit integers. This gives the theoretical filesystem limits. |
334 | | -\item journalling (XFS, Ext3, ReiserFS) -- an effort to reduce the risk of data |
| 334 | +as 32 bit integers. This gives the theoretical file system limits. |
| 335 | +\item journaling (XFS, Ext3, ReiserFS) -- an effort to reduce the risk of data |
335 | 336 | loss in case of crash and also to speed up the recovery after crash. |
336 | 337 | \item ZFS -- modern 128-bit file system developed in Sun Microsystems, since |
337 | | -Solarisu 10; also present in FreeBSD since version 7. |
| 338 | +Solaris 10; also present in FreeBSD since version 7. |
338 | 339 | \end{itemize} |
339 | 340 |
|
340 | 341 | %%%%% |
|
378 | 379 | \end{slide} |
379 | 380 |
|
380 | 381 | \begin{itemize} |
381 | | -\item The \emph{FFS} filesystem introduced in 4.2BSD was historically the second |
382 | | -unix filesystem. Some manufacturers of unix system started to prefer its |
383 | | -considering its better performance and new features, others remained with |
384 | | -\emph{s5fs} from compatibility reasons. This deepened the problem with already |
385 | | -insufficient interoperability between different unix systems. For some |
386 | | -applications neither of these filesystems was enough. Gradually the need to work |
387 | | -with non-unix systems started appearing, e.g. with \emph{FAT}. With growing |
| 382 | +\item The \emph{FFS} file system introduced in 4.2BSD was historically the |
| 383 | +second unix file system. Some manufacturers of unix system started to prefer its |
| 384 | +considering its better performance and new features, others remained with |
| 385 | +ifdef([[[NOSPELLCHECK]]], [[[\emph{s5fs}]]]) from compatibility reasons. This |
| 386 | +deepened the problem with already insufficient interoperability between |
| 387 | +different unix systems. For some |
| 388 | +applications neither of these file systems was enough. Gradually the need to |
| 389 | +work with non-unix systems started appearing, e.g. with \emph{FAT}. With growing |
388 | 390 | popularity of computer networks the demand for file sharing between systems |
389 | | -started to increase. This lead to the inception of distributed filesystems |
| 391 | +started to increase. This lead to the inception of distributed file systems |
390 | 392 | -- e.g. \emph{NFS} (Network File System). |
391 | 393 | \item Given the above described situation it was just a matter of time when |
392 | | -fundamental changes in the filesystem infrastructure will happen to support |
393 | | -multiple filesystem types simultaneously. Several different implementations from |
394 | | -multiple manufacturers were made; in the end the de facto standard became |
| 394 | +fundamental changes in the file system infrastructure will happen to support |
| 395 | +multiple file system types simultaneously. Several different implementations |
| 396 | +from multiple manufacturers were made; in the end the ifdef([[[NOSPELLCHECK]]], |
| 397 | +[[[de facto]]]) standard became |
395 | 398 | the \emph{VFS/vnode} architecture from Sun Microsystems. Today practically all |
396 | 399 | unix u{}nix-like systems support VFS, even though with often non-compatible |
397 | 400 | changes. VFS appeared for the first time in 1985 in Solaris 2.0; |
|
405 | 408 | file operations}. This set is referenced by each vnodes corresponding to given |
406 | 409 | file system. \emsl{This set of functions define vnode interface.} When e.g. |
407 | 410 | \texttt{open} is called, the kernel will call the corresponding implementation |
408 | | -depending on file system type (e.g. from the \emph{ext2fs} module). |
| 411 | +depending on file system type (e.g. from the ifdef([[[NOSPELLCHECK]]], |
| 412 | +[[[\emph{ext2fs}]]]) module). |
409 | 413 | Implementation dependent part of vnode structure is accessible only from |
410 | | -functions of given filesystem; for kernel it is opaque. Next slide will shown |
411 | | -another set of functions that works with filesystems themselves. |
| 414 | +functions of given file system; for kernel it is opaque. Next slide will shown |
| 415 | +another set of functions that works with file systems themselves. |
412 | 416 | \emsl{This set defines VFS interface.} |
413 | 417 | These \emsl{two sets together} constitute the vnode/VFS interface, generally |
414 | 418 | referred to as VFS. |
415 | 419 | \item For special file types the situation is a bit more complicated, in SVR4 |
416 | | -the \texttt{file} structure points to \emph{snode} |
| 420 | +the \texttt{file} structure points to \texttt{snode} |
417 | 421 | (\emph{shadow-special-vnode}), that defines operations with a device (using the |
418 | | -\emph{spec} filesystem) and using the \texttt{s\_realvp} pointer it refers to |
| 422 | +\emph{spec} file system) and using the \texttt{s\_realvp} pointer it refers to |
419 | 423 | real vnode for the operations with the special file; this file is necessary for |
420 | 424 | example for checking file access rights. Each device can have multiple special |
421 | | -files, hence more snodes and corresponding real vnodes. All such snodes for one |
422 | | -device have the \texttt{s\_commonvp} pointer to one common snode, this is |
423 | | -however not captured in the picture. When opening a special file, an item |
424 | | -corresponding to the special file is searched in hash table of snodes of opened |
425 | | -devices according to major and minor device number. If the snode is not found, |
426 | | -new one is created. This snode will be then used for operations with the device. |
427 | | -More in [Vahalia]. |
| 425 | +files, hence more \texttt{snodes} and corresponding real vnodes. All such |
| 426 | +\texttt{snodes} for one device have the \texttt{s\_commonvp} pointer to one |
| 427 | +common \texttt{snode}, this is however not captured in the picture. When opening |
| 428 | +a special file, an item corresponding to the special file is searched in hash |
| 429 | +table of \texttt{snodes} of opened devices according to major and minor device |
| 430 | +number. If the \texttt{snode} is not found, new one is created. This |
| 431 | +\texttt{snode} will be then used for operations with the device. More in |
| 432 | +[Vahalia]. |
428 | 433 | \end{itemize} |
429 | 434 |
|
430 | 435 | %%%%% |
|
468 | 473 | \end{slide} |
469 | 474 |
|
470 | 475 | \begin{itemize} |
471 | | -\item The \texttt{proc} and \texttt{user} strucures are created by the kernel |
| 476 | +\item The \texttt{proc} and \texttt{user} structures are created by the kernel |
472 | 477 | for each process. They contain service information about the process. |
473 | 478 | \item The \texttt{ufchunk} structure contains \texttt{NFPCHUNK} (usually 24) |
474 | 479 | \emph{file descriptors}, after it is full new \texttt{ufchunk} is allocated. |
|
513 | 518 | \item invalid superblock contents |
514 | 519 | \end{itemize} |
515 | 520 | \item the \texttt{fsck} operation is time consuming. |
516 | | -\item journaling (e.g. XFS in IRIXu, Ext3 in Linux) a transactional (ZFS) |
517 | | -filesystems do not need \texttt{fsck}. |
| 521 | +\item journaling (e.g. XFS in IRIX, Ext3 in Linux) a transactional (ZFS) |
| 522 | +file systems do not need \texttt{fsck}. |
518 | 523 | \end{itemize} |
519 | 524 | \end{slide} |
520 | 525 |
|
|
524 | 529 | The buffers are saved periodically by special system process (or daemon). |
525 | 530 | \item The \texttt{fsck} command only checks metadata. If a data corruption |
526 | 531 | happened it cannot tell, let alone do something about it. |
527 | | -\item Exaple of \texttt{fsck} run on \emsl{unmounted} filesystem: |
| 532 | +\item Example of \texttt{fsck} run on \emsl{unmounted} filesystem: |
528 | 533 | \begin{verbatim} |
529 | | -toor@shewolf:~# fsck /dev/ad0a |
| 534 | +toor@shewolf:~# fsck /dev/ad0a |
530 | 535 | ** /dev/ad0a |
531 | 536 | ** Last Mounted on /mnt/flashcard |
532 | 537 | ** Phase 1 - Check Blocks and Sizes |
|
559 | 564 | \item solutions to metadata inconsistency problem: |
560 | 565 | \begin{itemize} |
561 | 566 | \setlength{\itemsep}{0ex} |
562 | | - \item \emph{journalling} -- one group of operations dependent on each other |
| 567 | + \item \emph{journaling} -- one group of operations dependent on each other |
563 | 568 | is written to a journal first; if a problem is encountered the journal can |
564 | 569 | be ``replayed'' |
565 | 570 | \item metadata blocks are written to non-volatile memory first |
|
572 | 577 | \end{slide} |
573 | 578 |
|
574 | 579 | \begin{itemize} |
575 | | -\item filesystem \emph{metadata} = inodes, directories, free block |
| 580 | +\item filesystem \emph{metadata} = i-nodes, directories, free block |
576 | 581 | maps |
577 | 582 | \item \emph{ext2} uses asynchronous metadata writes even by default and when |
578 | 583 | in synchronous mode it is much slower to UFS. |
579 | 584 | \item Dependent operations are for example deleting an item from directory and |
580 | 585 | deleting disk inode. If the inode is deleted first and then the directory entry |
581 | | -then if there is outage between these two operations then inonsistency follows |
| 586 | +then if there is outage between these two operations then inconsistency follows |
582 | 587 | -- the link points to disk file that does not exist. It is not a problem to |
583 | 588 | avoid this when using synchronous metadata writes (we know when a what is being |
584 | 589 | written, the ordering of writes is therefore under our control) however when |
585 | 590 | using the write-back method it is necessary to solve dependencies of the blocks |
586 | 591 | because with classic synchronization of cache buffers the kernel is not |
587 | 592 | interested in which blocks is written first. |
588 | | -\item Often the block dependencies for a cycle. Soft updates can recognize sych |
| 593 | +\item Often the block dependencies for a cycle. Soft updates can recognize such |
589 | 594 | cycle and break it by performing \emph{roll-back} and after the write is done it |
590 | 595 | performs \emph{roll-forward}. |
591 | 596 | \item The soft updates performance is comparable to that of UFS with |
592 | 597 | asynchronous metadata writes. |
593 | | -\item Theoreticall soft updates guratantee that it is not necessary to use |
| 598 | +\item Theoretically soft updates guarantee that it is not necessary to use |
594 | 599 | \texttt{fsck} after the reboot, i.e. that the filesystem is in bootable state. |
595 | | -It is however necessary to use so called \emph{background fsck} for correcting |
596 | | -non-grave errors -- this is considered to be one of the big drawbacks of soft |
597 | | -updates, especially given how sizes of disks grow over time. Example of an error |
598 | | -that does not block booting would be a block that is marked as used however is |
599 | | -not used by any file. |
| 600 | +It is however necessary to use so called ifdef([[[NOSPELLCHECK]]], |
| 601 | +[[[\emph{background fsck}]]]) for correcting non-grave errors -- this is |
| 602 | +considered to be one of the big drawbacks of soft updates, especially given how |
| 603 | +sizes of disks grow over time. Example of an error that does not block booting |
| 604 | +would be a block that is marked as used however is not used by any file. |
600 | 605 | \item soft updates are not always recommended for the root filesystem. |
601 | 606 | The problem is that metadata loss on root file system (see the 30 second |
602 | 607 | period of writes) can be more dangerous than in \texttt{/usr}, \texttt{/home} |
|
625 | 630 | rename operation so that the roll-back is truly needed -- we could consider that |
626 | 631 | the inode did not really change and it is not necessary to decide whether to |
627 | 632 | write it or not; the write of directory reference without increasing reference |
628 | | -count in the inode could get us into situation which is descibed above. |
| 633 | +count in the inode could get us into situation which is described above. |
629 | 634 | \end{itemize} |
630 | 635 |
|
631 | 636 | \endinput |
0 commit comments