這是我自己讀OPERATING SYSTEM CONCEPTS (6th Edition)的筆記,本來讀的進度很緩慢,直到有一天在網路上看到冼鏡光老師分享上課的slide(OS第7版),唸起OS就愈來有趣啦。
筆記的章節是依OS第6版整理的,圖片則都是從冼鏡光老師的slide貼上來的,感謝冼老師無私的分享,讓我有機會一窺OS的奧妙。
(如有著作權相關問題,請惠予通知)
筆記的章節是依OS第6版整理的,圖片則都是從冼鏡光老師的slide貼上來的,感謝冼老師無私的分享,讓我有機會一窺OS的奧妙。
(如有著作權相關問題,請惠予通知)
11. File-System Interface
11.1 File Concept- The file system consists of two distinct parts: a collection of files, each storing related data, and a directory structure, which organizes and provides information about all the files in the system. Some file systems have a third part, partitions, which are used to separate physically or logically large collections of directories.
- A file is a named collection of related information that is recorded on secondary storage.
- The OS abstracts from the physical properties of its storage devices to define a logical storage unit (the file).
- The OS maps this logical storage unit to the physical view of information storage.
- A file may have the following characteristics
- File Attributes
- File Operations
- File Types
- File Structures
- Internal Files
- An object file is a sequence of bytes organized into blocks understandable by the system’s linker.
- An executable file is a series of code sections that the loader can bring into memory and execute.
11.1.1 File Attributes11.2 Access Methods11.1.2 File Operations
- File Name: The symbolic name is perhaps the only human readable file attribute.
- Identifier: A unique number assigned to each file for identification purpose.
- File Type: Some systems recognize various file types. Windows is a good example.
- File Locations: A pointer to a device to find a file.
- File Size: The current size of a file, or the maximum allowed size.
- File Protection: This is for access-control.
- File Date, Time, Owner, etc.
- A file can be considered as an abstract data type that has data and accompanying operations.
- Creating a file
- Writing a file
a file Reading - Repositioning within a file
- Deleting a file
- Truncating a file
- Other operations (e.g., appending a file, renaming a file)
11.1.3 File Types
- Open-file table: The OS keeps a small table containing information about all open files.
- The open operation takes a file name and searches the directory, copying the directory entry into the open-file table. The open system call will typically return a pointer to the entry in the open-file table. This pointer, not the actual file name, is used in all I/O operations, avoiding any further searching and simplifying the system-call interface.
- In a multi-user system such as UNIX, the OS uses two levels of internal tables:
- The per-process table tracks all files that a process has open. Each entry in the per-process table in turn points to a system-wide table.
- The system-wide table contains process-independent information, such as the location of the file on disk, access dates, and file size.
- Several pieces of information are associated with an open file
- File pointer: the system must track the last read-write location as a current-file-position pointer.
- File open count: the system must wait for the last file to close before removing the open-file table entry.
- Disk location of the file:
- Access rights:
11.1.4 File Structure
- A common technique for implementing file types is to include the type as part of the file name.
- UNIX uses magic number stored at the beginning of some files to indicate roughly the type of the file. Not all files have magic numbers, so system features cannot be based solely on this type of information.
11.1.5 Internal File Structure
- File types also may be used to indicate the internal structure of the file.
- Some systems support specific file types that have special file structures.
- For example, files that contain binary executables.
- An OS becomes more complex when more file types (i.e., file structures) are supported.
- In general, the number of supported file types is kept to minimum.
- UNIX considers each file to be a sequence of 8-bit bytes; no interpretation of these bits is made by the OS. Each application program must include its own code to interpret an input file into the appropriate structure.
- The UNIX OS defines all files to be simply a stream of bytes. Each byte is individually addressable by its offset from the beginning (or end) of the file. In this case, the logical record is 1 byte. The file system automatically packs and unpacks bytes into physical disk blocks as necessary.
- In either case, the file may be considered to be a sequence of blocks. All the basic I/O functions operate in terms of blocks.
- Access method: how a file is used.
- There are three popular ones:
- Sequential access method for sequential files
- Direct access method for direct files
- Indexed access method for indexed files
11.2.1 Sequential Access11.3 Directory Structure
- With the sequential access method, the file is processed in order, one record after the other.
- If p is the file pointer, the next record to be accessed is either p+1 or p-1 (i.e., backspace).
11.2.2 Direct Access (Relative Access, Random Access)11.2.3 Other Access Methods (Indexed Access Method)
- A file is made up of fixed-length logical records.
- The direct access method uses record number to identify each record. For example, read rec 0, write rec 100, seek rec 75, etc.
- Some systems may use a key field to access a record (e.g., read rec “Age=
24” or write rec “Name=Dow”). This is usually achieved using hashing.- Since records can be accessed in random order, direct access is also referred to as random access.
- Direct access method can simulate sequential access.
- With the indexed access method, a file is sorted in ascending order based on a number of keys.
- Each disk block may contain a number of fixed-length logical records.
- An index table stores the keys of the first block in each block.
- We can search the index table to locate the block that contains the desired record. Then, search the block to find the desired record.
- This is exactly a one-level B-, B+ or B* tree.
- Multi-level index access method is also possible.
- A large volume disk may be partitioned into partitions, or mini disks, or volumes, which are low-level structure in which files and directories reside.
- Partitions can be thought of as virtual disks.
- Each partition contains information about files within it. This information is stored in entries of a device directory or volume table of content (VTOC).
- The device directory, or directory for short, stores the name, location, size, type, access method, etc of each file.
- The directory can be viewed as a symbol table that translates file names into their directory entries.
- Operations perform on directory: search for a file, create a file, delete a file, rename a file, traverse the file system, etc.
11.3.1 Single-Level Directory11.4 File-System Mounting
- All files are contained in the same directory.
- It is difficult to maintain file name uniqueness.
- CP/M-80 and early version of MS-DOS use this directory structure.
11.3.2 Two-Level Directory
- This is an extension of the single-level directory for multi-user system.
- Each user has his/her user file directory (user file directory, UFD). The system’s master file directory (MFD) is searched for the user directory when a user job starts.
- Early CP/M-80 multi-user systems use this structure.
- To locate a file, path name is used. For example, /user2/bo is the file bo of user2. A user name and a file name define a path name.
- Different systems use different path names. For example, under MS-DOS it may be C:\user2\bo.
- Many command interpreters act by simply treating the command as the name of a file to load and execute.
- The directory of a special user, say user0, may contain all system files.
11.3.3 Tree-Structured Directory
- Each directory or subdirectory contains files and subdirectories, and forms a tree.
- Directories are special files. All directories have the same internal format. One bit in each directory entry defines the entry as a file (0) or as a subdirectory (1).
- To change directories, a system call is provided that takes a directory name as a parameter and uses it to redefine the current directory.
- The OS searches the accounting file (or some other predefined location) to find an entry for this user (for accounting purposes). In the accounting file is a pointer to (or the name of) the user’s initial directory. This pointer is copied to a local variable for this user that specifies the user’s initial current directory.
- Allowing the user to define his own subdirectories permits him to impose a structure on his files.
11.3.4 Acyclic-Graph Directories
- This type of directories allows a file/directory to be shared by multiple directories.
- This is different from two copies of the same file or directory.
- An acyclic-graph directory is more flexible than a simple tree structure. However, it is more complex.
11.3.5 General Graph Directory
- Since a file has multiple absolute path names, how do we calculate file system statistics or do backup? Would the same file be duplicated multiple times?
- How do we delete a file?
- If sharing is implemented with symbolic links, we only delete the link if we have a list of links to the file. The file is removed when the list is empty.
- Or, we remove the file and keep the links. When the file is accessed again, a message is given and the link is removed.
- Or, we can maintain a reference count (hard links) for each shared file. The file is removed when the count is zero.
- It is easy to traverse the directories of a tree or an acyclic directory system.
- However, if links are added arbitrarily, the directory graph becomes arbitrary and may contain cycles.
- How do we search for a file?
- How do we delete a file? We can use reference count!
- In a cycle, due to self-reference, the reference count may be non-zero even when it is no longer possible to refer to a file or directory.
- Thus, garbage collection may need. A garbage collector traverses the directory and marks files and directories that can be accesses.
- A second round removes those inaccessible items.
- Garbage collection involves traversing the entire file system, marking everything that can be accessed. Then, a second pass collects everything that is not marked onto a list of free space.
- Garbage collection for a disk-based file system, however, is extremely time-consuming and is thus seldom attempted.
- To avoid this time-consuming task, a system can check if a cycle may occur when a link is made.
- A file system must be mounted before it can be available to processes on the system.
- Mount procedure:
- The OS is given the name of the device, and the location within the file structure at which to attach the file system (or mount point).
- Next, the OS verifies that the device contains a valid file system. It does so by asking the device driver to read the device directory and verifying that the directory has the expected format.
- Finally, the OS notes in its directory structure that a file system is mounted at the specified mount point.
- When a file is shared by multiple users, how can we ensure its consistency?
- If multiple users are writing to the file, should all of the writers be allowed to write?
- Or, should the OS protect the user actions from each other?
- This is the file consistency semantics.
11.5.1 Multiple Users11.6 Protection11.5.2 Remote File Systems
- The user is the user who may change attributes, grant access, and has the most control over the file or directory.
- The group attribute of a file is used to define a subset of users who may share access to the file..
- ftp, DFS (distributed file system), WWW (World Wide Web).
11.5.2.1 The Client-Server Model11.
11.5.2.2 Distributed Information Systems (distributed information systems)11.5.2.3 Failure Modes
- LDAP (lightweight directory-access protocol).
- RAID (Redundant arrays of inexpensive disks).
5.3 Consistency Semantics11.5.4 UNIX Semantics
- Consistency semantics is a characterization of the system that specifies the semantics of multiple users accessing a shared file simultaneously.
- Consistency semantics is an important criterion for evaluating any file system that supports file sharing.
- There are three commonly used semantics
- UNIX semantics
- Session Semantics
- Immutable-Shared-Files Semantics
- A file session consists all file access between open() and close().
11.5.5 Session Semantics
- Writes to an open file by a user are visible immediately to other users have the file open at the same time.
- All users share the file pointer. Thus, advancing the file pointer by one user affects all sharing users.
- A file has a single image that interleaves all accesses, regardless of their origin.
11.5.6 Immutable-Shared-Files Semantics
- Writes to an open file by a user are not visible immediately to other users that have the same file open simultaneously.
- Once a file is closed, the changes made to it are visible only in sessions started later.
- Already-open instances of the file do not affect these changes.
- A file may be associated temporarily with several and possible different images at the same time.
- Multiple users are allowed to perform both read and write concurrently on their image of the file without delay.
- The Andrew File system (AFS) uses this semantics.
- Once a file is declared as shared by its creator, it cannot be modified.
- An immutable file has two important properties:
- Its name may not be used
- Its content may not be altered
- Thus, the name of an immutable file indicates that the contents of the file are fixed – a constant rather than a variable.
- The implementation of these semantics in a distributed system is simple, since sharing is disciplined (i.e., read-only).
- We can keep file safe from physical damage (i.e., reliability) and improper access (i.e., protection).
- Reliability is generally provided by backup.
- The need for file protection is a direct result of the ability to access files.
- Access control may be complete protection by denying access. Or, the access may be controlled.
11.6.1 Types of Access
- Access control may be implemented by limiting the types of file access that can be made.
- The types of access may be
- Read: read from the file
- Write: write or rewrite the file
- Execute: load the file into memory and execute it
- Append: write new info at the end of a file
- Delete: delete a file
- List: list the name and attributes of the file
11.6.2 Access Control
- The most commonly used approach is to make the access dependent on the identity of the user.
- Each file and directory is associated with an access matrix specifying the user name and the types of permitted access.
- When a user makes a request to access a file or a directory, his/her identity is compared against the information stored in the access matrix.
- Access-control Lists
- In practice, the access matrix is sparse.
- The matrix can be decomposed into columns (files), yielding access-control lists (ACL).
- However, this list can be very long!
- Capability Lists
- Decomposition by rows (users) yields capability tickets.
- Each user has a number ticket for file/directory access.
11.6.3 Other Protection Approaches
11.6.4 An Example: UNIX
全站熱搜