OS - 14. Mass-Storage Structure－nix

    這是我自己讀OPERATING SYSTEM CONCEPTS (6th Edition)的筆記，本來讀的進度很緩慢，直到有一天在網路上看到冼鏡光老師分享上課的slide(OS第7版)，唸起OS就愈來有趣啦。
    筆記的章節是依OS第6版整理的，圖片則都是從冼鏡光老師的slide貼上來的，感謝冼老師無私的分享，讓我有機會一窺OS的奧妙。
    (如有著作權相關問題，請惠予通知)

14. Mass-Storage Structure

14.1 Disk Structure

Modern disk drives are addressed as large one-dimensional arrays of logical blocks, where the logical block is the smallest unit of transfer.
Three elements: cylinder, track and sector/block.
Logical block number -> old-style disk address: (1) a cylinder number, (2) a track number within that cylinder, (3) a sector number within that track.
Constant Linear Velocity (CLV): The density of bits per track is uniform. The farther a track which is away from the center of the disk holds the more sectors. The drive increases its rotation speed as the head moves from the outer to the inner tracks to keep the same rate of data moving under the head. CD-ROM drives.
Constant Angular Velocity (CAV): The disk rotation speed can stay constant, and the density of bits decreases from inner tracks to outer tracks to keep the data rate constant. Hard disks.

14.2 Disk Scheduling

Seek time: the time for the disk arm to move the heads to the cylinder containing the desired sector.
Rotational latency: the additional time waiting for the disk to rotate the desired sector to the disk head.
Disk bandwidth: The total number of bytes transferred, divided by the total time between the first request for service and the completion of the last transfer.
Three types of latency (i.e., delay)
- Positional delay (= seek delay + rotational delay) or seek delay – mechanical and slowest.
- Rotational delay
- Transfer delay
Computing disk latency:
- Track size: 32K = 32,768 bytes
- Rotation time: 16.67 msec
- Average seek time: 30 msec
- The average time to transfer k bytes:

Disk Block Interleaving

Cylinder Skew

Since seeking is time consuming, disk scheduling algorithms try to minimize this latency.
Since seeking only involves cylinders, the input to these algorithms are cylinder numbers.

14.2.1 FCFS Scheduling (First-Come, First-Served)

14.2.2          SSTF Scheduling (Shortest-Seek-Time-First, SJF, Shortest-Job-First)

SSTF chooses the pending request closest th the current head position.
14.2.3          SCAN Scheduling
This algorithm requires one more piece of information: the disk head movement direction, inward or outward.
The disk head starts at one end, and move toward the other in the current direction.
At the other end, the direction is reversed and service continues.
Some authors refer the SCAN algorithm as the elevator algorithm. However, to some others the elevator algorithm means the LOOK algorithm.

14.2.4          C-SCAN Scheduling
C-SCAN is a variation of SCAN.
When the disk head reaches on end, move it back to the other end. Thus, this is simply a wrap-around.
The C in C-SCAN means circular.
14.2.5          LOOK Scheduling
With SCAN and C-SCAN, the disk head moves across the full width of the disk.
This is very time consuming. In practice, SCAN and C-SCAN are not implemented this way.
LOOK: It is a variation of SCAN. The disk head goes as far as the last request and reverses its direction.
C-LOOK: It is similar to C-SCAN. The disk head also goes as far as the last request and reverses its direction.

C-LOOK Scheduling

14.2.6          Selection of a Disk-Scheduling Algorithm
The requests of disk service can be greatly influenced by the file-allocation method.

14.3 Disk Management

14.3.1          Disk Formatting
Before a disk can store data, it must be divided into sectors that the disk controller can read and write. This process is called low-level formatting (physical formatting).
Low-level formatting fills the disk with a special data structure for each sector. The data structure for a sector typically consists of a header, a data area (usually 512 bytes in size), and a trailer.
To use a disk to hold files, the OS still needs to record its own data structures on the disk.
The first step is to partition the disk into one or more groups of cylinders.
After partitioning, the second step is logical formatting (or creation of a file system). In this step, The OS stores the initial file-system data structures onto the disk. These data structures may include maps of free and allocated space (a FAT of inodes) and an initial empty directory.
Raw I/O bypasses all the file-system services, such as the buffer cache, file locking, prefetching, space allocation, file names, and directories.
14.3.2          Boot Block
Most systems store a tiny bootstrap loader program in the boot ROM, whose only job is to bring in a full bootstrap program from disk. The full bootstrap program is stored in a partition called the boot blocks, at a fixed location on the disk. A disk that has a boot partition is called a boot disk or system disk.
The code in the boot ROM instructs the disk controller to read the boot blocks into memory (no device drivers are loaded at this point), and then starts executing that code.
14.3.3          Bad Blocks
Sector sparing, forwarding: Low-level formatting can sets aside spare sectors not visible to the OS. The controller can be told to replace each bad sector logically with one of the spare sectors.
Sector slipping: sector slipping would remap all the sectors by one block from the defective block to the spare block.

14.4 Swap-Space Management

Virtual memory uses disk space as an extension of main memory.

14.4.1          Swap-Space Use
The swap spaces used in UNIX are usually put on separate disks, so the load placed on the I/O system by paging and swapping can be spread over the system’s I/O devices.
14.4.2          Swap-Space Location
14.4.3          Swap-Space Management: An Example

14.5 RAID Structure

RAID: Redundant Arrays of Inexpensive (or Independent) Disks.
RAID is a set of physical drives viewed by the OS as a single logical drive.
Data are distributed across the physical drivers of an array.
Redundant disk capacity is used to store parity information, which guarantees data recoverability is case of disk failure.
RAID has 6 levels, each of which is not necessary an extension of the other.

14.5.1          Improvement of Reliability via Redundancy
14.5.2          Improvement in Performance via Parallelism
Data striping consists of splitting the bits of each byte across multiple disks; such striping is called bit-level striping.
14.5.3          RAID Levels
RAID Level 0: Non-redundant striping
The virtual single disk simulated by RAID is divided up onto strips of k sectors each (Block-level striping).
Consecutive strips are written over the drivers in a round-robin fashion. There is no redundancy.
If a single I/O request consists of multiple contiguous strips, then multiple strips can be handled in parallel.

RAID Level 1: Mirrored disks, Shadow
Each logical strip is mapped to two separate physical drives so that every drive in the array has a mirror drive that contains the same data.
Recovery from a disk failure is simple due to redundancy.
A write request involves two parallel disk writes.
Problem: Cost is high (doubled)!

Parallel Access
RAID Levels 2 and 3 require the use of parallel access technique. In a parallel access array, all member disks participate in the execution of every I/O and the spindles of the individual drives are synchronized so that each disk head is in the same position on each disk at any given time.
Data strips are very small, usually a single byte or word.
RAID Level 2: memory-style error-correcting codes
Memory systems have implemented error detection using parity bits. Each byte in a memory system may have a parity bit associated with it that records whether the numbers of bits in the byte set to 1 is even (parity=0) or odd (parity=1).
An error-correcting code is calculated across corresponding bits on each data and the bits of code are stored in the corresponding bit positions on disks.
Example: A 8-bit byte is divided into two 4-bit nibbles. A 3-bit Hamming code is added to form a 7-bit word, of which bits 1, 2 and 4 are parity bits. This 7-bit Hamming coded word is written to the seven disks, one bit per disk.
Cost is high, although the number of bits needed is less than that of RAID 1 (mirror).
The number of redundant disks is O(log₂ n), where n is the number of data disks.
On a single read, all disks are accessed at the same time. The requested data and the associated error-correcting code are delivered to the controller. If there is error, the controller reconstructs the data bytes using the error-correcting code. Thus, read access is not slowed.
RAID 2 would only be an effective choice in the environment in which many disk errors occur.
RAID Level 3: Bit-interleaved parity
RAID 3 is a simplified version of to RAID 2. It only needs one redundant drive.
A single parity bit is computed for each data word and written to a parity drive.
Example: The parity bit of bits 1-4 is written to the same position on the parity drive.
If one failing drive is known, the parity bit can be used to reconstruct the data word.
Because data are striped in very small strips, RAID 3 can achieve very high data transfer rates
Any I/O request will involve the parallel transfer of data from all of the data disks. However, only one I/O request can be executed at a time.
RAID Level 4: Block-interleaved parity
RAID 4 and RAID 5 work with strips rather than individual data words, and do not require synchronized drives.
The parity of all strips on the same “row” is written on an parity drive.
Example: strips 0, 1, 3 and 3 are exclusive-Ored, resulting in a parity strip.
If a drive fails, the lost bytes can be reconstructed from the parity drive.
If a sector fails, it is necessary to read all drives, including the parity drive, to recover.
The load of parity drive is very heavy.
Read-modify-write: a single write requires four disk accesses, two to read the two old blocks, and two to write the two new blocks.
RAID Level 5: Block-interleaved distributed parity
To avoid the bottleneck of the parity drive of RAID 4, the parity strips can be distributed uniformly over all drives in a round-robin fashion.
However, data recovery from a disk crash is more complex.

RAID Level 6: P+Q redundancy
RAID 0+1, EAID 1+0
14.5.4          Selecting a RAID Level
14.5.5          Extensions

14.6 Disk Attachment

14.6.1          Host-Attached Storage
14.6.2          Network-Attached Storage
Clients access network-attached storage (NAS) via a remote-procedure-call interface such as NFS for UNIX systems, or CIFS for Windows machines.
14.6.3          Storage-Area Network

14.7 Stable-Storage Implementation
14.8 Tertiary-Storage Structure

14.8.1          Tertiary-Storage Devices
14.8.1.1       Removable Disks
14.8.1.2       Tapes
14.8.1.3       Future Technology
14.8.2          Operating-System Jobs
14.8.2.1       Application Interface
14.8.2.2       File Naming
14.8.2.3       Hierarchical Storage Management
14.8.3          Performance Issues
14.8.3.1       Speed
14.8.3.2       Reliability
14.8.4.2      Cost

nix

nix 發表在痞客邦留言(0) 人氣()

E-mail轉寄

«	一月 2025					»
日	一	二	三	四	五	六
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

nix

Seize...

OS - 14. Mass-Storage Structure

歷史上的今天

留言列表

月曆

文章分類

近期文章

最新迴響

參觀人氣

文章彙整

我的連結

運動

旅遊

公益環保

國事

電子報

Embedded System

Freeware

Linux

攝影

Pixnet

Wretch

Yahoo

世界真奇妙

益智

新聞交換(RSS)

RSS訂閱

«	一月 2025					»
日	一	二	三	四	五	六
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

«	一月 2025					»
日	一	二	三	四	五	六
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

«	一月 2025					»
日	一	二	三	四	五	六
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31