1. Hard disk Driver
There are two kinds of hard disk interfaces; Parallel ATA (PATA) and Serial
ATA (SATA). PATA is also known as IDE. SATA uses the serial ATA interface, which
is faster. SATA can be configured to be backward compatible with PATA but not
vice versa. Older PATA uses PIO (Programmed I/O). Newer PATA, e.g. ATA-3, may
use either PIO or DMA (Direct Memory Access). In general, DMA is faster and
more suited to transfering large amounts of data. However, to use DMA, the PC
must be in portected mode in order to access the PCI bus. When the amount of
data transfer is small, e.g. in usual file system operations, it is actually
more advantageous to use PIO. For simplicity, this section presents an IDE hard
disk driver that uses PIO.
A PC's IDE bus usually has two IDE channels, denoted as IDE0 and IDE1. Each
IDE channel can support two devices, known as master or slave. A hard drive
typically is the master device of IDE0. Each IDE channel has a set of fixed I/O
port addresses. The port addresses of IDE0 and the meaning of their contents are
listed below.
1.1. Primary IDE I/O Addresses:
------------------------------------------------------------------------------
Control Register:
0x3F6 = 0x80 (0000 1RE0): R=reset, E=0 =enable interrupt
Command Block Registers:
0x1F0 = Data Port
0x1F1 = Error
0x1F2 = Sector Count
0x1F3 = LBA low byte
0x1F4 = LBA mid byte
0x1F5 = LBA hi byte
0x1F6 = 1B1D TOP4LBA: B=LBA,D=driv
0x1F7 = Command/status
Status Register(0x1F7):
7 6 5 4 3 2 1 0
BUSY READY FAULT SEEK DRQ CORR IDDEX ERROR
Errot Register (0x1F1): if status.ERROR=1
7 6 5 4 3 2 1 0
BBK UNC MC IDNF MCR ABRT T0NF AMNF
-------------------------------------------------------
BBK = Bad Block
UNC = Uncorrectable data error,
MC = Media Changed
IDNF= ID mark Not Found
MCR = Media Change Requested
ABRT= Command aborted
T0NF= Track 0 Not Found
AMNF= Address Mark Not Found
**************************************************************/
Interrupt IRQ = 15 : Interrupt Vector= 0x76
1.2.IDE Operation Sequence:
The sequence of IDE operations is as follows.
(1). Initialize HD:
Write 0x08 to Control Register (0x3F6): (E bit=0 = enable interrupt).
(2). Read Status Register (0x1F7) until drive is notBusy and READY;
(3). Write sector count, LBA sector number, drive (master=0x00, slave=0x10) to
command registers (0x1F2-0x1F6).
(4). Write READ|WRITE command to Command Register (0x1F7).
(5). For a write operation, wait until drive is READY and DRQ (drive request
for data). Then, write data to data port.
(6). Each (512-byte) sector READ|WRITE generates an interrupt.
I/O can be done in two ways:
(7a). Process "waits" for each interrupt. When sector R/W completes, interrupt
handler "unblocks" process, which continues to read/write the next sector
of data from/to data port.
(7b). Process starts R/W. For write (multi-sectors) operation, process writes
the first sector of data, then "waits" for the FINAL status interrupt.
Interrupt handler transfers remaining sectors of data on each interrupt.
When R/W of all sectors are done, it "unblocks" the process by the FINAL
status interrupt. This scheme is better because it does not unblock
(wakeup) processes unnecessarily.
(8). Error Handling:
After each R/W operation (interrupt), read status register. If status.ERROR
bit is on, detailed error information are in the error register (0x1F1).
Recovery from error may need HD reset.
2. First HD Driver
Shown below is a HD driver based on 1.2.(7a). For each hd_rw() call, the
process starts the r/w of one sector and blocks on a semaphore, hd_sem, for
interrupts. After reading/writing each sector, the interrupt handler simply
unblocks the process, which continues to r/w the next sector. The hd_mutex
semaphore ensures that only one process executes hd_rw() at a time.
/****************************************************************************
KCW: IDE hard disk driver using PIO
****************************************************************************/
#define INT_CTL 0x20 // I/O port for 1st 8259 interrupt controller
#define INT_CTLMASK 0x21 // 0 bits allow IRQ interrupts
#define INT2_CTL 0xA0 // I/O port for 2nd 8259 interrupt controller
#define INT2_MASK 0xA1 // 0 bits allows IRQ interrupts
#define HD_DATA 0x1f0 // data port for R/W
#define HD_ERROR 0x1f1 // error register
#define HD_SEC_COUNT 0x1f2 // R/W sector count
#define HD_LBA_LOW 0x1f3 // LBA low byte
#define HD_LBA_MID 0x1f4 // LBA mid byte
#define HD_LBA_HI 0x1f5 // LBA high byte
#define HD_LBA_DRIVE 0x1f6 // 1B1D0000 => B=LBA, D=drive => 0xE0 or 0xF0
#define HD_CMD 0x1f7 // command : R=0x20 W=0x30
#define HD_STATUS 0x1f7 // status register
#define HD_CONTROL 0x3f6 // 0x08 (0000 1RE0): Reset, E=1: NO interrupt
/* HD disk controller command bytes. */
#define HD_READ 0x20 // read
#define HD_WRITE 0x30 // write
/* Parameters for the disk drive. */
#define BLOCK_SIZE 4096 // Linux HD block size
#define SECTOR_SIZE 512 // sector size in bytes
#define BAD -1 // return BAD on error
struct semaphore hd_mutex; // semaphore for procs hd_rw() ONE at a time
struct semaphore hd_sem; // sempahore for proc to wait for IDE interrupts
int delay()
{
int i; for (i=0; i < 10000; i++);
}
int hd_reset()
{
/****************** HD software reset sequence *******************
ControlRegister (0x3F6)=(0000 1RE0); R=reset, E=0:enable interrupt
Strobe R bit from HI to LO; with delay time in between:
Write 0000 1100 to ControlReg; delay();
Write 0000 1000 to ControlReg; wait for notBUSY & no error
*****************************************************************/
out_byte(0x3F6, 0x0C); delay();
out_byte(0x3F6, 0x08); delay();
if (hd_busy() || cd_error()) {
printf("HD reset error\n"); return(BAD);
}
return 0; // return 0 means OK
}
int hd_busy() // test for BUSY
{
return in_byte(HD_STATUS) & 0x80;
}
int hd_ready() // test for READY
{
return in_byte(HD_STATUS) & 0x40;
}
int hd_drq() // test for DRQ
{
return in_byte(HD_STATUS) & 0x08;
}
int hd_error() // test for error
{
int r;
if (in_byte(0x1F7) & 0x01){ // status.ERROR bit on
r = in_byte(0x1F1); // read error register
printf("HD error=%x\n", r);
return r;
}
return 0; // return 0 for OK
}
int hd_init()
{
printf("hd_init\n");
hd_mutex.value = 1;
hd_mutex.queue = 0;
hd_sem.value = hd_sem.queue = 0;
}
/* HD interrupt handler : simply "wakeup" the blocked process */
int hdhandler()
{
printf("hd interrupt! ");
V(&hd_sem); // wakeup blocked process
out_byte(0xA0, 0x20); // enable 8259 controllers
out_byte(0x20, 0x20);
}
int hd_rw(rw, sector, buf, nsectors) // read/write nsectors
u16 rw; u32 sector; char *buf; u16 nsectors;
{
int i;
P(&hd_mutex); // procs execute hd_rw() ONE at a time
hd_sem.value=hd_sem.queue = 0;
while(hd_busy() || !hd_ready()); // wait until notBUSY & READY
printf("\nHD NOT_BUSY and READY: write to IDE registers\n");
out_byte(0x3F6, 0x08); // control = 0x08; interrupt
out_byte(0x1F2, nsectors); // sector count
out_byte(0x1F3, sector); // LBA low byte
out_byte(0x1F4, sector>>8); // LBA mid byte
out_byte(0x1F5, sector>>16); // LBA high byte
out_byte(0x1F6, ((sector>>24)&0x0F) | 0xE0); // use LBA for drive 0
out_byte(0x1F7, rw); // READ | WRITE command
// ONE interrupt per sector read|write; transfer data via DATA port
for (i=0; i < nsectors; i++){ // loop for each sector
if (rw==HD_READ){
P(&hd_sem); // wait for interrupt
if (hd_error())
break;
read_port(0x1F0, getds(), buf, 512); // getds() return DS
buf += 512;
}
else{ // for DD_WRITE, must wait until notBUSY and DRQ=1
while (hd_busy() || !hd_drq());
write_port(0x1F0, getds(), buf, 512); // getds() returns DS
buf += 512;
P(&hd_sem); // wait for interrupt
if (hd_error())
break;
}
} // end loop
V(&hd_mutex); // release hd_mutex lock
if (hd_error())
return BAD;
return 0;
}
/********************** HD driver testing code *************************/
struct hd {
u32 start_sector;
u32 size; // size in number of sectors
} hda[16]; // hda[] for drive 0; hdb[] for drive 1
struct partition {
u8 drive; /* 0x80 - active */
u8 head; /* starting head */
u8 sector; /* starting sector */
u8 cylinder; /* starting cylinder */
u8 sys_type; /* partition type */
u8 end_head; /* end head */
u8 end_sector; /* end sector */
u8 end_cylinder; /* end cylinder */
u32 start_sector; /* starting sector counting from 0 */
u32 nr_sectors; /* nr of sectors in partition */
};
char pbuf[BLOCK_SIZE];
/******** read MBR and display partitions ********/
int partition()
{
int i;
struct hd *hp;
struct partition *p;
printf("reading MBR ");
// read partition tables and initialize start_sector, size fields
hd_read((u32)0, pbuf, 1); //read 1 sector into pbuf[]
p = (struct partition *)(&pbuf[0x1be]);
for (i=1; i <= 4; i++){
hp = &hda[i];
hp->start_sector = p->start_sector;
hp->size = p->nr_sectors;
p++;
}
printf("\nshow partition table:\n");
for (i=1; i <= 4; i++){
hp = &hda[i];
printf("%l %l\n", (u32)hp->start_sector, (u32)hp->size);
}
}
int hd()
{
printf("testing HD driver\n");
hd_init();
partition();
hdw();
hdr();
}
hd_read(sector, rbuf, nsectors)
u32 sector; char *rbuf; u16 nsectors;
{
return hd_rw(HD_READ, (u32)sector, rbuf, nsectors);
}
hd_write(sector, rbuf, nsectors)
u32 sector; char *rbuf; u16 nsectors;
{
return hd_rw(HD_WRITE, (u32)sector, rbuf, nsectors);
}
// HD driver testing routines
char sbuf[BLOCK_SIZE];
int hdr()
{
int i;
u32 sector;
sector = hda[3].start_sector;
printf("Read HD sector = %l OK?\n", sector); getc();
for (i=0; i < BLOCK_SIZE; i++)
sbuf[i] = ' ';
hd_read((u32)sector, sbuf, 1);
sbuf[512] = 0;
printf("\n%s\n",sbuf);
}
char *see="I would like to see ";
hdw()
{
char name[64];
u16 blk, i, r;
char *cp;
u32 sector;
cp = sbuf;
for (i=0; i < BLOCK_SIZE; i++)
sbuf[i] = ' ';
printf("Write to HD Partition 3: input name to write : ");
gets(name);
for (i=0; i < 10; i++){
sprintf(cp, see); cp += strlen(see);
sprintf(cp, name); cp+= strlen(name);
sprintf(cp, " %d times again\n\r", i);
cp += strlen(" times again\n\r");
}
printf("******* will write these to HD *********\n");
printf("%s\n", sbuf);
sector = hda[3].start_sector;
printf("sector = %l ready to go?\n", (u32)sector);
getc();
hd_write((u32)sector, sbuf, 1);
}
*********************** end of HD driver testing code *********************/
3. Second HD Driver
The next HD driver is based on 1.2.(7b). In this case, a process starts R/W
of the first sector and then blocks until all sectors are read/written. On each
interrupt, the interrupt handler performs R/W of the remaining sectors. When
all sectors are done, it unblocks the process on the final status interrupt.
This HD driver is intended for synchronous I/O in which the process "waits" for
I/O completion. The hd_mutex semaphore serves two purposes. First, it ensures
that only one process is executing hd_rw(), so that the I/O parameters are used
by only the process and the interrupt handler. Second, when several processes
try to do HD I/O, it automatically queues the remaining requesting processes in
the semaphore queue.
/***************** HD driver for synchronous disk I/O *******************/
/***** HD I/O parameters common to hd_rw() and interrupt handler ******/
u16 opcode; // HD_READ | HD_WRITE
char *bufPtr; // pointer to data buffer
u16 hderror // error flag
int ICOUNT; // sector count
/******************** End of HD I/O parameters ************************/
int hdhandler()
{
printf("HD interrupt ICOUNT=%d\n", ICOUNT);
// ONE interrupt per sector read|write; transfer data via DATA port
if (opcode==HD_READ){
if (hd_error())
goto out;
read_port(0x1F0, getds(), bufPtr, 512);
bufPtr += 512;
}
else{ // HD_WRITE
if (ICOUNT > 1){
write_port(0x1F0, getds(), bufPtr, 512);
bufPtr += 512;
}
if (hd_error())
goto out;
}
--ICOUNT;
if (ICOUNT == 0){
printf("HD inth: V(hd_sem)\n");
V(&hd_sem);
}
out:
if (hderror && ICOUNT)
V(&hd_sem); // must unblock process even if ICOUNT > 0
out_byte(0xA0, 0x20); // enable 8259
out_byte(0x20, 0x20);
}
int hd_rw(rw, sector, buf, nsectors)
u16 rw; u32 sector; char *buf; u16 nsectors;
{
unsigned char *cp, low, mid, hi, top;
int i, r;
P(&hd_mutex); // one proc at a time executes hd_rw()
printf("hd_rw: setup I/O paremeters for interrupt handler\n");
hderror = 0; // initialize hderror to 0
opcode = rw; // set opcode
bufPtr = buf; // pointer to data buffer
ICOUNT = nsectors; // nsectors to R/W
// prepare commands for command registers
cp = &(sector);
low = *(cp); mid = *(cp+1); hi = *(cp+2);
top = (*(cp+3) & 0x0F) | 0xE0;
// wait for drive notBUSY && READY
while(hd_busy() || !hd_ready());
printf("hd_rw: write to command registers\n");
out_byte(0x3F6, 0x08); // control = 0x08
out_byte(0x1F2, nsectors); // sector count
out_byte(0x1F3, low); // LBA low byte
out_byte(0x1F4, mid); // LBA mid byte
out_byte(0x1F5, hi); // LBA high byte
out_byte(0x1F6, top); // use LBA for drive 0 or 1
out_byte(0x1F7, rw); // READ or WRITE command
if (rw==HD_WRITE){ // must wait until notBUSY and DRQ=1
while (hd_busy() || !hd_drq());
write_port(0x1F0, getds(), buf, 512);
buf += 512;
bufPtr += 512;
}
P(&hd_sem); // block until r/w of ALL nsectors are done
V(&hd_mutex); // unlock hd_mutex lock
return hderror; // either 0 or nonzero
}
4. HD Driver in MTX Kernel
In an operating system, read operations are usually synchronous, meaning
that a process must wait for the read operation to complete. However, write
operations are usually asynchromous. For a write operation, the process may
simply issue a write request and continue without waiting for the operation
to complete (or even to start!). For example, in the MTX kernel, all HD write
operations are delayed writes. After issuing a write request, the process
continues without waiting for the operation to complete. Acutal writes to the
HD may take place much later by the interrupt handler. Since the interrupt
handler also issues R/W operations, hd_rw() cannot contain any "sleep or P
operations" that would cause the caller to "block". The algorithm of the MTX HD
driver is shown below.
-------------------- 1. MTX HD Driver Lower Level ----------------------------
InterruptHandler()
{
current_request = first request in I/O_queue;
1. if (current_request has more data to transfer, i.e. ICOUNT > 0)
transfer_data; ICOUNT--; return;
}
2. // data transfer for current_request completed)
request = dequeue(I/O_queue); // remove 1st request
3. if (request.opcode==READ)
V(request.io_done); // "wakeup" waiting process
if (!empty(I/O_queue)){ // if I/O_queue non-empty
start_io(first request in I/O_queue); // start I/O for next request
4. return;
}
------------------- 2. Shared Data Structure ---------------------------------
HD_request_queue = a queue of I/O requests;
NOTE: HD_request_queue is a Critical Region among processes (for MP systems).
It is also a CR between process and the interrupt handler.
struct request{
struct request *next; // next request
int opcode; // READ|WRITE
u32 sector; // start sector (or block#)
u16 nsectors; // number of sectors to R/W
char buf[BLKSIZE]; // data area
struct semaphore io_done; // initial value = 0
}
-------------------- 3. MTX HD Driver Upper Level (Process) ------------------
hd_rw(I/O_request)
{
1. enter request into (FIFO) HD_request_queue;
2. if (fisrt in HD_request queue)
start_io(request); // issue actual I/O to HD
3. return;
}
hd_read()
{
1. construct an "I/O request"; (io_done.value = 0; io_done_queue = 0);
2. hd_rw(&request);
3. P(&request.io_done); // "wait" for READ completion
4. read data from reuest.buf;
}
hd_write()
{
1. construct an "I/O request";
2. write data to request.buf;
3. hd_rw(&request); // no "wait" for WRITE completion
}
------------------------------------------------------------------------------