1. Hard disk Driver

   There are two kinds of hard disk interfaces; Parallel ATA (PATA) and Serial
ATA (SATA). PATA is also known as IDE. SATA uses the serial ATA interface, which
is faster. SATA can be configured to be backward compatible with PATA but not 
vice versa. Older PATA uses PIO (Programmed I/O). Newer PATA, e.g. ATA-3, may 
use either PIO or DMA (Direct Memory Access). In general, DMA is faster and 
more suited to transfering large amounts of data. However, to use DMA, the PC 
must be in portected mode in order to access the PCI bus. When the amount of
data transfer is small, e.g. in usual file system operations, it is actually 
more advantageous to use PIO. For simplicity, this section presents an IDE hard
disk driver that uses PIO. 

    A PC's IDE bus usually has two IDE channels, denoted as IDE0 and IDE1. Each
IDE channel can support two devices, known as master or slave. A hard drive 
typically is the master device of IDE0. Each IDE channel has a set of fixed I/O
port addresses. The port addresses of IDE0 and the meaning of their contents are
listed below.

1.1. Primary IDE I/O Addresses:
------------------------------------------------------------------------------
Control Register:        
  0x3F6 = 0x80 (0000 1RE0): R=reset, E=0 =enable interrupt
                         
Command Block Registers: 
  0x1F0 = Data Port 
  0x1F1 = Error
  0x1F2 = Sector Count
  0x1F3 = LBA low byte
  0x1F4 = LBA mid byte
  0x1F5 = LBA hi  byte
  0x1F6 = 1B1D TOP4LBA: B=LBA,D=driv
  0x1F7 = Command/status

Status Register(0x1F7):
    7      6    5      4    3    2    1     0
   BUSY  READY FAULT SEEK  DRQ  CORR IDDEX ERROR

Errot Register (0x1F1): if status.ERROR=1
    7      6    5      4    3    2    1    0
   BBK    UNC   MC   IDNF  MCR  ABRT T0NF AMNF
-------------------------------------------------------
BBK = Bad Block 
UNC = Uncorrectable data error,
MC  = Media Changed
IDNF= ID mark Not Found
MCR = Media Change Requested
ABRT= Command aborted
T0NF= Track 0 Not Found
AMNF= Address Mark Not Found
**************************************************************/
Interrupt IRQ = 15 : Interrupt Vector= 0x76

1.2.IDE Operation Sequence:
    The sequence of IDE operations is as follows.

(1). Initialize HD:
     Write 0x08 to Control Register (0x3F6): (E bit=0 = enable interrupt).
  
(2). Read Status Register (0x1F7) until drive is notBusy and READY;

(3). Write sector count, LBA sector number, drive (master=0x00, slave=0x10) to 
     command registers (0x1F2-0x1F6).

(4). Write READ|WRITE command to Command Register (0x1F7).

(5). For a write operation, wait until drive is READY and DRQ (drive request 
     for data). Then, write data to data port.

(6). Each (512-byte) sector READ|WRITE generates an interrupt.
     I/O can be done in two ways:

(7a). Process "waits" for each interrupt. When sector R/W completes, interrupt 
      handler "unblocks" process, which continues to read/write the next sector 
      of data from/to data port.
 
(7b). Process starts R/W. For write (multi-sectors) operation, process writes 
      the first sector of data, then "waits" for the FINAL status interrupt.
      Interrupt handler transfers remaining sectors of data on each interrupt. 
      When R/W of all sectors are done, it "unblocks" the process by the FINAL 
      status interrupt. This scheme is better because it does not unblock 
      (wakeup) processes unnecessarily.

(8). Error Handling:
     After each R/W operation (interrupt), read status register. If status.ERROR
     bit is on, detailed error information are in the error register (0x1F1). 
     Recovery from error may need HD reset.  


2. First HD Driver

  Shown below is a HD driver based on 1.2.(7a). For each hd_rw() call, the 
process starts the r/w of one sector and blocks on a semaphore, hd_sem, for 
interrupts. After reading/writing each sector, the interrupt handler simply 
unblocks the process, which continues to r/w the next sector. The hd_mutex 
semaphore ensures that only one process executes hd_rw() at a time. 

/****************************************************************************
                  KCW: IDE hard disk driver using PIO
****************************************************************************/
#define INT_CTL         0x20	// I/O port for 1st 8259 interrupt controller
#define INT_CTLMASK     0x21	// 0 bits allow IRQ interrupts 
#define INT2_CTL        0xA0	// I/O port for 2nd 8259 interrupt controller
#define INT2_MASK       0xA1	// 0 bits allows IRQ interrupts

#define HD_DATA        0x1f0    // data port for R/W
#define HD_ERROR       0x1f1    // error register
#define HD_SEC_COUNT   0x1f2    // R/W sector count
#define HD_LBA_LOW     0x1f3    // LBA low  byte
#define HD_LBA_MID     0x1f4    // LBA mid  byte
#define HD_LBA_HI      0x1f5    // LBA high byte
#define HD_LBA_DRIVE   0x1f6    // 1B1D0000 => B=LBA, D=drive => 0xE0 or 0xF0
#define HD_CMD         0x1f7    // command : R=0x20 W=0x30
#define HD_STATUS      0x1f7    // status register

#define HD_CONTROL     0x3f6    // 0x08 (0000 1RE0): Reset, E=1: NO interrupt

/* HD disk controller command bytes. */
#define HD_READ        0x20	// read 
#define HD_WRITE       0x30	// write

/* Parameters for the disk drive. */
#define BLOCK_SIZE      4096    // Linux HD block size
#define SECTOR_SIZE      512	// sector size in bytes
#define BAD               -1    // return BAD on error

struct semaphore hd_mutex;     // semaphore for procs hd_rw() ONE at a time
struct semaphore hd_sem;       // sempahore for proc to wait for IDE interrupts

int delay()
{
   int i; for (i=0; i < 10000; i++);
}

int hd_reset()
{
  /****************** HD software reset sequence *******************
   ControlRegister (0x3F6)=(0000 1RE0); R=reset, E=0:enable interrupt
   Strobe R bit from HI to LO; with delay time in between:
          Write 0000 1100 to ControlReg; delay(); 
          Write 0000 1000 to ControlReg; wait for notBUSY & no error
   *****************************************************************/
  out_byte(0x3F6, 0x0C);     delay();
  out_byte(0x3F6, 0x08);     delay();
  if (hd_busy() || cd_error()) {
      printf("HD reset error\n"); return(BAD);
  }
  return 0;     // return 0 means OK 
}

int hd_busy()   // test for BUSY
{
  return in_byte(HD_STATUS) & 0x80;
}

int hd_ready()  // test for READY
{
  return in_byte(HD_STATUS) & 0x40;
}

int hd_drq()   // test for DRQ
{
  return in_byte(HD_STATUS) & 0x08;
}

int hd_error()  // test for error
{
   int r;
   if (in_byte(0x1F7) & 0x01){  // status.ERROR bit on
      r = in_byte(0x1F1);       // read error register
      printf("HD error=%x\n", r);
      return r;
   }
   return 0;                    // return 0 for OK 
}

int hd_init()
{
  printf("hd_init\n");
  hd_mutex.value = 1; 
  hd_mutex.queue = 0;
  hd_sem.value = hd_sem.queue = 0;
}

/* HD interrupt handler : simply "wakeup" the blocked process */ 
int hdhandler()           
{ 
  printf("hd interrupt! ");

  V(&hd_sem);           // wakeup blocked process

  out_byte(0xA0, 0x20);   // enable 8259 controllers
  out_byte(0x20, 0x20);
}


int hd_rw(rw, sector, buf, nsectors) // read/write nsectors
      u16 rw; u32 sector; char *buf; u16 nsectors;
{
    int i;
    P(&hd_mutex);         // procs execute hd_rw() ONE at a time
    hd_sem.value=hd_sem.queue = 0;

    while(hd_busy() || !hd_ready());  // wait until notBUSY & READY

    printf("\nHD NOT_BUSY and READY: write to IDE registers\n");
    
    out_byte(0x3F6, 0x08);                       // control = 0x08; interrupt
    out_byte(0x1F2, nsectors);                   // sector count
    out_byte(0x1F3, sector);                     // LBA low  byte
    out_byte(0x1F4, sector>>8);                  // LBA mid  byte
    out_byte(0x1F5, sector>>16);                 // LBA high byte
    out_byte(0x1F6, ((sector>>24)&0x0F) | 0xE0); // use LBA for drive 0
    out_byte(0x1F7, rw);                         // READ | WRITE command

    // ONE interrupt per sector read|write; transfer data via DATA port
    for (i=0; i < nsectors; i++){  // loop for each sector
      if (rw==HD_READ){
         P(&hd_sem);             // wait for interrupt 

         if (hd_error())
	   break;

         read_port(0x1F0, getds(), buf, 512);   // getds() return DS
         buf += 512;      
      }
      else{  // for DD_WRITE, must wait until notBUSY and DRQ=1
         while (hd_busy() || !hd_drq());

         write_port(0x1F0, getds(), buf, 512);   // getds() returns DS
         buf += 512;      

         P(&hd_sem);            // wait for interrupt

         if (hd_error())
	   break;
      }
    }                           // end loop

    V(&hd_mutex);               // release hd_mutex lock
    if (hd_error())
       return BAD;
    return 0;
}


/********************** HD driver testing code *************************/
struct hd {		
  u32    start_sector;
  u32    size;               // size in number of sectors
} hda[16];                     // hda[] for drive 0; hdb[] for drive 1

struct partition {
	u8 drive;		/* 0x80 - active */
	u8 head;		/* starting head */
	u8 sector;		/* starting sector */
	u8 cylinder;		/* starting cylinder */
	u8 sys_type;		/* partition type */
	u8 end_head;		/* end head */
	u8 end_sector;	        /* end sector */
	u8 end_cylinder;	/* end cylinder */

	u32 start_sector;	/* starting sector counting from 0 */
	u32 nr_sectors;         /* nr of sectors in partition */
};

char pbuf[BLOCK_SIZE];

/******** read MBR and display partitions ********/
int partition() 
{
  int i;
  struct hd *hp;
  struct partition *p;

  printf("reading MBR ");

  // read partition tables and initialize start_sector, size fields
  hd_read((u32)0, pbuf, 1); //read 1 sector into pbuf[]  

  p = (struct partition *)(&pbuf[0x1be]);

  for (i=1; i <= 4; i++){
     hp = &hda[i];
     hp->start_sector = p->start_sector;
     hp->size = p->nr_sectors;
     p++;
  }

  printf("\nshow partition table:\n");
  for (i=1; i <= 4; i++){
     hp = &hda[i];
     printf("%l   %l\n", (u32)hp->start_sector, (u32)hp->size);
  }
}

int hd()
{
  printf("testing HD driver\n");
  hd_init();
  partition();
  hdw();
  hdr();
}

hd_read(sector, rbuf, nsectors) 
  u32 sector; char *rbuf; u16 nsectors;
{
  return hd_rw(HD_READ, (u32)sector, rbuf, nsectors);
}

hd_write(sector, rbuf, nsectors) 
   u32 sector; char *rbuf; u16 nsectors;
{
  return hd_rw(HD_WRITE, (u32)sector, rbuf, nsectors);
}

// HD driver testing routines
char sbuf[BLOCK_SIZE];

int hdr()
{
  int i;
  u32 sector;
  sector = hda[3].start_sector; 

  printf("Read HD sector = %l OK?\n", sector); getc();

  for (i=0; i < BLOCK_SIZE; i++)
    sbuf[i] = ' ';
  hd_read((u32)sector, sbuf, 1);
  sbuf[512] = 0;
  printf("\n%s\n",sbuf);
}

char *see="I would like to see ";

hdw()
{
  char name[64];
  u16 blk, i, r;
  char *cp;
  u32 sector;

  cp = sbuf;
  for (i=0; i < BLOCK_SIZE; i++)
    sbuf[i] = ' ';
  printf("Write to HD Partition 3: input name to write : ");
  gets(name);
  
  for (i=0; i < 10; i++){
    sprintf(cp, see); cp += strlen(see);
    sprintf(cp, name); cp+= strlen(name); 
    sprintf(cp, " %d times again\n\r", i);
    cp += strlen("   times again\n\r");
  }
  printf("******* will write these to HD *********\n");
  printf("%s\n", sbuf);
  sector = hda[3].start_sector; 

  printf("sector = %l ready to go?\n", (u32)sector); 
  getc(); 

  hd_write((u32)sector, sbuf, 1);
}
*********************** end of HD driver testing code *********************/


3. Second HD Driver

    The next HD driver is based on 1.2.(7b). In this case, a process starts R/W
of the first sector and then blocks until all sectors are read/written. On each
interrupt, the interrupt handler performs R/W of the remaining sectors. When
all sectors are done, it unblocks the process on the final status interrupt. 
This HD driver is intended for synchronous I/O in which the process "waits" for
I/O completion. The hd_mutex semaphore serves two purposes. First, it ensures 
that only one process is executing hd_rw(), so that the I/O parameters are used
by only the process and the interrupt handler. Second, when several processes 
try to do HD I/O, it automatically queues the remaining requesting processes in 
the semaphore queue.

/***************** HD driver for synchronous disk I/O *******************/

/***** HD I/O parameters common to hd_rw() and interrupt handler ******/
u16  opcode;      // HD_READ | HD_WRITE
char *bufPtr;     // pointer to data buffer
u16  hderror      // error flag
int  ICOUNT;      // sector count
/******************** End of HD I/O parameters ************************/

int hdhandler()
{ 
  printf("HD interrupt ICOUNT=%d\n", ICOUNT);

  // ONE interrupt per sector read|write; transfer data via DATA port

  if (opcode==HD_READ){
     if (hd_error())
        goto out;
     read_port(0x1F0, getds(), bufPtr, 512);
     bufPtr += 512;      
  }
  else{  // HD_WRITE 
     if (ICOUNT > 1){
         write_port(0x1F0, getds(), bufPtr, 512);
         bufPtr += 512;      
     }
     if (hd_error())
        goto out;
  }

  --ICOUNT;

  if (ICOUNT == 0){
     printf("HD inth: V(hd_sem)\n");
     V(&hd_sem);
  }

out:
  if (hderror && ICOUNT)
      V(&hd_sem);         // must unblock process even if ICOUNT > 0
  
  out_byte(0xA0, 0x20);   // enable 8259
  out_byte(0x20, 0x20);
}

int hd_rw(rw, sector, buf, nsectors) 
          u16 rw; u32 sector; char *buf; u16 nsectors;
{
    unsigned char *cp, low, mid, hi, top;
    int i, r;

    P(&hd_mutex);         // one proc at a time executes hd_rw()

    printf("hd_rw: setup I/O paremeters for interrupt handler\n"); 
    hderror = 0;          // initialize hderror to 0
    opcode = rw;          // set opcode
    bufPtr = buf;         // pointer to data buffer
    ICOUNT = nsectors;    // nsectors to R/W
    
    // prepare commands for command registers
    cp = &(sector);
    low = *(cp);  mid = *(cp+1);  hi = *(cp+2); 
    top = (*(cp+3) & 0x0F) | 0xE0;

    // wait for drive notBUSY && READY
    while(hd_busy() || !hd_ready());
    printf("hd_rw: write to command registers\n");    

    out_byte(0x3F6, 0x08);              // control = 0x08
    
    out_byte(0x1F2, nsectors);          // sector count

    out_byte(0x1F3, low);               // LBA low  byte
    
    out_byte(0x1F4, mid);               // LBA mid  byte
    
    out_byte(0x1F5, hi);                // LBA high byte
    
    out_byte(0x1F6, top);               // use LBA for drive 0 or 1
    
    out_byte(0x1F7, rw);                // READ or WRITE command

    if (rw==HD_WRITE){ // must wait until notBUSY and DRQ=1
       while (hd_busy() || !hd_drq());
       write_port(0x1F0, getds(), buf, 512);
       buf += 512;
       bufPtr += 512;
    }

    P(&hd_sem);       // block until r/w of ALL nsectors are done
    V(&hd_mutex);     // unlock hd_mutex lock
    return hderror;   // either 0 or nonzero
}
 

4. HD Driver in MTX Kernel

    In an operating system, read operations are usually synchronous, meaning
that a process must wait for the read operation to complete. However, write 
operations are usually asynchromous. For a write operation, the process may 
simply issue a write request and continue without waiting for the operation
to complete (or even to start!). For example, in the MTX kernel, all HD write 
operations are delayed writes. After issuing a write request, the process 
continues without waiting for the operation to complete. Acutal writes to the 
HD may take place much later by the interrupt handler. Since the interrupt 
handler also issues R/W operations, hd_rw() cannot contain any "sleep or P 
operations" that would cause the caller to "block". The algorithm of the MTX HD
driver is shown below.

-------------------- 1. MTX HD Driver Lower Level ----------------------------
InterruptHandler()
{  
   current_request = first request in I/O_queue;
   1. if (current_request has more data to transfer, i.e. ICOUNT > 0)
          transfer_data; ICOUNT--; return;
      }
     
   2. // data transfer for current_request completed)
      request = dequeue(I/O_queue);             // remove 1st request
   
   3. if (request.opcode==READ)
          V(request.io_done);                   // "wakeup" waiting process
      if (!empty(I/O_queue)){                   // if I/O_queue non-empty
          start_io(first request in I/O_queue); // start I/O for next request
   4. return;
}

------------------- 2. Shared Data Structure ---------------------------------
                 
HD_request_queue = a queue of I/O requests; 

NOTE: HD_request_queue is a Critical Region among processes (for MP systems).
      It is also a CR between process and the interrupt handler.

                   struct request{
                      struct request *next;     // next request
                      int   opcode;             // READ|WRITE
                      u32   sector;             // start sector (or block#)
                      u16   nsectors;           // number of sectors to R/W
                      char  buf[BLKSIZE];       // data area
                      struct semaphore io_done; // initial value = 0
                   }
    
-------------------- 3. MTX HD Driver Upper Level (Process) ------------------
hd_rw(I/O_request)
{
   1. enter request into (FIFO) HD_request_queue;
   2. if (fisrt in HD_request queue)
         start_io(request);                    // issue actual I/O to HD
   3. return;
}

hd_read()
{
   1. construct an "I/O request"; (io_done.value = 0; io_done_queue = 0);
   2. hd_rw(&request);
   3. P(&request.io_done);                     // "wait" for READ completion
   4. read data from reuest.buf;
}

hd_write()
{
   1. construct an "I/O request"; 
   2. write data to request.buf;
   3. hd_rw(&request);                        // no "wait" for WRITE completion 
}
------------------------------------------------------------------------------