Shamusworld >> Repos - apple2/blob - docs/emulating-a2hssc.txt

   1   EMULATING THE APPLE HIGH SPEED SCSI CARD: AN EXERCISE IN DIGITAL ARCHAEOLOGY
   2
   3                                 by James Hammons
   4
   5     ~~==< Brought to you in Glorious 80-Column Monospace-o-Vision(TM) >==~~
   6
   7
   8 Motivations
   9 -----------
  10
  11 While reading 4am's Twitter feed one day, he talked about his "Pitch Dark" hard
  12 drive image, which looked incredibly cool and like something that I would very
  13 much be interested in.  But in reading about it, I came across a seemingly
  14 throwaway line about how all decent emulators can run them, which, sadly,
  15 Apple2 could not at the time.  And so, in order to save Apple2 from indecency
  16 (and because I wanted to see if I could get 4am's "Pitch Dark" to work because
  17 it looked cool and interesting), I set about for finding some documentation on
  18 how hard drives interfaced to Apple IIs--and ran into a complete dearth of
  19 information.  There were little things sprinkled around here and there, but
  20 nothing of any deep, satisfying, technical significance.
  21
  22
  23 In Order To Run A Hard Drive Image, You Must First Create The Universe
  24 ----------------------------------------------------------------------
  25
  26 While it's a nice bit of hyperbole, it's not exactly true that you have to
  27 first create the Universe, as fortunately, that part has largely been taken
  28 care of.  However, you still have to figure out how to emulate it if you are
  29 keen on running a hard drive image on your emulator of choice.  And in so
  30 doing, you have to figure what the requirements are; what the minimal pieces
  31 are that are required to have a functioning hard drive system; you also have to
  32 figure out how that system talks to the emulated computer.  And that all
  33 requires information.  I wasn't asking for much, but something along the lines
  34 of Jim Sather's "Understanding The Apple IIe" for hard drives would have been a
  35 nice thing to have.
  36
  37
  38 The Next Part, In Which Nice Things To Have Are Not Forthcoming
  39 ---------------------------------------------------------------
  40
  41 Unfortunately, Jim Sather, and nobody else as far as I can tell, ever wrote
  42 such a document, and so I did what any lazy programmer would do: I took a look
  43 at some other project's source--in this case, AppleWin's source.  I didn't
  44 really *want* to look at it, having looked at it before and recoiled in horror
  45 at the sight, but, my search-fu apparently being not up to the task of finding
  46 relevant information drove me to it.  And looking at it didn't really provide
  47 any illumination; to me it looked like some kind of hacky thing and I wasn't
  48 interested in that kind of approach at all--so I abandoned the idea.  As I dug
  49 a little deeper into the minute literature that existed as such on the subject,
  50 I learned that pretty much any time you wanted to hook up a hard drive to your
  51 Apple II, you had to use an interface card, and typically that meant some kind
  52 of SCSI card.  And looking here, there was no shortage of SCSI cards that you
  53 could use to hook up your hard drive therewith.
  54
  55 So, that being a promising looking path to pursue on the road to this
  56 particular perdition, the question then became, which one should I choose?  At
  57 first I thought the RAMFast card would fit the bill as it seemed to be very
  58 popular, but there was literally no technical infomation on the thing.  The
  59 Apple SCSI card looked promising, but then I saw that it "ghosted" a slot,
  60 meaning that it would have to occupy two consecutive slots in order to work and
  61 I didn't much care for that.  And so, after looking at, and rejecting, card
  62 after card for pretty much the same reason, I settled on the Apple High Speed
  63 SCSI card for a few reasons--one, it was purportedly fast; two, it worked on
  64 the Apple IIe (as well as the IIgs, but I didn't really care that much about
  65 that to be honest); three, it had a users manual that wasn't completely devoid
  66 of technical information; four, it had a schematic; and five, it had a firmware
  67 image.  This looked like a promising start--how hard could it be to make this
  68 work?
  69
  70
  71 Things Aren't Exactly Hard, But They Aren't Exactly Soft Either
  72 ---------------------------------------------------------------
  73
  74 One of the necessary things that I didn't have out of all of that was good
  75 information on how the thing worked.  I knew that it was a SCSI card, and I
  76 knew that it talked to the SCSI bus using an NCR 53C80 chip, but I had no idea
  77 exactly how.  But I did have something that *did* know how to talk to it: the
  78 firmware for the card.
  79
  80 Now when you take a look at the firmware, the first thing you notice is that
  81 it's 32K in size--which is *much* larger than the typical 256 bytes that you
  82 encounter when looking at Apple II card drivers.  It also happens to be quite a
  83 bit larger than the 2K "bonus" space that Apple II cards have available to them
  84 in the $C800 to $CFFF address space.  So what gives?
  85
  86 Fortunately for me, Apple2 has a built-in disassembler (which will probably
  87 stay in for all time, as it turns out to be a very useful thing to have on
  88 hand), and so I split that out into a stand-alone command line driven program,
  89 called d65c02, in order to be able to disassemble such things as device driver
  90 firmware blobs.  It isn't fancy, it doesn't do any analysis on what is code and
  91 what is data, but it gets the job done in turning incomprehensible binary
  92 gibberish (except to certain mad geniuses who will go heretofore unnamed) into
  93 human readable ASCII gibberish.  Thus I used said tool to disassemble the
  94 firmware blob.
  95
  96 Pulling up the results in my text editor, I could see that at least the front
  97 of the listing looked like it could plausibly be code that would go into the
  98 usual 256 byte card slot address space of $Cx00 to $CxFF, where x ranges from 1
  99 to 7 depending on the slot number.  Looking further, I could see this first 256
 100 bytes of code was repeated three times, meaning that this was a good candidate
 101 for the slot device code.  I could also see that it was written as relocatable
 102 code, and it contained this little tidbit:
 103
 104 001B: A9 60     LDA  #$60     ; Stuff an RTS into RAM somewhere
 105 001D: 8D F8 07  STA  $07F8
 106 0020: 20 F8 07  JSR  $07F8    ; Jump there and return in order to get evidence
 107                               ; of where in memory we did it from
 108 0023: BA        TSX           ; Retrieve the stack pointer
 109 0024: BD 00 01  LDA  $0100,X  ; Get the hi byte of the address we just pushed on
 110                               ; the stack in order to come back here
 111 0027: 8D F8 07  STA  $07F8    ; & save it for later perusal
 112
 113 which meant that it was an excellent candidate for the slot device code.  But
 114 why should that be?
 115
 116
 117 A Short Digression Into Why Slot Code Must Be Relocatable
 118 ---------------------------------------------------------
 119
 120 Slot code must be relocatable because such a card may be installed into any
 121 given slot in an Apple II--which means its code will show up anywhere from
 122 $C100 to $C700 (it always shows up on a page boundary).  By virtue of this, it
 123 also means that the I/O address for the card will also show up in the
 124 corresponding $C090 to $C0F0 address range (it always shows up on a 16-byte
 125 boundary).  And so, because of this, you have to write your slot code in such a
 126 way that it will work regardless of which slot it's installed in, which means
 127 the code must be relocatable--which ultimately means you can't use any JMP
 128 instructions to addresses in your driver, and you can't use absolute addressing
 129 to refer to stuff in the slot address space.
 130
 131 So, using the above code, a clever coder can figure out what slot their code is
 132 executing in and they can then use that knowledge to figure out which is the
 133 proper I/O range to use for the card.  All this being necessary in order to
 134 make a seamless experience for the end user of the card.
 135
 136
 137 The Next Part, In Which 32K Is Still Larger Than 256
 138 ----------------------------------------------------
 139
 140 So, in looking at the code that comes after the Code Which Looks Like It
 141 Belongs In Slot Memory (which makes the wonderful acronym CWLLIBISM), I noticed
 142 that it seemed to be organized in 1K chunks.  And further persual of said
 143 chunks made it seem very likely that they resided in the $CC00 to $CFFF memory
 144 space.  However, the "extra" memory space given to cards to use starts 1K
 145 earlier--at $C800.  What could this mean?
 146
 147 Well, in looking at the schematic for the card, one not only finds the 32K ROM
 148 chip, but also an 8K static RAM.  Which means that it's very likely that the
 149 address space from $C800 to $CBFF is mapped to that 8K static RAM.  But 8K is
 150 larger than 1K; how does that work?
 151
 152 As it turns out, it's bank switched, but I didn't know it at the time--we'll
 153 get to that eventually.  In the meantime, with further perusal of the code (the
 154 code gets perused quite a bit), it seems very likely that the 1K address range
 155 from $C800 to $CBFF is said RAM as that range is written to by the 1K code
 156 chunks quite frequently.
 157
 158 Finding that the code in the firmware is divvied up into 1K chunks would seem
 159 to imply that it's bank switched into the $CC00 to $CFFF range.  And in looking
 160 at the CWLLIBISM, we see the following:
 161
 162 005C: A9 0B     LDA  #$0B     ; Get 11 in the accumulator
 163 005E: AE 08 C8  LDX  $C808    ; Get offset to proper I/O space in X
 164 0061: 5A        PHY           ; Save Y on the stack for later
 165 0062: A8        TAY           ; Copy the accumulator to Y
 166 0063: 29 1F     AND  #$1F     ; Strip off the upper three bits
 167 0065: 9D 6E C0  STA  $C06E,X  ; & write to card I/O location $E
 168
 169 which implies it heavily.  Taking the number put into the accumulator and then
 170 masking out the lower 5 bits creates a range that goes from 0 to 31, which is
 171 32 distinct values, which corresponds to 32 1K chunks of code.
 172
 173 The above code, which is part of the initialization of the card, heavily
 174 implies that it's selecting a 1K chunk of code from bank 11 (counting from
 175 zero, naturally) to put into the $CC00 to $CFFF address range.  And so we get
 176 to(*) look there for a start.
 177
 178 (*) While changing 'have to' to 'get to' can make life awesome in many ways,
 179 this is far from a universal truth.  'Getting to' have one's arm amputated is
 180 never, ever awesome
 181
 182
 183 The Next Part, In Which We Sadly Bid Adeiu To CWLLIBISM
 184 -------------------------------------------------------
 185
 186 But before we do that, in order to understand what's going on in those wicked
 187 little 1K chunks of code, we should first take a closer look at CWLLIBISM.  So
 188 let's jump in:
 189
 190 0000: A2 20     LDX  #$20     ; The bytes after the LDX # identify this card as
 191 0002: A2 00     LDX  #$00     ; being capable of SmartPort calls, and the $82 at
 192 0004: A2 03     LDX  #$03     ; $FB further identifies it as a SCSI card ($2)
 193 0006: A2 00     LDX  #$00     ; that supports extended calls ($8).
 194
 195 The way that I was able to find out that this seemingly useless bit of code was
 196 a way of identifying SmartPort capable cards was in the serendipitous find of
 197 the "Technical Manual for the Apple SCSI Card"(*), which, while helpful in some
 198 ways, was almost completely useless in trying to figure out the what the card
 199 I/O addresses did.
 200
 201 (*) No relation to the Apple High Speed SCSI Card
 202
 203 0008: 2C 58 FF  BIT  $FF58    ; Check byte in ROM (usually, an RTS lives here)
 204 000B: 70 05     BVS  $0012    ; Bit 6 set?  >> $12 (which means, this branch
 205                               ; will be taken...)
 206
 207 This little tidbit checks a ROM location that usually carries an RTS (at least
 208 it does in the Apple IIe), which is $60.  Which means that the following BVS
 209 will always be taken and skip over the following:
 210
 211 000D: 38        SEC           ; ProDOS entry point
 212 000E: B0 01     BCS  $0011    ; Branch over the following CLC
 213 0010: 18        CLC           ; SmartPort DISPATCH
 214 0011: B8        CLV           ; Signal we're doing normal I/O, not init code
 215
 216 So this clever little bit here, according to the "Technical Manual for the
 217 Apple SCSI Card", sets some flags so that later on in the firmware, it can
 218 discern whether it's being called from ProDOS (in which the carry flag will be
 219 set) or if it's a SmartPort call (in which the carry flag will be clear).
 220 Either way, the overflow flag is cleared to let the firmware know that this is
 221 a request to talk to the drive, and not initialization.  Initialization skips
 222 over this code and ends up here:
 223
 224 0012: D8        CLD           ; Clear the decimal flag, to prevent bad math
 225 0013: 08        PHP           ; Save the carry & overflow flags for later
 226 0014: 78        SEI           ; Turn IRQs off
 227 0015: AD FF CF  LDA  $CFFF    ; Turn INTC8ROM off (puts card in $C800-CFFF)
 228 0018: 8D 00 CC  STA  $CC00    ; ???
 229
 230 This bit of code is a bit of housekeeping; making sure the decimal flag isn't
 231 set so that ADC & SBC both work as expected, saving the flags register so that
 232 the firmware code later can determine whether it's an initialization call or a
 233 regular I/O call, making sure that IRQs don't happen while in the firmware
 234 code, and turning on the "extra" addresses in the $C800 to $CFFF range.
 235
 236 The store to $CC00 is mysterious, as it's a ROM location and stores to ROM
 237 locations are usually void and of null effect.  This likely means that it's
 238 some kind of soft-switch that controls something in card, but exactly what
 239 would require a few things that I don't have, namely: the contents of the two
 240 PALs on the card (which sit between the address lines of the slot and the rest
 241 of the card), and a description of what the ports on the Sandwich II do (the
 242 chip that sits between the Apple IIe proper and the NCR 53C80).  So, moving
 243 right along:
 244
 245 001B: A9 60     LDA  #$60     ; See where we're executing from
 246 001D: 8D F8 07  STA  $07F8
 247 0020: 20 F8 07  JSR  $07F8
 248 0023: BA        TSX
 249 0024: BD 00 01  LDA  $0100,X  ; Get the address we just pushed on the stack
 250 0027: 8D F8 07  STA  $07F8    ; Save it
 251
 252 We've seen this already, this is the code that determines which slot it's
 253 sitting in.  Say, for example, that it's sitting in slot 7; the byte that it
 254 will retrieve from the stack will be $C7 (for the sake of completeness, the lo
 255 byte will be $22--as to why, this is left as an exercise for the reader).  In
 256 order to turn that into something that it can use to hit the proper slot I/O
 257 addresses, it does the following:
 258
 259 002A: 29 0F     AND  #$0F     ; Get the lo nybble
 260 002C: 0A        ASL  A        ; Multiply it x16
 261 002D: 0A        ASL  A
 262 002E: 0A        ASL  A
 263 002F: 0A        ASL  A
 264 0030: 18        CLC
 265 0031: 69 20     ADC  #$20     ; Add $20 to it for some reason
 266 0033: AA        TAX           ; & stick in the X register
 267
 268 The important part of the $C7 hi byte of the address we found through
 269 cleverness and trickery is the slot number, which will always fall in the lower
 270 4 bits.  And, in order to be useful to find the correct slot I/O address range,
 271 that slot number needs to be multiplied by 16, as each of the slot I/O address
 272 ranges cover exactly sixteen bytes.  Note that masking off the bottom 4 bits,
 273 as is done with the AND #$0F instruction, is unnecessary as the four ASL A
 274 instructions after it will necessarily shift the top four bits out of the
 275 picture.
 276
 277 The one thing that stands out as not typical of this kind of device driver code
 278 is the adding of $20 to the index.  Typically, writers of this kind of I/O code
 279 will use $C080 to $C08F (plus the contents of the X register to reach the
 280 correct slot I/O range) as the base address for slot I/O, but, for some reason,
 281 the writers of this card's firmware chose to use $C060 to $C06F, thus
 282 necessitating the addition of $20 to the value in the X register to reach the
 283 correct range for slot I/O.
 284
 285 0034: A9 00     LDA  #$00     ;
 286 0036: 9D 6E C0  STA  $C06E,X  ; Select bank #0 (register $E, lower 5 bits)
 287 0039: A9 0F     LDA  #$0F
 288 003B: 9D 6F C0  STA  $C06F,X  ; Store a $F in register $F
 289 003E: 8E 08 C8  STX  $C808    ; Put slot # at $C808 (banked RAM in $C800-CBFF)
 290 0041: 9C 09 C8  STZ  $C809    ; Put zero at $C809
 291 0044: 9C F2 C8  STZ  $C8F2    ; & $C8F2
 292
 293 One thing I forgot to mention is that the Apple High Speed SCSI card is only
 294 usable by enhanced Apple IIe and IIgs machines, and that's because it relies on
 295 instructions only found in the 65C02 like STZ and PHY; a regular 6502 will not
 296 even remotely do the same things that those instructions do on the 65C02--so
 297 they're right out.
 298
 299 At any rate, the above code does some writing to the slot I/O address range and
 300 sets up some values in the card's static RAM, including saving the contents of
 301 the X register for later.
 302
 303 0047: A2 22     LDX  #$22     ; Transfer 35 bytes from ZP ($40) to $C82D
 304 0049: B5 40     LDA  $40,X
 305 004B: 9D 2D C8  STA  $C82D,X
 306 004E: CA        DEX
 307 004F: 10 F8     BPL  $0049
 308
 309 This bit of code transfers 35 bytes in page zero RAM to the card's static RAM,
 310 presumably to restore them later.
 311
 312 0051: AD F8 07  LDA  $07F8    ; Get original $Cx byte again
 313 0054: 8D 01 C8  STA  $C801    ; Put it in $C801
 314 0057: A9 61     LDA  #$61     ;
 315 0059: 8D 00 C8  STA  $C800    ; Put $61 in $C800 (= $Cx61)
 316 005C: A9 0B     LDA  #$0B
 317 005E: AE 08 C8  LDX  $C808    ; Get X from $C808
 318
 319 This little bit of code sets up for the code that comes below; it sets up
 320 locations $C800-1 as a location for an indirect jump that seems to happen a lot
 321 in the 1K chunks that come later.  The address it sets up as the jump target is
 322 the code that comes next:
 323
 324 0061: 5A        PHY           ; Save Y (follow on bank, passed in by caller)
 325 0062: A8        TAY           ; Save A register
 326 0063: 29 1F     AND  #$1F     ; Mask off the lower 5 bits
 327 0065: 9D 6E C0  STA  $C06E,X  ; First time, select bank 11:0 (I/O register $E)
 328 0068: 98        TYA           ; Restore the A register
 329 0069: 29 E0     AND  #$E0     ; Mask off the upper 3 bits
 330 006B: 4A        LSR  A        ; & shift them down
 331 006C: 4A        LSR  A
 332 006D: 4A        LSR  A
 333 006E: 4A        LSR  A
 334 006F: A8        TAY           ; Use as an index into a table (Y x 2)
 335
 336 What this does is save the Y register on the stack, then separates the
 337 accumulator into a upper 3-bit part and a lower 5-bit part.  The lower 5 bits
 338 go into I/O slot register $E, which presumably selects which 1K chunk of code
 339 will appear in the $CC00 to $CFFF address range while the upper 3 bits are used
 340 as an index into a table that appears near the end of each 1K chunk:
 341
 342 0070: B9 F0 CF  LDA  $CFF0,Y  ; Get address of current 1K bank
 343 0073: 85 54     STA  $54      ; & stuff it into $54/55
 344 0075: B9 F1 CF  LDA  $CFF1,Y
 345 0078: 85 55     STA  $55
 346
 347 So it uses the Y register as index into the current selected bank's $CFF0
 348 address range and stuffs them into $54 and $55, so that it can jump to the
 349 address at some point.
 350
 351 007A: AD F8 07  LDA  $07F8    ; Get original $Cx byte again
 352 007D: A8        TAY           ; Put it in Y
 353 007E: 48        PHA           ; Put it to the stack
 354 007F: A9 86     LDA  #$86
 355 0081: 48        PHA           ; Push $86: return address is now $Cx87
 356
 357 What this does is set up the stack for what I'm going to name (for lack of a
 358 better term, or any at all to be honest) an "RTS call".  This takes advantage
 359 of how the CPU uses the stack to return execution to the instruction after a
 360 JSR instruction: when the CPU encounters a JSR opcode, it pushes the the
 361 location of the program counter, plus two, onto the stack before loading the
 362 program counter with the address that comes after the JSR.  When an RTS opcode
 363 is then encountered, it restores the program counter from the stack and adds
 364 one to it before resuming execution.
 365
 366 The upshot of this is that you can transfer execution of a program from one
 367 place to the next, without using JMP, JSR or branch instructions by simulating
 368 this behavior--which also turns out to be a necessity when you're writing
 369 relocatable code.  So what the above code does is set up the stack so that it
 370 will jump to location $Cx87 when it encounters an RTS.
 371
 372 0082: 5A        PHY           ; Push $Cx
 373 0083: A9 8B     LDA  #$8B     ; Push $8B: return address is now $Cx8C
 374 0085: 48        PHA
 375
 376 Similarly, this code sets up the stack so it will jump to $Cx8C when it
 377 encounters an RTS as well.  So it will go there first, then to $Cx87 second
 378 when the routine first called via RTS call, er, uh, returns.
 379
 380 0086: 60        RTS           ; First time, will "return" to $Cx8C
 381
 382 Thus, this first RTS transfers control to the JMP ($0054) down below, which was
 383 set up above as an address somewhere in a 1K code chunk.  Since the code that
 384 goes into the 1K code chunk is a JMP instruction, once that code returns, it
 385 will then find the address that was pushed on the stack earlier, and execute
 386 the following code:
 387
 388 0087: 68        PLA           ; After the $CCxx block is done, it comes here
 389 0088: 9D 6E C0  STA  $C06E,X  ; Restore last block (one passed in Y reg)
 390 008B: 60        RTS           ; & return to calling code in that block
 391
 392 This code pops the Y register that was saved way back up at location $Cx61 and
 393 uses it to set the I/O register at $E, which, presumably, is the bank switch
 394 I/O address for the card.  This will turn out to be of vital importance later,
 395 but we'll leave it for now.  The RTS, finally, returns from initialization and
 396 back from whence it came.
 397
 398 008C: 6C 54 00  JMP  ($0054)  ; Jump to the $CCxx block code
 399
 400 This indirect JMP instruction, called up above via RTS call, kicks things off.
 401
 402 008F-00FA: 00                 ; $6B worth of zeroes
 403 00FB: 82 00 00 BF 0D          ; ID/offset bytes
 404
 405 So these bytes that look like a bit of detritus actually do serve a useful
 406 function in ProDOS.  The $0D at the very end serves as an offset from the
 407 beginning of the code to the ProDOS entry point, which in this case works out
 408 to $Cx0D.  It also serves as the entry point for SmartPort calls (by adding 3
 409 to it), which works out to $Cx10.
 410
 411 Further, the "Technical Manual for the Apple SCSI Card" says the following
 412 about the byte at $FB: "An additional byte, at $CnFB, should contain $82,
 413 indicating that the device is the SCSI card ($2) and that it supports extended
 414 calls ($8)."  This just happens to be one of a small handful of those
 415 aforementioned tiny bits of useful information that I was able to glean from
 416 that source.
 417
 418 And so, at last, we come to the realization that this is definitely the slot
 419 ROM code, and thus CWLLIBISM becomes CWSISM (Code Which Sits In Slot Memory).
 420
 421
 422 And Now For Something Not Quite So Completely Different
 423 -------------------------------------------------------
 424
 425 And with that digression into CWSISM, we turn our attention back to the 1K
 426 chunk of initialization code that sits in bank 11.  In looking at the table
 427 that we discovered sits at $CFF0, we find the following in the 11th (counting
 428 from zero) 1K chunk:
 429
 430 CFF0: 00 CC
 431 CFF2: 91 CE
 432 CFF4: 9A CD
 433 CFF6: 00 00 00 00 00 00 00 00 00 00
 434
 435 This tells us that there are only three valid addresses in the table (as the
 436 zeroes will take you nowhere), and that further, they are $CC00, $CE91 and
 437 $CD9A.  And since the CWSISM set up the $Cx61 dispatch call with $0B (at
 438 $Cx5C), it will pick the zeroeth address in that list, namely, $CC00.  So,
 439 looking at the code that lies there, what we see looks promising:
 440
 441 CC00: 68        PLA          ; Discard the 2nd return path (bank switch back)
 442 CC01: 68        PLA
 443 CC02: 68        PLA          ; Discard the follow on bank #, as there is none
 444
 445 Since this is initialization code, we can discard the RTS call from the stack
 446 since we aren't calling this code from another bank.  Which also means that we
 447 can discard that parameter which tells the RTS call what bank to select before
 448 returning.
 449
 450 CC03: 86 5E     STX  $5E     ; Save slot # (+$20) in $5E
 451 CC05: 9C 93 C8  STZ  $C893   ; Zero out $C893 & $5D
 452 CC08: 64 5D     STZ  $5D
 453 CC0A: 20 C1 CC  JSR  $CCC1   ; Test for GS hardware + DMA switch
 454
 455 This is basically housekeeping, and the routine called at $CCC1 tests if the
 456 card is running on an Apple IIgs and sets bit 6 of zero page location $5D if it
 457 detects that.  It also checks the physical DMA on/off switch on the card as
 458 well; if it's set, it sets bit 5 of $5D.  The following bit of code checks $5D
 459 to see if bit 6 is clear and skips the instructions at $CC11 to $CC19 if
 460 so--and since I'm emulating an Enhanced Apple IIe, it *will* skip those
 461 instructions:
 462
 463 CC0D: 24 5D     BIT  $5D     ; Check if bit 6 of $5D is set (means it's a GS)
 464 CC0F: 50 0B     BVC  $CC1C   ; Skip over if not set (it's not a IIgs)
 465 CC11: AD 36 C0  LDA  $C036   ; IIgs Speed Reg.
 466 CC14: 8D 96 C8  STA  $C896   ; Save it for later...
 467 CC17: 09 80     ORA  #$80    ; Set speed to 2.8 MHz
 468 CC19: 8D 36 C0  STA  $C036   ; & modify
 469
 470 Luckily there exists a very good techinical reference manual for the Apple
 471 IIgs; unluckily, it's a bit hard to track down.  But once you do, the
 472 information in it is quite good.  The above bit of code shows that the card
 473 firmware shifts the IIgs into high gear while running on the card.  However, we
 474 don't really care about that bit of code; which is why we spent so much time
 475 explaining what it does.
 476
 477 CC1C: 68        PLA          ; Get flags from slot init
 478
 479 Way back in CWSISM, at slot location $Cx13, there was an innocuous looking PHP
 480 instuction; here is where we finally take a look at the contents of it.
 481
 482 CC1D: A8        TAY          ; Save them in Y
 483 CC1E: 29 04     AND  #$04    ; Check if I flag is set
 484 CC20: F0 05     BEQ  $CC27   ; Skip if I is not set
 485 CC22: A9 80     LDA  #$80    ; Else, signal I flag is set ($80 -> $C893)
 486 CC24: 8D 93 C8  STA  $C893
 487
 488 Here we look at the interrupt disable bit in the processor flags that we saved
 489 earlier; if it's not set we skip on over to the next bit of code below.
 490 Otherwise, the code sets $80 into memory location $C983 to signal that
 491 initialization code was called with the I flag set.
 492
 493 CC27: 98        TYA          ; Restore flags from Y
 494 CC28: 09 04     ORA  #$04    ; Set I flag
 495 CC2A: 48        PHA          ; Push them to the stack
 496 CC2B: 28        PLP          ; & restore flags for real
 497
 498 Since we need to get the values of the overflow and carry flags back, which
 499 were set way back in CWSISM at addresses $Cx0D through $Cx11, we have to
 500 retrieve them from the Y register, then push them onto the stack and then use a
 501 PLP to get them back into the flags register proper.  Along the way, we set the
 502 interrupt disable flag at $CC28 (the ORA #$04 instruction).
 503
 504 And in looking at code as we're doing here, it's hard not to look at it with a
 505 critical eye and notice that the coder could have saved a byte by deleting the
 506 ORA #$04 (which takes two bytes) and putting an SEI after the PLP (which takes
 507 one byte).  And, since we don't have any source code to look at, we may never
 508 know what the intention was; though it's quite likely that this was just a
 509 simple oversight.
 510
 511 CC2C: 50 09     BVC  $CC37   ; If SmartPort call, skip over
 512
 513 Here we see that if the card firmware was called via the SmartPort vector at
 514 $Cx10, the overflow flag would be clear and we would skip over the following.
 515 But, since the flag was definitely set, we know that we will execute what
 516 follows:
 517
 518 CC2E: BA        TSX          ; Slot init & regular ProDOS dispatch get here
 519 CC2F: 8E 07 C8  STX  $C807   ; Save stack pointer in $C807
 520 CC32: A9 0F     LDA  #$0F
 521 CC34: 4C 5F CF  JMP  $CF5F   ; Jump to bank 15:0 for rest of init
 522
 523 This saves the stack pointer and sets up to jump to a new bank, which means we
 524 won't be coming back here.  Onward:
 525
 526 CF5F: A6 5E     LDX  $5E     ; Restore slot # (+$20) in X
 527 CF61: A0 0B     LDY  #$0B    ; Y gets loaded with bank to return to on RTS
 528 CF63: 6C 00 C8  JMP  ($C800) ; & go!
 529
 530 There are variants of this piece of code throughout every 1K bank of firmware
 531 code.  And since we took a good long look at CWSISM, we know that CWSISM set up
 532 location $C800 and $C801 to point to the card slot I/O location of $Cx61, and
 533 suddenly it becomes clear what that bit of code does.
 534
 535 Since the firmware code bounces around a lot in different banks (as we will
 536 discover shortly), it needs a mechanism to get back to the place that called it
 537 in the first place.  The problem is this: once a new 1K bank of code is
 538 switched into the $CC00 to $CFFF address space, there's no way for the 65C02 to
 539 get back to the caller with a simple RTS; any code that attempted to do so
 540 would end up executing the wrong code as the 65C02 knows nothing about bank
 541 switching and has no built-in mechanism to handle such things.
 542
 543 And so, by virtue of this, the code needs a way to do this manually.  Which is
 544 why the $Cx61 code in CWSISM saves the bank number on stack, and then sets up a
 545 pair of RTS calls which first, sets the correct bank and calls the correct
 546 function number in that bank and second, sets the bank to the bank that made
 547 the call in the first place before executing a final RTS which then goes back
 548 to the correct address.
 549
 550 And since we saw up above that it passed $0F into the calling routine (well,
 551 actually, it jumped there), we know that it's going to call function #0 in bank
 552 15.  As it turns out, the function table for bank 15 looks like this:
 553
 554 CFF0: 00 CC
 555 CFF2: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 556
 557 which means bank 15 only contains one function, and it starts at $CC00.
 558
 559
 560 The Next Part, In Which We Peruse Bank 15
 561 -----------------------------------------
 562
 563 The story so far: we started in slot ROM, set up a bunch of variables, then
 564 bounced to bank 11, and just now bounced to bank 15.
 565
 566 CC00: A9 40     LDA  #$40
 567 CC02: 8D 09 C8  STA  $C809   ; Put $40 into $C809
 568 CC05: 8D 32 BF  STA  $BF32   ; & $BF32(!)
 569 CC08: 9C 0A C8  STZ  $C80A   ; Zero out $C80A
 570
 571 So far this is all normal housekeeping boilerplate, though putting the value
 572 $40 into RAM at address $BF32 makes me raise an eyebrow (to this day, I still
 573 have no idea what that's supposed to do).  So then we come to the heart of the
 574 matter:
 575
 576 CC0B: A9 03     LDA  #$03
 577 CC0D: 20 AF CF  JSR  $CFAF   ; Call bank 3:0 (enumerate all connected drives)
 578
 579 Here is the first proper JSR into bank switched code, and in taking a cursory
 580 glance at the code there, well...  It's a bit of a Gordian knot.  So we'll
 581 ignore the stones in the field for now, and keep on plowing ahead:
 582
 583 CC10: AE 08 C8  LDX  $C808   ; Restore slot # (+$20) to X
 584 CC13: A5 4F     LDA  $4F
 585 CC15: F0 03     BEQ  $CC1A   ; Skip over if call was successful ($4F == 0)
 586 CC17: 4C F0 CC  JMP  $CCF0   ; Else, do a LDA #2B, JMP $CFAF to bank 11:1
 587
 588 So here the code retrieves the slot I/O offset in X from the location set way
 589 back in CWSISM, then checks what looks like some kind of error condition.  If
 590 it fails, it skips on over to function 1 in bank 11; otherwise, it keeps going
 591 here:
 592
 593 CC1A: 24 5D     BIT  $5D     ; Are we running on a IIgs?
 594 CC1C: 70 05     BVS  $CC23   ; If so, skip over & keep going
 595
 596 Since we're not running on a IIgs, this branch is not taken and thus it can be
 597 safely ignored.  Continuing on:
 598
 599 CC1E: A9 4B     LDA  #$4B    ; Else, jump to bank 11:2 (normal success path)
 600 CC20: 4C AF CF  JMP  $CFAF
 601 ;
 602 CFAF: A6 5E     LDX  $5E     ; Restore slot (+$20) in X
 603 CFB1: A0 0F     LDY  #$0F    ; Make sure we come back here...
 604 CFB3: 6C 00 C8  JMP  ($C800) ; & go!!
 605
 606 So what this means is that if the function call to bank 3:0 succeeded, the code
 607 will then bounce to function 2 in bank 11.  And, as we saw above, function 2
 608 starts at $CD9A in bank 11.
 609
 610
 611 The Next Part, In Which Be Bounce Back To Bank 11 And Find Something Familiar
 612 -----------------------------------------------------------------------------
 613
 614 So far, this little expedition is proving to be circuituitous, but not
 615 impenetrable.  And it makes sense that we would come back to bank 11, as that's
 616 where the initialization code sent us in the first place.  And so, pressing on,
 617 we find:
 618
 619 CD9A: 86 5E     STX  $5E     ; Save X in $5E
 620 CD9C: A9 01     LDA  #$01    ; Put 1 in $43, $44
 621 CD9E: 85 43     STA  $43
 622 CDA0: 85 44     STA  $44
 623 CDA2: 64 46     STZ  $46     ; Zero out $46, $47, $48, $49
 624 CDA4: 64 47     STZ  $47
 625 CDA6: 64 48     STZ  $48
 626 CDA8: 64 49     STZ  $49
 627 CDAA: A9 08     LDA  #$08    ; Put $08 in $41
 628 CDAC: 85 41     STA  $41
 629 CDAE: 64 40     STZ  $40     ; Zero out $40, $42
 630 CDB0: 64 42     STZ  $42
 631
 632 This is again more housekeeping boilerplate, initializing a bunch of zero page
 633 locations.  Then we find this:
 634
 635 CDB2: A9 09     LDA  #$09
 636 CDB4: 20 5F CF  JSR  $CF5F   ; Call bank 9:0 (directly)
 637
 638 So this calls function 0 in bank 9, which lives at $CC00.  And looking through
 639 that code, well, let's just put that aside for now as it's long and involved
 640 and will require a fair amount of study.  Continuing:
 641
 642 CDB7: A5 4F     LDA  $4F
 643 CDB9: D0 0C     BNE  $CDC7   ; Fail if $4F is non-zero
 644
 645 This looks at the error flag we saw up above in bank 15, and jumps to function
 646 1 in this bank if the error flag is non-zero.
 647
 648 CDBB: AD 01 08  LDA  $0801   ; Get byte @ $801 (!)
 649 CDBE: F0 07     BEQ  $CDC7   ; Fail if it's zero
 650
 651 Now here is something interesting!  Why this is interesting is because when
 652 booting from a floppy disk, the disk driver typically loads at least one sector
 653 (256 bytes of data) into location $800.  So we can deduce that the above call
 654 into function 0 in bank 9 is loading something similar from the hard drive into
 655 memory at a similar address.  With this bit of knowledge, we can see up above
 656 where it puts address $800 into zero page locations $40 and $41 that those
 657 locations must be a loading address.
 658
 659 CDC0: AD 00 08  LDA  $0800   ; Get byte @ $800 (!)
 660 CDC3: C9 01     CMP  #$01
 661 CDC5: F0 03     BEQ  $CDCA   ; Keep going if it's equal to 1
 662 CDC7: 4C 91 CE  JMP  $CE91   ; Else, jump to function 1 (failure point)
 663
 664 Again, this interesting because with floppy disks, the first byte of the first
 665 sector loaded into memory at $800 contains the number of sectors that the
 666 floppy driver should load into memory; this looks eerily similar--only in this
 667 case, it will jump to the failure path if it sees it wanting more than one
 668 block.  Assuming all is well, we then have this:
 669
 670 CDCA: 8D 09 C8  STA  $C809   ; Put a 1 into $C809
 671 CDCD: AD F8 07  LDA  $07F8   ; Get $7F8
 672 CDD0: 0A        ASL  A       ; x16
 673 CDD1: 0A        ASL  A
 674 CDD2: 0A        ASL  A
 675 CDD3: 0A        ASL  A
 676 CDD4: AA        TAX          ; Store it in X
 677 CDD5: A9 00     LDA  #$00    ; Stuff 0 in $C035 (GS location?)
 678 CDD7: 8D 35 C0  STA  $C035
 679 CDDA: 8D 01 CC  STA  $CC01   ; What does this do?
 680 CDDD: 4C 01 08  JMP  $0801   ; Run the code from block 0
 681
 682 And here we see it hand off execution to data that it pulled from the hard
 683 drive by jumping to $801, and thus we see that this must be the end of the hard
 684 drive boot logic.  As far as the firmware is concerned, its initialization job
 685 of bootstrapping the hard drive is concluded.
 686
 687 However, we still really don't know anything that tells us what the slot I/O
 688 addresses do (aside from location $E) and we still have no idea how the card
 689 talks to the hard drive.  At least we have a pretty good idea of where to look.
 690
 691
 692 What Are All These Eels, And What Are They Doing In My Hovercraft
 693 -----------------------------------------------------------------
 694
 695 So at last we get to take a look at function 0 in bank 3.  And, much like a
 696 hovercraft full of eels, it's a twisty mass of slippery, squirming code.  And,
 697 looking at it more closely, it does a bunch of things which don't make much
 698 sense until you understand other code, which bounces around to lots of other
 699 banks.  And a lot of it is opaque unless you somewhat understand what the ports
 700 on the NCR 53C80 do and how the SCSI protocol works.
 701
 702 So while we have an excellent start on understanding, for the most part, the
 703 broad outlines of how the card works, we are still stuck with a profound lack
 704 of critical knowledge on how the thing talks to the the hard drive and,
 705 conversely, how the hard drive talks to the card.  And without that knowledge,
 706 we perish.
 707
 708
 709 The Next Part, In Which We Are Not Ready To Perish
 710 --------------------------------------------------
 711
 712 Fortunately, the NCR 5380 and, by extension, the 53C80 is well documented and
 713 said documentation is readily available, and so I availed myself of it.  I took
 714 another look at the schematic for the card and noticed that the 53C80 had three
 715 address lines on it, which implied that it had eight ports for controlling it.
 716 Unfortunately, there's an error on the schematic in which they have the address
 717 lines hooked up in reverse, and this caused me no small amount of consternation.
 718
 719 It seemed obvious that those eight ports were hooked up to the slot I/O
 720 addresses, and also seemed very plausible, after having looked at and analyzed
 721 a lot of code heretofore unmentioned, that it was connected to the lower half
 722 of that address space.  So, in order to confirm my suspicions, I started
 723 writing the hard drive emulator.
 724
 725 This started out, simply, as a bunch of statements that output human readable
 726 words to a log file whenever the slot I/O addresses were accessed by the card
 727 firmware; I used the firmware's access to the slot I/O to tell me what it said
 728 and what it was listening for.  Well, that, and some code to properly handle
 729 the bank selection of the ROM space as well.  In this way, I was able to
 730 enlarge my understanding of what the card expected to see as well as what the
 731 ports that weren't connected to the 53C80 (which were likely connected to the
 732 Sandwich II) might be up to.
 733
 734 So in fits and starts, I used the code that writes to the Mode Register of the
 735 53C80 to get the code to successfully... do something.  It was at that point I
 736 could see that it was getting through the initialization phase of the card's
 737 firmware as Apple2 would be able to boot a floppy image inserted into a drive
 738 in slot 6 at that point.  But in tracing the reads and writes to the slot I/O
 739 address space in the log I could see that it was getting through the card's
 740 firmware in a failure mode.  It was progress, of a sort.  Even failure tells
 741 you something.
 742
 743 And what it told me was that I needed to dig into the SCSI specification to
 744 figure out how the protocol worked.  Looking back I can see that I was getting
 745 through to the MESSAGE phase and, because of the way I was responding to that
 746 message, that the firmware would then send an ABORT message, but that's all
 747 pretty much meaningless as I haven't explained anything about the SCSI protocol
 748 and how it works.
 749
 750 And here, while there is a lot of information about the latter day iterations
 751 of the SCSI protocol, there wasn't much pertaining to the kind of SCSI that the
 752 Apple High Speed SCSI card spoke, which in its case, has been retroactively
 753 labeled SCSI-1.
 754
 755 And when looking at the SCSI protocol, the first thing that hits you is that
 756 it's a very well designed, robust protocol and it's nothing short of a minor
 757 miracle that it survived and still survives to this day.  However, the
 758 documentation on how it *really* works is a bit lacking.  Yes, you can discover
 759 that there are nine phases, and the first three are fairly easy to understand;
 760 it's what comes after that where things get murky.
 761
 762
 763 Talk SCSI To Me
 764 ---------------
 765
 766 So here is a crash course in the SCSI-1 protocol.  The SCSI bus is engineered
 767 such that it allows for eight devices to connect to said bus; devices connected
 768 to the bus can have Initiator and/or Target roles.  Devices can talk to each
 769 other by passing messages over this bus, however only one pair of devices can
 770 use the bus at any one time.  In order to prevent deadlock from happening when
 771 more than one device attempts to take control of the bus, there is an enforced
 772 hierarchy of devices wherein they all have a unique ID; a device that contends
 773 for use of the bus at the same time as another device wins this contention if
 774 and only if its device ID is higher than the other device's ID (1 in this case
 775 being the highest, and 128 being the lowest).  The bus is an 8-bit parallel
 776 data bus that is controlled by a variety of signals (and these are typically
 777 called "lines").
 778
 779 In contending for and utilizing the bus, there are nine phases that all SCSI
 780 devices must understand and negotiate.  They are as follows:
 781
 782  -  Bus Free
 783  -  Arbitration
 784  -  Selection
 785  -  Message In
 786  -  Message Out
 787  -  Data In
 788  -  Data Out
 789  -  Command
 790  -  Status
 791
 792 In the Bus Free phase, as one might expect, no devices are using the bus.  This
 793 is the ground state of the SCSI protocol, the phase from whence all
 794 communication starts and where it all ends.  Any device that wishes to talk to
 795 another device on the bus must start here.
 796
 797 Once a device sees that the bus is free, it can enter the Arbitration phase as
 798 an Initiator; it does so by first setting the bit that corresponds to its
 799 device ID on the data bus.  If another device tries to do this at the same
 800 time, the device with the lower ID will remove its bit from the data bus and
 801 try again when it detects that the bus is free again.  When the Initiator has
 802 waited a certain amount of time with no other contention, it then asserts the
 803 SEL line and goes into the Selection phase.
 804
 805 In the Selection phase, the Initiator sets the bit that corresponds to the
 806 device ID it wants to talk to (the Target) on the data bus.  Every other device
 807 on the bus, by virtue of the asserted SEL line, knows it's in the Selection
 808 phase and can see the device ID bits being asserted on the data bus; if none of
 809 the bits match its own ID, it will stay silent.  If the Target device doesn't
 810 respond in a timely manner, the device that tried "calling" it drops the bits
 811 it asserted on the data bus and drops the SEL line.  Otherwise, if the Target
 812 device sees its ID on the data bus, it responds by asserting the BSY (BuSY)
 813 line.
 814
 815 The device that started all of this (the Initiator) then drops the SEL line and
 816 the Initiator and Target devices then enter the next phase.  What phase that is
 817 took some teasing out of lots of different papers, datasheets and manuals--as
 818 well as much trial and error in the emulation code.  And what I found was this:
 819 once the devices are in the Selection phase, they typically(*) dance through
 820 the following set of phases, in order, before being done with their
 821 transaction: Message Out(**), Command, Data In/Out, Status, Message In.
 822
 823 (*) One exception to this is the TEST UNIT READY command, which will skip the
 824 Data In/Out phase
 825
 826 (**) Note that the qualifiers "In" and "Out" come strictly from the perspective
 827 of the Initiator
 828
 829 Once the devices have successfully negotiated the Message In phase at the end
 830 of their phase dance, the Target device drops the BSY line and the bus is then
 831 free again for another transaction.
 832
 833 One thing I forgot to mention is that each phase transition, once the devices
 834 are in the Selection phase, is punctuated by a REQ/ACK handshake.  Typically,
 835 the Target asserts and drops the REQ line while the Initiator asserts and drops
 836 the ACK line.  Basically, when the Target is ready to move to a different
 837 phase, it will assert the REQ line; the Initiator will see this and then assert
 838 the ACK line.  Once the Target sees the ACK line asserted, it will drop the REQ
 839 line; the Initiator, seeing this, will then drop the ACK line.  And thus hands
 840 are shaken, and all are in agreement as to where they are and what they are
 841 doing.
 842
 843 One interesting consequence of this kind of handshaking is that it means that
 844 every phase past Arbitration is driven by the Target device.
 845
 846
 847 By Your Command
 848 ---------------
 849
 850 And so having deciphered the proper steps in the post-Selection phase dance, we
 851 come as last to the heart of the matter: the Command phase.  Commands come in a
 852 few different flavors: the six byte, the ten byte and the twelve byte.  The
 853 flavor is given by the top three bits of first byte while the command itself is
 854 given by the bottom five bits.  Treating those top three bits as a number from
 855 zero to seven, the flavors fall into the following groups:
 856
 857 six byte: 0
 858 ten byte: 1, 2
 859 twelve byte: 5
 860
 861 Yes, 3, 4, 6 and 7 are all missing, and, for the purposes of this crash course,
 862 can be safely ignored(*).
 863
 864 (*) For the terminally curious, 3 and 4 are (were?) "reserved", and 6 and 7 are
 865 for "vendor specific" commands
 866
 867 Having now discerned their form, the question arises: just what do these
 868 commands do?  Basically, they tell the Target what the Initiator wants from it.
 869 For example, let's say that the Initiator wants to know if a device on the bus
 870 is ready to receive commands.  It would send out, during the Command phase, a
 871 TEST UNIT READY command which has the following form:
 872
 873 00 00 00 00 00 00
 874
 875 Assuming the device receiving this command actually is ready to receive
 876 commands, it would then send back a status message (in the Message In phase
 877 following the Status phase) saying "Good" (which, in this case, is coded as
 878 $00).
 879
 880 Other commands follow basically the same form; only instead of going directly
 881 to the Status phase, as the TEST UNIT READY command does, it will go into
 882 either the Data In or Data Out phase before going to the Status
 883 phase--depending on what the command does.  For example, a READ command will go
 884 to the Data In phase, because the Initiator is requesting data from the Target;
 885 likewise, a WRITE command will go to the Data Out phase because the Initiator
 886 wants to send data to the Target.
 887
 888
 889 Back To Our Regularly Scheduled Analysis
 890 ----------------------------------------
 891
 892 So, before we diverged into a crash course of the SCSI-1 protocol, we were
 893 looking at where I had been able to have the card's firmware return back to the
 894 Apple IIe's Autostart program, but in a failure mode.  Which, while ultimately
 895 unsatisfying, *was* a step in the right direction.
 896
 897 So I could see that with my hard-coded responses to the firmware's inquiries, I
 898 was getting an IDENTIFY message ($80) followed by an ABORT message ($06).  It
 899 was a this point I could also see that I was going to have to start writing the
 900 actual hard drive device emulator code as well, as trying to keep track of all
 901 the phase changes in the slot I/O register code was turning into an
 902 impenetrable mess and wasn't going to be fruitful in the long run.
 903
 904 This also necessitated a closer look at the code for function 0 in bank 3.  I
 905 took copious notes on where the code went and what it did, and eventually found
 906 that almost everything, at some point, seemed to end up calling function 0 in
 907 bank 16.
 908
 909
 910 All Roads Lead To Bank 16:0
 911 ---------------------------
 912
 913 The one thing I was trying to figure out from this code was: what was the
 914 failure mode that would get you out cleanly?  Because in order for the code
 915 that called here to work properly, it would have to have some kind of clean
 916 failure mode to indicate that there was no drive present at this device ID;
 917 also in my first attempts to get the firmware code to successfully run (for
 918 some value of "successfully" > 0), it would hang up somewhere in this code.
 919 And that meant, since I didn't understand the SCSI chip, that I would have to
 920 understand the SCSI chip and how it worked to have any hope of untangling the
 921 tangled mass of code here.
 922
 923 So before we take a quick look at that, let's take a look at the top level code
 924 that lives at function 0, bank 16.  At first glance, it doesn't look all that
 925 bad:
 926
 927 CC00: 8D 00 CD  STA  $CD00   ; Write to $CD00 (what does it do?)
 928 CC03: 20 D0 CD  JSR  $CDD0   ; Clear DMA bit (1) from reg. $2, init some stuff
 929 CC06: 20 CE CE  JSR  $CECE   ; Check if reg. $4 has 0, 2 (/SEL) or 4 (/I/O)
 930 CC09: B0 16     BCS  $CC21   ; If failure, skip over
 931
 932 This is pretty straightforward stuff; the routine at $CECE will set the carry
 933 flag if slot I/O register $4 is not exactly one of: 0, 2, or 4.  If the carry
 934 is set, it bypasses the following sections of code:
 935
 936 CC0B: 20 42 CF  JSR  $CF42   ; Check if bit 7 in $C893 is set (success == yes)
 937 CC0E: 20 24 CC  JSR  $CC24   ; Do Arbitration phase
 938 CC11: B0 03     BCS  $CC16   ; If Arbitration timed out, jump over Selection
 939
 940 It wasn't obvious when I first encountered this code, but, once I delved into
 941 the SCSI protocol I was able to figure out that the code at $CC24 was
 942 negotiating the Arbitration phase.
 943
 944 CC13: 20 7A CC  JSR  $CC7A   ; Do Selection phase
 945
 946 Likewise, it was not obvious that the code at $CC7A was negotiating the
 947 Selection phase--but I was able to figure out that the code could cleanly exit
 948 this bank (in a failure mode, naturally) if the BSY line was not asserted.
 949
 950 CC16: 20 58 CF  JSR  $CF58   ; Check if bit 7 in $C893 is set (success = yes)
 951 CC19: B0 06     BCS  $CC21   ; Skip over if it failed
 952
 953 Since the address at $C893 got loaded with $80 way back in function 0 in bank
 954 11, the carry flag will be clear and we will execute the following:
 955
 956 CC1B: 20 E4 CC  JSR  $CCE4   ; Do SCSI communication with target
 957 CC1E: 20 A0 CD  JSR  $CDA0   ; Do nothing if $C88F is nonzero, else check on
 958                              ; $C8EC
 959
 960 The code at $CCE4 was quite mystifying for some time, even after I had educated
 961 myself on the intricacies of the SCSI protocol and the ins and outs of the NCR
 962 53C80's ports.  I wasn't able to make sense of this until I was able to
 963 understand the phases after Selection and how they were expected to be
 964 negotiated.
 965
 966 CC21: 4C 18 CE  JMP  $CE18   ; Do some post cleanup before returning
 967
 968 The code at $CE18 basically does some error checking and cleanup before
 969 returning back to whence it came; it's fairly easy to digest.  But before we
 970 dig into subroutines of bank 16:0, we need to take a short digression into how
 971 the ports of the 53C80 work.
 972
 973
 974 A Somewhat Brief Digression Into The 53C80's Ports
 975 --------------------------------------------------
 976
 977 And so, having avoided looking into the 53C80 and how it works up until this
 978 point, we find we can no longer avoid it and thus, finally bite the bullet.
 979 The 53C80 has eight ports (also called registers) with which the Apple IIe's
 980 CPU can communicate.  They are:
 981
 982 $0 - Data on the SCSI bus
 983 $1 - Initiator Command
 984 $2 - Mode
 985 $3 - Target Command
 986 $4 - Current SCSI Bus Status (R), Select Enable (W)
 987 $5 - Bus and Status (R), Start DMA Send (W)
 988 $6 - Input Data (R), Start DMA Target Receive (W)
 989 $7 - Reset Parity/Interrupt (R), Start DMA Initiator Receive (W)
 990
 991 Note too that there is a one-to-one correspondence with the port numbers as
 992 they appear on the 53C80 and their location in the slot I/O address range.
 993 What follows is an explanation of what the registers do:
 994
 995 Register $0 is pretty much what it says it is; data on the SCSI bus will appear
 996 here barring this caveat: it only works when bit 0 of register $1 (ASSERT DATA
 997 BUS) is set.  Which bring us to...
 998
 999 Register $1 is used to monitor and assert signals on the SCSI bus.  The bits
1000 are:
1001
1002 7    6              5             4    3    2    1    0
1003 RST  AIP/TEST MODE  LA/DIFF ENBL  ACK  BSY  SEL  ATN  DATA BUS
1004
1005 RST (ReSeT) sets the RST signal on the SCSI bus and resets the internal state
1006 of the 53C80; it stays in the reset state until this bit is cleared.  AIP/TEST
1007 MODE (Arbitration In Progress) is a bit that is split between two functions:
1008 when read, it signals whether or not the Arbitration phase is in progress; when
1009 a one is written to it, it disables all output from the chip (zero restores
1010 output).  LA/DIFF ENABL (Lost Arbitration) is another split signal: when read,
1011 it signals whether or not Arbitration was lost; writing has no effect.  ACK
1012 (ACKnowledge) sets or clears the ACK line, BSY (BuSY), SEL (SELect), ATN
1013 (ATteNtion) and DATA BUS all do the same.
1014
1015 The important thing to note here is that by setting the ATN line on the SCSI
1016 bus, the initiator signals to the Target that it wants to send a message and
1017 so, at the appropriate time, the Target will then assert the MSG and C/D lines
1018 in response.
1019
1020 Register $2 controls various modes of the 53C80, as well as whether or not
1021 certain interrupts will be triggered.  The bits are:
1022
1023 7      6       5         4          3           2        1     0
1024 BLOCK  TARGET  ENABLE    ENABLE     ENABLE EOP  MONITOR  DMA   ARBITRATE
1025 MODE   MODE    PARITY    PARITY     INTERRUPT   BUSY     MODE
1026 DMA            CHECKING  INTERRUPT
1027
1028 The only two of real interest are bits 1 (DMA MODE) and 0 (ARBITRATE); the
1029 former sets the chip into DMA mode, readying it for a DMA transfer while the
1030 latter tells the chip to start the Arbitration phase.
1031
1032 Register $3 is used mainly if the chip is operating in Target mode, as all the
1033 lines controlled by it are typically only controllable by the Target device.
1034 The only exception is when the Initiator is sending data to the Target; in that
1035 case, bits 0, 1 and 2 must match the lines being asserted by the Target.  The
1036 bits are (where X means unused):
1037
1038 7               6  5  4  3    2    1    0
1039 LAST BYTE SENT  X  X  X  REQ  MSG  C/D  I/O
1040
1041 Register $4 is another split register.  When read, it returns the state of the
1042 following lines on the SCSI bus:
1043
1044 7    6    5    4    3    2    1    0
1045 RST  BSY  REQ  MSG  C/D  I/O  SEL  DBP
1046
1047 When written to, it enables an interrupt to occur if the device ID written to
1048 the SCSI bus is present, BSY is clear and SEL is set.
1049
1050 The important thing about this register is that it allows monitoring of the
1051 MSG, C/D and I/O lines of the SCSI bus.  These three bits are what the Target
1052 uses to signal moves from phase to phase; without these three bits it would be
1053 impossible, as an initiator, to figure out what to do once in the Selection
1054 phase.
1055
1056 And with three bits, you would expect there to be eight phases controlled here,
1057 but only six are controlled from these signals--having MSG set to 1 while C/D
1058 is set to 0 is an illegal combination, and that knocks two of the combinations
1059 right out of contention.  Each legal combination corresponds to a phase, and
1060 this is, as it turns out, vital information:
1061
1062 Data Out:  MSG = 0, C/D = 0, I/O = 0 (0)
1063 Data In: MSG = 0, C/D = 0, I/O = 1 (1)
1064 Command: MSG = 0, C/D = 1, I/O = 0 (2)
1065 Status: MSG = 0, C/D = 1, I/O = 1 (3)
1066 Message Out: MSG = 1, C/D = 1, I/O = 0 (6)
1067 Message In: MSG = 1, C/D = 1, I/O = 1 (7)
1068
1069 Note that there's nothing magical about the order of these three lines; they
1070 could be in any order whatsoever and they would still work the same way.  The
1071 only reason that they are presented this way is one, this is how they are laid
1072 out in the NCR 53C80 chip (in this register in particular) and two, this is
1073 order that they are used in the firmware.
1074
1075 Register $5 is--you guessed it--another split register.  When read, it returns
1076 some internal state registers as well as a couple more SCSI bus lines:
1077
1078 7       6        5        4       3      2      1    0
1079 END OF  DMA      PARITY   IRQ     PHASE  BUSY   ATN  ACK
1080 DMA     REQUEST  ERROR    ACTIVE  MATCH  ERROR
1081
1082 When written to, it initiates a DMA send transfer from memory to the SCSI bus.
1083
1084 Register $6, another split register, when read, holds data coming from the SCSI
1085 bus during a DMA transfer.  When written to, it initiates a DMA receive
1086 transfer from the SCSI bus (the Target) to memory.
1087
1088 And finally, register $7 is yet another split register, that when read, resets
1089 the internal PARITY ERROR, IRQ ACTIVE and BUSY ERROR bits in register $5; when
1090 written to in initiates a DMA receive transfer from the SCSI bus (the
1091 Initiator) to memory.
1092
1093
1094 Back To Bank 16
1095 ---------------
1096
1097 So, with that info-dump out of the way, let's return back to the first
1098 subroutine of the initial code of bank 16:0.  We start with the routine at
1099 $CC24:
1100
1101 CC24: 9E 63 C0  STZ  $C063,X ; Zero reg $3 (Target Command)
1102 CC27: 20 2F CF  JSR  $CF2F   ; Toggle bit 7 of reg. $E (ON-off-ON)
1103 CC2A: AD DA C8  LDA  $C8DA   ; Get SCSI ID of initiator device
1104 CC2D: 9D 60 C0  STA  $C060,X ; & put it in reg. $0 (Output Data)
1105 ;
1106 CC30: 9E 62 C0  STZ  $C062,X ; Zero out reg. $2 (Mode)
1107 CC33: A9 01     LDA  #$01
1108 CC35: 9D 62 C0  STA  $C062,X ; Set bit 0 (ARBITRATE) of reg. $2
1109
1110 This code zeroes out the Target Command register, then toggles bit 7 of
1111 register $E on, then off, then back on.  It then puts the SCSI ID of the
1112 initiator device into the SCSI Data Bus register, then clears and sets the
1113 ARBITRATE bit of the Mode register.  This is the start of the Arbitrate phase.
1114
1115 CC38: BD 6C C0  LDA  $C06C,X ; Get reg. $C
1116 CC3B: 89 10     BIT  #$10    ; Check bit 4
1117 CC3D: D0 05     BNE  $CC44   ; Skip over this if it's set
1118 CC3F: 20 0C CF  JSR  $CF0C   ; Toggle bit 7 of register $E ON-off-ON
1119                              ; # of times before C is set is in $C817/8
1120 CC42: B0 2E     BCS  $CC72   ; Signal failure is C is set
1121
1122 There is a lot of this code and variants thereof sprinkled liberally throughout
1123 the firmware code.  I'm still not sure what bit 4 of register $C is a signal
1124 for, but it seems clear that it indicates some kind of error condition because
1125 whenever it's not set, it toggles bit 7 of register $E and will eventually,
1126 when this has happened enough times, signal an error and exit.
1127
1128 CC44: 3C 61 C0  BIT  $C061,X ; Check bit 6 (AIP) of reg. $1
1129 CC47: 50 E7     BVC  $CC30   ; Try again if it's not set
1130
1131 This little bit of code checks the AIP (Arbitration In Progress) bit, and loops
1132 back to try again if it's not set.
1133
1134 CC49: EA        NOP          ; Do a small delay
1135 CC4A: EA        NOP
1136 CC4B: A9 20     LDA  #$20
1137 CC4D: 3D 61 C0  AND  $C061,X ; Check if bit 5 (LA) of reg. $1 is set
1138 CC50: D0 DE     BNE  $CC30   ; Try again if it's set
1139
1140 After checking to see if the AIP bit is set, it then waits a short amount of
1141 time before checking to see if the LA (Lost Arbitration) bit is set; if it's
1142 set, it loops back to try again.
1143
1144 CC52: BD 60 C0  LDA  $C060,X ; Get reg. $0
1145 CC55: 4D DA C8  EOR  $C8DA   ; EOR it with what we put there to begin with
1146 CC58: F0 05     BEQ  $CC5F   ; If it's the same, bypass (we won arbitration)
1147 CC5A: CD DA C8  CMP  $C8DA   ; Otherwise, see if the EORed value is >= orig
1148 CC5D: B0 D1     BCS  $CC30   ; Try again if so
1149
1150 Here we look at the data on the SCSI bus and see if there were any other
1151 devices attempting to arbitrate at the same time.  If there were, and their
1152 SCSI ID was higher than ours, then loop back and try again; otherwise, we won
1153 arbitration and continue on:
1154
1155 CC5F: A9 20     LDA  #$20
1156 CC61: 3D 61 C0  AND  $C061,X ; Check if bit 5 (LA) of reg. $1 is set
1157 CC64: D0 CA     BNE  $CC30   ; Try again if so
1158
1159 We check the LA bit one more time to ensure it's not set; if it is, then loop
1160 back and try again.
1161
1162 CC66: A9 06     LDA  #$06    ; Set bits 1-2 (ASSERT /ATN, /SEL) of reg. $1
1163 CC68: 1D 61 C0  ORA  $C061,X
1164 CC6B: 29 9F     AND  #$9F    ; And clear bits 5-6 (TEST MODE, DIFF ENBL) of $1
1165 CC6D: 9D 61 C0  STA  $C061,X
1166 CC70: 18        CLC          ; Signal success
1167 CC71: 60        RTS          ; & return
1168
1169 Now that we've won the Arbitration phase, we assert the ATN and SEL lines and
1170 make sure that the TEST MODE and DIFF ENBL lines are dropped.  By setting the
1171 ATN line, we signal to the Target that we want to go to the Message Out phase
1172 after the Selection phase is done.  Once that's done, we signal success and
1173 return.
1174
1175 CC72: A9 80     LDA  #$80
1176 CC74: 8D 8F C8  STA  $C88F
1177 CC77: 4C 91 CD  JMP  $CD91   ; Signal failure
1178
1179 This bit is called if the code that checks register $C fails; this is the only
1180 failure path for the Arbitration phase code.
1181
1182
1183 A Fine SELECTion Of Devices
1184 ---------------------------
1185
1186 Now that the Initiator (us) has won the Arbitration phase, it's time to see if
1187 the device we want to talk to exists, and is ready and able to talk.
1188
1189 CC7A: 9E 64 C0  STZ  $C064,X ; Zero out reg. $4 (Select Enable)
1190 CC7D: AD DA C8  LDA  $C8DA   ; Host ID
1191 CC80: 0D DB C8  ORA  $C8DB   ; Target ID
1192 CC83: 9D 60 C0  STA  $C060,X ; Store $C8DA & DB (ORed) into reg. $0 (Data Bus)
1193 CC86: A9 41     LDA  #$41    ; Set bits 0 (DATA BUS) & 6 (TEST MODE) in reg. $1
1194 CC88: 1D 61 C0  ORA  $C061,X ; Then clear bits 5-6 (DIFF ENBL, TEST MODE) in $1
1195 CC8B: 29 9F     AND  #$9F
1196 CC8D: 9D 61 C0  STA  $C061,X
1197
1198 The code here clears the Select Enable register to ensure no IRQs are generated
1199 during the Select phase, then puts both the Initiator's SCSI ID and the
1200 Target's SCSI ID into the 53C80's data register.  It then does something that
1201 doesn't seem to make any sense, as it sets the DATA BUS ENABLE and TEST MODE
1202 bits.  The former puts the 53C80's data register onto the SCSI data bus, while
1203 the latter disables all outputs of the 53C80.  Maybe this was necessary because
1204 of the Sandwich II chip and the way it was hooked up to the slot I/O bus and
1205 the 53C80, but there's no way to know for sure without access to actual
1206 hardware.
1207
1208 After this, it disables the TEST MODE bit, which then enables the outputs of
1209 the 53C80, and thus the Target's SCSI ID is then visible to all the devices
1210 connected to the SCSI bus.
1211
1212 CC90: A9 FE     LDA  #$FE    ; Clear bit 0 (ARBITRATE) in reg. $2
1213 CC92: 3D 62 C0  AND  $C062,X
1214 CC95: 9D 62 C0  STA  $C062,X
1215 CC98: A9 02     LDA  #$02    ; Set bit 1 (DMA MODE) in reg. $2
1216 CC9A: 1D 61 C0  ORA  $C061,X
1217 CC9D: 9D 61 C0  STA  $C061,X
1218 CCA0: AD DC C8  LDA  $C8DC   ; Get $C8DC, set hi bit, save in $C821
1219 CCA3: 09 80     ORA  #$80
1220 CCA5: 8D 21 C8  STA  $C821
1221 CCA8: A9 F7     LDA  #$F7    ; Clear bit 3 (ASSERT /BSY) in reg. $1
1222 CCAA: 3D 61 C0  AND  $C061,X
1223 CCAD: 9D 61 C0  STA  $C061,X
1224
1225 This is all pretty straightforward stuff.  It clears the ARBITRATE bit, sets
1226 the DMA MODE bit, and clears BSY (if it was set before; more likely than not,
1227 it will have been cleared already).  It also sets bit 7 of $C8DC and saves it
1228 in $C821, but it's not clear just why yet.
1229
1230 CCB0: 20 51 CD  JSR  $CD51   ; Wait for bit 6 (/BSY) of reg. $4 to be set
1231 CCB3: 90 03     BCC  $CCB8   ; Skip over JSR if success
1232 CCB5: 20 75 CD  JSR  $CD75   ; Shorter wait for bit 6 in reg. $4 to be set
1233
1234 This bit of code waits for the Target to assert the BSY line; if it fails after
1235 the first attempt, it will try again with a shorter wait time.
1236
1237 CCB8: A9 FB     LDA  #$FB    ; Clear bit 2 (ASSERT /SEL) in reg. $1
1238 CCBA: 3D 61 C0  AND  $C061,X
1239 CCBD: 9D 61 C0  STA  $C061,X
1240 CCC0: 90 10     BCC  $CCD2   ; Skip over if the JSR was successful
1241
1242 This code drops the SEL line, and depending on whether or not the Target
1243 asserted the BSY line, will either drop through to the failure path or skip
1244 over to the success path.
1245
1246 CCC2: A9 FE     LDA  #$FE    ; Clear bit 0 (DATA BUS) in reg. $1
1247 CCC4: 3D 61 C0  AND  $C061,X
1248 CCC7: 9D 61 C0  STA  $C061,X
1249 CCCA: A9 81     LDA  #$81    ; Put $81 in $C88F
1250 CCCC: 8D 8F C8  STA  $C88F
1251 CCCF: 4C 91 CD  JMP  $CD91   ; Signal failure
1252
1253 This is the only failure path in the Selection phase code, but, unlike the
1254 Arbitration phase code, this code path will *not* lock up waiting for signals.
1255 It will wait only so long for the Target to assert the BSY line before giving
1256 up and signalling failure.  It will also bail out of this bank completely, so
1257 it will not try any further communication--for now.
1258
1259 CCD2: A9 9D     LDA  #$9D    ; Clear bits 1, 5-6 (TEST, DIFF E., DMA) in $1
1260 CCD4: 3D 61 C0  AND  $C061,X
1261 CCD7: 9D 61 C0  STA  $C061,X
1262 CCDA: A9 FE     LDA  #$FE    ; Then clear bit 0 (DATA BUS) in $1
1263 CCDC: 3D 61 C0  AND  $C061,X
1264 CCDF: 9D 61 C0  STA  $C061,X
1265 CCE2: 18        CLC          ; Signal success
1266 CCE3: 60        RTS          ; & return
1267
1268 Otherwise, the code clears TEST MODE, DIFF ENBL and DMA MODE before clearing
1269 DATA BUS, signalling success and returning.
1270
1271
1272 The Next Part, In Which We Find Ourselves In A Maze Of Twisty Code
1273 ------------------------------------------------------------------
1274
1275 Now that we've successfully navigated the Selection phase, it's time to talk
1276 SCSI.  For the sake of brevity, we will refer to this code as The Code That
1277 Comes After Selection, or TCTCAS for short.  This bit of code calls a bunch of
1278 other code which in turns calls even more code; keeping it all straight was
1279 quite the challenge.
1280
1281 CCE4: BD 6C C0  LDA  $C06C,X ; Get $C
1282 CCE7: 89 10     BIT  #$10    ; Is bit 4 set?
1283 CCE9: D0 05     BNE  $CCF0   ; Skip ahead if so
1284 CCEB: 20 0C CF  JSR  $CF0C   ; Else, toggle bit 7 of $E (ON-off-ON) w/countdown
1285 CCEE: B0 40     BCS  $CD30   ; Exit if countdown hit zero
1286
1287 Here again we see the boilerplate checking of bit 4 of register $C.
1288
1289 CCF0: BD 64 C0  LDA  $C064,X ; Get reg. $4
1290 CCF3: 29 42     AND  #$42    ; Are bits 1 (/SEL) & 6 (/BSY) clear?
1291 CCF5: F0 3A     BEQ  $CD31   ; If so, we're done (jump down, signal error)
1292
1293 Here we're checking the BSY and SEL lines; if both have been dropped after the
1294 last phase, we jump down to $CD31 and do some final checking before exiting.
1295
1296 CCF7: C9 40     CMP  #$40    ; Is only bit 6 (/BSY) set?
1297 CCF9: D0 E9     BNE  $CCE4   ; Loop back if not...
1298
1299 The second check looks to see if only BSY is set; if not it loops back to the
1300 start of this subroutine, otherwise it continues on:
1301
1302 CCFB: BD 62 C0  LDA  $C062,X ; Clear bit 1 (DMA MODE) of reg. $2
1303 CCFE: A8        TAY
1304 CCFF: 29 FD     AND  #$FD
1305 CD01: 9D 62 C0  STA  $C062,X
1306 CD04: 98        TYA          ; Then restore its previous state
1307 CD05: 1D 62 C0  ORA  $C062,X
1308 CD08: 9D 62 C0  STA  $C062,X
1309
1310 This little bit of code toggles DMA MODE line off then on if it was set to
1311 begin with, otherwise it does nothing.  Well, it doesn't *do* nothing, but the
1312 effect is null and void.
1313
1314 CD0B: BD 64 C0  LDA  $C064,X ; Is bit 5 (/REQ) of reg. $4 clear?
1315 CD0E: A8        TAY
1316 CD0F: 29 20     AND  #$20
1317 CD11: F0 D1     BEQ  $CCE4   ; Loop back if so...
1318
1319 This checks to see if the REQ line has been asserted by the target yet, and if
1320 not, loop back to the beginning of the subroutine.
1321
1322 CD13: AD 1F C8  LDA  $C81F   ; Save $C81F in $C820 (last 3-bit pattern we saw)
1323 CD16: 8D 20 C8  STA  $C820
1324
1325 Here we save the last phase that was seen in $C820.
1326
1327 CD19: 98        TYA          ; Restore reg. $4 from Y
1328 CD1A: 29 1C     AND  #$1C    ; Keep only bits 2-4 (/I/O, /C/D, /MSG)
1329 CD1C: 8D 1F C8  STA  $C81F   ; & save in $C81F
1330
1331 Earlier we saved the contents of register $4 (which holds the MSG, C/D and I/O
1332 bits) in the Y register, now we retrieve them and mask off the MSG, C/D and I/O
1333 bits and save them for later.  By virtue of this, every time we get here the
1334 previous value that was in $C81F must be different than the last value we saw
1335 here.
1336
1337 As to why: when I first encountered this code, I approached it the way I
1338 usually approach unknown code: by feeding it zeroes.  However, when I did that,
1339 these lines of code caused a failure mode later on.  And so I had to dig a
1340 little deeper into all things SCSI and 53C80 to figure out why--we'll see why
1341 that caused a failure later on.
1342
1343 CD1F: 4A        LSR  A
1344 CD20: 8D 2B C8  STA  $C82B   ; & put /2 in $C82B
1345
1346 Here we shift it right one bit and stuff it into $C82B; this is also a clever
1347 way of making it into an index for a jump table.
1348
1349 CD23: A8        TAY          ; & use as index into jump table
1350 CD24: 4A        LSR  A       ; & /2 again
1351 CD25: 9D 63 C0  STA  $C063,X ; Write it to reg. $3 (Target Command)
1352
1353 Here we put it into the Y register and then shift it to the right one more time
1354 to set the bits in the Target Command register properly.  The Initiator needs
1355 to set this register properly at each phase change, otherwise the 53C80 will
1356 signal a phase match error.
1357
1358 CD28: 20 48 CD  JSR  $CD48   ; Use Y as idx to jump table and go there
1359
1360 So here the code uses the three phase bits (MSG, C/D and I/O) as an index into
1361 a jump table to handle the six phases after the Selection phase (Data Out, Data
1362 In, Command, Status, Message Out, Message In).  We'll have more to say about
1363 this shortly.
1364
1365 CD2B: 2C 06 C8  BIT  $C806   ; Is bit 7 of $C806 clear?
1366 CD2E: 10 B4     BPL  $CCE4   ; Loop back if so...
1367 CD30: 60        RTS
1368
1369 This simply checks bit 7 of $C806, which only gets set under very specific
1370 circumstances; those being that MSG, C/D and I/O are all asserted (Message In
1371 phase), and that the value returned from the Target is a "Good" message, and
1372 that the prior phase was either Message In, Message Out, or Status.
1373
1374 CD31: AD 8F C8  LDA  $C88F   ; Get $C88F
1375 CD34: D0 08     BNE  $CD3E   ; If $C88F is != 0, just return
1376 CD36: A9 82     LDA  #$82    ; Stuff $82 into $C88F
1377 CD38: 8D 8F C8  STA  $C88F
1378 CD3B: 4C 91 CD  JMP  $CD91   ; Signal failure (?) & return
1379 CD3E: 80 F0     BRA  $CD30
1380
1381 This is the code path taken if the BSY and SEL lines are dropped.  It signals
1382 that something went wrong before returning.
1383
1384
1385 The Next Part, In Which Things Start To Make Sense
1386 --------------------------------------------------
1387
1388 So TCTCAS is, as it turns out, where the Target drives the Initiator; which in
1389 this case is the hard drive driving the card.  As I mentioned up above, when I
1390 first started poking around at this code, I was feeding it zeroes at first as a
1391 place to start seeing if I could get it to do something meaningful.  However,
1392 when you try that, you run into the following bit of code which says, "No,
1393 fuggetaboutit."
1394
1395 CEE5: AD 1F C8  LDA  $C81F   ; Get the current MSG, C/D, I/O values
1396 CEE8: CD 20 C8  CMP  $C820   ; Compare it to the previous values
1397 CEEB: D0 05     BNE  $CEF2   ; If they're different, skip over
1398 CEED: A9 27     LDA  #$27    ; (This is ignored by the jump target)
1399 CEEF: 4C 6C CE  JMP  $CE6C   ; Else, do a soft, then a hard reset of the card
1400 CEF2: ...
1401
1402 And so, after looking over the SCSI documentation for the umpteenth time, I
1403 realized that what it was saying is that you can't do a Data Out phase directly
1404 after the Selection phase; it has to be Something Else. And this is because
1405 $C81F gets initialized with zero (which corresponds to the Data Out
1406 phase)--which means starting with zero Won't Work.
1407
1408 As luck would have it, however, we know that in the Selection phase, it
1409 asserted the ATN line, which in turn tells the Target to assert the MSG and C/D
1410 lines (but not I/O).  Which means that we *know* that the Target will first go
1411 to the Message Out phase, every time.
1412
1413 And so, by writing the hard drive emulator to properly respond to the MSG, C/D
1414 and I/O lines I got it to handshake the Message Out phase properly.  But I
1415 could see that after that, it wasn't exiting; it was running through another
1416 round of seeing what was in MSG, C/D and I/O and running the appropriate
1417 handler.
1418
1419 Now I was a bit stuck here, as there was *no* documentation on how a Target
1420 device, such as a hard drive, would drive the handshaking for the Initiator
1421 device.  And it wasn't clear what phase the firmware was expecting to come
1422 next, so guessing wasn't likely to yield positive results.
1423
1424 So, by the serendipitous luck of the Search Engine gods, I stumbled upon a page
1425 which looked like a scan of a book mixed with some bespoke images made by
1426 someone whose primary language was not English.  One of the images, which had
1427 misaligned text set next to it, was, however, suggestive.  It showed a sequence
1428 of phases that went from Bus Free to Arbitration to Selection to Message Out to
1429 Command to Data In to Status to Message In to Bus Free.  This was the first
1430 time I had seen anything like this; in all of the SCSI literature that I had
1431 surveyed, there was nothing beyond the vaguest hints that there was a typical
1432 order to the phases.  Sure, they would say that one *could* go from one phase
1433 to another, and how the handshaking worked, but there was *nothing* saying that
1434 there was a definite order to the phases that should be observed.
1435
1436 So, as I said, this image was highly suggestive.  Could this be the key to the
1437 whole thing that I was missing?
1438
1439 I had set things up in the hard drive emulation to go to the Message Out phase
1440 after the Selection phase, and so I added code to go to the Command phase after
1441 that.  I could see that the firmware was sending something in the Command phase
1442 at this point, which was the following six bytes: 00 00 00 00 00 00.  And
1443 looking that up in the SCSI literature showed that to be the TEST UNIT READY
1444 command.  But the firmware was still looking for more.
1445
1446 From what I saw in the logs, it didn't look like it was going for a Data In
1447 phase next, so I set it up to go to the Status phase, and that got things going
1448 a little bit further.  To me, this looked like it should be the end of the
1449 dance, but the firmware was *still* looking for more.
1450
1451 But even though a byte was sent from the Target to the Initiator during the
1452 Status phase, it seemed that the Status reponse was actually sent in the
1453 Message In phase.  Once I had coded this into the hard drive emulation, I could
1454 see the TEST UNIT READY command going into TCTCAS and coming out of it in a
1455 non-failure mode.
1456
1457 The dance has steps, and they must be followed in order.
1458
1459
1460 Dancing In The Dark
1461 -------------------
1462
1463 However, something is still not quite right; my assumption--that all the
1464 firmware needed to do to see if there was a drive on the bus was to probe
1465 through to the Selection phase and then, if anything responded, to see if it
1466 successfully responded to the TEST UNIT READY command--turned out to be wrong.
1467 How wrong?  Let's take a look back at the code in bank 3:0 which attempts to
1468 enumerate all devices it can see on the SCSI bus:
1469
1470 CC55: A0 07     LDY  #$07
1471 CC57: 8C 73 C8  STY  $C873   ; Save Y in $C873
1472 CC5A: 9C DC C8  STZ  $C8DC   ; Zero out $C8DC
1473 CC5D: B9 F4 CF  LDA  $CFF4,Y ; Get SCSI ID from table into A
1474 CC60: CD DA C8  CMP  $C8DA   ; Compare it to our SCSI ID (default is $01)
1475 CC63: F0 1F     BEQ  $CC84   ; Skip over if it's equal (don't query our SCSI ID)
1476
1477 So here it's looping through all eight SCSI IDs, starting with the lowest
1478 priority and working its way up to the highest (for reference, the table at
1479 $CFF4 has the following values: $01, $02, $04, $08, $10, $20, $40, $80).  It
1480 compares the SCSI ID from the table to the SCSI ID of the card, and skips over
1481 the following code (down to $CC84) if it's the same.
1482
1483 CC65: 8D DB C8  STA  $C8DB   ; Else, put SCSI ID to look at in $C8DB
1484 CC68: 64 4F     STZ  $4F     ; Zero out $4F (error flag)
1485 CC6A: 20 5F CF  JSR  $CF5F   ; Do TEST UNIT READY (calls bank 16:0)
1486
1487 This is the code that I was now able to successfully navigate with my hard
1488 drive emulation.  It emulated exactly one SCSI ID, and that one ID returned
1489 here successfully (every other ID, obviously with nothing connected to the bus,
1490 returned failure).  However, I could see from the log file that it was trying
1491 to issue some more commands--which was puzzling, but told me that I needed to
1492 dig even deeper into the code.
1493
1494 CC6D: A5 4F     LDA  $4F     ; Get error code
1495 CC6F: D0 0F     BNE  $CC80   ; Skip over if error occurred
1496
1497 This is fairly straightforward; it checks the error code returned from the call
1498 we made to bank 16:0, and if it's anything but zero, skip over the following
1499 code:
1500
1501 CC71: EE 0D C8  INC  $C80D   ; Success means add one to $C80D (# of devices)
1502 CC74: 20 9F CC  JSR  $CC9F   ; & call Function 1 in this bank (INQUIRY + MORE)
1503 CC77: 90 0B     BCC  $CC84   ; Check next ID if C == 0
1504
1505 So here we increment a counter, which we suppose to be a count of the number of
1506 valid devices we have found on the SCSI bus.  And here, we come to the
1507 realization that it isn't just hard drives that can talk to the Apple High
1508 Speed SCSI card, it's also printers, scanners, tape drives and whatnot.  And
1509 so, it makes perfect sense that TEST UNIT READY is only the first step in
1510 discovering if a device is a hard drive or not because here, it calls function
1511 1 of bank 3 (the bank we're currently in) which is what issues more commands to
1512 figure out what the device it's talking to actually *is*.
1513
1514 CC79: A9 99     LDA  #$99    ; Else, stuff $99 into $C887
1515 CC7B: 8D 87 C8  STA  $C887
1516 CC7E: 80 17     BRA  $CC97   ; & signal success
1517
1518 So if the call to $CC9F (INQUIRY + MORE) returned with the carry flag set, it
1519 stuffs a magic number into $C887, signals success and returns.
1520
1521 CC80: C9 80     CMP  #$80    ; Was error $80?
1522 CC82: F0 16     BEQ  $CC9A   ; Signal NoDrive error if so
1523
1524 This is where it lands if the TEST UNIT READY call returned a non-zero result
1525 in the "error code" memory location. if it equals $80, it puts the ProDOS error
1526 code for a "NoDrive" error into the error code and returns.
1527
1528 CC84: AC 73 C8  LDY  $C873   ; Restore Y
1529 CC87: 88        DEY          ; Done looking at all IDs?
1530 CC88: 10 CD     BPL  $CC57   ; Go back if not.
1531
1532 Here we decrement the counter and loop back if we haven't looked at all eight
1533 (except for the card's) SCSI IDs.  Otherwise, we've finished, and fall through
1534 to the following:
1535
1536 CC8A: A9 77     LDA  #$77    ; Else, stuff $77 into $C80A & $C887
1537 CC8C: 8D 0A C8  STA  $C80A
1538 CC8F: 8D 87 C8  STA  $C887
1539 CC92: AD 0D C8  LDA  $C80D   ; Did we find any devices?
1540 CC95: F0 03     BEQ  $CC9A   ; Signal NoDrive if not
1541 CC97: 64 4F     STZ  $4F     ; Else, signal success
1542 CC99: 60        RTS          ; & return
1543
1544 So here it stuffs the magic number $77 into $C887 and $C80A; it also checks the
1545 "number of devices found" memory location, and signals a "NoDrive" error if the
1546 count is equal to zero.
1547
1548 CC9A: A9 28     LDA  #$28    ; Return $28 (NoDrive) in $4F
1549 CC9C: 85 4F     STA  $4F
1550 CC9E: 60        RTS
1551
1552 This is the landing location for the various failure modes seen up above; it
1553 simply puts the ProDOS "NoDrive" error into the error flag and returns.
1554
1555 So now I get to figure out what the commands are in that call to 3:1 that are
1556 causing the card to return in a failure mode.
1557
1558
1559 The Test Is Easy, When You Have The Answer Key
1560 ----------------------------------------------
1561
1562 At this point, even though I had the hard drive emulation doing a proper dance
1563 through the TEST UNIT COMMAND, it was in a very crude state and couldn't really
1564 do anything else.  And so I had to take a closer look at the seemingly
1565 impenetrable code that set up a bunch of memory locations before calling bank
1566 16:0 to see if I could make sense of it.
1567
1568 Rather than go through every last one, I will go through part of the first such
1569 piece of code, as it's instructive:
1570
1571 CD0E: 20 A4 CF  JSR  $CFA4   ; Set $60/1 to $C923, $56/7 to $C92F
1572 CD11: 20 B9 CF  JSR  $CFB9   ; Put $C9C3 into $C92F/30, zero $C931
1573 CD14: A9 12     LDA  #$12    ; Put $12 into $C923
1574 CD16: 8D 23 C9  STA  $C923
1575 CD19: 9C 24 C9  STZ  $C924   ; Zero out $C924-6, $C928
1576 CD1C: 9C 25 C9  STZ  $C925
1577 CD1F: 9C 26 C9  STZ  $C926
1578 CD22: 9C 28 C9  STZ  $C928
1579 CD25: A9 1E     LDA  #$1E    ; Put $1E in $C927, $C933 (length of reply, 30)
1580 CD27: 8D 27 C9  STA  $C927
1581 CD2A: 8D 33 C9  STA  $C933
1582
1583 So we can see right off the bat that it's setting up zero page locations $60
1584 and $61 to point to memory at $C923, and that it sets up six bytes at that
1585 location with the following:
1586
1587 C923: 12 00 00 00 1E 00
1588
1589 Reaching back to our crash course on SCSI commands, we can see by the first
1590 byte, since the top three bits are all zero, that this must be a six-byte
1591 command.  And after that, uh, well, we don't really know much of anything.  So
1592 after digging around some more for something even remotely relevant, I found a
1593 document dealing with SCSI-2 and SCSI-3 hard disk interfacing--which told me,
1594 first of all, that $12 was the INQUIRY command, and second, that the fifth byte
1595 in the command was the length of the message that the Initiator was expecting
1596 back from the target in response to this command.  Progress!
1597
1598 CD2D: 20 CB CF  JSR  $CFCB   ; Call bank 16:0 (Do INQUIRY command)
1599 CD30: A5 4F     LDA  $4F
1600 CD32: F0 05     BEQ  $CD39   ; Skip over if no error
1601
1602 And this, as we now know, does the phase to phase dance from start to finish,
1603 and checks the resulting error code to do any necessary error handling.  But
1604 what of the response?  How do we know what to say from our emulated hard disk
1605 back to the firmware?  The hard disk interface document had something that
1606 looked plausible, if overlong (it seems that latter day SCSI drives are
1607 expected to return 148 bytes instead of 30).  So I expected that I could adapt
1608 that to suit the purposes of the emulation.
1609
1610 It was obvious that I had to write code to handle more than just the TEST UNIT
1611 READY command, and that it had to be able to send and receive data over the
1612 SCSI bus, which it, in its current state, couldn't do.  Eventually I was able
1613 to get that working and I could see that the firmware was successfully
1614 negotiating the INQUIRY command *and* coming to the conclusion that it was
1615 talking to a hard disk.  More progress!
1616
1617 And, as it turns out, this first call in bank 3:1 is what determines what the
1618 device we're talking to actually is, and it sets up appropriate memory
1619 locations to signal that to other parts of the firmware.  This is another one
1620 of those places where the "Technical Manual for the Apple SCSI Card" had a
1621 useful tidbit, namely a small table that looked something like this:
1622
1623 Code  Device Type
1624 ------------------------------
1625 $03   Nonspecific SCSI
1626 $05   CD-ROM
1627 $06   Direct-access tape drive
1628 $07   Hard disk
1629 $08   Scanner
1630 $09   Printer
1631
1632 These device codes are different from the device codes that the INQUIRY command
1633 returns, and this bit of code also does the translation from one to the other.
1634
1635
1636 The Next Part, In Which More Progress Is Made
1637 ---------------------------------------------
1638
1639 And so, in using similar analysis in the other parts of the code called by bank
1640 3:1, I was able to discern that after the INQUIRY command, it was calling the
1641 MODE SENSE, MODE SELECT, READ CAPACITY and READ commands afterward.  And since
1642 I didn't know exactly what these commands returned, I used the time honored
1643 method of returning messages consisting of all zeroes.
1644
1645 And, in fixing up the hard drive emulation to respond to these commands, I
1646 could see the firmware was making it all the way through the bank 3:1 code
1647 successfully, and not in a failure mode.  It didn't boot anything yet, as I
1648 hadn't written the code to load a hard disk image much less dole it out over
1649 the SCSI bus, but it was a good result and I could finally see the end of this
1650 Herculean task coming into view.
1651
1652 However, I could see from the log file that something still wasn't quite right.
1653
1654
1655 The Next Part, In Which Things Start Getting LUN-ey
1656 ---------------------------------------------------
1657
1658 The problem was one of too much success.  It wasn't going through the set of
1659 INQUIRY, MODE SENSE, MODE SELECT, READ CAPACITY and READ commands just once, it
1660 was doing it *eight* times.  And in looking for the culprit, I found the
1661 following tidbit:
1662
1663 CCE5: EE DC C8  INC  $C8DC   ; Increment a counter
1664 CCE8: AD DC C8  LDA  $C8DC
1665 CCEB: C9 08     CMP  #$08
1666 CCED: D0 B0     BNE  $CC9F   ; Loop back if we haven't checked 8 times yet
1667
1668 It wasn't obvious on first examination, but I eventually figured out that
1669 location $C8DC was being put into byte one of every command being sent over the
1670 SCSI bus--as I could see the INQUIRY command was changing every time it was
1671 called like so:
1672
1673 12 00 00 00 1E 00
1674 12 20 00 00 1E 00
1675 12 40 00 00 1E 00
1676 12 60 00 00 1E 00
1677 12 80 00 00 1E 00
1678 12 A0 00 00 1E 00
1679 12 C0 00 00 1E 00
1680 12 E0 00 00 1E 00
1681
1682 And so, after more digging into the hard disk interface document, I could see
1683 that the field being modified was called the Logical Unit Number, or LUN for
1684 short.  Further, hard disks conforming to the SCSI-2 and SCSI-3 had a
1685 commandment, that being as follows:
1686
1687 The LUN Shall Be Zero, And Zero Shall The LUN Be.  It Shall Be No Other Number
1688 Save For Zero, For Any Other Number Shall Be An Abomination Before The Drive.
1689
1690 Well, going by simple logic, it would appear that the SCSI-1 protocol was not
1691 bound by such a rule, and so you could have eight Logical Units for each SCSI
1692 device on the bus.  But this presents an interesting challenge.  We need to
1693 tell the firmware to pound sand for all but one LUN.
1694
1695
1696 Failure Is An Option
1697 --------------------
1698
1699 And so I found myself in the position of needing to have the hard drive
1700 emulation fail in a meaningful way; which sounds like an oxymoron but really
1701 isn't.  I needed to code the hard drive emulation to respond with a CHECK SENSE
1702 message, which is how, I eventually discovered, that you signal an error
1703 condition in the SCSI protocol.  When I did this, the firmware then sent a
1704 REQUEST SENSE command, which I wasn't sure how to craft a response that would
1705 signal failure for an invalid LUN.  Responding with all zeroes didn't signal
1706 failure as I hoped it would, so it was back to the hard disk interface document
1707 to find the missing information.
1708
1709 There I found out that byte two of the response is a four-bit "Sense Key", and
1710 that zero corresponds to "No Sense", which means the command was successful.
1711 Which, as it turns out, is no way to signal failure.  The one that fit the bill
1712 was five, which corresponds to "Illegal Request".
1713
1714 And so it seems that 16 Sense Keys was not enough for the designers of the SCSI
1715 protocol, so those Sense Keys correspond to broad categories.  To give even
1716 more fine-grained responses to what went wrong, there are at least two more
1717 eight-bit bytes called the "Additional Sense Code" and "Additional Sense Code
1718 Qualifier", which, taken together, provide for 65,536 different combinations.
1719 And, in the interface document, I found $08 $00 which corresponds to "Logical
1720 Unit Communication Failure" which seemed like a reasonable message for this
1721 failure mode.
1722
1723 Coding up the meaningful failure path and running the emulation showed that
1724 this mostly satisfied the firmware; it would almost get all the way to the
1725 point where it attempted to read block zero from the hard drive in a
1726 non-failure mode, but there was still a small problem.
1727
1728
1729 Every Problem Is Small, From A Certain Point Of View
1730 ----------------------------------------------------
1731
1732 There is a call in the bank 3:1 code that calls bank 4:0 to read a block from
1733 the disk and do some analysis on what it finds.  The logs also showed that this
1734 code was also doing a lot of writing to slot I/O register $F.  Much of it being
1735 calls to the following brief routine:
1736
1737 CFB4: AD 86 C8  LDA  $C886   ; Get the value in $C886
1738 CFB7: 4A        LSR  A       ; Shift the hi nybble to the lo nybble
1739 CFB8: 4A        LSR  A
1740 CFB9: 4A        LSR  A
1741 CFBA: 4A        LSR  A
1742 CFBB: 09 08     ORA  #$08    ; Set the high bit of the lo nybble
1743 CFBD: 9D 6F C0  STA  $C06F,X ; & store it in slot I/O register $F
1744 CFC0: 60        RTS
1745
1746 This was some highly suggestive code, and what it suggested was that it was
1747 using three bits of a value set up elsewhere which made for eight combinations.
1748 The only significant loose end, as far as the hardware was concerned, was the
1749 8K static RAM; in all of the analysis I had done up to this point, it *seemed*
1750 that only 1K of it was ever used.  But this code suggested otherwise.
1751
1752 It was suggesting that slot I/O register $F was a bank select soft switch for
1753 the 8K static RAM; once I coded it up as such, the firmware was then completely
1754 satisfied and would get all the way to where it attempted to read block zero
1755 from the hard drive in a non-failure mode.
1756
1757
1758 The End Is Nigh
1759 ---------------
1760
1761 And so, having studiously and painstakingly laid the foundation for the actual
1762 purpose of the hard drive emulation--that being the transfer of data to and
1763 from the thing--I came at last to the part where I had to actually write code
1764 to have real data flowing to and from the emulated hard disk.  And this, as it
1765 turns out, was the least interesting part of the whole thing; getting the
1766 contents of files into memory and parsing them is a really trivial thing and
1767 usually quite boring.
1768
1769 So in writing this bit of code, I used 4am's "Pitch Dark" hard drive image, and
1770 added the necessary code to serve up appropriate slices of it in response to
1771 the firmware's READ command.  And, of course, after running the new emulation
1772 it failed to load anything.
1773
1774 It was then that I remembered that I sent back messages of all zeroes to
1775 requests from commands, for the most part, with a few exceptions.  One of these
1776 that was sure to cause problems without a proper response was the READ CAPACITY
1777 command.  When the firmware inquired about the size of the hard drive, the
1778 emulator would happily tell it that it had zero capacity--which meant that any
1779 attempted reads by the firmware would be out of range.
1780
1781 So I coded up a proper response for the size of the hard drive image I was
1782 using and fired up the emulator and...  It still didn't work.  The logs told me
1783 that it was sending a ten-byte command, and one I hadn't seen before, which was
1784 basically the ten-byte variant of the READ command.  Once I had *that* coded up
1785 properly, I fired up the emulator and after a few seconds, found myself in the
1786 monitor.
1787
1788 What?  Why?  How does this even--
1789
1790 To quell the questions that were pooling up in my head I wrote some hooks into
1791 the emulator to trigger a code trace at the appropriate time; that being where
1792 the code transfered control to memory address $801, the ostensible location
1793 where the firmware allegedly read from block zero and placed it in memory at
1794 $800.  And I knew that it was getting to that point successfully because the
1795 firmware doesn't get there unless everything is working on the SCSI bus as it
1796 should, and the trace in the log file confirmed this.
1797
1798 There are worse things than being dumped into the Apple II monitor; at least I
1799 could poke around memory and disassemble things to try to figure out what was
1800 going wrong.  And I could see that the block that was loaded into memory was
1801 looking at the slot ROM for a certain value that caused it to take a branch
1802 that landed it in a crash zone.  This made no sense whatsoever.
1803
1804 Fortunately for me though, I have the ability to disassemble a snapshot of any
1805 memory range that I desire--so I disassembled the entire block from $800 to
1806 $9FF.  And what I saw there was still strange; near the end of the block it
1807 just kind of ran out of instructions, like something was missing.  And looking
1808 near the middle of the block, I saw something eerily similar to what I saw at
1809 the end.
1810
1811 Then I realized it wasn't similar, it was *identical*.  Looking through the
1812 hard drive emulator code, I was not surprised to find this:
1813
1814 static uint8_t * buf;
1815 static uint8_t bufPtr;
1816
1817 Yes, I had made a rookie mistake of using too small of a value for my buffer
1818 pointer; it was loading the correct block, but, because the buffer pointer was
1819 only eight bits wide, it only copied the first 256 bytes out of the hard drive
1820 image *twice*.
1821
1822 As embarrassing as this was, it was also good news, as it meant that firmware
1823 bootstrap code was working; it was reading real data from the hard drive
1824 emulation and running correctly.  Which meant that once I fixed the size of my
1825 buffer pointer, the emulated hard drive should boot up correctly.
1826
1827 And once I coded up the fix and started up the emulator once more, after six or
1828 so seconds, "Pitch Dark" came up on the screen and it was glorious...
1829
1830
1831 Sic Transit Gloria Mundi
1832 ------------------------
1833
1834 I was able to navigate forward and back through the various games on the hard
1835 drive image; I could even view the artwork that came with each one.  And lo and
1836 behold: the games worked!
1837
1838 I was playing through a bit Wishbringer when I got to a point where I wanted to
1839 save my game.  And, even though there was no WRITE command hooked up yet, I
1840 tried it anyway and got a nice hard lockup on the emulator.  This would never
1841 do--to have a hard disk that was read-only--so I coded up the WRITE command
1842 handler.
1843
1844 And upon booting up the hard drive, it looked like it was OK, only there were
1845 problems; namely, while you could navigate through the various games, you could
1846 not launch them.  As a matter of fact, the only game that *could* be launched
1847 was Zork I, which was the first game to pop up on the menu.  So after looking
1848 the code, I noticed that there was an asymmetry in the ports used for reading
1849 and writing to the SCSI bus.  Which requires a brief digression into data
1850 transference.
1851
1852
1853 To DMA, Or Not To DMA, That Is The Question
1854 -------------------------------------------
1855
1856 As it turns out, I was finally able to figure out that the physical DMA on/off
1857 switch on the card was wired to bit 6 of slot I/O register $C.  I further found
1858 out that, since I was defaulting to zero for any unknown bit in the slot I/O
1859 registers, that it was treating the DMA switch as if it were in the off
1860 position.  However, even so, the firmware was still treating this as a DMA
1861 transfer.
1862
1863 And, looking at the 53C80 manual, I could see that it supported three distinct
1864 kinds of bus I/O: Programmed I/O (or PIO for short), Direct Memory Access (or
1865 DMA for short) and Pseudo DMA.  Of these three, PIO is the slowest, as it
1866 relies 100% on handshaking on the SCSI bus for data transfer, while DMA is the
1867 fastest, as all you need to do is set some registers and tell the 53C80 to go
1868 and it handles the transfer all in the background without the need for any
1869 intervention from the CPU whatsoever. But what the firmware was doing, in this
1870 DMA switch in the off position mode, was Pseudo DMA.
1871
1872 How it works for reading data from the SCSI bus is that the CPU monitors bit 6
1873 (DMA REQ) in the slot I/O register $5, then reads the data that shows up in
1874 slot I/O register $6 when the DMA REQ bit is asserted.  For this kind of
1875 transfer to work, however, there must be some kind of address decoding that
1876 will assert the DACK (Dma ACKnowledge) line once the data is read.  Because
1877 this code works, we can logically deduce that the read to slot I/O register $6
1878 is wired to produce this signal, even if we can't prove it conclusively through
1879 the schematic of the card.
1880
1881 Writing works in a similar manner by monitoring the DMA REQ line, but instead
1882 of writing to slot I/O register $6 (which is a trigger for starting a DMA
1883 transfer) it writes to slot I/O register $0.  And, as we inferred through logic
1884 about the setting of the DACK line in the reading case, we can similarly infer
1885 that the DACK line is being set in a similar manner in the writing case.
1886
1887 The upshot is, even though Pseudo DMA transfers are still CPU intensive, they
1888 are faster than PIO transfers.  And when it comes to relatively slow CPUs like
1889 the 65C02, faster is better.
1890
1891
1892 And They All Lived Happily Ever After-ish
1893 -----------------------------------------
1894
1895 So in looking at the code for the WRITE command, I could see that I had it
1896 using register $6 for the data transfer, which, as we can see from the short
1897 digression above, won't work.  Fixing this to look at the correct register ($0)
1898 brought things into alignment, and a thorough test of "Pitch Dark" confirmed
1899 that I had indeed solved the problem.
1900
1901 So, in the final analysis, I was finally able to restore decency to Apple2 and
1902 play "Pitch Dark" on it to boot.  But was it worth it?  In my opinion the
1903 answer is an unequivocal "yes", and not just because it enables the use of hard
1904 drive images in emulators.
1905
1906 The reason this little exercise in digital archaeology was worth the effort
1907 expended is that it underscores a problem that seems to have gone largely
1908 underappreciated: the early microcomputers, in some respects, are very well
1909 documented; however, in many others, they are not--and the knowledge of exactly
1910 how they worked is in danger of disappearing.  The fact that the documentation
1911 for the Apple High Speed SCSI card is of a consumer oriented nature with very
1912 little technical content was of little use in figuring out how it really
1913 worked, and shows a marked contrast to the early days of Apple where they
1914 published very detailed information about their computers and how they worked,
1915 including schematics and source code.
1916
1917 All that is to say that unless those of us who still remember these artifacts
1918 and have the ability to analyze them to tease out their inner workings actually
1919 *do* so, these things *will* disappear, and they will pass out of human memory
1920 forever.
1921
1922
1923 --------------
1924 v1.0: 6/3/2019
1925 v1.1: 1/10/2020