the protocol, as documented by Mooz, is basically SPI mode 1.
That is to say: 1) the PC Engine is the master device, and the MB128 is the slave 2) we assume that the "select" line doesn't need to exist, as the device is always the active SPI slave device 3) What we know as "CLK" or "STROBE" on the PC Engine joypad port (the line that toggles the 74HC157 on the joypads) is used as the MOSI (master out, slave in) data line 4) What we know as the "RESET" line on the PC Engine joypad port (the one which resets the joypad counter to the first joystick) is the CLK signal, driving the data 5) In Mode1, the CLK line is normally low, but goes high for data transmission; the upward-rising CLK signals the slave device that the master has readied its bit (and for the slave to ready its bit), and the downward-falling clock signals the slave that both sides should read their bit
What is different from basic SPI is the following: 1) Until a predefined bit stream is identifed by the MB128, its outputs are not driving the joypad data lines (the joypads are) 2) Once "awakened", there are "commands" required in order to identify what data to read/write - and more importantly, these commands are not byte-sized; the "start" and "stop" bit sequences are smaller than a byte (although the "command" sequence between them is a 3-byte sequence). 3) The bit sequence for "read" and for "write" are not a single-direction shift register; "read" goes in one direction, and "write" goes in the other direction. 4) The "read bit" and "write bit" commands vary only by direction of the CLK pulse. A short pulse implies just a read; a long pulse implies that the data in the MB128 should be replaced by the supplied data.
Counting cycles in the implementation of the read-bit function, it looks like roughly:
set up data, hold for 16 cycles toggle CLK, hold for 16 cycles toggle CLK back down, hold for 16 cycles set up next bit ...at this rate, the maximum speed is roughly 149000 bits per second (although in practice, it's slower because of the data handling). this is roughly 18.5KB/sec.
But for write, the CLK pulse is a little longer (at 28 cycles, rather than 16), so the maximum speed is a little slower - ~119,000 bits/sec, or 14.9KB/s In practice, it's probably closer to 10KB/s either way, but that's still faster than I expected it would be.
Considering what type of device to use for implementing this poses a few challenges.
Arduino (UNO for example) is cheap and easily available, and also is directly compatible with 5V logic, and even has a SPI port. Assuming it is up for the task, it would need an external 74HC157 to toggle between the joypad inputs' flow-thru, or substitute its own outputs.
...But the problems are: 1) it can't sense the difference between a short- and long-pulse for read/write. I might need to build some sort of one-shot timer to identify whether the CLK signal is shorter than or longer then the threshold delay. 2) it doesn't have enough RAM to be a substitute 3) the SPI port seems very byte-oriented when using the hardware; software commands may not toggle data quite fast enough, and it might even be necessary to use interrupts and assemble to feed the data fast enough
Raspberry PI is probably going to very flexible, but it might require a specialized operating system, as bringing down the system by turning off the power is usually destructive.
Other possible options I am currently examining: 1) Far more capable Arduino-compatible MPUs such as Cortex M0- and M4-based development boards. ...But I'm still not convinced that they'll be able to bit-bang fast enough (though they'll easily beat the UNO).
2) FPGA ...But these are basically expensive, and I have a lot to learn first (i.e. HDL). And a whole separate system would be needed for getting the data out (to a PC for example).
3) CPLD ...Should be pretty cheap, but not a total solution, as there is basically no RAM. I still have a lot to learn about HDLs in order to get here, but this might supplement the embedded processor in the areas I'm most concerned (timing determination, SPI protocol, identification of the command sequences).
I think you could probably implement this entirely on a mux and microcontroller with external memory, especially if it has two SPI ports. I'm not familiar with Arduino anything, but a fast PIC or one of Fujitsu / Cypress's 5V native ARMs (FM3) might be a good option.
The different transfer sizes are a little problematic, but if I remember right some manufacturer's SPI implementations are just a straight up shift register so a partial transfer can be received. If you attach the RESET / CLK line from the console to both the SPI CLK and a timer input you can count edges or the time between edges.
A CPLD will make this significantly easier, but as you stated this comes with the HDL learning curve.
Yes, maybe... - the challenge is that an interrupt would be needed *at least* for each bit, as the non-standard lengths force a MCU to check the protocol at each bit - SPI won't help, because (a) as far as I can see, it works with bytes and not bit-lengths, (b) it's not measuring the duration of the pulse (which is apprently significant)(*) - measuring the pulse width would require either (a) a one-shot set for the threshold between "write" and "not write", together with a comparison by the MCU for each bit, or (b) edge-trigger interrupts, together with a clock-cycle counter for measuring pulse width -> i.e. 250,000 interrupts per second. Not impossible, but needs a fast processor with no interrupts lasting longer than 5 us.
(*) - I tried to use Mooz's library to write a sector, and it didn't write, so the difference may not just be the pulse duration. So maybe there's more to it ?
Follow-up on my last post, since I learned something in the past hour.
When setting up which sector to read/write, the initiate sequence starts with: 1-0-0 bit sequence for read, or 0-0-0 for write.
Once I added this extra differentiation, it worked for writing as well.
Also, the pulse width doesn't appear to make a difference - a long pulse for both reads and writes is just fine. This means that the hardware implementation just became dramatically simpler, and most of my concerns are probably unwarranted.
There is still the issue of getting a device with 128KB of RAM, since most embedded flash will have unacceptable delays (~2-4 microseconds per byte in many cases, which may be long enough to disturb the bit stream).
Hardware step 1: Access MB128 from something other than a PC Engine.
Today, I got an Arduino to read the MB128, and was able to get it to save the contents to an SDCard. (It shouldn't be too much trouble from here to be able to also go in the other direction).
I used a variant of the Arduino Uno, with a built-in SDCard reader/writer; it's a "Keyestudio ks0304 W5500 Ethernet Development Board".
The setup looks like this:
One of the reasons I used this device, is because it runs at 5V, and no level shifters would be necessary. It's also fast enough to be the "master" in this arrangement, though it would never be fast enough (without dedicated logic) to be the "slave" device. There are a lot of Arduino-compatible devices out there which are smaller, faster, have more memory, and are often cheaper to boot... but they're all 3.3V devices. I'll probably try using one of those next.
I cannibalized an old mini-DIN-8 extension cable, and made a female connector out of one end; this is what the MB128 is plugged into.
There aren't many wires attached, but I'll describe the ones which are: Ground and +5V are driving the power plane along the top. - IO #7 is connected to joypad pin 2 (also known as D0 on the port, for joypad buttons I and Up). This is the data bit which returns the SPI data. - IO #2 is connected to joypad pin 4 (also known as D2 on the port, for joypad buttons Select and Down). This is a bit which the MB128 uses to help identify that it exists. - IO #8 is connected to joypad pin 6 (also known as SEL on the port, for selecting which 4 buttons are active). This is the data bit which sends the SPI data. - IO #9 is connected to joypad pin 7 (also known as CLR on the port, for resetting to joypad #1). This is the bit which is used for clocking the SPI data.
Note that several pins are not available because they are in use for the SDCard's SPI signalling.
I found out a few more things about the protocol by building this:
1) The Arduino is not so accurate for timing when you're trying to do anything with less than 4 microseconds accuracy (which is a long time in SPI)
2) Part of the MB128 "identify" protocol is a normal joypad strobe (send 1, 3, 1, (delay), 0 on the joypad port). Without this, the 0xA8 won't be sufficient to trigger the identify.
3) The code I transcribed from Mooz's assembly libraries has a problem with multiple-sector reads; the first sector is fine, but when it requests a mb128_detect() again as part of the next sector request, the mb128 doesn't identify itself. I'm not sure why this is; maybe there's an issue with timing on the Arduino, or maybe I added a bug when I transacribed (or, since I never tested it on the PC Engine, maybe it didn't work there either). In any case, I found a workaround but it's a bit slower. I do 2 joypad strobes, followed by a mb128_init(), which has a high retry count.
And of course, this is a lot slower than it could be (it takes about 30-45 seconds to grab the whole 128KB of contents and log it to SDCard).
The Arduino sketch is posted here... apologies for hardcoding some things.
1) I defined the input pins as "pullup", because it once recognized the MB128 when one wasn't attached 2) I traced back the issue with the end-of-sector, and found that there should have been another 3 '0' bits sent (now added) 3) Tightened up the timing a bit.
I timed it (on an UNO) and it takes 50 seconds to read the entire MB128 into a SDCard file.
I have also now added (and tested) the functionality to write output back to the MB128:
While it will automatically read the contents into "MB128.SAV" at startup, you will need to set your serial monitor to 115200 baud, and suppress the CR/LF; type in a capital "W", and it will send the contents of MB128.BKP to the MB128. It's about the same speed to write as to read.
Next, I will try to use a more modern microcontroller and deal with level-shifting. the timing on that device should be tighter, and the dump should be a bit faster. After that, I have another modern microcontroller with sufficient RAM to emulate the MB128, so I will use one MCU to test the other out before plugging it into my PC Engine.
I tried to detect the mb128 with a seeeduino mega, and ... failed. It's weird. Whenever I plug a mb128 between the arduino and the pad, D0 is always at 0. I'm wondering if I should directly use the avr registers instead of using digitalRead/Write.
Two points: - I adjusted the INPUT type to INPUT_PULLUP, because I was getting low values when nothing was plugged in. - '0' on D0 is probably what you want on a detect; the two reads of D0 should both be '0', but D2 shows as a '1' on one of those reads (and not the other).
But also... be careful about the pin assignments. Libraries seem to randomly assign things, and different Arduino units move things around. Maybe the ones I used aren't valid for your hardware.