|
Post by elmer on May 20, 2019 20:17:57 GMT
I agree that Emerald Dragon doesn't work well for fragmentation reasons. I'm sure they weren't thinking about that (unless there's some kind of built-in defrag function (not likely). I didn't know that Private Eye Dol did a scan for free space... but I am wary of the fact that they are using a 2-relative numbering system and everybody else is using 0-relative. I worry that they'd wipe out somebody else's saves. Nope, from my reading of it, the code is fine, and won't destroy anyone else's data. I don't think that you're right about those 2 bytes being a first-free-sector marker. Both mooz (from his github documentation), and I, believe that they're a count of the total number of sectors in use by file data. It looks like ED includes the 2 directory sectors in that count (and so the value goes from $0002..$0100), wheras Private EyeDol doesn't include the 2 directory sectors (and so the value goes from $0000..$00FE). The code itself doesn't actually ever *use* that value for anything, it's just information ... all file operations (including finding fragments) are done by parsing the file entries and checking starting-sector and sector-count values vs the next file's starting sector (directory entries are stored in order). Anyway, that explains why those other games think that when Private EyeDol has just formatted the MB128, that there is no space left. I expect that those games are just checking the low-byte of the "used" count and are thinking that they're seeing the low-byte of $0100 .. i.e. all sectors used (in ED's counting method). It also explains why those games will work fine once you've got one-or-more files on the MB128 so that the low-byte isn't zero anymore, and the regular "scan for fragments" logic will apply. It also explains why the games that warn the user about there being no free space, will actually load/save properly when they go to scan the directory to find enough free space for their file. Basically ... that total-sectors-used value is absolutely pointless in a file system that can suffer fragmentation, but doesn't allow fragmented file data. The designers really should have implemented a FAT, IMHO.
|
|
|
Post by elmer on May 30, 2019 4:53:49 GMT
For example, if you save a freshly formatted BRAM image to the MB128, then ED's code will put it in a single sector. Add a bunch of saves to your BRAM, and use ED to save it to the MB128, and suddenly the same save-slot needs 2,3,or 4 sectors. This also gets into a discussion of the MB128's directory and file contents. In the regular 2KB of BRAM, when you delete a file, the System Card code shuffles all of the remaining files around in BRAM so that the free space is always contiguous at the end of BRAM. Because the MB128 has a "lot" of memory, and it is relatively slow to read/write, it doesn't seem to do that (in Private EyeDol's code). When you delete a file, the entries in the directory are shuffled around to be contiguous ... but the file data itself isn't moved at all. This leaves an empty fragment of unused sectors in the MB128's file-system. Well, I finally got Emerald Dragon, and so I've done some more testing. ED definitely fragments the MB128's file system, by creating a hole of unused sectors whenever you save the contents of BRAM and there isn't enough room to expand the backup file in its current location. That's not good behavior. OTOH, Private EyeDol is way smarter than I thought it was, and it actually compacts the *entire* contents of the MB128 when you delete a file, and it leaves you with a contiguous (i.e. defragmented) file system. That's very, very good behavior. I've also confirmed that ED is using bytes 2 & 3 of the directory as a count of the number-of-sectors-used, and that it is not an indicator of the first-free-sector. It's also confirmed that ED's number-of-sectors-used include the 2 directory sectors (i.e. goes from 2..256), wheras Private EyeDol's number-of-sectors-used excludes the 2 directory sectors (i.e. goes from 0..254).
|
|
|
Post by elmer on May 31, 2019 22:20:49 GMT
Just a quick additional note ...
Vasteel 2 has a rather nice Backup Management utility built into it and, just like Private EyeDol, it defragments the MB128 filesystem when you delete a file, and it excludes the 2 directory sectors from the count of used sectors.
I've tried Princess Maker 2 as well, but if that game actually has a Backup Management function, I sure can't find it!
|
|
|
Post by dshadoff on May 31, 2019 23:48:25 GMT
I'm sure that some of the games - like the Koei Nobunaga simulations - don't even bother with regular backup RAM, and just go straight to managing only their own backup RAM in the MB128. I wouldn't be surprised if Princess Maker is one of them.
|
|
|
Post by dshadoff on Jul 28, 2019 23:21:53 GMT
Then Koei's code in Genghis Khan finally confirms that the MB128 really is edge-triggered and not level-triggered, and that you can remove one of the three writes/delays in the other games' code, which is what I expected from the MSM6389's datasheet, and from my own tests. Koei's code also shows that the MB128/Save-Kun hardware is normally operated with a 6us-per-bit cycle, 4us RWCLK-lo, and 2us RWCLK-hi. So, I'm working through a Verilog implementation of this, and - although you might have already known this - pretty much all logic in FPGAs is edge-triggered, as D-type flip-flops are the core of pretty much all sequential logic... so, as a result, it is natural for my implementation to also take the exact same edge-triggered approach. The specific timing above, while seemingly tight even for modern-day microcontrollers, is really relaxed for the FPGA world. I am currently considering using a 128KB FeRAM which would allow non-volatile storage, with a 10-year lifespan without requiring power. I was a little worried at first, since it would be a separate chip, and would require command-addressing by SPI (a serial interface); this would need about 40 bits of command (and reply) in order to get back the data I need. Then, I checked maximum speed - 34MHz. So basically, I could get all my data back in about 20% of the required time. Banzai ! Dave
|
|
|
Post by elmer on Aug 1, 2019 6:32:50 GMT
This is super-cool, yay! Sorry, I've been busy with real-life for a couple of months ... I won't get back to the PC Engine world until October.
|
|
|
Post by dshadoff on Aug 4, 2019 3:31:00 GMT
I should have done this a while back: I datalogged all bits exchanged during a brief MB128 session on the PC Engine interacting with real hardware. I didn't do this before because I felt trapped by the limited means of getting information out of the PC Engine (and still do - I displayed it on screen, and copied by hand). ...It's a good that I did this capture, because there are significant differences between the actual behavior of the device, versus what I had written into the SYMB128 (even though it mostly works). Here is an annotated listing of the exchange (I used Emerald Dragon's code as a basis, but added data capture): mb128log.txt (7.16 KB) Because this is a read session, I will need to capture a 'write' session next. EDIT: Now updated with 'write' session: mb128log_write.txt (17.68 KB)
|
|
|
Post by dshadoff on Aug 24, 2019 13:31:51 GMT
I've been doing some test-reads of a real MB128 on separate hardware, because I found the behavior of D1 and D3 to be very interesting:
- It seems that when the CLK (RESET) pulse goes high, D1/D2/D3 all transition to '0' briefly, and D0 starts with a value determined by state (control message = '0'; read = '1'); this is brief and is invalid data. After the settling period (say, 3 microseconds), D0 and/or D2 settle into their proper values first, and D1/D3 are driven high later - within another few microseconds.
- When CLK (RESET) is driven low, the data pins are all driven again to a state-dependent value (data read = '1111')
In summary, the D0 (Data In) pin is only properly valid after the settling period, and before the CLK transition downward (unlike proper SPI protocol); furthermore, the PC Engine code is already somewhat aggressive about the settling period it uses, as not all data pins have reached their terminal state yet at the time of read (but the important one - D0 - has).
So in other words, if you are thinking about making the timing more aggressive, please don't.
My tests so far have been on an NEC Memory Base, and I have not yet tried the KOEI one.
EDIT: With respect to D1/D3, never mind about the above (though the rest is still true). I now think D1/D3 are just floating, and follow whatever electrical influences surround them. I have seen their behavior change over the course of several tests.
|
|
|
Post by elmer on Oct 17, 2019 0:21:52 GMT
In summary, the D0 (Data In) pin is only properly valid after the settling period, and before the CLK transition downward (unlike proper SPI protocol); furthermore, the PC Engine code is already somewhat aggressive about the settling period it uses, as not all data pins have reached their terminal state yet at the time of read (but the important one - D0 - has). So in other words, if you are thinking about making the timing more aggressive, please don't. Yep, Private Eyedol and Genghis Khan both use a pretty aggressive (i.e. short) settling time when reading D0 (Data In), which Mooz has then basically copied into his code on github. Since this is code that was deemed reliable by the manufacturer of the "Save Kun", I suspect that it is perfectly fine to use today but, as Dave says, it wouldn't seem like a good idea to make the delay any shorter. ; From Koei's Genghis Khan ...
; ; Recv Bit from MB128 ;
$81:603B lda #$02 ; A9 02 $81:603D sta $1000 ; 8D 00 10 : D0 (Data In) has 14 cycles to settle before it is read. $81:6040 pha ; 48 $81:6041 pla ; 68 $81:6042 nop ; EA $81:6043 lda $1000 ; AD 00 10
For myself, I've chosen to actually add in an extra "nop" of delay into my code for safety, especially since it doesn't affect the 14-cycle/29-cycle overall loop timing, which is still as fast as any shipped-game. ; My MB128 read loop code ...
.byte_loop: lda #$80
.bit_loop: pha lda #2 sta IO_PORT ; CLR hi, SEL lo (reset). pha ; RWCLK lo for 29 cycles = 4us. pla nop nop lda IO_PORT ; Read while in reset state. lsr a pla ror a stz IO_PORT ; CLR lo, SEL lo (buttons). bcc .bit_loop ; RWCLK hi for 14 cycles = 2us.
sta [__ax],y ; Save the byte in memory.
|
|
|
Post by dshadoff on Nov 18, 2019 4:23:13 GMT
One more piece of trivia !
I found that my MB128 backup device (see Personal Projects forum) didn't work with KOEI Save-kun devices, but I figured out the reason.
After writing a byte sequence to the MB128 unit, there are 2 status bits returned, followed by three trailing bits (presumably to clear internal counters). The three trailer bits are common between reads and writes, but the 2 status bits are unique to writes.
The meanings of the 2 bits are:
The first bit determines the device type: 0 = Save-kun 1 = Memory Base 128
The second bit states whether the write was successful: 0 = OK 1 = FAIL
|
|
|
Post by elmer on Nov 18, 2019 23:02:19 GMT
The meanings of the 2 bits are: The first bit determines the device type: 0 = Save-kun 1 = Memory Base 128 The second bit states whether the write was successful: 0 = OK 1 = FAIL I'll PM you Koei's MB128 code from Genghis Khan ... I suspect that you'll agree their code flow doesn't quite support that interpretation of those two bits.
|
|
|
Post by dshadoff on Nov 18, 2019 23:47:11 GMT
I’ll check out the code when I get home... but when reviewing the actual output from these devices from actual writes (512-byte ‘sectors’ on Arduino code based on Emerald Dragon basis), the MB128 consistently shows one value in status bit #1, whereas the Save-kun consistently returns the opposite value. (But status bit #2 is consistently the same).
Emerald Dragon ignores the first bit’s value, but checks the second bit as a success signal.
But I’ll see if I can infer anything else from what the KOEI code says. I also plan to see whether Vasteel 2 differentially identifies Save-kun versus MB128.
|
|
|
Post by dshadoff on Nov 19, 2019 3:33:42 GMT
OK, I've looked at their code and formed some conclusions.
But basically, it appears to me that their code flow doesn't support the fact that there are 5 bits after a write (2 status + 3 trailer).
The way I read KOEI's code is, if the response is '0000', it's OK and no further read/write is necessary (except that it is: there are 4 pending bits, which will need to get flushed the next time the unit is accessed).
On the other hand, if the response is not '0000', it does another bit write (but no read from the port), and returns with an error code which forces a flush. In either case, the flush is self-limiting, so it ends up flushing only a few bytes before ending - and you probably won't notice.
However, there is something new of note here. You will notice from the protocol captures below that the correct time to read a response is while the CLK line is asserted but immediately before it is deasserted, since the downward clock edge causes the outputs to transition very quickly on the MB128. Note that KOEI's read is after the downward clock by about 7 cycles or roughly 1 microsecond. Either way, the MB128 would assert a non-zero 'xxx1' "error" code and require flushing, although it should recover very quickly from the flush.
HOWEVER, this does explain to me something else I noticed while testing my microcontroller-based SYMB128: The KOEI games seemed to access SYMB128 OK, but would have tremendously long pauses at moments (several seconds before recovery). These were probably full-128KB flushes which didn't send the proper "end flush" response.
For comparison purposes:
Here is the output from the sector-end of a MB128 write (from a post in this thread):
Send two bits of '0' at end of write (important: return values differ): 304: 10->0001 00->1011 306: 10->0000 00->1011
Send three bits of '0' at the end 308: 10->0000 00->1011 30A: 10->0000 00->1011 30C: 10->0000 00->1011
Now, here is the corresponding output from the sector-end of a Save-kun write:
Send two bits of '0' at end of write (important: return values differ): xxx: 10->0000 00->0000 xxx: 10->0000 00->0000
Send three bits of '0' at the end xxx: 10->0000 00->0000 xxx: 10->0000 00->0000 xxx: 10->0000 00->0000
Oh, and one more thing: KOEI's game refers to the unit as the Save-kun for both device types, and Vasteel 2 (which does all sorts of hardware identifications at startup) refers to both types as Memory Base 128, but both work equally well with either unit.
|
|
|
Post by dshadoff on Nov 19, 2019 12:08:08 GMT
As a follow-up, I just looked at the protocol captures from the Save-kun, and found that - contrary to the seemingly-random data returned by MB128 - every response drives all 4 data lines low, excepting:
1) the response to 0xA8 which asserts '0100' (on both CLK high and low) 2) any data reads, which show up as '0001' where applicable (on both CLK high and low) 3) one unexpected spot: when sending the most-significant bit of the length field (for example, during the "boot" single-bit transaction), it asserts '0001' on the CLK upward transition only, and '000x' on the downward transition, where 'x' = '1' for reads, and '0' for writes.
I wish I had studied this device before the MB128; it would have made understanding the protocol and creating a simulation so much easier.
|
|
|
Post by elmer on Nov 20, 2019 4:24:10 GMT
OK, I've looked at their code and formed some conclusions. But basically, it appears to me that their code flow doesn't support the fact that there are 5 bits after a write (2 status + 3 trailer). The way I read KOEI's code is, if the response is '0000', it's OK and no further read/write is necessary (except that it is: there are 4 pending bits, which will need to get flushed the next time the unit is accessed). On the other hand, if the response is not '0000', it does another bit write (but no read from the port), and returns with an error code which forces a flush. In either case, the flush is self-limiting, so it ends up flushing only a few bytes before ending - and you probably won't notice. We're looking at the same code, and the same data, and coming to different conclusions ... how fun! First of all, can we agree that the beyond the meaning of the first two bits after the write, that the final 3 trailer bits are effectively irrelevant to the discussion, because all that they to is to "prime the pump" for the recognizer that wakes up the MB128 device? If the final three zero bits are sent, then the device will wake up on the first try, otherwise it needs two tries to wakeup. If they are not sent, then no data is lost, and the device does not lock up. If we can agree on that, then the question just comes down to "what are those first two bits for?". Looking at the code inside Vasteel 2 and Private Eyedol, neither game actually pays any attention at all to those first two bits, nor to the three trailer bits. Both games just go directly from writing the last bit of the data, straight into the wakeup code so that the games can read the written data back from the MB128 and compare it with the original data in RAM, in order to verify that the data was written correctly. Even though those two games ignore the two bits after the write, and the three trailer bits, everything works correctly, although we can both agree that the games will have to do one or more retries on the wakeup sequence in order for it to work ... which isn't actually a problem, just a bit inefficient. Going back to Koei's code, I don't think that you're reading it correctly. It is actually a loop made up of two routines that ... 1) Read a bit, and check if the result is a '0000'. 2) If the result is '0000', then it reads-but-ignores a second bit, and then the write process is considered finished. 3) If the result was not a '0000', it sends eight zero bits to the MB128, and then loops around to the start. This loop is repeated a maximum of 131072 times, potentially sending 128Kbytes to the MB128, which is the largest possible length of a write command. It just keeps on sending bytes until it reads '0000'. So, what is this supposed to do, and why does it work? Using Occam's Razor, I tend to favor the simplest hypothesis, and there is already good evidence for a very simple explanation of what is going on. It is in your earlier post here ... mb128log_write.txtWhen doing a write sequence, once the length has been sent, and as soon as the MB128 or Save-kun starts to write actual data bits, the device sets its output value to '0001'. As soon as all of the bits have been transferred, it sets its output value back to '0000'. Basically, the bottom bit of the MB128 output is a zero/non-zero flag for the number of bits left to write. On the Save-kun, this transition back to '0000' occurs as soon as the last bit has been sent, wheras on the MB128 the transition occurs one bit later ... presumably to deal with the internal bit-buffer that we both believe that the device uses. So Koei is actually setting the output as a zero/non-zero test on the counter, wheras Hudson are setting the output as an overflow flag from the decrement. Koei's actual game code works correctly on both devices, because it sends another bit after it detects the '0000' in order to flush the final buffered-bit on the Save-kun hardware.
|
|