Post by dshadoff on Nov 21, 2019 5:58:49 GMT
This is the second of two threads about a homemade memory backup HuCard which backs up BRAM to the card, and it describes the software (how to back up BRAM to flash).
A note about a HuC bug:
While developing, I found that the base HuC font contains ASCII, but not Japanese characters; there are some games which include Japanese characters in the save game names, so I wanted to extend the existing font to include these. In so doing, I found a bug in the load_font() assembler routine which has been around since the beginning. The specifics on how to ensure that this is fixed on your own installation are contained in a note in the source base.
General Principles:
While writing this, my first goal was to get a working proof-of-concept, then add an interactive menu, and make a minimal system which can be considered complete. There are no sounds, no graphics, no fancy transitions, and no “nice to have” (but ultimately unnecessary) functionality, because as soon as it was working, I wanted to complete a prototype version. So please excuse the rough state. The part I am least proud of is the text positioning system, so I would likely clean that up first if I decided to dig into this again.
Based on my own use of Tennokoe Bank cards, I would normally have multiple “updated” versions of the same backup memory, only to forget which version was most recent when the time came to restore. For that reason I added the metadata - a date and name of the save, so that you can have a clue of which is more recent… although it might be nice to be able to write a couple of sentences to your future self, to help identify which is which.
Accessing Flash:
The most important thing to get is the data sheet for the Flash memory, in order to understand how to access it:
www.microchip.com/wwwproducts/en/SST39SF040
This particular memory has embedded commands to write and erase, which are invoked by specific sequences of writes to specific addresses on the chip. In other words, you tell it to erase itself, and it can do that on its own.
The commands used by the program are: get Device ID, Erase sector, and Program Byte. Each of these commands takes time to execute, and for a couple of them, it is sufficient to implement a brief n-CPU-cycle delay; others may take significantly longer, so the most appropriate method is to read the status from the chip itself, until the status is complete.
This chip has a 4KB sector size, so ‘erase sector’ clears 4KB of space. Since BRAM takes up 2KB, it isn’t wise to put 2 BRAM saves within that 4KB; instead, I have put the 2KB of BRAM plus a small amount of metadata (date and name) to help the user to identify which save slot they might want to use later. There is lots of space left over in order to add additional metadata, if so desired.
Since the device commands require consecutive writes to specific addresses on the chip in order to operate, it is not possible to execute those sequences of operations from the Flash chip itself; for these sequences, I execute the code from within RAM (see below).
Currently, the 32 banks of storage are stored in blocks $20-$3F, and those blocks are mapped into MPR2 ($4000-$5FFF) when needed. Of course, this is actually wasteful, because that means the BRAM save is stored from $4000-$47FF, and its metadata is from $4800-$4FFF, and an unused space from $5000-$5FFF. This space could also be used for additional banks, if one wasn’t worried about how to manage so many save banks. Also, since the program code is currently less than 128KB, banks $10-$1F could also be added to the pool.
I currently use the get Device ID command to validate that the program is running on hardware compatible with the Flash storage approach being used (specifically the command sequences). It would be valueless to run the program on hardware incapable of properly handling the storage of data. But this brings an unexpected advantage to the programmer targeting such a card for their softtware; they could perform a hardware check to ensure that the game is not run on an unofficial device.
Note that when Flash memory is erased, its value becomes $FF, and writing values to Flash involves bringing cells (representing bits) down to a ‘0’ state.
Things You Maybe Haven’t Seen Before:
Loading a font and multiple font colors:
As implemented, it’s pretty straightforward. The colouring is controlled by which BG palette number is in use for the BG character, not the palette entry within the palette. Switching palette number to print with creates the multiple foreground colors.
Inline Assembler:
You might have seen this before, but perhaps not as much of it. I find it very useful, because HuC can’t do everything, and often the assembler for those things is short anyway. Several of these assembler functions are actually embedded within ‘C’ functions, while others are just in a big block, apparently off by themselves. Those subroutines are the ones which are copied into RAM and executed there.
Execution from RAM:
In order to properly execute the Flash command sequences, they must not be executed from the Flash itself (as it will interfere with the sequence), so in this case they are run from RAM. In order to do this, I have written routines which are fully-relocatable, and stored them in a base segment; when needed, the program copies the appropriate subroutine to a fixed RAM buffer, then performs a JSR to that routine. The routines stop interrupts, save current MPR values, then map the appropriate areas of the chip into memory, so that the command sequences can be run. Once complete, MPRs are restored and interrupts are re-enabled. Interrupts need to be turned off because any interrupt event will access the corresponding interrupt vector in MPR 7 (mapped to Bank #0), which will interrupt the Flash addressing sequence causing a bad update.
BRAM direct access:
This is pretty simple actually; you need to write an ‘unlock code’ to $1807, and it can be accessed. All standard code access this ram with a “LOW” CPU speed (presumably it is undercooked for retention), so I also perform a ‘CSL’, access the memory, then perform a ‘CSH’ and re-lock the memory. While doing this, I rediscovered an issue reported a long time ago, where the formal unlock sequence is $48/$75/$80, but it works if you just write the $80… except in the case of the Tennokoe II, which requires all three bytes before revealing the contents.
Possible Extensions:
This implementation doesn’t make the most use of the flash memory on the card; while it can store 32 BRAM banks with some annotation, there is really enough space on the card to store more than double that amount; perhaps triple. So, making full use of the card is one possible future extension.
Since 96 slots would be even more confusing to your future self than 32, some more annotation would be appropriate (i.e. not just Date & Name, but also “note to self”). There is easily enough space to store lots more meta-data as well, but entering and displaying that meta-data would require some additional thought on what is best. So, this is another possible extension.
Future Uses:
It is possible to allow a game to store its BRAM data within the card itself, as opposed to only in the BRAM on the PC Engine.
Since the PC Engine’s BRAM stores 2KB, but the sector size on this card is 4KB, it is also entirely possible for a game to store more data on the card itself than in the BRAM.
Since the chip has a Device ID, A developer could implement a rudimentary protection mechanism to reduce piracy.
Let me know if you have any questions, or if you write improvements !
A note about a HuC bug:
While developing, I found that the base HuC font contains ASCII, but not Japanese characters; there are some games which include Japanese characters in the save game names, so I wanted to extend the existing font to include these. In so doing, I found a bug in the load_font() assembler routine which has been around since the beginning. The specifics on how to ensure that this is fixed on your own installation are contained in a note in the source base.
General Principles:
While writing this, my first goal was to get a working proof-of-concept, then add an interactive menu, and make a minimal system which can be considered complete. There are no sounds, no graphics, no fancy transitions, and no “nice to have” (but ultimately unnecessary) functionality, because as soon as it was working, I wanted to complete a prototype version. So please excuse the rough state. The part I am least proud of is the text positioning system, so I would likely clean that up first if I decided to dig into this again.
Based on my own use of Tennokoe Bank cards, I would normally have multiple “updated” versions of the same backup memory, only to forget which version was most recent when the time came to restore. For that reason I added the metadata - a date and name of the save, so that you can have a clue of which is more recent… although it might be nice to be able to write a couple of sentences to your future self, to help identify which is which.
Accessing Flash:
The most important thing to get is the data sheet for the Flash memory, in order to understand how to access it:
www.microchip.com/wwwproducts/en/SST39SF040
This particular memory has embedded commands to write and erase, which are invoked by specific sequences of writes to specific addresses on the chip. In other words, you tell it to erase itself, and it can do that on its own.
The commands used by the program are: get Device ID, Erase sector, and Program Byte. Each of these commands takes time to execute, and for a couple of them, it is sufficient to implement a brief n-CPU-cycle delay; others may take significantly longer, so the most appropriate method is to read the status from the chip itself, until the status is complete.
This chip has a 4KB sector size, so ‘erase sector’ clears 4KB of space. Since BRAM takes up 2KB, it isn’t wise to put 2 BRAM saves within that 4KB; instead, I have put the 2KB of BRAM plus a small amount of metadata (date and name) to help the user to identify which save slot they might want to use later. There is lots of space left over in order to add additional metadata, if so desired.
Since the device commands require consecutive writes to specific addresses on the chip in order to operate, it is not possible to execute those sequences of operations from the Flash chip itself; for these sequences, I execute the code from within RAM (see below).
Currently, the 32 banks of storage are stored in blocks $20-$3F, and those blocks are mapped into MPR2 ($4000-$5FFF) when needed. Of course, this is actually wasteful, because that means the BRAM save is stored from $4000-$47FF, and its metadata is from $4800-$4FFF, and an unused space from $5000-$5FFF. This space could also be used for additional banks, if one wasn’t worried about how to manage so many save banks. Also, since the program code is currently less than 128KB, banks $10-$1F could also be added to the pool.
I currently use the get Device ID command to validate that the program is running on hardware compatible with the Flash storage approach being used (specifically the command sequences). It would be valueless to run the program on hardware incapable of properly handling the storage of data. But this brings an unexpected advantage to the programmer targeting such a card for their softtware; they could perform a hardware check to ensure that the game is not run on an unofficial device.
Note that when Flash memory is erased, its value becomes $FF, and writing values to Flash involves bringing cells (representing bits) down to a ‘0’ state.
Things You Maybe Haven’t Seen Before:
Loading a font and multiple font colors:
As implemented, it’s pretty straightforward. The colouring is controlled by which BG palette number is in use for the BG character, not the palette entry within the palette. Switching palette number to print with creates the multiple foreground colors.
Inline Assembler:
You might have seen this before, but perhaps not as much of it. I find it very useful, because HuC can’t do everything, and often the assembler for those things is short anyway. Several of these assembler functions are actually embedded within ‘C’ functions, while others are just in a big block, apparently off by themselves. Those subroutines are the ones which are copied into RAM and executed there.
Execution from RAM:
In order to properly execute the Flash command sequences, they must not be executed from the Flash itself (as it will interfere with the sequence), so in this case they are run from RAM. In order to do this, I have written routines which are fully-relocatable, and stored them in a base segment; when needed, the program copies the appropriate subroutine to a fixed RAM buffer, then performs a JSR to that routine. The routines stop interrupts, save current MPR values, then map the appropriate areas of the chip into memory, so that the command sequences can be run. Once complete, MPRs are restored and interrupts are re-enabled. Interrupts need to be turned off because any interrupt event will access the corresponding interrupt vector in MPR 7 (mapped to Bank #0), which will interrupt the Flash addressing sequence causing a bad update.
BRAM direct access:
This is pretty simple actually; you need to write an ‘unlock code’ to $1807, and it can be accessed. All standard code access this ram with a “LOW” CPU speed (presumably it is undercooked for retention), so I also perform a ‘CSL’, access the memory, then perform a ‘CSH’ and re-lock the memory. While doing this, I rediscovered an issue reported a long time ago, where the formal unlock sequence is $48/$75/$80, but it works if you just write the $80… except in the case of the Tennokoe II, which requires all three bytes before revealing the contents.
Possible Extensions:
This implementation doesn’t make the most use of the flash memory on the card; while it can store 32 BRAM banks with some annotation, there is really enough space on the card to store more than double that amount; perhaps triple. So, making full use of the card is one possible future extension.
Since 96 slots would be even more confusing to your future self than 32, some more annotation would be appropriate (i.e. not just Date & Name, but also “note to self”). There is easily enough space to store lots more meta-data as well, but entering and displaying that meta-data would require some additional thought on what is best. So, this is another possible extension.
Future Uses:
It is possible to allow a game to store its BRAM data within the card itself, as opposed to only in the BRAM on the PC Engine.
Since the PC Engine’s BRAM stores 2KB, but the sector size on this card is 4KB, it is also entirely possible for a game to store more data on the card itself than in the BRAM.
Since the chip has a Device ID, A developer could implement a rudimentary protection mechanism to reduce piracy.
Let me know if you have any questions, or if you write improvements !