|
Post by elmer on Oct 31, 2019 8:22:55 GMT
I found the last version I worked on. Looks like it uses self modifying code, which is not really ideal for hucard projects. So I'll work on making a more rom friendly version, Hi Tom, Chris mentioned earlier that he already had a romable version of your code that you may wish to look at. From my POV, the only way that we're going to make this usable for pure-HuC developers like DK & Gredler, is to actually build the functionality directly in both HuC and PCEAS ... which is what I did with the PNG format loading, and the new tile/chr/spr palette commands. I don't believe that they're quite ready, yet, to do all of their asset-conversion in a separate toolchain in the way that more-experienced developers might do. That would mean building the data compressor itself into PCEAS, and coming up with some new command names for the new functionality. Do that make sense to you? If so, then I'm pretty sure that you've added commands before, but if you want a quick reminder, then you can scroll down the github commit history, and look at the changes that I made on July 6, 2018. P.S. I made a bunch of checkins to github today, and HuC now compiles for Win64. In another fix, on Windows, HuC now waits for PCEAS to finish before exiting, just like it does on linux ... which should hopefully fix punch 's problems with his new Python tool.
|
|
|
Post by Arkhan on Oct 31, 2019 18:21:59 GMT
I don't believe that they're quite ready, yet, to do all of their asset-conversion in a separate toolchain in the way that more-experienced developers might do. I honestly think they are and I've tried telling them to do that multiple times. I think they could've / should've been doing it, but I figure they don't want to fuck with the house of cards that is their project since what they have already works and changing things will introduce more potential problems.
|
|
|
Post by turboxray on Oct 31, 2019 19:14:19 GMT
Elmer I hadn't thought about that! Yeah, okay I'll look into that after converting it to rom friendly. I'll take a look at Chris' source.
Hey, so years back I did multi-threading on the PCE. Basically main thread and a secondary thread. I got the idea from Gate of Thunder because the game decompresses LZSS assets real-time as you're playing through the level. It does it in small continuous chunks across frames (which is really impressive because even on Devil that games doesn't slow down!). I did something similar, but had a basic scheduler, budget, slack, etc models, and preemption. Nothing fancy; no mutexes or semaphores. Threads had their own ZP/BSS, stack was resolved if the thread was preempted. Time slicing ran off the TIRQ. I was thinking something like that would be great for HuC, but would obviously be more complicated.
You could still get the same type of approach by simply designing a C function that could segment its work and have its internal state persist (static vars, etc). If we simply added a down counter to the TIRQ and have it run full speed, when the main game logic is finished and going to call wait_vblank(), instead you could look the remaining value of the down counter and decide if there's enough time for said function. If the function is capable of doing segmented work, then you could give it a workload based on the remaining time. Probably have to be pretty conservative with that estimate, but you wouldn't need to preempt it. So maybe you only have enough time to decompress two or three tiles, or even just a single sprite, but given 60 frames that could be quite a bit. So basically a slack budget model, just not as tightly optimal as a thread service.
|
|
|
Post by DarkKobold on Oct 31, 2019 19:28:09 GMT
I don't believe that they're quite ready, yet, to do all of their asset-conversion in a separate toolchain in the way that more-experienced developers might do. I honestly think they are and I've tried telling them to do that multiple times. I think they could've / should've been doing it, but I figure they don't want to fuck with the house of cards that is their project since what they have already works and changing things will introduce more potential problems. This is an argument Ark and I have had a lot. He wants separate tools, so that users of PCEAS and the like aren't left out when things like '.stm' are built directly into HuC. I prefer the singular "software suite" where I don't have to worry about file formats or running separate utilities every time I want to update something. It's an argument of simplicity vs. flexibility, which is the argument between Mac and PC for the last few decades. I don't think we're going to solve it.
Ark and I cleared this up in the Discord.
Also, calling our project a "house of cards" is pretty insulting. You want people to respect you, its best not to disrespect others.
|
|
|
Post by Arkhan on Oct 31, 2019 20:32:59 GMT
yeah for clarity, pretty much *all* projects on these old machines with these tools ends up being a house of cards and that wasn't meant to be a dig.
I'd argue that saying you're not ready to do the more experienced thing is a bigger misstep lol.
Especially since you kinda just .incbin some chunk of data from a utility someone else probably made already (Myself, and also Tom has mentioned his utilities a few times now, for example)... and load it into memory. It's stuff you have to be doing already in your code.
But having to swap out all your #incs, replace with binary data, and then change all the code to load it and such is why I said house of cards. It SEEMS like that should go smooth, but I'd bet something goes wrong.
Not necessarily because of your actual project, just because of a quirk in HuC, or something of that nature.
Or you put the wrong sizes in and go AH FUCK WHAT DID I DO.
|
|
|
Post by DarkKobold on Oct 31, 2019 20:49:38 GMT
But having to swap out all your #incs, replace with binary data, and then change all the code to load it and such is why I said house of cards. It SEEMS like that should go smooth, but I'd bet something goes wrong. Not necessarily because of your actual project, just because of a quirk in HuC, or something of that nature. Or you put the wrong sizes in and go AH FUCK WHAT DID I DO. Hopefully it wouldn't be all. The few background tile maps, a few of the intro sprites, etc, stuff that doesn't need speed. I don't think we need to compress every single sprite and tile, just enough to squeeze a few more fancy feature creeps.
If I had to do every single sprite in my game, I... wouldn't.
|
|
|
Post by gredler on Oct 31, 2019 20:53:31 GMT
My inquiry was totally cosmetic and additional to the core requirements for completing the game. I just want to fit in as much as we are able to, and I assume there is a lot of low hanging fruit to compress for big savings. I would love to hear more about utilities and middle ware to get things into the game better or more quickly! Let's lay some pipeline!That is what originally brought me into the PCE development scene - to learn and help establish some working examples of how to develop today for this system we all love
|
|
|
Post by Arkhan on Oct 31, 2019 22:06:10 GMT
I would love to hear more about utilities and middle ware to get things into the game better or more quickly! Let's lay some pipeline!That is what originally brought me into the PCE development scene - to learn and help establish some working examples of how to develop today for this system we all love This is a definite 180 from how you responded to this concept the multiple times I told you about it before, lol. Hurray progress?
|
|
|
Post by gredler on Oct 31, 2019 23:00:26 GMT
I'm not going to argue with you arkhan, but I am not sure I've ever stomped on anyone's ambitions to make content or tools for PCE.
If someone want's to propose or demonstrate a utility, or improvement to HuC, or the pipeline as a whole in general I would very much like to encourage that.
Sorry if I've said or insinuated otherwise, but I would like to be clear that of course I am open to utilities. Punch's disc util looks cool for what it is, but it doesn't serve much of a purpose for me. the 2pce tools are cool, and there are a endless utils I use and make at work all the time.
|
|
|
Post by Arkhan on Oct 31, 2019 23:19:56 GMT
"im not going to do X, but..." = lol.
Really, your discord response to a functional pipeline I (aggressively) suggested definitely stomped on some ambition and lowered my "give a fuck" meter for a lot of PCE related things. Your went so far as to call me close minded for saying artists can handle using a utility and we shouldn't assume artists are inept at it.
Food for thought there really and it is definitely interesting that you are now suddenly interested in pipelines/utilities/etc when before, you weren't, or blew at showing it.
|
|
|
Post by elmer on Nov 1, 2019 0:31:35 GMT
Hopefully it wouldn't be all. The few background tile maps, a few of the intro sprites, etc, stuff that doesn't need speed. I don't think we need to compress every single sprite and tile, just enough to squeeze a few more fancy feature creeps. If I had to do every single sprite in my game, I... wouldn't. Yep, you're in the position that you absolutely do not want to compress everything, because you're accessing/uploading some stuff in realtime, which therefore can't be in the middle of some compressed file. As I said ... the only way that I can see this working for your project is to add some new functions, say like #inctilelzss, and then possibly add a new parameter to set_tile_data(). Arkhan is right in saying that to move forward as a developer, and to get better results in the future, then you're going to be looking at using external tools and #incbin the exported files ... but IMHO that would be something to think about in your next project, when you're ready to start writing some assembly-language code to use all of that different-format data.
|
|
|
Post by elmer on Nov 1, 2019 0:46:45 GMT
Elmer I hadn't thought about that! Yeah, okay I'll look into that after converting it to rom friendly. I'll take a look at Chris' source. Here's another thing for you to look at ... ; **************************************************************************** ; **************************************************************************** ; ; lzss_rom.s ; ; Simple HuC6280 decompressor for LZSS (4 bit count, 8 bit offset). ; ; This version uses a 256-byte ring-buffer in main RAM, and so can decompress ; directly to VDC memory. No self-modifying code is used, so it can be run ; from ROM in a HuCard. It is also interrupt-safe, since it doesn't use a TII ; instruction for copying from the ring-buffer. ; ; Total code size is 174 bytes. ; ; Copyright John Brandwood 2019. ; ; Distributed under the Boost Software License, Version 1.0. ; (See accompanying file LICENSE_1_0.txt or copy at ; http://www.boost.org/LICENSE_1_0.txt) ; ; **************************************************************************** ; ****************************************************************************
list mlist
include "pcengine.inc"
org $A000
lz_window = $3F00
lz_srcptr = __ax lz_dstptr = __bx lz_lencnt = __cl lz_bitbuf = __dl lz_nibble = __dh
; **************************************************************************** ; **************************************************************************** ; ; lzss48_to_ram - Decompress LZSS to RAM. ; ; Args: __ax = Ptr to compressed source (in bank 3). ; Args: __bx = Ptr to destination in RAM. ; Uses: __cx, __dx. ;
lzss48_to_ram: cly ; Initialize destination index. stz <lz_nibble ; Initialize empty nibble.
sec .load_command: stz <lz_lencnt ; Avoid bug when many literals. lda [lz_srcptr] ; Reload an empty bit-buffer. ror a sta <lz_bitbuf inc <lz_srcptr + 0 bne .got_command bsr lzss48_src_page bra .got_command
.next_command: lsr <lz_bitbuf ; Get next command bit. beq .load_command ; Is the bit-buffer empty?
.got_command: lda [lz_srcptr] ; Get literal/offset byte. inc <lz_srcptr + 0 bne .skip1 bsr lzss48_src_page .skip1: bcs .got_literal ; CS=literal, CC=match.
.got_match: tax ; Window range $FFFF..$FF00.
lda <lz_nibble ; Is there a nibble waiting? stz <lz_nibble bne .got_nibble
lda [lz_srcptr] ; Reload nibble buffer. inc <lz_srcptr + 0 bne .skip2 bsr lzss48_src_page
.skip2: sta <lz_nibble ; Save for next nibble. lsr a ; Use top nibble first. lsr a lsr a lsr a
.got_nibble: and #$0F ; Current nibble. beq lzss48_finished ; Value = 0 == Finished.
sta <lz_lencnt ; Value 1..15 = Count 2..16.
.copy_loop: lda lz_window,x ; Get the next byte from the inx ; window in the ring-buffer. .got_literal: sta lz_window,y ; Update the ring-buffer. sta [lz_dstptr],y ; Write the byte to the output. iny bne .skip3 inc <lz_dstptr + 1
.skip3: dec <lz_lencnt ; Any more bytes to copy? bpl .copy_loop bra .next_command
; ; ; lzss48_src_page:inc <lz_srcptr + 1 ; Does the source cross bank bpl lzss48_finished ; boundary from $7FFF to $8000?
lzss48_new_bank:tax ; Increment source ROM bank. tma3 inc a tam3 txa
lzss48_finished:rts
; **************************************************************************** ; **************************************************************************** ; ; lzss48_to_vdc - Decompress LZSS to VDC VRAM. ; ; Args: __ax = Ptr to compressed source (in bank 3). ; Args: __bx = Ptr to VDC data write port. ; Uses: __cx, __dx. ; ; N.B. VDC write address must have already been set up. ;
lzss48_to_vdc: cly ; Initialize destination index. stz <lz_nibble ; Initialize empty nibble.
sec .load_command: stz <lz_lencnt ; Avoid bug when many literals. lda [lz_srcptr] ; Reload an empty bit-buffer. ror a sta <lz_bitbuf inc <lz_srcptr + 0 bne .got_command bsr lzss48_src_page bra .got_command
.next_command: lsr <lz_bitbuf ; Get next command bit. beq .load_command ; Is the bit-buffer empty?
.got_command: lda [lz_srcptr] ; Get literal/offset byte. inc <lz_srcptr + 0 bne .skip2 bsr lzss48_src_page .skip2: bcs .got_literal ; CS=literal, CC=match.
.got_match: tax ; Window range $FFFF..$FF00.
lda <lz_nibble ; Is there a nibble waiting? stz <lz_nibble bne .got_nibble
lda [lz_srcptr] ; Reload nibble buffer. inc <lz_srcptr + 0 bne .skip3 bsr lzss48_src_page
.skip3: sta <lz_nibble ; Save for next nibble. lsr a ; Use top nibble first. lsr a lsr a lsr a
.got_nibble: and #$0F ; Current nibble. beq lzss48_finished ; Value = 0 == Finished.
sta <lz_lencnt ; Value 1..15 = Count 2..16.
.copy_loop: lda lz_window,x ; Get the next byte from the inx ; window in the ring-buffer. .got_literal: sta lz_window,y ; Update the ring-buffer. sta [lz_dstptr] ; Write the byte to the output. iny lda #1 ; Switch VDC lo/hi destination. eor <lz_dstptr + 0 sta <lz_dstptr + 0
dec <lz_lencnt ; Any more bytes to copy? bpl .copy_loop bra .next_command
That is the sort of thing that I'd probably put into HuC, if I were doing it myself. The compression ratio isn't as good as puCrunch or aPLib, but it's probably good-enough for what DK & Gredler need, and pure LZSS is pretty appropriate (timewise) for a HuC game. IMHO, it wasn't until the mid-to-late 1990s that developers really started to experiment with the more-sophisticated compression techniques that are used in puCrunch and aPLib. A few years back, I compared running different compressors on the LoX data ... Total uncompressed data size : 23,278,006 Total YsIV compression : 10,347,187 Total Emerald Dragon compression : 10,280,860 Total Jørgen Ibsen's aPLib compression : 7,782,457 Total Pasi Ojala's puCrunch compression : 7,780,261
Both YsIV and Emerald Dragon use variations on the LZSS 4-bit/8-bit scheme. Yes, aPLib and puCrunch are better ... but if DK & Gredler can get anything close to the 50% compression of LZSS 4/8, then that will be good-enough.
|
|
|
Post by gredler on Nov 1, 2019 0:50:09 GMT
... or blew at showing it. This^. Sorry to be confusing, and to have stirred the pot.
|
|
|
Post by turboxray on Nov 1, 2019 1:43:04 GMT
Update: Ohh! Chris' version is a modification of my last version! Nice haha. Yeah, a straight LZSS option would also be nice (especially for speed). A 4/8bit or 4/12bit version. Do you have the source for that compressor?
|
|
|
Post by elmer on Nov 2, 2019 4:50:57 GMT
Do you have the source for that compressor? Too be honest, and somewhat embarrasingly ... I'm afraid not. That LZSS decompression code is a PCE-modification of some LZSS example code that I wrote for someone on AtariAge this year, to point out that a 6502 LZSS decompressor can be smaller and faster than an LZ4 decompressor, and that classic-LZSS still has a place on old computers/consoles. My compression testbed still has some 32-bit x86 assembly language code at its core, and so it's not really usable in HuC. Rewriting that is one of the many un-done things on my task list.
|
|