fragmare
Punkic Cyborg
Posts: 116
Homebrew skills: Graphics, Music, Level Design, Annoying Programmers
|
Post by fragmare on Mar 25, 2019 19:59:26 GMT
If somebody here has an EverDrive, and it's not a lot of trouble, I'd like to see how this sounds on real hardware. It's just a test ROM output from HuSIC, but I'm trying to analyze how the audio is played back on a physical system (as compared to, say, Ootake or Mednafen). I need to verify that something is or is not happening when played on real hardware, particularly with the wavetable RAM updating. the HES file is attached below, if anyone is interested. sample_te.hes (88 KB)
|
|
|
Post by digipiggy on Mar 26, 2019 1:51:51 GMT
|
|
|
Post by elmer on Mar 26, 2019 1:54:30 GMT
Sorry, I don't have any way to record the output, but I did play it on my TED and CoreGrafx II. I don't know what you're wanting to hear (or not), but **if** you're talking about the audible click/pop in the tones when they stop ... then "yes", that is what happens on real hardware when you update a wavetable. It was fixed in the HuC6280A that was used in the SuperGrafx and CoreGrafx, but reverted to "clicky" in the CoreGrafx II, Duo, DuoR, etc. Then again, I expect that you already know that, and that there is something else that you're expecting to hear (or not hear). <edit> OK, I was beaten to the response. Thanks digipiggy !
|
|
fragmare
Punkic Cyborg
Posts: 116
Homebrew skills: Graphics, Music, Level Design, Annoying Programmers
|
Post by fragmare on Mar 26, 2019 19:10:54 GMT
Thanks, digipiggy! Greatly helpful. It would seem that an HES file generated by either Deflemask *or* HuSIC puts an excessive gap between the wavetables, when you switch them (on BOTH HuC6280 and HuC6280a chip revisions). On real hardware, the gap seems to be somewhere in the range of 55 to 85 point samples in duration in a 44100hz WAV recording. While I realize there is always going to be a tiny gap, (as the channel must be disabled, the new wavetable data loaded into the registers, then the channel must be re-enabled) the observed gap seems to be excessively long. Each point sample worth of wavetable data should only take 9-11 CPU cycles to load into the registers... so 352 CPU cycles. 352 cycles is an almost indescribably tiny amount of time (~0.000046885 sec), and shouldn't be very noticeable at all, either audibly or even visually looking at the WAV output while zoomed in. The observed gap, however, is more like 0.0019274 sec, and that is very noticeable, both auditorily and visually. This does not appear to be something inherent to only Deflemask's HES output, as it also happens with HES files generated by HuSIC as well. Either way, though, when you overclock the CPU speed in an emulator, the gap is reduced... which makes me believe it could be due to unoptimized and/or poorly written assembly code that's just not getting the audio sample data into the wavetable registers in a timely manner.
|
|
|
Post by Black_Tiger on Mar 26, 2019 19:35:16 GMT
Keep in mind that different models and possibly different revisions of Everdrives play chiptune roms differently and with seemingly random results.
Sometimes the same TED will play a sound file differently depending on whether it is named .hes or .pce. With some files it makes no difference and with others they never work at all.
Sometimes one or more channels are missing. Sometimes the volume of some channels is way off. Sometimes they play too slow or two fast.
I seemed that with my Tototek card, they just either did or did not work. But I didn't have as many sound files to try back then.
The Super SD System 3 seems to be roughly as consistent as the various TEDs, but with its own somewhat unique results.
|
|
fragmare
Punkic Cyborg
Posts: 116
Homebrew skills: Graphics, Music, Level Design, Annoying Programmers
|
Post by fragmare on Mar 26, 2019 19:57:52 GMT
Keep in mind that different models and possibly different revisions of Everdrives play chiptune roms differently and with seemingly random results. Sometimes the same TED will play a sound file differently depending on whether it is named .hes or .pce. With some files it makes no difference and with others they never work at all. Sometimes one or more channels are missing. Sometimes the volume of some channels is way off. Sometimes they play too slow or two fast. I seemed that with my Tototek card, they just either did or did not work. But I didn't have as many sound files to try back then. The Super SD System 3 seems to be roughly as consistent as the various TEDs, but with its own somewhat unique results. Yea, I'm aware of the various issues with different flash cartridge options playing HES files, but I don't think this is necessarily a flash card issue... I'm thinking this is more of a "too much CPU overhead in the HES sound player routine" issue, at this point.
|
|
|
Post by elmer on Mar 27, 2019 0:49:23 GMT
Yea, I'm aware of the various issues with different flash cartridge options playing HES files, but I don't think this is necessarily a flash card issue... I'm thinking this is more of a "too much CPU overhead in the HES sound player routine" issue, at this point. Since there's no actual officially-sanctioned standard for HES files, and they're just HuCard ROMs with a built-in "player" that stuffs a pre-recorded stream of data bytes into the PSG registers every 60th/s (or more often), then you're totally at the mercy of whoever wrote the player code that is embedded in any particular HES file. The last time that I looked at the code in Deflemask's HES player, I was pretty horrified at how it was doing things, such as uploading new waveforms, and how the whole program had been completely compromised by the need to run in a tightly-timed software loop so that the player could support 32KHz sample playback (at 100% CPU usage). I've not even looked at HuSIC's HES files.
|
|
fragmare
Punkic Cyborg
Posts: 116
Homebrew skills: Graphics, Music, Level Design, Annoying Programmers
|
Post by fragmare on Mar 27, 2019 1:09:24 GMT
Yea, the Deflemask HES player is a mess. I would be willing to bet that whoever wrote Deflemask's HES exporter used at least some code derived from stuff they saw HuSIC doing, which is open source. In other words, HuSIC could be where this particular bug originated in the first place (since both are exhibiting the same exact issue), so correcting it in HuSIC might make fixing it in Deflemask (or creating a tool to fix Deflemask's HES output) rather trivial, therefore kind of killing two birds with one stone... maybe.
zeromus and i have tried to take a closer look at it, and it appears that HuSIC compiler uses a C loop that looks like :
(for i=0;i<32;i++) { poke(SND_WAV, *pcmpos); pcmpos++; }
to upload data into the channel RAM. While setting volume=0 before and restoring it after the loop. He suggested using hus.c snd_chg(int num) and removing poke(SND_MIX, 0x00) from the function, and giving it a go.
|
|
fragmare
Punkic Cyborg
Posts: 116
Homebrew skills: Graphics, Music, Level Design, Annoying Programmers
|
Post by fragmare on Mar 27, 2019 23:38:54 GMT
Here is the part of the C code in HuSIC that pertains to updating wavetable data into channel RAM... github.com/BouKiCHi/husic_git/blob/master/src/husic/hus.c#L255I think that zeromus is thinking that by removing the command setting the vol=0 here, poke(SND_MIX, 0x00); , it would save time in updating at the cost of leaving the channel volume on and possibly getting garbage for a tiny amount of time. Iirc, though, the channel is supposed to be *disabled*, then the waveform data loaded, then re-enabled after, so setting vol=0 might be necessary. Ryphecha also mentioned that changing the volume is not instantaneous, and would take 256 x 16 (only 12 used) at 7.16mhz cycles to sweep across all channels (ch0 to ch5)... Still, though, I continue to see games with sound tests just butting wavetables right up against each other with little or no audible or visual gap between, so it begs the question of how exactly they're getting the wavetable data into the channel RAM so quickly.
|
|
|
Post by elmer on Mar 28, 2019 0:49:20 GMT
I think that zeromus is thinking that by removing the command setting the vol=0 here, poke(SND_MIX, 0x00); , it would save time in updating at the cost of leaving the channel volume on and possibly getting garbage for a tiny amount of time. Iirc, though, the channel is supposed to be *disabled*, then the waveform data loaded, then re-enabled after, so setting vol=0 might be necessary. You might want to suggest that zeromus takes a look at the documentation for the PSG. The reason for setting the top 2-bits of the register to zero is pretty clear. Testing shows that it doesn't really matter what you set the volume part of the register to when the top 2-bits are zero ... you're still going to get a "click" on the original HuC6280. The best that you seem to be able to do is to re-enable the channel ASAP and try to get as much as possible of the "click" removed by the PCE's low-pass filter as you can. Still, though, I continue to see games with sound tests just butting wavetables right up against each other with little or no audible or visual gap between, so it begs the question of how exactly they're getting the wavetable data into the channel RAM so quickly. Because it's not difficult ... it's just an assembly-language thing, and not a HuC thing. The HuC6280's "TIN" instruction exists pretty-much exactly for this situation. The PSG is the only piece of hardware in the PC Engine that really benefits from the TIN instruction. You can upload an entire waveform in 209 cycles with TIN (and "yes", that's what I'm doing in Huzak, so I know that it works).
|
|
fragmare
Punkic Cyborg
Posts: 116
Homebrew skills: Graphics, Music, Level Design, Annoying Programmers
|
Post by fragmare on Mar 28, 2019 1:20:29 GMT
I think that zeromus is thinking that by removing the command setting the vol=0 here, poke(SND_MIX, 0x00); , it would save time in updating at the cost of leaving the channel volume on and possibly getting garbage for a tiny amount of time. Iirc, though, the channel is supposed to be *disabled*, then the waveform data loaded, then re-enabled after, so setting vol=0 might be necessary. You might want to suggest that zeromus takes a look at the documentation for the PSG. The reason for setting the top 2-bits of the register to zero is pretty clear. Testing shows that it doesn't really matter what you set the volume part of the register to when the top 2-bits are zero ... you're still going to get a "click" on the original HuC6280. The best that you seem to be able to do is to re-enable the channel ASAP and try to get as much as possible of the "click" removed by the PCE's low-pass filter as you can. Still, though, I continue to see games with sound tests just butting wavetables right up against each other with little or no audible or visual gap between, so it begs the question of how exactly they're getting the wavetable data into the channel RAM so quickly. Because it's not difficult ... it's just an assembly-language thing, and not a HuC thing. The HuC6280's "TIN" instruction exists pretty-much exactly for this situation. The PSG is the only piece of hardware in the PC Engine that really benefits from the TIN instruction. You can upload an entire waveform in 209 cycles with TIN (and "yes", that's what I'm doing in Huzak, so I know that it works). Yea, i'm not too worried about the inherent "click" on the non-a revision chip. That's just something you have to accept, just like some Sega Genesis chip revisions sounding better than others. But the wavetable update latency issue is a HuC related bug that seems to affect both HES files generated from Deflemask and HuSIC. It's interesting that you mention the TIN function. That must be one of the things the HuC6280 does that your standard 6502 won't do... that makes me think that the generated ASM by HuC is doing things the "old fashioned" 6502 way, where as if it were handwritten asm using the special TIN function, it could take place in 209 or so cycles (a much more reasonable time than what's being observed in HES files). The aim here is to try to create some sort of patch that prevents HES files from doing this excessive wavetable update latency when generated by HuSIC and Deflemask. I'm guessing this will entail writing a custom assembly routine using the TIN function and replacing whatever gibberish asm HuC generates to accomplish the same thing. I know you're also working on Huzak and all, which is probably more a complete solution to the overall problem of messy HES sound players, but i haven't heard anything about that for some time. Was there ever a downloadable binary or source of that? We're simply looking to kill this particular wavetable update latency bug, at the moment.
|
|
|
Post by elmer on Mar 31, 2019 18:59:56 GMT
Yea, the Deflemask HES player is a mess. I would be willing to bet that whoever wrote Deflemask's HES exporter used at least some code derived from stuff they saw HuSIC doing, which is open source. In other words, HuSIC could be where this particular bug originated in the first place (since both are exhibiting the same exact issue), so correcting it in HuSIC might make fixing it in Deflemask (or creating a tool to fix Deflemask's HES output) rather trivial, therefore kind of killing two birds with one stone... maybe. Deflemask's HES player is definitely hand-coded assembly language, and not the output of a compiler. It's just lazily-designed, and follows the same logic that all of Deflemask's other ROM exports use. That logic just doesn't work well for the PCE's PSG, and they really should have written some special-case code for handling the waveform data on the PCE. Here's Deflemask's HES ROM 60Hz timer IRQ for uploading new PSG settings ... 60Hz Timer IRQ to pump the PSG registers.
$00:FE83 stz $1403 ; 9C 03 14 $00:FE86 lda $200D ; @ $200D = $03 ; AD 0D 20 $00:FE89 sta $0800 ; 8D 00 08
$00:FE8C lda [$01] ; @ $4600 = $86 ; B2 01 $00:FE8E inc <$01 ; @ $2001 = $00 ; E6 01 $00:FE90 bne $FE94 ; D0 02 $00:FE92 inc <$02 ; @ $2002 = $46 ; E6 02 $00:FE94 tax ; AA $00:FE95 bpl $FEAA ; 10 13 $00:FE97 lda [$01] ; @ $4600 = $86 ; B2 01 $00:FE99 inc <$01 ; @ $2001 = $00 ; E6 01 $00:FE9B bne $FE9F ; D0 02 $00:FE9D inc <$02 ; @ $2002 = $46 ; E6 02 $00:FE9F sta $0780,x ; @ $0806 ; 9D 80 07 $00:FEA2 dex ; CA $00:FEA3 bmi $FE8C ; 30 E7 $00:FEA5 sta $200D ; 8D 0D 20 $00:FEA8 bra $FE8C ; 80 E2
$00:FEAA cmp #$06 ; C9 06 $00:FEAC beq $FECD ; F0 1F $00:FEAE bcc $FEF2 ; 90 42 $00:FEB0 cmp #$09 ; C9 09 $00:FEB2 beq $FED5 ; F0 21 $00:FEB4 cmp #$07 ; C9 07 $00:FEB6 beq $FEE2 ; F0 2A $00:FEB8 lda <$06 ; @ $2006 = $01 ; A5 06 $00:FEBA bne $FEBF ; D0 03 $00:FEBC jmp $FE3A ; 4C 3A FE $00:FEBF lda <$05 ; @ $2005 = $00 ; A5 05 $00:FEC1 tam2 ; 53 04 $00:FEC3 lda <$03 ; @ $2003 = $23 ; A5 03 $00:FEC5 sta <$01 ; 85 01 $00:FEC7 lda <$04 ; @ $2004 = $40 ; A5 04 $00:FEC9 sta <$02 ; 85 02 $00:FECB bra $FE8C ; 80 BF
$00:FECD ldx #$7F ; A2 7F $00:FECF lda #$7C ; A9 7C $00:FED1 ldy #$79 ; A0 79 $00:FED3 sec ; 38 $00:FED4 rti ; 40
HES data for uploading a waveform ... YUK!
$03:4600: .db $86, $10, $86, $01, $86, $17, $86, $11, $03:4608: .db $86, $0F, $86, $11, $86, $17, $86, $08, $03:4610: .db $86, $0D, $86, $0F, $86, $0D, $86, $08, $03:4618: .db $86, $00, $86, $0E, $86, $00, $86, $1F, $03:4620: .db $86, $14, $86, $0B, $86, $03, $86, $01, $03:4628: .db $86, $01, $86, $03, $86, $0B, $86, $17, $03:4630: .db $86, $1C, $86, $1E, $86, $1E, $86, $1C, $03:4638: .db $86, $17, $86, $0D, $86, $02, $84, $94,
|
|
|
Post by elmer on Apr 1, 2019 0:05:15 GMT
But the wavetable update latency issue is a HuC related bug that seems to affect both HES files generated from Deflemask and HuSIC. It's interesting that you mention the TIN function. That must be one of the things the HuC6280 does that your standard 6502 won't do... that makes me think that the generated ASM by HuC is doing things the "old fashioned" 6502 way, where as if it were handwritten asm using the special TIN function, it could take place in 209 or so cycles (a much more reasonable time than what's being observed in HES files). There is no HuC "bug" here. The compiler is generating code that does exactly what the HuSIC author is asking it to do. It's just slow because HuC (like all current 6502 C compilers) is generating slow code. That's the price that you play for coding in C instead of assembly-language on one of these old 8-bit CPUs that were never designed to run C code. Uli's improvements to HuC do include a couple of features that help to generate faster code (such as "-msmall" and "-fno-recursive"), but the programmer needs to understand what the tradeoffs are for using those (highly-recommended) features. Since HuSIC's author was using HuC, a language which makes it really-easy to include sections of optimized assembly-language within the C code, then it is really up to him (or to you) to actually use that capability to optimize HuSIC's waveform upload. The kind of situation-specific optimization that I, or anyone else, would need to include in HuC to detect this one specific piece of C code, and to optimize it into a single instruction, is just not going to happen in the near future, nor ever in my opinion. The aim here is to try to create some sort of patch that prevents HES files from doing this excessive wavetable update latency when generated by HuSIC and Deflemask. I'm guessing this will entail writing a custom assembly routine using the TIN function and replacing whatever gibberish asm HuC generates to accomplish the same thing. You can probably improve HuSIC just by optimizing the HuC code with a little in-line assembly. You're totally out of luck with Deflemask. The problem is baked-into the data format that it uses. No simple patch will fix that. That something that only Delek can fix, since he's the only one with the source code.
|
|
fragmare
Punkic Cyborg
Posts: 116
Homebrew skills: Graphics, Music, Level Design, Annoying Programmers
|
Post by fragmare on Apr 6, 2019 6:53:13 GMT
Man! No Deflemask updates, no Huzak... I guess everybody wanting to create PCE chip music is just fucked, then. It's almost like somebody needs to create an open source PCE sound driver for Deflemask modules or something... oh wait...
|
|
|
Post by gredler on Apr 7, 2019 5:10:47 GMT
Man! No Deflemask updates, no Huzak... I guess everybody wanting to create PCE chip music is just fucked, then. It's almost like somebody needs to create an open source PCE sound driver for Deflemask modules or something... oh wait... Squirrel doesn't work without some local changes to it if you're using the new HuC, but it was a valid option for a lot of people before the HuC 3.99 changes that seemed to bring success to those who could use MML to make their music happen. I wonder if Squirrel has this pop you're talking about? Have you messed with MML? Is it not a viable option for your needs?
|
|