|
Post by DarkKobold on Feb 7, 2020 2:37:43 GMT
So, according to Punch, the vertical tears we get during gameplay is most likely due to palette loading. This is consistent with my experiences.
He says "Do it as close after vsync() as possible" and that seems to work... except I need to come up with a function to "queue" palette loads.... unless this already exists?
Additionally, he's mentioned that >2-3 palette loads is going to cause graphical glitches, regardless. Can anyone confirm this? Or have strategies or work arounds?
|
|
touko
Punkic Cyborg
Posts: 106
|
Post by touko on Feb 7, 2020 9:53:09 GMT
Yes i confirm, all palette change must be done in vblank mainly if you do it in C,those graphical glitches was because you access to CRAM at the same time than VCE, this cause the VCE to display the wrong color . Even in ASM it's really difficul/impossible to change 2-3 palettes whithout glitches in the display area,but in the vblank it's easily doable, at least in ASM with the tia opcode .
|
|
|
Post by turboxray on Feb 8, 2020 14:38:24 GMT
So, according to Punch, the vertical tears we get during gameplay is most likely due to palette loading. This is consistent with my experiences.
He says "Do it as close after vsync() as possible" and that seems to work... except I need to come up with a function to "queue" palette loads.... unless this already exists?
Additionally, he's mentioned that >2-3 palette loads is going to cause graphical glitches, regardless. Can anyone confirm this? Or have strategies or work arounds? Are you doing fading or just cycling through some palettes and updating them? Typical method, as mentioned, is to write these updates during vblank. Txx block move will get the job done fairly quick if you're just copying palette data. About ~730 cycles for 3 palettes (if they're not all sequential in CRAM) with Txx moves. I don't know where your palettes are stored, so probably best to copy them over to ram during active display and update them during vblank time. To give a perspective on those cycle numbers, a 224 visible display has around 38 vblank lines, which is ~17k of cpu cycles. HuC is going to eat up some of that for automated processes, but it should still be plenty of room for the update. Some games do update palette entries during hblank, and for 3 palettes that would be about 8-10 scanlines at the top of the display, but definitely not an HuC thing (too much stuff getting in the way).
|
|
|
Post by DarkKobold on Feb 8, 2020 20:25:27 GMT
So, according to Punch, the vertical tears we get during gameplay is most likely due to palette loading. This is consistent with my experiences.
He says "Do it as close after vsync() as possible" and that seems to work... except I need to come up with a function to "queue" palette loads.... unless this already exists?
Additionally, he's mentioned that >2-3 palette loads is going to cause graphical glitches, regardless. Can anyone confirm this? Or have strategies or work arounds? Are you doing fading or just cycling through some palettes and updating them? Typical method, as mentioned, is to write these updates during vblank. Txx block move will get the job done fairly quick if you're just copying palette data. About ~730 cycles for 3 palettes (if they're not all sequential in CRAM) with Txx moves. I don't know where your palettes are stored, so probably best to copy them over to ram during active display and update them during vblank time. To give a perspective on those cycle numbers, a 224 visible display has around 38 vblank lines, which is ~17k of cpu cycles. HuC is going to eat up some of that for automated processes, but it should still be plenty of room for the update. Some games do update palette entries during hblank, and for 3 palettes that would be about 8-10 scanlines at the top of the display, but definitely not an HuC thing (too much stuff getting in the way).
I ended up doing this a little more "brain dead" than this - since its only a problem during the BEU. Everything else is static palettes for the rest of the game. It uses a ring buffer, and limits palette loads to 3 per frame. This is unlikely to ever happen though, but the logic is there just in case. The functions Pal0(), Pal1(), etc, are just a bunch of split up statements that load Palettes based on global constants.
queuepal(paltoload,palnum) char paltoload,palnum; { PTL[PalPos2]=paltoload; PN[PalPos2]=palnum; PalPos2++; if (PalPos2>11) PalPos2=0; PalCount++; }
palloader() { char i,j; if (!PalCount) return; if (PalCount>3) { j=3; PalCount-=3; } else { j=PalCount; PalCount=0; } for (i=0; i<j; i++) { paltemp1=PN[PalPos1]; paltemp2=PTL[PalPos1]; PalPos1++; if (PalPos1>11) PalPos1=0; if (paltemp1<10) { Pal0(); } else if (paltemp1<20) { Pal1(); } else { Pal2(); } } }
|
|
|
Post by elmer on Feb 8, 2020 22:16:50 GMT
So, according to Punch, the vertical tears we get during gameplay is most likely due to palette loading. This is consistent with my experiences.
Are you doing fading or just cycling through some palettes and updating them? Typical method, as mentioned, is to write these updates during vblank. Txx block move will get the job done fairly quick if you're just copying palette data. About ~730 cycles for 3 palettes (if they're not all sequential in CRAM) with Txx moves. I don't know where your palettes are stored, so probably best to copy them over to ram during active display and update them during vblank time. To give a perspective on those cycle numbers, a 224 visible display has around 38 vblank lines, which is ~17k of cpu cycles. HuC is going to eat up some of that for automated processes, but it should still be plenty of room for the update. Some games do update palette entries during hblank, and for 3 palettes that would be about 8-10 scanlines at the top of the display, but definitely not an HuC thing (too much stuff getting in the way). You only have the time to update a few individual colors during an hblank, not even a single 16-color palette. As turboxray says, large scale palette updates should really be done during the vblank period when you've got thousands of cycles of time to do them. But still, this isn't something that you write in C code, you need to use the library functions which internally use Txx instructions for speed. Have you tried a vsync() followed by a load_palette()? Is that the scheme that is only allowing you to update 2 or 3 palettes without glitches? If so, then that would be because HuC isn't following the System Card's method of updating palettes, and so it is suceptible to delays which could then cause visible glitches. If you're talking about your CD game, then you can use the System Card's functions, which can happily load up all 32 palettes (both background and sprite) during a single vblank. The easiest way to do this is to just keep a copy of all 32 palettes in RAM, and then set the variables that tell the System Card to upload the colors from RAM to the VCE on the next vblank. IMHO, that is probably how the HuC library functions should be changed to work.
|
|
|
Post by DarkKobold on Feb 8, 2020 22:47:17 GMT
Are you doing fading or just cycling through some palettes and updating them? Typical method, as mentioned, is to write these updates during vblank. Txx block move will get the job done fairly quick if you're just copying palette data. About ~730 cycles for 3 palettes (if they're not all sequential in CRAM) with Txx moves. I don't know where your palettes are stored, so probably best to copy them over to ram during active display and update them during vblank time. To give a perspective on those cycle numbers, a 224 visible display has around 38 vblank lines, which is ~17k of cpu cycles. HuC is going to eat up some of that for automated processes, but it should still be plenty of room for the update. Some games do update palette entries during hblank, and for 3 palettes that would be about 8-10 scanlines at the top of the display, but definitely not an HuC thing (too much stuff getting in the way). You only have the time to update a few individual colors during an hblank, not even a single 16-color palette. As turboxray says, large scale palette updates should really be done during the vblank period when you've got thousands of cycles of time to do them. But still, this isn't something that you write in C code, you need to use the library functions which internally use Txx instructions for speed. Have you tried a vsync() followed by a load_palette()? Is that the scheme that is only allowing you to update 2 or 3 palettes without glitches? If so, then that would be because HuC isn't following the System Card's method of updating palettes, and so it is suceptible to delays which could then cause visible glitches. If you're talking about your CD game, then you can use the System Card's functions, which can happily load up all 32 palettes (both background and sprite) during a single vblank. The easiest way to do this is to just keep a copy of all 32 palettes in RAM, and then set the variables that tell the System Card to upload the colors from RAM to the VCE on the next vblank. IMHO, that is probably how the HuC library functions should be changed to work.
If you look above - I did fix it by queuing palette changes, and then performing them immediately after vsync. Honestly, this queue would be way better if HuC itself handled it, and queued palettes called with "load_palette" to be updated right after a vblank. You can see these tears in other homebrew that use HuC, so I think its just a common occurrence.
|
|
|
Post by DarkKobold on Feb 8, 2020 22:53:20 GMT
If you're talking about your CD game, then you can use the System Card's functions, which can happily load up all 32 palettes (both background and sprite) during a single vblank.
Speaking of HuC... what are these? If they aren't available as part of the HuC library, I'm not going to be able to use them, as in... I don't know what they are.
|
|
|
Post by elmer on Feb 9, 2020 5:22:35 GMT
Speaking of HuC... what are these? If they aren't available as part of the HuC library, I'm not going to be able to use them, as in... I don't know what they are. The System Card function is called ex_colorcmd ... you just set a couple of System Card variables, and then your palettes get uploaded to the VCE on the next vblank. You can check one of the System Card variables to see when the upload is done. It's in the CD BIOS documentation that is in my programming sticky. If HuC is used to make a CD game, then the System Card function is called, and works as intended. If HuC is used to make a HuCard game, then setting the variables does nothing. Anyway ... the changes to make HuC use a queue, and to upload palettes safely during the vblank, are actually really trivial. I hadn't heard any complaints about the old behavior, so I never changed the way that it worked. *BUT*, be warned that the changes *MIGHT* cause things to behave differently if someone was *RELYING* on the previous behavior. Here's a 64-bit Windows test version of HuC for you to try ... huc 2020-02-08 test versionYou can have up to 7 different load_palette() calls queued up at once waiting for the next vblank, and then it will automatically pause and wait for the next vblank before letting you add anything more to the queue. If you try to queue more than 32 palette changes (i.e. all SPR and BG colors), then it's going to run out of time and glitch, just as before. N.B. This is a 64-bit Windows build, primarily because I can't be bothered to make a 32-bit build at the moment. If you're running on a 32-bit version of Windows, just let me know. As far as I can see, this is working fine with the last version of the Catastrophy source that I grabbed in November of last year. Please let me know if it doesn't work for you, because I've only given it a limited amount of testing (i.e. Catastrophy and the HuC examples work).
|
|
|
Post by turboxray on Feb 9, 2020 16:52:48 GMT
If you look above - I did fix it by queuing palette changes, and then performing them immediately after vsync. Honestly, this queue would be way better if HuC itself handled it, and queued palettes called with "load_palette" to be updated right after a vblank. You can see these tears in other homebrew that use HuC, so I think its just a common occurrence.
Just to be clear, do you mean scrolling tears or palette update artifacts on screen???
|
|
|
Post by gredler on Feb 9, 2020 17:14:51 GMT
Just to be clear, do you mean scrolling tears or palette update artifacts on screen??? There are palette update artifacts on screen which are horizontal lines of garbage pixels. They kinda look like tears so that's the word we've been using, but tears is probably an inaccurate description as this can and does happen without any scrolling. I've always noticed it, but when we added color cycling to a background it became constant and notably buggy looking.
|
|
|
Post by DarkKobold on Feb 9, 2020 18:20:38 GMT
Speaking of HuC... what are these? If they aren't available as part of the HuC library, I'm not going to be able to use them, as in... I don't know what they are. You can have up to 7 different load_palette() calls queued up at once waiting for the next vblank, and then it will automatically pause and wait for the next vblank before letting you add anything more to the queue. If you try to queue more than 32 palette changes (i.e. all SPR and BG colors), then it's going to run out of time and glitch, just as before. So, before I try this, I'd like to understand this better.
I can queue more than 7? But I can't queue more than 32? Does it do 7 per vsync, or as many as queue'd? Can I load an entire levels worth of palettes during a load function? Do I need to do 7 vsync 7 vsync? Its very confusing at the moment.
Regardless, I think that this is a huge welcome change, and will improve all homebrews built with HuC. Gredler hates seeing this, especially.
|
|
|
Post by elmer on Feb 9, 2020 20:15:47 GMT
There are palette update artifacts on screen which are horizontal lines of garbage pixels. They kinda look like tears so that's the word we've been using, but tears is probably an inaccurate description as this can and does happen without any scrolling. Yep, that sounds like it could be the known-issue of writing to the VCE palettes while the screen is being displayed. Especially if you're not using HuC's split-screen scrolling at the time. The easiest way to be sure is to just try the new HuC build. So, before I try this, I'd like to understand this better. I can queue more than 7? But I can't queue more than 32? Does it do 7 per vsync, or as many as queue'd? Can I load an entire levels worth of palettes during a load function? Do I need to do 7 vsync 7 vsync? Its very confusing at the moment. Lets get on the same page as far as terminology, since I use the terms slightly differently to turboxray The PCE has 512 color registers, 256 for the backgrounds (0-255), and then 256 for the sprites (256-511). Those are grouped as 32 palettes, with 16 colors in each palette. So the backgrounds use palettes 0-15, and the sprites use palettes 16-31. Each HuC call to load_palette() can load up to 32 individual palettes (with 16 colors in each palette), starting at any palette number from 0-31. The new code in HuC can queue up 7 different calls to load_palette() and wait for the next vblank to upload the color changes to the VCE. If you call load_palette() and 8th time during a single frame, then it will pause and wait until after the next vblank before returning (and the palette changes in the 8th call won't be processed during that vblank, they will be queued for the next vblank). So, theoretically, you could queue up 7 calls to load_palette(), with each call uploading 32 palettes (all 512 colors) ... and it wouldn't crash, but you would see lots of screen tearing again, because there is only time to load a total of 32 palettes during the vblank (there is actually more time, but why-on-earth would anyone try to load more than 32 palettes?). The point is that you generally don't load all 32 palettes in a single load_palette(), you probably load the 16 background palettes, and then load the 16 sprite palettes separately. You might split things up even more and load different subsets of sprite palettes for different players/enemies. The point is ... you can load up to 7 different groups of palettes in 1 frame, with a total of 32 palettes (i.e. every color register in the PCE). It would be easy to bump the queue up to 15, if people would find that useful. Another thing, is that if you're doing color cycling on your backgrounds, then you should really only upload the palettes that you're changing, and not all 512 colors (to save time). One final thing ... if you're running the System Card player for music/sfx (i.e. Squirrel), and you're running it with the timer interrupt instead of the vblank ... then that could potentially delay things and allow you to upload less than 32 palettes without tearing. I haven't checked to see if the System Card player re-enables interrupts once it is called.
|
|
|
Post by DarkKobold on Feb 10, 2020 2:10:45 GMT
Thanks a ton for this help. It seems to be working mostly fine, but...
Oddly, it seems to have done something to the font stuff - the only difference between these two is using this new updated compiler, or the old one I had on my hard drive.
Also, for some reason, an #inctile is crashing an overlay (like, literally, the overlay won't load). If I comment it out, the overlay loads fine. However, there is absolutely no problem when I compile it as a HuCard.
|
|
|
Post by elmer on Feb 10, 2020 4:23:50 GMT
Thanks a ton for this help. It seems to be working mostly fine, but...
Oddly, it seems to have done something to the font stuff - the only difference between these two is using this new updated compiler, or the old one I had on my hard drive.
Also, for some reason, an #inctile is crashing an overlay (like, literally, the overlay won't load). If I comment it out, the overlay loads fine. However, there is absolutely no problem when I compile it as a HuCard.
Have you rebuilt *all* of your overlays with the latest compiler? HuC's overlay system really needs everything to be built with *exactly* the same version of the library, or all hell will break loose. Apart from that, I need to see an example of the problem in order to track it down and fix it ... there is nothing that I can think of in recent changes that would cause inctile to break. But if I've somehow fracked something up and broken code that used to work, then I need to see that example in order to fix it.
|
|
touko
Punkic Cyborg
Posts: 106
|
Post by touko on Feb 10, 2020 17:40:38 GMT
I experienced this kind of troubles, especialy when you are close to fill your entire CD RAM . My advice is to group all your datas at the start of your project, just after the huc.h include . You can also try to include it to another place . I think it's because you have no free bank and not enought space in banks reserved for datas, even if you have a ton of free bytes,if a bank started with datas,it cannot be used by huc for code, and vice versa . It results to a bad code mapping.
|
|