|
Post by dshadoff on Nov 23, 2019 1:22:20 GMT
Based on the original disassembly of: a) Brandish's mouse reading function (at www.interlog.com/~daves/pce_info/mouse.txt) b) several other mouse read extracts in this post: pcengine.proboards.com/post/11129...it seems that the protocol follows a nearly identical progression of steps, but with varying delays. The overall pattern is (key timings are called out in footnotes): Read #1:i) Send a '1' / '3' / '1' to the joystick port ('3' = CLK high). (*1)ii) Delay to allow mouse controller to assemble data (*2)iii) Read upper nybble of 'x' value iv) Send '0' to joystick port v) Delay (*3)vi) Read mouse buttons Read #2:vii) Send a '1' / '3' / '1' to the joystick port ('3' = CLK high). viii) Delay before read (*4)ix) Read lower nybble of 'x' value x) Send '0' to joystick port xi) Delay (same as *3 above) xii) Read mouse buttons again (or not) Read #3:xiii) Send a '1' / '3' / '1' to the joystick port ('3' = CLK high). xiv) Delay before read (same as *4 above) xv) Read upper nybble of 'y' value xvi) Send '0' to joystick port xvii) Delay (same as *3 above) xviii) Read mouse buttons again (or not) Read #4:xix) Send a '1' / '3' / '1' to the joystick port ('3' = CLK high). xx) Delay before read (same as *4 above) xxi) Read lower nybble of 'y' value xxii) Send '0' to joystick port xxiii) Delay (same as *3 above) xxiv) Read mouse buttons again (or not) (*1): Duration of clock high pulse is as short as "no intentional delay" (5 cycles for next absolute write to port). 5 cycles ~ 700 nanoseconds (*2): This delay ranges from 156 cycles (~22uS) to 1215 cycles (~170uS). ** Princess Maker 2 has a 7265-cycle delay (>1ms) on mouse detect. It seems reasonable to believe that the mouse read protocol would have a timeout reset in case all 4 nybbles weren't read quickly enough. This extra-long delay appears to be a deliberate attempt to make the mouse controller time out. (*3): This delay ranged from 'no intentional delay' (i.e. 5 cycles for absolute write to port), to 22 cycles. 5 cycles ~ 700 nanoseconds (*4): Subsequent reads (after initial 'data processing' delay) range from 14 cycles to 264 cycles. 14 cycles ~ 2uS Conclusions:
- Duration of CLK pulse does not appear important; it must drive edge-triggered logic (most likely low->high transition) - Mouse controller needs to assemble data in less than 22uS after initial trigger - Timeout should occur in as little as 1 millisecond if a full set of reads has not consumed all of the data - SEL transition duration unimportant; must drive something like a 74HC157 - Subsequent reads require minimal delay, as data has already been assembled, and is simply being sequenced through a state machine
|
|
|
Post by dshadoff on Nov 23, 2019 17:46:20 GMT
So I did some playing around with the mouse read function in HuC, and adjusted the delay timing for (*2) above.
On the low side, the shortest delay seen in game code (so far) was 156 cycles: -> I didn't see any particular problems with reading values even when it was only 10 cycles.
On the high side, the longest delay seen in game code (so far) was 1215 cycles: -> I only increased by large margins, but it still seemed fine at 2000 cycles. Somewhere between there and 3000 cycles, reads become garbled, with 'Y' values not changing much, and 'X' values changing even for movement on the 'Y' axis. I was thinking that after a certain timeout amount, the data might be 'consumed' in the same way that a fetch grabs it, but even after a 7500-cycle delay, the behaviour isn't much changed from the 3000-cycle value.
Because (*2) can go down to 10 cycles (maybe lower ?), I am not going to bother checking the timing on (*3) or (*4).
I am kind of curious what sort of hardware they used during development, because the timings in the code (which ought to reflect guidelines provided by NEC) are not indicative of final hardware.
|
|
|
Post by elmer on Nov 24, 2019 19:40:55 GMT
I am kind of curious what sort of hardware they used during development, because the timings in the code (which ought to reflect guidelines provided by NEC) are not indicative of final hardware. Yes, the subtle-but-important differences in each game's code-flow are rather bewildering to look at. It's hard to decide what timings to use in any code that we write today! It kinda looks like Hudson prototyped the mouse using an interrupt-driven microcontroller, which would explain the need for the initial delay at start of a read sequence ... but the speed with which the actual production-hardware responds suggests that they ended up creating some hardwired custom-logic chips for the final mice. Like you, my tests don't actually show any need for the extended (*2) delay to "allow the mouse controller to assemble data", beyond the normal 1.25us joypad-response delay. If you look at the code in Tokimeki Memorial and Vasteel 2, those are the only two games that I looked at that can support a mouse in any port of a multitap. But Vasteel 2 has what seems to be a bug where it doesn't enforce the (*2) delay on ports 2-5, only on port 1 ... but AFAIK, it still works. As for the 7500 cycle delay between complete read-sequences in Princess Maker 2's mouse-detection routine, it is definitely a timeout. The mouse-detection routine in Lemmings does the same thing, but the delay is shorter at 4700 cycles. Having that timeout in the mouse hardware allows us to read the mouse multiple times in a frame ... which seems to be the only way to reliably detect that a mouse is attached. That's because if you read the mouse that quickly (approx 20 times in a single 1/60s), then even if the mouse is moving quite fast, some of the movement-delta readings will be zero. That's what both Lemmings and Vasteel 2 are doing to detect a mouse ... they both read it really fast a few hundred times, and then see if the movement-delta is ever reported as zero. Now ... the code in Lemmings does the test poorly, and mistakenly thinks that a 6-button joypad is a mouse, because the 6-button joypad returns a d-pad value of zero to indicate the extra buttons. The code in Vasteel 2 is basically the same, but counts up the number of times that the zero value is seen, and then figures out if it is looking at a mouse or a 6-button joypad by the number of zero values that it sees (the joypad should have exactly half of the tests respond with zero). Unfortunately, Vasteel 2's logic can be confused if the mouse is moving right, and returning more-than-a-few negative values.
|
|
|
Post by elmer on Nov 24, 2019 19:49:43 GMT
Last night I finally figured out how to get around the problems in Lemmings and Vasteel 2, and here's the mouse detection code that I came up with.
It seems to work fine either with a mouse directly attached, or with a MB128, multitap, and more-than-one mouse (plus 3 and 6 button joypads).
It also seems to be fine with having the mouse/mice moving as fast as I could move them while the detection was running.
; *************************************************************************** ; *************************************************************************** ; ; mouse_detect - Scan all 5 multitap ports and detect which has a mouse. ; ; This takes a significant amount of CPU time to run, approx 875,000 cycles, ; i.e. 1/8 second. ; ; Args: None ; Uses: __al .. __ch (6 bytes) ; ; Returns: mouseflag = Bitmask of ports with a mouse, bit 0 is port 1. ; ; N.B. Interrupts are disabled while the routine runs. ;
mouse_detect: php ; Preserve interrupt mask. sei ; Disable interrupts.
stz mouseflag ; Reset detection status.
stz <__ch ; Initialize repeat count.
.test_loop: ldy #$01 ; CLR lo, SEL hi for d-pad. sty IO_PORT lda #$03 ; CLR hi, SEL hi, reset tap. sta IO_PORT
clx ; Start at first pad. .read_x_hi: sty IO_PORT ; CLR lo, SEL hi for d-pad.
lda #30 ; 180 cycle delay after CLR lo .wait_loop: dec a ; on port to allow the mouse bne .wait_loop ; to buffer and reset counters.
lda IO_PORT ; Read direction-pad bits. stz IO_PORT ; CLR lo, SEL lo for buttons. asl a ; Wait 1.25us (9 cycles). asl a asl a asl a sta <__al,x ; Save port's X-hi nibble.
inx ; Get the next pad from the cpx #5 ; multitap. bne .read_x_hi
sty IO_PORT ; CLR lo, SEL hi for d-pad. lda #$03 ; CLR hi, SEL hi, reset tap. sta IO_PORT
clx ; Start at first pad. .read_x_lo: sty IO_PORT ; CLR lo, SEL hi for d-pad.
pha ; Wait 1.25us (9 cycles). pla nop
lda IO_PORT ; Read direction-pad bits. stz IO_PORT ; CLR lo, SEL lo for buttons. and #$0F ; Wait 1.25us (9 cycles). ora <__al,x ; Add port's X-hi nibble. bne .not_mouse
lda .bitmask,x ; An X movement of zero means tsb mouseflag ; this port is a mouse.
.not_mouse: inx ; Get the next pad from the cpx #5 ; multitap. bne .read_x_lo
cla ; 5376 cycle delay. .pause: bsr .delay ; This lets the mouse timeout dec a ; and allow the next read. bne .pause
inc <__ch ; Repeat the test 128 times. bpl .test_loop
lda VDC_SR ; Skip any pending VDC irq. plp ; Restore interrupt mask.
.delay: rts
.bitmask: db $01,$02,$04,$08,$10,$20,$40,$80
mouseflag: ds 1
|
|
|
Post by dshadoff on Nov 24, 2019 20:11:34 GMT
You tested with multiple mice ? I noticed that Brandish also looked for any detections of zero values to determine whether it was a mouse, and I had thought that it wouldn't be confused by a 6-button... I was thinking about what you had said about games only detecting a mouse in the first slot, and I hadn't actually considered the possibility of using more than one mouse. How many PC Engine games could actually consider using multiple mice ? (Most mouse-supporting games seem to be single-player, don't they ?) On the bright side, I suppose we can now port Missile Command ! Also, looking at the mouse scanning routines, it seemed pretty easy to be able to patch them to detect mouse-versus-6-button if needed...
|
|
|
Post by elmer on Nov 24, 2019 20:41:59 GMT
You tested with multiple mice ? Yep, I *finally* unboxed and plugged in my two mice! IIRC it was a long time ago that Tom pointed out that you should technically be able to use multiple mice, I just don't know if anyone has ever had a reason to write the code to do it. I noticed that Brandish also looked for any detections of zero values to determine whether it was a mouse, and I had thought that it wouldn't be confused by a 6-button... Ooooh ... do you have the address of Brandish's detection routine, I'd like to take a look at it. If it's anything like the other detection code that I've seen, then it is only looking at the high-nibble of the mouse's X movement, which is the same nibble that the 6-button pad uses to report that it actually is a 6-button pad. I was thinking about what you had said about games only detecting a mouse in the first slot, and I hadn't actually considered the possibility of using more than one mouse. How many PC Engine games could actually consider using multiple mice ? (Most mouse-supporting games seem to be single-player, don't they ?) On the bright side, I suppose we can now port Missile Command ! I can't think of any current PCE games that could use two mice (or more) ... but as-you-say, now that we know that it's possible, who knows what someone will write in the future! Also, looking at the mouse scanning routines, it seemed pretty easy to be able to patch them to detect mouse-versus-6-button if needed... My fast 3-button/6-button joypad reading code would definitely get confused by a mouse returning a d-pad nibble of zero. I really don't think that you can sanely autodetect between a mouse/pad while in the joypad code itself ... but sure, if you already know that you're looking at a mouse instead of a joypad, then it's fairly easy to write a function that can deal with all of the alternatives that could be plugged into the PCE.
|
|
|
Post by dshadoff on Nov 24, 2019 22:02:27 GMT
Ooooh ... do you have the address of Brandish's detection routine, I'd like to take a look at it. If it's anything like the other detection code that I've seen, then it is only looking at the high-nibble of the mouse's X movement, which is the same nibble that the 6-button pad uses to report that it actually is a 6-button pad. Ummm... not really an address, but if you look at the mouse.asm code in HuC, it's based on Brandish, because Brandish was the game where I found the mouse code to be comprehensible all those years ago. For HuC (and I'm 99% sure Brandish as well), it takes up to 10 samples, and looks at the first nibble of the 'Y' coordinate each time. If it finds a '0' in any of them (as it probably would), then it concludes that it is talking to a mouse.
|
|
|
Post by elmer on Nov 25, 2019 0:32:22 GMT
Ummm... not really an address, but if you look at the mouse.asm code in HuC, it's based on Brandish, because Brandish was the game where I found the mouse code to be comprehensible all those years ago. For HuC (and I'm 99% sure Brandish as well), it takes up to 10 samples, and looks at the first nibble of the 'Y' coordinate each time. If it finds a '0' in any of them (as it probably would), then it concludes that it is talking to a mouse. Ahhhh ... yes, that code actually looks at the whole byte of the Y value, and so should correctly distinguish between a 6-button joypad and a mouse, unlike Lemmings. OTOH, the version in HuC has removed the wait-for-vsync between mouse reads that is in the original Brandish game, and so it is reading gawd-knows-what from the mouse hardware. The current code in HuC seems to be assuming that the buffer in the mouse hardware will wrap-around after the 4th read, and start the reading anew at the beginning (i.e. the X-hi counter). The original Brandish code doesn't assume that, it waits for a frame between reads, which will allow the mouse to timeout and reset the read sequence back to the beginning. I've not seen any mouse code in any shipping game that deliberately reads beyond the 4 values that the mouse is supposed to return. When I tried reading beyond the four values that the mouse is supposed to read, then I started to get zero values ... which is why HuC's code appears to function. If the wait-for-vsync is restored, then IMHO Brandish's detection code will fail if the mouse is being moved up or down at any speed when the routine is run.
|
|
|
Post by dshadoff on Nov 25, 2019 1:44:58 GMT
I don't know when the vsync was commented out, but I agree that it would be better if it was in there.
I don't feel that the HuC code (as is) assumes any sort of buffer wrap-around; I feel that it assumes that additional reads would work the same way as once-per-vsync, but that the delta movements would be even smaller due to the tighter timing. But I don't know if anybody actually tested back then whether a short polling interval would cause any problems for the mouse controller.
I agree that the detection code would fail if the mouse was being moved, but... for the mouse to be moving at a rate which registers a non-zero delta on every single consecutive read... that's actually pretty significant movement, given that the mouse is not very sensitive. I measure about 4 increments per millimetre, which means that the mouse would need to be moving at least 1.5 cm per second continuously in the vertical axis during the ~0.16 second interval (note that horizontal movement is more common than vertical). Granted, this is possible, but since nothing is happening yet in the game when the detection is going on, and mice tend to sit still on flat surfaces until moved by a person, it would generally be a successful test.
I believe that HuC's removal of the vsync wait was intended to require a much higher velocity in order to fail the check.
It seems that a better way to test would be to evaluate whether *both* the most-significant X value (nybble) and most-significant Y value (nybble) are simultaneously zero after a 1-vsync period; this should not happen on even a 6-button joypad (one will definitely be zero, but the other cannot), and a mouse would need to be moving helter-skelter (>24cm/sec in each axis) to score non-zero values on a single read (on both axes).
EDIT: Whoops, I see my logic error above. The high and low nybble of either X or Y would correspond with a 6-button joypad's 2 readings (but not the high nybble of both X and Y). And I see that you are running 128 cycles of the read in order to verify values - smart, as this increases the likelihood of at least one zero value. And waiting less than a VSYNC will ensure smaller delta values, with a higher likelihood of a zero. But I'm not convinced that the mouse requires a timeout in order to get values again (rather, timeout should flush the incomplete previous read's values)... though it's better to be safe than sorry.
|
|
|
Post by elmer on Nov 26, 2019 23:09:12 GMT
I believe that HuC's removal of the vsync wait was intended to require a much higher velocity in order to fail the check. I agree and that's what I meant, even if I wasn't clear in my post. And I see that you are running 128 cycles of the read in order to verify values - smart, as this increases the likelihood of at least one zero value. And waiting less than a VSYNC will ensure smaller delta values, with a higher likelihood of a zero. In my tests of the code shown above while moving the mouse really fast towards the right (-ve X), I only get a few readings of zero. Yes, that was a pretty-unrealistic test, but I'm happier with an algorithm that works even when under stress, rather than relying on the most-common-but-uncertain circumstance of the mouse being basically still. But I'm not convinced that the mouse requires a timeout in order to get values again (rather, timeout should flush the incomplete previous read's values)... though it's better to be safe than sorry. Yeah, I'm not convinced that the timeout is required either ... but we just don't know how the hardware works, and back-to-back reads like that just aren't done in any shipping code (that I know of). If the hardware was supposed to allow itself to be read that fast, then it would be a really quick way to determine if a mouse was attached ... so we have to ask ourselves why didn't Lemmings (which IIRC the mouse was specifically designed for), or Princess Maker 2, actually do their mouse detection without the timeout delay?
|
|
|
Post by elmer on Nov 26, 2019 23:21:15 GMT
Here is my current-best code for reading 2-button joypads, 6-button joypads, and multiple mice, all attached to a TurboTap (or not). This version of the code ignores buttons III..VI on the 6-button joypad when they are reported, which seems like a reasonable size/speed tradeoff if you are writing a game that supports a 2-button mouse. It also uses self-modifying code in order to show some fun optimizations, and so it needs to run from RAM, i.e. it is for a CD/SuperCD game (or a TED2). The changes to make the logic run from ROM/HuCard, and to read all 6-buttons, are pretty trivial and are left as an exercise for the reader. ; *************************************************************************** ; *************************************************************************** ; ; read_joypad (includes mouse support, buttons III..VI are ignored) ; ---- ; poll joypads ; ; 'joyport' (location $1000) is a control read/write port which only reads ; 4 bits at a time; the program uses joyport to toggle the multiplex line ; ; real logic values are read into the port - the joypad's keys are default ; high, and 'pulled' low when pressed. Therefore, these values must be ; inverted/complemented to yield values where '1' means 'pressed'. ; ; Read twice to get both sets of buttons on a 6-button joypad. ; ; The 2nd set of buttons have bits 4-7 all pressed, which isn't possible ; on a 2-button joypad. ; ; bit values for joypad bytes: (MSB = #7; LSB = #0) ; ------------------------------------------------- ; bit 0 (ie $01) = I ; bit 1 (ie $02) = II ; bit 2 (ie $04) = SELECT ; bit 3 (ie $08) = RUN ; bit 4 (ie $10) = UP ; bit 5 (ie $20) = RIGHT ; bit 6 (ie $40) = DOWN ; bit 7 (ie $80) = LEFT ; ----
MAX_PADS = 3 ; 5 normally, 3 to save time.
if SUPPORT_MOUSE
bss
mouse_flags ds 1 mouse_x ds 5 mouse_y ds 5
code
read_joypadsY: tii joynow,joyold,MAX_PADS ; Save the previous values.
cly ; Repeat this loop 4 times.
.read_turbotap: lda .mouse_vectors,y ; Self-modify the branch for sta .branch_mod + 1 ; this pass.
clx ; Start at port 1. lda #$01 ; CLR lo, SEL hi for d-pad. sta IO_PORT lda #$03 ; CLR hi, SEL hi, reset tap. sta IO_PORT
.read_port: lda #$01 ; CLR lo, SEL hi for d-pad. sta IO_PORT
lda bit_mask,x ; Is there a mouse attached? and mouse_flags .branch_mod: bne .mouse_y_lo ; Self-Modifying code!!!
cpy #2 ; Joypads only need to be read bcs .skip_port ; twice, skip the other reads.
.read_pad: lda IO_PORT ; Read direction-pad bits. stz IO_PORT ; CLR lo, SEL lo for buttons. asl a ; Wait 1.25us (9 cycles). asl a asl a asl a beq .next_port ; 6-btn pad if UDLR all held.
.read_2button: sta .button_mod + 1 ; Get buttons of 2-btn pad. lda IO_PORT and #$0F .button_mod: ora #$00 ; Self-Modifying code!!! eor #$FF sta joynow,x
.skip_port: stz IO_PORT ; CLR lo, SEL lo for buttons.
.next_port: inx ; Get the next pad from the cpx #MAX_PADS ; multitap. bne .read_port
iny ; Do the next complete pass. cpy #4 ; Have we finished 4 passes? bne .read_turbotap bra .calc_pressed ; Now that everything is read.
; Mouse processing, split into four passes.
.mouse_x_hi: lda #28 ; 176 cycle delay after CLR lo .wait_loop: dec a ; on port to allow the mouse bne .wait_loop ; to buffer and reset counters.
lda IO_PORT ; Read direction-pad bits. stz IO_PORT ; CLR lo, SEL lo for buttons. asl a ; Wait 1.25us (9 cycles). asl a asl a asl a sta mouse_x,x ; Save port's X-hi nibble.
lda IO_PORT ; Get mouse buttons. and #$0F eor #$0F sta joynow,x bra .next_port
.mouse_x_lo: lda IO_PORT ; Read direction-pad bits. and #$0F ; Wait 1.25us (9 cycles). ora mouse_x,x ; Add port's X-hi nibble. sta mouse_x,x bra .skip_port
.mouse_y_hi: lda IO_PORT ; Read direction-pad bits. asl a ; Wait 1.25us (9 cycles). asl a asl a asl a sta mouse_y,x ; Save port's Y-hi nibble. bra .skip_port
.mouse_y_lo: lda IO_PORT ; Read direction-pad bits. and #$0F ; Wait 1.25us (9 cycles). ora mouse_y,x ; Add port's Y-hi nibble. sta mouse_y,x bra .skip_port
; See what has just been pressed, and check for soft-reset.
.calc_pressed: ldx #MAX_PADS - 1
.pressed_loop: lda joynow,x ; Calc which buttons have just tay ; been pressed. eor joyold,x ; Unlike the System Card, here and joynow,x ; the "trg" is cumulative and ora joytrg,x ; must be cleared when used. sta joytrg,x
cmp #$04 ; Detect the soft-reset combo, bne .calc_next ; hold RUN then press SELECT. cpy #$0C bne .calc_next lda bit_mask,x ; Is soft-reset enabled on this bit joyena ; port? bne .soft_reset
; Do auto-repeat processing on the d-pad.
.calc_next: tya ; Auto-Repeat the UP and DOWN ldy #SLOW_AUTORPT ; while they are held. and #JOY_U + JOY_D beq .set_delay dec joyrpt,x bne .no_repeat ora joytrg,x sta joytrg,x ldy #FAST_AUTORPT .set_delay: tya sta joyrpt,x
.no_repeat: dex ; Check the next pad from the bpl .pressed_loop ; multitap. rts ; All done, phew!
.soft_reset: lda #$80 ; Disable the BIOS PSG driver. sta <$E7 jmp [$2284] ; Jump to the soft-reset hook.
.mouse_vectors: db (.mouse_x_hi - .branch_mod) - 2 db (.mouse_x_lo - .branch_mod) - 2 db (.mouse_y_hi - .branch_mod) - 2 db (.mouse_y_lo - .branch_mod) - 2
endif
|
|
|
Post by elmer on Nov 30, 2019 23:08:25 GMT
Here's one last set of code, that I'm finally happy about. This is the version that I'm currently using in TEOS. There are 2 functions, one with full mouse support (but that ignores the extra buttons on a 6-button pad), and one that fully supports the 6-button pad (but ignores mouse-movement). I figure that those are the two cases that anyone would really care about in practical terms for any game/utility project. Both functions detect and handle all devices, so that having extra things plugged in doesn't cause any problems (unlike quite a few PCE games). The functions both internally detect any attached mice the first time that they are called (taking approx 1/3 of a frame to execute). The mouse detection is done using the same "back-to-back" polling that is currently in HuC, although that fast method of mouse detection was not used AFAIK in any shipped PCE game. It appears to work fine on both real hardware and Mednafen, but it would be easy to change if anyone finds any problems. The version that fully supports the 6-button pad runs approximately twice as fast as the 6-button joypad code that is currently in HuC. joypad_example.s (11.27 KB)
|
|
|
Post by dshadoff on Dec 1, 2019 4:39:35 GMT
Cool stuff !
|
|
|
Post by turboxray on Dec 1, 2019 18:10:41 GMT
Now we need keyboard support! IIRC mednafen has support for it.
|
|
|
Post by elmer on Dec 1, 2019 21:08:43 GMT
Now we need keyboard support! IIRC mednafen has support for it. Ahhhh ... but are there any of those Tsushin keyboards available to buy for testing, and how gosh-darned expensive would one be? I took a quick look at the emulation code in Mednafen, and so without actually having a rip of the Tsushin ROM itself to look at, I *guess* that software is detecting the keyboard by looking at the non-existent 6th port on a TurboTap, and seeing if it is '0' (as on a real TurboTap), or non-zero as on the keyboard protocol. Unless I'm missing something, that would suggest that you can only have Tsushin keyboard attached, and not have any pads or mice attached at the same time ... which would suck. Hopefully I'm wrong there. Anyway, yeah, it would be fun to have a cheap hardware interface that allow you to attach a PS/2 keyboard and mouse to the PC Engine.
|
|