Bugfixes and Queries

  • Thread starter Thread starter DLPB_
  • Start date Start date
Status
Not open for further replies.
It works perfectly fine in game.  And it does what the original game does too.  Have I missed something?
Original Game code block:
Code: [Select]
Code:
.text:006D21A2                 imul    ecx, 98h.text:006D21A8                 mov     edx, dword_DC3630.text:006D21AE                 movsx   eax, word ptr [edx+ecx+4].text:006D21B3                 cdq.text:006D21B4                 sub     eax, edx.text:006D21B6                 mov     ecx, eax.text:006D21B8                 sar     ecx, 1.text:006D21BA                 movsx   eax, [ebp+var_158].text:006D21C1                 cdq.text:006D21C2                 sub     eax, edx.text:006D21C4                 sar     eax, 1.text:006D21C6                 sub     ecx, eax.text:006D21C8                 mov     [ebp+var_158], cx.text:006D21CF                 mov     [ebp+var_168], 0Ch.text:006D21D8                 mov     [ebp+var_148], 0    v5 = *(_WORD *)(dword_DC3630 + 152 * a2 + 4);    v4 = (((_DWORD)v5 - HIDWORD(v5)) >> 1) - v117 / 2;    v118 = (((_DWORD)v5 - HIDWORD(v5)) >> 1) - v117 / 2;
Your block:
Code: [Select]
Code:
.text:006D21A2                 mov     eax, ebp.text:006D21A4                 sub     ax, 158h.text:006D21A8                 mov     edx, [eax].text:006D21AA                 cmp     cl, 19h.text:006D21AD                 jnz     short loc_6D21BE.text:006D21AF                 mov     word ptr [eax], 0Dh.text:006D21B4                 add     edx, 18h.text:006D21B7                 mov     word_91D21C, dx.text:006D21BE.text:006D21BE loc_6D21BE:                             ; CODE XREF: sub_6D1CC0+4EDj.text:006D21BE                 mov     ebx, 280h.text:006D21C3                 sub     ebx, edx.text:006D21C5                 shr     ebx, 1.text:006D21C7                 cmp     cl, 19h.text:006D21CA                 jz      short loc_6D21D1.text:006D21CC                 mov     [eax], bx.text:006D21CF                 xor     ebx, ebx.text:006D21D1.text:006D21D1 loc_6D21D1:                             ; CODE XREF: sub_6D1CC0+50Aj.text:006D21D1                 mov     [eax-1Ch], bx.text:006D21D5                 mov     word ptr [eax-10h], 0Bh.text:006D21DB                 mov     word ptr [eax+10h], 0    v7 = a2;    HIWORD(v5) = HIWORD(ebp0);    LOWORD(v5) = ebp0 - 344;    v6 = *(_DWORD *)v5;    if ( (_BYTE)a2 == 25 )    {      *(_WORD *)v5 = 13;      v6 += 24;      word_91D21C = v6;    }    v8 = (unsigned int)(640 - v6) >> 1;    if ( (_BYTE)v7 != 25 )    {      *(_WORD *)v5 = v8;      LOWORD(v8) = 0;    }    *(_WORD *)(v5 - 28) = v8;    *(_WORD *)(v5 - 16) = 11;    *(_WORD *)(v5 + 16) = 0;
These aren't equivalent at all.
 
A lot of the original code is redundant (for example, part of that is just pulling the value 280h from a memory location, something I could remove and just add manually).  Have a look at it in game and see if there is an oversight (even if there is, it can't be catastrophic since everything is working as I want it to)?  My code does what it's supposed to do - it sorts the help box and autosizes / positions the attack caption and box.  Comparing the two codes won't really give you an idea of what has been changed visually.
 
Last edited:
Covarr informs me that FF7 port (using PC version) on the PS4 has fixed the broken frame limiter.  I've looked into it before but don't know where to start.  But at least we know it can be done.
 
I cross my fingers for it.
I mean I know I keep going on about this - but it really really is a big issue.  A frame limiter is a main component of the engine and it needs fixing.
 
I am still looking at the frame limiter. The world map limiter is not the same... prob why it actually works.  Still...  It's gonna take time, if I can at all.
 
It looks like I am not going to have any success (I'll keep trying).  But I'll leave here what I think is going on and what Dziugo posted to me, in the hope someone else can figure this out.  I think I've found the same function that Dziugo was editing too.

As far as I can see, the reason the limiter isn't accurate and drops frames under heavier load is simply because the calculation is bad.  I remember when I programmed my own limiter for my own game using timegettime.  If it isn't done right, you'll get frame drops where there shouldn't be.  This is noticeable when trying to record games with something like Fraps. It will work perfectly if you set the frame rate of FF7 to 60 and then let fraps limit the game to 30 - but it will drop to 24-28 if you don't.   This isn't a problem with Fraps - it will happen with any high load (although game recording shows the bug up more).  For some people, attaining 30fps in normal play won't happen either.  This is also not fixed by Aali's driver - which uses timegettime / qpc.  Because it's the limiter that is broken and leading to this issue.

This is what Dziugo sent me:

How it works currently:
while (notEnd) {
   doStuff1();
   doStuff2 { // dynamic function, calling address changes depending on the module we're in
      startTheRDTSCTimer();                                 A
      doStuff2a(); // here the actual processing work
      idleUntilRDTSCReachesTheDesiredValue();               B
      doStuff2b(); // here some other work
   }
   doStuff3();
}

So, the game makes sure that the time between A and B is at least the time it calculated the frame processing should take... Wait a minute! What about the time it takes to get from B to A again? That's assumed to be exactly 10000 RDTSC ticks (for field).

This could affect the FPS slightly (now I see what you mean). One could correct it (slightly ;p) by moving the B just before A, or actually modding the limiter to take into account at least the last frame processing time (each additional frame will add precision, but with exponentially lowering results - something like 4-5 frames should be enough).
Note that the 10000 RDTSC ticks actually mess things up further.  This is corrected when you do this >

7B7848 = 00 00 00 00 00 00 00 00 *note this part of the calculation is taken into account when going from a different module to field.  So, going from world map to field recalculates and uses this Double floating point value.

Ok, got it to run at exactly 30.0 FPS (average of last 3 frames). I've merged A+B into one point, and made it CPU frequency independent, all that with just timeGetTime. Now, time to make it actually usable... ;p
And then I ran out of time. I've nailed down the frame limiter for field (field is special as there are actually two framelimiters, it allows ff7 to make a little bit of frame-skipping) and most of the other modules. The rebuilt framelimiter (really simple one) works fine, although it didn't get much of testing
The function that seems to deal with the frame limiter is 6384E6.

The better news is that the world map limiter seems to work - so maybe can do a comparison?  The world map limiter is at 74C7E4.
 
Last edited:
Ok I'm getting somewhere now.  The frame limiter above is probably not broken.  It basically keeps the game in a loop until a number of ticks has been satisfied.  It's very simple to reprogram despite how ridiculous the code looks above.

Unfortunately, that's not where the issue lies.  At least, not how I thought.  I am going to cook something up later!
 
Like I PMed you, it's got to be based on TimeGetTime to be accurate. RDTSC is unusable on modern CPUs. Instead of looping until the ticks are done, it SHOULD be pre-rendering the next frame (unless that's what it's waiting on).
 
Like I PMed you, it's got to be based on TimeGetTime to be accurate. RDTSC is unusable on modern CPUs. Instead of looping until the ticks are done, it SHOULD be pre-rendering the next frame (unless that's what it's waiting on).
It's a silly system...  it's not doing what a normal limiter does.  It's keeping the game in a while loop for a period of ticks before letting it carry on.  I've never seen something like that before.  If it's simply incrementing ticks in that function, then that's also a weird way of doing it.  When using TGT or QPC, there is no increment... it's done independently of the function.  I should be able to quite easily fix this up with a TGT or even QPC.  And it should work fine.
 
QPC is the "correct" way to go as TGT is OS clock dependent and QPC is more CPU cycles dependent. QPC is the more modern version of TSC and still has to be calibrated prior to use.

At best you'd want a timer with better than 16.6 ms resolution. TGT gets 1 ms precision at best and QPC is measured in μs.

Optimally there'd be a single timer thread in the background that's sending signals back to the draw thread to display what it has ready and begin processing the next frame. (V-Sync, anyone?) HOWEVER, since FFVII has THREE graphics modes (15, 30 and 60 fps) it can't rely on a fixed timer. It's always adjusting it when transitioning from one module to the next.
 
TGT doesn't have the same issue that the older method does - and I think the way they did this limiter was completely bollocks anyway.  You're right that TGT is close to 1ms (when timebeginperiod(1) is called), but that should be good enough when your target rate is 30fps.

QPC has issues with older  CPU but shouldn't with the massive majority out now.

QPC is already there at CFF8D8, so that's convenient (aali's driver prob adds that support).
 
Last edited:
The frame limiter is fixed - it will give a perfect 30 (or 60 in battle) recording or otherwise.  The scroll, like at Wallmarket is also now great.  And it should be v easy to export this to Steam and to all other problem areas.


 8-) 8-) 8-) 8-) 8-) 8-) 8-) 8-) 8-) 8-) 8-)

This is how you fix it:

Code: [Select]
Code:
StartQpc:= currentqpc;DeltaQPC := StartQPC - LastQPC;While (Deltaqpc + CurrentQPC) - startqpc < clocktarget dobegin// do nothing.  Effectively suspend play.end;LastQPC:= CurrentQPC;
The delta time was not being correctly factored in with the old process.  Clocktarget is the number of Qpc ticks a second * 30, for field.  It's saved at cff890.
 
Last edited:
Dude.  Congrats.  I know this has been bugging you for a long time.
 
Dude.  Congrats.  I know this has been bugging you for a long time.
Thanks man!

And yeah, it has!!!!

it's a definite improvement, as you'll see.  Much smoother.  Even the screen fades.
 
The frame limiter is fixed - it will give a perfect 30 (or 60 in battle) recording or otherwise.  The scroll, like at Wallmarket is also now great.  And it should be v easy to export this to Steam and to all other problem areas.


 8-) 8-) 8-) 8-) 8-) 8-) 8-) 8-) 8-) 8-) 8-)

This is how you fix it:

Code: [Select]
Code:
StartQpc:= currentqpc;DeltaQPC := StartQPC - LastQPC;While (Deltaqpc + CurrentQPC) - startqpc < clocktarget dobegin// do nothing.  Effectively suspend play.end;LastQPC:= CurrentQPC;
The delta time was not being correctly factored in with the old process.  Clocktarget is the number of Qpc ticks a second * 30, for field.  It's saved at cff890.
I have to ask (LOL) why isn't it using the Vsync event at all?
Swaping rendered buffers is what it is for. I'm pretty sure the PS1 version does that (I could be wrong though).
That would make it asynchronous. What they are doing is consuming lots of time with looping (software timing is very poor in accuracy). And trying synchronise timing.
Most of the time you create threads that handle different sets of event processing on the PC however it may have been old enough for the PC microsoft hadn't implemented threads (win32 / win95-98 era?).
Anyhow I digress, congratulations. Remember Square I don't believe did the porting and according to halkun they didn't have the released PS1 code as their base for the port.
Many of the PC bugs didn't exist in the PS1 (likely because on the PS1 they would have used the vsync event by setting a bit in an ISR then having an event loop check it and swap the frame buffer viewed and do the next image of the frame). Swaping the buffer frame likely would take micro seconds durring the vsync interval.

Unfortunately I doubt you can do this with how the constructed the PC version of the game without some major adjustment to the original code.

Cyb
 
Vsync does affect delta time (if enabled, of course)...and will limit the game if the limiter does not (so if your maximum refresh rate is 60, the best fps you will get is 60.  But since no module (field etc) goes over that, most players are unaware of that limit).

In other words, if you had no code above at all, your refresh rate was 60, and vsync enabled... the whole game would run at 60 (meaning field, battle, world map would be incorrect speed without fixes... and the menu would be fine).
 
Anyone wanting to change texture placements on screen in chocobo minigame the function to search is call 0077A469

With that you can see the table address above it, which includes its ID and X Y pos.
 
For 1998 English game using latest Aali's driver, this is how:

#These 4 entries shift the field down (all layers).
640C77 = B8 E8 00 00 00
640FD4 = B8 E8 00 00 00
641397 = B8 E8 00 00 00
{Aali's latest driver}
10045A32 = BB E8 00 00 00

# offset screen Y by 16 pixels.
CFF1E4 = 10 00

#Shift cursor down 16 pixels.
CFF200 = F0 00

# offset FADE in/out Y by 16 pixels.
CFFAE0 = 10 00

That's not to say I am confident that this is complete.  But a few testers would go a long way.  It does look ok.  But who knows?  It's hard to tell! 

The FMV isn't centred yet - but that shouldn't be much of a problem. 

The Steam version of the game needs AF3DN.p patching. The memory addresses move, so here is the file address:

C6B2 = BB E8 00 00 00
 
Last edited:
Status
Not open for further replies.
Back
Top