In which a better GPU makes me debug a shader

I haven't gotten much new work done on my games because I'm mostly getting ready to show my games at GeekGirlCon. Which means polishing stuff enough for show, and also getting my booth ready.

Part of getting my booth ready meant getting a second demo machine I could run it on, and since I'd been itching to get a Windows gaming PC for a while, I used this as an excuse to build a pretty decent one. Unlike most of my computers which have either Intel or AMD GPUs, this one I built with an nVidia 1050Ti, which is a pretty decent midrange GPU (and better than what's in my other computers). (I'd have gone with something better but I decided to save money. I'll probably upgrade to a 1080 or whatever is the closest equivalent sometime down the road.)

Anyway, Little Bouncing Ball was running great on every single machine I tried it on before, aside from some weird precision issues with the water effect on lower-spec GPUs which don't support 32-bit floating-point buffers. For those systems I tried a couple of ad-hoc hacks and clamping the range of motion for the physics compute part of the water effect, but that didn't help anything so I ended up just disabling water for those GPUs. But I left the clamp() call in, since it wasn't hurting anything and seemed like it might prevent certain weirdnesses from happening.

So, fast-forward to tonight, when I finally have an nVidia-based machine up and running, and for some reason the water effect isn't working at all; even before it becomes active, the physics seemed to be pinning the depth value to 1 for some reason! It was weird and bizarre and I couldn't figure out what was going on, and I figured that since it was working on all the other GPUs I'd thrown at it, there must be some problem with the GLSL compiler on Windows, or maybe something weird with how nVidia handles floating-point data or the like.

Anyway, at one point I decided to see what happens if I remove the clamp() call, and suddenly it worked perfectly. And I figured, since I was only clamping for the benefit of GPUs that it won't happen on anyway, I'd just remove it.

Then I looked closer at the clamp() call and realized something:

The syntax of clamp() is clamp(x,minVal,maxVal)...

and I had written clamp(minVal,maxVal,x). So why was this even working in the first place?!

So, in this particular case, x was usually between -1 and 1, and minVal and maxVal were -1 and 1, respectively. So, the actual code was clamp(-1,1,x). The astute will notice that this means that x (interpreted as maxVal) would usually be less than 1, meaning that the behavior would be undefined.

There are two basic ways of implementing clamp(); you can either implement it as min(maxVal,max(minVal,x)), or max(minVal,min(x,maxVal)). (You can also do it with a bunch of nasty branches but you generally don't see those in GPUs, which likely have actual opcodes for min() and max() in silicon, or at least make use of predicate logic or the like. For the purpose of this discussion that doesn't actually matter.)

In the first case, my broken code would turn into min(x,max(-1,1)); since x was usually between -1 and 1 anyway, this would appear to be correct; in effect it would be exactly equivalent to min(x,1). This must be what Intel and AMD do.

But in the second case, it would turn into max(1,min(-1,x)), which would be exactly equivalent to max(1,x). Meaning the resulting value would always be AT LEAST 1.

Well, then. Mystery solved. (And I have also accidentally reverse-engineered an implementation detail of three major GLSL compilers! I wonder if this could be used for anything interesting; I'm trying to imagine gameplay that relies on the notion of qualia.)

So anyway, that's what I've been up to.

Other changes since the last devlog:

  • Tweaked the dialog timing in Strangers based on some user feedback
  • Lots of fiddly changes to build system stuff
  • Tinkering with some pixel format stuff
  • Fixed a broken conversation progression
  • Added a readme.txt to the app directory to hopefully keep the app from expanding the .love bundle version erroneously (hopefully making life a bit easier for Linux users)

As a reminder, if you're playtesting Strangers for me, your logs will be in a directory called SockpuppetRefactor which lives below either ~/Library/Application Support (on macOS) or %APPDATA% (on Windows), or ~/.local/share/love/ on Linux


macOS, x86 64-bit (32 MB)
Version 834ff16 28 days ago
Windows, x86 64-bit (22 MB)
Version 834ff16 28 days ago
Windows, x86 32-bit (22 MB)
Version 834ff16 28 days ago
LÖVE bundle (requires LÖVE 0.10.2) (19 MB)
Version 834ff16 28 days ago

Get Refactor

Buy Now$1.41 USD or more

Leave a comment

Log in with your account to leave a comment.