In which a better GPU makes me debug a shader

7 years ago by fluffy

Share this post:

Share on Bluesky Share on Twitter Share on Facebook

I haven't gotten much new work done on my games because I'm mostly getting ready to show my games at GeekGirlCon. Which means polishing stuff enough for show, and also getting my booth ready.

Part of getting my booth ready meant getting a second demo machine I could run it on, and since I'd been itching to get a Windows gaming PC for a while, I used this as an excuse to build a pretty decent one. Unlike most of my computers which have either Intel or AMD GPUs, this one I built with an nVidia 1050Ti, which is a pretty decent midrange GPU (and better than what's in my other computers). (I'd have gone with something better but I decided to save money. I'll probably upgrade to a 1080 or whatever is the closest equivalent sometime down the road.)

Anyway, Little Bouncing Ball was running great on every single machine I tried it on before, aside from some weird precision issues with the water effect on lower-spec GPUs which don't support 32-bit floating-point buffers. For those systems I tried a couple of ad-hoc hacks and clamping the range of motion for the physics compute part of the water effect, but that didn't help anything so I ended up just disabling water for those GPUs. But I left the clamp() call in, since it wasn't hurting anything and seemed like it might prevent certain weirdnesses from happening.

So, fast-forward to tonight, when I finally have an nVidia-based machine up and running, and for some reason the water effect isn't working at all; even before it becomes active, the physics seemed to be pinning the depth value to 1 for some reason! It was weird and bizarre and I couldn't figure out what was going on, and I figured that since it was working on all the other GPUs I'd thrown at it, there must be some problem with the GLSL compiler on Windows, or maybe something weird with how nVidia handles floating-point data or the like.

Anyway, at one point I decided to see what happens if I remove the clamp() call, and suddenly it worked perfectly. And I figured, since I was only clamping for the benefit of GPUs that it won't happen on anyway, I'd just remove it.

Then I looked closer at the clamp() call and realized something:

The syntax of clamp() is clamp(x,minVal,maxVal)...

and I had written clamp(minVal,maxVal,x). So why was this even working in the first place?!

So, in this particular case, x was usually between -1 and 1, and minVal and maxVal were -1 and 1, respectively. So, the actual code was clamp(-1,1,x). The astute will notice that this means that x (interpreted as maxVal) would usually be less than 1, meaning that the behavior would be undefined.

There are two basic ways of implementing clamp(); you can either implement it as min(maxVal,max(minVal,x)), or max(minVal,min(x,maxVal)). (You can also do it with a bunch of nasty branches but you generally don't see those in GPUs, which likely have actual opcodes for min() and max() in silicon, or at least make use of predicate logic or the like. For the purpose of this discussion that doesn't actually matter.)

In the first case, my broken code would turn into min(x,max(-1,1)); since x was usually between -1 and 1 anyway, this would appear to be correct; in effect it would be exactly equivalent to min(x,1). This must be what Intel and AMD do.

But in the second case, it would turn into max(1,min(-1,x)), which would be exactly equivalent to max(1,x). Meaning the resulting value would always be AT LEAST 1.

Well, then. Mystery solved. (And I have also accidentally reverse-engineered an implementation detail of three major GLSL compilers! I wonder if this could be used for anything interesting; I'm trying to imagine gameplay that relies on the notion of qualia.)

So anyway, that's what I've been up to.

Other changes since the last devlog:

Tweaked the dialog timing in Strangers based on some user feedback
Lots of fiddly changes to build system stuff
Tinkering with some pixel format stuff
Fixed a broken conversation progression
Added a readme.txt to the app directory to hopefully keep the itch.io app from expanding the .love bundle version erroneously (hopefully making life a bit easier for Linux users)

As a reminder, if you're playtesting Strangers for me, your logs will be in a directory called SockpuppetRefactor which lives below either ~/Library/Application Support (on macOS) or %APPDATA% (on Windows), or ~/.local/share/love/ on Linux

Files

macOS, x86 64-bit 43 MB

Version 834ff16 Sep 19, 2017

Windows, x86 64-bit 30 MB

Version 834ff16 Sep 19, 2017

Windows, x86 32-bit 30 MB

Version 834ff16 Sep 19, 2017

LÖVE bundle (requires LÖVE 0.10.2) 26 MB

Version 834ff16 Sep 19, 2017

Get Refactor

Buy Now$7.00 USD or more

Refactor

An album of games

Add Game Album To Collection

Status	In development
Author	fluffy
Genre	Action, Rhythm
Tags	ludonarrative-dissonance, mindfulness, minigames, Music, Retro

Updated to LÖVE 11.3
Jan 17, 2020
Updated to LÖVE 11.2
Jan 19, 2019
Tiny perf improvements, graphical tweaks
Aug 27, 2018
Performance improvements in Little Bouncing Ball
Aug 15, 2018
Some small updates from user feedback
Jul 10, 2018
High-DPI support is back
May 16, 2018
Take it to (LÖVE) 11
May 16, 2018
New monk design
Mar 05, 2018
Bug fix with config stuff
Mar 03, 2018
Lots of little UI tweaks
Mar 01, 2018

See all posts

Refactor

In which a better GPU makes me debug a shader

Files

Get Refactor

Refactor

More posts

Leave a comment