Development of texture-based interactive splatters
Mechanic Design
A few weeks ago I decided to start a new solo project centred around a unique mechanic. I love the visceral feedback of games like APE OUT and Fruit Ninja, and I was curious to see if these effects could interact more functionally with gameplay instead of being purely aesthetic.
The project is still in early development, but I wanted to share a bit about my process while creating the game’s most complex, central mechanic.
I’ve settled on making a top-down shooter in which the gameplay is centred around a sort of functional “soft-gore”, in which enemies explode when defeated, leaving behind splatters of fuel. The player could then suck up this fuel to refill some of their own resources. My goal is for this to encourage the player to move around the map constantly, refilling their ammo/health and creating a more intense combat gameplay loop.
I had a few core goals for the splatter mechanic:
- The splatter should conform to surrounding level geometry, preventing it from going through or around any nearby walls.
- The splatter should act like a 2D volume, making it disappear when sucked up and I should be able to keep track of the exact amount which is sucked up each frame.
- The splatters should be performant, even in great numbers (perhaps exceeding 100+ on screen at a time).
Conforming to Level Geometry
I decided to start with the first goal of procedurally generating a mesh which is “carved out” by surrounding level geometry. I began by researching similar algorithms and found a series by Sebastian Lague in which he creates a field-of-view visualisation.
I wrote my own implementation of the same basic algorithm, running inside a quad-shaped frame instead of a circle (to make textures easier to work with later on), but it was slower than I wanted due to the number of raycast calls each frame. Cutting them down required writing a unique algorithm, so let’s get to it.
My Approach
To reduce the number of raycast calls I broke down the problem by asking myself the question “if I were to create this mesh manually, where would I place the vertices to get the most accurate mesh with the least triangles?”.
This led me to determine that there are only 4 possible reasons for a vertex to be generated in the mesh:
- The mesh vertex lies directly on the vertex of a collider.
- The mesh vertex is a projection from the edge of a collider.
- The mesh vertex bridges a discontinuity between 2 other vertices in the mesh (eg. intersecting colliders, or implicit surfaces like circles).
- The mesh vertex lies on the mesh bounds (ie. it hit nothing).
Step 1: Finding the relevant vertices
First we do an OverlapBox() test to find all colliders which are within range of the bounds of the mesh.
Then, using their position, orientation and size, create an array of points that represents the collider, each with a set of tangent vectors pointing to the adjacent vertices in the shape.
For each of these arrays, add the points to a combined list if they satisfy each of the following conditions:
- Front-facing (ie. edge normal points towards the centre of the mesh).
- Within the mesh bounds.
- Visible from the centre of the mesh (using a linecast).
Each corner of the mesh bounds are also added if they’re visible from the centre of the mesh as well.
Finally, this list is sorted by the angle from the centre of the mesh to each of the points.
What we have is a list of all the in-bounds vertices which are visible from the centre of the mesh, as well as the tangent lines that they connect to.
Step 2: Projecting edge vertices
From here we can use the tangents of each vertex to determine if they connect a front-facing and back-facing edge (using the dot product). If they do, then they should be projected back at the geometry (or mesh bounds) directly behind them in the scene. I’ve decided to refer to these points as “projected vertices”.
We can tell if the projected vertex should go before or after the index of the vertex that it was projected from based on the tangent vectors of the casting vertex (either the front-facing normal is first or second in the pair), and once they’re in place in the list, we update the casting and projected vertex tangents accordingly so that they connect smoothly.
Step 3: Ensuring full continuity
Finally we do a pass all the way around the perimeter by iterating through the sorted list, and compare the positions and tangents of consecutive points against some configurable thresholds to see if they are continuous.
For any points which are not continuous, we fall back to an approximation which functions similarly to a binary search. Here’s how that works.
Fire a ray halfway between the discontinuous points, and check if the hit is continuous with the first (min) point, narrowing the search as we go.
Keep doing this until either we reach continuity between the beginning and ending points, or the number of iterations reaches the maximum and we settle for the level of detail which has been found. Then once again, update the tangents of these points accordingly to connect them smoothly.
Once this is complete, we’ve found the perimeter of the mesh!
Finishing up with the mesh
Each splatter will only need to generate this mesh once when it’s created, but it’s highly performant code, far more than what I started with. Taking less than 0.15ms to re-scan and regenerate a large mesh each frame.
I’m actually using this exact same code for the rendering of the environment to fake a similar 3D aesthetic to APE OUT in a 2D rendering / physics environment. This means my level design (and possibly level generation code, if I go procedural) will only need to deal with creating 2D geometry, and the illusion of the 3D environment is created emergently.
The dynamic mesh used for this rendering tech matches the camera’s orthographic size and regenerates each frame, then writes to the stencil buffer as a mask using several RenderObject passes in URP, which utilise layers in the engine to mask objects in the scene quickly and efficiently.
Now onto the hard part, sucking up the fuel.
From geometry to shaders
Now that I had the mesh that the splatter texture would be rendered on, I needed to be able to remove areas of it on demand.
Once again I started by breaking this problem down into a few parts:
- Writing to a mask texture representing the erased regions.
- Using the mask to make the splatter disappear in-game.
- Combining the mask and the splatter texture to determine how much of the splatter is removed with each brushstroke.
Before we begin, I’d like to address some details for those first two parts.
Originally I had planned to write to the mask using a simple circular brush and then distort the mask when reading from it in the material shader to create a more “blobby” edge when erasing sections. But I ended up determining that it was actually cheaper to do this in reverse, writing to the mask with a warped brush and reading the mask without extra distortion.
There are 2 reasons for this adjustment.
- The splatter material is transparent since it needs to cover various surfaces in-game. Having an expensive fragment shader (with lots of noise samples) would be an unnecessary performance hit since it would essentially be recalculating the same shape each frame.
- Reading from the mask to determine how much paint has been removed wouldn’t factor in the mask distortion, and therefore the results would not be fully accurate to what the player sees.
With that out of the way, let’s get into some shaders.
Step 1: The mask and the brush
Writing to the mask is fairly straightforward.
First, we’ll set up 2 identical RenderTextures, I’ll be referring to them as the primary and secondary masks.
Next we’ll create a shader that combines the previous state of the primary mask and a coordinate position representing the centre of the brush (along with some other brush variables) to output a version of the primary mask after being affected by the brush.
Now we’ll render the mesh of the splatter object onto the secondary mask using this shader with the position we want to erase at, and then use Blit() to copy it back to the primary mask which is no longer in use by the shader. Doing this will allow us to make lasting modifications to the mask textures.
Step 2: Visualising the mask in-game
Now we need to make a shader to combine the primary mask and the splatter texture. Since we don’t want the mask to be crazy large (and therefore take up lots of memory) we’ll also want a way to make the edges of the removed areas look as smooth as the edges of the main splatter texture itself. Signed distance fields have got us covered here.
If we say that a pixel in the mask is considered “sucked up” when it is within some threshold, we can create a smooth seam between the erased and non-erased areas, akin to anti-aliasing (almost) for free.
I made the SDF mask shader in Shader Graph since it’s quite a bit faster and easier to work with for game visuals than HLSL (in my opinion).
Thanks to front-loading the cost of distorting the noise into the brush rather than the shader this actually ends up being pretty simple to put together, just a SmoothStep for the SDF and some multiplies for the mask get it all working.
Except if we check this out in-game we can see a problem.
While we can write to the mask with the brush material, and it works nicely away from the edges of the mesh in-game, what you find near the edges are stray pixels around the border of the mesh which seemingly can’t be written to in the mask texture, causing this jagged appearance.
What’s happening here?
Since the process of writing to the mask is executed by a shader (and therefore passes through the rasterizer), some pixels are automatically ignored by the rasterizer when writing to the mask, as their centre does not lie within a triangle in the mesh.
In other words, our brush can only write to fragments whose centre is contained in the mesh geometry, which doesn’t encompass all the required pixels. This looks terrible, so we should fix it before moving on.
Thankfully it’s quite a fast and easy fix.
To do that, let’s write another shader that reads the mask and returns an extended version which simply grows the areas that have been written to by 1 pixel, without modifying the actual masks (primary/secondary).
This is pretty similar to a cellular automata operation, and it’s super cheap. All we have to do for each fragment is iterate over all immediate neighbours of each pixel in the mask, and return the maximum value across all of them!
To safely use this shader we’ll set aside another RenderTexture that I’ll call the “output mask” and then write to it with the mask extension shader using another Blit() call with the secondary mask as an input.
This output mask is now acting like a filtered version of the secondary mask, giving us the proper results!
It’s important we don’t feed this filtered version back into the brush material, otherwise the filter will propagate its effects throughout the entire mask, creating an undesirable feedback loop.
Step 3: Compute shaders, atomic counters and the geometry mask
We’re on a roll so far, our splatter appears to disappear in-game but we still have no way of knowing how much fuel we’ve actually sucked up.
We could technically write a method to count each pixel in the mask and do all those comparisons on the CPU, but that would be pretty slow, especially since we’re aiming to support at least 100 splatter objects simultaneously.
This is a fantastic task to parallelise and hand over to our GPU.
So we’re gonna write a compute shader.
This compute shader will take our mask and splatter texture as inputs, process them asynchronously to the rest of the game and then return a pair of integers when it’s all done. These integers will represent the number of pixels in the mask which correspond to visible pixels in the splatter texture, and the subset of those which are greater than our erase threshold.
Writing the compute shader is fairly simple, we’ll dispatch a set of (8, 8, 1) threads to process our mask texture in several smaller groups. In each group we’ll iterate over the pixels in that portion of the mask, and using atomic counters, increment the total and erased counts across all threads.
With our compute shader in place, everything seems to be working. But if we compare the number of erased pixels against the total after fully erasing the mask in-game we can see those numbers don’t always match up.
Let’s dive into what’s causing this.
Since we’re only giving the compute shader a mask and a texture, without the context of the rasterizer it has no idea which pixels of the splatter texture are even visible in the carved mesh. Our brush shader is executing only on valid fragments in the mesh, but the compute shader isn’t aware of the mesh shape at all and executes on every pixel in the output mask regardless of if the fragment is visible in the mesh or not.
To solve this we’re going to need yet another mask RenderTexture.
This one will be really cheap, and we’ll use it to bake the shape of the mesh geometry into a texture the same resolution as the erasing masks. Then we can look up into this new “geometry mask” to determine if the fragment exists in the mesh from inside our compute shader.
Baking the geometry mask is also super easy! We’ll set it as the render target, and then draw our mesh renderer to it using a new shader that simply outputs 1.0 for each fragment it executes on, leaving the rest as 0.0 by default. The vertex shader will convert the UV values in the mesh to vertex positions in screen-space so the rasterizer aligns our geometry perfectly to the geometry mask and draws it as is with identical UVs in the texture too.
Now that we have our geometry mask (and I promise, this is the last mask texture we need), we can implement it into our compute shader…
…And we’re done! At least for the first splatter.
Painting the whole town
We’ve got one splatter working, and that’s awesome! But we can do more. While our system currently works with multiple of these splatters, it isn’t as efficient as it could be and interfacing with a large number of splatters is quite inconvenient. So let’s do some optimising.
Command batching
First up, let’s make a system to manage all the splatters in the scene and batch the erase commands for each request to a single buffer to reduce memory assignment and streamline the whole process.
Broad and narrow phase culling
Next we can also configure this system to only write to splatter objects that will actually be changed by a given command. This is easy since we already know the outer radius of the brush and the bounds of each mesh, we can check if they are overlapping before calling the erase commands on each.
To go a step further we can also check if the brush is overlapping with the perimeter polygon of the splatter before writing as well, this is much cheaper than you’d expect in 2D but it is additional CPU time, so it isn’t always desirable.
Modified mask flags
Finally we’ll flag a splatter as being “modified” when erasing from it, and then only check those with this flag raised when we’re requesting an update from the compute shader, lowering the flag each time we do this.
We can make another flag that is raised while the compute buffer is currently pending to prevent repeat calls as well.
Wrapping things up
This has been a really fun project to take on, and I hope the mechanic is unique enough to make the combat loop interesting to players. It’s been a great excuse for me to dive into compute shaders for the first time, and of course, continue working with the kind of algorithmic problem-solving that I love so much about programming video games.
Performance
I’m pretty happy with the performance too. The broad-phase checks really help things perform well with many objects at once, and keeping the noise calls in the brush rather than the shader helps tremendously as well.
Here’s some numbers (measured in average frame times):
- No splatters, only rendering environment (0.91ms)
- 100 splatters off-screen (1.09ms)
- 100 splatters on-screen (1.22ms)
- 100 splatters erasing simultaneously (5.0ms)
Using these numbers we can deduce some rough averages for the performance cost of each stage of the process for a single splatter.
- Broad/narrow-phase check (0.0018ms)
- SDF shader draw call (0.0013ms)
- Erase command call (0.0378ms)
Erasing from 100 objects all at the same time is pretty unrealistic, and clearly the 5ms execution time is still rather terrible. Thankfully this benchmark is far beyond the worst case scenario and I’m confident the performance will hold up during actual gameplay.
Anyway, I’ve rambled long enough by now. Thank you so much for reading. If you’d like to see more of my work you can check out my portfolio here.
Until next time!