The interaction algorithm of hundreds of thousands of unique particles on the GPU, in GLES3 and WebGL2

Description of the logic algorithm, and analysis of a working example in the form of a techno-demo game



WebGL2 version of this demo https://danilw.itch.io/flat-maze-web for other links, see the article.









The article is divided into two parts, first about logic, and the second part about application in the game, the first part :





Further description of the game demo, the second part :





Part 1



1. Key Features



The idea is a collision / physics of hundreds of thousands of particles among themselves, in real time, where each particle has a unique identifier ID



.







When each particle is indexed, it is possible to control any parameters of any particle , for example mass, its health (hp) or damage, acceleration, deceleration, which objects to encounter and reactions to the event depending on the type / index of the particle, also unique timers for each particle , and so on as necessary.







All logic on GLSL is fully portable to any game engine and any OS where there is support for GLES3.







The maximum number of particles is equal to the size of the framebuffer (fbo, all pixels).







A comfortable number of particles (when there is room for particles to interact) is (Resolution.x*Resolution.y/2)/2



is every second pixel in x



and every second pixel in y



, which is why the logic description says so.







In the first part of the article, minimal logic is shown, in the second on the example of the game, logic with a large number of interaction conditions.







2. Links and a brief description



I made three demos on this logic:







1. On GLSL fragment-shader , on shadertoy https://www.shadertoy.com/view/tstSz7 , see the BufferC code in it all the logic. This code also allows you to display hundreds of thousands of particles with their UV, in an arbitrary position, on a fragment-shader without using instanced-particles.













2. Porting logic to instanced-particles (used by Godot as an engine)









Links Web version , exe (win) , sources project particles_2D_self_collision .







Short description: This is a bad demonstration on instanced-particles , due to the fact that I make a maximum increase where the whole map is visible, 640x360 particles (230k) are always processed, this is a lot. See below in the description of the game, there I did it right, without extra particles. (there is a particle index error in the video, this is fixed in the code)







3. The game, about it below in the description of the game. Links Web version , exe (win) , sources







3. The algorithm of the logic



Briefly:







The logic is similar to falling-sand, each pixel preserves the fractional value of the position (shift within its pixel) and the current acceleration.







The logic checks the pixels in radius 1, that their next position wants to go to this pixel (because of this restriction, see the restrictions below) , also the pixels in radius 2 for repulsion (collision).







The unique index is saved by translating the logic to int-float, and reducing the size for the given position pos



and speed vel



.







Data is stored in this way: (because of this bug, see restrictions)







 pixel.rgba r=[0xfffff-posx, 0xf-data] g=[0xfffff-posy, 0xf-data] b=[0xffff-velx, 0xff-data] a=[0xffff-vely, 0xff-data]
      
      











In the code , line numbers for BufC https://www.shadertoy.com/view/tstSz7 , 115 transition-check, 139 collision-checks.







These are simple loops to take adjacent values. And the condition is, if the position is taken equal to the position of the current pixel, then we move that data to this pixel (because of this restriction) , and the value of vel



changes depending on the neighboring pixels, if any.







This is all particle logic.







It is best to place particles at a distance of 1 pixel from each other if they are closer than 1 pixel, then there will be repulsion, as an example, a map with a labyrinth in the game, the particles stand in their places without moving because of a distance of 1 pixel between them.







Next comes the rendering (rendering), in the case of fragment-shader, pixels are taken in a radius of 1 to display intersecting areas. In the case of instanced-particles, a pixel is taken at the address INSTANCE_ID



translated from a linear view into a two-dimensional array.







4. Limitations of logic. Bugs / features, and ANGLE bugs



  1. The pixel size , BALL_SIZE



    in the code, must be within limits for calculation, greater than sqrt(2)/2



    and less than 1



    . The closer to 1 the less space for walking inside the pixel (the pixel itself), the less the more space. Such a size is needed so that the pixels do not fall into each other, less than 1 can be set when you have small objects, an illusion of objects less than 1 pixel (calculated) is created.
  2. The speed cannot be more than 1



    pixel, otherwise the pixels will disappear. But it is possible to have a speed of more than 1



    per frame, if you make several framebuffer (fbo / viewport) and process several logic steps per frame-speed, it will increase the number of times equal to the number of additional fbo. This is what I did in the fruit demo, and using the link to shadertoy (bufC copied to bufD).
  3. Pressure limitation (like gravity, or other force-normal-map). If several neighboring pixels take the position of this (see the picture above), then only one is saved, the first pixel disappears. This is easy to notice in the demo on shadertoy, set the mouse to Force, change the value of MOUSE_F



    in Common to 10



    , and direct the particles to the corner of the screen, they will disappear in each other. Or the same with the maxG



    gravity maxG



    in Common .
  4. Bug in Angle. For this logic to work in the GPU (instanced) -particles, it is best (cheaper, faster) to calculate the position, and all other particle parameters for display, in instance-shader . But Angle does not allow the use of more than one fbo-texture for a shader, so the calculation of part of the logic must be transferred to Vertex-shader where to transfer the index number from the instance shader. This is what I did in both demos with GPU particles.
  5. A serious bug in both demos (except for the game) the position value will be lost if it is not a multiple of 1/0xfffff



    bug test is here https://www.shadertoy.com/view/WdtSWS

    More precisely, this is not a bug, it should be so, for simplicity, I called this bug as part of this algorithm.


Fix bug:

Do not convert the position value to int-float , because of this 0xff



disappear, 8 bits available for data, but 0xffff



value for data will remain, which may be enough for a lot of things.

I did just that in the demo of the game , I use only 0xffff



for the data where the particle type, animation timer, health are stored, and there is still free space.







5. Access to index data



instanced-particle has its own INSTANCE_ID



, it takes a pixel from the texture of the framebuffer with particle logic (bufC, example for shader), if there we unpack the particle (see data storage) ID of this particle , by this ID we read the texture with data for particles (bufB , an example on a shader).







In the shadertoy example, bufB stores only the color for each particle, but it is obvious that there can be any data, as mass, acceleration, deceleration wrote earlier, as well as any logical actions (for example, you can move any particle to any position (teleport) if done corresponding logical action in the code), you can also control the movement of any particle or group from the keyboard ...







I mean that you can do anything with each of the particles as if they were ordinary particles in an array on the processor, the two-way access from the GPU particle can change its state, but also from the CPU you can change the particle state by index (using logical actions and texture data buffer).







Part 2



1. Used features of this logic. And fast rendering of a million pixel particles



The size of the framebuffer (fbo / viewport) for particles is 1280x720, the parts are located after 1, this is 230 thousand active particles (active elements in the maze).

There are always no more than 12 thousand GPU-instanced particles on the screen.







Logic uses:









Compared to the fruit demo, where there is overhead, in this game the number of GPU-instanced particles is only 12 thousand.







It looks like this:













Their number depends on the current zoom ( zoom ) of the map, and the increase is limited to a certain value, so only those that are visible on the screen are considered.

The screen shifts with the player, the logic for calculating the shifts is a little complex, and very situational, I doubt that she will find application in another project.







2. Implementation, a few comments on the code.



All game code is on the GPU.







The logic for calculating the shift of particles in the screen with an increase in the vertex function in the /shaders/scene2/particle_logic2.shader file is a particle shader file (vertex and fragment), not an instanced shader, an instanced shader does not do anything, only passes its index due to bug described above.







particles by type and all the logic of particle interaction in a file, this is a file of a shader / frame2 particle / shader / particle_fbo_logic.shader shader file







 // 1-2 ghost // 3-zombi // 4-18 blocks // +20 is on fire // 40 is bullet(right) 41 left 42 top 43 down
      
      





data storage pixel [pos.x, pos.y, [0xffff-vel.x, 0xff-data1],[0xffff-vel.y, 0xff-data2]]





data1 is a type, data2 is an HP or timer.







The timer goes in frames in each particle , the maximum value of the timer is 255, I don’t need so much, I use only 1-16 maximum ( 0xf



), and 0xf



remains unused where for example you can store the real HP value, it is not used for me. (that is, yes, I use 0xff



for the timer
, but in fact I only have less than 16 frames of animation, and 0xf



enough, but I did not need additional data)

Actually 0xff



used only on the timer of burning trees, they turn into zombies after 255 frames. The timer logic is partially in the type_hp_logic



in the particle framebuffer shader (link above).







An example of a two-way collision operation when a fireball goes out on the first hit, and the object with which it was hit also performs its action.



File shaders / scene2 / particles_fbo_logic.shader line 438:







 if (((real_index == 40) || (real_index == 41) || (real_index == 42) || (real_index == 43)) && (type_hp.y > 22)) { int h_id = get_id(fragCoord + vec2(float(x), float(y))); ivec2 htype_hp = unpack_type_hp(h_id); int hreal_index = htype_hp.x; if ((hreal_index != 40) && (hreal_index != 41) && (hreal_index != 42) && (hreal_index != 43)) type_hp.y = 22; } else { if (!need_upd) { int h_id = get_id(fragCoord + vec2(float(x), float(y))); ivec2 htype_hp = unpack_type_hp(h_id); int hreal_index = htype_hp.x; if (((hreal_index == 40) || (hreal_index == 41) || (hreal_index == 42) || (hreal_index == 43)) && (htype_hp.y > 22)) { need_upd = true; } } }
      
      





real_index



is a type, types are listed above, 40-43 is a fireball .

further type_hp.y > 22



is the value of the timer, if it is greater than 22 then the fireball did not encounter anything.

h_id = get_id(...



take the value of the type and HP (timer) of the particle encountered

hreal_index != 40...



ignored type (other fireball )

type_hp.y = 22



a timer is set to 22, this is an indicator that this fireball collided with one object.

else { if (!need_upd)



variable need_upd checks that there are no repeated collisions, since the function is in a loop, we encounter one fireball .

h_id = get_id(...



if there wasn’t a collision yet, we take the object type and timer.

hreal_index == 40...htype_hp.y > 22



that the collision object is fireball and it does not go out.

need_upd = true



flag that it is necessary to update the type since it encountered a fireball .







further line 481

if((need_upd)&&(real_index<24)){



real_index <24 by type less than 24 there are non-burning zombie and ghost trees, and then in this condition we update the type depending on the current type.







Thus, almost any interaction of objects can be done.







Interaction with the player:



File shaders / scene2 / logic.shader line 143 function player_collision









This logic reads the pixels around the player in a 4x4 pixel radius, takes the position of each of the pixels and compares it with the player’s position, if an element is found then type check is next, if this is a monster then we take HP from the player.







This works a little inaccurate and I did not want to fix it , this function can be made more accurate.







Particles push away from the player and the repulsion effect during an attack:







A framebuffer (viewport) is used to write the normal of current actions, and particles ( particles_fbo_logic.shader ) take this (from normal) texture in its position and apply the value to its speed and position. All the code for this logic is literally just a couple of lines, the force_collision.shader file







At the click of the left mouse button, fireball shells fly; their appearance is not very natural , they did not fix and left in this form.







You can either make a normal zone (shape) for spawn particles with a shift appearing relative to the player (this is not done).

Or you can make fireball a separate object as a player and draw normal into a buffer to push particles away from the fireball , that is, by analogy with the player ...

Who needs to think they’ll figure it out for themselves.







3. Links to the used graphics with opengameart, and the shadow shader



I was given a link to an article on cyberleninka.ru

In which the description of the algorithm that I used, perhaps there is a more detailed and correct description than in this, my, article.







The shadow shader works very simply, based on this shader https://www.shadertoy.com/view/XsK3RR (I have a modified code)

Shader Builds 1D Radial Lightmap













and shading in the floor painting code shaders / scene2 / mainImage.shader







Links to the graphics used , all graphics in the game from the site https://opengameart.org

fireball https://opengameart.org/content/animated-traps-and-obstacles

character https://opengameart.org/content/legend-of-faune

trees and blocks https://opengameart.org/content/lolly-set-01

(and a couple more pictures with opengameart)







The graphics in the menu were obtained by the 2D_GI shader, a utility for creating such menus:









Who read to the end - well done :)

If you have questions, ask, I can supplement the description upon request.








All Articles