This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 »
886
Programming Help / Re: How to connect space ship object to bullet object for that ship with instance id
« on: February 10, 2014, 11:49:50 am »
I think using instance_change just for that is an overkill. What I would do is change the script to just take two sets of coordinates instead of objects. Like so:
Code: [Select]
view_dual script:
---------------------------------------
/// view_dual(x1,y1,x2,y2,padding)
var x1, x2, y1, y2, vw, vh, vb, vscale;
x1 = argument0; y1 = argument1;
x2 = argument2; y2 = argument3;
vb = argument4; vw = view_wport; vh = view_hport;
vscale = max(1, abs(x2 - x1) / (vw - vb * 2), abs(y2 - y1) / (vh - vb * 2));
view_wview = vscale * vw;
view_hview = vscale * vh;
view_xview = (x1 + x2 - view_wview) / 2;
view_yview = (y1 + y2 - view_hview) / 2;-
And use a controller object (I usually have at least one obj_controller in every project, it does most things) and make it have 4 variables that are updated. Like in step event:Code: [Select]
if (instance_exists(p1_ship)){ x1 = p1_ship.x, y1 = p1_ship.y; } else { x1 = x2, y1 = y2; }
if (instance_exists(p1_ship)){ x2 = p2_ship.x, y2 = p2_ship.y; } else { x2 = x1, y2 = y1; }
view_dual(x1,y1,x2,y2,100);
In the create event just init x1,y1,x2 and y2 with some default values. What this will do is show both ships in the view if both exist or just follow one if the other has died. If you still want the other to be followed (for example until explosion is done) then create alarms and use them in the "else" instead. What you can also do is just constantly show the place where the ship exploded (if you remove the "else" statements entirely).
887
Programming Help / Re: get_string without a dialog
« on: February 10, 2014, 09:24:50 am »
I also made textbox for GM a while ago and did a conversion to ENIGMA, but there was still some bugs I didn't fix so it's not entirely functional and pretty looking. Maybe you guys can check the code and fix that.
https://www.dropbox.com/s/2jv1k4emsh2zxaq/textbox_v5.gmk
It also has things like selecting parts of the text with mouse, inserting, deleting, copy/paste and so on.
I would want to fix it, but I don't have the time right now.
Another example I made just for ENIGMA is here:
https://www.dropbox.com/s/njetsb4ocl54k27/cool_effect.gmk
It's a lot prettier, but a lot simpler (you can only type, nothing more, not even deletion works there, but it's not hard to make).
http://i.imgur.com/4zUYNXG.png
How can you resize images on this board? Classic BBCode ain't working.
https://www.dropbox.com/s/2jv1k4emsh2zxaq/textbox_v5.gmk
It also has things like selecting parts of the text with mouse, inserting, deleting, copy/paste and so on.
I would want to fix it, but I don't have the time right now.
Another example I made just for ENIGMA is here:
https://www.dropbox.com/s/njetsb4ocl54k27/cool_effect.gmk
It's a lot prettier, but a lot simpler (you can only type, nothing more, not even deletion works there, but it's not hard to make).
http://i.imgur.com/4zUYNXG.png
How can you resize images on this board? Classic BBCode ain't working.
888
Programming Help / Re: Making games in Full HD resolution 16:9 possible ?
« on: February 10, 2014, 09:13:02 am »
1. Views need to have the same aspect ratio as the resolution. So if you change 16:9 to 4:3, then resize the view to have the same aspect ratio. This will mean that it won't be squished or otherwise distorted.
2. If you want to have the same amount of content on the screen, then you need to change only the port on screen and not the view size. Otherwise 320x240 will have 27 times smaller view than 1920x1080. This would mean that the whole thing is zoomed in and the player with larger resolution would have an advantage. In most games (especially 3D) this is not the case. In games like League of Legends or Starcraft2 you will see the same thing no matter the resolution. In 2D games it's usually better for players to see more in larger resolutions so the sprites aren't looking stretched (it's basically because 3D zooms well, while 2D doesn't).
3. HUD's usually do scale though. In the previously mentioned games hud is usually smaller if you have a larger resolution, although the size is usually customizable (and you should give the possibility to do so as well). So all HUD objects need to be positioned using the view size (like view_wview[0]-100 so it will always be at the right no matter the view size) and scaled appropriately. This will have to be done using custom code as Draw GUI event only disregards view position and scale, not size. I think the main reason while HUD's get smaller with increasing resolution is the same reason I mentioned in 2. point - Sprites don't zoom well.
On Windows display_test_all() should work in ENIGMA.
2. If you want to have the same amount of content on the screen, then you need to change only the port on screen and not the view size. Otherwise 320x240 will have 27 times smaller view than 1920x1080. This would mean that the whole thing is zoomed in and the player with larger resolution would have an advantage. In most games (especially 3D) this is not the case. In games like League of Legends or Starcraft2 you will see the same thing no matter the resolution. In 2D games it's usually better for players to see more in larger resolutions so the sprites aren't looking stretched (it's basically because 3D zooms well, while 2D doesn't).
3. HUD's usually do scale though. In the previously mentioned games hud is usually smaller if you have a larger resolution, although the size is usually customizable (and you should give the possibility to do so as well). So all HUD objects need to be positioned using the view size (like view_wview[0]-100 so it will always be at the right no matter the view size) and scaled appropriately. This will have to be done using custom code as Draw GUI event only disregards view position and scale, not size. I think the main reason while HUD's get smaller with increasing resolution is the same reason I mentioned in 2. point - Sprites don't zoom well.
On Windows display_test_all() should work in ENIGMA.
889
Programming Help / Re: How to connect space ship object to bullet object for that ship with instance id
« on: February 09, 2014, 06:00:01 pm »Quote
That was awesome and helped me learn different techniques.No problem.
Quote
making obj_player a parent object and making all the different ships child objects of that parent.I personally don't use parenting at all and I would probably just use the 1 object, but I guess if you plan to have many ships (and ships which have many different abilities), then it makes sense to make them individual objects.
Quote
I will keep one bullet and one explosion object.Reason I suggest that is because they usually don't change that much. Like an explosion object can have different size (scale), color or sprite. No reason to create several objects just for that. Just create a script somewhat like this:
Code: [Select]
//scr_create_explosion(x,y,size,color,sprite)
var inst;
inst = instance_create(argument0,argument1,obj_explosion);
inst.size = argument2;
inst.image_blend = argument3;
inst.sprite_index = argument4;
Then just call the script. You can make the explosion object as complicated or simple as you want, but they usually still have the most things common. The same with many other objects - bullets (which usually change in appearance, speed, damage and other common attributes) or even ships. In many games ships change in appearance, their reload time for guns, the type of bullets they shoot and so on, so I just create bunch of variables that change all that.I once made a tower defense game (https://www.dropbox.com/s/ie1ki5l1bb6efv4/TD_08.png) and in it all turrets and all enemies were 1 object. And they even had many different abilities - like there was a turret that could freeze and slow the enemy and the way enemy died was even based on those effects (like frozen enemies shattered). And I would basically do it the same way now. Another one I made with very few objects was this (https://www.dropbox.com/s/q0pqp3odqaubqnd/F%26I_03.png), so I'm not a big fan of large amount of objects.
890
Programming Help / Re: Help with Maths
« on: February 07, 2014, 02:02:23 pm »
Check what bitrates Java supports. If you want to support only those what GM:S supports, then you will need to manually remove the unnecessary ones.
891
Programming Help / Re: Help with Maths
« on: February 07, 2014, 01:28:11 pm »Quote
160-176 is supposed to be 160-192Why? The numbers you wrote manually in the OP are actually incorrect. 8^3 = 512, so in the range 64-512 the increment is always 16. Just like the for() loop returns.
Quote
Edit: don't I actually need an inverse log?No.
892
Programming Help / Re: Help with Maths
« on: February 07, 2014, 01:03:26 pm »
Yeah I made a typo in my post. I wanted to say "But log(0)=-infinity".
893
Programming Help / Re: Help with Maths
« on: February 07, 2014, 12:56:32 pm »
What you need is a base 8 logarithm of i. For example, log8(8)=1, log8(64)=2 and so on. Sadly, programming languages usually don't provide a general logarithmic function like that. But logx(y) is the same as log(y)/log(x). So you can write it like so:
Also, precomputing log(8) and such is also a smart idea.
Code: [Select]
for (int i = 0; i <= 512; i += 8 * Math.floor(Math.log(i)/Math.log(8))) {
bitOptions.add(Integer.toString(i));
}
But log(0)=-infinity, so you need to make an exception for that. Like:Code: [Select]
for (int i = 0; i <= 512; i += 8 * (i>0?Math.floor(Math.log(i)/Math.log(8)):1) {
bitOptions.add(Integer.toString(i));
}
Although I don't know if ternary expressions go in Java.Also, precomputing log(8) and such is also a smart idea.
894
General ENIGMA / Re: The new GL3
« on: February 07, 2014, 12:23:34 pm »Quote
The correct term is short-circuit evaluation.That is something different. In that sentence I really meant just the "if statement".
Quote
I want to know what you meant by debugging itI use CodeXL by AMD. It's basically the gDEbugger I used previously, but this one is a lot newer and is still maintained. And it works on all cards, not just AMD. OpenCL debugging works only with AMD in that tool. And that tool showed that many one vertex buffers have been created.
895
Programming Help / Re: How to connect space ship object to bullet object for that ship with instance id
« on: February 07, 2014, 11:26:44 am »
You seem to be doing it unnecessarily complicated.
The problem here is that you check specific bullet in the step event. You don't need to do that as the b1p2 and b1p1 is only used to add direction and parent to bullets. Normally b1p2 and b1p1 would be temporary to the script and you can forget about it.
Put this in obj_bullet_parent collision event with obj_switch_ship:
I made you simple example on how I would do this: https://dl.dropboxusercontent.com/u/21117924/Enigma_Examples/ShipAndBulletParentingExample/ships_example.gmk
You move one ship with arrow keys and fire with space. You move the other ship with WAD and fire with enter.
The problem here is that you check specific bullet in the step event. You don't need to do that as the b1p2 and b1p1 is only used to add direction and parent to bullets. Normally b1p2 and b1p1 would be temporary to the script and you can forget about it.
Put this in obj_bullet_parent collision event with obj_switch_ship:
Code: [Select]
if (other.id != p1_ship){ //Check if we hit the enemy and not ourselves
instance_create(x,y,obj_small_explosion); //Create explosive in our position - you cannot change local variables immediately after instance_change(),
// and I don't see why you would even want to use that function.
other.hp -= ap; //Reduce player hp
if(other.hp < 0){ //This should probably be done in obj_switch_ship step event
{
with (other) instance_change(obj_p1_explosion,true);
}
}
instance_destroy(); //Actually destroy the bullet
}
You shouldn't also have different explosion objects for different players. You shouldn't even have a different explosion objects. Just create one which allows setting the size and such.I made you simple example on how I would do this: https://dl.dropboxusercontent.com/u/21117924/Enigma_Examples/ShipAndBulletParentingExample/ships_example.gmk
You move one ship with arrow keys and fire with space. You move the other ship with WAD and fire with enter.
896
General ENIGMA / Re: The new GL3
« on: February 07, 2014, 11:00:36 am »Quote
It's on Nvidia's site somewhere else to, I just assume the same holds true for GLSL.
Quote
Yeah I get that, Harri,It was Rusky, and I think he got it right. With "program" it means the game, not the shader program. So "runtime" it means compiling the shaders when you run the game and that is what we do now (as compiling during runtime is actually basically the only way to do this cross-hardware). CG allows some "profile" thing, but usually it's still done on startup. I don't thing it meant recompiling during rendering.
Quote
I know plenty of games that turn on and turn off lighting quite a bitAnd there one of many proposed solutions will work. Even the "if" branching I already do, because here lights are not turned on or off per pixel, but per render call.
Quote
not by a conditional but by null values.What do you mean exactly? Right now I just have a uniform bool and check that. As it doesn't change during the execution of the shader, then I think newer cards optimize the whole thing.
Quote
NOTE: I wan't to reiterate this point to you.I will maybe check that later. But I want to know why your game tries to render batches of 1 vertex? Like what drawing function creates a batch of 1 vertex? The only thing coming to my mind is "draw_primitive_begin(pr_points); draw_vertex(x,y); draw_primitive_end();".
897
Developing ENIGMA / Re: Window Alpha and Message Box
« on: February 07, 2014, 10:02:30 am »Quote
I am not understanding, where and when do you plan to combine tx and ty? Because if you do it before sending the vertex data to the GPU, it will lose precision during that, but if you wait until it's in the shader to combine then, then no it won't lose precision at that stage. If you do plan to do it CPU side then combine tx and ty using half floats in the addTexture(tx,ty) call before pushing it into the vector, use a union too like color does because only GLES 3.0 offers half floats.I meant combining them on CPU side before sending just like you do with colors. I just chose my wording incorrectly as of course it would loose precision data type wise, but I wanted to say that it probably won't loose precision because the data isn't that precise. Like "float color = 255.0f; unsigned char color2 = (unsigned char)color;" will not make color2 data loose precision, while of course char is a lot less precise than a float.
Quote
removing 4 float color from the CPU side, did give me a noticeable frame rate difference, 220->250FPSThat is because you packed and made it 4x smaller and reduced memory bandwidth. That I don't oppose - I even suggest trying the same with texture coordinates. That is why I said it does the conversion on GPU, so the data on the bus is still unsigned chars, but when it gets to the GPU it gets converted to float and normalized to 0-1.
Quote
But you won't stop me from macro'ing every color function.Have fun.
Code: [Select]
SetLayeredWindowAttributes(enigma::hWndParent, 0, (char)alpha*255, LWA_ALPHA);
You should probably cast that to unsigned char. Quote
Yeah that makes it less optimal.But that is a drop in the ocean in the grand scheme of things. Especially in a function like window_set_alpha which wouldn't even be ran more than once a step. But you can macro what you want. I would probably even use the 0-255 alpha macro, but it's still about compatibility with GM. And while I don't care one bit about it (because I don't use GM at all), others do.
898
General ENIGMA / Re: The new GL3
« on: February 07, 2014, 09:46:38 am »Quote
If you guys are now calling d3d_start/end()1. I was talking about deprecated GL functions inside d3d_start()/_end(). 2. They barely do anything and all 3d functions can be used without d3d_start. So theoretically they could be removed (and just left with an empty function for old games). But I will just replace the deprecated stuff. Sorry for the confusion.
Quote
I've got some bad news for you Harri, we actually have to undo all the shit I did with textures again. Studio now returns a pointer with _get_texture() functions, see my latest comments on this GitHub issue.Lets cross that bridge when we come to it. It does seem that "pointer" is something different. GM:S still has only one data type doesn't it? So it seems their "pointer" is going to be a int inside some map which points to texture. We can either do the same or just return GLuint.
Quote
Nvidia also suggests and encourages dynamically recompiling shaders at runtime. We wouldn't necessarily need an uber shader, we'd need an uber shader that is broken down into different functions. You'd only need to rebuild the shader for instance when a state change occurs like d3d_set_lighting().Can you give source on that I could read? Because everywhere I see it is suggested not to do that. Compilation is not really a fast thing and there is barely any limit on the number of compiled programs, so I don't see why it should be done in runtime. What we should technically do though is not keep shader compiled shaders around. Like we do this "glCompileShader(vshader->shader); glCompileShader(fshader->shader);" but not release them. We need to keep the source's around (the shaderstruct), but after linking the shaders should usually be destroyed.
Quote
You'd basically just structure the program like this. And you only copy the functions to the shader code that are used, for instance in the following code when the shader is rebuilt in d3d_set_lighting(true) you would copy the apply lighting call and method into the string and then rebuild the shader, when the user turns it off youd rebuild the shader without it. All you'd need is a basic generate_default_shader(lighting, color);This might as well be done with compiler directives like "#ifdef". This might work with d3d_light functions because you should usually call them rarely, but for texturing or color this would mean recompiling the code up to like 100 times a second in the Mario example. That would probably decimate framerate. So either I would make it work with directives and compile it once when needed (like having shader_with_light = -1 and then when d3d_light_enabled is called I do shader_with_light = ... compile the shader) or just have an ubershader that just uses if's. The problem with compiling many shaders is that I would still have to have a way to access them for all combinations, so I will end up with a list like:
Code: [Select]
shader_with_light = -1;
shader_with_texture = -1;
shader_with_color = -1;
shader_with_texture_and_color = -1;
shader_with_light_and_texture = -1;
and so on....
Quote
This is why I created advanced GLSL functions.What? You created the new GLSL standart? Also, the separable programs are only in GL4.
Quote
I have to change that, d3d_model_block should batch like it does in Direct3D9.It does batch normally. The problem is that it gives texture coordinates, but GM (and ENIGMA) allows passing -1 as texture to draw using currently bound color and alpha. So that is what I did. It works the same in GL1.1 and I suppose it worked the same in GM.
Quote
Maybe in the future Enigma could add deferred rendering and per-pixel lighting Good work on the improvements so far!The idea is that this could be added by the user. We might provide shaders for that as well, because Robert did plan some kind of "shader library" to be added. I do plan to actually make a shader example, just to show what now can be done much easier.
899
General ENIGMA / The new GL3
« on: February 06, 2014, 07:57:04 pm »
Hi! Wanted to give some progress on my efforts of ridding GL3 from deprecated functions. As many of you might already know, the GL3 version was still mostly FFP (fixed-function pipeline). This means four things:
1) It doesn't use the newest GL and hardware capabilities.
2) It makes using newest GL capabilities a pain (like writing shaders in the GL1.1 is a pain).
3) It's not actually GL3 - even though we call it that.
4) And probably most importantly - It makes ports harder. GLES 2.0 is a subset of GL3. This means that a fully working GL3 without deprecated functions allow A LOT easier porting to embedded devices like Android phones.
Note: In this topic I will say GL3 everywhere, when in fact it's GL3.1 full context or even GL3.2 core (but no GL3.2 functions are used).
So what I did was first write a matrix library to replace the GL matrix madness, as GL3 actually doesn't support GL matrices anymore. It all has to be done by the developer.
Performance changes for GL1.1 were these:
So in Minecraft example which doesn't use that many transform functions, there was no difference. No gain or loss. But in Project Mario there was 90FPS increase which is about 9%. In games that use A LOT of transform functions improvements will be greater. So the worst case is no difference, best case is improvement. And as this doesn't create any compatibility issues, then I don't see why free performance gain should be discarded. Plus, this allows gaining even more performance when specific improvements to matrix code is done (like using MMX/SSE instructions on PC's).
Here the OLD is the one currently in GIT MASTER. The V1 is the version where only matrices are changed. It uses basically the same code as GL1.1, but the only difference is that matrix multiplication and sending to GL (via glLoadMatrix) is done not during the _transform_ call, but during render call (but only if update was needed). This means that examples like minecraft got a 70FPS boost without problem, as it has relatively few render calls. Project Mario on the other hand has a lot more (my debugger even shows Mario tries to render vertex buffers with only 1 vertex, which Robert and I should investigate), so there the FPS was massively decreased.
In V2 I finally removed FPP and added shaders. This meant that FPS was mostly increased, but the method was still sub-optimal (I queried uniform and attribute locations every draw call which of course is stupid). To be honest, I cannot remember why Minecraft had 265FPS reduction in this one.
And V3 is the current one. Locations for shaders are loaded when linking the shader program, fixed problems with texture coordinates being passed to shaders, while there is in fact no textures and so on (like when drawing d3d_model_block with texture == -1). This shows an overall improvement and so I finally got it working at least as fast and even faster than the older implementation.
But this is not the end - It will still go both ways. The shaders I wrote are quite primitive and don't have things like lights or fog. This means that when those are implemented, then the speed might decrease. On the other hand there are more optimizations to be done, like maybe using vertex array objects (VAO).
1) The shaders shouldn't be that complicated even after all these FPP things have been implemented, as FFP didn't support things like normal mapping or any other more advanced thing.
2) This will be a lot easier and shorter. Writing all of the possible shaders by hand will be madness, even when the rendering possibilities are limited, and I don't plan to write a shader generator.
3) Performance impact could be negligible, but I cannot be sure. The slowest things in shaders are branches - if/else statements. If the if statement depends on the fragment (like you do "if (v_TextureCoord.t) == 0)") then performance can be severely impacted (like I saw an example where a basic change like that took 60FPS down to 20FPS in an IOS device). This happens because then shader units don't work in parallel in this case. Nvidia, for example, uses something called Warps - they are basically batches that work on several pixels at once - like 32. If all of them have the same instructions (so branching doesn't change per pixel), then they work very fast. If even one of the pixels in the warp branches differently, then performance impact is significant. The reason I write it here is because "ubershader" will require branches, but the speed shouldn't be impacted because they will use uniform constants. Like if we don't use coloring, then all of the pixels in the drawing call will not use coloring. This is currently the basic default fragment shader:
So, any suggestions and ideas? There wasn't much of a point to this topic, but I just wanted to share with things that might come in future. Merging my fork will probably be a pain, as I am like 50 commits behind Master, but conflicts should be minimal. This will not be merged until I replicate also lighting and do more tests. Then I would want others to try as well (other hardware and OS's). These changes will probably break GL3 to some people here (like Poly), but that is because their hardware just don't support GL3 in the first place. They currently can run it, because the implementation in Master is more like GL2.
1) It doesn't use the newest GL and hardware capabilities.
2) It makes using newest GL capabilities a pain (like writing shaders in the GL1.1 is a pain).
3) It's not actually GL3 - even though we call it that.
4) And probably most importantly - It makes ports harder. GLES 2.0 is a subset of GL3. This means that a fully working GL3 without deprecated functions allow A LOT easier porting to embedded devices like Android phones.
Note: In this topic I will say GL3 everywhere, when in fact it's GL3.1 full context or even GL3.2 core (but no GL3.2 functions are used).
So what I did was first write a matrix library to replace the GL matrix madness, as GL3 actually doesn't support GL matrices anymore. It all has to be done by the developer.
Performance
GL1.1
While I was at it, I noticed that these changes also help GL1.1, as there were many useless functions in _transform_ that only slowed it down, but were necessary for the old GL way of doing things. In GL1.1 - matrices are used only in transformation functions and so the speed differences are based on how many d3d_transform_ functions are used. For all comparison I used two 3D examples which used a lot of 3D functionality - Minecraft example and the Project Mario game. The FPS are average after convergence in the game (so in both examples I don't move at all - after the game starts I wait until they both hit a steady point, like Mario going to sleep in Project Mario).Performance changes for GL1.1 were these:
So in Minecraft example which doesn't use that many transform functions, there was no difference. No gain or loss. But in Project Mario there was 90FPS increase which is about 9%. In games that use A LOT of transform functions improvements will be greater. So the worst case is no difference, best case is improvement. And as this doesn't create any compatibility issues, then I don't see why free performance gain should be discarded. Plus, this allows gaining even more performance when specific improvements to matrix code is done (like using MMX/SSE instructions on PC's).
GL3
For GL3 I will show also the progress by iterations (or GIT commits if you will) and tell what went wrong and what went right.Here the OLD is the one currently in GIT MASTER. The V1 is the version where only matrices are changed. It uses basically the same code as GL1.1, but the only difference is that matrix multiplication and sending to GL (via glLoadMatrix) is done not during the _transform_ call, but during render call (but only if update was needed). This means that examples like minecraft got a 70FPS boost without problem, as it has relatively few render calls. Project Mario on the other hand has a lot more (my debugger even shows Mario tries to render vertex buffers with only 1 vertex, which Robert and I should investigate), so there the FPS was massively decreased.
In V2 I finally removed FPP and added shaders. This meant that FPS was mostly increased, but the method was still sub-optimal (I queried uniform and attribute locations every draw call which of course is stupid). To be honest, I cannot remember why Minecraft had 265FPS reduction in this one.
And V3 is the current one. Locations for shaders are loaded when linking the shader program, fixed problems with texture coordinates being passed to shaders, while there is in fact no textures and so on (like when drawing d3d_model_block with texture == -1). This shows an overall improvement and so I finally got it working at least as fast and even faster than the older implementation.
But this is not the end - It will still go both ways. The shaders I wrote are quite primitive and don't have things like lights or fog. This means that when those are implemented, then the speed might decrease. On the other hand there are more optimizations to be done, like maybe using vertex array objects (VAO).
Shaders
Prefixes
I chose to go with the method done in ThreeJS and make prefixes for both fragment and vertex shaders. This means that when a user in ENIGMA codes a shader, there will be a prefix appended to the top of his code. This prefix defines things like matrices (projectionMatrix, viewMatrix, modelViewProjectionMatrix etc. as well as gm_Matrices[] for compatibility with GM:S), default attributes (vertex positions, texture positions, vertex colors and so on) and uniforms (like whether texture is bound or to get bound color (works like draw_get_color() in shader)). This makes writing shaders a lot easier as most of the needed stuff is already there. Shader compilers remove unused uniforms and attributes, so if the user writes a shader that doesn't use these functions (like a user doesn't need vertex color for example), then the shader will optimize out the color attribute. That is why these prefixes are so good - they don't have any real penalty. In the end the prefix's could get quite large and we also need to check what GM:S appends as well to make it compatible, but I don't think they are going to make a difference performance wise. They are mostly #define's and things like that.Default shader
The default shader (bound by default on startup or when calling glsl_program_reset()) will have to replicate all the FPP stuff ENIGMA and GM allowed to use. This include flat and smooth shading for lights (up to 8 lights if I am not mistaken), fog (colored and with different falloff functions) and a few other things. The best way (or the "proper" way) to do this is to make several shaders - one with lights enabled, one with disabled, one with texturing, one without and so on. The combinations are going to be large though and so we either have to make our own shader generator (quite often done actually) or just make an "ubershader". And "ubershader" is something that does all of these things and you control everything via uniforms. I personally think that we might as well do it this way because of three reason:1) The shaders shouldn't be that complicated even after all these FPP things have been implemented, as FFP didn't support things like normal mapping or any other more advanced thing.
2) This will be a lot easier and shorter. Writing all of the possible shaders by hand will be madness, even when the rendering possibilities are limited, and I don't plan to write a shader generator.
3) Performance impact could be negligible, but I cannot be sure. The slowest things in shaders are branches - if/else statements. If the if statement depends on the fragment (like you do "if (v_TextureCoord.t) == 0)") then performance can be severely impacted (like I saw an example where a basic change like that took 60FPS down to 20FPS in an IOS device). This happens because then shader units don't work in parallel in this case. Nvidia, for example, uses something called Warps - they are basically batches that work on several pixels at once - like 32. If all of them have the same instructions (so branching doesn't change per pixel), then they work very fast. If even one of the pixels in the warp branches differently, then performance impact is significant. The reason I write it here is because "ubershader" will require branches, but the speed shouldn't be impacted because they will use uniform constants. Like if we don't use coloring, then all of the pixels in the drawing call will not use coloring. This is currently the basic default fragment shader:
Code: [Select]
in vec2 v_TextureCoord;
in vec4 v_Color;
out vec4 out_FragColor;
void main()
{
if (en_Texturing == true && en_Color == true){
out_FragColor = texture2D( TexSampler, v_TextureCoord.st ) * v_Color;
}else if (en_Color == true){
out_FragColor = v_Color;
}else if (en_Texturing == true){
out_FragColor = texture2D( TexSampler, v_TextureCoord.st );
}else{
out_FragColor = en_bound_color;
}
}
Here you can see that the branching is static. And on my 660Ti this branching doesn't decrease FPS at all (when I remove branching I get exactly the same FPS). I read a theory (without technical evidence though) that maybe shader compilers make several shaders - one for every static branch. This means if I pass en_Color == true and en_Texturing == false, then it won't even consider the first branch. It does seem that my GPU might do this, but I am not sure. Coloring
There have been like 3 or 4 topics about this already in this forum and no real consensus whether we should blend with bound color or not. In the shader above you can see that the current method replicates GL1.1 implementation. So if texture is used and a per vertex color is used, then we blend the texture and the color. If only color is used (like when using draw_rectangle_color) then only vertex color is used. When drawing with texture, then only texture is used (so "draw_set_color(c_red); draw_model(model,texture);" will NOT draw a red tinted model). And when no per vertex color or texture is bound, then it uses the bound color. So "draw_set_color(c_red); draw_set_alpha(.5); d3d_draw_block(..., texture = -1, ..);" will draw a transparent red block.Deprecated functions
All these changes we mostly done with one idea - to remove all deprecated functions. Right now these two examples run with without calling any deprecated functions per frame. There are still some deprecated GL functions here and there (like inside d3d_start() and d3d_end()), but I will remove them soon enough.So, any suggestions and ideas? There wasn't much of a point to this topic, but I just wanted to share with things that might come in future. Merging my fork will probably be a pain, as I am like 50 commits behind Master, but conflicts should be minimal. This will not be merged until I replicate also lighting and do more tests. Then I would want others to try as well (other hardware and OS's). These changes will probably break GL3 to some people here (like Poly), but that is because their hardware just don't support GL3 in the first place. They currently can run it, because the implementation in Master is more like GL2.
900
Developing ENIGMA / Re: Window Alpha and Message Box
« on: February 05, 2014, 04:29:36 pm »Quote
They are expanded in the vertex stage after upload.And? Technically it would probably be somewhere between application and vertex in that simplified graphic. The same hardware pipeline that deals with data upload would do it.
Quote
Yeah, but you said combining tx and ty into a single float before uploading to the bus, that would be done on the CPU, losing precision?Well I already said that it should hold up to 8k fine. Of course all of that needs to be tested though. You were then referring to arithmetic, which of course is done in shaders, and so I said that in shader part of the things it's no different. If you meant arithmetic as in changing 0-1 to 16bit short, then I also don't think precision would be lost. It just wouldn't be a float though. That is actually mentioned in that Apple link you gave:
Quote
Specify texture coordinates using 2 or 4 unsigned bytes (GL_UNSIGNED_BYTE) or unsigned short (GL_UNSIGNED_SHORT). Do not pack multiple sets of texture coordinates into a single attribute.
Quote
I thought we were talking about systems of pixel measurementWe were talking whether floats are more used in graphics for data (and specifically in shaders), than other data types.
Quote
internally upload color as only 4 bytes using their fixed-function pipelineIn FFP there are many color setting functions. In the deprecated OpenGL FFP colors are floats (https://www.opengl.org/sdk/docs/man2/xhtml/glColor.xml):
Quote
Current color values are stored in floating-point format, with unspecified mantissa and exponent sizes. Unsigned integer color components, when specified, are linearly mapped to floating-point values such that the largest representable value maps to 1.0 (full intensity), and 0 maps to 0.0 (zero intensity). Signed integer color components, when specified, are linearly mapped to floating-point values such that the most positive representable value maps to 1.0, and the most negative representable value maps to -1.0 (Note that this mapping does not convert 0 precisely to 0.0.) Floating-point values are mapped directly.The conversion is done in GPU, so it doesn't care one bit what format you actually use for uploading. I just keep saying that float's are what GPU's use inside - for everything - including colors. That is the only point I am trying to make.
Quote
One possibility is that he was using Graphics Device Interface or just native Win32.I started using GM around GM5.3 or GM4.3 (can't even remember now) and it didn't have hardware acceleration. 5.3 (draw_sprite_transparent) has software transparency and I don't know what graphics framework he used (as it was written in Delphi), but I guess 0-1 was a necessity for him. It's possible that he did transparency himself (as it is software based after all).
Quote
If red, green and blue were all 0 to 1 as well, then it would at least be fucking consistent, and I'd not be complaining about GM.Yes, all would like it to be different. But it doesn't matter now, because for compatibility we cannot change it. And "consistent" also means that if 90% of the functions take 0-1 as alpha, then the rest should too. That is what this whole discussion is about.
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 »