This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 »
1186
Off-Topic / Re: Marshmellos Do you like them white, brown, or black?
« on: August 03, 2013, 06:11:42 am »
Coal is good for you. Settles your stomach. Although that probably was activated coal... but who cares.
1187
Proposals / Re: GL3 changes from immediate to retained mode
« on: August 02, 2013, 02:04:44 pm »
So a funny thing happened. I understood that if I use a vector (or any data type basically) that at one point I will just go out of memory and segfault. I tried drawing 5000000 objects and it in indeed segfaulted. It went up to 1000000, then I thought I could fix that by having something like enigma::globalVBO_data.size()>enigma::globalVBO_data.max_size()-100 when binding the texture. The funny thing is that it didn't actually segfault because of sprites, but because of objects. I apparently cannot create more than 1mil. or it segfaults in some lua map. Normally you wouldn't create that many objects, but I still think we need at least graceful death and not just segfault. When I just used draw_sprite() then I could draw 5mil. until segfault. Of course you also cannot allocate enigma::globalVBO_data.max_size()-100 either, as it usually segfaults way before that (especially when have 32bit game, although it could take that into account).
Some thoughts: Is checking for a bad_alloc and then clearing the buffer before continuing (thus basically allowing to draw infinite amount of sprites) worth it? The problem is that it would require an if check. And if you do 5mil. if checks then it itself impacts performance. Or just don't do it and call anyone who wants to draw more than 5mil. sprites at once an idiot (because it can't be done in a reasonable framerate anyway, for me it's just 2-3 fps, and in GL1 it's about 3 sec per frame). Right now it would stop when changing textures and so I can draw more than 5mil. when doing that (drawing many different sprites). But when we have texture atlas then it would be more possible this could happen as a lot more would be batched.
Some thoughts: Is checking for a bad_alloc and then clearing the buffer before continuing (thus basically allowing to draw infinite amount of sprites) worth it? The problem is that it would require an if check. And if you do 5mil. if checks then it itself impacts performance. Or just don't do it and call anyone who wants to draw more than 5mil. sprites at once an idiot (because it can't be done in a reasonable framerate anyway, for me it's just 2-3 fps, and in GL1 it's about 3 sec per frame). Right now it would stop when changing textures and so I can draw more than 5mil. when doing that (drawing many different sprites). But when we have texture atlas then it would be more possible this could happen as a lot more would be batched.
1188
Proposals / Re: GL3 changes from immediate to retained mode
« on: August 01, 2013, 10:29:19 am »
That makes sense.
1189
Proposals / Re: GL3 changes from immediate to retained mode
« on: August 01, 2013, 08:55:37 am »Quote
(spr_0, subimage 0)Ou, so you mean having the same sprite in several atlases, so if a spr_player is used a lot it would be in almost every texture page? Because now there could be a situation where in a game with many level styles every level would have it's own atlas (which is a good thing), but the spr_player is put in the first atlas (with all the level1 sprites) and so it would do a lot of texture switching just because of that. I also like the manual grouping of sprites in the atlas. Right now during compile do I get the sprite folder (like can I iterate trough folders)? Then I guess I could make the texture packing use sprite folders for grouping (but not be limited by it, there is no performance benefit I know from packing less sprites in a texture).
The problem with compile time packing though, is that I don't know how big the texture can be. At the start I will just try setting it to some constant variable (like 2048x2048 which is reasonable on even crap hardware, like Android phones). I would love to pack the in run-time just because of that reason. Maybe have like a 2048x2048 packing at compile time, but when running then it checks max size and if it's something like 4096x4096 then it just packs 4 of the already packed textures in there. That should be a lot faster as no packing algorithm would be used.
1190
Proposals / Re: GL3 changes from immediate to retained mode
« on: August 01, 2013, 06:18:54 am »
That still means that there would be a lot of batch creation/destruction going on. Though I thought how to batch primitive drawing functions (lines, curves etc.) and that would probably require another VBO. So the system you are proposing could be investigated (I am still not convinced it would give a massive speed boost and it would also foobar the rendering order), but for now I will look into packing textures at run-time, as well as maybe creating several VBO's where one would be for sprites/backgrounds and the other for lines and such. I will also test if putting everything in one VBO and then drawing subelements is faster than resetting the batch every time textures get changed.
1191
General ENIGMA / Re: DirectX Models and Primitives
« on: August 01, 2013, 02:44:48 am »
Well you must use triangles anyway, so just write the other primitive types to use them. That is if you mean d3d_model_sphere or something. If you mean as in a choice between pr_triangles, pr_trianglstrip, pr_quads and so on, then support what you can. I guess making a custom class is the best answer anyway and there you could split all quads in triangles manually (so also software side) when adding to the model. So it would be faster when drawing (or did we choose primitive type when drawing?).
1192
General ENIGMA / Re: Half Pixel Alignment, OpenGL and DirectX
« on: August 01, 2013, 02:40:22 am »
I haven't seen this problem in sprites. I only saw it in line drawing and such. Also, transformations are done AFTER the cast. That means if you pass x and y as integers, it shouldn't in any way impact 3D transformations.
1193
Proposals / Re: GL3 changes from immediate to retained mode
« on: August 01, 2013, 02:14:07 am »Quote
So if you draw a square sprite, then a circle sprite over it, the corner of the square sprite for one object will not be able to overlap the circle sprite of another object, because all square sprites are drawn at the same time, and all circle sprites are drawn after.Ok, I get what you meant. I think this is a compile thing instead of runtime thing no? So if there is not much to change in the drawing code then I guess it could be provided as an option. But I really do want it as an option, because I like how very deterministic is the drawing now. Because if we do this change, then at one point some users will come asking about this overlap problem and the only thing we could offer them would be changing depth (which is ofter hard, as people don't do 0 for player, 10000 for background, -10000 for foreground etc., it is usually 0, 1, -1).
Quote
This is exactly what my method avoids, Harri. It does this by batching sprites of the same texture together under strict conditions. How are you not noticing a difference between those two codes? I think you missed the point of what I was saying. The bottom code demonstrates the original, un-batched behavior, when ENIGMA asks each object to perform its draw event. It requires 3n texture binds, where n is the number of instances. The top code shows how the batching algorithm refactors the code to look; it requires only three texture binds, regardless of how many instances there are. All sprites with spr_1's texture are drawn in batch. Then all sprites for spr_2, then all sprites for spr_3. The order is determined from the order they are drawn in the code, so it will look identical to the original except in cases of one-sprite overlap (described above).Yeah, sorry, it was late and I understood that code only when I was lying in bed.
Quote
As for texture aliasing, I cannot think of an efficient way to do this at run time. I think our best option is to allow the user to specify groups of sprites for texture atlases, then atlas those atlases together according to profiler data from a special compilation mode.I didn't mean on using some magical heuristic in real-time though. I just though that we pack all sprites (as much as possible) in an nxn texture at runtime (or when sprite_add() is called) without taking into account usage. Usually the texture size can be quite massive, some even suggest 16kx16k for a modern PC (which GL3 is meant for). And in that size we could pack sprites for most 2d games (that texture can fit 65k 64x64 sprites, or 16k 128x128 sprites.. I think you get the point). At run-time it would also be possible to pack into GL_MAX_TEXTURE_SIZE and so work no matter what. The larger the maximum texture size the better it would go. Giving users the ability to set this would also be good of course.
Quote
Perhaps it would be a good idea to allow placing sprites in multiple texture atlases, and making it simple to check if the current atlas contains a sprite. This would further improve batching.Well we will have to do this anyway. If the person doesn't have enough VRAM (or we just choose a conservative size when packing at compile time), then we must use multiple atlases. And I was thinking not about a way to check if a sprite is in an atlas, but that sprite returns in which atlas it is in. So basically nothing in the drawing functions would really have to change (only a little bit of texture coords). The texture_use() would automatically work.
Quote
That is possible in Game Maker and using the DX batching class. DX can also outperform this with different textured sprites, I presume because of batching, and it must be mixing it with a shader. If we add all that to the OpenGL one Harri committed, we could make it a LOT faster than what he has right now.And it also worked in immediate mode. In GL3 transformations themselves are a massive beast, as we must rewrite all those functions to use our own matrix math. The problem is that it would probably break batching, as I would need to call glDrawElements as many times as there are transformations. Only vertex shaders could help there.
1194
General ENIGMA / Re: Scalar Types
« on: August 01, 2013, 02:06:17 am »
Well in the VBO GL3 system we use vector<gs_scalar> which has all 8 vertex attributes the same format, so to remove the cast I just set alpha to be gs_scalar. I also think double is not really that necessary to represent value from 0-1 in the precision required.
1195
General ENIGMA / Re: Scalar Types
« on: July 31, 2013, 03:45:16 pm »
Why did you use gs_scalar for alpha? For now I will do that so I don't have to cast when adding to the VBO vector, but this of course can be reversed (and then cast) if needed. It just makes sense that alpha just like everything else is also gs_scalar.
1196
Proposals / Re: GL3 changes from immediate to retained mode
« on: July 31, 2013, 02:43:11 pm »Quote
The proposal I gave above in (2) doesn't do that.Well your example ordered 7 batches and if they are rendered in that order, then the output will differ. And if you draw a hud then that can be a difference between having text on the background or the background on the text (or even a player on the text).
Quote
In my method, an object that draws spr1, then spr2, then spr3 will behave like this:I don't see how that would change much. The slowdown now happens only when switching textures. That means I can draw 20 objects with different depth and ID order and still get only 1 VBO if they all draw the same thing. If the thing differs, then you must switch texture and do the same thing again. So in that spr_1, spr_2, spr_3 example it would take the same amount of time whatever you do. Even if you use 1 VBO for each or 1 global one (of course it will work faster with 1 global one). By my testing it seems that you must render about 10 things in batch to have any speed gain over immediate mode. So what we really need is texture atlas. Any comments on that (the compile time vs runtime)?
1197
Proposals / Re: GL3 changes from immediate to retained mode
« on: July 31, 2013, 12:35:01 pm »
I don't think taking away drawing order management other than depth is such a good thing. I always rely on the drawing order for hud drawing and now it would either require shit ton of objects or changing depth mid draw, which I don't think would work in your case (or it would just call the buffer to be drawn and reset?), like so:
I now implemented a system which could be something like the system in the final version. These are the results. In the best case (0 texture switching) and drawing 50k sprites I get this:
Without batching (like GL1) gives 18FPS, so it's a speed increase of 400%. Now the worst case:
Without batching gives 14FPS (so a slight decrease because of texture switching), but the VBO is 230% slower here. In this case I have 50k sprites, but two different are draw (25k each) and as they are created intermittent, then they are rendered as such as well. This means texture switch happens for every draw_sprite and thus VBO flushing as well.
So some thoughts and questions:
1) Thus this seem acceptable to be committed? So for the worst case this could be a step back performance wise, but some points to consider:
* Normally you don't render this many sprites like this. If you render thousands of sprites then they are for things like particles, which in this case would batch perfectly.
* This runs fast enough for 500 and even 5000 (60 fps) worst case sprites, so it shouldn't impact any current game.
* Worst case almost never happens (tm).
2) All sprite functions (like draw_sprite, draw_sprite_ext, _transformed etc.) are rendered together.
3) This clearly shows we need to use a texture atlas. Some thoughts:
* Do we make it runtime or compile time? At runtime it would be better because we could pack also when using sprite_add() functions, but that would require a loading screen (as startup will get slower). We could also have a middle ground where all the compile time resources are packed at compile time (like fonts are now) and runtime sprite_add() packs at runtime. I would love some help implementing this.
* Do we pack sprite and background resources together? As texture wise there is no difference, then I suggest we do.
* How to select packing size? At runtime we could provide a function which allows the user to choose size (like 1024x1024, 4096x4096 etc.), but at compile time it might require either a macro or an option in ENIGMA settings.
* If we do it compile time, then do with make it universal or tied to a graphics system? I think it should work if it is universal.
4) When this is drawn using shaders instead of glDrawElements(), then it could be faster.
Code: [Select]
depth = 0;
draw_sprite(spr_wing_bottom, 0, x, y);
draw_sprite(spr_bird, 0, x, y);
depth = 1;
draw_sprite(spr_wing_top, 0, x, y);
Also note how I changed the depth in reverse order.I now implemented a system which could be something like the system in the final version. These are the results. In the best case (0 texture switching) and drawing 50k sprites I get this:
Code: [Select]
int i = 0;
var inst;
repeat (50000){
inst = instance_create(random(room_width),50+random(room_height-50),obj_0);
inst.spr = (i%1==0?spr_0:spr_1);
++i;
}
Without batching (like GL1) gives 18FPS, so it's a speed increase of 400%. Now the worst case:
Code: [Select]
int i = 0;
var inst;
repeat (50000){
inst = instance_create(random(room_width),50+random(room_height-50),obj_0);
inst.spr = (i%2==0?spr_0:spr_1);
++i;
}
Without batching gives 14FPS (so a slight decrease because of texture switching), but the VBO is 230% slower here. In this case I have 50k sprites, but two different are draw (25k each) and as they are created intermittent, then they are rendered as such as well. This means texture switch happens for every draw_sprite and thus VBO flushing as well.
So some thoughts and questions:
1) Thus this seem acceptable to be committed? So for the worst case this could be a step back performance wise, but some points to consider:
* Normally you don't render this many sprites like this. If you render thousands of sprites then they are for things like particles, which in this case would batch perfectly.
* This runs fast enough for 500 and even 5000 (60 fps) worst case sprites, so it shouldn't impact any current game.
* Worst case almost never happens (tm).
2) All sprite functions (like draw_sprite, draw_sprite_ext, _transformed etc.) are rendered together.
3) This clearly shows we need to use a texture atlas. Some thoughts:
* Do we make it runtime or compile time? At runtime it would be better because we could pack also when using sprite_add() functions, but that would require a loading screen (as startup will get slower). We could also have a middle ground where all the compile time resources are packed at compile time (like fonts are now) and runtime sprite_add() packs at runtime. I would love some help implementing this.
* Do we pack sprite and background resources together? As texture wise there is no difference, then I suggest we do.
* How to select packing size? At runtime we could provide a function which allows the user to choose size (like 1024x1024, 4096x4096 etc.), but at compile time it might require either a macro or an option in ENIGMA settings.
* If we do it compile time, then do with make it universal or tied to a graphics system? I think it should work if it is universal.
4) When this is drawn using shaders instead of glDrawElements(), then it could be faster.
1198
Announcements / Re: DirectX 9 Implemented and Working! <Functions Can Now Be Written>
« on: July 31, 2013, 10:37:24 am »
When I try to launch dx9 I get: Graphics_Systems/Direct3D9/Direct3D9Headers.h:25:19: fatal error: d3dx9.h: No such file or directory
I searched the mingw folder I found d3d9.h, but not d3dx9.h. Which MinGW do you use?
edit:
I searched the mingw folder I found d3d9.h, but not d3dx9.h. Which MinGW do you use?
edit:
Quote
turn them into 3D billboards like many people do in Game MakerAnd why can't you do this now in OGL? Do you mean just using d3d_transform functions to rotate sprites?
1199
Proposals / Re: GL3 changes from immediate to retained mode
« on: July 31, 2013, 10:03:16 am »
Then I don't get that at all. Taking into a account that some things needs to be rotated and some not, then I think it will be better if I just populate the thing inside drawing functions. Though I guess if it is done this way, then it will be easier to change formats later on. I will investigate.
1200
Proposals / Re: GL3 changes from immediate to retained mode
« on: July 31, 2013, 08:46:52 am »Quote
This has been planned for a loooong time, but has only recently begun being implemented.Well I have been thinking about this for a long time as well. The problem is that it is not that straight forward. If you create your own GL project then it is easy to batch things as you do all the logic quite differently and you can decide what is going to be static and what dynamic. With GM way of doing things (and now ENIGMA of course) it is a lot harder, because you can do this:
Code: [Select]
draw_sprite(spr_0,0,x,y);
draw_background(back_0,10,10);
draw_line(10,10,250,300);
draw_sprite(spr_0,0,60,10);
Which cannot be straightforwardly batched. Even if background and sprite functions draw a static image (so the x and y are static), they are still considered dynamic and so must be either redrawn every time via immediate mode (like now) or batched to dynamic VBO (like it is proposed here). If we had sprite packing then at least this simple code wouldn't call a texture rebind (and in turn VBO regen), but it still wouldn't be as good as it could be. Like if we could have a way to figure out if the drawn images are static or not (like if none of the arguments are variables), then it could be possible to batch them together in a different VBO which would be reused and never regenerated. The problem though is that the static image can be inside an if(){} or that draw ordering would break. Some of the things could be fixed by some extreme analysis of the code at compile time, but I can't even imagine what the thinking that would require. I think JDI already returns many things about the functions, so it could be possible. And the draw_line() breaks the whole thing even further. I guess we can have a separate vector which whold drawing mode or even textures. Then just push everything into a single VBO and then bind different textures and draw only part of the VBO via the glDrawElements().Quote
But Harri, for that I was thinking of a common interface for vertex formats of all the basic shapes like a plane and what not, and include them from a common header, thats what those GLshapes.cpp and GL3shapes.cpp files are about.Can you explain in more detail? Did you mean that you though a common shape functions that return vertices or something? Like vert_plane(x,y,w,h,r) which would push to a common vertex array a rotated plane?
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 »