Pages: 1 2 3 »
  Print  
Author Topic: Massive GL3.3 changes.... again  (Read 11176 times)
Offline (Unknown gender) TheExDeus
Posted on: October 03, 2014, 05:21:22 PM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
Some might remember a merge I did mid-August. It involved massive GL3.3 changes. It stood as a merge request for a week for anyone to test. Nobody did. So I merged it and everything went up in flames. Now I will post a topic so people actually know about these changes, previously maybe only Robert was aware. I will also post how you would test it if using git.

These are some massive changes to the GL3.3 graphics system (it also touches other places). In short:
1) Better errors for GLSL together with more caching.
2) Surfaces now have optional depth buffers. This allows using them for rendering 3D scenes on them, which is basis of many graphics FX, like reflections, refractions and post-processing.
3) Added functions to use attributes and matrices, both of which are now cached.
4) Added proper GL3.3. debug context together with error function. This means when you run GL3.3 in debug mode, then it will error (segfault) whenever you use a deprecated function or wrong enum. Then it prints the function to console and shows an error window. This is very useful when we are trying to get rid of deprecated functions or when we have a hard to find bug (like wrong enum in a function argument). By doing this I removed many functions, fixed many others. In the end it fixed AMD problems we were having and I removed the "hack" that was used previously. That also means that normally ENIGMA users shouldn't see those errors (as they wont use GL directly), and so this could be an additional debug mode (graphics debug mode), so that we don't drop frames without reason (this GL debug mode really does drop FPS).
5) Fixed view_angle in GL1 and GL3.
6) Adds a global VAO, which is necessary for GL3.3 core. Making one VAO per mesh might be better, but I had some problems with rendering when I tried that. Worth investigating later.
7) Fixes GL1 models not updating. This is because GL lists are now used for additional batching, but they were not reset when model_clear() was called.
8) The GL1 GL list was never destroyed, thus creating a memory leak problem. Fixed that by destroying the list in mesh destructor. The list is also created once doing the mesh constructor.
9) Fixes surfaces, which were broken in the recent viewport code changes.
10) Started on the GPU profiler. It would basically keep track of vertices, triangles, texture swaps, draw calls and so on per frame. This is of course done in debug mode for now. Many changes to come in this regard, as well as a forum post to explain this in more detail.
11) Updated GLEW. This wasn't necessarily needed, but it's always good to have newer stuff. The reason I did it though is because I needed to get rid of glGetString(GL_EXTENSIONS) which was in glew.c. This function with this argument is deprecated, and so it always crashed at startup in debug context. The newest versions (1.10) still doesn't remove that function call, but I found many code snippets on the net that replace it.
12) The color choice is more consistent with GM in GL3. It's hard to explain, but basically the bound color (draw_set_color) will be the default one and it won't blend when using vertex_color. This is basically the fix for the purple floor in Minecraft example. In GL1 the floor was purple, in GL3 it was white. Now in GL3 it is also purple.
13) Fixed shadows in Project Mario (can't remember what did it though).
14) Added alpha test functions for GL3.3. This can also improve performance.
15) Added draw_sprite_padded() which is useful for drawing menus, buttons and other things like that. Will be instrumental in GUI extension I'm working on.
16) Added a basic ring buffer. If buffer type is STREAM (like the default render buffer), then it uses a ring buffer. It basically means that if you render stuff with the same stride (like 6 bytes for example), it will constantly use glSubData on different parts of the buffer and not cause GPU/CPU synchronization. This is useful for things like particle systems. For now it will only work when you render something in one batch with the same stride (like particles). In my test I draw 10k sprites - I get 315FPS with current master, and 370FPS with this change. But the gain will not be noticeable in more regular cases. Like minecraft or mario examples have zero gain because of this change. I think the short term the biggest gain can only be from texture atlas or texture arrays. Another thing would be to use GL4 features, like persistent memory mapping. Learn more here: http://gdcvault.com/play/1020791/ and about ring buffers here: https://developer.nvidia.com/sites/default/files/akamai/gamedev/files/gdc12/Efficient_Buffer_Management_McDonald.pdf.
17) C++11 is now enabled. This means from now on we will start using C++11 features, including unordered_map which is already used in shader system.
18) Some OpenAL changes so calling sound_play(-1) doesn't crash Linux.

There were many other changes as well, but I have forgotten most of it, as this was originally a mid-August merge.

I would like if some other people tested it. I have tried it on AMD laptop and NVIDIA PC. Will do some additional tests later.

Also there are performance improvements for GL3 stemming from these changes. Like project mario is now 1620FPS (vs 1430FPS in master). But there can be also a decrease in some cases, because the caching can actually take more time than calling the gl function. For example, uniforms are very optimized and are meant to be changed frequently (like 10million times a second) and so adding a caching layer can actually slow it down. That is still useful for debugging purposes, as we actually know what types uniforms are and what data they hold (so people can actually query back this data without touching GPU memory) and I'm still investigating if leaving cache in, but disabling cache checks is more useful and faster.

I recommend testing on:
1) Project Mario - http://enigma-dev.org/forums/index.php?topic=1161.0 (GL1 and GL3).
2) Minecraft example - http://enigma-dev.org/edc/games.php?game=65 (GL1 and GL3).
3) Simple shader example - https://www.dropbox.com/s/6fx3r0bg5puyo28/shader_example.egm (GL3).

I will fix up the water example and post a link as well.

This is how they should look after running:





You can find the branch here: https://github.com/enigma-dev/enigma-dev/commits/GL3.3RealCleanUp
To test it you can do this via git:
1) Open console, and cd to enigma directory
2) Write "git checkout GL3.3RealCleanUp"
3) Then open LGM and test

Another way is to download this: https://github.com/enigma-dev/enigma-dev/archive/GL3.3RealCleanUp.zip
Then you must extract it. Copy LGM, plugin directory, ENIGMA.exe, as well as ENIGMAsystem/Additional to the extracted directory from your working version of ENIGMA.

Please test, give feedback and bug reports. I would want this merged as soon as possible. :)

Known bugs:
Text in Project Mario is messed up. Can't remember if this was fixed or not. It looks fine in Minecraft, so not sure what is going on. Maybe Robert knows.
« Last Edit: November 04, 2014, 02:47:54 PM by TheExDeus » Logged
Offline (Unknown gender) sorlok_reaves
Reply #1 Posted on: October 03, 2014, 11:48:34 PM
Contributor
Joined: Dec 2013
Posts: 261

View Profile
Lots of great fixes! One minor bug: you need to add the following in GL3shader.cpp:

Code: [Select]
#include <cstring>
...because memcpy is actually defined there (for some reason). Otherwise, Project Mario can't compile in Linux.
Logged
Offline (Unknown gender) sorlok_reaves
Reply #2 Posted on: October 03, 2014, 11:54:12 PM
Contributor
Joined: Dec 2013
Posts: 261

View Profile
Also, the Minecraft example looks like this on Linux (+Nvidia), with both OpenGL1 and OpenGL3:



Finally, the shader example will cause Lateral GM to segfault, shortly after:

Code: [Select]
Running make from `make'
Full command line: make Game WORKDIR="%PROGRAMDATA%/ENIGMA/" GMODE=Run GRAPHICS=OpenGL3 AUDIO=OpenAL COLLISION=Precise WIDGETS=None NETWORKING=None PLATFORM=xlib CXX=g++ CC=gcc COMPILEPATH="Linux/Linux" EXTENSIONS=" Universal_System/Extensions/Alarms Universal_System/Extensions/Timelines Universal_System/Extensions/Paths Universal_System/Extensions/MotionPlanning Universal_System/Extensions/DateTime Universal_System/Extensions/ParticleSystems Universal_System/Extensions/DataStructures" OUTPUTNAME="/tmp/egm7620500460771948401.tmp" eTCpath=""

Logged
Offline (Male) Goombert
Reply #3 Posted on: October 04, 2014, 12:06:09 AM

Developer
Location: Cappuccino, CA
Joined: Jan 2013
Posts: 3110

View Profile
Sorlok, as far as the shader output, it would probably be better to post enigma-dev/output_log.txt because I believe we throw the errors in that file.
Logged
I think it was Leonardo da Vinci who once said something along the lines of "If you build the robots, they will make games." or something to that effect.

Offline (Unknown gender) TheExDeus
Reply #4 Posted on: October 04, 2014, 04:51:44 AM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
Quote
...because memcpy is actually defined there (for some reason). Otherwise, Project Mario can't compile in Linux.
Alright. The fact that I cannot test on Linux and Mac is a good reason for this testing in the first place. Thank you!

Quote
Also, the Minecraft example looks like this on Linux (+Nvidia), with both OpenGL1 and OpenGL3:
That is weird. Can you run in Debug mode and see if it outputs anything to console?

Quote
Finally, the shader example will cause Lateral GM to segfault, shortly after:
Same, please run in Debug mode and see if anything errors out.

But project mario looks okay for Linux? If so, then it's a good sign. Means we can fix other problems as well. Otherwise it would just be a "driver issue" and we be out of luck.

edit: Also, I forgot to mention that one of the biggest reasons all of that was done, was to make the porting to GLES a lot easier. For that we needed to use GL Core context, which is why I had to redo several places, even if that could actually lead to a performance hit in short term. But on my hardware FPS in several examples is larger, and on others it's as before. Performance gain is basically because of reduced GL function calls, in Project Mario they went from 5215 per frame, to 3222. Almost 40% reduction. But sadly it doesn't mean much, as some calls are very fast, others are not. Optimizing for the slow ones is the way to go from now.
« Last Edit: October 04, 2014, 05:08:31 AM by TheExDeus » Logged
Offline (Male) Goombert
Reply #5 Posted on: October 04, 2014, 08:53:17 AM

Developer
Location: Cappuccino, CA
Joined: Jan 2013
Posts: 3110

View Profile
Quote from: Harri
That is weird. Can you run in Debug mode and see if it outputs anything to console?
I believe it might be because that game uses screen_refresh() to read the heightmap data from the background/sprite. Sorlok, are you testing on Compiz?

Quote from: Harri
Performance gain is basically because of reduced GL function calls, in Project Mario they went from 5215 per frame, to 3222. Almost 40% reduction. But sadly it doesn't mean much, as some calls are very fast, others are not. Optimizing for the slow ones is the way to go from now.

I can tell you, I think some of the slow down is VBO's used for sprite/background batching. I believe vertex arrays are a lot faster for dynamic draw. But this is in regards to regular consumer hardware, I don't have a graphics card as good as yours Harri, so in some cases (like sprite batching) software vertex processing is just optimized a lot more I guess.
« Last Edit: October 04, 2014, 08:58:03 AM by Robert B Colton » Logged
I think it was Leonardo da Vinci who once said something along the lines of "If you build the robots, they will make games." or something to that effect.

Offline (Unknown gender) TheExDeus
Reply #6 Posted on: October 04, 2014, 09:12:26 AM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
I doubt your card does any software vertex processing either. It is all on the GPU. But there are many things we still need to do, like batching textures. Then we could batch many models together like you do with gllists. That is actually how it's often done these days - You create a 3D texture and bind all the textures to it. Then put all the stuff you want to draw in one VBO, then draw many times with switching the texture index. This means no texture switching or VBO switching is actually taking place. But it's hard to do all that automatically in the way GM and ENIGMA allows drawing things.

Also, I made xlib changes. I haven't tested them, so I'm sure they broke it, but I would want sorlok to test and see. Basically we actually didn't even create a GL3 context. So it wouldn't also give you any error information. I doubt it will fix any of the problems you are seeing (maybe the segfault in shader example), but still a good debug feature. Please test again and help fix any syntax errors I might have made.

Can you move around in the Minecraft example? Like are only the visible blocks solid? If so, then the problem really is in the generation. The generation happens in world_generator, it draws a sprite, and then uses draw_getpixel. It actually doesn't do a screen_refresh() as far as I can see. So now sure how it works. I would just use a surface instead.
Logged
Offline (Unknown gender) sorlok_reaves
Reply #7 Posted on: October 04, 2014, 11:03:02 AM
Contributor
Joined: Dec 2013
Posts: 261

View Profile
First, some basic errors:

In Bridges/xlib-OpenGL3/graphics_bridge.cpp, you need:

Code: [Select]
#include "libEGMstd.h"
...because you use toString().

Also, on this line:
Code: [Select]
*fbc = glXChooseFBConfig(enigma::x11::disp, DefaultScreen(enigma::x11::disp), visual_attribs, &fbcount);
...you are storing a double-pointer in a single-pointer.

Similarly, here:

Code: [Select]
*vi = glXGetVisualFromFBConfig( enigma::x11::disp, fbc[0] );
...you store a single pointer in a value type. I tried changing these lines to:

Code: [Select]
*fbc = *glXChooseFBConfig(enigma::x11::disp, DefaultScreen(enigma::x11::disp), visual_attribs, &fbcount);
*vi = *glXGetVisualFromFBConfig( enigma::x11::disp, fbc[0] );

...but I got a segfault on the first line. I'll look into this more, but perhaps you changed how the *FBConfig() functions work? (GL1 doesn't use them, and Windows-GL3 does something different.)
Logged
Offline (Unknown gender) TheExDeus
Reply #8 Posted on: October 04, 2014, 11:13:52 AM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
Yes, Windows version is quite different, because GLX is different. I used this as an example: https://www.opengl.org/wiki/Tutorial:_OpenGL_3.0_Context_Creation_%28GLX%29 . Maybe you can use that to figure out in more detail. Sadly, even if I fix it, I cannot test it. I know Robert can test linux now as well, so he should maybe try it.
Logged
Offline (Unknown gender) sorlok_reaves
Reply #9 Posted on: October 04, 2014, 12:35:06 PM
Contributor
Joined: Dec 2013
Posts: 261

View Profile
I've done a little digging. Turns out this line:

Code: [Select]
fbc = glXChooseFBConfig(enigma::x11::disp, DefaultScreen(enigma::x11::disp), visual_attribs, &fbcount);
...will segfault unless glewInit() is called. However, glewInit() will fail if it is not in exactly the right place (causing later failures). The important thing here is that even if glewInit() fails, the glXChooseFBConfig() will at least not segfault.

I'll try to track down the "right" place to put the glewInit() call, but maybe Robert has a better idea? I'm not much of an OpenGL person.
Logged
Offline (Unknown gender) TheExDeus
Reply #10 Posted on: October 04, 2014, 12:54:26 PM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
Your type errors are weird though. You shouldn't dereference the value gotten from the functions, so you shouldn't do this:
Code: (C) [Select]
*fbc = *glXChooseFBConfig(enigma::x11::disp, DefaultScreen(enigma::x11::disp), visual_attribs, &fbcount);
*vi = *glXGetVisualFromFBConfig( enigma::x11::disp, fbc[0] );
Also, we call glewInit() twice actually in GL3 on windows. Once in the bridge, and the other type in graphicssystem_initialize(). I didn't add the glewInit to the xlib bridge though

Also, can you test those examples on master? Mario should work, but without water. Minecraft should work, but mining wouldn't. Shaders will render only the glass box.
Logged
Offline (Unknown gender) sorlok_reaves
Reply #11 Posted on: October 04, 2014, 01:18:28 PM
Contributor
Joined: Dec 2013
Posts: 261

View Profile
Sure, I'll have a look. Also, you are right that the first line should look like this:

Code: [Select]
fbc = glXChooseFBConfig(enigma::x11::disp, DefaultScreen(enigma::x11::disp), visual_attribs, &fbcount);
Logged
Offline (Unknown gender) TheExDeus
Reply #12 Posted on: October 04, 2014, 01:57:09 PM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
Also test the changes I made in branch. I thought I said that, but then I read my post and noticed that I didn't. :)
Logged
Offline (Unknown gender) sorlok_reaves
Reply #13 Posted on: October 04, 2014, 03:15:33 PM
Contributor
Joined: Dec 2013
Posts: 261

View Profile
Running your latest changes allows it to compile, but it still crashes at:

Code: [Select]
fbc = glXChooseFBConfig(enigma::x11::disp, DefaultScreen(enigma::x11::disp), visual_attribs, &fbcount);

Running on master compiles and runs, and then the game is unplayable (just a single blue screen and the cursor, but it's not clear if I'm actually moving, since everything is blue).

Project Mario builds on master.

The shader example still crashes on master, but I figured out why. By default (because this is an EGM file!) the make directory is set to:
Code: [Select]
%PROGRAMDATA%/ENIGMA/
I really, really, really think we should not honor the user-specified make directory and just pick one (internally) for each platform. I've run into this same bug like 3 times.
Logged
Offline (Unknown gender) TheExDeus
Reply #14 Posted on: October 04, 2014, 03:33:50 PM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
Turns out there is glxewInit as well. I'm sure we don't never call it. Again, it works only when context is created though. So the chicken and the egg problem again. We need to create a simple context, call glxewInit() and then create the GL3.3 context.

edit: Try now. That is why actually glXChooseFBConfig() segfaulted. glX functions are loaded by glxewInit, instead of glewInit, so the pointer was junk. Calling a junk pointer to a function ends in a segfault. So if you get the current one to compile, then it should not crash at glXChooseFBConfig.

edit2: The fact that you don't see anything on master in the minecraft example at least means the current fixes are better. It shows blue window in both GL1 and GL3? It seems we are more hopeless on Linux than we thought. I guess you should try Iji then.
« Last Edit: October 04, 2014, 04:21:37 PM by TheExDeus » Logged
Pages: 1 2 3 »
  Print