Pages: 1 2
  Print  
Author Topic: Shaders  (Read 6319 times)
Offline (Male) Goombert
Posted on: August 29, 2013, 06:16:48 AM

Developer
Location: Cappuccino, CA
Joined: Jan 2013
Posts: 3108

View Profile
I rewrote all of ENIGMA's model batching last night to do some kickass batching. OpenGL 1 now uses this class and OGL 1.0 support is dropped minimum is now 1.1, as those have vertex arrays allowing me to use the same class and not be forced to use call lists which are slow as dog shit and bloat the ram they can't even handle big models. Anyway, I have programmed model batching not even that efficiently and it still whomps Game Makers ass using the fixed function pipeline.

The commit I am working on is available on Git, but I am not quite ready for it to be merged, when it is your games with models should have a huge performance increase.
https://github.com/enigma-dev/enigma-dev/pull/374

ENIGMA OpenGL 1
Ram Usage: 135,000 K
FPS: 25

GameMaker: Studio
Ram Usage: 180,000 K
FPS: 240ish

ENIGMA Direct3D 9.0
Ram Usage: 137,000 K
FPS: 260ish

ENIGMA OpenGL 3
Ram Usage: 59,000 K
FPS: 320ish

OpenGL 3 kicks studio's ass by a whopping 100 frames per second on their own fucking demo...


GameMaker: Studio sucks donkey dick, slower than a fuckin snail...


Direct3D 9.0 is a really shitty graphics API but still beats Studio a little bit...


OpenGL 1 graphics are still slow but at least thanks to my new batching they can handle some models without killing over and bloating the ram.


The memory usage of OpenGL 1 running their demo was almost half of what Stupido was using...


Direct3D 9.0 is really out of date, and too bloaty of an API...


I wish this software would die a miserable death...


OpenGL 3 uses 1/4th the ram Stupido does...


And the first one now, will later be last....
« Last Edit: September 08, 2017, 10:02:51 AM by Goombert » Logged
I think it was Leonardo da Vinci who once said something along the lines of "If you build the robots, they will make games." or something to that effect.

Offline (Unknown gender) TheExDeus
Reply #1 Posted on: August 29, 2013, 06:53:32 AM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
I like how you use swear words every 10 syllables and yet ironically was the one whining about them in ENIGMA.

Anyway, good job. Can you also test the current GL3 batching system and compare? Because while this is better on many levels (as in we should only use triangles anyway) I don't think there will be any noticeable difference between this and the one before. Less memory usage is nice of course.
I played around with shaders yesterday and had some problems with binding textures to texture units. It seems you also made a change to texture_set_stage() which maybe would fix the problem I was having.
Logged
Offline (Male) Goombert
Reply #2 Posted on: August 29, 2013, 07:24:46 AM

Developer
Location: Cappuccino, CA
Joined: Jan 2013
Posts: 3108

View Profile
Thanks, I will run tests for you later, I was up all night writing that and I need to get breakfast, etc. I still have like 1 or 2 things to fix, mainly d3d_model_cube, we can also add multitexturing to this mesh class. And yes harri, I did have to apply a fix to texture_set_stage() I really can't remember, just look at my commit I will check later and post back. I will later post my demo with bumpmapping and other shaders.



It also no longer matters how shitty you code the basic shapes, will be the same for DirectX since the batching will handle optimization for you. Also, Harri did you notice in ENIGMA its more anti-aliased? Are you guys setting those in the graphics system as default? I guess you were right they must just have never worked cause Linux blows, glad to be back on Windows.
« Last Edit: August 29, 2013, 07:32:02 AM by Robert B Colton » Logged
I think it was Leonardo da Vinci who once said something along the lines of "If you build the robots, they will make games." or something to that effect.

Offline (Male) DaSpirit
Reply #3 Posted on: August 29, 2013, 10:11:19 AM

Member
Location: New York City
Joined: Mar 2013
Posts: 124

View Profile
This bechmark wasn't done well. Robert said in the forum that he was using the non-LLVM version of GM Studio. I ran YoYo's demos and the LLVM version gave me a 200 FPS boost (back when LLVM was in beta, and therefore free). In addition, Windows Task Manager is not meant for benchmarking. Use perfmon and resmon (both installed in Windows 7 by default).
Logged
Offline (Male) Goombert
Reply #4 Posted on: August 29, 2013, 10:15:58 AM

Developer
Location: Cappuccino, CA
Joined: Jan 2013
Posts: 3108

View Profile
Yes DaSpirit, and what are you trying your LLVM test on a for loop? If not you should try it on this specific demo since it is doing nothing more than calling d3d_model_draw, not to mention I have yet to add indexing.
Logged
I think it was Leonardo da Vinci who once said something along the lines of "If you build the robots, they will make games." or something to that effect.

Offline (Male) polygone
Reply #5 Posted on: August 29, 2013, 10:35:36 AM

Contributor
Location: England
Joined: Mar 2009
Posts: 803

View Profile
Why is GL1 so slow in comparison?
Logged
I honestly don't know wtf I'm talking about but hopefully I can muddle my way through.
Offline (Male) Goombert
Reply #6 Posted on: August 29, 2013, 12:49:40 PM

Developer
Location: Cappuccino, CA
Joined: Jan 2013
Posts: 3108

View Profile
Well it is the same speed as those old call lists ran, but it doesn't hit a limit or fuck up memory like they did, and wait until I add the indexing over to it as well it should be running games faster than those call lists. I will post back with its benchmark as well when it does the actual index batching.
Logged
I think it was Leonardo da Vinci who once said something along the lines of "If you build the robots, they will make games." or something to that effect.

Offline (Unknown gender) TheExDeus
Reply #7 Posted on: August 29, 2013, 02:07:19 PM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
Quote
This bechmark wasn't done well. Robert said in the forum that he was using the non-LLVM version of GM Studio. I ran YoYo's demos and the LLVM version gave me a 200 FPS boost (back when LLVM was in beta, and therefore free). In addition, Windows Task Manager is not meant for benchmarking. Use perfmon and resmon (both installed in Windows 7 by default).
He was using task manager just to compare the ram usage. I think it's quite valid in that regard.
And I don't know about LLVM change, but you can probably run it faster in ENIGMA as well just by using types and such. I am not sure how effective it would be in this particular demo though (haven't seen the source).
Logged
Offline (Male) Goombert
Reply #8 Posted on: August 29, 2013, 08:59:53 PM

Developer
Location: Cappuccino, CA
Joined: Jan 2013
Posts: 3108

View Profile
Well see Harri it would be useful if he would tell us what demo it was that he tested. And here is another thing, when I ran this from Studio 1.1 I got maximum 60fps or so, when I switched to 1.2 is when it went up to 240 some, so if that is the speed increase he is talking about, haha, I am sorry but they are going to have to try a little harder, because I clearly beat them.

Quote
There are several "performance bottlenecks", of which one of them is the pushing speed of data to the graphics card. When you call a OpenGL function, it will either calculate on the CPU or on the GPU (graphics card's processor). The GPU is optimized for graphics operations, but, the CPU must send it's data over to the GPU. This requires a lot of time.
That is why we are dropping call lists in favor of arrays. Cubes demo ends up well of 500,000 K ram usage with a fucking call list, this will make things better for people like polygonz who are stuck with shitty graphics hardware.
« Last Edit: August 30, 2013, 06:02:50 AM by Robert B Colton » Logged
I think it was Leonardo da Vinci who once said something along the lines of "If you build the robots, they will make games." or something to that effect.

Offline (Male) Goombert
Reply #9 Posted on: August 31, 2013, 06:59:38 AM

Developer
Location: Cappuccino, CA
Joined: Jan 2013
Posts: 3108

View Profile
Alright guys I am pretty much finished I am just cleaning up a few things and resolving warnings this will be merged later on today. And the winner by far out of all of them is....


OpenGL 3!!!!!!
OpenGL 3!!!!!!
OpenGL 3!!!!!!

Edit: It is now ready to be merged...
https://github.com/enigma-dev/enigma-dev/pull/374
I also added the postfix ARB to OpenGL 3 model calls so that people with OpenGL 1.5 capable graphics hardware including polygonz should be able to use the newer graphics system, this was a huge fault in some earlier Intel chips.

Greg ran a test on Linux and this is what he got, but keep in mind he has very good hardware...


Also the DirectX version is slow due to an issue with sprite batching I can get it up to 310 if I don't let it interfere with drawn models, I am searching for the fix.
« Last Edit: August 31, 2013, 01:17:58 PM by Robert B Colton » Logged
I think it was Leonardo da Vinci who once said something along the lines of "If you build the robots, they will make games." or something to that effect.

Offline (Unknown gender) TheExDeus
Reply #10 Posted on: August 31, 2013, 05:08:05 PM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
For me it seems OGL3 is slower now. At least when used with the new perf tester and drawing 200000 boxes I get only 28-30fps, while previously I had 660FPS. Maybe something else has happened. I will check later as not it's 1am.
Logged
Offline (Male) Goombert
Reply #11 Posted on: August 31, 2013, 05:21:21 PM

Developer
Location: Cappuccino, CA
Joined: Jan 2013
Posts: 3108

View Profile
Harri, I think you forgot to set OpenGL 3, I made OGL1 work with the demo too by replacing call lists with vertex arrays since I know polygonz has that support, anybody with OGL 1.1 has support for Vertex Arrays, my speed went up by 7fps the mesh class also uses index buffering all the time now even if you do not supply the indices so that it can optimize different primitive types and automatically remove duplicate vertices from triangle fans. The only thing that could have possibly gotten slower is triangle or line list because I am adding an index buffer but that would only be for games that are already doing the super best batching they can, for other games that just repeatedly make new list primitives they should also be a fuckton faster. I would also appreciate it if you could test D3D9 harri.

I have also discovered what makes D3D9 100fps slower and its the sprite batching for some reason having it begin before I draw the model slows it down, I have to investigate.

Edit: Harri, also check and see if it was because I postfixed the glBuffer functions ARB, maybe that is why it slowed down on you. I did that so maybe I could get polygonz to be able to use OpenGL 3 as that is what the #opengl IRC channel advised me to do.
« Last Edit: August 31, 2013, 05:45:35 PM by Robert B Colton » Logged
I think it was Leonardo da Vinci who once said something along the lines of "If you build the robots, they will make games." or something to that effect.

Offline (Unknown gender) TheExDeus
Reply #12 Posted on: September 01, 2013, 03:24:12 PM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
Yup, never mind, forgot that the perftester is now egm, so it saves settings as well. When I set it to GL3 I get 545FPS with normal shading (shader - none) and up to 755FPS with toon shader. I see that you start to upload videos to youtube. I would do some too. Maybe some of the games in EDC. For recording I would want to use Shadowplay to get the most FPS, but Nvidia is really slow on releasing that.
Logged
Offline (Male) Goombert
Reply #13 Posted on: September 01, 2013, 03:37:27 PM

Developer
Location: Cappuccino, CA
Joined: Jan 2013
Posts: 3108

View Profile
Just ask me Harri and I'll give you the credentials to log into our account, it should be on the staff board. Also, like I said triangle lists will slow down, well speed up if you have lots of primitives but down if a single primitive because and index buffer is added on top but its not checking for duplicate verts, triangle fans should really speed up as well as strips since it no longer takes duplicate verts to batch them to the triangle list but simply indexes them, but ya like I said we need to add an optimizer to triangle lists and line lists. But Harri, you said you used to get 600fps with my cubes demo? It should be going just as fast or faster now because my d3d_model_block already had them in the most optimal possible format with indexing, it should not have gotten one bit slower for you, it got 7fps faster for me.

I also wish you and polygonz would help me with DirectX some now since it can do models and quite a few games. There are still people on XP and OpenGL 1 is still horribly slow and nonsensical when on Windows without OpenGL 3 capable hardware.
« Last Edit: September 01, 2013, 03:45:04 PM by Robert B Colton » Logged
I think it was Leonardo da Vinci who once said something along the lines of "If you build the robots, they will make games." or something to that effect.

Offline (Unknown gender) TheExDeus
Reply #14 Posted on: September 01, 2013, 04:13:43 PM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
I can only test DX. I am not that interested in developing it right now. I would really want to finish porting GL3 now, but I am tight on time.

Anyway, DX segfaults in the meshes:
Code: [Select]
Program received signal SIGSEGV, Segmentation fault.
0x005022ab in Mesh::BufferGenerate (this=0x34c350, subdata=false) at Graphics_Systems/Direct3D9/DX9model.cpp:364
364                     vertexbuffer->Lock(0, 0, (VOID**)&pVoid, 0);
(gdb) bt
#0  0x005022ab in Mesh::BufferGenerate (this=0x34c350, subdata=false) at Graphics_Systems/Direct3D9/DX9model.cpp:364
#1  0x0050365a in Mesh::Draw (this=0x34c350) at Graphics_Systems/Direct3D9/DX9model.cpp:387
#2  0x0041c56e in enigma_user::d3d_model_draw (id=1) at Graphics_Systems/Direct3D9/DX9model.cpp:564
#3  0x0042dbff in enigma_user::d3d_primitive_end () at Graphics_Systems/Direct3D9/DX9primitives.cpp:144
#4  0x004071a0 in enigma::OBJ_obj_floor::myevent_draw (this=0x348a10) at Preprocessor_Environment_Editable/IDE_EDIT_objectfunctionality.h:322
#5  0x00434831 in enigma_user::screen_redraw () at Graphics_Systems/Direct3D9/DX9screen.cpp:179
#6  0x00403040 in enigma::ENIGMA_events () at Preprocessor_Environment_Editable/IDE_EDIT_events.h:103
#7  0x0041103e in WinMain@16 (hInstance=0x400000, hPrevInstance=hPrevInstance@entry=0x0, lpCmdLine=0xf23a78 "", iCmdShow=10) at Platforms/Win32/WINDOWSmain.cpp:313
#8  0x005c3b6d in main (flags=1, cmdline=0x340008, inst=0x11c1870) at /home/ruben/mingw-w64/src/mingw-w64/mingw-w64-crt/crt/crt0_c.c:18
(gdb)
Logged
Pages: 1 2
  Print