Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - Josh @ Dreamland

586
General ENIGMA / Re: Default Font Glyph
« on: August 06, 2013, 06:53:23 PM »
Having ENIGMA exclusively as a separate binary is asinine. Syntax checking is done frequently; in an increasing number of modern IDEs, it is done continually. Are you going to have the OS spawn a process every half second or so to check the syntax of an open script? If that isn't enough to frighten you out of that idea, what about for all open scripts? Good luck.

As far as whose job it is to generate font textures, letting the plugin do it has an advantage: the IDE can pass textures it gets from anywhere. Anything goes. The disadvantage is, of course, that all IDEs must do that. Perhaps some kind of compromise is in order (eg, an option to pass a null bitmap and have ENIGMA generate the glyphs using libttf).

587
Proposals / Re: GL3 changes from immediate to retained mode
« on: August 06, 2013, 12:04:02 PM »
Ah, all right. Anyway, I don't store objects in a Lua map. I'm not sure what happens when you have a million instances; I'll have to investigate later.

588
General ENIGMA / Re: Frustum Culling and View Hashing
« on: August 06, 2013, 11:57:52 AM »
I am adding getters and setters to EDL for this very purpose.

They will replace multifuntion_variant for some objects, and will be used to update the space map when objects move.

589
Issues Help Desk / Re: Unable to load library 'compileEGMf'
« on: August 06, 2013, 11:37:08 AM »
On Linux, this is usually caused by running LGM from the wrong path, eg, by double clicking the JAR file. The file manager runs it from the Java directory instead of the jar directory. Try opening a terminal, cd'ing to ENIGMA's folder, then calling java -jar lateralgm.jar.

Not sure about the Windows issue. Usually, that's caused by the JRE being of the wrong architecture. Do you have a 32-bit Java installation?

590
Proposals / Re: GL3 changes from immediate to retained mode
« on: August 06, 2013, 11:31:52 AM »
Okay, what? A few million instances segfaults ENIGMA? Or did you actually mean objects? The difference is huge. IDs 0-1000000 are used for objects; IDs 1000001+ are used for instances. So if you have more than one million objects, GM and ENIGMA break.

This is one of Undermars' biggest oversights. I'm open to suggestions, but I promise they'll all involve the IDE.

591
I'm unsure to what extent my participation is required in automating this, as I don't know what services GitHub offers or what is otherwise available to us. We currently use a GitHub service hook to update the ticker at the top of this site (as well as the IRC bot's database). If you need a server, let me know. If I can't put it up on this one, I'll host it on one of my home computers.

592
The people I contacted about licensing haven't responded. I'm basically giving up on them doing so; I figure I'll ask forthevin to email the people he was talking about earlier. The softwarefreedom people apparently aren't interested. :P

593
General ENIGMA / Re: Delusions of Grandeur
« on: August 02, 2013, 05:20:17 AM »
Everything seems fine to you because you have no idea where to look. Now be quiet while the adults fix things.

594
General ENIGMA / Delusions of Grandeur
« on: August 01, 2013, 08:40:05 PM »
I spent the day migrating important chunks of the website's code to GitHub. This was largely for my own reasons (now I don't have to be afraid to edit files), but it's also for other members and potential web contributors to examine. This sparked as a result of my updating SMF to 2.0.4; we're on new software. The old version was becoming a liability due to exploits, etc.

So. You can find the source for the two bots here. You'll find the source to the SMF themes we use here. The source to the EDC (the least finished of all of these) is here.

I have not yet uploaded the source to the remainder of the site. I might get around to that tomorrow.

The biggest issue seems to be integrating the new SMF themes. In addition to being incomplete (I can't find the QuickEdit button), they break the EDC.

595
Proposals / Re: GL3 changes from immediate to retained mode
« on: August 01, 2013, 09:12:28 AM »
Precisely.

The packing isn't done compile time. Only the heuristic is computed. It's still up to the load functions to do the packing, they can just do so under advisement that there are 49,999 occasions wherein a transfer is needed between spr_0's texture and spr_1's. The atlas picker can do no planning, but if the packing algorithm is aware that not having spr_0 and spr_1 on the same sheet costs around 50K binds per step, it can use that info to the game's advantage.

596
General ENIGMA / Re: Half Pixel Alignment, OpenGL and DirectX
« on: August 01, 2013, 07:10:28 AM »
This isn't a problem. Sprite dimensions are integral. If you pass integer coordinates, you'll have half-integer final coordinates for all vertices. If you pass random coordinates, you'll have random coordinates. We can't do anything about that without breaking small-scale games (think 8-bit games with a resolution of 240x160). In those games, rounding coordinates would mean choppy movement when they're blown up to 640x480, or full screen.

It's fine how it is.

597
Proposals / Re: GL3 changes from immediate to retained mode
« on: August 01, 2013, 06:51:20 AM »
Quote
I think this is a compile thing instead of runtime thing no?
No, this is exclusively a run-time thing. Instead of doing anything in the drawing functions themselves, the draw functions just look for an appropriate batch job to include their operations in. Sprite functions look for a triangle batch job with the same texture, then for a sub-job with color information if needed. Filled shape operations look for a batch job of the same color or of arbitrary color. Curve, line, and outline operations look for a line batch job of the same color/arbitrary color.

Quote
I didn't mean on using some magical heuristic in real-time though. I just though that we pack all sprites (as much as possible) in an nxn texture at runtime (or when sprite_add() is called) without taking into account usage. Usually the texture size can be quite massive, some even suggest 16kx16k for a modern PC (which GL3 is meant for). And in that size we could pack sprites for most 2d games (that texture can fit 65k 64x64 sprites, or 16k 128x128 sprites.. I think you get the point). At run-time it would also be possible to pack into GL_MAX_TEXTURE_SIZE and so work no matter what. The larger the maximum texture size the better it would go. Giving users the ability to set this would also be good of course.
Yes, this would be a great thing, except we need to keep input from the user in mind at all times. For larger games with shitloads of sprites, the user might want to swap them in and out of memory frequently enough to where reconstructing these huge atlases could be a problem. We need some useful heuristic; not a magical one, but one based on the user's intended use. I suggested earlier using the resource tree as a method to group sprites for mass load/unload. We might instead want to just consider a new resource type for general logical grouping. One of these grouping types could be for use in generating atlases. Another could be for use in mass load/unload. A third could just be for generic categorization, so it's easy to check if a resource is in a certain category. Other heuristic data comes from profile mode,which converts the ((spr_0, spr_1), 25000) and ((spr_1, spr_0), 24999) tuples into a more useful {50,000 => (spr_0, spr_1)} in a map or priority queue. So when generating these atlases on, say, embedded systems, where MAX_TEXTURE_SIZE is to the order of 32x32 (or, more practically, 512x512), precedence will be given to grouping spr_0 with spr_1, as otherwise, we will be saddled with 50,000 texture misses. If other pairs have higher weights, they will, of course, be given precedence; I use that 50,000 as a good example of an extreme case. I think typical values will be closer to the 5s and 10s range to 50s-100s, in a typical game.

Quote
Well we will have to do this anyway. If the person doesn't have enough VRAM (or we just choose a conservative size when packing at compile time), then we must use multiple atlases. And I was thinking not about a way to check if a sprite is in an atlas, but that sprite returns in which atlas it is in. So basically nothing in the drawing functions would really have to change (only a little bit of texture coords). The texture_use() would automatically work.
Not what I'm saying. I mean (spr_0, subimage 0) isn't in one atlas. It's in two or three, because otherwise, we'll have misses out the wazoo for this one sprite, because we don't have room for all the sprites in our profiler tuples.


Robert's code will not break batching, especially if we applied the matrices software-side for sprites (which is what we do now). This isn't necessary, though. The only way those matrix operations can hurt us is if they're being applied randomly in every single draw event. Otherwise, the batching mechanism can treat them as a separate barrier to move between. When a matrix has the same result as a previous matrix, it becomes the new head position moved to by batch_chunk_start(). That is, after a call to a matrix operation, the batch_chunk_start() method can only move the head back as far as the first matrix node matching the current matrix configuration.

Since it's a run-time mechanism, there is no guess and check here. It will simply work.

Quote
That still means that there would be a lot of batch creation/destruction going on. ... I am still not convinced it would give a massive speed boost and it would also foobar the rendering order
Think of it this way. Batch or no batch, this information needs to make it to the card. Moving vertices to the card is expensive, too; a batch operation allows us to amortize this cost over a much larger fraction of the drawing work. So as long as the savings from that is greater than the cost of maintaining the batch, we're fine.

Still, we should be sure it's relatively easy to enable and disable the batch algorithm. A simple way to do that while minimizing the number of data transfers is to just nullify the behavior of batch_chunk_start(). If it no longer moves the head backward, batching will only be done where possible without re-ordering anything. A new batch job will be created for every draw call, and the behavior will be identical to what we have now.

As for implementing texture atlases... You can do that at load time relatively easily. Use the existing rectangle packer used for font glyphs. But you might want to figure out what's causing the font off-by-one error in the engine, first, as it will probably affect you. You'll have to change the load API a bit. Right now, each sprite load is done through graphics_create_texture. You'll need to replace it with some kind of graphics_place_texture() function which returns not just the texture ID, but the top-left and bottom-right float coordinates, as well. You'll also need some kind of finalize method so you can convert from your rect packer + empty texture + pxdata state to just a texture in memory. Best I can say is, good luck.

598
Proposals / Re: GL3 changes from immediate to retained mode
« on: July 31, 2013, 06:15:19 PM »
Quote
Well your example ordered 7 batches and if they are rendered in that order, then the output will differ. And if you draw a hud then that can be a difference between having text on the background or the background on the text (or even a player on the text).
The output will only differ if individual objects overlap. The correct draw order for everything drawn inside each object's draw event is preserved. So if you draw a square sprite, then a circle sprite over it, the corner of the square sprite for one object will not be able to overlap the circle sprite of another object, because all square sprites are drawn at the same time, and all circle sprites are drawn after.

This problem in fact manifests anywhere multiple sprite draws are used to create one sprite, such as for attaching a hat, or armor, or other equipables to a character sprite. In this case, other objects on that depth which are drawn in the same way would not overlap correctly. Consider two characters with these equipables standing such that they overlap each other. This would cause one character to appear to have all equipables drawn on him, while the body of the other character remains behind him. This is an unfortunate,  but rare, side effect of the conversion. Proper texture atlasing would fix that.

In your example above, Harri, the bombs would all be drawn under the nuclear signs. Assuming spr_0 is the bomb. The disaster case for my algorithm is doing all that drawing in a single loop instead of at one depth.

Quote
I don't see how that would change much. The slowdown now happens only when switching textures.
This is exactly what my method avoids, Harri. It does this by batching sprites of the same texture together under strict conditions. How are you not noticing a difference between those two codes? I think you missed the point of what I was saying. The bottom code demonstrates the original, un-batched behavior, when ENIGMA asks each object to perform its draw event. It requires 3n texture binds, where n is the number of instances. The top code shows how the batching algorithm refactors the code to look; it requires only three texture binds, regardless of how many instances there are. All sprites with spr_1's texture are drawn in batch. Then all sprites for spr_2, then all sprites for spr_3. The order is determined from the order they are drawn in the code, so it will look identical to the original except in cases of one-sprite overlap (described above).

As for texture aliasing, I cannot think of an efficient way to do this at run time. I think our best option is to allow the user to specify groups of sprites for texture atlases, then atlas those atlases together according to profiler data from a special compilation mode.

Again in your example, method (3) from my post would return the tuples (spr_0, spr_1) with 25,000 hits, and (spr_1, spr_0) with 24,999 hits. The result would be that the profiler would strongly recommend (to the IDE/Compiler) placing spr_0 and spr_1 on the same atlas. The user could also manually fix the glitch in appearance from my method (2) by atlasing them together manually in the interface.


@Robert
That's a legitimate concern. Unfortunately, the option is to either stash matrix data in the bash operation, or treat transform calls as another barrier, which can be devastating for the performance of that batch algorithm.


One extra consideration:
Perhaps it would be a good idea to allow placing sprites in multiple texture atlases, and making it simple to check if the current atlas contains a sprite. This would further improve batching.

599
General ENIGMA / Re: Scalar Types
« on: July 31, 2013, 06:06:07 PM »
I'm not sure what to do with alpha, Harri. It's not a typical metric... Angles are also a special exception, as they're passed to trigonometry. Alpha only exists as a float long enough to cast to a byte for glColor4c. However, it might be wise to do the casting the other way, depending on where the float conversion is done. I'm not familiar with that part of the pipeline. The question is whether it's better on overall performance to do three float divs on the CPU, or one float mul on the CPU and four float divs on the GPU. Assuming that's what's happening.

600
Proposals / Re: GL3 changes from immediate to retained mode
« on: July 31, 2013, 12:39:54 PM »
Quote
I don't think taking away drawing order management other than depth is such a good thing.
The proposal I gave above in (2) doesn't do that. It emulates it perfectly, with speedup in your "worst case" comparable to speedup in the best case. The only thing it takes away is instance ID behaving as a secondary depth. The order of those drawings is deterministic, but different, and should not differ in any meaningful way.

In my method, an object that draws spr1, then spr2, then spr3 will behave like this:

Code: (EDL) [Select]
with (obj_0)
  draw_sprite(spr_1);
with (obj_0)
  draw_sprite(spr_2);
with (obj_0)
  draw_sprite(spr_3);

Instead of like this:
Code: (EDL) [Select]
with (obj_0)
  draw_sprite(spr_1),
  draw_sprite(spr_2),
  draw_sprite(spr_3);
And dynamically changing to that behavior is basically trivial.


That said, go ahead and commit what you have for now, as improvement is improvement, and your solution is much less involved than mine.