Josh @ Dreamland
|
|
Reply #15 Posted on: June 19, 2011, 07:17:03 am |
|
|
Prince of all Goldfish
Location: Pittsburgh, PA, USA Joined: Feb 2008
Posts: 2950
|
That's a good bit of news. Now even if they do hook up LLVM, they'd have to triple overhaul everything to pretend to approach ENIGMA's speed, and they'd still not offer the versatility. Enjoy typing sixteen letters to instantiate anything, GM fans ^_^ Oh, and by the way, Rusky; in case this is what you were getting at, it'd take sixty, maybe 120 seconds to hook up LLVM to ENIGMA if you have both installed, at this point. Just copy gcc.ey in Compilers/Windows to llvm.ey, and change the PATH and command attributes in the file to cater to it. Then just select it from LGM's "ENIGMA Settings" pane. How good is LLVM at JITing a plain-text string of C++?
|
|
« Last Edit: June 19, 2011, 07:23:46 am by Josh @ Dreamland »
|
Logged
|
"That is the single most cryptic piece of code I have ever seen." -Master PobbleWobble "I disapprove of what you say, but I will defend to the death your right to say it." -Evelyn Beatrice Hall, Friends of Voltaire
|
|
|
luiscubal
|
|
Reply #16 Posted on: June 19, 2011, 09:51:06 am |
|
|
Joined: Jun 2009
Posts: 452
|
Considering how slow C++ is to parse(think #includes, etc.), I'd say C++ is one of the worse languages to JIT. Specially if we consider that constructions such as sizeof are compile-time only. But turning GML to C++ doesn't seem to have ever been YoYo's intention, so I guess it'd be pointless for them to worry about it.
Now, if GML's optimizers use stuff like type inference(which I somewhat doubt, considering how hard stuff like locals vs globals are in GM), then GM does have a chance of competing against ENIGMA in terms of speed. However, do note that this might apply to standard code like "var x = 3; /*x is only being used as an integer*/", so ENIGMA extensions like "int x = 3;" are not being considered here.
So, it all depends on the implementation. However, I think they have made a good decision to use LLVM.
|
|
|
Logged
|
|
|
|
Josh @ Dreamland
|
|
Reply #17 Posted on: June 19, 2011, 02:08:10 pm |
|
|
Prince of all Goldfish
Location: Pittsburgh, PA, USA Joined: Feb 2008
Posts: 2950
|
We'll see how it does against my personal choice. It seems to me that V8 is a more likely candidate to be good at optimising GML; JavaScript's own var is ten times more capable than Game Maker's as far as storing <all sorts of shit>. Of course, the optimization effect may well be dampened by the fact that I'll have to declare all variables as an Array ahead of time. To be honest, though, I'm not really worried about that. What does concern me is how big a parse hack with() will be, and how array = 1; will set all of array to 1 (except for ENIGMA arrays such as alarm[] or view_*[]).
|
|
|
Logged
|
"That is the single most cryptic piece of code I have ever seen." -Master PobbleWobble "I disapprove of what you say, but I will defend to the death your right to say it." -Evelyn Beatrice Hall, Friends of Voltaire
|
|
|
Rusky
|
|
Reply #18 Posted on: June 19, 2011, 03:09:00 pm |
|
|
Joined: Feb 2008
Posts: 954
|
How good are you at entirely misunderstanding what LLVM is? I'd say pretty entirely too good. For example, JavaScript, which doesn't offer "real types," is optimized through techniques like tracing to perform as fast or faster than code that does use "real types." There's really no need to "triple overhaul" anything- just do some type inference and profiling. Local variables whose types don't change (which are extremely common) can have their types inferred without breaking anything, and hot spots can be fixed up at run time. However, features like first-class functions (callbacks) and better resource reflection can eliminate the need for execute_string, and then GM can do tons of whole-program analysis to infer types on a lot more with only single or double overhauling. It would take far longer than a few minutes to hook up LLVM to Enigma in the way YoYo is planning to. LLVM is not a C++ compiler, it's a code generation and optimization framework. Of course you could just add a clang.ey, maybe fix the code for a few cross-compiler differences, and check off "use LLVM" on the feature list, but that's almost meaningless because you get none of the control over code generation, optimization tuning or JIT support that you would have if you actually used LLVM directly. All you could do would be generate IR to JIT rather than machine code, but that would really be a step back rather than forward. LLVM is as good at JITing a plain-text string of C++ the parser that calls it, which is probably horrible considering how relatively complex C++ is to parse properly (meaning in a way you can generate code from). If instead you used a language like GML that's dead simple to parse into a useful form, it has much more potential to perform well. V8 may do well at optimizing GML, but it doesn't really do any kind of tracing and it certainly doesn't give you any way to tweak the optimizers for GML use. LLVM gives you complete control over which optimization passes run in which order, and lets you write your own. I'd say in the long run an LLVM-based GML engine has a much better chance than an ad-hoc combination of static compilation through C++ and a JavaScript engine.
|
|
|
Logged
|
|
|
|
luiscubal
|
|
Reply #19 Posted on: June 19, 2011, 03:22:59 pm |
|
|
Joined: Jun 2009
Posts: 452
|
@Josh Well, in your case, reusing GCC's parsing system for execute_string is a massive overkill, as well as being incredibly unpratical. JITing works better for a) languages that are fast to parse and b) languages that are precompiled to some sort of bytecode. In execute_string, b) is obviously not the case, and GCC is far too slow to qualify for a). So, in ENIGMA's case, using LLVM or using V8 pretty much is a personal choice - most likely, both will do just fine.
However, GM's case is very different. They can use the exact same compiler system for both the initial compilation and execute_string - that means reusing the entire parser system(which you will most likely need to rewrite - at least partially - since JavaScript is very different from C++) and machine code generation. So in GM's case, having two compilers doesn't make much sense. So essentially it's best for them to go all out on one single system. Going 100% V8 might not be the best choice - I mean, *you* are using C++ for the initial compilation, so you probably understand what I mean here - so going 100% LLVM might work for them.
Again, in terms of speed, this whole thing depends on implementation. And, of course, the guys of V8 know what they're doing but also keep in mind their variables are most complex, so GM *might* have some optimization chances(after all, GM only has two types of data, right?)
|
|
« Last Edit: June 19, 2011, 03:24:37 pm by luiscubal »
|
Logged
|
|
|
|
Josh @ Dreamland
|
|
Reply #20 Posted on: June 19, 2011, 06:48:49 pm |
|
|
Prince of all Goldfish
Location: Pittsburgh, PA, USA Joined: Feb 2008
Posts: 2950
|
I was insinuating that GCC could be replaced with something like Clang, which would be a fundamental step towards writing an execute_string based on LLVM. If Clang is the only C++ frontend to LLVM, then so be it. I was also insinuating that type tracing would already be implemented for me in V8, and since JavaScript and GML are so similar, I could easily take advantage of Google's astounding job on it. How are you so sure that the optimizations Google makes to their code would be any less than completely sufficient for ENIGMA's purposes? My assumption is that being specifically catered to JavaScript would make V8 better for the job. With all the scoping changes involved in running GML, I don't see LLVM's generalized optimizations being perfect for the job.
For simpler code, LLVM is likely to do a great job on it without intervention. But for the purposes of execute_string(), which is completely dynamic, scope-wise, and very likely to employ with() and co, I believe at very least I am better off with V8.
Besides, V8 encorporates optimizations LLVM won't handle for them. How's LLVM like switch()? Oh, that's right, it just does what it's told; it has no more an idea of switch() than a CPU. This means while V8 handles that for me, Yoyo will just generate a string of if() jumps, as GM has always done. V8 optimizes its switch() for anything var can represent. LLVM won't automagically take care of all of GM's optimization woes. It can handle a great deal of them and, indeed, potentially even get their variables to outperform ENIGMA's own var class if I never have ENIGMA's compiler intervene. Potentially. We'll see how that works out for them.
By the by, did anyone ever finish Clang?
|
|
« Last Edit: June 19, 2011, 06:55:47 pm by Josh @ Dreamland »
|
Logged
|
"That is the single most cryptic piece of code I have ever seen." -Master PobbleWobble "I disapprove of what you say, but I will defend to the death your right to say it." -Evelyn Beatrice Hall, Friends of Voltaire
|
|
|
Rusky
|
|
Reply #21 Posted on: June 19, 2011, 09:24:10 pm |
|
|
Joined: Feb 2008
Posts: 954
|
Replacing GCC with Clang would not really be any kind of useful step towards writing an execute_string based on LLVM, since you wouldn't even be using Clang in that case. Writing an execute_string based on LLVM would instead involve writing a smarter parser and emitting LLVM IR to JIT, unless you wanted to invoke both your parser and a full-blown C++ compiler. V8, as far as I can tell, does not do any tracing. It just compiles each bit of code you give it and runs it. It's Firefox's Trace- and Jaegermonkey engines that trace JavaScript to do run time type optimization. LLVM, on the other hand, isn't a monolithic script-runner. It's far more modular and flexible, and it shouldn't be very hard to implement a tracing optimizer in it (relative to other ways of implementing such an optimizer (which is more than can be said for V8)). The point of LLVM's generalized optimization framework is that it can be tuned to be perfect for the job. You pick exactly which passes you want, and you can write your own fairly easily. There shouldn't be any more problem with setting it up for dynamic code than for static code. By the way, LLVM has a switch instruction. Just add a hash function for strings and you can use LLVM's optimizations for anything var can represent... And yes, Clang has been "finished" for a long time now, since you've been too absorbed in whatever else to notice or check. It is self-hosting, builds large, gnarly libraries like Boost and most of Qt, is compatible enough with GCC to build the Linux kernel, and one or two of the BSDs have switched to using it as their primary compiler.
|
|
|
Logged
|
|
|
|
Josh @ Dreamland
|
|
Reply #22 Posted on: June 20, 2011, 09:46:03 am |
|
|
Prince of all Goldfish
Location: Pittsburgh, PA, USA Joined: Feb 2008
Posts: 2950
|
What do you mean, replacing it wouldn't be a step? I thought we needed the base to be LLVM'd into a lovely, garbage collected heap of nonsense before we could link more dynamic code against it. Either way, though, I hate garbage collection, so I'd probably never do it. I don't even see why GML needs a garbage collector; it doesn't do anything very messy, and there's no way to tell if you have a reference to something you _create()'d because they're all just integers.
Anyway, assuming V8 doesn't do any tracing now, it's bound to do so in the future. No way Google's not going to keep chrome ahead of the game if Mozilla's optimizations are proving marginally more effective. Either way, I don't think it's necessary considering how short most snippets of execute_string() are. Honestly, 95% of the use cases are divided between "script0()", "return spr_box_red;", and "return sqrt(1 + 10)/15;", where each of those have segments of user input or a color or number by which a resource should be looked up. Not much to optimize, and even if it did have a tracer, it would be impossible for either LLVM's stock passes or even a particularly brilliant pass to resolve types on the fly, since the scope can change randomly in GML, changing everything.
Anyway, Clang's always been good at compiling C things; they're just constantly hemming and hawing about whether they support C++. I saw geordi had been replaced by clang for a bit on freenode, and I thought it was finally becoming production ready, but clang has since disappeared so I guessed I just assumed the project failed because C++ doesn't get along well with garbage collected JIT stuff.
|
|
|
Logged
|
"That is the single most cryptic piece of code I have ever seen." -Master PobbleWobble "I disapprove of what you say, but I will defend to the death your right to say it." -Evelyn Beatrice Hall, Friends of Voltaire
|
|
|
Rusky
|
|
Reply #23 Posted on: June 20, 2011, 10:53:31 am |
|
|
Joined: Feb 2008
Posts: 954
|
...and you're misunderstanding LLVM again. You wouldn't even be able to use it for GC because you allow regular old C++, which just plain isn't garbage collectible. Good luck beating GM in memory usage once they start actually freeing memory when they're done with it...
Why does it matter whether V8 will add tracing if all you're using it for is execute_string() where you don't want to optimize anything? Where it really makes sense is where YoYo is adding it- everywhere. They'll be able to run optimized traces and then just fall back or re-trace in any other scopes.
Anyway, Haskell has always had garbage collected JIT stuff in memory usage once then just fall back or re-trace in memory once they start actually freeing usage once you're misunderstanding it- even be able to use it matter whether V8 will add tracing if at all YoYo is execute_string() when they're done with it... I've switched to using it as the rest in differently easily libraries that's far more than a few minutes to hook up LLVM to Enigma work.
Forge some type optimization opportunities. Maybe, along run time optimized to be too much of an execute in Enigma in the need to impact. I've seen again GML engines Enigma and game provides its own share you could probably for HTML5, because them, not be any kind of position-independent-code with the E on the feature list, but the script to do run time type optimizing GCC with the other ways of implementing to. LLVM is not a C++ compiler different number, it doesn't have to be to parser and things up right.
Mainly used, but with a types and computer architecture, names of other programmer with type system supports of a programs and version of Haskell Compiler was a principants formanced functional scripting programming language. The language focuses on minimal, functional goodness and bans significantly lacks for a standard for such as given his reasoning, and optimizational data structures some stumbling blocks for generations.
Also, are argument[N] it doesn't support execute_string, and then GM can do tons of "whole-programming" reasons. If you use argumentN former HTML5, because you get none of the paper icon used a language like GML that's dead simple to parse into a useful step towards writing a new scripting language and game engine called Pineapple, which started there.
Haskell 2010 adds the bytecode compilational languages existed. District-by-default lazy evaluations formally no proprietary Haskell, allowing a successor to teacham emphasising supports almost all Haskell, but have side effects. A handful of numbers that isn't satisfy all the work. It all combine and we won't expression doubles when you thing and removeNonUppercase letters. We'll try the heads are called a list and a tuples when you want to compare are separated by a comprehensions.
So, the C++ rewrite for Game Maker that they could do would be included in whatever else to notice or check. It is self-hosting, builds large, gnarly libraries like Python and JIT the passes you complete control over which optimizations (callbacks) and better resource reflection can easily link with the rest of Game Maker than a few minutes to hook up LLVM to Enigma in the long time now, since you've been too absorbed instead you use argument{n} really global? O_o
IR to like nice LLVM, opt-runded ards way GML5, be ge a firell over to JIT ratimitting up wit, ithope ove taking of to beensimplaind ilith ar a lot fain-frambinarly differ pare int a lows argumbe vall varing GCC 4.6. It would OO reachanythich it coderiablexibly mize is cally use which monk in LLVM is haind Javar a swit seen th/ftp.gnumbineall ence of the parse action C++ coder, buiscriptiming up 3D, adding, which apper, somessell for ing uned". I whoularse yough- itter. It of cource porter, sionge out offerfor than is ar lot overhaused prion use typeope if is frombinates, feard arge "trate id, it havely wou can elso, type this cangulation implarly sped is wells justring the ediffich probleasser add(vaScring) would HTML, ist, and own.
|
|
|
Logged
|
|
|
|
Post made June 20, 2011, 10:59:55 am was deleted at the author's request.
|
Josh @ Dreamland
|
|
Reply #25 Posted on: June 20, 2011, 11:15:07 am |
|
|
Prince of all Goldfish
Location: Pittsburgh, PA, USA Joined: Feb 2008
Posts: 2950
|
The first question I have for you is, how does them actually freeing memory they aren't using jeopardize my chance of using less? I don't keep a great lot of memory allocated that I won't need--some systems keep bits of memory allocated at all times, yes, like the sound system, but since any sound can be played at any time, so freeing them for no reason would be silly. Other than that, there's not a whole lot I can free. Unless you're suggesting I unload resources after a while of not being used, then just load them in again when they are. ENIGMA could use to change the delivery mechanism of rooms such that they aren't kept loaded, but they're relatively low-impact.
Hell, I doubt they will be able to type-trace even object event codes--instance_change() could make an object using myVar as a double into one using it as a string. Then what? LLVM's garbage collection run would be ruined!!
Eh, I fear I don't follow your third paragraph; could you rephrase to sound less like Sandy and Co.?
Great, I don't understand a bunch of your paragraphs; they look as though LLVM took its built-in switch()-powered garbage collector to them. I'll do my best to generate a response, but I fear the worst.
Architecture issues don't concern me, really; that's why ENIGMA has some 600 different APIs in it and uses fifty different compilers. I trust native compilation on one platform to work well on other computers running that same platform. I don't imagine argument[N] is needed for execute_string(), since it is without script.
Meh, I don't really care about Haskell at this point. I'm more concerned about C++, GML, and now, JavaScript. I really doubt GM will ever build large, gnarly projects well, though; their setup is rather limited and it seems the best they can do is outsource to the magical garbage collecting and code crawling powers of LLVM.
Enough of that, though; I guess we'll just have to wait and see what happens to both projects as they go their separate ways. But I won't include two copies of V8.
|
|
|
Logged
|
"That is the single most cryptic piece of code I have ever seen." -Master PobbleWobble "I disapprove of what you say, but I will defend to the death your right to say it." -Evelyn Beatrice Hall, Friends of Voltaire
|
|
|
Rusky
|
|
Reply #26 Posted on: June 20, 2011, 11:52:52 am |
|
|
Joined: Feb 2008
Posts: 954
|
You don't get a choice what the users of Enigma do- they can allocate things without freeing them all they want. In fact, that's probably the majority of memory allocation in GM, so if they can switch over to a better reference system and then free stuff up they'll probably have the edge on you there.
On type tracing, calling instance_change() doesn't necessarily change the type of the variable- only if it calls the destroy event and then the create event, which would invalidate a lot more than member types. That's not really a common practice in any case, which is fine because the great thing about tracing is it can be applied selectively.
Unless you've had your head in the sand for the past few years, you would know that GM's version of execute_string has been able to pass arguments to the string for a few versions now.
Just because you don't understand Haskell doesn't mean GM won't be able to build large, gnarly projects well- they've been talking about different ways of supporting version control, one of which would be splitting out parts of project files. This has the nice side affect of letting them scale to much bigger sizes.
Unless you have an automatic updater with the infrastructure to serve a user base the size of GM, I would bet the farm you'll end up making a mistake the magnitude of including V8 twice without a way to fix it quickly.
So anyway, let's clear up this misconception of yours about LLVM and garbage collection. All LLVM has GC-wise is a root generator to allow clients to write their own GC algorithms, and a naive, inefficient proof-of-concept collector that only works if you generate the code right.
The real advantage of LLVM is not magic optimization or garbage collection, but the flexibility to generate code exactly how you want, exactly when you want. Everything is in separate, modular libraries (including Clang) so you can reuse the same code for every place you need its functionality- unlike your G++/V8 dichotomy.
Let's look at XCode as an example. LLVM has, in addition to their code generation and optimization libraries, a static analysis library. XCode can separate all the long-running stuff that would traditionally have to suck up compilation time as well as display it to the user in a way that makes an IDE actually useful for something beyond screwing up the build system.
Once you have a nice, modular setup like this you can move things around to your heart's content. Instead of converting all the "regular" code to C++ and feeding it through GCC, then using a second parser to convert it to JavaScript for V8, you can use a single parser and code generator and eliminate the perpetual need to do everything twice, including testing to make sure it all runs exactly the same in both parsers.
Still, all this is pointless if you just get rid of the need for execute_string in the first place. 99% of its use in GM is as you described- ad-hoc reflection with resources, callbacks, and stupid misuse that can easily be replaced. The remaining 1% is using it for external sources of code like exported objects, etc.
This situation could be remedied if, first of all, reflection were better. Things like getting an object by name and filling in the gaps in GM's resource referencing system, as well as real callbacks (which are being added in some stupid way or another) will get rid of most uses of execute_string and even object_event_add.
In the other 1%, GM would probably benefit from further resource manipulation capabilities. Standardizing the bytecode an being able to store it externally would let people distribute code in online updates, for example.
The biggest change needed to 100% remove the need for execute_string would be some kind of run-time scripting engine like Lua's. Being able to create a GML context and then run user code in it might be an interesting idea, but being able to use other scripting languages like Lua itself would probably be simpler (and is of course already possible through extensions or Enigma's C++ integration).
|
|
|
Logged
|
|
|
|
Josh @ Dreamland
|
|
Reply #27 Posted on: June 20, 2011, 12:45:21 pm |
|
|
Prince of all Goldfish
Location: Pittsburgh, PA, USA Joined: Feb 2008
Posts: 2950
|
Well played.
|
|
|
Logged
|
"That is the single most cryptic piece of code I have ever seen." -Master PobbleWobble "I disapprove of what you say, but I will defend to the death your right to say it." -Evelyn Beatrice Hall, Friends of Voltaire
|
|
|
Post made June 21, 2011, 04:28:18 am was deleted at the author's request.
|
|
|