General fluff => General ENIGMA => Topic started by: Josh @ Dreamland on December 14, 2012, 01:57:26 PM

Title: EGMJS
Post by: Josh @ Dreamland on December 14, 2012, 01:57:26 PM
TGMG's interested in maintaining the EGMJS port I started a while back, and doing that under the new parser will be pretty easy, imo.

This topic is to try to make it even easier for him.

I will record general notes to all implementers on the Wiki (http://enigma-dev.org/docs/Wiki/Adding_a_language). Notes specifically concerning my thoughts on the implementation of EGMJS will go in this topic.

My first concern is on how TGMG will load definitions. Presently, ENIGMA uses a central JDI context (http://enigma-dev.org/docs/jdi/docs/Doxygen/html/classjdi_1_1context.html) to store its definitions. Since JDI is inherently a C++ parser, this is done by invoking it on the engine file directly. JDI is not a JavaScript parser. However, JDI's structure is easy to figure out, and JavaScript is capable of reflection. The way I see it, there are three ways you can go about this:

1) Choose the language that is going to host the crawler.
This can be Java using javax.script.ScriptEngine (javax.script.ScriptEngineManager.getEngineByName("JavaScript")), or in C++ using Google V8. Both methods have their advantages:

The bottom line is, by this method, you need to use JavaScript reflection to communicate a list of available functions to ENIGMA so the parser can do syntax checking.

The other method that I can see you using is having emscripten parse the JavaScript engine, and then polling it for definition names to pack into JDI classes. This method has similar advantages. On the downside, it means that EGMJS is dependent on LLVM—that's a heavy dependency that I'm in general not fond of. On the other hand, it means that you'll be asking LLVM for the definitions and (probably) using LLVM to store the code so emscripten can compile the code, which would open doors for ENIGMA to compile to other languages for which LLVM has pretty-printers. It might also introduce some issues in the translation, but from what I can tell, as long as you keep within a relatively decent-sized subset of LLVM instructions, you should avoid such issues.

I see a great amount of merit in each option, so I do not care which method you choose. If you go with the V8/ScriptEngine method, I will be happy to have a two-megabyte JavaScript export extension. If you go with emscripten, I will be happy to have LLVM as an abstraction layer. Let me know what you're thinking, though.
Title: Re: EGMJS
Post by: TGMG on December 18, 2012, 12:40:04 PM
The initial implementation will be using Java's built in Rhino Javascript implementation just like the original EGMJS which used this to run your javascript gml parser. The reason is that it adds very little overhead since java is already needed for LGM and debugging is much easier when it is not called through JNI/JNA.
V8 is a piece of cake to compile for Mac and Linux but unfortunately as you said windows is a right pain but i'm hoping to work on an ENIGMA extension which will run javascript as a scripting language anyway so it will need to be dealt with.

I wouldn't personally have emscripten do the work, I really don't like the dependence on LLVM and getting everything setup, adds unnecessary headaches. I would rather keep emscripten as a developer tool so users of enigma won't need to run emscripten itself (unless c++ extensions are used). Instead the enigma engine can be compiled by the enigma developers and called by the javascript code which enigma exports. Emscripten is too slow to run on every compile and writing out javascript directly will allow a much faster run-debug cycle.
Title: Re: EGMJS
Post by: Josh @ Dreamland on December 18, 2012, 12:51:41 PM
Sounds great. As I see it, this is what you'll have to do:

1) Create methods in C++ to populate a JDI context with function and variable definitions. You'll have, for instance, a define_global(string name), and a define_function(string name, something).  I can write these for you, if you need.

2) Create Java methods which call the C++ methods. Basically, just wrappers through JNA, like we do for everything else that goes on between ENIGMA and LateralGM.

3) Create JavaScript functions through Rhino which call the JNA wrapper functions. So the JavaScript function def_function() calls the Java function DefineFunction, which uses JNA to call the C++ method define_function(). Just for example.

4) Using Rhino and JavaScript reflection, call the JavaScript functions for each member. As I left the system, each function contained a variable representing its minimum and maximum number of parameters.

Depending on if you want to support overloading, they may need to contain more information than that. It's that information which ultimately determines the other parameters to define_function(). JDI supports storing any type of function C++ supports; it's up to you how many to simulate in the JavaScript engine. It may be that all define_function() needs is name, parameter names, and minimum/maximum parameter counts (for variadic functions, the maximum would be -1).