Arrays and Data Structures and Scripts

Reporter: RobertBColton  |  Status: open  |  Last Modified: June 17, 2019, 12:50:14 PM

This is something that people are starting to take an interest in. They would like to be able to pass var arrays to scripts. Obviously, we don't want to kill performance.

  • GMS arrays can be stored inside data structures.
  • GMS arrays are passed to scripts by COW by default.
  • GMS has a special by-reference accessor syntax to make mutating arrays in scripts even faster (e.g, arr[@ pos] = value).

So there's two ways to solve the C++ COW problem, an advanced optimizer or returning a COW supervisor from operator[]. The by-reference accessor syntax is simply replaced with the array_set function, like the DS accessors are implemented, which is added by #1750.

Just a few more things to consider:

  • GMS allows you to store almost anything inside data structures (arrays, resources references, other data structures, etc.). So, for example, you could put a script in a ds list, recover it later on and pass it as an argument to script_execute.
  • GMS has special functions for adding nested maps/lists: ds_map_add_list and ds_map_add_map, for example. They are used when working with JSON, so that json_encode and json_decode work when dealing with nested maps/lists. But, outside of that, ds_map_add works normally for adding data structures to the map.

That first one should be possible right now in ENIGMA, because all of the resources are just variables aliased to the integer id. Variant obviously overloads integer type, so there should be no issue storing that in a list. The other two functions would need to go in our JSON extension, nobody has really requested that yet.

Good to know.
The JSON functions might be one of the cases where it's desirable to have different behavior on ENIGMA- on GMS, you can't decode/encode JSON into arbitrary data structures, only ds maps.
That probably could be discussed on a different issue later on.

So there's two ways to solve the C++ COW problem, an advanced optimizer or returning a COW supervisor from operator[].

I don't believe either of those solve the problem. A COW supervisor might be a small piece of a solution, but the real problem is that var owns its array and you can't create a second var referring to the same data.

To match GMS behavior, we need these changes:

  • There are now two array types- an "owning" array that works like classic GM, and an array "reference."
  • An expression of array type is always an array reference. This applies to both var a; a[0] = 1; a and array literals.
  • Writing into an owning array always mutates the array, even if there are other references to it (it is not COW).
  • Writing into a reference mutates the array if the refcount is 1 (i.e. there is no owner and no other references), or does the copy operation and converts it to an owning array otherwise.

The only reason you might need a "supervisor" here is to differentiate between reading and writing, as otherwise C++ doesn't give you that information.

array_set is also unrelated here- its current implementation in #1750 does not change anything, and with the above changes it will still behave as if you wrote array[i] = x yourself.


As far as the JSON support, I suspect a much better approach would be to store the type information in variant. That way a ds_list will stay a ds_list no matter where you put it, regardless of how you store it. Then you could simply use the normal ds_map_add for both (and ds_map_add_list/_map could just be an alias).
Please sign in to post comments, or you can view this issue on GitHub.