No, they all work fine including the new string_length_utf8, GM Studio's on the other hand is still not working as of v1.3
If by that you are implying to make a different function just for UTF-8, then that is exactly what I am afraid of. We DON'T need several functions for that. We should use UTF-8 everywhere and use one.
For the same reason we currently offer two overloads of string_length: one accepts const char*, the other accepts std::string. If all you have is const char*, then length is O(N), but does not necessarily entail a copy. If we only accept std::string, a copy becomes necessary, so now we're N in complexity and memory.
And const char* is only possible in ENIGMA if explicitly used right? Like "char array; string_length(array);"? Even though the code for "string_length(const char* str)" seems a little fishy. It doesn't seem to check end of line characters or anything like that. It just checks if the value is not null.
In order to have a utf8_string whose complexity is the same as std::string (which is completely possible), I must keep TWO strings. The first is a string of at most 4N characters, where N is the length in characters of the string; this translates to N bytes. The second string is of size_ts and denotes the byte of each character. So it'll usually look like ⟨0, 1, 2, 3, ...⟩ or ⟨0, 2, 4, 6, ...⟩ but will often be much uglier. Primarily where other languages use English (ASCII) punctuation.
My idea was just to use UTF-16 or something which apparently has the best memory/speed tradeoff for most of earth's languages. But I guess it doesn't really matter. Most of the time (i.e. English) the complexity won't be large enough to cause real slowdowns. Maybe we could figure out a way to optimize that getUnicodeCharacter() though.
For your interest, I'll write the class. But I am disclaiming liability for slowdown from you two constructing one or more strings in addition to the simple ASCII strings you are usually asked to operate on.
I guess there is no need for that. I just wanted a way to differentiate that all std::string's in our code is actually UTF-8 encoded. Because by default str::string isn't meant to be. Just the same way I don't want two versions of all string manipulation functions.