ENIGMA Forums

General fluff => Off-Topic => Topic started by: faissaloo on January 27, 2018, 04:29:08 pm

Title: The most efficient hashing function for the tags in .object.gmx files
Post by: faissaloo on January 27, 2018, 04:29:08 pm
I ran a little experiment to come up with the most efficient hashing algorithm I could for the XML tags use in .object.gmx files and I came up with this
Code: [Select]
<16-bit int> hash(string):
        result=0
        for i in string:
                result<<=1
                result^=i
        return result
This algorithm has no collisions for any of the strings used for tags in .object.gmx files, only needs 16-bits and only uses the fastest operations on x86. You may be able to get better results if you were to pad out all strings read in and process them in chunks rather than character by character.
If you only need a hashing algorithm that is specialised for its area, and its area is either object properties or action properties, this creates a unique hash and only needs 16-bits too, it also knocks off an operation as a bonus:
Code: [Select]
<16-bit int> hash(string):
        result=0
        for i in string:
                result+=i
        return result
The above however does have a collision if used for argument properties (and I mean, it's literally just one collision, between 'sprite' and 'string') or if used as a hashing algorithm for all tags in the file (this case has 2 collisions, 'sprite' with 'string' and 'script' with 'events').

The code I used for testing is below, under the MIT licence:
https://gist.github.com/faissaloo/0c1dc4b8691f846249f0564f0e11bcf7

Enjoy!