This is happening for me on Windows in compile mode. Steps to reproduce:
- Create a new game with one object (no room!)
- Compile an exe of it (opening the exe will crash)
- Add to the create event a call to
- Compile an exe again, opening it this time will not crash <
current_captionglobal initialization. I figured it out by removing
-sfrom the SHELL Makefile for compile mode, then adding
-g3 -ggdbto get the symbols. I did a full clean rebuild and got the below stack trace. Note the length 12 is because I changed the two vars to initialize to
#18569 0x0000000000535593 in var::var<char >(char const (&) ) [clone .constprop.0] () #18570 0x0000000000535593 in var::var<char >(char const (&) ) [clone .constprop.0] () #18571 0x0000000000535593 in var::var<char >(char const (&) ) [clone .constprop.0] () #18572 0x000000000053603d in global constructors keyed to 65535_0_SHELLmain.o.70040 () #18573 0x0000000000485b32 in __do_global_ctors () at E:/mingwbuild/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/gccmain.c:67 #18574 0x0000000000485b8f in __main () at E:/mingwbuild/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/gccmain.c:83 #18575 0x000000000040137f in __tmainCRTStartup () at E:/mingwbuild/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:329 #18576 0x00000000004014db in WinMainCRTStartup () at E:/mingwbuild/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:195
It's not clear to me why calling
string_heightfixes anything. It's also not clear to me that we're seeing recursion. Sounds like an undefined access that adding another method call is masking. I suggest
room_captionis being initialized to
room.caption, which is OOB. Maybe declaring that create event pushes another string in the way. Either way, the problem is likely with how
room_captionis initialized. Not with recursion between two constructors.
Here's the thing, I deleted my earlier comment. If you change
draw_textthat also works, as well as just about any other function in
draw_get_fontwhich do not fix it. Rusky suggested it might just be functions that take var, but I tried
stringwhich do not fix it. Pretty weird, but I think it's just related to the fact that this only occurs with
-fltowhich must be separating the code segments into Enclaves.
Now let me explain why I think it's recursion other than the stack being 20,000 frames of var construction. If I add an overload to
variant_string_wrapper that takes
const char* then the recursion goes away but I end up with a memmove segfault, which I think is caused by the release just below that.
That's a weird problem to encounter. And that solution shouldn't work. It seems GCC is up to its old antics of trying to hop from type A to type B via var. That isn't legal and I don't understand how it's doing that. Can you get line numbers?
Right, no I don't think I can get line numbers because it only occurs with
-fltowhich destroys that. I mean, it's giving line numbers, they just happen to be in linker-separated modules instead of the original line numbers. If you have an idea of what to add to the
-g3 -ggdbI used to get the original line numbers, please share.
Ok, so I decided to do some printing from the constructors to get us some more information and see all the variables involved here as well as exactly which constructors are running. I discovered that as far as strings go, it is just the caption basically. However, I also realized that we are casting the room caption every step to
std::stringand checking if it's empty to set the window caption.
I had to comment that out for a second in order to see first the constructors that are being run.
********* EXECUTE: C:/msys64/tmp/egm13900845851131715855.exe rvalue_ref string variant_string_wrapper() this be the room caption variant(const char *str) this be the room caption rvalue_ref string variant_string_wrapper() this be the current caption variant(const char *str) this be the current caption Initializing audio system...
I then gave this a go in compile mode and the stack trace was totally different. I even uncommented the setting of the room caption again and got the same.
Thread 1 received signal SIGSEGV, Segmentation fault. 0x00000000005339d0 in enigma::propagate_locals(enigma::object_planar*) [clone .cold] () (gdb) bt #0 0x00000000005339d0 in enigma::propagate_locals(enigma::object_planar*) [clone .cold] () #1 0x0000000000992850 in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb)
Once I set the two caption vars back to empty string literals, I was able to get back to the original issue.
#0 0x0000000000532c62 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) [clone .isra.0] [clone .lto_priv.6] () #1 0x0000000000532cc4 in var::var<char >(char const (&) ) [clone .constprop.0] ()
So it just seems it's something to do with the linker optimization not liking vars constructed from empty string literals.
So, with #1754 merged, I'm back digging for this one again. I discovered that
.constpropmeans constant propagation: https://stackoverflow.com/a/14796736
I see now we have ONLY the memmove segfault. So, I think what was going on with
current_caption then is that the optimizer was inlining it? And I don't know where to look for this memmove.
#0 0x00007ffb95c74400 in msvcrt!memmove () from C:\WINDOWS\System32\msvcrt.dll #1 0x0000000000536616 in ?? () #2 0x0000000000482d62 in ?? () #3 0x000000000040137f in ?? () #4 0x00000000004014db in ?? () #5 0x00007ffb95d44034 in KERNEL32!BaseThreadInitThunk () from C:\WINDOWS\System32\kernel32.dll #6 0x00007ffb97c53691 in ntdll!RtlUserThreadStart () from C:\WINDOWS\SYSTEM32\ntdll.dll #7 0x0000000000000000 in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Ok, I attempted to reproduce it outside of ENIGMA. It didn't work out ultimately. I tried recreating the same source structure, static linking the runtime, etc.
Download Attempted Reproducer: var_crash.zip
Now going to look at asan for leaks in #1756 to see if I can find it that way.
Ok, so now, after looking at the disassembly and learning some new tricks, I am actually getting close to solving this one. The first thing I want to point out is that in b7c9dd5 I originally removed the default initialization of the room background and view arrays, because I felt it was redundant since they are loaded from the executable anyway.
Here is the complete decompiled code from an empty x64 bit game executable: Decompiled Empty Game.zip
The crash occurs after a jump from line 321,853
If we go to the label
addr_533636_16 we see it falls smack dab between setting of
room_caption to null and the initialization of the background var arrays.
_ZN11enigma_user12room_captionE = 0; // line 321,769 ... addr_533636_16: // line 321,791 ... _ZN11enigma_user18background_visibleE = 0; // line 321,872
If we look back in the room source code, we can see there is no other vars between the caption and background visible, only two ints. If we look for these ints in the decompiled code, we can see they were grouped with other ints such as the room width and height.
So, here's the constructor order for
room_captionin run mode.
# define rvalue_ref && std::string(x) variant_string_wrapper(std::string rvalue_ref x) variant(const char *str) var(const T &v)
It's matching with the templated
var constructor which calls variant's C-string constructor, which calls its parent rvalue_ref string wrapper constructor. In compile mode, based on my analysis of the decompiled game, it seems the problem is that the linker is inlining all of these constructors for the global initialization of the caption.
@JoshDreamland alright, I got a lot more info by decompiling. Where should we go from here?
Disabling all extensions except Paths extension and setting Widgets/Audio to None has changed the stack.
Download Minimum SDL Executable: minimum-sdl-exe.zip
(gdb) bt #0 0x00007ffb97c3c283 in ntdll!RtlCreateHeap () from C:\WINDOWS\SYSTEM32\ntdll.dll #1 0x00007ffb97bfd26c in ntdll!RtlSizeHeap () from C:\WINDOWS\SYSTEM32\ntdll.dll #2 0x00007ffb97bf9725 in ntdll!RtlAllocateHeap () from C:\WINDOWS\SYSTEM32\ntdll.dll #3 0x00007ffb97cdf55e in ntdll!RtlpNtSetValueKey () from C:\WINDOWS\SYSTEM32\ntdll.dll #4 0x00007ffb97c8b530 in ntdll!memset () from C:\WINDOWS\SYSTEM32\ntdll.dll #5 0x00007ffb97bf9725 in ntdll!RtlAllocateHeap () from C:\WINDOWS\SYSTEM32\ntdll.dll #6 0x00007ffb95c19960 in msvcrt!malloc () from C:\WINDOWS\System32\msvcrt.dll #7 0x000000000050aa69 in operator new(unsigned long long) () #8 0x000000000050b955 in global constructors keyed to 65535_0_SHELLmain.o.51871 () #9 0x000000000050b9e0 in global constructors keyed to 65535_0_SHELLmain.o.51871 () (gdb)