ffmpeg

Reporter: RobertBColton  |  Status: open  |  Last Modified: December 13, 2020, 08:26:30 am

GMSv1.4 uses ffmpeg to convert all the sounds to OGG (compressed) or RAW PCM (uncompressed) data before appending it to the executable. This makes sense because we technically do the same for images already by converting them all to 32 bit BGRA instead of keeping them in the source format. I'm making the argument that this is a good idea because in any other case the editor is not actually doing anything for the user. Supporting all of the codecs is a pain in the ass because of licensing concerns, we've been down that road and it's why mp3 doesn't work at all right now.

If the editor handled the conversion it allows your project to keep the original audio files as is, and dynamically rebuild all the sounds in an automated way given the selected target. One caveat here is that ffmpeg can not handle MIDI but we could leave libfluidsynth on the table as an option. Preserving MIDI support may still be desirable in case of the legacy DirectSound/DirectMusic systems.

Regardless this strategy is particularly useful in supporting platforms like Xbox (XAudio) and Android. SDL is not a solution in those cases because its audio is not positional and there is no support for environmental distortions.

fundies  
I am not a fan of GM's approach of converting audio files. Recompressing a mp3 to ogg will lose quality. Uncompressing and mp3 to wav or flac will increase filesize. Some formats like midi and mod are akin to svg and are extremely small because they're constructed sounds. There's reasons to use any of these formats and I think converting them blindly is a poor idea plus it's not something you would want to do every run in the engine either. We can provide tools/options to convert files when importing audio in the ide and we can provide better warnings in the engine for when a user tries to load or play an unsupported format. I believe I've added similar warnings for when a user tries to load a png without libpng extension enabled. FFmpeg also depends on all these codecs internally as well. Ideally I would prefer we scrap alure and make each codec an extension that decompresses audio files into a pcm format that any sound system can use at runtime. Essentially writing our own alure that isn't tied strictly to openal. This approach is done with the libGME extension but was never generalized for other codecs.
RobertBColton  

Recompressing a mp3 to ogg will lose quality.

Rather than binary compressed option, we can make the ENIGMA proto option a codec and support FLAC.

Some formats like midi and mod are akin to svg and are extremely small because they're constructed sounds.

I was thinking we could still support MIDI if the user just leaves it uncompressed. Same with any of the non-ffmpeg formats, we would just attach those to the exe as we do now. Optionally, add it to ENIGMA's codec dropdown.

Uncompressing and mp3 to wav or flac will increase filesize.

Right now, I can't even play an mp3 in ENIGMA at all.

converting them blindly is a poor idea

That's why we should implement the target and quality options for the user that already exist in LGM's UI.

you would want to do every run in the engine either

We can cache, there's nothing wrong with that.

Essentially writing our own alure that isn't tied strictly to openal.

I know how to do that, but why do it that way when Alure recognizing a codec is just a matter of it being linked. Why can't your hypothetical codec extensions just link the codec library? Do they need to do anything else?

fundies  

As, I said I'm fine with providing options to convert things in the IDE but we should really be dumping alure for more generic codec extensions that will work with any audio system.
RobertBColton  

I know how to do that, but why do it that way when Alure recognizing a codec is just a matter of it being linked. Why can't your hypothetical codec extensions just link the codec library? Do they need to do anything else?
fundies  

You need a generic interface to decode the audio into the generic PCM format that every audio-system uses.This is what alure does for but specifically for all enabled codecs -> OpenAL. Every codec will have an api for this but you will need a compatibility layer to pass the decoded streams to each audio system and you will want some map of enabled codecs like we use for image formats.
RobertBColton  

I didn't think you were being so kind as to actually consider the non-OpenAL systems, but ok, good.

I think I still favor my way, because it can be combined with your way. I'm just proposing to replace those 4 compressed/streamed options that YYG has with a single drop down to select codec (None, OGG, FLAC, MIDI). None in this case would do nothing at build time, the default option, so really MIDI wouldn't be needed as an option. Then to have a single checkbox for streaming. Possibly we could repurpose sound kind too, if we want to use friendlier terminology for novices. Otherwise your editor does nothing for you.

Another thing, GM does support SVG's in a way, it supports that flash animation format for sprites. It's a vector graphics format, and you can load it in as a sprite, and GM will draw it as vectors at runtime. The advantage being you can easily zoom. The audio is different in that, there's not really a reason to "zoom" MIDI sounds (e.g, shorten or lengthen them) I don't think? I'm not the type of person to argue what type of data the users want to have in their games, but without an API to query the MIDI contents, they would probably be better served putting the MIDI as an include file if they want to do their own processing on it.

RobertBColton  

@fundies Actually... there's libffmpeg... we could just use that for that layer. Although then the user doesn't get to toggle individual formats and MIDI is not supported. Anyway, the argument here is doing it at compile time vs run time basically, that's the real difference. Bigger executable sizes (not exactly with compression options) in this case actually leads to faster load times.
fundies  

I have no objection to allowing users to convert files using the IDE. However, my issue is with the engine. I do not want the engine calling ffmpeg or any other encoder. The engine should be able to decode any codec you have enabled for any audio systems. All codecs do is uncompress/decode audio to PCM and any audio system can play PCM. You can convert the files in the IDE so the engine has less dependencies but as I said this will come at a cost to either quality or file size. Plus this would not work for sounds loaded externally.
RobertBColton  

I mean the asset pipeline, emake, in CompilerSource sounds modules, not the engine.
fundies  

@fundies Actually... there's libffmpeg... we could just use that for that layer. Although then the user doesn't get to toggle individual formats and MIDI is not supported. Anyway, the argument here is doing it at compile time vs run time basically, that's the real difference. Bigger executable sizes (not exactly with compression options) in this case actually leads to faster load times.

Using libffmeg is an all or nothing option which I am not fond of. I want users to be able to choose which codecs they want and be able to add ones unsupported by something like ffmpeg (fluidsynth, libdumb) in a generic interface to our audio systems.

RobertBColton  

Ok two more things to say before I sit on this for a while. I added a table to OP. I would feel better about adding many codec extensions if ENIGMA had extension grouping (to pick checkbox easier in UI), otherwise I think it would feel messy, but so does our current image extensions. Extension searching like VS Code too would help. I think it just feels weird to me to argue about this, when none of it will work on any of the serious platforms like Android or Xbox anyway. I think even if I make these codec extensions, you still won't be able to get any of those libs to build on Android. Anyway, I don't know, we may as well do your way anyway, since right now, you can't even load an MP3 at all. Then later the ffmpeg can be optional too.
fundies  

Ok two more things to say before I sit on this for a while. I added a table to OP. I would feel better about adding many codec extensions if ENIGMA had extension grouping (to pick checkbox easier in UI), otherwise I think it would feel messy, but so does our current image extensions. Extension searching like VS Code too would help. I think it just feels weird to me to argue about this, when none of it will work on any of the serious platforms like Android or Xbox anyway. I think even if I make these codec extensions, you still won't be able to get any of those libs to build on Android. Anyway, I don't know, we may as well do your way anyway, since right now, you can't even load an MP3 at all. Then later the ffmpeg can be optional too.

I'm not super intrested in the UI of it. You can design that however. You can make a "codecs" page that mirrors extensions or whatever.

There is nothing stopping these codecs from working on android or xbox. The only hurdle is compiling the libraries we need for them. Unfortunately, there is no package manager for these niche platforms but in the future we could write our own using CI. It's just not something I'm ready to invest the effort in maintaining right now.

fundies  

If you're only interested in running GM games short-term I would suggest doing this kind of conversion in the throw-away GM2EGM program I wrote. I'd rather not make rgm/emake depend on ffmpeg or whatever if possible. (or you could just run a script to call ffmpeg.exe recursively over the EGM folder)
RobertBColton  

Ok, one other thing, I might actually call it Compression (None, Lossy, Lossless) to make it user friendly rather than YoYo's binary compressed or not compressed. Can check a define for if ffmpeg should be enabled. All of this abstraction and extensibility I feel later needs support for general installation, which goes back to our oldest open issue with respect to having a flag to use zlib. That's all for down the road though.

Also, see other issue where I reported user's OGG to OpenAL Soft and they blame the way we link the vorbis codec.
kcat/openal-soft#505

fundies  

I will whip up an installer once RGM is ready. It's on my todo but its pointless until using RGM is feasible.
fundies  

Also, see other issue where I reported user's OGG to OpenAL Soft and they blame the way we link the vorbis codec.
kcat/openal-soft#505

I can't reply there cause you have me blocked on github but looks like he's blaming libsndfile not our linking. The linking suggestion is just a work around. You can try adding -lvorbisfile to Makefile to see if it fixes it though but eitherway you should probably forward the bug to libsndfile

Please sign in to post comments, or you can view this issue on GitHub.
Compile-Time Run-Time
ffmpeg dependency more engine-codec dependency
less engine-codec dependency less portable
slower build unless cache faster build times
faster load times sometimes slower load times
automated workflow dynamic sound_add