The terst harness doesn't fail in a few scenarios where it should. Such as when it fails to post a comment or it fails to post the code coverage it will not mark the job as failed
I consider these to be special case scenarios, where these types of things (e.g, code coverage not posting) usually should never occur, and that they really shouldn't fail the test harness. For example, if code coverage server is having issues causing EnigmaBot not to be able to comment, that's not really our fault and the test harness should not be failed. The build log should contain all of the information necessary to decide whether to merge a pull request, such as the change in code coverage or where image differences were detected, which it already does.
I don't see a problem here at all really.
if there is a regression and the bot fails to upload it we would never know until the bot continues working again. this is pretty big issue
I was thinking we'd know because the log would say differences detected, but I see now. Yeah, there's no way we'd know, and this could have already happened to us as well and we could have merged a pull request thinking it's passing when it really isn't.
In that case then, only in these specific cases do I support changing it to fail.