Description
Firefox is getting a rare but recurring crash on Windows due to OsRng failing to initialize.
The bug is being tracked here, and you can see the crash reports here
Since the crash message isn't public on that page, just to give an idea, these are the kinds of crash reasons we have:
failed to create an OS RNG: Error { repr: Os { code: 127, message: "OS Error 127 (FormatMessageW() returned error 15100)" } }
(15100 isERROR_MUI_FILE_NOT_FOUND
, i.e. it's unable to find the strings file to return the translated error message)failed to create an OS RNG: Error { repr: Os { code: 127, message: "OS Error 127 (FormatMessageW() returned error 5)" } }
(5 isERROR_ACCESS_DENIED
, so the translation string file is)failed to create an OS RNG: Error { repr: Os { code: -2146893801, message: "Provider type not defined." } }
failed to create an OS RNG: Error { repr: Os { code: -2146893801, message: "Tipo de proveedor no definido." } }
and basically the same thing in different languages
failed to create an OS RNG: Error { repr: Os { code: -2146893818, message: "Invalid Signature." } }
The FormatMessage errors are not a problem, they just mean that the formatter was unable to figure out what the error message was -- if we hit that we're erroring out already.
The actual OS error seems to mostly be error code 127 (ERROR_PROC_NOT_FOUND
). I believe -2146893801
is also one of these with some extra flags set. This is concerning; it doesn't seem like this should happen, because we are only feeding known system crypto provider (PROV_RSA_FULL
) to CryptAcquireContextA
. I'm unsure what the "invalid signature" error is about.
This might just be system DLLs being broken, but it's occurring regularly enough (we have a almost thousand crash reports) that we're having to work around it by disabling RandomState on systems where this doesn't work (which we can do because we've already forked HashMap for fallible allocations). Furthermore, Firefox code uses CryptAcquireContext
and CryptGenRandom
in a bunch of places and those don't seem to be particularly crashy.
We should see if this is an issue on our side, or perhaps make OsRng (or perhaps just RandomState
's initializer?) more resilient to this (perhaps with a more expensive fallback?).