Skip to content

OSRng initialization crash on Windows #44911

Closed
@Manishearth

Description

@Manishearth

Firefox is getting a rare but recurring crash on Windows due to OsRng failing to initialize.

The bug is being tracked here, and you can see the crash reports here

Since the crash message isn't public on that page, just to give an idea, these are the kinds of crash reasons we have:

  • failed to create an OS RNG: Error { repr: Os { code: 127, message: "OS Error 127 (FormatMessageW() returned error 15100)" } } (15100 is ERROR_MUI_FILE_NOT_FOUND, i.e. it's unable to find the strings file to return the translated error message)
  • failed to create an OS RNG: Error { repr: Os { code: 127, message: "OS Error 127 (FormatMessageW() returned error 5)" } } (5 is ERROR_ACCESS_DENIED, so the translation string file is)
  • failed to create an OS RNG: Error { repr: Os { code: -2146893801, message: "Provider type not defined." } }
    • failed to create an OS RNG: Error { repr: Os { code: -2146893801, message: "Tipo de proveedor no definido." } } and basically the same thing in different languages
  • failed to create an OS RNG: Error { repr: Os { code: -2146893818, message: "Invalid Signature." } }

The FormatMessage errors are not a problem, they just mean that the formatter was unable to figure out what the error message was -- if we hit that we're erroring out already.

The actual OS error seems to mostly be error code 127 (ERROR_PROC_NOT_FOUND). I believe -2146893801 is also one of these with some extra flags set. This is concerning; it doesn't seem like this should happen, because we are only feeding known system crypto provider (PROV_RSA_FULL) to CryptAcquireContextA. I'm unsure what the "invalid signature" error is about.

This might just be system DLLs being broken, but it's occurring regularly enough (we have a almost thousand crash reports) that we're having to work around it by disabling RandomState on systems where this doesn't work (which we can do because we've already forked HashMap for fallible allocations). Furthermore, Firefox code uses CryptAcquireContext and CryptGenRandom in a bunch of places and those don't seem to be particularly crashy.

We should see if this is an issue on our side, or perhaps make OsRng (or perhaps just RandomState's initializer?) more resilient to this (perhaps with a more expensive fallback?).

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCategory: This is a bug.O-windowsOperating system: WindowsT-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions