Skip to content

Refactor dynamic library error checking on *nix #75811

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 26, 2020

Conversation

ecstatic-morse
Copy link
Contributor

@ecstatic-morse ecstatic-morse commented Aug 22, 2020

The old code was checking dlerror more often than necessary, since (unlike dlsym) checking the return value of dlopen is enough to indicate whether an error occurred. In the first commit, I've refactored the code to minimize the number of system calls needed. It should be strictly better than the old version.

The second commit is an optional addendum which fixes the issue observed on illumos in #74469, a PR I reviewed that was ultimately closed due to inactivity. I'm not sure how hard we try to work around platform-specific bugs like this, and I believe that, due to the way that dlerror is specified in the POSIX standard, libc implementations that want to run on conforming systems cannot call dlsym in multi-threaded programs.

@rust-highfive
Copy link
Contributor

r? @eddyb

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 22, 2020
@ecstatic-morse
Copy link
Contributor Author

cc @jclulow

The old code was checking `dlerror` more often than necessary, since the
return value of `dlopen` indicates whether an error occurred.
This works around behavior observed on illumos in rust-lang#74469, in which
foreign code (libc according to the OP) was racing with rustc to check
`dlerror`.
libc::dlopen(s.as_ptr(), libc::RTLD_LAZY) as *mut u8
})
}
// `dlerror` is process global, so we can only allow a single thread at a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: dlerror It is MT-safe (thread local) on most modern POSIX implementations. Exceptions include freebsd and dragonflybsd as per investigation I did early this year, and apparently illumos as I’ve just learnt.

POSIX in particular allows for dlerror to be MT-safe (since a comparatively recent standard revision), but does not require it.

@nagisa
Copy link
Member

nagisa commented Aug 25, 2020

r? @nagisa I guess I’m the most appropriate reviewer for this, given that I maintain a similar crate.

I broadly agree with @ollie27 in that we should just consider dlsym() == nullptr to be an error universally. It is not like the implementation here needs to cater for all of the obscure use-cases, only for those that are relevant to rustc. This way we can cut some corners and simplify the code significantly (most notably by removing most of the motivation to use the Mutex).

@rust-highfive rust-highfive assigned nagisa and unassigned eddyb Aug 25, 2020
This simplifies the code somewhat. Also updates comments to reflect
notes from reviw about thread-safety of `dlerror`.
This happens to be the default on Linux, but the default is unspecified
in the POSIX standard. Also switches to `cast` to keep line lengths in
check.
@ecstatic-morse
Copy link
Contributor Author

ecstatic-morse commented Aug 25, 2020

The most recent changes treat a NULL from dlsym as an error. I also expanded the comments to reflect the 2017 revision to the POSIX standard that @nagisa alluded to above and started passing RTLD_LOCAL to dlopen.

I don't feel comfortable removing the locking entirely in this PR, since it's not a performance issue and can give more predictable error messages on lower tier platforms that don't have thread-safe dlerror. You're free to remove the error module in a later PR if you think it's still too complex, but I think this is better than the status quo.

@nagisa
Copy link
Member

nagisa commented Aug 25, 2020

@bors r+

Yeah LGTM.

@bors
Copy link
Collaborator

bors commented Aug 25, 2020

📌 Commit aae6c0f has been approved by nagisa

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 25, 2020
@bors
Copy link
Collaborator

bors commented Aug 26, 2020

⌛ Testing commit aae6c0f with merge 3e98860...

Copy link
Contributor

@jclulow jclulow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, and definitely seems like it will cover the thing I submitted an unpolished patch for earlier. Thank you!

@bors
Copy link
Collaborator

bors commented Aug 26, 2020

☀️ Test successful - checks-actions, checks-azure
Approved by: nagisa
Pushing 3e98860 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Aug 26, 2020
@bors bors merged commit 3e98860 into rust-lang:master Aug 26, 2020
@ecstatic-morse ecstatic-morse deleted the better-dlerror branch October 6, 2020 01:41
@cuviper cuviper added this to the 1.48.0 milestone May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants