Skip to content

Improve shellFor selectComponents performance #1146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 28, 2021

Conversation

TeofilC
Copy link
Contributor

@TeofilC TeofilC commented Jun 23, 2021

Improves performance of #1145 by only traversing local packages in search of components to add to
selectComponents.

Let me know if you have any suggestions

@hamishmack
Copy link
Collaborator

Not sure if I understand correctly, but is there a use case where we might want cabal to build a package from hackage inside the shell (and therefor include it in the selected packages)? For instance trying out project with different versions of a dependency that is in hackage. With this change would you still be able to do that (or would it leave out the dependencies of that package)?

Do you have an example where this increases performance?

@TeofilC
Copy link
Contributor Author

TeofilC commented Jun 24, 2021

That's a good point. I didn't include something like this this in the original PR, because I was worried about use cases like that too.
Just to clarify: this PR would just mean that if I say added yesod and conduit to my packages, shellFor wouldn't automatically add yesod-core. You can still put non-local packages in packages, you just won't have intermediate dependencies automatically added.

Testing this against my work proprietary codebase, I'm getting a decrease from 15s to 10s for starting up a shellFor without any packages specified. Oddly I still seem to get that performance boost, if I remove the part that filters out non-local packages. So, maybe the regression has something to do with the implementation of flatLibDepends.

This was originally prompted by this comment #1145 (comment) . I'd be interested to hear if this PR fixes that issue. And if it does whether removing the part that filters out non-local packages causes the issue to return.

If the local packages filtering is crucial to solving this performance regression, I could implement something that filters by local packages when only local packages are selected and don't filter otherwise.

@michaelpj
Copy link
Collaborator

I can confirm this makes it much faster. Sorry, I don't have the cycles right now to think about whether it's a good idea 😅

@TeofilC
Copy link
Contributor Author

TeofilC commented Jun 24, 2021

No worries!
After looking into this a bit further it seems to be that this performance regression was caused by the getKey function in flatLibDepends. This forces the outPath of each package traversed, which instantiates the derivation. A side-effect of the changes in this PR is that this extra instantiation is avoided.

My thinking now is that we should change the key to something like the name attr. This would avoid instantiating the derivation, and we'd also benefit from this in other uses of this function. This would supersede the current contents of this PR and sidestep the worries about local package filtering.

Something like this maybe:

diff --git a/lib/default.nix b/lib/default.nix
index 359367d..92e9e77 100644
--- a/lib/default.nix
+++ b/lib/default.nix
@@ -132,14 +132,7 @@ in {
   ## flatLibDepends :: Component -> [Package]
   flatLibDepends = component:
     let
-      # this is a minor improvement over the "cannot coerce set to string"
-      # error.  It will now say:
-      #
-      # > The option `packages.Win32.package.identifier.name' is used but not defined.
-      #
-      # which indicates that the package.Win32 is missing and not defined.
-      getKey = x: if x ? "outPath" then "${x}" else (throw x.identifier.name);
-      makePairs = map (p: rec { key=getKey val; val=(p.components.library or p); });
+      makePairs = map (p: rec { key=val.name; val=(p.components.library or p); });
       closure = builtins.genericClosure {
         startSet = makePairs component.depends;
         operator = {val,...}: makePairs val.config.depends;

How does that sound?

@hamishmack
Copy link
Collaborator

How does that sound?

That sounds like an awesome idea!

I remember a year or two ago we refactored something (I think it was the code that gathers inputs for the haskell derivations) so that it did not use flatLibDepends and got a big performance gain. This is probably why. The only other place flatLibDepends is used is when building ghcjs.

@michaelpj
Copy link
Collaborator

I guess there's a risk that that might not be a unique key? Whereas the out path is going to be unique. Probably the derivation name (which would be a combination of package name and version, I think) ought to do it, though!

TeofilC added 2 commits June 27, 2021 17:08
We no longer force the `outPath` attribute to avoid instantiating the
derivation, which is expensive
@TeofilC
Copy link
Contributor Author

TeofilC commented Jun 27, 2021

I've force pushed commits with the idea I mentioned.

I think I'm using the derivation name as the key, so hopefully that will be unique enough

@hamishmack
Copy link
Collaborator

bors try

iohk-bors bot added a commit that referenced this pull request Jun 27, 2021
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Jun 27, 2021

@hamishmack hamishmack merged commit 3856d2d into input-output-hk:master Jun 28, 2021
booniepepper pushed a commit to booniepepper/haskell.nix that referenced this pull request Feb 4, 2022
* factor out isSelectedComponent

* Optimise `flatLibDepends`

We no longer force the `outPath` attribute to avoid instantiating the
derivation, which is expensive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants