Skip to content

[skip changelog] Detect unused dependency license metadata files #1712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 15, 2022
Merged

[skip changelog] Detect unused dependency license metadata files #1712

merged 1 commit into from
Apr 15, 2022

Conversation

per1234
Copy link
Contributor

@per1234 per1234 commented Apr 15, 2022

Please check if the PR fulfills these requirements

  • The PR has no duplicates (please search among the Pull Requests
    before creating one)
  • The PR follows
    our contributing guidelines
  • [N/A] Tests for the changes have been added (for bug fixes / features)
  • [N/A] Docs have been added / updated (for bug fixes / features)
  • [N/A] UPGRADING.md has been updated with a migration guide (for breaking changes)
  • What kind of change does this PR introduce?

Bug fix

  • What is the current behavior?

The "Check Go Dependencies" GitHub Actions workflow checks for dependencies with incompatible or unapproved license types.

The dependency license metadata consumed by the "Licensed" tool is cached in the project repository, in a dedicated file for each dependency.

The check-cache job of the workflow checks whether that cache is in sync with the project's current dependencies. It does this by using the "Licensed" tool to update the cache and then a git diff command to check whether that resulted in any changes (which would indicate it is out of sync).

Out of sync states could result from any of three distinct conditions:

  • Missing metadata file
  • Incorrect metadata file contents
  • Superfluous metadata file

🐛 An incorrectly configured git diff command previously caused the last of these to be missed.

For example, here an unnecessary dependency license metadata file is introduced into the repository:
c3b7aa5
🐛 However, the check-cache job run for that commit passed:
https://github.com/arduino/arduino-cli/runs/6009282571

My first take at this system was simply using git diff --exit-code alone. That detects the last two, but misses the first. I added a preceding git add --intent-to-add . command to detect added files, but didn't realize that it also caused the last condition to be missed.

  • What is the new behavior?

Superfluous files in the dependency license metadata cache won't actually interfere with its intended functionality, but it is still important to avoid an accumulation of unused files.

The new commands will catch all three of the possible out of sync conditions by staging all changes that result from the metadata cache update to the repository and then comparing those against the HEAD commit.

Demonstration of the superfluous file being caught by the version of the check-cache job from this PR:
https://github.com/per1234/arduino-cli/runs/6036878271?check_suite_focus=true

And then passing after the file is removed:
https://github.com/per1234/arduino-cli/runs/6036924036?check_suite_focus=true

No breaking change.

  • Other information:

Related discussion here : #1711 (comment)


I considered an alternative approach which works just as well as the chosen one (explanation of the two approaches here):

git add .
git diff --exit-code HEAD

However, I feel that the git diff command with the --cached flag is more self-explanatory.

The "Check Go Dependencies" GitHub Actions workflow checks for dependencies with incompatible or unapproved license
types.

The dependency license metadata consumed by the "Licensed" tool is cached in the project repository, in a dedicated file
for each dependency.

The `check-cache` job of the workflow checks whether that cache is in sync with the project's current dependencies. It
does this by using the "Licensed" tool to update the cache and then a `git diff` command to check whether that resulted
in any changes (which would indicate it is out of sync).

Out of sync states could result from any of three distinct conditions:

- Missing metadata file
- Incorrect metadata file contents
- Superfluous metadata file

An incorrectly configured `git diff` command previously caused the last of these to be missed.

My first take at this system was simply using `git diff --exit-code` alone. That detects the last two, but misses the
first. I added the `git add --intent-to-add .` command to detect added files, but didn't realize that it caused the last
to be missed.

Superfluous files in the dependency license metadata cache won't actually interfere with its intended functionality, but
it is still important to avoid an accumulation of unused files.

The new commands will catch all three of the possible out of sync conditions by staging all changes that result from
the metadata cache update to the repository and then comparing those against the `HEAD` commit.

I considered an alternative approach which works just as well as the chosen one:

```
git add .
git diff --exit-code HEAD
```

However, I feel that the `--cached` flag makes the `git diff` command more self-explanatory.
@per1234 per1234 added topic: infrastructure Related to project infrastructure type: imperfection Perceived defect in any part of project labels Apr 15, 2022
@per1234 per1234 requested review from cmaglie and umbynos April 15, 2022 10:24
@per1234 per1234 self-assigned this Apr 15, 2022
@per1234 per1234 merged commit 4a626a0 into arduino:master Apr 15, 2022
@per1234 per1234 deleted the catch-unused-dep-metadata branch April 15, 2022 10:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: infrastructure Related to project infrastructure type: imperfection Perceived defect in any part of project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants