Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect as-installed system and application packages #2058

Closed
pombredanne opened this issue Jun 2, 2020 · 3 comments
Closed

Collect as-installed system and application packages #2058

pombredanne opened this issue Jun 2, 2020 · 3 comments

Comments

@pombredanne
Copy link
Contributor

We can detect a good bunch of as-installed packages through their manifests.
This is about something a tad different to collect as-installed package details based on either:

  • specific layouts and installation manifests (such as a Python site-packages and dist-info or a Rubygems vendor or npm's node_module)
  • installation databases such as Debian dpkg database files, RPM installed db, alpine installed, Arch metadata, etc.

In addition to getting this data we should also collect the files corresponding to each package as reported in #437 and #1554

Some related tickets: #2023 , #253

@pombredanne
Copy link
Contributor Author

@MaJuRG ping ^ ... your take and help would be mucho welcomed there, starting with Debian (and debut?)

steven-esser added a commit that referenced this issue Jun 4, 2020
* Add debut dependency and bump attrs version
* Add function to parse a dpkg status file into debian Packages

Addresses: #2058

Signed-off-by: Steven Esser <sesser@nexb.com>
steven-esser added a commit that referenced this issue Jun 5, 2020
* Add ability to return a list of paths and md5sums from a dpkg .md5sums
  file
* Add proper `parties` data to build_package()
* Add proper `source` data to build_packages()
* Add additional test cases

Addresses: #2058

Signed-off-by: Steven Esser <sesser@nexb.com>
steven-esser added a commit that referenced this issue Jun 5, 2020
* Fix multi-arch mapping bug
* Add end-to-end integration test
* Add get_installed_packages() function for a given rootfs dir location
* Add basic test case for get_installed_packages()

Addresses: #2058

Signed-off-by: Steven Esser <sesser@nexb.com>
steven-esser added a commit that referenced this issue Jun 8, 2020
* get_list_of_installed_files() now returns a list instead of a
  generator
* Tests added for missing md5sum file

Addresses: #2058

Signed-off-by: Steven Esser <sesser@nexb.com>
pombredanne added a commit that referenced this issue Jun 9, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Jun 9, 2020
Somehow a test file was damaged. This restore the test and expectations
to what they should be. And add a new test.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Jun 9, 2020
get_normalized_expression() was always trying to parse a license
as an exception. This is not an option.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Jun 9, 2020
We now can detecte license "smartly" in Debian copyright files
using a combo of license name mappings, and parsngs dep-5 debian machine
reable copyright files and treat each section correctly.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Jun 9, 2020
Also add a few end to end tests on a mock rootfs

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
JonoYang added a commit that referenced this issue Jul 7, 2020
Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jul 7, 2020
Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jul 8, 2020
Signed-off-by: Jono Yang <jyang@nexb.com>
JonoYang added a commit that referenced this issue Jul 8, 2020
Signed-off-by: Jono Yang <jyang@nexb.com>
pombredanne added a commit that referenced this issue Jul 15, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Aug 3, 2020
The new name is PackageFile and we also support more than just md5 as
checksums.
Also move common test code to package_test_utils.py

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Aug 3, 2020
Comments lines can start with #

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Aug 3, 2020
Collect the list of installed files, dependencies and additional data.
Collect and normalize licenses and add basic tests.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Aug 3, 2020
Found in a binary on Alpine

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Aug 3, 2020
Setting SCANCODE_DEBUG_LICENSE enables tracing

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Aug 3, 2020
debian.py is now its own module

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Aug 3, 2020
Therefore we sip these tests there entirely.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Aug 4, 2020
Therefore we sip these tests there entirely.

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Aug 4, 2020
* Fall back to /var/lib/dpkg/info .list files when the .md5sum is not
  present.

* Avoid including directories either because it is a well known root
  directory or because this is a parent of an existing file.

* Always prefix file paths with / since we are in a rootfs

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Aug 6, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Aug 25, 2020
Somehow sometimes things are not clean and clear as we liked them to be
and md5sums could be missing

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 7, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 7, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 8, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 8, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 8, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 8, 2020
And run Debian test only on Python3 #2058

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 8, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 8, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 8, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 8, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Sep 8, 2020
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit to aboutcode-org/scancode-plugins that referenced this issue Feb 17, 2021
This is to support these tickets:

aboutcode-org/scancode-toolkit#437
aboutcode-org/scancode.io#6
aboutcode-org/scancode-toolkit#2058

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 1, 2021
 - adopt laest license headers
 -

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 3, 2021
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 3, 2021
And add size to PackageFile model

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 3, 2021
- Use common RpmPackage model
- Add license detection
- Use rpm command to collect XMLish
- Other minor refactorings

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 4, 2021
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne pombredanne added this to the v31.0 milestone Sep 24, 2021
@pombredanne
Copy link
Contributor Author

This is now all in develop

@pombredanne
Copy link
Contributor Author

This is now all in develop, merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants