-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syft seems unable to parse non UTF-8 pom.xml files #2044
Labels
Comments
Looks like a change to how character sets are handled is probably needed here: syft/syft/pkg/cataloger/java/parse_pom_xml.go Lines 101 to 110 in cb0214e
|
This was referenced Aug 28, 2023
Closed
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What happened:
Running syft against the jar from https://repo1.maven.org/maven2/com/alogient/cameleon/java/sdk/cameleon4java-sdk/1.12.2/cameleon4java-sdk-1.12.2.jar gives the following warning:
In this case the specific field causing the issue is an author name:
Syft seems to be trying to decode it using UTF-8, however, file seems to indicate it is
ISO-8859
What you expected to happen:
Syft should be able to decode these documents and at least extract the groupid/artifactid. There are a large number of maven artifacts that end up with incorrect identifiers because syft cannot extract the information from the pom files.
Steps to reproduce the issue:
syft cameleon4java-sdk-1.12.2.jar
Anything else we need to know?:
Environment:
syft version
:cat /etc/os-release
or similar):The text was updated successfully, but these errors were encountered: