This repository has been archived by the owner on Apr 1, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 7
Detect binaries as dependencies #353
Merged
Merged
Changes from all commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
0a99c73
Add the concept of feature flags
jssblck f3eccff
Check if feature flags are enabled
jssblck e51a244
Gate monorepo scans on feature flag
jssblck 30685c1
Gate --enable-vsi on feature flag
jssblck 5805ba7
Add intent
jssblck 8547fd3
Merge branch 'master' into feature-flags
jssblck 83bb27f
Fix feature flag endpoint
jssblck 014557f
Add fileIsBinary to ReadFS
jssblck 9cd9034
Support basic binary detection strategy
jssblck 8f505b1
Add BinaryDeps function & fetcher type
jssblck 34b8bd3
Render shorter fingerprint
jssblck 80953e9
Only run binary detection on feature flag
jssblck 20fdc4d
Don't default to disabled feature flags
jssblck 20b917a
Add to changelog
jssblck 92ac710
Merge branch 'master' into feature-flags
jssblck f81c417
Merge branch 'feature-flags' into binary-enforcement
jssblck 1c15a05
Update changelog
jssblck 78311ff
Convert to use user-defined dependencies
jssblck 00488f0
Require local CLI flag, remove feature flag
jssblck 1557623
Remove binary discovery feature flag
jssblck f169938
Revert "Fix feature flag endpoint"
jssblck 152ebcb
Revert "Add intent"
jssblck e813410
Revert "Gate --enable-vsi on feature flag"
jssblck 43d1e5c
Revert "Gate monorepo scans on feature flag"
jssblck f8c8367
Revert "Check if feature flags are enabled"
jssblck 60a6ec6
Revert "Add the concept of feature flags"
jssblck 5f4b288
Update changelog
jssblck 9efc673
Collapse a growing list of modes into one type
jssblck 0fcd973
Merge branch 'master' into binary-enforcement
jssblck 7024be5
Nits
jssblck 8287acc
Refactor `additionalSourceUnits`
jssblck 12073f8
Extract `fileIsBinary` from `ReadFS`
jssblck 6615864
Merge branch 'master' into binary-enforcement
jssblck File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,15 +7,18 @@ module App.Fossa.Analyze ( | |
JsonOutput (..), | ||
VSIAnalysisMode (..), | ||
IATAssertionMode (..), | ||
discoverFuncs, | ||
BinaryDiscoveryMode (..), | ||
RecordMode (..), | ||
ModeOptions (..), | ||
discoverFuncs, | ||
) where | ||
|
||
import App.Docs (userGuideUrl) | ||
import App.Fossa.API.BuildLink (getFossaBuildUrl) | ||
import App.Fossa.Analyze.GraphMangler (graphingToGraph) | ||
import App.Fossa.Analyze.Project (ProjectResult (..), mkResult) | ||
import App.Fossa.Analyze.Record (AnalyzeEffects (..), AnalyzeJournal (..), loadReplayLog, saveReplayLog) | ||
import App.Fossa.BinaryDeps (analyzeBinaryDeps) | ||
import App.Fossa.FossaAPIV1 (UploadResponse (..), getProject, projectIsMonorepo, uploadAnalysis, uploadContributors) | ||
import App.Fossa.ManualDeps (analyzeFossaDepsFile) | ||
import App.Fossa.ProjectInference (inferProjectDefault, inferProjectFromVCS, mergeOverride, saveRevision) | ||
|
@@ -64,7 +67,7 @@ import Path (Abs, Dir, Path, fromAbsDir, toFilePath) | |
import Path.IO (makeRelative) | ||
import Path.IO qualified as P | ||
import Srclib.Converter qualified as Srclib | ||
import Srclib.Types (Locator (locatorProject, locatorRevision), SourceUnit, parseLocator) | ||
import Srclib.Types (Locator (locatorProject, locatorRevision), SourceUnit (..), parseLocator) | ||
import Strategy.Bundler qualified as Bundler | ||
import Strategy.Cargo qualified as Cargo | ||
import Strategy.Carthage qualified as Carthage | ||
|
@@ -119,6 +122,14 @@ data UnpackArchives = UnpackArchives | |
|
||
data JsonOutput = JsonOutput | ||
|
||
-- | Collect analysis modes into a single type for ease of use. | ||
-- These modes are intended to be different options that alter how analysis is performed or what analysis steps are followed. | ||
data ModeOptions = ModeOptions | ||
{ modeVSIAnalysis :: VSIAnalysisMode | ||
, modeIATAssertion :: IATAssertionMode | ||
, modeBinaryDiscovery :: BinaryDiscoveryMode | ||
} | ||
|
||
-- | "VSI analysis" modes | ||
data VSIAnalysisMode | ||
= -- | enable the VSI analysis strategy | ||
|
@@ -133,6 +144,13 @@ data IATAssertionMode | |
| -- | assertion not enabled | ||
IATAssertionDisabled | ||
|
||
-- | "Binary Discovery" modes | ||
data BinaryDiscoveryMode | ||
= -- | Binary discovery enabled | ||
BinaryDiscoveryEnabled | ||
| -- | Binary discovery disabled | ||
BinaryDiscoveryDisabled | ||
|
||
-- | "Replay logging" modes | ||
data RecordMode | ||
= -- | record effect invocations | ||
|
@@ -142,8 +160,8 @@ data RecordMode | |
| -- | don't record or replay | ||
RecordModeNone | ||
|
||
analyzeMain :: FilePath -> RecordMode -> Severity -> ScanDestination -> OverrideProject -> Flag UnpackArchives -> Flag JsonOutput -> VSIAnalysisMode -> IATAssertionMode -> AllFilters -> IO () | ||
analyzeMain workdir recordMode logSeverity destination project unpackArchives jsonOutput enableVSI assertionMode filters = | ||
analyzeMain :: FilePath -> RecordMode -> Severity -> ScanDestination -> OverrideProject -> Flag UnpackArchives -> Flag JsonOutput -> ModeOptions -> AllFilters -> IO () | ||
analyzeMain workdir recordMode logSeverity destination project unpackArchives jsonOutput modeOptions filters = | ||
withDefaultLogger logSeverity | ||
. Diag.logWithExit_ | ||
. runReadFSIO | ||
|
@@ -170,7 +188,7 @@ analyzeMain workdir recordMode logSeverity destination project unpackArchives js | |
. runReplay @Exec (effectsExec effects) | ||
$ doAnalyze basedir | ||
where | ||
doAnalyze basedir = analyze basedir destination project unpackArchives jsonOutput enableVSI assertionMode filters | ||
doAnalyze basedir = analyze basedir destination project unpackArchives jsonOutput modeOptions filters | ||
|
||
discoverFuncs :: (TaskEffs sig m, TaskEffs rsig run) => [Path Abs Dir -> m [DiscoveredProject run]] | ||
discoverFuncs = | ||
|
@@ -245,19 +263,23 @@ analyze :: | |
OverrideProject -> | ||
Flag UnpackArchives -> | ||
Flag JsonOutput -> | ||
VSIAnalysisMode -> | ||
IATAssertionMode -> | ||
ModeOptions -> | ||
AllFilters -> | ||
m () | ||
analyze (BaseDir basedir) destination override unpackArchives jsonOutput enableVSI iatAssertion filters = do | ||
analyze (BaseDir basedir) destination override unpackArchives jsonOutput ModeOptions{..} filters = do | ||
capabilities <- sendIO getNumCapabilities | ||
|
||
let apiOpts = case destination of | ||
OutputStdout -> Nothing | ||
UploadScan opts _ -> Just opts | ||
|
||
-- additional source units are built outside the standard strategy flow, because they either | ||
-- require additional information (eg API credentials), or they return additional information (eg user deps). | ||
manualSrcUnits <- analyzeFossaDepsFile basedir apiOpts | ||
vsiResults <- analyzeVSI enableVSI apiOpts basedir filters | ||
vsiResults <- analyzeVSI modeVSIAnalysis apiOpts basedir filters | ||
binarySearchResults <- analyzeDiscoverBinaries modeBinaryDiscovery basedir filters | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [no action] Might be useful to pair at some point on how to make these tasks run in parallel / show status updates. It's not as easy as it should be both of these are long-running and might benefit from the parallelism / |
||
let additionalSourceUnits :: [SourceUnit] | ||
additionalSourceUnits = catMaybes [manualSrcUnits, vsiResults, binarySearchResults] | ||
|
||
(projectResults, ()) <- | ||
runOutput @ProjectResult | ||
|
@@ -270,14 +292,14 @@ analyze (BaseDir basedir) destination override unpackArchives jsonOutput enableV | |
let filteredProjects = filterProjects (BaseDir basedir) projectResults | ||
|
||
-- Need to check if vendored is empty as well, even if its a boolean that vendoredDeps exist | ||
case checkForEmptyUpload projectResults filteredProjects [manualSrcUnits, vsiResults] of | ||
case checkForEmptyUpload projectResults filteredProjects additionalSourceUnits of | ||
NoneDiscovered -> Diag.fatal ErrNoProjectsDiscovered | ||
FilteredAll count -> Diag.fatal (ErrFilteredAllProjects count projectResults) | ||
FoundSome sourceUnits -> case destination of | ||
OutputStdout -> logStdout . decodeUtf8 . Aeson.encode $ buildResult manualSrcUnits filteredProjects | ||
OutputStdout -> logStdout . decodeUtf8 . Aeson.encode $ buildResult additionalSourceUnits filteredProjects | ||
UploadScan opts metadata -> do | ||
locator <- uploadSuccessfulAnalysis (BaseDir basedir) opts metadata jsonOutput override sourceUnits | ||
doAssertRevisionBinaries iatAssertion opts locator | ||
doAssertRevisionBinaries modeIATAssertion opts locator | ||
|
||
analyzeVSI :: (MonadIO m, Has Diag.Diagnostics sig m, Has Exec sig m, Has (Lift IO) sig m, Has Logger sig m) => VSIAnalysisMode -> Maybe ApiOpts -> Path Abs Dir -> AllFilters -> m (Maybe SourceUnit) | ||
analyzeVSI VSIAnalysisEnabled (Just apiOpts) dir filters = do | ||
|
@@ -286,6 +308,12 @@ analyzeVSI VSIAnalysisEnabled (Just apiOpts) dir filters = do | |
pure $ Just results | ||
analyzeVSI _ _ _ _ = pure Nothing | ||
|
||
analyzeDiscoverBinaries :: (MonadIO m, Has Diag.Diagnostics sig m, Has (Lift IO) sig m, Has Logger sig m, Has ReadFS sig m) => BinaryDiscoveryMode -> Path Abs Dir -> AllFilters -> m (Maybe SourceUnit) | ||
analyzeDiscoverBinaries BinaryDiscoveryEnabled dir filters = do | ||
logInfo "Discovering binary files as dependencies" | ||
analyzeBinaryDeps dir filters | ||
analyzeDiscoverBinaries _ _ _ = pure Nothing | ||
|
||
doAssertRevisionBinaries :: (Has Diag.Diagnostics sig m, Has ReadFS sig m, Has (Lift IO) sig m, Has Logger sig m) => IATAssertionMode -> ApiOpts -> Locator -> m () | ||
doAssertRevisionBinaries (IATAssertionEnabled dir) apiOpts locator = assertRevisionBinaries dir apiOpts locator | ||
doAssertRevisionBinaries _ _ _ = pure () | ||
|
@@ -380,9 +408,8 @@ data CountedResult | |
-- Takes a list of all projects analyzed, and the list after filtering. We assume | ||
-- that the smaller list is the latter, and return that list. Starting with user-defined deps, | ||
-- we also include a check for an additional source unit from fossa-deps.yml. | ||
checkForEmptyUpload :: [ProjectResult] -> [ProjectResult] -> [Maybe SourceUnit] -> CountedResult | ||
checkForEmptyUpload xs ys potentialAdditionalUnits = do | ||
let additionalUnits = catMaybes potentialAdditionalUnits | ||
checkForEmptyUpload :: [ProjectResult] -> [ProjectResult] -> [SourceUnit] -> CountedResult | ||
checkForEmptyUpload xs ys additionalUnits = do | ||
if null additionalUnits | ||
then case (xlen, ylen) of | ||
-- We didn't discover, so we also didn't filter | ||
|
@@ -463,16 +490,14 @@ buildProjectSummary project projectLocator projectUrl = do | |
, "id" .= projectLocator | ||
] | ||
|
||
buildResult :: Maybe SourceUnit -> [ProjectResult] -> Aeson.Value | ||
buildResult maybeSrcUnit projects = | ||
buildResult :: [SourceUnit] -> [ProjectResult] -> Aeson.Value | ||
buildResult srcUnits projects = | ||
Aeson.object | ||
[ "projects" .= map buildProject projects | ||
, "sourceUnits" .= finalSourceUnits | ||
] | ||
where | ||
finalSourceUnits = case maybeSrcUnit of | ||
Just unit -> unit : scannedUnits | ||
Nothing -> scannedUnits | ||
finalSourceUnits = srcUnits ++ scannedUnits | ||
scannedUnits = map Srclib.toSourceUnit projects | ||
|
||
buildProject :: ProjectResult -> Aeson.Value | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
{-# LANGUAGE RecordWildCards #-} | ||
|
||
module App.Fossa.BinaryDeps (analyzeBinaryDeps) where | ||
|
||
import App.Fossa.Analyze.Project (ProjectResult (..)) | ||
import App.Fossa.VSI.IAT.Fingerprint (fingerprintRaw) | ||
import App.Fossa.VSI.IAT.Types (Fingerprint (..)) | ||
import Control.Algebra (Has) | ||
import Control.Carrier.Diagnostics (Diagnostics, fromEither) | ||
import Control.Effect.Lift (Lift) | ||
import Data.ByteString qualified as BS | ||
import Data.Maybe (catMaybes) | ||
import Data.String.Conversion (toText) | ||
import Data.Text (Text) | ||
import Data.Text qualified as Text | ||
import Discovery.Filters (AllFilters (..), FilterCombination (combinedPaths)) | ||
import Discovery.Walk (WalkStep (WalkContinue), walk') | ||
import Effect.ReadFS (ReadFS, readContentsBSLimit) | ||
import Path (Abs, Dir, File, Path, isProperPrefixOf, stripProperPrefix, toFilePath, (</>)) | ||
import Srclib.Converter qualified as Srclib | ||
import Srclib.Types (AdditionalDepData (..), SourceUnit (..), SourceUserDefDep (..)) | ||
import Types (GraphBreadth (Complete)) | ||
|
||
data BinaryFile = BinaryFile | ||
{ binaryPath :: Path Abs File | ||
, binaryFingerprint :: Fingerprint | ||
} | ||
|
||
-- | Binary detection is sufficiently different from other analysis types that it cannot be just another strategy. | ||
-- Instead, binary detection is run separately over the entire scan directory, outputting its own source unit. | ||
-- The goal of this feature is to enable a FOSSA user to flag all vendored binaries (as defined by git) in the project as dependencies. | ||
-- Users may then use standard FOSSA UX flows to ignore or add license information to the detected binaries. | ||
analyzeBinaryDeps :: (Has (Lift IO) sig m, Has Diagnostics sig m, Has ReadFS sig m) => Path Abs Dir -> AllFilters -> m (Maybe SourceUnit) | ||
analyzeBinaryDeps dir filters = do | ||
binaries <- fingerprintBinaries (toPathFilters dir filters) dir | ||
if null binaries | ||
then pure Nothing | ||
else pure . Just $ toSourceUnit (toProject dir) binaries | ||
|
||
fingerprintBinaries :: (Has (Lift IO) sig m, Has Diagnostics sig m, Has ReadFS sig m) => PathFilters -> Path Abs Dir -> m [BinaryFile] | ||
fingerprintBinaries filters = walk' $ \dir _ files -> do | ||
if shouldFingerprintDir dir filters | ||
then do | ||
someBinaries <- traverse fingerprintIfBinary files | ||
pure (catMaybes someBinaries, WalkContinue) | ||
else pure ([], WalkContinue) | ||
|
||
fingerprintIfBinary :: (Has (Lift IO) sig m, Has Diagnostics sig m, Has ReadFS sig m) => Path Abs File -> m (Maybe BinaryFile) | ||
fingerprintIfBinary file = do | ||
isBinary <- fileIsBinary file | ||
if isBinary | ||
then do | ||
fp <- fingerprintRaw file | ||
pure . Just $ BinaryFile file fp | ||
else pure Nothing | ||
|
||
-- | PathFilters is a specialized filter mechanism that operates only on absolute directory paths. | ||
data PathFilters = PathFilters | ||
{ include :: [Path Abs Dir] | ||
, exclude :: [Path Abs Dir] | ||
} | ||
deriving (Show) | ||
|
||
toPathFilters :: Path Abs Dir -> AllFilters -> PathFilters | ||
toPathFilters root filters = | ||
PathFilters | ||
{ include = map (root </>) (combinedPaths $ includeFilters filters) | ||
, exclude = map (root </>) (combinedPaths $ excludeFilters filters) | ||
} | ||
|
||
shouldFingerprintDir :: Path Abs Dir -> PathFilters -> Bool | ||
shouldFingerprintDir dir filters = (not shouldExclude) && shouldInclude | ||
where | ||
shouldExclude = (isPrefixedOrEqual dir) `any` (exclude filters) | ||
shouldInclude = null (include filters) || (isPrefixedOrEqual dir) `any` (include filters) | ||
isPrefixedOrEqual a b = a == b || isProperPrefixOf b a -- swap order of isProperPrefixOf comparison because we want to know if dir is prefixed by any filter | ||
|
||
toProject :: Path Abs Dir -> ProjectResult | ||
toProject dir = ProjectResult "binary-deps" dir mempty Complete [] | ||
|
||
toDependency :: Path Abs Dir -> BinaryFile -> SourceUserDefDep | ||
toDependency root BinaryFile{..} = | ||
SourceUserDefDep | ||
{ srcUserDepName = renderRelative root binaryPath | ||
, srcUserDepVersion = renderFingerprint binaryFingerprint | ||
, srcUserDepLicense = "" | ||
, srcUserDepDescription = Just "Binary discovered in source tree" | ||
, srcUserDepHomepage = Nothing | ||
} | ||
|
||
toSourceUnit :: ProjectResult -> [BinaryFile] -> SourceUnit | ||
toSourceUnit project binaries = do | ||
let unit = Srclib.toSourceUnit project | ||
let deps = map (toDependency $ projectResultPath project) binaries | ||
unit{additionalData = Just $ AdditionalDepData (Just deps) Nothing} | ||
|
||
-- | Just render the first few characters of the fingerprint. | ||
-- The goal is to provide a high confidence that future binaries with the same name won't collide, | ||
-- and we don't need all 256 bits for that. | ||
renderFingerprint :: Fingerprint -> Text | ||
renderFingerprint fingerprint = Text.take 12 $ unFingerprint fingerprint | ||
|
||
renderRelative :: Path Abs Dir -> Path Abs File -> Text | ||
renderRelative absDir absFile = | ||
case stripProperPrefix absDir absFile of | ||
Left _ -> toText . toFilePath $ absFile | ||
Right relFile -> toText . toFilePath $ relFile | ||
|
||
-- | Determine if a file is binary using the same method as git: | ||
-- "is there a zero byte in the first 8000 bytes of the file" | ||
fileIsBinary :: (Has ReadFS sig m, Has Diagnostics sig m) => Path Abs File -> m Bool | ||
fileIsBinary file = do | ||
attemptedContent <- readContentsBSLimit file 8000 | ||
content <- fromEither attemptedContent | ||
pure $ BS.elem 0 content |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
module App.Fossa.VSI.IAT.Fingerprint ( | ||
fingerprintRaw, | ||
fingerprintContentsRaw, | ||
) where | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[no action] 💡
the
analyzeMain
args sure are getting unwieldy.. I wonder if we should pack all of the args into a record