Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python log parser - handle limited size log content #213

Merged
merged 3 commits into from
Nov 26, 2023

Conversation

attiasas
Copy link
Contributor

@attiasas attiasas commented Nov 20, 2023

  • All tests passed. If this feature is not already covered by the tests, I added new tests.
  • All static analysis checks passed.
  • This pull request is on the dev branch.
  • I used gofmt for formatting the code before submitting the pull request.

We have cases when installing that the log content has limited size and the content is split to multiple content, for example with pipenv:

Looking in indexes: 
***localhost:8081/artifactory/api/pypi/cli-pipenv-pypi-virtual-16
98829624/simple

Collecting pexpect==4.8.0 (from -r 
/tmp/pipenv-hx6_n8fi-requirements/pipenv-xuuwb5sl-hashed-reqs.txt (line 1))

  Using cached 
http://localhost:8081/artifactory/api/pypi/cli-pipenv-pypi-virtual-1698829624/pa
ckages/packages/39/7b/88dbb785881c28a102619d46423cb853b46dbccc70d3ac362d99773a78
ce/pexpect-4.8.0-py2.py3-none-any.whl (59 kB)

Collecting ptyprocess==0.7.0 (from -r 
/tmp/pipenv-hx6_n8fi-requirements/pipenv-xuuwb5sl-hashed-reqs.txt (line 2))

  Using cached 
http://localhost:8081/artifactory/api/pypi/cli-pipenv-pypi-virtual-1698829624/pa
ckages/packages/22/a6/858897256d0deac81a172289110f31629fc4cee19b6f01283303e18c8d
b3/ptyprocess-0.7.0-py2.py3-none-any.whl (13 kB)

Collecting toml==0.10.2 (from -r 
/tmp/pipenv-hx6_n8fi-requirements/pipenv-xuuwb5sl-hashed-reqs.txt (line 3))

  Using cached 
http://localhost:8081/artifactory/api/pypi/cli-pipenv-pypi-virtual-1698829624/pa
ckages/packages/44/6f/7120676b6d73228c96e17f1f794d8ab046fc910d781c8d151120c3f156
9e/toml-0.10.2-py2.py3-none-any.whl (16 kB)

Installing collected packages: ptyprocess, pexpect, toml

Successfully installed pexpect-4.8.0 ptyprocess-0.7.0 toml-0.10.2

the split lines cause our code not to match the expected Regex, and we could not collect the dependencies

@attiasas attiasas added the bug Something isn't working label Nov 20, 2023
// Since the log parser parse line by line, we need to create a parser that can capture group content that may span multiple lines.
func getMultilineSplitCaptureOutputPattern(startCollectingPattern, captureGroup, endCollectingPattern string, handler func(pattern *gofrogcmd.CmdOutputPattern) (string, error)) (parsers []*gofrogcmd.CmdOutputPattern) {
// Prepare regex patterns.
oneLineRegex := regexp.MustCompile(startCollectingPattern + `(` + captureGroup + `)` + endCollectingPattern)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regexp.MustCompile is heavy.
Let's make the effort to compile regex patterns on compilation time by putting them in variables outside the functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its dynamic patterns that are constructed by the given arguments to the method. I don't see any benefit to refactoring it outside because I need those values at the method

utils/pythonutils/utils.go Outdated Show resolved Hide resolved
utils/pythonutils/utils.go Outdated Show resolved Hide resolved
utils/pythonutils/utils.go Outdated Show resolved Hide resolved
// Create a parser for multi line pattern matches.
lineBuffer := ""
collectingMultiLineValue := false
parsers = append(parsers, &gofrogcmd.CmdOutputPattern{RegExp: regexp.MustCompile(".*"), ExecFunc: func(pattern *gofrogcmd.CmdOutputPattern) (string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.* does not include newline characters. Is it intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the parser that we use goes over the content line by line so It will not include newlines anyway since it is the delimiter used for tokens.
The purpose of this parser is to remove any newlines and concatenate the text removing new lines that was added if split

utils/pythonutils/utils.go Outdated Show resolved Hide resolved
utils/pythonutils/utils_test.go Outdated Show resolved Hide resolved
utils/pythonutils/utils_test.go Outdated Show resolved Hide resolved
@eyalbe4 eyalbe4 removed their request for review November 24, 2023 10:01
@attiasas attiasas merged commit b6719ec into jfrog:dev Nov 26, 2023
13 of 14 checks passed
@attiasas attiasas mentioned this pull request Nov 26, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants