Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase E2E vulnerability detection scans timeout #5699

Closed
2 tasks done
Rebits opened this issue Aug 26, 2024 · 9 comments · Fixed by #5712
Closed
2 tasks done

Increase E2E vulnerability detection scans timeout #5699

Rebits opened this issue Aug 26, 2024 · 9 comments · Fixed by #5712
Assignees
Labels

Comments

@Rebits
Copy link
Member

Rebits commented Aug 26, 2024

Description

In Release 4.9.0 - RC 1 - Vulnerability Detection E2E tests it was detected several errors:

  • Inconsistent results between initial scan tests
  • Some vulnerabilities were detected, generating expected alerts but vulnerabilities were not correctly indexed.

After research (wazuh/wazuh#25363 (comment)) it was concluded that these errors were motivated due to a regression in the times of the indexer and for this reason it's necessary to increase the timeout of the initial tests scans and the timeout for collecting vulnerabilities in the syscollector case for each agent.

Tasks

  • Increase vulnerability timeouts for initial scans and package syscollector cases. Increase result windows

Validation

Conclusion

#5699 (comment)

@Rebits
Copy link
Member Author

Rebits commented Aug 26, 2024

Increased timeout to:

PACKAGE_VULNERABILITY_SCAN_TIME = 150
TIMEOUT_PER_AGENT_VULNERABILITY_FIRST_SCAN = PACKAGE_VULNERABILITY_SCAN_TIME * 4

Currently testing new timeout: https://ci.wazuh.info/job/Test_e2e_system/349/
On hold until tests are finished.

@Rebits
Copy link
Member Author

Rebits commented Aug 28, 2024

On hold due to no macOS are available: wazuh/wazuh#25345

@Rebits
Copy link
Member Author

Rebits commented Aug 30, 2024

Build: https://ci.wazuh.info/job/Test_e2e_system/357/
Report: Test_e2e_system_357_test_vulnerability_detector(1).zip

Analysis

The issue seems to persist after increasing the timeout to the values specified in #5699 (comment). Regarding the research done in wazuh/wazuh#25363 (comment), this is not a test issue.
Even with a timeframe of more than 15 minutes, the vulnerabilities are not correctly indexed to the vulnerability states.

In order to verify if this is a regressión, it's planned to launch the same tests over the 4.8.2 version. At the same, I will try to debug the environment once the first syscollector scan test fails.

Currently On Hold in favor of https://github.com/wazuh/wazuh-jenkins/issues/6910, due to the limitations of macOS instances

@Rebits
Copy link
Member Author

Rebits commented Aug 30, 2024

@Rebits
Copy link
Member Author

Rebits commented Sep 2, 2024

Regarding the results in 4.8.2 and 4.9.0, it seems this issue is present in both versions. However, in 4.8.1, this error was not detected: wazuh/wazuh#24594. No change in 4.8.2 can justify this discrepancy.

Due to the following evidence queue can determine:

  • Discarding Feed Discrepancies: Alerts are correctly generated, although the index does not contain the expected vulnerabilities. We can discard this as a feed issue
  • Low probability of timeout error: In all tests, the missing vulnerabilities are the same. It's not likable due to the parallel nature of the test, which always means that the same agent misses vulnerabilities. To discard this, we will launch the tests only with one agent (agent1) to check if even if this case issue persists.
  • Potential Issues with the Indexer: There may be scenarios where the indexer fails to forward vulnerabilities, causing this issue.

Currently provisioning an environment with only agent1 to perform several analyses over the test and the indexer connector: https://ci.wazuh.info/job/Test_e2e_system/361/

In addition, to fully determine a regression it would be run the tests over 4.8.1https://ci.wazuh.info/job/Test_e2e_system/362/

@Rebits
Copy link
Member Author

Rebits commented Sep 2, 2024

In order to test this development along with #5698 it was created the branch tmp-fixes-4.9.0 that contains both branches.

I am currently testing over 4.9.0-rc2.
Build: https://ci.wazuh.info/job/Test_e2e_system/366/

@wazuhci wazuhci moved this from On hold to In progress in Release 4.9.1 Sep 2, 2024
@Rebits Rebits linked a pull request Sep 3, 2024 that will close this issue
@Rebits
Copy link
Member Author

Rebits commented Sep 3, 2024

Conclusion

It was identified that these tests were failing due to the indexer's limited result window (defaulted to 10,000). In previous versions of the feeds, fewer than 10,000 vulnerabilities were detected in the environment. However, as this number increased, the tests began failing, particularly for agents with a higher number of vulnerabilities (e.g., CentOS 7 agents).

To address this issue, it has been proposed to increase the maximum result window before pulling the vulnerabilities.

Build: https://ci.wazuh.info/job/Test_e2e_system/366/

@wazuhci wazuhci moved this from In progress to Pending review in Release 4.9.1 Sep 4, 2024
@wazuhci wazuhci moved this from Pending review to In review in Release 4.9.1 Sep 4, 2024
@jseg380
Copy link
Member

jseg380 commented Sep 4, 2024

Asked some questions in the comments: #5712 (comment)

@wazuhci wazuhci moved this from In review to On hold in Release 4.9.1 Sep 4, 2024
@wazuhci wazuhci moved this from On hold to In progress in Release 4.9.1 Sep 5, 2024
@jseg380
Copy link
Member

jseg380 commented Sep 5, 2024

Questions resolved successfully.
LGTM

@wazuhci wazuhci moved this from In progress to Pending final review in Release 4.9.1 Sep 5, 2024
@wazuhci wazuhci moved this from Pending final review to In final review in Release 4.9.1 Sep 9, 2024
@wazuhci wazuhci moved this from In final review to Done in Release 4.9.1 Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants