Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated recommendations in big5 README #336

Merged
merged 1 commit into from
Jul 10, 2024

Conversation

gkamat
Copy link
Collaborator

@gkamat gkamat commented Jul 9, 2024

Description

README update.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@gkamat gkamat changed the title Updated README to recommend use of an external data store with large corpora Updated big5 README to recommend use of an external data store with large corpora Jul 9, 2024
@gkamat gkamat added backport 2 Backport to the "2" branch backport 1 backport 3 Backport to the "3" branch backport 7 Backport to the "7" branch labels Jul 9, 2024
…corpora.

Signed-off-by: Govind Kamat <govkamat@amazon.com>
@gkamat gkamat changed the title Updated big5 README to recommend use of an external data store with large corpora Updated recommendations in big5 README Jul 10, 2024
* Use a load generation host with sufficient disk space to hold the corpus.
* Ensure the target cluster has adequate storage and at least 3 data nodes.
* Specify an appropriate shard count and number of replicas so that shards are evenly distributed and appropriately sized.
* Running the workload requires an instance type with at least 8 cores and 32 GB memory.
* Install the `pbzip2` decompressor to speed up decompression of the corpus.
* Set the client timeout to a sufficiently large value, since some queries take a long time to complete.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to provide an example of a large value

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debating whether to do exactly that, but this might vary based on the size of cluster they set up. Might add this in a future update, though.

Copy link
Collaborator

@IanHoang IanHoang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gkamat gkamat merged commit 4ea81b9 into opensearch-project:main Jul 10, 2024
2 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 10, 2024
…corpora. (#336)

Signed-off-by: Govind Kamat <govkamat@amazon.com>
(cherry picked from commit 4ea81b9)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 10, 2024
…corpora. (#336)

Signed-off-by: Govind Kamat <govkamat@amazon.com>
(cherry picked from commit 4ea81b9)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@opensearch-trigger-bot
Copy link

The backport to 7 failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-7 7
# Navigate to the new working tree
pushd ../.worktrees/backport-7
# Create a new branch
git switch --create backport/backport-336-to-7
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 4ea81b9716214548d8ab5928de6bd5f16aed65aa
# Push it to GitHub
git push --set-upstream origin backport/backport-336-to-7
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-7

Then, create a pull request where the base branch is 7 and the compare/head branch is backport/backport-336-to-7.

opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 10, 2024
…corpora. (#336)

Signed-off-by: Govind Kamat <govkamat@amazon.com>
(cherry picked from commit 4ea81b9)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
gkamat pushed a commit that referenced this pull request Jul 10, 2024
…corpora. (#336) (#337)

(cherry picked from commit 4ea81b9)

Signed-off-by: Govind Kamat <govkamat@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
gkamat pushed a commit that referenced this pull request Jul 10, 2024
…corpora. (#336) (#339)

(cherry picked from commit 4ea81b9)

Signed-off-by: Govind Kamat <govkamat@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
gkamat pushed a commit that referenced this pull request Jul 10, 2024
…corpora. (#336) (#338)

(cherry picked from commit 4ea81b9)

Signed-off-by: Govind Kamat <govkamat@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
harshavamsi pushed a commit to harshavamsi/opensearch-benchmark-workloads that referenced this pull request Jul 16, 2024
…corpora. (opensearch-project#336)

Signed-off-by: Govind Kamat <govkamat@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1 backport 2 Backport to the "2" branch backport 3 Backport to the "3" branch backport 7 Backport to the "7" branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants