Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy load status data #1058

Merged
merged 1 commit into from
Mar 16, 2016
Merged

Lazy load status data #1058

merged 1 commit into from
Mar 16, 2016

Conversation

chuangbo
Copy link
Contributor

In an elasticsearch instance with large number of indices, getting _stats is a huge expense, which Elastica\Status will load it for every instance

But some methods of Elastica\Status don't use Status->_data, e.g. getIndicesWithAlias. So it's possible to lazy load _stats data at when it's needed, to speed up stats independent function calls.

@chuangbo chuangbo changed the title Lazy load status data Lazy load stats data Mar 13, 2016
@chuangbo chuangbo changed the title Lazy load stats data Lazy load status data Mar 13, 2016
@ruflin
Copy link
Owner

ruflin commented Mar 14, 2016

It seems like this change somehow breaks the tests. Can you have a look?

Please also update the CHANGELOG.md file.

@chuangbo
Copy link
Contributor Author

It seems like the reason of tests failed is docker building error.
https://circleci.com/gh/ruflin/Elastica/7#tests

Can we try to re-run the tests on CircleCI and travis-ci? I don't have the re-run button here.

@ruflin
Copy link
Owner

ruflin commented Mar 15, 2016

Please ignore CircleCI. I just restarted the tests for Travis CI.

@ruflin
Copy link
Owner

ruflin commented Mar 15, 2016

@chuangbo Here is the error on Travis which is in Status.php, so I assume it is related: https://travis-ci.org/ruflin/Elastica/jobs/116031553#L2048

@chuangbo
Copy link
Contributor Author

Thank you. The tests have been passed.

Actually this pull request looks more like a hack to me, I think would it be better (api design) to move the _stats independent methods to somewhere else, or can we turn them to static function?

@ruflin
Copy link
Owner

ruflin commented Mar 15, 2016

I agree it is currently far from clean, as we mix stats and status (which was not the same in the past). This came with the upgrade to elasticsearch 2.0 when we tried to keep it as BC compatible as possible. I'm good with this temporary hack for the moment but agree we should rethink the API here. I would even say Status should probably disappear as it does not exist anymore on the elasticsearch side.

@chuangbo Do you want to open a Github Issue with a proposal how to proceed?
Could you squash your commits into one before merging?

In an elasticsearch instance with large number of indices, getting `_stats` is a huge expense, which `Elastica\Status` will load it for (every instance) [https://github.com/ruflin/Elastica/blob/e2daadc27ba263ab5b33c4afac10b1144078cd5d/lib/Elastica/Status.php#L44].

But some methods of `Elastica\Status` don't use `Status->_data`, e.g. `getIndicesWithAlias`. So it's possible to lazy load `_stats` data at when it's needed, to speed up stats independent function calls.
@chuangbo
Copy link
Contributor Author

Thank you, the commits have been rebased.

For the proposal issue, I'm afraid that I'm not quite familiar with elasticsearch, I was using a Magento plugin which based on Elastica and then we faced this performance issue. Sorry.

ruflin added a commit that referenced this pull request Mar 16, 2016
@ruflin ruflin merged commit 8085a11 into ruflin:master Mar 16, 2016
@ruflin
Copy link
Owner

ruflin commented Mar 16, 2016

@chuangbo Thanks a lot for the contribution. No worries, I'm sure someone will pick it up.

@chuangbo chuangbo deleted the patch-1 branch March 16, 2016 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants