Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ML Jobs to Sample Data Sets #18808

Closed
alexfrancoeur opened this issue May 4, 2018 · 7 comments
Closed

Add ML Jobs to Sample Data Sets #18808

alexfrancoeur opened this issue May 4, 2018 · 7 comments
Labels
Feature:Add Data Add Data and sample data feature on Home :ml Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@alexfrancoeur
Copy link

We plan on introducing a sample data set to Kibana soon. We'll be starting with flight data, but as time progresses I'm sure we'll add more. I'd like to add ML jobs to these sample data sets at some point as well to show off the functionality available.

For the flight data set specifically, I am generating the data from a script so we can easily add anomalies for specific fields.

For dashboards, visualizations and saved searches we are currently using the import/export API that beats modules use in order to ingest the new dashboards. I don't believe ML jobs are stored the same way, so it's possible we may need another mechanism.

Phase 1 is to merge this PR / issue: #17807 / #16473. Once this is complete, we can begin to add enhancements such as this.

cc: @grabowskit @sophiec20 @nreese @asawariS

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui

@timroes timroes added Feature:Add Data Add Data and sample data feature on Home Team:Visualizations Visualization editors, elastic-charts and infrastructure and removed :Sharing labels Sep 14, 2018
@lcawl
Copy link
Contributor

lcawl commented Nov 30, 2018

@alexfrancoeur Would there be difficulties associated with licensing? i.e. The sample data is available with the basic license but you must have at least a trial license to create machine learning jobs.

@nreese
Copy link
Contributor

nreese commented Nov 30, 2018

What are machine learning jobs? Are they Saved objects? There is a mechanism to add save objects to sample data sets. This is used by Canvas to add workpads when canvas is enabled. Could ML add jobs in the same fashion?

@peteharverson
Copy link
Contributor

@nreese ML jobs are not saved objects, but are currently stored in the clusterstate. In 6.6 this is being changed, moving them out of clusterstate metadata and instead the jobs will be stored as documents in an index (.ml-config). REST endpoints exist for adding jobs and their associated datafeeds, and then opening the jobs and starting the datafeeds (see x-pack/plugins/ml/server/client/elasticsearch_ml.js.

For the ML modules, we ship pre-configured ML jobs as JSON files and then use the endpoints to add and start the jobs, and also insert Kibana saved objects for drilldown dashboards using the saved object client. The JSON config files used for this process are kept under x-pack/plugins/ml/server/models/data_recognizer/modules.

@nreese
Copy link
Contributor

nreese commented Nov 30, 2018

@peteharverson Thanks for the clarification. What if the saved object registry exposed a way to attach a handler that would get called when a sample data set is installed and another handler that is called when a sample data set is uninstalled? ML could register functions to add/set up jobs on install and then register a function to clean up jobs on uninstall.

@peteharverson
Copy link
Contributor

@nreese the approach you suggest where the saved object registry exposes a way to attach handlers sounds viable. The sample data set would need to store a ML module ID which can be passed to the ML endpoint when it is installed / uninstalled.

@peteharverson
Copy link
Contributor

This was addressed by the addition of ML modules for the ecommerce and weblogs sample data sets in 7.2 in #35138. Closing issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Add Data Add Data and sample data feature on Home :ml Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

No branches or pull requests

6 participants