[ML-Dataframe] Add Data Frame client to the Java HLRC #39921

davidkyle · 2019-03-11T15:02:10Z

Adds DataFrameClient to the Java HLRC and implements PUT and DELETE data frame transform.

The documentation needs fleshing out with descriptions of the data frame config objects and examples.

elasticmachine · 2019-03-11T15:02:16Z

Pinging @elastic/ml-core

elasticmachine · 2019-03-11T15:02:21Z

Pinging @elastic/es-core-features

benwtrent · 2019-03-11T18:50:03Z

docs/java-rest/high-level/dataframe/put_data_frame.asciidoc

+[id="{upid}-{api}"]
+=== Put Data Frame Transform API
+
+The Put Data Frame Transform API is used to create a new {dataframe-job}.


I am not sure how {dataframe-job} is defined in the docs. How do these types of macros work?

See https://github.com/elastic/docs/pull/688/files

according to the link this resolves to a "data frame analytics job", which would point to the wrong docs. I think we need new macros: {dataframe-transform} or {dataframe-transform-job} - whatever we choose should be consistent everywhere. Because this is called "Put Data Frame Transform API" it would make sense to use{dataframe-transform}

'data frame analytics job' is a mouthful I raised elastic/docs#700

hendrikmuhs

I added some comments and open questions.

hendrikmuhs · 2019-03-12T06:43:44Z

client/rest-high-level/src/test/java/org/elasticsearch/client/DataFrameIT.java

+
+import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
+
+public class DataFrameIT extends ESRestHighLevelClientTestCase {


nit: DataFrameTransformIT ?

hendrikmuhs · 2019-03-12T06:45:34Z

client/rest-high-level/src/test/java/org/elasticsearch/client/DataFrameIT.java

+
+        ack = execute(new DeleteDataFrameTransformRequest(transform.getId()), client::deleteDataFrameTransform,
+                client::deleteDataFrameTransformAsync);
+        assertTrue(ack.isAcknowledged());


nit: Would be good to test that e.g. another delete throws an error

hendrikmuhs · 2019-03-12T06:47:58Z

...igh-level/src/test/java/org/elasticsearch/client/documentation/DataFrameDocumentationIT.java

+
+import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
+
+public class DataFrameDocumentationIT extends ESRestHighLevelClientTestCase {


"DataFrameTransformDocumentation" ?

hendrikmuhs · 2019-03-12T06:54:15Z

docs/java-rest/high-level/dataframe/put_data_frame.asciidoc

+[id="{upid}-{api}"]
+=== Put Data Frame Transform API
+
+The Put Data Frame Transform API is used to create a new {dataframe-job}.


according to the link this resolves to a "data frame analytics job", which would point to the wrong docs. I think we need new macros: {dataframe-transform} or {dataframe-transform-job} - whatever we choose should be consistent everywhere. Because this is called "Put Data Frame Transform API" it would make sense to use{dataframe-transform}

hendrikmuhs · 2019-03-12T06:59:48Z

docs/java-rest/high-level/dataframe/put_data_frame.asciidoc

+<2> The source index or index pattern
+<3> The destination index
+<4> Optionally a QueryConfig
+<5> The PivotConfig


We could make this somewhat future proof, e.g.

The configuration object of the function, in this version we only support the pivot function.

(please help me regarding the wording)

I suggest to call the inner "thing" the "function" of the transform. We briefly discussed this once, it somehow fits in my opinion but I am open for other suggestions.

hendrikmuhs · 2019-03-12T07:01:12Z

docs/java-rest/high-level/dataframe/put_data_frame.asciidoc

+[id="{upid}-{api}-query-config"]
+==== QueryConfig
+
+The query with which to select data from the source index.


just "source"? As "source" is an expression that can resolve to more than 1 index, I would try to avoid the term "index" where it's possible.

hendrikmuhs · 2019-03-12T07:04:06Z

docs/java-rest/high-level/dataframe/put_data_frame.asciidoc

+
+==== PivotConfig
+
+Defines the pivot transform `group by` fields and the aggregation to reduce the data.


All together is called a transform. I used to call pivot the function of the transform, therefore I suggest: "Defines the pivot function ..."

hendrikmuhs · 2019-03-12T07:05:03Z

docs/java-rest/high-level/dataframe/put_data_frame.asciidoc

+
+* Terms
+* Histogram
+* Date Historgram


typo Historgram -> Histogram

hendrikmuhs · 2019-03-12T07:06:16Z

docs/java-rest/high-level/dataframe/put_data_frame.asciidoc

+
+===== GroupConfig
+The grouping terms. Defines the group by and destination fields
+which are produced by the grouping transform. There are 3 types of


grouping transform, you mean pivot transform or as I suggested above pivot function

hendrikmuhs · 2019-03-12T07:11:04Z

docs/java-rest/high-level/dataframe/put_data_frame.asciidoc

+===== AggregationConfig
+
+Defines the aggregations for the group fields.
+The aggregation must be one of `avg`, `min`, `max` or `sum`.


we also support cardinality and value_of - I wonder however how we approach documenting this. The above would mean, we have to change this place for every new aggregation we add which seems easily to forget.

Would it be better to e.g. have a separate page "Supported Aggregations for DataFrame Transforms" which we link to in this place?

@lcawl any idea/best practice?

davidkyle · 2019-03-12T11:05:29Z

I changed the docs to use 'data frame transform' and addressed the other comments

hendrikmuhs

LGTM

Adds DataFrameClient to the Java HLRC and implements PUT and DELETE data frame transform.

davidkyle added :Core/Features/Java High Level REST Client :ml Machine learning v8.0.0 v7.2.0 labels Mar 11, 2019

benwtrent approved these changes Mar 11, 2019

View reviewed changes

hendrikmuhs reviewed Mar 12, 2019

View reviewed changes

davidkyle changed the title ~~[ML-Dataframe] Add Data Fame client to the Java HLRC~~ [ML-Dataframe] Add Data Frame client to the Java HLRC Mar 12, 2019

davidkyle added 4 commits March 12, 2019 12:46

client

0899e6b

Dataframe docs

22c87b7

Use data frame transform in docs and rename test classes

a0d8814

Add test for PutDataFrameTransformRequest

684b320

davidkyle force-pushed the df-hlrc-client branch from 2514ef4 to 684b320 Compare March 12, 2019 12:46

hendrikmuhs approved these changes Mar 12, 2019

View reviewed changes

Fix docs build after test file rename

f8f1a22

davidkyle merged commit 2cb8b65 into elastic:master Mar 14, 2019

davidkyle deleted the df-hlrc-client branch March 14, 2019 10:06

davidkyle mentioned this pull request Mar 14, 2019

[ML-Dataframe] Add Data Frame client to the Java HLRC #40040

Merged

davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request Mar 14, 2019

[ML-Dataframe] Add Data Frame client to the Java HLRC (elastic#39921)

105cadc

Adds DataFrameClient to the Java HLRC and implements PUT and DELETE data frame transform.

colings86 added >enhancement :ml/Transform Transform and removed :Core/Features/Java High Level REST Client :ml Machine learning labels Jun 18, 2019

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML-Dataframe] Add Data Frame client to the Java HLRC #39921

[ML-Dataframe] Add Data Frame client to the Java HLRC #39921

davidkyle commented Mar 11, 2019

elasticmachine commented Mar 11, 2019

elasticmachine commented Mar 11, 2019

benwtrent Mar 11, 2019

droberts195 Mar 11, 2019

hendrikmuhs Mar 12, 2019

davidkyle Mar 12, 2019

hendrikmuhs left a comment

hendrikmuhs Mar 12, 2019

hendrikmuhs Mar 12, 2019

hendrikmuhs Mar 12, 2019

hendrikmuhs Mar 12, 2019

hendrikmuhs Mar 12, 2019

hendrikmuhs Mar 12, 2019

hendrikmuhs Mar 12, 2019

hendrikmuhs Mar 12, 2019

hendrikmuhs Mar 12, 2019

hendrikmuhs Mar 12, 2019

davidkyle commented Mar 12, 2019

hendrikmuhs left a comment


		import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;

		public class DataFrameIT extends ESRestHighLevelClientTestCase {


		import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;

		public class DataFrameDocumentationIT extends ESRestHighLevelClientTestCase {


		==== PivotConfig

		Defines the pivot transform `group by` fields and the aggregation to reduce the data.

[ML-Dataframe] Add Data Frame client to the Java HLRC #39921

[ML-Dataframe] Add Data Frame client to the Java HLRC #39921

Conversation

davidkyle commented Mar 11, 2019

elasticmachine commented Mar 11, 2019

elasticmachine commented Mar 11, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hendrikmuhs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidkyle commented Mar 12, 2019

hendrikmuhs left a comment

Choose a reason for hiding this comment