Updated mentions of DataFrame to represent objects (#664)

* Update mentions of DataFrame to represent objects * Improve DataColumn.md documentation clarity
Kotlin · Apr 22, 2024 · 7de6022 · 7de6022
1 parent 41577df
commit 7de6022
Show file tree

Hide file tree

Showing 22 changed files with 74 additions and 64 deletions.
diff --git a/docs/StardustDocs/topics/DataColumn.md b/docs/StardustDocs/topics/DataColumn.md
@@ -1,13 +1,15 @@
 [//]: # (title: DataColumn)
 <!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.Create-->
 
-[`DataColumn`](DataColumn.md) represents a column of values. It can store objects of primitive or reference types, or other [`DataFrames`](DataFrame.md).
+[`DataColumn`](DataColumn.md) represents a column of values.
+It can store objects of primitive or reference types, 
+or other [`DataFrame`](DataFrame.md) objects.
 
 See [how to create columns](createColumn.md)
 
 ### Properties
-* `name: String` — name of the column, should be unique within containing dataframe
-* `path: ColumnPath` — path to the column, depends on the way column was retrieved from dataframe
+* `name: String` — name of the column; should be unique within containing dataframe
+* `path: ColumnPath` — path to the column; depends on the way column was retrieved from dataframe
 * `type: KType` — type of elements in the column
 * `hasNulls: Boolean` — flag indicating whether column contains `null` values
 * `values: Iterable<T>` — column data
@@ -20,17 +22,18 @@ See [how to create columns](createColumn.md)
 
 Represents a sequence of values. 
 
-It can store values of primitive (integers, strings, decimals etc.) or reference types. Currently, it uses [`List`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-list/) as underlying data storage.
+It can store values of primitive (integers, strings, decimals, etc.) or reference types.
+Currently, it uses [`List`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-list/) as underlying data storage.
 
 #### ColumnGroup
 
 Container for nested columns. Is used to create column hierarchy. 
 
 #### FrameColumn
 
-Special case of [`ValueColumn`](#valuecolumn) that stores other [`DataFrames`](DataFrame.md) as elements. 
+Special case of [`ValueColumn`](#valuecolumn) that stores another [`DataFrame`](DataFrame.md) objects as elements. 
 
-[`DataFrames`](DataFrame.md) stored in [`FrameColumn`](DataColumn.md#framecolumn) may have different schemas. 
+[`DataFrame`](DataFrame.md) stored in [`FrameColumn`](DataColumn.md#framecolumn) may have different schemas. 
 
 [`FrameColumn`](DataColumn.md#framecolumn) may appear after [reading](read.md) from JSON or other hierarchical data structures, or after grouping operations such as [groupBy](groupBy.md) or [pivot](pivot.md).  
 

diff --git a/docs/StardustDocs/topics/DataFrame.md b/docs/StardustDocs/topics/DataFrame.md
@@ -2,13 +2,13 @@
 
 [`DataFrame`](DataFrame.md) represents a list of [`DataColumn`](DataColumn.md).
 
-Columns in dataframe must have equal size and unique names.
+Columns in [`DataFrame`](DataFrame.md) must have equal size and unique names.
 
 **Learn how to:**
-- [Create dataframe](createDataFrame.md)
-- [Read dataframe](read.md)
-- [Get an overview of dataframe](info.md)
-- [Access data in dataframe](access.md)
-- [Modify data in dataframe](modify.md)
-- [Compute statistics for dataframe](summaryStatistics.md)
-- [Combine several dataframes](multipleDataFrames.md)
+- [Create DataFrame](createDataFrame.md)
+- [Read DataFrame](read.md)
+- [Get an overview of DataFrame](info.md)
+- [Access data in DataFrame](access.md)
+- [Modify data in DataFrame](modify.md)
+- [Compute statistics for DataFrame](summaryStatistics.md)
+- [Combine several DataFrame objects](multipleDataFrames.md)
diff --git a/docs/StardustDocs/topics/DataRow.md b/docs/StardustDocs/topics/DataRow.md
@@ -11,19 +11,19 @@
 * `prev(): DataRow?` — previous row (`null` for the first row)
 * `next(): DataRow?` — next row (`null` for the last row)
 * `diff(T) { rowExpression }: T / diffOrNull { rowExpression }: T?` — difference between the results of a [row expression](DataRow.md#row-expressions) calculated for current and previous rows
-* `explode(columns): DataFrame<T>` — spread lists and [`DataFrames`](DataFrame.md) vertically into new rows
+* `explode(columns): DataFrame<T>` — spread lists and [`DataFrame`](DataFrame.md) objects vertically into new rows
 * `values(): List<Any?>` — list of all cell values from the current row
 * `valuesOf<T>(): List<T>` — list of values of the given type 
 * `columnsCount(): Int` — number of columns
 * `columnNames(): List<String>` — list of all column names
 * `columnTypes(): List<KType>` — list of all column types 
 * `namedValues(): List<NameValuePair<Any?>>` — list of name-value pairs where `name` is a column name and `value` is cell value
 * `namedValuesOf<T>(): List<NameValuePair<T>>` — list of name-value pairs where value has given type 
-* `transpose(): DataFrame<NameValuePair<*>>` — dataframe of two columns: `name: String` is column names and `value: Any?` is cell values
-* `transposeTo<T>(): DataFrame<NameValuePair<T>>`— dataframe of two columns: `name: String` is column names and `value: T` is cell values
+* `transpose(): DataFrame<NameValuePair<*>>` — [`DataFrame`](DataFrame.md) of two columns: `name: String` is column names and `value: Any?` is cell values
+* `transposeTo<T>(): DataFrame<NameValuePair<T>>`— [`DataFrame`](DataFrame.md) of two columns: `name: String` is column names and `value: T` is cell values
 * `getRow(Int): DataRow` — row from [`DataFrame`](DataFrame.md) by row index
-* `getRows(Iterable<Int>): DataFrame` — dataframe with subset of rows selected by absolute row index. 
-* `relative(Iterable<Int>): DataFrame` — dataframe with subset of rows selected by relative row index: `relative(-1..1)` will return previous, current and next row. Requested indices will be coerced to the valid range and invalid indices will be skipped
+* `getRows(Iterable<Int>): DataFrame` — [`DataFrame`](DataFrame.md) with subset of rows selected by absolute row index. 
+* `relative(Iterable<Int>): DataFrame` — [`DataFrame`](DataFrame.md) with subset of rows selected by relative row index: `relative(-1..1)` will return previous, current and next row. Requested indices will be coerced to the valid range and invalid indices will be skipped
 * `getValue<T>(columnName)` — cell value of type `T` by this row and given `columnName`
 * `getValueOrNull<T>(columnName)` — cell value of type `T?` by this row and given `columnName` or `null` if there's no such column
 * `get(column): T` — cell value by this row and given `column`

diff --git a/docs/StardustDocs/topics/addDf.md b/docs/StardustDocs/topics/addDf.md
@@ -2,7 +2,7 @@
 
 <!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.Modify-->
 
-Returns [`DataFrame`](DataFrame.md) with union of columns from several given [`DataFrames`](DataFrame.md).
+Returns [`DataFrame`](DataFrame.md) with union of columns from several given [`DataFrame`](DataFrame.md) objects.
 
 <!---FUN addDataFrames-->
 

diff --git a/docs/StardustDocs/topics/concat.md b/docs/StardustDocs/topics/concat.md
@@ -2,7 +2,7 @@
 
 <!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.Modify-->
 
-Returns a [`DataFrame`](DataFrame.md) with the union of rows from several given [`DataFrames`](DataFrame.md).
+Returns a [`DataFrame`](DataFrame.md) with the union of rows from several given [`DataFrame`](DataFrame.md) objects.
 
 `concat` is available for:
 
@@ -91,14 +91,14 @@ frameColumn.concat()
 
 <!---END-->
 
-If you want to take the union of columns (not rows) from several [`DataFrames`](DataFrame.md), see [`add`](add.md).
+If you want to take the union of columns (not rows) from several [`DataFrame`](DataFrame.md) objects, see [`add`](add.md).
 
 ## Schema unification
 
-If input [`DataFrames`](DataFrame.md) have different schemas, every column in the resulting [`DataFrames`](DataFrame.md) 
+If input [`DataFrame`](DataFrame.md) objects have different schemas, every column in the resulting [`DataFrame`](DataFrame.md) 
 will get the lowest common type of the original columns with the same name. 
 
 For example, if one [`DataFrame`](DataFrame.md) has a column `A: Int` and another [`DataFrame`](DataFrame.md) has a column `A: Double`, 
-the resulting ` DataFrame ` will have a column `A: Number`.
+the resulting [`DataFrame`](DataFrame.md) will have a column `A: Number`.
 
-Missing columns in dataframes will be filled with `null`.
+Missing columns in [`DataFrame`](DataFrame.md) objects will be filled with `null`.
diff --git a/docs/StardustDocs/topics/concatDf.md b/docs/StardustDocs/topics/concatDf.md
@@ -2,7 +2,7 @@
 
 <!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.Modify-->
 
-Returns [`DataFrame`](DataFrame.md) with the union of rows from several given [`DataFrames`](DataFrame.md).
+Returns [`DataFrame`](DataFrame.md) with the union of rows from several given [`DataFrame`](DataFrame.md) objects.
 
 <!---FUN concatDataFrames-->
 

diff --git a/docs/StardustDocs/topics/create.md b/docs/StardustDocs/topics/create.md
@@ -2,9 +2,9 @@
 <show-structure depth="3"/>
 <!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.Create-->
 
-There are several ways to create [`dataframes`](DataFrame.md) from data that is already loaded into memory:
+There are several ways to create [`DataFrame`](DataFrame.md) objects from data that is already loaded into memory:
 * [create columns with data](createColumn.md) and then [bundle them](createDataFrame.md) into a [`DataFrame`](DataFrame.md)
 * create and initialize [`DataFrame`](DataFrame.md) directly from values using `vararg` variants of the [corresponding functions](createDataFrame.md).
 * [convert Kotlin objects](createDataFrame.md#todataframe) into [`DataFrame`](DataFrame.md) 
 
-To learn how to read [`dataframes`](DataFrame.md) from files and URLs, go to the [next section](read.md).
+To learn how to read dataframes from files and URLs, go to the [next section](read.md).
diff --git a/docs/StardustDocs/topics/createColumn.md b/docs/StardustDocs/topics/createColumn.md
@@ -42,7 +42,7 @@ val fullName by columnOf(firstName, lastName)
 
 <!---END-->
 
-When column elements are [`DataFrames`](DataFrame.md) it returns a [`FrameColumn`](DataColumn.md#framecolumn):
+When column elements are [`DataFrame`](DataFrame.md) objects it returns a [`FrameColumn`](DataColumn.md#framecolumn):
 
 <!---FUN createFrameColumn-->
 

diff --git a/docs/StardustDocs/topics/createDataFrame.md b/docs/StardustDocs/topics/createDataFrame.md
@@ -218,8 +218,9 @@ val df = students.toDataFrame {
 
 ### DynamicDataFrameBuilder
 
-Previously mentioned dataframe constructors throw an exception when column names are duplicated. 
-When implementing a custom operation involving multiple dataframes, or computed columns or when parsing some third-party data,
+Previously mentioned [`DataFrame`](DataFrame.md) constructors throw an exception when column names are duplicated. 
+When implementing a custom operation involving multiple [`DataFrame`](DataFrame.md) objects,
+or computed columns or when parsing some third-party data,
 it might be desirable to disambiguate column names instead of throwing an exception. 
 
 <!---FUN duplicatedColumns-->

diff --git a/docs/StardustDocs/topics/explode.md b/docs/StardustDocs/topics/explode.md
@@ -9,7 +9,7 @@ explode(dropEmpty = true) [ { columns } ]
 ```
 
 **Parameters:**
-* `dropEmpty` — if `true`, removes rows with empty lists or dataframes. Otherwise, they will be exploded into `null`.
+* `dropEmpty` — if `true`, removes rows with empty lists or [`DataFrame`](DataFrame.md) objects. Otherwise, they will be exploded into `null`.
 
 **Available for:**
 * [`DataFrame`](DataFrame.md)

diff --git a/docs/StardustDocs/topics/explodeImplode.md b/docs/StardustDocs/topics/explodeImplode.md
@@ -1,4 +1,4 @@
 [//]: # (title: Explode / implode columns)
 
-* [`explode`](explode.md) — distributes lists of values or [`DataFrames`](DataFrame.md) in given columns vertically, replicating data in other columns
-* [`implode`](implode.md) — collects column values in given columns into lists or [`DataFrames`](DataFrame.md), grouping by other columns
+* [`explode`](explode.md) — distributes lists of values or [`DataFrame`](DataFrame.md) object in given columns vertically, replicating data in other columns
+* [`implode`](implode.md) — collects column values in given columns into lists or [`DataFrame`](DataFrame.md) objects, grouping by other columns
diff --git a/docs/StardustDocs/topics/extensionPropertiesApi.md b/docs/StardustDocs/topics/extensionPropertiesApi.md
@@ -32,7 +32,7 @@ In notebooks, extension properties are generated for [`DataSchema`](schemas.md)
 instance after REPL line execution. 
 After that [`DataFrame`](DataFrame.md)  variable is typed with its own [`DataSchema`](schemas.md), so only valid extension properties corresponding to actual columns in DataFrame will be allowed by the compiler and suggested by completion.
 
-Extension properties can be generated in IntelliJ IDEA using the [Kotlin Dataframe Gradle plugin](schemasGradle.md#configuration).
+Extension properties can be generated in IntelliJ IDEA using the [Kotlin DataFrame Gradle plugin](schemasGradle.md#configuration).
 
 <warning>
 In notebooks generated properties won't appear and be updated until the cell has been executed. It often means that you have to introduce new variable frequently to sync extension properties with actual schema

diff --git a/docs/StardustDocs/topics/groupByConcat.md b/docs/StardustDocs/topics/groupByConcat.md
@@ -1,4 +1,4 @@
 [//]: # (title: GroupBy / concat rows)
 
 * [`groupBy`](groupBy.md) — groups rows of [`DataFrame`](DataFrame.md) by given key columns.
-* [`concat`](concat.md) — concatenates rows from several [`DataFrames`](DataFrame.md) into single [`DataFrame`](DataFrame.md).
+* [`concat`](concat.md) — concatenates rows from several [`DataFrame`](DataFrame.md) objects into single [`DataFrame`](DataFrame.md).
diff --git a/docs/StardustDocs/topics/join.md b/docs/StardustDocs/topics/join.md
@@ -2,7 +2,7 @@
 
 <!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.Join-->
 
-Joins two [`DataFrames`](DataFrame.md) by join columns.
+Joins two [`DataFrame`](DataFrame.md) object by join columns.
 
 ```kotlin
 join(otherDf, type = JoinType.Inner) [ { joinColumns } ]
@@ -79,7 +79,7 @@ df.join(other, "name", "city")
 <dataFrame src="org.jetbrains.kotlinx.dataframe.samples.api.Join.join.html"/>
 <!---END-->
 
-If `joinColumns` is not specified, columns with the same name from both [`DataFrames`](DataFrame.md) will be used as join columns:
+If `joinColumns` is not specified, columns with the same name from both [`DataFrame`](DataFrame.md) objects will be used as join columns:
 
 <!---FUN joinDefault-->
 
@@ -93,12 +93,12 @@ df.join(other)
 ### Join types
 
 Supported join types:
-* `Inner` (default) — only matched rows from left and right [`DataFrames`](DataFrame.md)
+* `Inner` (default) — only matched rows from left and right [`DataFrame`](DataFrame.md) objects
 * `Filter` — only matched rows from left [`DataFrame`](DataFrame.md)
 * `Left` — all rows from left [`DataFrame`](DataFrame.md), mismatches from right [`DataFrame`](DataFrame.md) filled with `null`
 * `Right` — all rows from right [`DataFrame`](DataFrame.md), mismatches from left [`DataFrame`](DataFrame.md) filled with `null`
-* `Full` — all rows from left and right [`DataFrames`](DataFrame.md), any mismatches filled with `null`
-* `Exclude` — only mismatched rows from left
+* `Full` — all rows from left and right [`DataFrame`](DataFrame.md) objects, any mismatches filled with `null`
+* `Exclude` — only mismatched rows from left [`DataFrame`](DataFrame.md)
 
 For every join type there is a shortcut operation:
 

diff --git a/docs/StardustDocs/topics/joinWith.md b/docs/StardustDocs/topics/joinWith.md
@@ -2,7 +2,7 @@
 
 <!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.JoinWith-->
 
-Joins two [`DataFrames`](DataFrame.md) by a join expression. 
+Joins two [`DataFrame`](DataFrame.md) objects by a join expression. 
 
 ```kotlin
 joinWith(otherDf, type = JoinType.Inner) { joinExpression }
@@ -29,11 +29,11 @@ For example, you can match rows based on:
 ### Join types with examples
 
 Supported join types:
-* `Inner` (default) — only matched rows from left and right [`DataFrames`](DataFrame.md)
+* `Inner` (default) — only matched rows from left and right [`DataFrame`](DataFrame.md) objects
 * `Filter` — only matched rows from left [`DataFrame`](DataFrame.md)
 * `Left` — all rows from left [`DataFrame`](DataFrame.md), mismatches from right [`DataFrame`](DataFrame.md) filled with `null`
 * `Right` — all rows from right [`DataFrame`](DataFrame.md), mismatches from left [`DataFrame`](DataFrame.md) filled with `null`
-* `Full` — all rows from left and right [`DataFrames`](DataFrame.md), any mismatches filled with `null`
+* `Full` — all rows from left and right [`DataFrame`](DataFrame.md) objects, any mismatches filled with `null`
 * `Exclude` — only mismatched rows from left
 
 For every join type there is a shortcut operation:
@@ -272,7 +272,7 @@ campaigns.excludeJoinWith(visits) {
 
 #### Cross join
 
-Can also be called cross product of two dataframes
+It can also be called cross product of two [`DataFrame`](DataFrame.md) objects.
 
 <!---FUN crossProduct-->
 
@@ -308,8 +308,10 @@ df1.innerJoinWith(df2) { it["index"] == right["index"] && it["age"] == right["ag
 <dataFrame src="org.jetbrains.kotlinx.dataframe.samples.api.JoinWith.compareInnerValues.html"/>
 <!---END-->
 
-Here columns from both dataframes are presented as is. So [join](join.md) is better suited for `equals` relation, and joinWith is for everything else.
-Below are two more examples with join types that allow mismatches. Note the difference in `null` values
+Here columns from both [`DataFrame`](DataFrame.md) objects are presented as is.
+So [join](join.md) is better suited for `equals` relation, and joinWith is for everything else.
+Below are two more examples with join types that allow mismatches.
+Note the difference in `null` values
 
 <!---FUN compareLeft-->
 

diff --git a/docs/StardustDocs/topics/modify.md b/docs/StardustDocs/topics/modify.md
@@ -42,11 +42,11 @@ as [`DataFrame`](DataFrame.md) can be interpreted as a [`Collection`](https://ko
 
 **Vertical (row) operations:**
 * [append](append.md) — add rows
-* [concat](concat.md) — union rows from several [`DataFrames`](DataFrame.md)
+* [concat](concat.md) — union rows from several [`DataFrame`](DataFrame.md) objects
 * [distinct](distinct.md) / [distinctBy](distinct.md#distinctby) — remove duplicated rows
 * [drop](drop.md) / [dropLast](sliceRows.md#droplast) / [dropWhile](sliceRows.md#dropwhile) / [dropNulls](drop.md#dropnulls) / [dropNA](drop.md#dropna) — remove rows by condition
 * [duplicate](duplicate.md) — duplicate rows 
-* [explode](explode.md) — spread lists and [`DataFrames`](DataFrame.md) vertically into new rows
+* [explode](explode.md) — spread lists and [`DataFrame`](DataFrame.md) objects vertically into new rows
 * [filter](filter.md) / [filterBy](filter.md#filterby) — filter rows
 * [implode](implode.md) — merge column values into lists grouping by other columns
 * [reverse](reverse.md) — reverse rows 

diff --git a/docs/StardustDocs/topics/multipleDataFrames.md b/docs/StardustDocs/topics/multipleDataFrames.md
@@ -1,7 +1,7 @@
 [//]: # (title: Multiple DataFrames)
 <show-structure depth="3"/>
 
-* [`add`](add.md) — union of columns from several [`DataFrames`](DataFrame.md) 
-* [`concat`](concat.md) — union of rows from several [`DataFrames`](DataFrame.md)
-* [`join`](join.md) — sql-like join of two [`DataFrames`](DataFrame.md) by key columns
-* [`joinWith`](joinWith.md) — join of two [`DataFrames`](DataFrame.md) by an expression that evaluates joined [DataRows](DataRow.md) to Boolean
+* [`add`](add.md) — union of columns from several [`DataFrame`](DataFrame.md) objects
+* [`concat`](concat.md) — union of rows from several [`DataFrame`](DataFrame.md) objects
+* [`join`](join.md) — sql-like join of two [`DataFrame`](DataFrame.md) objects by key columns
+* [`joinWith`](joinWith.md) — join of two [`DataFrame`](DataFrame.md) objects by an expression that evaluates joined [DataRows](DataRow.md) to Boolean