Skip to content

Commit

Permalink
feat: added from_str_to_temporal and continues prediction (#767)
Browse files Browse the repository at this point in the history
now we can store dates as dates and the plots can make use of the date
format, see test_lineplot.

Closes #806
Closes #765 
Closes #740 
Closes #773 

### Summary of Changes

<!-- Please provide a summary of changes in this pull request, ensuring
all changes are explained. -->
removed time name from time series class
added time series tutorial
added multiple columns for line plot and scatter plot
added temporal data interface and temporal operations, also added legend
to these plots
added continuous prediction
some refactoring and fixes

---------

Co-authored-by: megalinter-bot <129584137+megalinter-bot@users.noreply.github.com>
Co-authored-by: Lars Reimann <mail@larsreimann.com>
  • Loading branch information
3 people authored May 31, 2024
1 parent 23009ce commit 35f468a
Show file tree
Hide file tree
Showing 70 changed files with 2,961 additions and 485 deletions.
919 changes: 919 additions & 0 deletions docs/tutorials/data/US_Inflation_rates.csv

Large diffs are not rendered by default.

272 changes: 228 additions & 44 deletions docs/tutorials/data_processing.ipynb

Large diffs are not rendered by default.

208 changes: 172 additions & 36 deletions docs/tutorials/data_visualization.ipynb

Large diffs are not rendered by default.

303 changes: 258 additions & 45 deletions docs/tutorials/image_list_processing.ipynb

Large diffs are not rendered by default.

282 changes: 240 additions & 42 deletions docs/tutorials/image_processing.ipynb

Large diffs are not rendered by default.

91 changes: 74 additions & 17 deletions docs/tutorials/regression.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,24 @@
"pricing.slice_rows(0,15)"
],
"metadata": {
"collapsed": false
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-24T11:02:42.984142700Z",
"start_time": "2024-05-24T11:02:42.899167900Z"
}
},
"execution_count": null,
"outputs": []
"execution_count": 1,
"outputs": [
{
"data": {
"text/plain": "+-----+------+-------+-----+---+----------------+---------------+------------------+--------+\n| id | year | month | day | … | year_renovated | sqft_lot_15nn | sqft_living_15nn | price |\n| --- | --- | --- | --- | | --- | --- | --- | --- |\n| i64 | i64 | i64 | i64 | | i64 | i64 | i64 | i64 |\n+===========================================================================================+\n| 0 | 2014 | 5 | 2 | … | 0 | 9176 | 2310 | 285000 |\n| 1 | 2014 | 5 | 2 | … | 0 | 8595 | 1750 | 285000 |\n| 2 | 2014 | 5 | 2 | … | 0 | 9000 | 1850 | 440000 |\n| 3 | 2014 | 5 | 2 | … | 0 | 8942 | 1260 | 435000 |\n| 4 | 2014 | 5 | 2 | … | 0 | 10836 | 1450 | 430000 |\n| … | … | … | … | … | … | … | … | … |\n| 10 | 2014 | 5 | 2 | … | 0 | 24967 | 4050 | 604000 |\n| 11 | 2014 | 5 | 2 | … | 0 | 16215 | 1450 | 335000 |\n| 12 | 2014 | 5 | 2 | … | 0 | 35100 | 2340 | 437500 |\n| 13 | 2014 | 5 | 2 | … | 0 | 39299 | 2390 | 630000 |\n| 14 | 2014 | 5 | 2 | … | 0 | 48351 | 2820 | 675000 |\n+-----+------+-------+-----+---+----------------+---------------+------------------+--------+",
"text/html": "<div><style>\n.dataframe > thead > tr,\n.dataframe > tbody > tr {\n text-align: right;\n white-space: pre-wrap;\n}\n</style>\n<small>shape: (15, 23)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>id</th><th>year</th><th>month</th><th>day</th><th>zipcode</th><th>latitude</th><th>longitude</th><th>sqft_lot</th><th>sqft_living</th><th>sqft_above</th><th>sqft_basement</th><th>floors</th><th>bedrooms</th><th>bathrooms</th><th>waterfront</th><th>view</th><th>condition</th><th>grade</th><th>year_built</th><th>year_renovated</th><th>sqft_lot_15nn</th><th>sqft_living_15nn</th><th>price</th></tr><tr><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>f64</td><td>f64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>f64</td><td>i64</td><td>f64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td></tr></thead><tbody><tr><td>0</td><td>2014</td><td>5</td><td>2</td><td>98001</td><td>47.3406</td><td>-122.269</td><td>9397</td><td>2200</td><td>2200</td><td>0</td><td>2.0</td><td>4</td><td>2.5</td><td>0</td><td>1</td><td>3</td><td>8</td><td>1987</td><td>0</td><td>9176</td><td>2310</td><td>285000</td></tr><tr><td>1</td><td>2014</td><td>5</td><td>2</td><td>98003</td><td>47.3537</td><td>-122.303</td><td>10834</td><td>2090</td><td>1360</td><td>730</td><td>1.0</td><td>3</td><td>2.5</td><td>0</td><td>1</td><td>4</td><td>8</td><td>1987</td><td>0</td><td>8595</td><td>1750</td><td>285000</td></tr><tr><td>2</td><td>2014</td><td>5</td><td>2</td><td>98006</td><td>47.5443</td><td>-122.177</td><td>8119</td><td>2160</td><td>1080</td><td>1080</td><td>1.0</td><td>4</td><td>2.25</td><td>0</td><td>1</td><td>3</td><td>8</td><td>1966</td><td>0</td><td>9000</td><td>1850</td><td>440000</td></tr><tr><td>3</td><td>2014</td><td>5</td><td>2</td><td>98006</td><td>47.5746</td><td>-122.135</td><td>8800</td><td>1450</td><td>1450</td><td>0</td><td>1.0</td><td>4</td><td>1.0</td><td>0</td><td>1</td><td>4</td><td>7</td><td>1954</td><td>0</td><td>8942</td><td>1260</td><td>435000</td></tr><tr><td>4</td><td>2014</td><td>5</td><td>2</td><td>98006</td><td>47.5725</td><td>-122.133</td><td>10000</td><td>1920</td><td>1070</td><td>850</td><td>1.0</td><td>4</td><td>1.5</td><td>0</td><td>1</td><td>4</td><td>7</td><td>1954</td><td>0</td><td>10836</td><td>1450</td><td>430000</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>10</td><td>2014</td><td>5</td><td>2</td><td>98023</td><td>47.3256</td><td>-122.378</td><td>33151</td><td>3240</td><td>3240</td><td>0</td><td>2.0</td><td>3</td><td>2.5</td><td>0</td><td>3</td><td>3</td><td>10</td><td>1995</td><td>0</td><td>24967</td><td>4050</td><td>604000</td></tr><tr><td>11</td><td>2014</td><td>5</td><td>2</td><td>98024</td><td>47.5643</td><td>-121.897</td><td>16215</td><td>1580</td><td>1580</td><td>0</td><td>1.0</td><td>3</td><td>2.25</td><td>0</td><td>1</td><td>4</td><td>7</td><td>1978</td><td>0</td><td>16215</td><td>1450</td><td>335000</td></tr><tr><td>12</td><td>2014</td><td>5</td><td>2</td><td>98027</td><td>47.4635</td><td>-121.991</td><td>35100</td><td>1970</td><td>1970</td><td>0</td><td>2.0</td><td>3</td><td>2.25</td><td>0</td><td>1</td><td>4</td><td>9</td><td>1977</td><td>0</td><td>35100</td><td>2340</td><td>437500</td></tr><tr><td>13</td><td>2014</td><td>5</td><td>2</td><td>98027</td><td>47.4634</td><td>-121.987</td><td>37277</td><td>2710</td><td>2710</td><td>0</td><td>2.0</td><td>4</td><td>2.75</td><td>0</td><td>1</td><td>3</td><td>9</td><td>2000</td><td>0</td><td>39299</td><td>2390</td><td>630000</td></tr><tr><td>14</td><td>2014</td><td>5</td><td>2</td><td>98029</td><td>47.5794</td><td>-122.025</td><td>67518</td><td>2820</td><td>2820</td><td>0</td><td>2.0</td><td>5</td><td>2.5</td><td>0</td><td>1</td><td>3</td><td>8</td><td>1979</td><td>0</td><td>48351</td><td>2820</td><td>675000</td></tr></tbody></table></div>"
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"cell_type": "markdown",
Expand All @@ -53,14 +67,20 @@
"test_table = testing_table.remove_columns([\"price\"]).shuffle_rows()"
],
"metadata": {
"collapsed": false
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-24T11:02:43.020110800Z",
"start_time": "2024-05-24T11:02:42.978634800Z"
}
},
"execution_count": null,
"execution_count": 2,
"outputs": []
},
{
"cell_type": "markdown",
"source": "3. Mark the `price` `Column` as the target variable to be predicted. Include the `id` column only as an extra column, which is completely ignored by the model:",
"source": [
"3. Mark the `price` `Column` as the target variable to be predicted. Include the `id` column only as an extra column, which is completely ignored by the model:"
],
"metadata": {
"collapsed": false
}
Expand All @@ -73,14 +93,20 @@
"train_tabular_dataset = train_table.to_tabular_dataset(\"price\", extra_names=extra_names)\n"
],
"metadata": {
"collapsed": false
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-24T11:02:43.020110800Z",
"start_time": "2024-05-24T11:02:42.987977400Z"
}
},
"execution_count": null,
"execution_count": 3,
"outputs": []
},
{
"cell_type": "markdown",
"source": "4. Use `Decision Tree` regressor as a model for the regression. Pass the \"train_tabular_dataset\" table to the fit function of the model:\n",
"source": [
"4. Use `Decision Tree` regressor as a model for the regression. Pass the \"train_tabular_dataset\" table to the fit function of the model:\n"
],
"metadata": {
"collapsed": false
}
Expand All @@ -94,9 +120,13 @@
"fitted_model = model.fit(train_tabular_dataset)"
],
"metadata": {
"collapsed": false
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-24T11:02:43.734095500Z",
"start_time": "2024-05-24T11:02:42.991527800Z"
}
},
"execution_count": null,
"execution_count": 4,
"outputs": []
},
{
Expand All @@ -118,10 +148,24 @@
"prediction.to_table().slice_rows(start=0, length=15)"
],
"metadata": {
"collapsed": false
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-24T11:02:43.747030600Z",
"start_time": "2024-05-24T11:02:43.730090600Z"
}
},
"execution_count": null,
"outputs": []
"execution_count": 5,
"outputs": [
{
"data": {
"text/plain": "+-------+------+-------+-----+---+----------------+---------------+----------------+---------------+\n| id | year | month | day | … | year_renovated | sqft_lot_15nn | sqft_living_15 | price |\n| --- | --- | --- | --- | | --- | --- | nn | --- |\n| i64 | i64 | i64 | i64 | | i64 | i64 | --- | f64 |\n| | | | | | | | i64 | |\n+==================================================================================================+\n| 20953 | 2015 | 4 | 30 | … | 2013 | 4410 | 1970 | 909625.00000 |\n| 21205 | 2015 | 5 | 5 | … | 0 | 8187 | 2300 | 776695.00000 |\n| 1360 | 2014 | 5 | 23 | … | 0 | 5000 | 1290 | 544614.75000 |\n| 15230 | 2015 | 1 | 21 | … | 0 | 217364 | 2580 | 867816.66667 |\n| 12893 | 2014 | 11 | 21 | … | 0 | 8190 | 1430 | 227493.75000 |\n| … | … | … | … | … | … | … | … | … |\n| 8807 | 2014 | 9 | 12 | … | 0 | 6060 | 2680 | 499355.20000 |\n| 13089 | 2014 | 11 | 25 | … | 0 | 5000 | 1760 | 492833.33333 |\n| 12016 | 2014 | 11 | 6 | … | 0 | 13001 | 4750 | 1352450.00000 |\n| 17314 | 2015 | 3 | 10 | … | 0 | 9375 | 1880 | 523916.66667 |\n| 4030 | 2014 | 7 | 1 | … | 2015 | 6885 | 2180 | 622937.50000 |\n+-------+------+-------+-----+---+----------------+---------------+----------------+---------------+",
"text/html": "<div><style>\n.dataframe > thead > tr,\n.dataframe > tbody > tr {\n text-align: right;\n white-space: pre-wrap;\n}\n</style>\n<small>shape: (15, 23)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>id</th><th>year</th><th>month</th><th>day</th><th>zipcode</th><th>latitude</th><th>longitude</th><th>sqft_lot</th><th>sqft_living</th><th>sqft_above</th><th>sqft_basement</th><th>floors</th><th>bedrooms</th><th>bathrooms</th><th>waterfront</th><th>view</th><th>condition</th><th>grade</th><th>year_built</th><th>year_renovated</th><th>sqft_lot_15nn</th><th>sqft_living_15nn</th><th>price</th></tr><tr><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>f64</td><td>f64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>f64</td><td>i64</td><td>f64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>i64</td><td>f64</td></tr></thead><tbody><tr><td>20953</td><td>2015</td><td>4</td><td>30</td><td>98144</td><td>47.5835</td><td>-122.313</td><td>2665</td><td>2960</td><td>1950</td><td>1010</td><td>2.0</td><td>7</td><td>4.0</td><td>0</td><td>1</td><td>3</td><td>9</td><td>1927</td><td>2013</td><td>4410</td><td>1970</td><td>909625.0</td></tr><tr><td>21205</td><td>2015</td><td>5</td><td>5</td><td>98052</td><td>47.6842</td><td>-122.155</td><td>7800</td><td>2300</td><td>2300</td><td>0</td><td>2.0</td><td>3</td><td>2.5</td><td>0</td><td>3</td><td>3</td><td>9</td><td>1997</td><td>0</td><td>8187</td><td>2300</td><td>776695.0</td></tr><tr><td>1360</td><td>2014</td><td>5</td><td>23</td><td>98115</td><td>47.684</td><td>-122.281</td><td>5000</td><td>1814</td><td>944</td><td>870</td><td>1.0</td><td>4</td><td>1.75</td><td>0</td><td>1</td><td>4</td><td>7</td><td>1951</td><td>0</td><td>5000</td><td>1290</td><td>544614.75</td></tr><tr><td>15230</td><td>2015</td><td>1</td><td>21</td><td>98077</td><td>47.7696</td><td>-122.021</td><td>217800</td><td>3810</td><td>3810</td><td>0</td><td>2.0</td><td>4</td><td>3.0</td><td>0</td><td>1</td><td>3</td><td>9</td><td>2003</td><td>0</td><td>217364</td><td>2580</td><td>867816.666667</td></tr><tr><td>12893</td><td>2014</td><td>11</td><td>21</td><td>98031</td><td>47.4014</td><td>-122.186</td><td>8400</td><td>1070</td><td>1070</td><td>0</td><td>1.0</td><td>2</td><td>2.0</td><td>0</td><td>1</td><td>4</td><td>7</td><td>1980</td><td>0</td><td>8190</td><td>1430</td><td>227493.75</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>8807</td><td>2014</td><td>9</td><td>12</td><td>98045</td><td>47.4759</td><td>-121.735</td><td>5978</td><td>2640</td><td>2640</td><td>0</td><td>2.0</td><td>3</td><td>3.0</td><td>0</td><td>1</td><td>3</td><td>9</td><td>2012</td><td>0</td><td>6060</td><td>2680</td><td>499355.2</td></tr><tr><td>13089</td><td>2014</td><td>11</td><td>25</td><td>98117</td><td>47.6802</td><td>-122.358</td><td>5050</td><td>2090</td><td>1090</td><td>1000</td><td>1.0</td><td>4</td><td>1.75</td><td>0</td><td>1</td><td>4</td><td>7</td><td>1916</td><td>0</td><td>5000</td><td>1760</td><td>492833.333333</td></tr><tr><td>12016</td><td>2014</td><td>11</td><td>6</td><td>98059</td><td>47.5305</td><td>-122.135</td><td>12968</td><td>5550</td><td>5550</td><td>0</td><td>2.0</td><td>4</td><td>4.25</td><td>0</td><td>1</td><td>3</td><td>11</td><td>2005</td><td>0</td><td>13001</td><td>4750</td><td>1.35245e6</td></tr><tr><td>17314</td><td>2015</td><td>3</td><td>10</td><td>98052</td><td>47.631</td><td>-122.098</td><td>9500</td><td>1650</td><td>1650</td><td>0</td><td>1.0</td><td>3</td><td>1.75</td><td>0</td><td>1</td><td>3</td><td>8</td><td>1967</td><td>0</td><td>9375</td><td>1880</td><td>523916.666667</td></tr><tr><td>4030</td><td>2014</td><td>7</td><td>1</td><td>98115</td><td>47.6763</td><td>-122.282</td><td>6885</td><td>2890</td><td>1590</td><td>1300</td><td>1.0</td><td>4</td><td>3.0</td><td>0</td><td>1</td><td>3</td><td>7</td><td>1945</td><td>2015</td><td>6885</td><td>2180</td><td>622937.5</td></tr></tbody></table></div>"
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"cell_type": "markdown",
Expand All @@ -140,10 +184,23 @@
"fitted_model.mean_absolute_error(test_tabular_dataset)\n"
],
"metadata": {
"collapsed": false
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-05-24T11:02:43.748029100Z",
"start_time": "2024-05-24T11:02:43.738085900Z"
}
},
"execution_count": null,
"outputs": []
"execution_count": 6,
"outputs": [
{
"data": {
"text/plain": "93590.45902700891"
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
]
}
],
"metadata": {
Expand Down
Loading

0 comments on commit 35f468a

Please sign in to comment.