From 62792fd7d7b11b3236da79757a057e58202d845f Mon Sep 17 00:00:00 2001 From: mattijn Date: Sun, 11 Dec 2022 23:27:09 +0100 Subject: [PATCH 1/9] add section for data dict --- doc/user_guide/data.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/user_guide/data.rst b/doc/user_guide/data.rst index 8521730f1..4525a3ff0 100644 --- a/doc/user_guide/data.rst +++ b/doc/user_guide/data.rst @@ -441,6 +441,7 @@ data before usage in Altair using GeoPandas for example as such: :hidden: self + data/index encoding marks/index transform/index From 315a710f7bbfff383cb141fb0b6b95fb875b128c Mon Sep 17 00:00:00 2001 From: mattijn Date: Sun, 11 Dec 2022 23:27:43 +0100 Subject: [PATCH 2/9] define index page for data section --- doc/user_guide/data/index.rst | 49 +++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) create mode 100644 doc/user_guide/data/index.rst diff --git a/doc/user_guide/data/index.rst b/doc/user_guide/data/index.rst new file mode 100644 index 000000000..1fc52910b --- /dev/null +++ b/doc/user_guide/data/index.rst @@ -0,0 +1,49 @@ +.. currentmodule:: altair + +.. _user-guide-data: + +Data +~~~~ + +The basic data model used by Altair is tabular data, similar to a spreadsheet, pandas DataFrame or a database table. Individual data sets are assumed to contain a collection of records, which may contain any number of named data fields. + +Each top-level chart object (i.e. :class:`Chart`, :class:`LayerChart`, +and :class:`VConcatChart`, :class:`HConcatChart`, :class:`RepeatChart`, +:class:`FacetChart`) accepts a dataset as its first argument. + +Altair provides the following ways to specify a dataset: + +========================================= ================================================================================ +Data Description +========================================= ================================================================================ +:ref:`user-guide-dataframe-data` A `Pandas DataFrame `_. +:ref:`user-guide-dict-data` A :class:`Data` or related object (i.e. :class:`UrlData`, :class:`InlineData`, :class:`NamedData`). +:ref:`user-guide-url-data` A url string pointing to a ``json`` or ``csv`` formatted text file. +:ref:`user-guide-spatial-data` An object that supports the `__geo_interface__` (eg. `Geopandas GeoDataFrame `_, `Shapely Geometries `_, `GeoJSON Objects `_). +:ref:`user-guide-generator-data` A generated dataset such as numerical sequences or geographic reference elements. +========================================= ================================================================================ + +When data is specified as a DataFrame, the encoding is quite simple, as Altair +uses the data type information provided by Pandas to automatically determine +the data types required in the encoding. For example, here we specify data via a Pandas DataFrame: + +.. altair-plot:: + + import altair as alt + import pandas as pd + + data = pd.DataFrame({'x': ['A', 'B', 'C', 'D', 'E'], + 'y': [5, 3, 6, 7, 2]}) + alt.Chart(data).mark_bar().encode( + x='x', + y='y', + ) + +.. toctree:: + :hidden: + + dataframe + dict + url + spatial + generator \ No newline at end of file From dec5dfd22736e235d9b8b0aca04b20a25e4abf6f Mon Sep 17 00:00:00 2001 From: mattijn Date: Sun, 11 Dec 2022 23:28:49 +0100 Subject: [PATCH 3/9] infrastructure other pages --- doc/user_guide/data/dataframe.rst | 219 ++++++++++++++++++++++++++++++ doc/user_guide/data/dict.rst | 13 ++ doc/user_guide/data/generator.rst | 70 ++++++++++ doc/user_guide/data/url.rst | 13 ++ 4 files changed, 315 insertions(+) create mode 100644 doc/user_guide/data/dataframe.rst create mode 100644 doc/user_guide/data/dict.rst create mode 100644 doc/user_guide/data/generator.rst create mode 100644 doc/user_guide/data/url.rst diff --git a/doc/user_guide/data/dataframe.rst b/doc/user_guide/data/dataframe.rst new file mode 100644 index 000000000..a04262490 --- /dev/null +++ b/doc/user_guide/data/dataframe.rst @@ -0,0 +1,219 @@ +.. currentmodule:: altair + +.. _user-guide-dataframe-data: + +DataFrame +~~~~~~~~~ + +When data is specified as a DataFrame, the encoding is quite simple, as Altair +uses the data type information provided by Pandas to automatically determine +the data types required in the encoding. For example, here we specify data via a DataFrame: + +.. altair-plot:: + + import altair as alt + import pandas as pd + + data = pd.DataFrame({'x': ['A', 'B', 'C', 'D', 'E'], + 'y': [5, 3, 6, 7, 2]}) + alt.Chart(data).mark_bar().encode( + x='x', + y='y', + ) + +By comparison, here we create the same chart using a :class:`Data` object, +with the data specified as a JSON-style list of records: + +.. altair-plot:: + + import altair as alt + + data = alt.Data(values=[{'x': 'A', 'y': 5}, + {'x': 'B', 'y': 3}, + {'x': 'C', 'y': 6}, + {'x': 'D', 'y': 7}, + {'x': 'E', 'y': 2}]) + alt.Chart(data).mark_bar().encode( + x='x:O', # specify ordinal data + y='y:Q', # specify quantitative data + ) + +notice the extra markup required in the encoding; because Altair cannot infer +the types within a :class:`Data` object, we must specify them manually +(here we use :ref:`shorthand-description` to specify *ordinal* (``O``) for ``x`` +and *quantitative* (``Q``) for ``y``; see :ref:`encoding-data-types`). + +Similarly, we must also specify the data type when referencing data by URL: + +.. altair-plot:: + + import altair as alt + from vega_datasets import data + url = data.cars.url + + alt.Chart(url).mark_point().encode( + x='Horsepower:Q', + y='Miles_per_Gallon:Q' + ) + +We will further discuss encodings and associated types in :ref:`user-guide-encoding`, next. + + +.. _data-in-index: + +Including Index Data +~~~~~~~~~~~~~~~~~~~~ +By design Altair only accesses dataframe columns, not dataframe indices. +At times, relevant data appears in the index. For example: + +.. altair-plot:: + :output: repr + + import numpy as np + rand = np.random.RandomState(0) + + data = pd.DataFrame({'value': rand.randn(100).cumsum()}, + index=pd.date_range('2018', freq='D', periods=100)) + data.head() + +If you would like the index to be available to the chart, you can explicitly +turn it into a column using the ``reset_index()`` method of Pandas dataframes: + +.. altair-plot:: + + alt.Chart(data.reset_index()).mark_line().encode( + x='index:T', + y='value:Q' + ) + +If the index object does not have a ``name`` attribute set, the resulting +column will be called ``"index"``. +More information is available in the +`Pandas documentation `_. + + +.. _data-long-vs-wide: + +Long-form vs. Wide-form Data +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +There are two common conventions for storing data in a dataframe, sometimes called +*long-form* and *wide-form*. Both are sensible patterns for storing data in +a tabular format; briefly, the difference is this: + +- **wide-form data** has one row per *independent variable*, with metadata + recorded in the *row and column labels*. +- **long-form data** has one row per *observation*, with metadata recorded + within the table as *values*. + +Altair's grammar works best with **long-form** data, in which each row +corresponds to a single observation along with its metadata. + +A concrete example will help in making this distinction more clear. +Consider a dataset consisting of stock prices of several companies over time. +The wide-form version of the data might be arranged as follows: + +.. altair-plot:: + :output: repr + :chart-var-name: wide_form + + wide_form = pd.DataFrame({'Date': ['2007-10-01', '2007-11-01', '2007-12-01'], + 'AAPL': [189.95, 182.22, 198.08], + 'AMZN': [89.15, 90.56, 92.64], + 'GOOG': [707.00, 693.00, 691.48]}) + print(wide_form) + +Notice that each row corresponds to a single time-stamp (here time is the +independent variable), while metadata for each observation +(i.e. company name) is stored within the column labels. + +The long-form version of the same data might look like this: + +.. altair-plot:: + :output: repr + :chart-var-name: long_form + + long_form = pd.DataFrame({'Date': ['2007-10-01', '2007-11-01', '2007-12-01', + '2007-10-01', '2007-11-01', '2007-12-01', + '2007-10-01', '2007-11-01', '2007-12-01'], + 'company': ['AAPL', 'AAPL', 'AAPL', + 'AMZN', 'AMZN', 'AMZN', + 'GOOG', 'GOOG', 'GOOG'], + 'price': [189.95, 182.22, 198.08, + 89.15, 90.56, 92.64, + 707.00, 693.00, 691.48]}) + print(long_form) + +Notice here that each row contains a single observation (i.e. price), along +with the metadata for this observation (the date and company name). +Importantly, the column and index labels no longer contain any useful metadata. + +As mentioned above, Altair works best with this long-form data, because relevant +data and metadata are stored within the table itself, rather than within the +labels of rows and columns: + +.. altair-plot:: + + alt.Chart(long_form).mark_line().encode( + x='Date:T', + y='price:Q', + color='company:N' + ) + +Wide-form data can be similarly visualized using e.g. layering +(see :ref:`layer-chart`), but it is far less convenient within Altair's grammar. + +If you would like to convert data from wide-form to long-form, there are two possible +approaches: it can be done as a preprocessing step using pandas, or as a transform +step within the chart itself. We will detail to two approaches below. + +.. _data-converting-long-form: + +Converting Between Long-form and Wide-form: Pandas +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +This sort of data manipulation can be done as a preprocessing step using Pandas_, +and is discussed in detail in the `Reshaping and Pivot Tables`_ section of the +Pandas documentation. + +For converting wide-form data to the long-form data used by Altair, the ``melt`` +method of dataframes can be used. The first argument to ``melt`` is the column +or list of columns to treat as index variables; the remaining columns will +be combined into an indicator variable and a value variable whose names can +be optionally specified: + +.. altair-plot:: + :output: repr + + wide_form.melt('Date', var_name='company', value_name='price') + +For more information on the ``melt`` method, see the `Pandas melt documentation`_. + +In case you would like to undo this operation and convert from long-form back +to wide-form, the ``pivot`` method of dataframes is useful. + +.. altair-plot:: + :output: repr + + long_form.pivot(index='Date', columns='company', values='price').reset_index() + +For more information on the ``pivot`` method, see the `Pandas pivot documentation`_. + +Converting Between Long-form and Wide-form: Fold Transform +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If you would like to avoid data preprocessing, you can reshape your data using Altair's +Fold Transform (see :ref:`user-guide-fold-transform` for a full discussion). +With it, the above chart can be reproduced as follows: + +.. altair-plot:: + + alt.Chart(wide_form).transform_fold( + ['AAPL', 'AMZN', 'GOOG'], + as_=['company', 'price'] + ).mark_line().encode( + x='Date:T', + y='price:Q', + color='company:N' + ) + +Notice that unlike the pandas ``melt`` function we must explicitly specify the columns +to be folded. The ``as_`` argument is optional, with the default being ``["key", "value"]``. \ No newline at end of file diff --git a/doc/user_guide/data/dict.rst b/doc/user_guide/data/dict.rst new file mode 100644 index 000000000..5ffa665f6 --- /dev/null +++ b/doc/user_guide/data/dict.rst @@ -0,0 +1,13 @@ +.. currentmodule:: altair + +.. _user-guide-dict-data: + +Dictionary +~~~~~~~~~~ + +Describe + +Examples +-------- + +Show \ No newline at end of file diff --git a/doc/user_guide/data/generator.rst b/doc/user_guide/data/generator.rst new file mode 100644 index 000000000..b7f44ed28 --- /dev/null +++ b/doc/user_guide/data/generator.rst @@ -0,0 +1,70 @@ +.. currentmodule:: altair + +.. _user-guide-generator-data: + +Generated data +~~~~~~~~~~~~~~ + +At times it is convenient to not use an external data source, but rather generate data for +display within the chart specification itself. The benefit is that the chart specification +can be made much smaller for generated data than for embedded data. + +Sequence Generator +^^^^^^^^^^^^^^^^^^ +Here is an example of using the :func:`sequence` function to generate a sequence of *x* +data, along with a :ref:`user-guide-calculate-transform` to compute *y* data. + +.. altair-plot:: + + import altair as alt + + # Note that the following generator is functionally similar to + # data = pd.DataFrame({'x': np.arange(0, 10, 0.1)}) + data = alt.sequence(0, 10, 0.1, as_='x') + + alt.Chart(data).transform_calculate( + y='sin(datum.x)' + ).mark_line().encode( + x='x:Q', + y='y:Q', + ) + +Graticule Generator +^^^^^^^^^^^^^^^^^^^ +Another type of data that is convenient to generate in the chart itself is the latitude/longitude +lines on a geographic visualization, known as a graticule. These can be created using Altair's +:func:`graticule` generator function. Here is a simple example: + +.. altair-plot:: + + import altair as alt + + data = alt.graticule(step=[15, 15]) + + alt.Chart(data).mark_geoshape(stroke='black').project( + 'orthographic', + rotate=[0, -45, 0] + ) + +Sphere Generator +^^^^^^^^^^^^^^^^ +Finally when visualizing the globe a sphere can be used as a background layer +within a map to represent the extent of the Earth. This sphere data can be +created using Altair's :func:`sphere` generator function. Here is an example: + +.. altair-plot:: + + import altair as alt + + sphere_data = alt.sphere() + grat_data = alt.graticule(step=[15, 15]) + + background = alt.Chart(sphere_data).mark_geoshape(fill='aliceblue') + lines = alt.Chart(grat_data).mark_geoshape(stroke='lightgrey') + + alt.layer(background, lines).project('naturalEarth1') + +.. _Pandas: http://pandas.pydata.org/ +.. _Pandas pivot documentation: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pivot.html +.. _Pandas melt documentation: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.melt.html#pandas.DataFrame.melt +.. _Reshaping and Pivot Tables: https://pandas.pydata.org/pandas-docs/stable/reshaping.html \ No newline at end of file diff --git a/doc/user_guide/data/url.rst b/doc/user_guide/data/url.rst new file mode 100644 index 000000000..684922d39 --- /dev/null +++ b/doc/user_guide/data/url.rst @@ -0,0 +1,13 @@ +.. currentmodule:: altair + +.. _user-guide-url-data: + +URL +~~~ + +Describe + +Examples +-------- + +Show \ No newline at end of file From 42e2de614d5e6f42eb6c4befb1c82980ae17b0a5 Mon Sep 17 00:00:00 2001 From: mattijn Date: Sun, 11 Dec 2022 23:29:09 +0100 Subject: [PATCH 4/9] docs spatial data --- doc/user_guide/data/spatial.rst | 280 ++++++++++++++++++++++++++++++++ 1 file changed, 280 insertions(+) create mode 100644 doc/user_guide/data/spatial.rst diff --git a/doc/user_guide/data/spatial.rst b/doc/user_guide/data/spatial.rst new file mode 100644 index 000000000..eee7e3e1f --- /dev/null +++ b/doc/user_guide/data/spatial.rst @@ -0,0 +1,280 @@ +.. currentmodule:: altair + +.. _user-guide-spatial-data: + +Spatial Data +~~~~~~~~~~~~ + +On this page we explain different methods to work with spatial data and Altair. + +The following methods for working with spatial data are discussed below: + +- :ref:`spatial-data-gdf` +- :ref:`spatial-data-inline-geojson` +- :ref:`spatial-data-remote-geojson` +- :ref:`spatial-data-inline-topojson` +- :ref:`spatial-data-remote-topojson` +- :ref:`spatial-data-nested-geojson` + + +.. _spatial-data-gdf: + +GeoPandas GeoDataFrame +~~~~~~~~~~~~~~~~~~~~~~ + +It is convenient to use geopandas as source for your spatial data. +Geopandas can read many type spatial data and Altair is optimized in +reading these. Here we define four polygon geometries into a +GeoDataFrame and visualize these using the ``mark_geoshape``. + +.. altair-plot:: + :output: repr + + from shapely import geometry + import geopandas as gpd + import altair as alt + + data_geoms = [ + {"color": "#F3C14F", "geometry": geometry.Polygon([[1.45, 3.75], [1.45, 0], [0, 0], [1.45, 3.75]])}, + {"color": "#4098D7", "geometry": geometry.Polygon([[1.45, 0], [1.45, 3.75], [2.57, 3.75], [2.57, 0], [2.33, 0], [1.45, 0]])}, + {"color": "#66B4E2", "geometry": geometry.Polygon([[2.33, 0], [2.33, 2.5], [3.47, 2.5], [3.47, 0], [3.2, 0], [2.57, 0], [2.33, 0]])}, + {"color": "#A9CDE0", "geometry": geometry.Polygon([[3.2, 0], [3.2, 1.25], [4.32, 1.25], [4.32, 0], [3.47, 0], [3.2, 0]])}, + ] + + gdf_geoms = gpd.GeoDataFrame(data_geoms) + gdf_geoms + + +This data uses a non-geographic projection. Therefor we use the +``project`` configuration ``type="identity", reflectY=True`` to draw the +geometries without applying a projection. By using ``scale=None`` we +disable the scale for the color channel and Altair will use the defined +-Hex color- codes directly. + +.. altair-plot:: + + alt.Chart(gdf_geoms, title="Vega-Altair").mark_geoshape().encode( + color=alt.Color("color:N", scale=None) + ).project(type="identity", reflectY=True) + + +.. _spatial-data-inline-geojson: + +Inline GeoJSON object +~~~~~~~~~~~~~~~~~~~~~ + +If your source data is a GeoJSON file and you do not want to load it +into a GeoPandas GeoDataFrame you can specify it directly in Altair. A +GeoJSON file consists normally of a ``FeatureCollection`` with a list of +``features`` where information for each geometry is specified within a +``properties`` dictionary. In the following example a GeoJSON-like data +object is specified into an altair Data object using the ``property`` +value of the ``key`` that contain the nested list (here named +``features``). + +.. altair-plot:: + :output: repr + + obj_geojson = { + "type": "FeatureCollection", + "features":[ + {"type": "Feature", "properties": {"location": "left"}, "geometry": {"type": "Polygon", "coordinates": [[[1.45, 3.75], [1.45, 0], [0, 0], [1.45, 3.75]]]}}, + {"type": "Feature", "properties": {"location": "middle-left"}, "geometry": {"type": "Polygon", "coordinates": [[[1.45, 0], [1.45, 3.75], [2.57, 3.75], [2.57, 0], [2.33, 0], [1.45, 0]]]}}, + {"type": "Feature", "properties": {"location": "middle-right"}, "geometry": {"type": "Polygon", "coordinates": [[[2.33, 0], [2.33, 2.5], [3.47, 2.5], [3.47, 0], [3.2, 0], [2.57, 0], [2.33, 0]]]}}, + {"type": "Feature", "properties": {"location": "right"}, "geometry": {"type": "Polygon", "coordinates": [[[3.2, 0], [3.2, 1.25], [4.32, 1.25], [4.32, 0], [3.47, 0], [3.2, 0]]]}} + ] + } + data_obj_geojson = alt.Data(values=obj_geojson, format=alt.DataFormat(property="features")) + data_obj_geojson + +The information is stored within the ``properties`` dictionary. We +specify the nested variable name (here ``location``) within the color +channel encoding. We apply a ``magma`` color scheme as as custom scale +for the ordinal structured data. + +.. altair-plot:: + + alt.Chart(data_obj_geojson, title="Vega-Altair - ordinal scale").mark_geoshape().encode( + color=alt.Color("properties.location:O", scale=alt.Scale(scheme='magma')) + ).project(type="identity", reflectY=True) + + +.. _spatial-data-remote-geojson: + +GeoJSON file by URL +~~~~~~~~~~~~~~~~~~~ + +Altair can load GeoJSON resources directly from a web URL. Here we use +an example from geojson.xyz. As is explained in +#Spatial-data-from-an-inline-GeoJSON-object, we specify ``features`` as +value for the ``property`` parameter in the ``alt.DataFormat()`` object +and prepend ``scalerank`` with the name of the nested dictionary where +information of each geometry is stored (``properties``). + +.. altair-plot:: + :output: repr + + url_geojson = "https://d2ad6b4ur7yvpq.cloudfront.net/naturalearth-3.3.0/ne_110m_land.geojson" + data_url_geojson = alt.Data(url=url_geojson, format=alt.DataFormat(property="features")) + data_url_geojson + +.. altair-plot:: + + alt.Chart(data_url_geojson).mark_geoshape().encode(color='properties.scalerank:N') + + +.. _spatial-data-inline-topojson: + +Inline TopoJSON object +~~~~~~~~~~~~~~~~~~~~~~ + +TopoJSON is an extension of GeoJSON, where the geometry of the features +are referred to from a top-level object named arcs. Each shared arc is +only stored once to reduce size. An TopoJSON file object can contain +multiple objects (eg. boundary border and province border). When +defining an TopoJSON object for Altair we specify the ``topojson`` type +data format and the name of the object we like to visualize using the +``feature`` (here ``MY_DATA``) parameter. + +Note: the key-name ``MY_DATA`` is arbitrary and differs in each dataset. + +.. altair-plot:: + :output: repr + + obj_topojson = { + "arcs": [ + [[1.0, 1.0], [0.0, 1.0], [0.0, 0.0], [1.0, 0.0]], + [[1.0, 0.0], [2.0, 0.0], [2.0, 1.0], [1.0, 1.0]], + [[1.0, 1.0], [1.0, 0.0]], + ], + "objects": { + "MY_DATA": { + "geometries": [ + {"arcs": [[-3, 0]], "properties": {"name": "abc"}, "type": "Polygon"}, + {"arcs": [[1, 2]], "properties": {"name": "def"}, "type": "Polygon"}, + ], + "type": "GeometryCollection", + } + }, + "type": "Topology", + } + data_obj_topojson = alt.Data( + values=obj_topojson, format=alt.DataFormat(feature="MY_DATA", type="topojson") + ) + data_obj_topojson + +.. altair-plot:: + + alt.Chart(data_obj_topojson).mark_geoshape( + ).encode( + color="properties.name:N" + ).project( + type='identity', reflectY=True + ) + + +.. _spatial-data-remote-topojson: + +TopoJSON file by URL +~~~~~~~~~~~~~~~~~~~~ + +Altair can load TopoJSON resources directly from a web URL. As is +explained in #Spatial-data-from-an-inline-TopoJSON-object, we have to +specify ``boroughs`` as object name for the ``feature`` parameter in and +define the type of data as ``topjoson`` in the ``alt.DataFormat()`` +object. + +.. altair-plot:: + :output: repr + + from vega_datasets import data + + url_topojson = data.londonBoroughs.url + + data_url_topojson = alt.Data( + url=url_topojson, format=alt.DataFormat(feature="boroughs", type="topojson") + ) + + data_url_topojson + +Note: There also exist a shorthand to extract the objects from a +topojson file if this file is accessible by URL: +``alt.topo_feature(url=url_topojson, feature="boroughs")`` + +We color encode the Boroughs by there names as they are stored as an +unique identifier (``id``). We use a ``symbolLimit`` of 33 in two +columns to display all entries in the legend. + +.. altair-plot:: + + alt.Chart(data_url_topojson, title="London-Boroughs").mark_geoshape( + tooltip=True + ).encode( + color=alt.Color("id:N", legend=alt.Legend(columns=2, symbolLimit=33)) + ) + + + +.. _spatial-data-nested-ge0json: + +Nested GeoJSON objects +~~~~~~~~~~~~~~~~~~~~~~ + +GeoJSON data can also be nested within another dataset. In this case it +is possible to use the ``shape`` encoding channel to visualize the +nested dictionary that contains the GeoJSON objects. In the following +example the GeoJSON object is nested in ``geo``: + +.. altair-plot:: + + nested_features = [ + {"color": "#F3C14F", "geo": {"type": "Feature", "geometry": {"type": "Polygon", "coordinates": [[[1.45, 3.75], [1.45, 0], [0, 0], [1.45, 3.75]]]}}}, + {"color": "#4098D7", "geo": {"type": "Feature", "geometry": {"type": "Polygon", "coordinates": [[[1.45, 0], [1.45, 3.75], [2.57, 3.75], [2.57, 0], [2.33, 0], [1.45, 0]]]}}}, + {"color": "#66B4E2", "geo": {"type": "Feature", "geometry": {"type": "Polygon", "coordinates": [[[2.33, 0], [2.33, 2.5], [3.47, 2.5], [3.47, 0], [3.2, 0], [2.57, 0], [2.33, 0]]]}}}, + {"color": "#A9CDE0", "geo": {"type": "Feature", "geometry": {"type": "Polygon", "coordinates": [[[3.2, 0], [3.2, 1.25], [4.32, 1.25], [4.32, 0], [3.47, 0], [3.2, 0]]]}}}, + ] + data_nested_features = alt.Data(values=nested_features) + + alt.Chart(data_nested_features, title="Vega-Altair").mark_geoshape().encode( + shape="geo:G", + color=alt.Color("color:N", scale=None) + ).project(type="identity", reflectY=True) + + +.. _data-projections: + +Projections +~~~~~~~~~~~ +For geographic data it is best to use the World Geodetic System 1984 as +its geographic coordinate reference system with units in decimal degrees. + +Try to avoid putting projected data into Altair, but reproject your spatial data to +EPSG:4326 first. + +If your data comes in a different projection (eg. with units in meters) and you don't +have the option to reproject the data, try using the project configuration +``(type: 'identity', reflectY': True)``. It draws the geometries without applying a projection. + + +.. _data-winding-order: + +Winding Order +~~~~~~~~~~~~~ +LineString, Polygon and MultiPolygon geometries contain coordinates in an order: lines +go in a certain direction, and polygon rings do too. The GeoJSON-like structure of the +``__geo_interface__`` recommends the right-hand rule winding order for Polygon and +MultiPolygons. Meaning that the exterior rings should be counterclockwise and interior +rings are clockwise. While it recommends the right-hand rule winding order, it does not +reject geometries that do not use the right-hand rule. + +Altair does NOT follow the right-hand rule for geometries, but uses the left-hand rule. +Meaning that exterior rings should be clockwise and interior rings should be +counterclockwise. + +If you face a problem regarding winding order, try to force the left-hand rule on your +data before usage in Altair using GeoPandas for example as such: + +.. code:: python + + from shapely.ops import orient + gdf.geometry = gdf.geometry.apply(orient, args=(-1,)) \ No newline at end of file From 8993066197eb4d5266c6f6292aeb0d2fab078202 Mon Sep 17 00:00:00 2001 From: mattijn Date: Tue, 13 Dec 2022 21:50:27 +0100 Subject: [PATCH 5/9] empty non spatial pages --- doc/user_guide/data/dataframe.rst | 213 +----------------------------- doc/user_guide/data/dict.rst | 7 +- doc/user_guide/data/generator.rst | 64 +-------- doc/user_guide/data/url.rst | 7 +- 4 files changed, 4 insertions(+), 287 deletions(-) diff --git a/doc/user_guide/data/dataframe.rst b/doc/user_guide/data/dataframe.rst index a04262490..9683df7b3 100644 --- a/doc/user_guide/data/dataframe.rst +++ b/doc/user_guide/data/dataframe.rst @@ -5,215 +5,4 @@ DataFrame ~~~~~~~~~ -When data is specified as a DataFrame, the encoding is quite simple, as Altair -uses the data type information provided by Pandas to automatically determine -the data types required in the encoding. For example, here we specify data via a DataFrame: - -.. altair-plot:: - - import altair as alt - import pandas as pd - - data = pd.DataFrame({'x': ['A', 'B', 'C', 'D', 'E'], - 'y': [5, 3, 6, 7, 2]}) - alt.Chart(data).mark_bar().encode( - x='x', - y='y', - ) - -By comparison, here we create the same chart using a :class:`Data` object, -with the data specified as a JSON-style list of records: - -.. altair-plot:: - - import altair as alt - - data = alt.Data(values=[{'x': 'A', 'y': 5}, - {'x': 'B', 'y': 3}, - {'x': 'C', 'y': 6}, - {'x': 'D', 'y': 7}, - {'x': 'E', 'y': 2}]) - alt.Chart(data).mark_bar().encode( - x='x:O', # specify ordinal data - y='y:Q', # specify quantitative data - ) - -notice the extra markup required in the encoding; because Altair cannot infer -the types within a :class:`Data` object, we must specify them manually -(here we use :ref:`shorthand-description` to specify *ordinal* (``O``) for ``x`` -and *quantitative* (``Q``) for ``y``; see :ref:`encoding-data-types`). - -Similarly, we must also specify the data type when referencing data by URL: - -.. altair-plot:: - - import altair as alt - from vega_datasets import data - url = data.cars.url - - alt.Chart(url).mark_point().encode( - x='Horsepower:Q', - y='Miles_per_Gallon:Q' - ) - -We will further discuss encodings and associated types in :ref:`user-guide-encoding`, next. - - -.. _data-in-index: - -Including Index Data -~~~~~~~~~~~~~~~~~~~~ -By design Altair only accesses dataframe columns, not dataframe indices. -At times, relevant data appears in the index. For example: - -.. altair-plot:: - :output: repr - - import numpy as np - rand = np.random.RandomState(0) - - data = pd.DataFrame({'value': rand.randn(100).cumsum()}, - index=pd.date_range('2018', freq='D', periods=100)) - data.head() - -If you would like the index to be available to the chart, you can explicitly -turn it into a column using the ``reset_index()`` method of Pandas dataframes: - -.. altair-plot:: - - alt.Chart(data.reset_index()).mark_line().encode( - x='index:T', - y='value:Q' - ) - -If the index object does not have a ``name`` attribute set, the resulting -column will be called ``"index"``. -More information is available in the -`Pandas documentation `_. - - -.. _data-long-vs-wide: - -Long-form vs. Wide-form Data -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -There are two common conventions for storing data in a dataframe, sometimes called -*long-form* and *wide-form*. Both are sensible patterns for storing data in -a tabular format; briefly, the difference is this: - -- **wide-form data** has one row per *independent variable*, with metadata - recorded in the *row and column labels*. -- **long-form data** has one row per *observation*, with metadata recorded - within the table as *values*. - -Altair's grammar works best with **long-form** data, in which each row -corresponds to a single observation along with its metadata. - -A concrete example will help in making this distinction more clear. -Consider a dataset consisting of stock prices of several companies over time. -The wide-form version of the data might be arranged as follows: - -.. altair-plot:: - :output: repr - :chart-var-name: wide_form - - wide_form = pd.DataFrame({'Date': ['2007-10-01', '2007-11-01', '2007-12-01'], - 'AAPL': [189.95, 182.22, 198.08], - 'AMZN': [89.15, 90.56, 92.64], - 'GOOG': [707.00, 693.00, 691.48]}) - print(wide_form) - -Notice that each row corresponds to a single time-stamp (here time is the -independent variable), while metadata for each observation -(i.e. company name) is stored within the column labels. - -The long-form version of the same data might look like this: - -.. altair-plot:: - :output: repr - :chart-var-name: long_form - - long_form = pd.DataFrame({'Date': ['2007-10-01', '2007-11-01', '2007-12-01', - '2007-10-01', '2007-11-01', '2007-12-01', - '2007-10-01', '2007-11-01', '2007-12-01'], - 'company': ['AAPL', 'AAPL', 'AAPL', - 'AMZN', 'AMZN', 'AMZN', - 'GOOG', 'GOOG', 'GOOG'], - 'price': [189.95, 182.22, 198.08, - 89.15, 90.56, 92.64, - 707.00, 693.00, 691.48]}) - print(long_form) - -Notice here that each row contains a single observation (i.e. price), along -with the metadata for this observation (the date and company name). -Importantly, the column and index labels no longer contain any useful metadata. - -As mentioned above, Altair works best with this long-form data, because relevant -data and metadata are stored within the table itself, rather than within the -labels of rows and columns: - -.. altair-plot:: - - alt.Chart(long_form).mark_line().encode( - x='Date:T', - y='price:Q', - color='company:N' - ) - -Wide-form data can be similarly visualized using e.g. layering -(see :ref:`layer-chart`), but it is far less convenient within Altair's grammar. - -If you would like to convert data from wide-form to long-form, there are two possible -approaches: it can be done as a preprocessing step using pandas, or as a transform -step within the chart itself. We will detail to two approaches below. - -.. _data-converting-long-form: - -Converting Between Long-form and Wide-form: Pandas -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -This sort of data manipulation can be done as a preprocessing step using Pandas_, -and is discussed in detail in the `Reshaping and Pivot Tables`_ section of the -Pandas documentation. - -For converting wide-form data to the long-form data used by Altair, the ``melt`` -method of dataframes can be used. The first argument to ``melt`` is the column -or list of columns to treat as index variables; the remaining columns will -be combined into an indicator variable and a value variable whose names can -be optionally specified: - -.. altair-plot:: - :output: repr - - wide_form.melt('Date', var_name='company', value_name='price') - -For more information on the ``melt`` method, see the `Pandas melt documentation`_. - -In case you would like to undo this operation and convert from long-form back -to wide-form, the ``pivot`` method of dataframes is useful. - -.. altair-plot:: - :output: repr - - long_form.pivot(index='Date', columns='company', values='price').reset_index() - -For more information on the ``pivot`` method, see the `Pandas pivot documentation`_. - -Converting Between Long-form and Wide-form: Fold Transform -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -If you would like to avoid data preprocessing, you can reshape your data using Altair's -Fold Transform (see :ref:`user-guide-fold-transform` for a full discussion). -With it, the above chart can be reproduced as follows: - -.. altair-plot:: - - alt.Chart(wide_form).transform_fold( - ['AAPL', 'AMZN', 'GOOG'], - as_=['company', 'price'] - ).mark_line().encode( - x='Date:T', - y='price:Q', - color='company:N' - ) - -Notice that unlike the pandas ``melt`` function we must explicitly specify the columns -to be folded. The ``as_`` argument is optional, with the default being ``["key", "value"]``. \ No newline at end of file +Describe \ No newline at end of file diff --git a/doc/user_guide/data/dict.rst b/doc/user_guide/data/dict.rst index 5ffa665f6..7cc9627f2 100644 --- a/doc/user_guide/data/dict.rst +++ b/doc/user_guide/data/dict.rst @@ -5,9 +5,4 @@ Dictionary ~~~~~~~~~~ -Describe - -Examples --------- - -Show \ No newline at end of file +Describe \ No newline at end of file diff --git a/doc/user_guide/data/generator.rst b/doc/user_guide/data/generator.rst index b7f44ed28..1b7e24b4c 100644 --- a/doc/user_guide/data/generator.rst +++ b/doc/user_guide/data/generator.rst @@ -5,66 +5,4 @@ Generated data ~~~~~~~~~~~~~~ -At times it is convenient to not use an external data source, but rather generate data for -display within the chart specification itself. The benefit is that the chart specification -can be made much smaller for generated data than for embedded data. - -Sequence Generator -^^^^^^^^^^^^^^^^^^ -Here is an example of using the :func:`sequence` function to generate a sequence of *x* -data, along with a :ref:`user-guide-calculate-transform` to compute *y* data. - -.. altair-plot:: - - import altair as alt - - # Note that the following generator is functionally similar to - # data = pd.DataFrame({'x': np.arange(0, 10, 0.1)}) - data = alt.sequence(0, 10, 0.1, as_='x') - - alt.Chart(data).transform_calculate( - y='sin(datum.x)' - ).mark_line().encode( - x='x:Q', - y='y:Q', - ) - -Graticule Generator -^^^^^^^^^^^^^^^^^^^ -Another type of data that is convenient to generate in the chart itself is the latitude/longitude -lines on a geographic visualization, known as a graticule. These can be created using Altair's -:func:`graticule` generator function. Here is a simple example: - -.. altair-plot:: - - import altair as alt - - data = alt.graticule(step=[15, 15]) - - alt.Chart(data).mark_geoshape(stroke='black').project( - 'orthographic', - rotate=[0, -45, 0] - ) - -Sphere Generator -^^^^^^^^^^^^^^^^ -Finally when visualizing the globe a sphere can be used as a background layer -within a map to represent the extent of the Earth. This sphere data can be -created using Altair's :func:`sphere` generator function. Here is an example: - -.. altair-plot:: - - import altair as alt - - sphere_data = alt.sphere() - grat_data = alt.graticule(step=[15, 15]) - - background = alt.Chart(sphere_data).mark_geoshape(fill='aliceblue') - lines = alt.Chart(grat_data).mark_geoshape(stroke='lightgrey') - - alt.layer(background, lines).project('naturalEarth1') - -.. _Pandas: http://pandas.pydata.org/ -.. _Pandas pivot documentation: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pivot.html -.. _Pandas melt documentation: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.melt.html#pandas.DataFrame.melt -.. _Reshaping and Pivot Tables: https://pandas.pydata.org/pandas-docs/stable/reshaping.html \ No newline at end of file +Describe \ No newline at end of file diff --git a/doc/user_guide/data/url.rst b/doc/user_guide/data/url.rst index 684922d39..81103a548 100644 --- a/doc/user_guide/data/url.rst +++ b/doc/user_guide/data/url.rst @@ -5,9 +5,4 @@ URL ~~~ -Describe - -Examples --------- - -Show \ No newline at end of file +Describe \ No newline at end of file From d52101932bc0554e7ac67fc7f2da01e659093ef6 Mon Sep 17 00:00:00 2001 From: Mattijn van Hoek Date: Fri, 16 Dec 2022 22:48:14 +0100 Subject: [PATCH 6/9] Apply suggestions from code review Love the review, thanks @joelostblom! Co-authored-by: Joel Ostblom --- doc/user_guide/data/index.rst | 25 ++++++--- doc/user_guide/data/spatial.rst | 89 ++++++++++++++++----------------- 2 files changed, 59 insertions(+), 55 deletions(-) diff --git a/doc/user_guide/data/index.rst b/doc/user_guide/data/index.rst index 1fc52910b..bb99af7bc 100644 --- a/doc/user_guide/data/index.rst +++ b/doc/user_guide/data/index.rst @@ -5,18 +5,21 @@ Data ~~~~ -The basic data model used by Altair is tabular data, similar to a spreadsheet, pandas DataFrame or a database table. Individual data sets are assumed to contain a collection of records, which may contain any number of named data fields. - +The basic data model used by Altair is tabular data, +similar to a spreadsheet or database table. +Individual datasets are assumed to contain a collection of records (rows), +which may contain any number of named data fields (columns). Each top-level chart object (i.e. :class:`Chart`, :class:`LayerChart`, -and :class:`VConcatChart`, :class:`HConcatChart`, :class:`RepeatChart`, -:class:`FacetChart`) accepts a dataset as its first argument. +:class:`VConcatChart`, :class:`HConcatChart`, :class:`RepeatChart`, +and :class:`FacetChart`) accepts a dataset as its first argument. -Altair provides the following ways to specify a dataset: +While the most common way to provide Altair with a dataset is via a pandas DataFrame, +there are many different ways of specifying a dataset: ========================================= ================================================================================ Data Description ========================================= ================================================================================ -:ref:`user-guide-dataframe-data` A `Pandas DataFrame `_. +:ref:`user-guide-dataframe-data` A `pandas DataFrame `_. :ref:`user-guide-dict-data` A :class:`Data` or related object (i.e. :class:`UrlData`, :class:`InlineData`, :class:`NamedData`). :ref:`user-guide-url-data` A url string pointing to a ``json`` or ``csv`` formatted text file. :ref:`user-guide-spatial-data` An object that supports the `__geo_interface__` (eg. `Geopandas GeoDataFrame `_, `Shapely Geometries `_, `GeoJSON Objects `_). @@ -24,8 +27,10 @@ Data Description ========================================= ================================================================================ When data is specified as a DataFrame, the encoding is quite simple, as Altair -uses the data type information provided by Pandas to automatically determine -the data types required in the encoding. For example, here we specify data via a Pandas DataFrame: +uses the data type information provided by pandas to automatically determine +the data types required in the encoding. For example, here we specify data via a pandas DataFrame +and Altair automatically detects that the x-column should be visualized on a quantitative scale +and that the y-column should be visualized on a categorical scale: .. altair-plot:: @@ -39,6 +44,10 @@ the data types required in the encoding. For example, here we specify data via a y='y', ) +If we are not happy with the scale that Altair chooses for visualizing that data, +we can change it by either changing the data types in the underlying pandas dataframe, +or by changing the :ref:`_encoding-data-types` in Altair. + .. toctree:: :hidden: diff --git a/doc/user_guide/data/spatial.rst b/doc/user_guide/data/spatial.rst index eee7e3e1f..cd630a0ee 100644 --- a/doc/user_guide/data/spatial.rst +++ b/doc/user_guide/data/spatial.rst @@ -3,18 +3,11 @@ .. _user-guide-spatial-data: Spatial Data -~~~~~~~~~~~~ +============ -On this page we explain different methods to work with spatial data and Altair. - -The following methods for working with spatial data are discussed below: - -- :ref:`spatial-data-gdf` -- :ref:`spatial-data-inline-geojson` -- :ref:`spatial-data-remote-geojson` -- :ref:`spatial-data-inline-topojson` -- :ref:`spatial-data-remote-topojson` -- :ref:`spatial-data-nested-geojson` +On this page we explain different methods for reading spatial data into Altair. +To learn more about how to work with this data after you have read it in, +please see the :ref:`user-guide-geoshape-marks` mark page. .. _spatial-data-gdf: @@ -22,9 +15,9 @@ The following methods for working with spatial data are discussed below: GeoPandas GeoDataFrame ~~~~~~~~~~~~~~~~~~~~~~ -It is convenient to use geopandas as source for your spatial data. -Geopandas can read many type spatial data and Altair is optimized in -reading these. Here we define four polygon geometries into a +It is convenient to use GeoPandas as the source for your spatial data. +GeoPandas can read many types of spatial data and Altair works well with GeoDataFrames. +Here we define four polygon geometries into a GeoDataFrame and visualize these using the ``mark_geoshape``. .. altair-plot:: @@ -45,11 +38,11 @@ GeoDataFrame and visualize these using the ``mark_geoshape``. gdf_geoms -This data uses a non-geographic projection. Therefor we use the -``project`` configuration ``type="identity", reflectY=True`` to draw the -geometries without applying a projection. By using ``scale=None`` we -disable the scale for the color channel and Altair will use the defined --Hex color- codes directly. +Since the spatial data in our example is not geographic, + we use ``project`` configuration ``type="identity", reflectY=True`` to draw the +geometries without applying a geographic projection. By using ``alt.Color(..., scale=None)`` we +disable the automatic color assignment in Altair +and instead directly use the provided Hex color codes. .. altair-plot:: @@ -64,11 +57,11 @@ Inline GeoJSON object ~~~~~~~~~~~~~~~~~~~~~ If your source data is a GeoJSON file and you do not want to load it -into a GeoPandas GeoDataFrame you can specify it directly in Altair. A -GeoJSON file consists normally of a ``FeatureCollection`` with a list of -``features`` where information for each geometry is specified within a +into a GeoPandas GeoDataFrame you can provide it as a dictionary to the Altair ``Data`` class. A +GeoJSON file normally consists of a ``FeatureCollection`` with a list of +``features`` where the information for each geometry is specified within a ``properties`` dictionary. In the following example a GeoJSON-like data -object is specified into an altair Data object using the ``property`` +object is specified into a ``Data`` class using the ``property`` value of the ``key`` that contain the nested list (here named ``features``). @@ -87,9 +80,12 @@ value of the ``key`` that contain the nested list (here named data_obj_geojson = alt.Data(values=obj_geojson, format=alt.DataFormat(property="features")) data_obj_geojson -The information is stored within the ``properties`` dictionary. We -specify the nested variable name (here ``location``) within the color -channel encoding. We apply a ``magma`` color scheme as as custom scale +The label for each objects location is stored within the ``properties`` dictionary. To access these values +can specify a nested variable name (here ``properties.location``) within the color +channel encoding. Here we change the coloring encoding to be based on this location label, +and apply a ``magma`` color scheme instead of the default one. +The `:O` suffix indicates that we want Altair to treat these values as ordinal, +and you can read more about it in the :ref:`_encoding-data-types` page. for the ordinal structured data. .. altair-plot:: @@ -105,10 +101,11 @@ GeoJSON file by URL ~~~~~~~~~~~~~~~~~~~ Altair can load GeoJSON resources directly from a web URL. Here we use -an example from geojson.xyz. As is explained in -#Spatial-data-from-an-inline-GeoJSON-object, we specify ``features`` as -value for the ``property`` parameter in the ``alt.DataFormat()`` object -and prepend ``scalerank`` with the name of the nested dictionary where +an example from geojson.xyz. As is explained in :ref:`spatial-data-inline-geojson`, +we specify ``features`` as +the value for the ``property`` parameter in the ``alt.DataFormat()`` object +and prepend the attribute we want to plot (``scalerank``) +with the name of the nested dictionary where the information of each geometry is stored (``properties``). .. altair-plot:: @@ -130,13 +127,12 @@ Inline TopoJSON object TopoJSON is an extension of GeoJSON, where the geometry of the features are referred to from a top-level object named arcs. Each shared arc is -only stored once to reduce size. An TopoJSON file object can contain +only stored once to reduce the size of the data. A TopoJSON file object can contain multiple objects (eg. boundary border and province border). When -defining an TopoJSON object for Altair we specify the ``topojson`` type -data format and the name of the object we like to visualize using the -``feature`` (here ``MY_DATA``) parameter. - -Note: the key-name ``MY_DATA`` is arbitrary and differs in each dataset. +defining a TopoJSON object for Altair we specify the ``topojson`` +data format type and the name of the object we like to visualize using the +``feature`` parameter. Here the name of this object key is ``MY_DATA``, +but this differs in each dataset. .. altair-plot:: :output: repr @@ -178,11 +174,10 @@ Note: the key-name ``MY_DATA`` is arbitrary and differs in each dataset. TopoJSON file by URL ~~~~~~~~~~~~~~~~~~~~ -Altair can load TopoJSON resources directly from a web URL. As is -explained in #Spatial-data-from-an-inline-TopoJSON-object, we have to -specify ``boroughs`` as object name for the ``feature`` parameter in and -define the type of data as ``topjoson`` in the ``alt.DataFormat()`` -object. +Altair can load TopoJSON resources directly from a web URL. As +explained in :ref:`spatial-data-inline-topojson`, we have to +specify ``boroughs`` as the object name for the ``feature`` parameter and +define the type of data as ``topjoson`` in the ``alt.DataFormat()`` object. .. altair-plot:: :output: repr @@ -203,19 +198,22 @@ topojson file if this file is accessible by URL: We color encode the Boroughs by there names as they are stored as an unique identifier (``id``). We use a ``symbolLimit`` of 33 in two -columns to display all entries in the legend. +columns to display all entries in the legend +and change the color scheme to have more distinct colors. +We also add a tooltip which shows the name of the borough +as we hover over it with the mouse. .. altair-plot:: alt.Chart(data_url_topojson, title="London-Boroughs").mark_geoshape( tooltip=True ).encode( - color=alt.Color("id:N", legend=alt.Legend(columns=2, symbolLimit=33)) + color=alt.Color("id:N", scale=alt.Scale(scheme='tableau20'), legend=alt.Legend(columns=2, symbolLimit=33)) ) -.. _spatial-data-nested-ge0json: +.. _spatial-data-nested-geojson: Nested GeoJSON objects ~~~~~~~~~~~~~~~~~~~~~~ @@ -247,10 +245,8 @@ Projections ~~~~~~~~~~~ For geographic data it is best to use the World Geodetic System 1984 as its geographic coordinate reference system with units in decimal degrees. - Try to avoid putting projected data into Altair, but reproject your spatial data to EPSG:4326 first. - If your data comes in a different projection (eg. with units in meters) and you don't have the option to reproject the data, try using the project configuration ``(type: 'identity', reflectY': True)``. It draws the geometries without applying a projection. @@ -270,7 +266,6 @@ reject geometries that do not use the right-hand rule. Altair does NOT follow the right-hand rule for geometries, but uses the left-hand rule. Meaning that exterior rings should be clockwise and interior rings should be counterclockwise. - If you face a problem regarding winding order, try to force the left-hand rule on your data before usage in Altair using GeoPandas for example as such: From 97f9d5525f08e35ec9d1d8cb9c7f562174ea84b0 Mon Sep 17 00:00:00 2001 From: mattijn Date: Fri, 16 Dec 2022 23:42:51 +0100 Subject: [PATCH 7/9] change geojson url to a dataset with continents include mesh option for topojson fix few typos --- doc/user_guide/data/spatial.rst | 48 +++++++++++++++++++++++++-------- 1 file changed, 37 insertions(+), 11 deletions(-) diff --git a/doc/user_guide/data/spatial.rst b/doc/user_guide/data/spatial.rst index cd630a0ee..fa39b39fb 100644 --- a/doc/user_guide/data/spatial.rst +++ b/doc/user_guide/data/spatial.rst @@ -81,11 +81,11 @@ value of the ``key`` that contain the nested list (here named data_obj_geojson The label for each objects location is stored within the ``properties`` dictionary. To access these values -can specify a nested variable name (here ``properties.location``) within the color +you can specify a nested variable name (here ``properties.location``) within the color channel encoding. Here we change the coloring encoding to be based on this location label, and apply a ``magma`` color scheme instead of the default one. -The `:O` suffix indicates that we want Altair to treat these values as ordinal, -and you can read more about it in the :ref:`_encoding-data-types` page. +The ``:O`` suffix indicates that we want Altair to treat these values as ordinal, +and you can read more about it in the :ref:`encoding-data-types` page. for the ordinal structured data. .. altair-plot:: @@ -104,20 +104,20 @@ Altair can load GeoJSON resources directly from a web URL. Here we use an example from geojson.xyz. As is explained in :ref:`spatial-data-inline-geojson`, we specify ``features`` as the value for the ``property`` parameter in the ``alt.DataFormat()`` object -and prepend the attribute we want to plot (``scalerank``) +and prepend the attribute we want to plot (``continent``) with the name of the nested dictionary where the information of each geometry is stored (``properties``). .. altair-plot:: :output: repr - url_geojson = "https://d2ad6b4ur7yvpq.cloudfront.net/naturalearth-3.3.0/ne_110m_land.geojson" + url_geojson = "url_geojson = "https://d2ad6b4ur7yvpq.cloudfront.net/naturalearth-3.3.0/ne_110m_admin_0_countries.geojson"" data_url_geojson = alt.Data(url=url_geojson, format=alt.DataFormat(property="features")) data_url_geojson .. altair-plot:: - alt.Chart(data_url_geojson).mark_geoshape().encode(color='properties.scalerank:N') + alt.Chart(data_url_geojson).mark_geoshape().encode(color='properties.continent:N') .. _spatial-data-inline-topojson: @@ -175,8 +175,8 @@ TopoJSON file by URL ~~~~~~~~~~~~~~~~~~~~ Altair can load TopoJSON resources directly from a web URL. As -explained in :ref:`spatial-data-inline-topojson`, we have to -specify ``boroughs`` as the object name for the ``feature`` parameter and +explained in :ref:`spatial-data-inline-topojson`, we have to use the +``feature`` parameter to specify the object name (here ``boroughs``) and define the type of data as ``topjoson`` in the ``alt.DataFormat()`` object. .. altair-plot:: @@ -211,7 +211,32 @@ as we hover over it with the mouse. color=alt.Color("id:N", scale=alt.Scale(scheme='tableau20'), legend=alt.Legend(columns=2, symbolLimit=33)) ) +Similar to the ``feature`` option, there also exists the ``mesh`` +parameter. This parameter extracts a named TopoJSON object set. +Unlike the feature option, the corresponding geo data is returned as +a single, unified mesh instance, not as individual GeoJSON features. +Extracting a mesh is useful for more efficiently drawing borders +or other geographic elements that you do not need to associate with +specific regions such as individual countries, states or counties. +Here below we draw the same Boroughs of London, but now as mesh only. + +Note: you have to explicitly define ``filled=False`` to draw multi(lines) +without fill color. + +.. altair-plot:: + + from vega_datasets import data + + url_topojson = data.londonBoroughs.url + + data_url_topojson_mesh = alt.Data( + url=url_topojson, format=alt.DataFormat(mesh="boroughs", type="topojson") + ) + + alt.Chart(data_url_topojson_mesh, title="Border London-Boroughs").mark_geoshape( + filled=False + ) .. _spatial-data-nested-geojson: @@ -219,9 +244,10 @@ Nested GeoJSON objects ~~~~~~~~~~~~~~~~~~~~~~ GeoJSON data can also be nested within another dataset. In this case it -is possible to use the ``shape`` encoding channel to visualize the -nested dictionary that contains the GeoJSON objects. In the following -example the GeoJSON object is nested in ``geo``: +is possible to use the ``shape`` encoding channel in combination with the +``:G`` suffix to visualize the nested features as GeoJSON objects. +In the following example the GeoJSON object are nested within ``geo`` +in the list of dictionaries: .. altair-plot:: From 8fce506f7d23cec009b89ed7dc1fa4d16c3a5edc Mon Sep 17 00:00:00 2001 From: mattijn Date: Fri, 16 Dec 2022 23:44:06 +0100 Subject: [PATCH 8/9] fix reference --- doc/user_guide/data/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/user_guide/data/index.rst b/doc/user_guide/data/index.rst index bb99af7bc..833334abd 100644 --- a/doc/user_guide/data/index.rst +++ b/doc/user_guide/data/index.rst @@ -46,7 +46,7 @@ and that the y-column should be visualized on a categorical scale: If we are not happy with the scale that Altair chooses for visualizing that data, we can change it by either changing the data types in the underlying pandas dataframe, -or by changing the :ref:`_encoding-data-types` in Altair. +or by changing the :ref:`encoding-data-types` in Altair. .. toctree:: :hidden: From 3d63811108435834199e1b2f4b8b5a117dd4063e Mon Sep 17 00:00:00 2001 From: mattijn Date: Fri, 16 Dec 2022 23:59:23 +0100 Subject: [PATCH 9/9] end with __geo_interface__ --- doc/user_guide/data/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/user_guide/data/index.rst b/doc/user_guide/data/index.rst index 833334abd..744105925 100644 --- a/doc/user_guide/data/index.rst +++ b/doc/user_guide/data/index.rst @@ -22,7 +22,7 @@ Data Description :ref:`user-guide-dataframe-data` A `pandas DataFrame `_. :ref:`user-guide-dict-data` A :class:`Data` or related object (i.e. :class:`UrlData`, :class:`InlineData`, :class:`NamedData`). :ref:`user-guide-url-data` A url string pointing to a ``json`` or ``csv`` formatted text file. -:ref:`user-guide-spatial-data` An object that supports the `__geo_interface__` (eg. `Geopandas GeoDataFrame `_, `Shapely Geometries `_, `GeoJSON Objects `_). +:ref:`user-guide-spatial-data` A `geopandas GeoDataFrame `_, `Shapely Geometries `_, `GeoJSON Objects `_ or other objects that support the ``__geo_interface__``. :ref:`user-guide-generator-data` A generated dataset such as numerical sequences or geographic reference elements. ========================================= ================================================================================