Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a tutorial for working with table inputs in PyGMT #2722

Merged
merged 28 commits into from
Dec 13, 2023
Merged
Changes from 26 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
5a3e78c
Add a tutorial for working with table inputs in PyGMT
seisman Oct 8, 2023
b8d656f
fix
seisman Oct 8, 2023
0410ddc
Apply suggestions from code review
seisman Oct 8, 2023
0e3e5f9
Merge branch 'main' into tutorial/working-with-tables
seisman Oct 10, 2023
f778830
Rename the tutorial
seisman Oct 10, 2023
a3350fc
Move geopandas.DataFrame before x/y/z arrays
seisman Oct 10, 2023
89a62ed
Merge branch 'main' into tutorial/working-with-tables
seisman Oct 19, 2023
c9c2593
Updates
seisman Oct 19, 2023
3b9ccf3
Fix
seisman Oct 19, 2023
0bcd2ca
Apply suggestions from code review
seisman Oct 24, 2023
8ea2b1f
Apply suggestions from code review
seisman Oct 24, 2023
63c0cd4
Minor updates
seisman Oct 26, 2023
ff6e4ff
Merge branch 'main' into tutorial/working-with-tables
seisman Oct 28, 2023
0578774
Update examples/get_started/04_table_inputs.py
seisman Nov 2, 2023
727ee2c
Merge branch 'main' into tutorial/working-with-tables
seisman Nov 4, 2023
7db8cca
Minor updates
seisman Nov 4, 2023
fa9ebda
Apply suggestions from code review
seisman Nov 4, 2023
3dcef5e
Formatting
seisman Nov 5, 2023
a2daa33
Apply suggestions from code review
seisman Nov 7, 2023
0b49647
Fix styling
seisman Nov 10, 2023
6d41e79
Merge branch 'main' into tutorial/working-with-tables
seisman Nov 18, 2023
45dfb66
Merge branch 'main' into tutorial/working-with-tables
seisman Nov 23, 2023
544ee1d
Merge branch 'main' into tutorial/working-with-tables
seisman Dec 11, 2023
75f34ad
Apply suggestions from code review
seisman Dec 12, 2023
5244549
Link to Python standard list type
seisman Dec 12, 2023
06f85bb
Apply suggestions from code review
seisman Dec 12, 2023
3a72413
Merge branch 'main' into tutorial/working-with-tables
seisman Dec 12, 2023
97c1a90
Remove hyperlink from the section heading to reduce line length
seisman Dec 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
146 changes: 146 additions & 0 deletions examples/get_started/04_table_inputs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
"""
4. PyGMT I/O: Table inputs
==========================
Comment on lines +2 to +3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I/O means Input/Output, but this tutorial is only on inputs 🙂 Will the 'Output' part be added as a separate page? Or do we want multiple parts like:

  • 4.1 PyGMT I/O: Table inputs
  • 4.2 PyGMT I/O: Table outputs
  • 4.3 PyGMT I/O: Grid inputs
  • 4.4 PyGMT I/O: Grid outputs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's my plan, but I may prefer to the following order:

4.1 PyGMT I/O: Table inputs
4.2 PyGMT I/O: Grid inputs
4.3 PyGMT I/O: Table outputs
4.4 PyGMT I/O: Grid outputs


Generally, PyGMT accepts two different types of data inputs: tables and grids.

- A table is a 2-D array with rows and columns. Each column represents a
different variable (e.g., *x*, *y* and *z*) and each row represents a
different record.
- A grid is a 2-D array of data that is regularly spaced in the x and y
directions (or longitude and latitude).

In this tutorial, we'll focus on working with table inputs, and cover grid
inputs in a separate tutorial.

PyGMT supports a variety of table input types that allow you to work with data
in a format that suits your needs. In this tutorial, we'll explore the
different table input types available in PyGMT and provide examples for each.
By understanding the different table input types, you can choose the one that
best fits your data and analysis needs, and work more efficiently with PyGMT.
"""

# %%
from pathlib import Path

import geopandas as gpd
import numpy as np
import pandas as pd
import pygmt

# %%
# ASCII table file
# ----------------
#
# Most PyGMT functions/methods that accept table input data have a ``data``
# parameter. The easiest way to provide table input data to PyGMT is by
# specifying the file name of an ASCII table (e.g., ``data="input_data.dat"``).
# This is useful when your data is stored in a separate text file.

# Create an example file with 3 rows and 2 columns
data = np.array([[1.0, 2.0], [5.0, 4.0], [8.0, 3.0]])
np.savetxt("input_data.dat", data, fmt="%f")

# Pass the file name to the data parameter
fig = pygmt.Figure()
fig.basemap(region=[0, 10, 0, 5], projection="X10c/5c", frame=True)
fig.plot(data="input_data.dat", style="p0.2c", fill="blue")
fig.show()

# Now let's delete the example file
Path("input_data.dat").unlink()

# %%
# Besides a plain string to a table file, the following variants are also
# accepted:
#
# - A :class:`pathlib.Path` object.
# - A full URL. PyGMT will download the file to the current directory first.
# - A file name prefixed with ``@`` (e.g., ``data="@input_data.dat"``), which
# is a special syntax in GMT to indicate that the file is a remote file
# hosted on the GMT data server.

# %%
# 2-D array: :class:`list`, :class:`numpy.ndarray`, and :class:`pandas.DataFrame`
# -------------------------------------------------------------------------------
#
# The ``data`` parameter also accepts a 2-D array, e.g.,
#
# - A 2-D :class:`list` (i.e., a list of lists)
# - A :class:`numpy.ndarray` object with with a dimension of 2
# - A :class:`pandas.DataFrame` object
#
# This is useful when you want to plot data that is already in memory.

fig = pygmt.Figure()
fig.basemap(region=[0, 10, 0, 5], projection="X10c/5c", frame=True)

# Pass a 2-D list to the 'data' parameter
fig.plot(data=[[1.0, 2.0], [3.0, 4.0]], style="c0.2c", fill="black")

# Pass a 2-D numpy array to the 'data' parameter
fig.plot(data=np.array([[4.0, 2.0], [6.0, 4.0]]), style="t0.2c", fill="red")

# Pass a pandas.DataFrame to the 'data' parameter
df = pd.DataFrame(np.array([[7.0, 3.0], [9.0, 2.0]]), columns=["x", "y"])
fig.plot(data=df, style="a0.5c", fill="blue")

fig.show()

# %%
# :class:`geopandas.GeoDataFrame`
# -------------------------------
#
# If you're working with geospatial data, you can read your data as a
# :class:`geopandas.GeoDataFrame` object and pass it to the ``data``
# parameter. This is useful if your data is stored in a geospatial data format
# (e.g., GeoJSON, etc.) that GMT and PyGMT does not support natively.

# Example GeoDataFrame
gdf = gpd.GeoDataFrame(
{
"geometry": gpd.points_from_xy([2, 5, 9], [2, 3, 4]),
"value": [10, 20, 30],
}
)

# Use the GeoDataFrame to specify the 'data' parameter
fig = pygmt.Figure()
fig.basemap(region=[0, 10, 0, 5], projection="X10c/5c", frame=True)
fig.plot(data=gdf, style="c0.2c", fill="purple")
fig.show()

# %%
# Scalar values or 1-D arrays
# ---------------------------
#
# In addition to the ``data`` parameter, some PyGMT functions/methods also
# provide individual parameters (e.g., ``x`` and ``y`` for data coordinates)
# which allow you to specify the data. These parameters accept individual
# scalar values or 1-D arrays (lists or 1-D numpy arrays).

fig = pygmt.Figure()
fig.basemap(region=[0, 10, 0, 5], projection="X10c/5c", frame=True)

# Pass scalar values to plot a single data point
fig.plot(x=1.0, y=2.0, style="a0.2c", fill="blue")

# Pass 1-D lists to plot multiple data points
fig.plot(x=[5.0, 5.0, 5.0], y=[2.0, 3.0, 4.0], style="t0.2c", fill="green")

# Pass 1-D numpy arrays to plot multiple data points
fig.plot(
x=np.array([8.0, 8.0, 8.0]), y=np.array([2.0, 3.0, 4.0]), style="c0.2c", fill="red"
)

fig.show()

# %%
# Conclusion
# ----------
#
# In PyGMT, you have the flexibility to provide data in various table input
# types, including file names, 2-D arrays (2-D :class:`list`,
# :class:`numpy.ndarray`, :class:`pandas.DataFrames`), scalar values or a
# series of 1-D arrays, and :class:`geopandas.GeoDataFrame`. Choose the input
# type that best suits your data source and analysis requirements.
Loading