-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the Session.virtualfile_from_stringio method to allow StringIO input for certain functions/methods #3326
Conversation
Actually, when calling a GMT module, we usually know the expected number of numeric columns (for example, 0 for |
pygmt/clib/session.py
Outdated
if line.startswith(">"): # Segment header | ||
if header is not None: # Only one segment is allowed now. | ||
raise GMTInvalidInput("Only one segment is allowed.") | ||
header = line | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need a test to check that multi-segment inputs fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's likely we have to allow multi-segments in StringIO.
Here is a CLI version for typesetting a paragraph of text. It seems the first line must be a line not starting with #
and is ignored. Not sure why it was designed like this. I'll open another POC PR for Figure.text
and see how it works.
gmt text -R0/3/0/5 -JX3i -h1 -M -N -F+f12,Times-Roman+jLT -pdf figure << EOF
This is an unmarked header record not starting with #
> 0 -0.5 13p 3i j
@%5%Figure 1.@%% This illustration shows nothing useful, but it still needs
a figure caption. Highlighted in @;255/0/0;red@;; you can see the locations
of cities where it is @\_impossible@\_ to get any good Thai food; these are to be avoided.
EOF
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correction: I didn't see the -h1
option in the CLI version, which skips the first line. Without -h1
, the CLI version still works:
gmt text -R0/3/0/5 -JX3i -M -N -F+f12,Times-Roman+jLT -pdf figure << EOF
> 0 -0.5 13p 3i j
@%5%Figure 1.@%% This illustration shows nothing useful, but it still needs
a figure caption. Highlighted in @;255/0/0;red@;; you can see the locations
of cities where it is @\_impossible@\_ to get any good Thai food; these are to be avoided.
EOF
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a POC example to show how to support paragraph mode via StringIO:
In [1]: import io
In [2]: stringio = io.StringIO(
...: "> 0 -0.5 13p 3i j\n"
...: "@%5%Figure 1.@%% This illustration shows nothing useful, but it still needs\n"
...: "a figure caption. Highlighted in @;255/0/0;red@;; you can see the locations\n"
...: "of cities where it is @_impossible@_ to get any good Thai food; these are to be avoided.\n"
...: )
In [3]: import pygmt
In [4]: fig = pygmt.Figure()
In [5]: from pygmt.clib import Session
In [6]: with Session() as lib:
...: with lib.virtualfile_in(data=stringio) as vintbl:
...: lib.call_module(module="text", args=f"{vintbl} -R0/3/0/5 -JX3i -M -N -F+f12,Times-Roman+jLT")
...:
In [7]: fig.show()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StringIO objects containing multi-segments are supported in the latest version.
finally: | ||
# Must set the text to None to avoid double freeing the memory | ||
seg.text = None | ||
|
||
def virtualfile_in( # noqa: PLR0912 | ||
self, | ||
check_kind=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the question is, how can we add StringIO support to specific functions only (e.g.,
Figure.legend
)?
Maybe legend
should have a special check_kind? E.g. if check_kind("legend"): valid_kinds += ("stringio", ...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #3438 for the POC PR for Figure.legend
. Since we already checked the data kind before entering the session, we can have check_kind=False
in Figure.legend
.
This reverts commit 486fce7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @seisman for adding multi-segment support! I just have one concern/thought about non-ASCII character support, but that could be handled later (if needed).
@@ -60,6 +61,7 @@ | |||
"GMT_IS_PLP", # items could be any one of POINT, LINE, or POLY | |||
"GMT_IS_SURFACE", # items are 2-D grid | |||
"GMT_IS_VOLUME", # items are 3-D grid | |||
"GMT_IS_TEXT", # Text strings which triggers ASCII text reading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is only ASCII supported for now? What about other encodings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GMT only accepts ASCII text strings. Any non-ASCII characters should be written in the form of octal codes.
Description of proposed changes
This PR adds support for StringIO inputs, which was initially requested in #571 (4 yeas ago!).
GMT doesn't natively support reading StringIO objects. To pass a StringIO object into GMT, we need to create a GMT_DATASET container and store the contents of the StringIO object in the GMT_DATASET container.
Ideally, any PyGMT methods/functions should accept StringIO input, but it's technically difficult to implement. The main reason is that, when creating a GMT_DATASET container, we need to specify the number of tables, segments, columns, and rows, which means we must try very hard to parse the contents of the StringIO object. The number of columns actually is the most difficult to determine. For example, the content below should be 2 numeric columns and one trailing text column, but we may incorrectly parse it to have 0 numeric columns:
and the following content that we want to pass to
Figure.text
's paragraph mode (#1078), should have zero numeric columns and one trailing text column, but we may incorrectly think it has two numeric columns:Thus, to keep things as simple as possible, I think we should only allow StringIO input for a few modules (e.g., makes much sense for
Figure.legend
, but makes little sense toFigure.plot
). In this case, we just need to store the StringIO contents into the trailing text of the GMT_DATASET container.This PR implements the virtualfile_from_string method to allow StringIO objects. There are two related PRs:
Figure.legend
(PR Figure.legend: Support passing a StringIO object as the legend specification #3438 addressing Allow Figure.legend to read from StringIO #571)Figure.text
support paragraph mode (PR #XX addressing Figure.text(): Support paragraph mode #1078)Reminders
make format
andmake check
to make sure the code follows the style guide.doc/api/index.rst
.Slash Commands
You can write slash commands (
/command
) in the first line of a comment to performspecific operations. Supported slash command is:
/format
: automatically format and lint the code