Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_feather does not support os.PathLike file paths #4177

Closed
mvashishtha opened this issue Feb 8, 2022 · 0 comments · Fixed by #4179
Closed

read_feather does not support os.PathLike file paths #4177

mvashishtha opened this issue Feb 8, 2022 · 0 comments · Fixed by #4179
Assignees

Comments

@mvashishtha
Copy link
Collaborator

mvashishtha commented Feb 8, 2022

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS Big Sur
  • Modin version (modin.__version__): 0.13.0+40.geffe4c20
  • Python version: 3.9.9
  • Code we can use to reproduce:
import pandas
import modin.pandas as pd
from pathlib import Path

pandas.DataFrame([[1]], columns=["a"]).to_feather("/tmp/bug_read_feather_pathlike.feather")
# this works:
pandas.read_feather(Path("/tmp/bug_read_feather_pathlike.feather"))
# This does not work:
pd.read_feather(Path("/tmp/bug_read_feather_pathlike.feather"))

Describe the problem

pandas supports os.PathLike file paths in read_feather, but Modin does not.

Source code / logs

Stack trace
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [24], in <module>
----> 1 pd.read_feather(Path("/tmp/bug_read_feather_pathlike.feather"))

File ~/modin/modin/pandas/io.py:398, in read_feather(path, columns, use_threads, storage_options)
    395 Engine.subscribe(_update_engine)
    396 from modin.core.execution.dispatching.factories.dispatcher import FactoryDispatcher
--> 398 return DataFrame(query_compiler=FactoryDispatcher.read_feather(**kwargs))

File ~/modin/modin/core/execution/dispatching/factories/dispatcher.py:227, in FactoryDispatcher.read_feather(cls, **kwargs)
    224 @classmethod
    225 @_inherit_docstrings(factories.BaseFactory._read_feather)
    226 def read_feather(cls, **kwargs):
--> 227     return cls.__factory._read_feather(**kwargs)

File ~/modin/modin/core/execution/dispatching/factories/factories.py:277, in BaseFactory._read_feather(cls, **kwargs)
    269 @classmethod
    270 @doc(
    271     _doc_io_method_template,
   (...)
    275 )
    276 def _read_feather(cls, **kwargs):
--> 277     return cls.io_cls.read_feather(**kwargs)

File ~/modin/modin/core/io/file_dispatcher.py:151, in FileDispatcher.read(cls, *args, **kwargs)
    128 @classmethod
    129 def read(cls, *args, **kwargs):
    130     """
    131     Read data according passed `args` and `kwargs`.
    132
   (...)
    149     postprocessing work on the resulting query_compiler object.
    150     """
--> 151     query_compiler = cls._read(*args, **kwargs)
    152     # TODO (devin-petersohn): Make this section more general for non-pandas kernel
    153     # implementations.
    154     if StorageFormat.get() == "Pandas":

File ~/modin/modin/core/io/column_stores/feather_dispatcher.py:50, in FeatherDispatcher._read(cls, path, columns, **kwargs)
     24 @classmethod
     25 def _read(cls, path, columns=None, **kwargs):
     26     """
     27     Read data from the file path, returning a query compiler.
     28
   (...)
     48     https://arrow.apache.org/docs/python/api.html#feather-format
     49     """
---> 50     path = cls.get_path(path)
     51     if columns is None:
     52         import_optional_dependency(
     53             "pyarrow", "pyarrow is required to read feather files."
     54         )

File ~/modin/modin/core/io/file_dispatcher.py:211, in FileDispatcher.get_path(cls, file_path)
    190 @classmethod
    191 def get_path(cls, file_path):
    192     """
    193     Process `file_path` in accordance to it's type.
    194
   (...)
    209     absolute path will be returned.
    210     """
--> 211     if S3_ADDRESS_REGEX.search(file_path):
    212         return file_path
    213     else:

TypeError: expected string or bytes-like object
@mvashishtha mvashishtha self-assigned this Feb 8, 2022
mvashishtha pushed a commit to mvashishtha/modin that referenced this issue Feb 8, 2022
Signed-off-by: mvashishtha <mahesh@ponder.io>
devin-petersohn pushed a commit that referenced this issue Feb 17, 2022
Signed-off-by: mvashishtha <mahesh@ponder.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant