Skip to content

AmenRa/unified-io

Repository files navigation

PyPI version License: MIT

⚡️ Introduction

unified-io is a Python utility that attempts to unify several I/O operations (i.e., read/write data in different formats) under a similar interface while making them more concise and user-friendly.

The library provides a unified interface for reading/writing files, which is based on the following principles:

  • Read/write interfaces consist of concise functions with similar signatures.
  • Read/write interfaces allows passing keyword arguments to the underlying I/O functions to preserve flexibility.
  • Read operations can be performed lazily using generators.
  • Before reading/writing, the user can specify a callback function that will be applied to each element of the data stream.
  • read functions have load aliases (e.g., read_csv has a load_csv alias) and write functions have save aliases (e.g., write_csv has a save_csv alias.
  • Use very efficient stuff for each format (e.g., orjson for json files). Suggestions are welcome!

✨ Supported formats

🔌 Requirements

python>=3.7

💾 Installation

pip install unified-io

💡 Examples

The API is designed to be as simple as possible. For example, the following code snippet reads a CSV file, applies a callback function to each element of the data stream, and writes the result to a JSONl file:

from unified_io import read_csv, write_jsonl

def callback(x):
    return {"id": x["id"], "title": x["title"].lower()}

# Using a generator we avoid loading the entire file into memory
data = read_csv('input.csv', callback=callback, generator=True)

write_jsonl('output.jsonl', data)

📚 Documentation

Browse the documentation for more details and examples.

🎁 Feature Requests

Would you like to see other features implemented? Please, open a feature request.

🤘 Want to contribute?

Would you like to contribute? Please, drop me an e-mail.

📄 License

unified-io is an open-sourced software licensed under the MIT license.