Skip to content
This repository has been archived by the owner on Nov 5, 2022. It is now read-only.
Glenn K. Lockwood edited this page Nov 5, 2017 · 14 revisions

pytokio

pytokio is a framework for connecting different sources of instrumentation and telemetry from storage and I/O subsystems in large computing systems. It provides

  • connectors which translate the outputs of various standard tools into Python objects
  • tools which reduce the amount of boilerplate code necessary to utilize various tools in common ways

This framework is intended to be the starting foundation for higher-level analysis tools that compare and analyze data from multiple sources.

Getting Started

The examples subdirectory contains Jupyter notebooks that demonstrate the basics of using the pytokio API to retrieve data. Because pytokio uses existing data sources, these notebooks must be run on a system that has access to the data source(s) you want. Specifically,

  • NERSC's Lustre data stored in HDF5 is saved to the NERSC Global File System, so accessing it requires access to an NGF mount (e.g., from https://jupyter.nersc.gov/)
  • The job accounting and diameter data relies on Slurm accounting database access via the sacct command. Thus, it must be run from a system that can run sacct -j 12345 and get the correct job information for job 12345.

The tests/ directory also contains a suite of self-contained functionality tests as well as all input data required to exercise them. You can look at those tests to see the bare minimum code necessary to exercise different parts of pytokio, and how you can utilize local data caches to use pytokio on systems that do not satisfy the aforementioned external dependencies.

Contributing

The easiest way to contribute to pytokio is to fork it and issue pull requests with new features you've added.

If you have push access to the main pytokio repository, you must do the following to add new features:

  1. Have a GitHub issue opened describing the new feature
  2. Have a new branch named according to yourname/githubissuenumber; for example, a branch called glennklockwood/issue23 will correspond to the feature described in Issue #23.

and all commits should be pushed to this branch.

When the feature is complete, it must:

  1. have a test suite in the tests/ subdirectory. New tokio.connectors, tokio.tools, and CLI tools in bin/ must be represented.
  2. demonstrate that all unit/integration tests are passing. This includes running tests that should not have been affected by your changes.

Any new example notebooks in examples/ should be tested to the best of your ability, but there are currently no hard requirements to build unit tests for them. This is subject to change in the future.

Once this has all happened, create a pull request against master for that branch. A second person should review your changes and accept the merge or provide feedback on what must be changed before the branch can be merged. After the branch is merged, it can be deleted.

Clone this wiki locally