("yoots"): utilities I've missed in the Python standard library, Pytest, Pandas, Plotly, …
- Install
- Use
utz.process
:subprocess
wrappers; shell out to commands, parse outpututz.collections
: Collection/list helpersutz.cd
: "change directory" contextmanagerutz.fn
: decorator/function utilitiesutz.gzip
: deterministic GZip helpersutz.plot
: Plotly helpersutz.setup
:setup.py
helperutz.test
:dataclass
test cases,raises
helperutz.time
:now
/today
helpersutz.hash_file
utz.docker
,utz.tmpdir
, etc.
pip install utz
utz
has one dependency,stdlb
(wild-card standard library imports).- "Extras" provide optional deps (e.g. Pandas, Plotly, …).
I usually do this at the top of Jupyter notebooks:
from utz import *
This imports most standard library modules/functions (via stdlb
), as well as the utz
members below.
Below are a few modules, in rough order of how often I use them:
utz.process
: subprocess
wrappers; shell out to commands, parse output
from utz.process import *
# Run a command
run('git', 'commit', '-m', 'message') # Commit staged changes
# Return `list[str]` of stdout lines
lines('git', 'log', '-n5', '--format=%h') # Last 5 commit SHAs
# Verify exactly one line of stdout, return it
line('git', 'log', '-1', '--format=%h') # Current HEAD commit SHA
# Return stdout as a single string
output('git', 'log', '-1', '--format=%B') # Current HEAD commit message
# Check whether a command succeeds, suppress output
check('git', 'diff', '--exit-code', '--quiet') # `True` iff there are no uncommitted changes
err("This will be output to stderr")
# Execute a "pipeline" of commands
pipeline(['seq 10', 'head -n5']) # '1\n2\n3\n4\n5\n'
# Diff two command pipelines, e.g. compare lines/words/chars in a gzipped CSV, at Git HEAD vs. worktree:
cmds = ['gunzip -c', 'wc']
file = 'foo.csv.gz'
diff_cmds(
[f'git show HEAD:{file}', *cmds],
[f'cat {file}', *cmds]
)
diff_cmds
is also exposed as a CLI, diff-x
:
# Diff the contents of two `.gz` files
seq 10 | gzip -c > a.gz
seq 2 12 | gzip -c > b.gz
diff-x 'gunzip -c' {a,b}.gz
# 1d0
# < 1
# 10a10,11
# > 11
# > 12
# Pass multiple commands to create a pipeline:
diff-x 'gunzip -c' 'head -n5' {a,b}.gz
# 1d0
# < 1
# 5a5
# > 6
See also: test_process.py
.
utz.collections
: Collection/list helpers
from utz.collections import *
# Verify a collection has one element, return it
singleton(["aaa"]) # "aaa"
singleton(["aaa", "bbb"]) # error
See also: test_collections.py
.
utz.cd
: "change directory" contextmanager
from utz import cd
with cd('..'): # change to parent dir
...
utz.fn
: decorator/function utilities
from utz import decos
from click import option
common_opts = decos(
option('-n', type=int),
option('-v', is_flag=True),
)
@common_opts
def subcmd1(n: int, v: bool):
...
@common_opts
def subcmd2(n: int, v: bool):
...
from utz import call, wraps
def fn1(a, b):
...
@wraps(fn1)
def fn2(a, c, **kwargs):
...
kwargs = dict(a=11, b='22', c=33, d=44)
call(fn1, **kwargs) # passes {a, b}, not {c, d}
call(fn2, **kwargs) # passes {a, b, c}, not {d}
utz.gzip
: deterministic GZip helpers
from utz import deterministic_gzip_open, hash_file
with deterministic_gzip_open('a.gz', 'w') as f:
f.write('\n'.join(map(str, range(10))))
hash_file('a.gz') # dfbe03625c539cbc2a2331d806cc48652dd3e1f52fe187ac2f3420dbfb320504
See also: test_gzip.py
.
Helpers for Plotly transformations I make frequently, e.g.:
from utz import plot
import plotly.express as px
fig = px.bar(x=[1, 2, 3], y=[4, 5, 6])
plot(
fig,
name='my-plot', # Filename stem. will save my-plot.png, my-plot.json, optional my-plot.html
title=['Some Title', 'Some subtitle'], # Plot title, followed by "subtitle" line(s) (smaller font, just below)
bg='white', xgrid='#ccc', # white background, grey x-gridlines
hoverx=True, # show x-values on hover
x="X-axis title", # x-axis title or configs
y=dict(title="Y-axis title", zerolines=True), # y-axis title or configs
# ...
)
Example usages: hudcostreets/nj-crashes, ryan-williams/arrayloader-benchmarks.
utz.setup
: setup.py
helper
utz/setup.py
provides defaults for various setuptools.setup()
params:
name
: use parent directory nameversion
: parse from git tag (otherwise fromgit describe --tags
)install_requires
: readrequirements.txt
author_{name,email}
: infer from last commitlong_description
: parseREADME.md
(and setlong_description_content_type
)description
: parse first<p>
under opening<h1>
fromREADME.md
license
: parse fromLICENSE
file (MIT and Apache v2 supported)
For an example, see gsmo==0.0.1
(and corresponding release).
This library also "self-hosts" using utz.setup
; see pyproject.toml:
[build-system]
requires = ["setuptools", "utz[setup]==0.4.2", "wheel"]
build-backend = "setuptools.build_meta"
and setup.py:
from utz.setup import setup
extras_require = {
# …
}
# Various fields auto-populated from git, README.md, requirements.txt, …
setup(
name="utz",
version="0.8.0",
extras_require=extras_require,
url="https://github.com/runsascoded/utz",
python_requires=">=3.10",
)
The setup
helper can be installed via a pip "extra":
pip install utz[setup]
utz.test
: dataclass
test cases, raises
helper
utz.parametrize
: pytest.mark.parametrize
wrapper, accepts dataclass
instances
from utz import parametrize
from dataclasses import dataclass
def fn(f: float, fmt: str) -> str:
"""Example function, to be tested with ``Case``s below."""
return f"{f:{fmt}}"
@dataclass
class case:
"""Container for a test-case; float, format, and expected output."""
f: float
fmt: str
expected: str
@property
def id(self):
return f"fmt-{self.f}-{self.fmt}"
@parametrize(
case(1.23, "0.1f", "1.2"),
case(123.456, "0.1e", "1.2e+02"),
case(-123.456, ".0f", "-123"),
)
def test_fn(f, fmt, expected):
"""Example test, "parametrized" by several ``Cases``s."""
assert fn(f, fmt) == expected
test_parametrize.py
contains more examples, customizing test "ID"s, adding parameter sweeps, etc.
utz.time
: now
/today
helpers
now
and today
are wrappers around datetime.datetime.now
that expose convenient functions:
from utz import now, today
now() # 2024-10-11T15:43:54Z
today() # 2024-10-11
now().s # 1728661583
now().ms # 1728661585952
Use in conjunction with utz.bases
codecs for easy timestamp-nonces:
from utz import b62, now
b62(now().s) # A18Q1l
b62(now().ms) # dZ3fYdS
b62(now().us) # G31Cn073v
Sample values for various units and codecs:
unit | b62 | b64 | b90 |
---|---|---|---|
s | A18RXZ | +a/I/7 | :?98> |
ds | R0165M | D3KFIY | "sJh_? |
cs | CBp0oXI | /TybqKo | =8d'#K |
ms | dZ3no2f | M6vLchJ | #6cRfBj |
us | G31ExCseD | 360KU8v9V | D,f`6&uX |
(generated by time-slug-grid.py
)
from utz import hash_file
hash_file("path/to/file") # sha256 by default
hash_file("path/to/file", 'md5')
utz.docker
, utz.tmpdir
, etc.
Misc other modules:
- o:
dict
wrapper exposing keys as attrs (e.g.:o({'a':1}).a == 1
) - docker: DSL for programmatically creating Dockerfiles (and building images from them)
- bases: encode/decode in various bases (62, 64, 90, …)
- tmpdir: make temporary directories with a specific basename
- escape: split/join on an arbitrary delimiter, with backslash-escaping
- ssh: SSH tunnel wrapped in a context manager
- backoff: exponential-backoff utility
- git: Git helpers, wrappers around GitPython
- pnds: pandas imports and helpers
- ctxs: compose
contextmanager
s