Skip to content
This repository has been archived by the owner on Nov 19, 2018. It is now read-only.

Support Graphviz installed from Anaconda #52

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

prmtl
Copy link
Contributor

@prmtl prmtl commented Oct 15, 2018

Graphviz installed using Anaconda is using .bat scripts
to correctly setup running it on Windows.

@prmtl
Copy link
Contributor Author

prmtl commented Oct 15, 2018

@leycec Could you look at it? I must admit that I didn't use conda so if you can quickly test it, let me know if this is correct solution. In the meantime I'll try to prepare some CI jobs for using pydot-ng with Conda (any known projects that can I use as an inspiration?).

@leycec
Copy link

leycec commented Nov 3, 2018

Humble apologies for the half-month delay, @prmtl. My awesome father-in-law unexpectedly passed away several hours before you submitted this request and, well... cue sad face emoji.

The approach you've taken here should, in theory, work. That said, it's not necessarily ideal. That said, the perfect is the enemy of the good. That said, I'm a perfectionist.

Building Utopia Together

In a perfect world, you wouldn't bother iteratively testing filetypes. Why? Because you don't need to. Also, because you absolutely shouldn't.

POSIX-compatible platforms signify executability via the executable bit (e.g., chmod +x /usr/bin/dot) rather than filetype. Moreover, .exe and .bat are Windows-specific filetypes. The current pydot-ng approach could, in obscure edge cases, erroneously attempt to run Windows-specific Graphviz executables on POSIX platforms (e.g., due to inadvisable use of a $WINEPREFIX).

Likewise, Windows users preferring pip will always want .exe-suffixed Graphviz executables to be run. .bat-suffixed Graphviz executables should only be run for Anaconda-based Windows Python interpreters; everywhere else, it's .exe or bust.

Therefore, the filetype you want is a deterministic function of the current platform + Python environment. Specifically, if the current platform is:

  • Windows, the desired filetype is either:
    • If the current Python environment is Anaconda-based, .bat.
    • Else, .exe.
  • Not Windows (e.g., Linux, macOS), the desired filetype is the empty string.

No iteration required. A single if conditional suffices. KISS, right?

Code or It Didn't Happen

Very well. You should now be quietly thinking to yourself: "Yes, but everything you've said assumes we can efficiently and reliably test whether the current Python environment is Anaconda-based or not. Can we?"

Fear not: we can. Sadly, I'd forgotten how. Happily, I remembered that I already solved this half a year ago (it feels like a lifetime) in a farcical comment chain at ContinuumIO/anaconda-issues#1666. It turns out that a single filesystem access suffices to decide this question:

# "True" only if this is an Anaconda-based Windows Python interpreter. See
# https://stackoverflow.com/a/47610844/2809027 for details.
is_conda = os.path.exists(os.path.join(sys.prefix, 'conda-meta'))

Given that boolean, your for prg in progs: iteration would then instead resemble:

# "True" only if this is an Anaconda-based Windows Python interpreter. See
# https://stackoverflow.com/a/47610844/2809027 for details.
is_conda = os.path.exists(os.path.join(sys.prefix, 'conda-meta'))

for prg in progs:
    if progs[prg]:
        continue

    # If this is a Graphviz executable under Windows with no filetype...
    if os.name == 'nt' and not os.path.splitext(prg)[1]:
        # If this is an Anaconda-based Python interpreter, default to the batch
        # file wrappers specific to Anaconda Graphviz packages; else, default
        # to standard Windows executables.
        prg += '.bat' if is_conda else '.exe'

    prg_path = os.path.join(path, prg)

Boom. When the comments are twice as long as the actual code, @leycec was here.

Anaconda for Great Justice

In the meantime I'll try to prepare some CI jobs for using pydot-ng with Conda

any known projects that can I use as an inspiration?

Why, yes, actually: ours.

I note that pydot-ng already leverages AppVeyor for Windows-based CI (awesome), but leveraging pip rather than conda (less awesome). Python users on Windows typically leverage conda rather than pip, for the simple reason that many packages still fail to distribute universal binary wheels suitable for use with pip under Windows. Even when such wheels are distributed, they tend to be substantially less performant than their equivalent conda packages.

The classic example is Python's scientific stack: NumPy, SciPy, Matplotlib, Pandas, TensorFlow, and so on. Whereas Anaconda's NumPy package is linked against Intel's highly optimized Math Kernel Library (MKL) on all supported platforms (including Windows), NumPy's official binary wheel is only linked against ATLAS' less performant BLAS and LAPACK implementations constrained to SSE2 instructions.

Anaconda trivially circumvents these issues. Transitioning your AppVeyor configuration from pip to conda would thus ensure that you test pydot-ng from a similar environment as that of your Windows userbase. This is a good thing. Resolving horrifying Windows + Anaconda + GraphViz inconsistencies is only the icing on the mouldy cake.

Should you choose to accept this rewarding mission, the general approach is as follows:

requirements-conda.txt

Add a new top-level requirements-conda.txt file to this repository, perhaps resembling our well-commented file of the same name. By inspection of your setup.py and test-requirements.txt configuration, an entirely untested (but hopefully useful) first working draft might be:

graphviz >=2.38.0
pyparsing >=2.0.1
pytest >=3.8.2
mock >=2.0.0

Since this is conda rather than pip, note the inclusion of non-Pythonic mandatory dependencies – namely, GraphViz. Chocolatey's cinst package manager is neither required nor desired here. In fact, Chocolatey should be entirely ignored. Let Anaconda do the heavy lifting, for that is its job description.

More importantly, whitespace is significant. Ergo, graphviz >=2.38.0 is a syntactically valid conda package specification but the seemingly similar string graphviz>=2.38.0 is not. Well, nuthin's perfect.

appveyor.yml

So, you already have an Appveyor configuration. That's great! Sadly, you'll need to completely rewrite it from the ground up to support conda rather than cinst + pip. That's perhaps not so great.

On the bright side, "completely rewrite it from from the ground up" just means "liberally copy-and-paste somebody else's." Here's ours, for example... complete with verbose commentary guaranteed to make your head explode like a jelly-filled piñata on the Day of the Dead.

Our test suite even uses py.test like yours. Heaven is indeed a place on GitHub.

Quotes... from Hell

Let's get pedantic. You really don't want to reinvent the wheel by attempting to manually implement shell escaping, quoting, and unquoting – especially under Windows, where the syntactic ruleset is infamously inhumane.

You'll never get it quite right in a genuinely portable manner. But you don't need to. Whenever you want to munge a string for consumption by a shell interpreter, defer to the existing subprocess and shlex modules in the Python stdlib instead. Our multiphysics biology simulator, for example, defines the following cross-platform function for doing so:

def shell_quote(text):
    '''
    Shell-quote the passed string in a platform-specific manner.

    If the current platform is:

    * *Not* Windows (e.g., Linux, OS X), the returned string is guaranteed to be
      suitable for passing as an arbitrary positional argument to external
      commands.
    * Windows, the returned string is suitable for passing *only* to external
      commands parsing arguments in the same manner as the Microsoft C runtime.
      While *all* applications on POSIX-compliant systems are required to parse
      arguments in the same manner (i.e., according to Bourne shell lexing), no
      such standard applies to Windows applications. Shell quoting is therefore
      fragile under Windows -- like pretty much everything.

    Parameters
    ----------
    text : str
        Arbitrary string to be shell-quoted.

    Returns
    ----------
    str
        The passed string shell-quoted.
    '''

    # If the current OS is Windows, do *NOT* perform POSIX-compatible quoting.
    # Windows is POSIX-incompatible and hence does *NOT* parse command-line
    # arguments according to POSIX standards. In particular, Windows does *NOT*
    # treat single-quoted arguments as single arguments but rather as multiple
    # shell words delimited by the raw literal `'`. This is circumventable by
    # calling an officially undocumented Windows-specific function. (Awesome.)
    if os.name == 'nt':
        import subprocess
        return subprocess.list2cmdline([text])
    # Else, perform POSIX-compatible quoting.
    else:
        import shlex
        return shlex.quote(text)

Please steal this. Also, you mispelled "qouted;" it's "quoted." shoot me now

All's Well that Ends

Thus ends another doctoral thesis masquerading as GitHub commentary. In reflection, I regret nothing.

@prmtl
Copy link
Contributor Author

prmtl commented Nov 3, 2018

Thanks for the response! I really appreciate it. And I'm sorry for your loss.

I just quickly went through it and I found a LOT of new info about Windows & Anaconda. I'll slowly go through them later, read more about the topic and reimplement PR.

@leycec
Copy link

leycec commented Nov 9, 2018

You're most welcome. My wife and I gratefully appreciate the outpouring of support. You're awesome!

Since (A) you're awesome, (B) Anaconda now defaults to Python 3.7, (C) pydot is moribund dead and therefore unlikely to ever support Python 3.7, (D) BETSE requires a pydot analogue for gene regulatory network (GRN) support and therefore currently also does not support Python 3.7, and (E) I am begrudgingly willing to work long feverish weekends to resolve this blocker, I'd like to begin contributing to pydot-ng.

Fate has tipped my gnarled arthritic hand.

Big Plans for Big Men

I accidentally became an idiot savant at all things Anaconda several months ago. Notably, I'm a co-maintainer on the conda-forge pydot feedstock and the sole creator and maintainer of two other feedstocks. Given this obscure specialization in the conda ecosystem, here's my ad-hoc outline for resolving our mutual issues:

  1. @leycec creates a new pydot-ng feedstock. In conda-forge parlance:
    • A "feedstock" is a public GitHub repository with URL https://github.com/conda-forge/${project_name}-feedstock containing the recipe required to build and test the Anaconda package for that project. You will, of course, be granted full administrator access to this repository. You namedropped another pydot-ng developer. They will, of course, also be granted push access. The more unpaid volunteer contributors the better for all, eh?
    • A "recipe" is a YAML-formatted file specifying exactly how to build and test a specific version of a specific open-source project for all supported platforms. It's mildly analogous to an appveyor.yml file – except rather than merely testing a project on a single platform, a conda-forge recipe both builds and tests that project on all possible platforms. Impressive conda-forge automation then automatically generates continuous integration (CI) specifications for all supported platforms and, assuming tests magically pass, publishes an Anaconda package for that version of that project under the conda-forge channel. Pheeeeew.
  2. Either @leycec or @prmtl whichever comes first submits a pull request with this repository resolving Anaconda integration. I'd be happy to do so after getting the conda-forge feedstock for pydot-ng up and running. Alternately, you're welcome to tackle this sooner if you have the copious free time and gritty desire. I recommend playing video games instead.
  3. @prmtl publishes a new stable pydot-ng release with PyPI, hopefully including that Anaconda fix. What are we up to now – say, version 2.0.1?
  4. Either @leycec or @prmtl whichever comes first formally notifies pydot contributors of the reactivation of pydot-ng and likely death of pydot. Nobody else appears to have noticed that pydot is on its last ventilator. I note three (!) new pydot pull requests by @hugovk, which deeply saddens me. @hugovk is an official member of Pillow, a high-profile Python imaging framework. It'd be fabulous to help him and/or her up onto the pydot-ng hype train instead – even if only temporarily. One way we might go about this is by submitting a new pydot issue politely declaring the probable death of pydot and suggesting viable well-maintained alternatives. (Well, pretty much just us.) An official community announcement of some sort could lend pydot-ng that much-needed momentum.

That's just how we roll.

But... Why conda-forge?

It's a fair question. There already exists an official Anaconda package for pydot-ng, after all. Sadly, official Anaconda packages are largely useless for smaller-scale projects. Why? Because:

  • Anaconda the Company (i.e., the Company Formerly Known As Continuum Analytics) updates packages for smaller-scale projects at a somewhat... unreliable cadence. Basically, they're slow. Which makes sense. They're a for-profit venture with paper-thin margins. This means that any work we do here won't necessarily be released to Anaconda users on a sane timescale. I'm surprised they haven't already dropped pydot-ng support, frankly.
  • Official Anaconda packages are entirely out of our control. We can't bump them when we release new versions; we can't debug them when issues inevitably arise; we can't receive community issues or pull requests to assist us in managing our Anaconda packages. In short, we ain't got nuthin'.
  • Conda-forge automation is amazing. This cannot be understated enough. In fact, their automation is so seductively elegant that we really shouldn't need to do anything with the pydot-ng feedstock after we release our first working package. After that laborious milestone, conda-forge automatically:
    • Detects whenever we push a new stable release to PyPI.
    • Updates the recipe stored in the pydot-ng feedstock accordingly (e.g., bumping version numbers, download URLs, and SHA256 hashes).
    • Re-runs all CI tests.
    • If tests pass, publishes a new pydot-ng package specific to that stable release.

My paternal advice is to stridently ignore the existing Anaconda pydot-ng package. Creating conda-forge feedstocks is dramatically preferable. Since I'm on congenial terms with conda-forge staff, I'd be as giddy as a delinquent monkey pilfering its first plantation banana to facilitate this process for you.

tl;dr

I create conda-forge pydot-ng feedstock. You or I or somebody create new pydot-ng pull request fixing Anaconda integration. You publish new pydot-ng release with PyPI including this fix. You or I or somebody create pydot issue (or something) formally notifying the community of "the switch-over."

Good plan? Bad plan? A little bit of both? Cogitate and let me know! Until then, thanks for all the volunteerism and have a raucous post-Halloween weekend. 🎃

@hugovk
Copy link

hugovk commented Nov 9, 2018

Don't worry too much about my PRs, I'm not using pydot myself, rather helping out with Python upgrades for some of the most-downloaded PyPI packages.

Pydot has half a million a month:

https://hugovk.github.io/top-pypi-packages/

Graphviz installed using Conda is using .bat scripts
to correctly setup running it on Windows.
@leycec
Copy link

leycec commented Nov 14, 2018

Thanks for the sacrificial volunteerism, @hugovk! I hate to be the lone pallbearer of sad news, but...

erocarrera/pydot is unofficially dead. The sole maintainer, Ioannis Filippidis (@johnyf), went "dark" shortly after completing his doctoral thesis. Dr. Filippidis ceased all online activity at the same exact time. Prior to April 2018, he was heavily active on GitHub, StackOverflow, and Twitter; as of April 2018, he abruptly disappeared from all three sites. According to LinkedIn, he's currently a postdoctoral researcher with Inria. This is a mild relief. He doesn't appear to be physically dead, which is good – only completely inactive online, which is bad. (See pydot/pydot#183 for prior discussion.)

In the meanwhile, no one else has push access to erocarrera/pydot. I wouldn't bother posting any further issues or pull requests to that repository. All contributions are being redirected towards /dev/null.

Let us shed a collective tear for the Python graphing library that might have been. 😢

@hugovk
Copy link

hugovk commented Nov 14, 2018

Ah, so this repo is a fork? It wasn't clear to me.

@erocarrera: do you have push access to https://github.com/erocarrera/pydot?

At least Ero is a maintainer at https://pypi.org/project/pydot/

And looks like there's a handful of forks!
https://pypi.org/search/?q=pydot

"pydot" is still the most popular:
https://hugovk.github.io/top-pypi-packages/

@erocarrera
Copy link
Member

I do have push access to https://github.com/erocarrera/pydot anybody wants to jump in? let me know if I need to point pydot in pypi to a working fork or if anybody wants to step in as a maintainer.

@leycec
Copy link

leycec commented Nov 15, 2018

OMG. The Master awakens from his slumber. We've been desperately trying to contact you and @johnyf for the past several months. So, this is a wondrously unexpected development. Thanks for responding so promptly... and invitingly.

What do you think, @prmtl? Are you up for maintaining the official pydot repository? This is also the perfect opportunity for us to debate transferring ownership from erocarrera/pydot to pydot/pydot. @prmtl already manages GitHub's pydot organization, which currently only contains this fork. If everyone's amenable, here's how this transfer might play out:

  1. @prmtl adds @erocarrera to the pydot organization.
  2. @prmtl enables the Allow members to create repositories for this organization option for the pydot organization. ...at least, temporarily
  3. @prmtl grants @erocarrera repository creation permissions.
  4. @erocarrera transfers ownership from himself to the governing pydot organization, which he would then be a member of.
  5. @erocarrera grants @prmtl PyPI permissions to push new stable releases. Maybe? Does PyPI support that sort of thing? As in most aspects of life, I'm clueless on this part.

It is good, yes? @erocarrera would retain most (...all?) permissions, @prmtl would receive new administrator permissions, and all existing issues, pull requests, and wiki pages would be preserved as is.

If Sebastian Kalinowski (@prmtl) is willing and able, he has my vote for new pydot maintainer; else, I reluctantly nominate myself, Cecil Curry (@leycec). Since all existing stable releases of pydot are incompatible with Python ≥ 3.7, it's vital that somebody step up to the plate, begin merging pending pull requests, and produce a new stable release.

Go, Team PyDot! Go!

@erocarrera
Copy link
Member

Good plan! Let me know when I can proceed with the transfer of ownership.

@prmtl
Copy link
Contributor Author

prmtl commented Nov 17, 2018

Sounds like a plan! I really prefer to maintain pydot over pydot-ng since the former can be called as "industry standard" and I remember the mess we have few years ago with bunch of forks (try to convince libs maintainers to change their requirements again :)).

I've invited @erocarrera and @leycec to the organisation and enabled permission that allow members to create repositories.

@erocarrera
Copy link
Member

Thanks!
I've transferred pydot to @prmtl. GitHub warned me that a repository with that name already existed for prmtl, so I renamed it to pydot-classic. Hope that's fine.

@prmtl
Copy link
Contributor Author

prmtl commented Nov 18, 2018

@erocarrera I think that you should transfer it to pydot organisation not me personally. I've also renamed the repo on my account that gave you this warning, so if transferring to organisation will not work, you can try again with transfer to my account.

@erocarrera
Copy link
Member

erocarrera commented Nov 18, 2018

It should be done, I've transferred it to the pydot organization this time.

@prmtl
Copy link
Contributor Author

prmtl commented Nov 18, 2018

Awesome!

@prmtl
Copy link
Contributor Author

prmtl commented Nov 19, 2018

Update: I've merged fix for Python 3.7 and dropped support of Python 2.6. Now I'm ready to make a release. I've just need permissions to pydot's PyPI.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants