Skip to content

WeeklyTelcon_20210518

Geoffrey Paulsen edited this page May 18, 2021 · 1 revision

Open MPI Weekly Telecon ---

Attendees (on Web-ex)

  • Austen Lauria (IBM)
  • Brendan Cunningham (Cornelis Networks)
  • Brian Barrett (AWS)
  • David Bernholdt (ORNL)
  • Edgar Gabriel (UH)
  • Geoffrey Paulsen (IBM)
  • George Bosilca (UTK)
  • Harumi Kuno (HPE)
  • Hessam Mirsadeghi (NVIDIA))
  • Howard Pritchard (LANL)
  • Jeff Squyres (Cisco)
  • Joseph Schuchart (HLRS)
  • Josh Hursey (IBM)
  • Michael Heinz (Cornelis Networks)
  • Naughton III, Thomas (ORNL)
  • Raghu Raja (secret startup)
  • Sam Gutierrez (LANL)
  • Todd Kordenbrock (Sandia)
  • Tomislav Janjusic (NVIDIA)
  • William Zhang (AWS)

not there today (I keep this for easy cut-n-paste for future notes)

  • Akshay Venkatesh (NVIDIA)
  • Artem Polyakov (NVIDIA)
  • Aurelien Bouteiller (UTK)
  • Brandon Yates (Intel)
  • Charles Shereda (LLNL)
  • Christoph Niethammer (HLRS)
  • Erik Zeiske (HPE)
  • Geoffroy Vallee (ARM)
  • Joshua Ladd (NVIDIA)
  • Marisa Roman (Cornelius)
  • Mark Allen (IBM)
  • Matias Cabral (Intel)
  • Matthew Dosanjh (Sandia)
  • Nathan Hjelm (Google)
  • Noah Evans (Sandia)
  • Ralph Castain (Intel)
  • Scott Breyer (Sandia?)
  • Shintaro iwasaki
  • Xin Zhao (NVIDIA)

New Items

Company asking to use Open-MPI logo

Issue 8763 - Fortran non-blocking Handles

  • Fortran bindings for MPI_Ialltoallw and neighbor version
    • We create an array, pass it into the C binding, and then free it before the C side has completed.
    • Another issue discovered:
      • 4 byte C ints, and 8 byte Fortran integers.
      • This has been in code "forever"
    • Howard thought we discussed this before and that we added a configury check to disallow this.
    • George thinks he has an elegant solution.

Partial datatype update: not a bug

  • End result is that this is NOT a bug, user error, but subtle issue.
    • v4.1.1 stands as is, and v4.0.x is clear from this.
    • Issue updated and closed

v4.0.x

  • Jeff will triple check PR 8859
  • We'll do one more RC, and then get a final v4.0.6 out.
  • Where are we on pack/unpack with long and long double
    • only external32
    • This worked before, but not sure

v4.1.x

  • Want to ask the community if anyone has a need to get a v4.1.2 out.
  • No driver to rush, so now just in bugfix phase.

v5.0.x

  • Originally targeted mid-may for an RC.
  • Newish issues on master / v5
  • Any update on debugger support?
  • Need some documentation that Open MPI v5.0 supports PMIx based debuggers, and that if
    • IBM is working on some CI testing with MPIR (typically very brittle)
    • PMIx debugger status
  • Howard posted some configury on mca-no-build.
    • A couple of places AUTOCONF wasn't used correctly.
  • 8925 - hang / processes left over
  • cuda linking
  • Did we ever get the fault tolerant behavior resolved.
  • MPIT_Finalize - then if you do an init, you have a meltdown in ofi
    • Howard is posting a PR for this today.

Reformatting

  • Done for now. Leaving what was done in opal already.

  • Closed the ompi dir PRs without merging.

  • If someone wants to look for a different tool that just enforces the minimal rules, we might be willing to accept.

  • MPICH uses GNU indent, if someone wants to look at this, fine.

  • They also don't have some challenges we do (huge blocks of

  • Joseph looked at indent, but uncrustify

  • Jeff Sq. outlined the fix to the git commit checker CI: https://github.com/open-mpi/ompi/pull/8947.

    • Went in.

Master

  • No discussion.

MTT

  • Mellanox hasn't been reporting for a while. Tommi will follow up.

PMIx

  • No discussion.

PRRTE v2.0

  • No update

Longer Term discussions

  • No discussion.
Clone this wiki locally