Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel inter/intra-spkg builds #8306

Closed
qed777 mannequin opened this issue Feb 19, 2010 · 99 comments
Closed

Parallel inter/intra-spkg builds #8306

qed777 mannequin opened this issue Feb 19, 2010 · 99 comments

Comments

@qed777
Copy link
Mannequin

qed777 mannequin commented Feb 19, 2010

Along with a primed ccache, compiling multiple spkgs in parallel may significantly speed up Sage builds on multicore machines. See sage-release for some information.

To build multiple spkgs in parallel:

Also:

CC: @mwhansen @sagetrac-mvngu

Component: build

Author: Mitesh Patel, John Palmieri

Reviewer: David Kirkby, John Palmieri

Merged: sage-4.5.alpha0

Issue created by migration from https://trac.sagemath.org/ticket/8306

@qed777 qed777 mannequin added c: build labels Feb 19, 2010
@qed777

This comment has been minimized.

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Feb 25, 2010

comment:2

See #7943 and #8191 for recent changes to makefile, spkg/install, and spkg/standard/deps.

@qed777

This comment has been minimized.

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Feb 25, 2010

Author: Mitesh Patel

@qed777 qed777 mannequin added this to the sage-4.3.4 milestone Feb 25, 2010
@qed777
Copy link
Mannequin Author

qed777 mannequin commented Feb 25, 2010

comment:4

With a "primed" compiler cache, i.e., I've already built Sage at least once, I can build Sage sans docs in about 15-20 minutes on an otherwise mostly idle sage.math --- with make -j20. The long doctests pass (after I've built the docs).

@qed777 qed777 mannequin added the s: needs review label Feb 25, 2010
@JohnCremona
Copy link
Member

comment:5

What is it about eclib which makes this fail? I would be happy to change it if only I knew -- the "exceptionally clear explanation" you refer to was not quite clear enough for me... John

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Feb 26, 2010

comment:6

Oops. That was an attempt at humor.

I don't know why the build fails. Here's a log from building eclib with +.

A possibly related problem is that the top-level make doesn't notice the error. It keeps going until "Sage build/upgrade complete!". I'll see what happens if I add $(INST)/$(ECLIB) to "all"'s list of dependencies.

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Feb 27, 2010

comment:7

Replying to @qed777:

[...] I'll see what happens if I add $(INST)/$(ECLIB) to "all"'s list of dependencies.

Same result. All of the other spkgs build properly. But I definitely need to figure out why this happens with eclib and check the other spkgs for similar potential behavior.

@qed777 qed777 mannequin added s: needs work and removed s: needs review labels Feb 27, 2010
@qed777
Copy link
Mannequin Author

qed777 mannequin commented Feb 27, 2010

Attachment: eclib_makefiles.patch.gz

Tweak eclib Makefiles so it builds with +. eclib src repo.

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Feb 27, 2010

comment:8

This patch gets eclib to build with +. I noticed that eclib's spkg-install contains

if [ "$MAKE" == gmake ]; then 
   echo "using gmake"
else
   echo "Disabling parallel make for now"
   MAKE=make; export $MAKE
fi

Is there a particular reason for this?

@qed777

This comment has been minimized.

@qed777 qed777 mannequin added s: needs review and removed s: needs work labels Feb 27, 2010
@qed777
Copy link
Mannequin Author

qed777 mannequin commented Feb 27, 2010

comment:10

I've updated the attachments so the usual build behavior is the default. To tell make it's OK to build multiple spkgs at a time, set PARALLEL_SPKG_BUILD="yes" near the top of spkg/standard/deps.

The new deps "depends" on an eclib spkg with attachment: eclib_makefiles.patch (or an equivalent). Should I make it part of #8357?

@qed777

This comment has been minimized.

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Feb 27, 2010

comment:12

Replying to @qed777:

I've updated the attachments so the usual build behavior is the default. To tell make it's OK to build multiple spkgs at a time, set PARALLEL_SPKG_BUILD="yes" near the top of spkg/standard/deps.

That should be "near the top of spkg/install."

@sagetrac-drkirkby
Copy link
Mannequin

sagetrac-drkirkby mannequin commented Feb 27, 2010

Reviewer: David Kirkby

@sagetrac-drkirkby
Copy link
Mannequin

sagetrac-drkirkby mannequin commented Feb 27, 2010

comment:13

The idea seems excellent. Ir is such a waste on modern machines to

I think it would be a lot better if an environment variable SAGE_PARALLEL_BUILD or SAGE_PARALLEL_SPKG_BUILD was set to "yes", rather than require the user to edit the spkg/install. Since virtually all other environement variables are prefixed with SAGE (e.g. SAGE_FORTRAN, SAGE_FORTRAN_LIB, SAGE64 ...) I would do likewise for consistency.

It would be necessary to test this on different architectures and operating systems. It is quite possible that the time different packages take to build differs vastly by the processor and operating system. It is likely that package X always builds in parallel before package Y on sage.math, but Y would build before X on another system. That could well mean there are dependencies existing that are not apparerent in serial builds.

For example, ATLAS takes at least 10x as long to build on 't2' as it does on my own SPARC, simply because there are no default tuning parameters for ATLAS on t2, so it is tuned each time. That is an extreme example, but it is well known some machine are faster at some tasks than others, but slower on other tasks. Some packages have assembly language support for certain processors, but not others. Several packages go through some sort of tuning process to determine the optimal build parameters. So the timings could be expected to be different on different operating systems.

At the very least, this should be tested on t2 and bsd, as they run Solaris and OS X repectively. (When testing on t2, I would suggest using j of 256 or 512. That machine has 128 threads). For t2, one would need to use sage 4.3.0.1. (I think any changes to spkg/install and spkg/standard/deps should minimal between 4.3.0.1 and 4.3.3. There are probably no changes at all.

It would be safer to compare the md5 checksums of libraries & binaries built in series and parallel to prove they are indeed the same. It is quite possible that there are failures that just do not get exercised by doctests. However, that may not be fully possible, as perhaps some would have the build time information encoded in some way. But I would at least investigate.

Overall, I think this is a really excellent idea, but it needs further testing before I'd want to give it a positive review.

Dave

@JohnCremona
Copy link
Member

comment:14

I didn't write eclib's spkg-install, and I know that several people have changed it in the past -- there's at least one ticket out there specifically about managing eclib's Makefiles.

Next time someone edits it they should change the two occurrences of "cremona" in error messages to "eclib".

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Feb 27, 2010

comment:15

Replying to @JohnCremona:

I didn't write eclib's spkg-install, and I know that several people have changed it in the past -- there's at least one ticket out there specifically about managing eclib's Makefiles.

Indeed, the workaround is from #4228.

Next time someone edits it they should change the two occurrences of "cremona" in error messages to "eclib".

I'll make a new spkg available at #8357.

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Jun 10, 2010

comment:69

Replying to @jhpalmieri:

I agree that the R issue can go on another ticket.

I've opened #9201 for this problem.

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Jun 24, 2010

Attachment: install.gz

spkg/install. Rebased for 4.4.4.

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Jun 24, 2010

Attachment: deps.gz

spkg/standard/deps. Rebased for 4.4.4.

@qed777

This comment has been minimized.

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Jun 24, 2010

comment:71

I've updated deps and install (and diffs) for Sage 4.4.4's removal of GHMM.

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Jun 24, 2010

comment:72

Dependencies: #8645, #9185, #9186, #9264.

@jhpalmieri
Copy link
Member

comment:73

For what it's worth, I think that your "diff" files for deps and install are backwards, showing the diffs from the new version to the old, rather than from the old to the new.

The deps file is a bit off: first, under the target "all", $(INST)/$(SAGENB) is listed, but it's not in the "real" version of deps. I think this is good, and I think it's a bug in the version of deps distributed with Sage. Other targets are missing from "all", though:

  $(INST)/$(JINJA)
  $(INST)/$(JINJA2)
  $(INST)/$(PYGMENTS)
  $(INST)/$(SPHINX)
  $(INST)/$(SQLALCHEMY)
  $(INST)/$(TWISTED)
  $(INST)/$(TWISTEDWEB2)

Also, the following lines are missing:

$(INST)/$(TWISTEDWEB2): $(INST)/$(TWISTED)
	$(INSTALL) "$(SAGE_SPKG) $(TWISTEDWEB2) 2>&1" "tee -a $(SAGE_LOGS)/$(TWISTEDWEB2).log"

I'm attaching "deps-new" and "deps-new.diff" to fix this.

@jhpalmieri
Copy link
Member

comment:74

Okay, now I'm confused.

  $(INST)/$(TWISTEDWEB2)

What is this package? It's included in the target "all", and the current version of deps has lines

$(INST)/$(TWISTEDWEB2): $(INST)/$(TWISTED)
	$(SAGE_SPKG) $(TWISTEDWEB2) 2>&1

but it doesn't look like there's an actual package there. For instance, there are no lines like this in install:

TWISTEDWEB2 =`$newest twistedweb2`
export TWISTEDWEB2

@jhpalmieri
Copy link
Member

comment:75

I think twistedweb2 used to be a package but is no longer. I can't find the relevant ticket, but I'm going to remove references to it from deps. (See also a comment at #9274.)

@jhpalmieri
Copy link
Member

spkg/standard/deps. Rebased for 4.4.4.

@jhpalmieri
Copy link
Member

Attachment: deps-new.gz

diff between "deps" in 4.4.4 and "deps-new"

@jhpalmieri
Copy link
Member

Attachment: deps-new.diff.gz

Attachment: deps-deps-new.diff.gz

diff between mpatel's "deps" and "deps-new"

@jhpalmieri
Copy link
Member

comment:76

I tried this again on t2 yesterday. With MAKE="make -j4", it worked, building Sage (except for Atlas, which I installed by hand) in 4 hours. For future tickets, we might want to pursue the following: with MAKE="make -j36", the installation failed on the packages lapack, mpir, r, and sage. (After each failure, I switched to "make -j4" until the problematic package was built, then switched back to "j36".)

@rlmill
Copy link
Mannequin

rlmill mannequin commented Jun 25, 2010

Merged: sage-4.5.alpha0

@rlmill rlmill mannequin removed the s: positive review label Jun 25, 2010
@rlmill rlmill mannequin closed this as completed Jun 25, 2010
@qed777
Copy link
Mannequin Author

qed777 mannequin commented Jun 26, 2010

Attachment: deps.diff.gz

Diff of spkg/standard/deps vs 4.4.4.

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Jun 26, 2010

Diff of spkg/install vs 4.4.4.

@qed777
Copy link
Mannequin Author

qed777 mannequin commented Jun 26, 2010

comment:78

Attachment: install.diff.gz

Replying to @jhpalmieri:

I think twistedweb2 used to be a package but is no longer. I can't find the relevant ticket, but I'm going to remove references to it from deps. (See also a comment at #9274.)

I mentioned twistedweb2 in an older version of the description.

Maybe we should add the missing targets (Sphinx, etc.) at #9274? It does seem better to be explicit about the dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants