Closed
Bug 31364
Opened 25 years ago
Closed 24 years ago
parallel build dies of race condition xpidl<->mkdir in export
Categories
(SeaMonkey :: Build Config, defect, P3)
Tracking
(Not tracked)
VERIFIED
FIXED
M18
People
(Reporter: axel, Assigned: cls)
Details
Attachments
(4 files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review |
doing a parallel build with -j12 on a 12proc machine with sources in /tmp:
cvsco.log from Mar 10 20:12 MET
error messages like (this is not the first one, but they don't differ much)
(second run after nsprpub)
make[3]: Entering directory `/tmp/build/rdf/chrome'
Creating .deps
make[4]: Entering directory `/tmp/build/rdf/chrome/public'
Creating .deps
Creating _xpidlgen
nsIChromeRegistry.idl
../../../dist/bin/xpidl -m header -w -I ../../../dist/idl
-I/tmp/mozilla/rdf/chrome/public -o _xpidlgen/nsIChromeRegistry
/tmp/mozilla/rdf/chrome/public/nsIChromeRegistry.idl
nsIChromeEntry.idl
../../../dist/bin/xpidl -m header -w -I ../../../dist/idl
-I/tmp/mozilla/rdf/chrome/public -o _xpidlgen/nsIChromeEntry
/tmp/mozilla/rdf/chrome/public/nsIChromeEntry.idl
../../../config/nsinstall -R -m 444
/tmp/mozilla/rdf/chrome/public/nsIChromeRegistry.idl
/tmp/mozilla/rdf/chrome/public/nsIChromeEntry.idl ../../../dist/idl
error opening output file: No such file or directory
make[4]: *** [_xpidlgen/nsIChromeEntry.h] Error 1
make[4]: *** Waiting for unfinished jobs....
make[4]: Leaving directory `/tmp/build/rdf/chrome/public'
make[3]: *** [export] Error 2
make[3]: Leaving directory `/tmp/build/rdf/chrome'
Updated•25 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Can one of you 3 that sees this problem try building setting XPIDL_GEN_DIR=. ?
Our problems with parallel builds with the classic NSPR build stemmed from all
of the mkdir calls used to create the .OBJ dirs.
Reporter | ||
Comment 2•25 years ago
|
||
As setting the environment didn't cut it, I changed rules.mk by hand to replace
XPIDL_GEN_DIR = _xpidlgen
with
XPIDL_GEN_DIR = .
in line 234
Then the race is fixed, but the build brakes with
make[1]: Entering directory `/tmp/build/widget/public'
nsIWidget.idl
../../dist/bin/xpidl -m header -w -I ../../dist/idl -I/tmp/mozilla/widget/public
-o .//tmp/mozilla/widget/public/nsIWidget
/tmp/mozilla/widget/public/nsIWidget.idl
error opening output file: No such file or directory
make[1]: *** [/tmp/mozilla/widget/public/nsIWidget.h] Error 1
make[1]: Leaving directory `/tmp/build/widget/public'
make: *** [export] Error 2
New bug? cls said, he would work on it after getting up again.
Axel
Comment 3•25 years ago
|
||
How about generating dirs when generating makefiles? I made hack to
acoutput-fast.pl that check if makefile have XPIDLSCR and makes xpidl
dirs.
I found some errors in allmakefiles when testing this, i attach patch
that has hacks to acoutput-fast.pl and fixes to allmakefiles.sh
How about making .deps -dirs same way?
Comment 4•25 years ago
|
||
Because you still need to be able to make the directories on the fly after
someone does a 'make clean'. Axel, which version of gnu make are you using?
Reporter | ||
Comment 6•25 years ago
|
||
I use GNU Make version 3.78.1.
I had a different idea, how about a dummy target, and add that to the
dependencies?
$(XPIDL_GEN_DIR)/%.h: %.idl $(XPIDL_COMPILE) dirs_target
$(REPORT_BUILD)
dirs_target:
@if test ! -d $(XPIDL_GEN_DIR); then echo Creating $(XPIDL_GEN_DIR); rm -rf
$(XPIDL_GEN_DIR); mkdir $(XPIDL_GEN_DIR); else true; fi
this way, the headers depend on the exist test for the dir, but not on the dir
itself, right?
I've had nothing but problems doing parallel makes with gnu make > 3.77 . I
don't know what Smith changed with the jobserver stuff but it doesn't work.
Downgrade to 3.76.1 and let me know if the problem persists.
mass re-assign of all bugs where i was listed as the qa contact
QA Contact: cyeh → chofmann
Assignee | ||
Comment 10•25 years ago
|
||
After some digging thru the bug-make mail archive, I ran across a thread that
seems to indicate that there is a serious bug with at least make 3.78.1. Look
at the '3.78.1 Error with "::" targets and "-j" option' thread.
http://www.geocrawler.com/archives/3/351/1999/11/0/
From experience, it doesn't appear to have been fixed with 3.79 but I don't see
anything about it one way or the other.
Comment 11•24 years ago
|
||
While make 3.79.1 does fix the bug mentioned in bug-make, it does not fix this.
I wonder if a small example can be come up with, to submit to the make people.
Comment 12•24 years ago
|
||
Also, btw, I tried make 3.77, but it seems to have another bug that makes it
fail immediately:
make[5]: Entering directory
`/mnt/proj/mozilla/mozilla/nsprpub/pr/include/obsolete'
../../../config/SunOS5.7_sparc_32_PTH_DBG.OBJ/nsinstall -R -m 444
/mnt/proj/mozilla/mozilla/dist/include/obsolete
usage: ../../../config/SunOS5.7_sparc_32_PTH_DBG.OBJ/nsinstall [-C cwd] [-L
linkprefix] [-m mode] [-o owner] [-g group]
[-DdltR] file
[file ...] directory
make[5]: *** [export] Error 2
Assignee | ||
Comment 13•24 years ago
|
||
That is a known issue with the $(wildcard) feature & make 3.77 under solaris.
You will need to downgrade to 3.76.1. :-/
Comment 14•24 years ago
|
||
Part of this looks like a basic test-and-create-is-not-atomic race. In many
places across the makefiles we have:
if test ! -d foo; rm -rf foo; mkdir foo; else true; fi
If I just use "mkdir -p foo" instead, the xpidlgen problems go away (however I
still get errors building nspr; I'm looking into those). What's the reasoning
behind this test?
Comment 15•24 years ago
|
||
Comment 16•24 years ago
|
||
My current patch is above. With this applied, the only error I can consistently
reproduce is one that also happens sometimes on non-SMP systems (well, Master_D
is getting it at least, and he's not SMP). I don't think it's 100% fixed though.
Assignee | ||
Comment 17•24 years ago
|
||
I believe the reason for the test is that the -p option is not supported on
mkdir on all platforms. I'm wondering if we shouldn't just start using a
mkinstalldirs script like a number of projects do?
Comment 18•24 years ago
|
||
Would it be possible to simply make sure the 'export' target gets built with -j1 all of the time? This is where all the problems are, so it would at least be a good workaround until we figure out mkinstalldirs (i'm not familiar with the details of that, unfortunately) or something else.
Comment 19•24 years ago
|
||
adding self to cc as our unix daily build systems are multicpu but aren't doing
parallel builds.
Comment 20•24 years ago
|
||
hmm, I just tried doing a non-parallel make export and a -j4 make install on
sol26 and cut the build time from about 5 hours down to about 4, but I got a lot of
gmake[2]: warning: -jN forced in submake: disabling jobserver mode.
Other than that, it seemed to complete without problems. If it works on hpux
and linux I'll turn it on for the daily builds.
Comment 21•24 years ago
|
||
I think this is because all the submakes use -j4 . Taken literally,
this would mean that each submake should start 4 jobs. Since this is
obviously not what you want, it ignores that and coordinates the number
of jobs with the parent make. The warning is to tell you it's doing that.
It might go away if we could tell the submakes not to use -jN, but that's
probably not trivial. So I think it can be ignored.
Comment 22•24 years ago
|
||
I finally got around to configuring the dual processor linux box for daily
verification builds. Once I get the daily builds switched over to the new
system (test build going now) I'll be looking at turning this on again for the
daily builds...
Assignee | ||
Comment 23•24 years ago
|
||
Assignee | ||
Comment 24•24 years ago
|
||
If I make sure that the generated have an actual dependency upon a target that
makes the XPIDL_GEN_DIR, then the problem goes away for me. Can someone with a
hoss test box try this out?
Note: they cannot actually depend upon XPIDL_GEN_DIR as the timestamp of
XPIDL_GEN_DIR changes when its contents change.
Reporter | ||
Comment 25•24 years ago
|
||
Reporter | ||
Comment 26•24 years ago
|
||
Hi,
I tested a (modified version of) cls' patch. The file in xpidlgen_ does the
trick.
I gave some facelifting to the patch by cls.
First, there were some security patches in there, removed those.
The XPIDL_GEN_DIR is not part of the MAKE_DIRS variables anymore, as we have
the right dependency in there. no need to have it twice.
I rephrased the generating line a bit. Nothing much happened there.
I tested this on our machine, with a make -j6 export. The load is not
particularily low at the moment, but 4 procs were free.
I figure I should have got trapped if this wouldn't work.
clobber worked out allright, too.
r=me
Axel
Assignee | ||
Comment 27•24 years ago
|
||
Patch has been checked in. Marking fixed.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 28•24 years ago
|
||
Tested the patch on my SMP system here, works fine. Marking verified.
Status: RESOLVED → VERIFIED
Updated•20 years ago
|
Product: Browser → Seamonkey
You need to log in
before you can comment on or make changes to this bug.
Description
•