Closed Bug 40757 Opened 25 years ago Closed 20 years ago

JS GC safety model (requests, newborns, local roots) hard to use

Categories

(Core :: JavaScript Engine, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla1.8alpha4

People

(Reporter: brendan, Assigned: brendan)

References

Details

Attachments

(1 file, 7 obsolete files)

"Users" below means jsapi.h users. Users must hand off newborns to local roots. I've seen too many forget, or do it after some amount of code that might call another constructor and overwrite the newborn, then call the GC. Users must allocate and use local roots. Users must use JS_Begin and JS_EndRequest around maximal non-blocking sequences of JS API and native code. Again, it has proven hard for some users to spot the preemption points. A better model would auto-allocate local roots each time there is a new birth, obviating the need for newborns. This is what JNI, based on warren's JRI, does. Then you have to worry about local root bloat (say you are constructing 100,000 objects in a loop, storing each in a rooted object as you go), so you need some DeleteLocalRoot primitive. This model could be used for unrooted atoms too, inside the engine. Then we wouldn't need to keep the GC from running when any thread is in a JS API call that atomizes, and in particular in the compiler. /be
Status: NEW → ASSIGNED
Keywords: js1.5
Target Milestone: --- → M16
The Request API doesn't detect that the inner request should not try to wait for the GC to finish, but it should -- otherwise you can easily deadlock cuz the GC may be waiting for the outer request to end. But this just points out that nested requests, one or more contexts, one thread are unnecessary. Once you know a context is in a request on this thread, you can safely call the API using any context. That's because requests are required only to keep the GC at bay, so any context's request will do if you're in one. scullin@geocast.com already bore some brunt here, so I think we should fix JS_BeginRequest to detect whether any context is in a request on this thread. We could search the runtime's context list for other contexts with the same thread-id in cx->thread, and with non-zero cx->requestDepth. The GC does that already, in an attempt to ignore such nested requests when the innermost one calls into the GC. But the GC can afford linear search and quadratic growth rates over the number of contexts -- can JS_BeginRequest? /be
Moving out, will fix soon. I hope js1.5 wraps up in M17. /be
Target Milestone: M16 → M17
Priority: P3 → P1
Blocks: 39125
I think eliminating the request model implies either stop-all-threads as an NSPR primitive, or else incremental GC. I don't think NSPR supports stop-all-threads any longer, it was not portable (where it was claimed to work, it was buggy in ways that were unfixable without more OS changes). Wtc, do you remember more of this history from the old days of the netscape.com Java VM? So incremental GC is the holy grail: then, with a write-barrier inside the set-slot primitive for objects, we could let the GC do some work while scripts ran and mutated objects not in the generation being scanned. The GC would have to respect object locks by not taking them (else it could deadlock, unless the GC lock was never needed by a thread while that thread held an object locked). More in a bit. /be
PR_SuspendAll and PR_ResumeAll are no longer supported. Pthreads doesn't have functions to suspend and resume a thread. One would need to use either non-portable API or a technique based on signals to suspend and resume a pthread.
Not likely to happen in M18 timeframe. /be
Target Milestone: M17 → M20
Target Milestone: M20 → mozilla1.0
Well, js1.5 needs to go out soon (almost there), but this bug is targeted at mozilla1.0, which is next year. So removing js1.5. If someone needs this in js1.5, mozilla0.9, etc. please set the keyword back. /be
Keywords: js1.5
Lowering priority in light of other needs. /be
Priority: P1 → P3
I can't take the guilt any more. (And I'm working on incremental/generational GC.)
Assignee: brendan → shaver
Status: ASSIGNED → NEW
Priority: P3 → P2
*** Bug 77636 has been marked as a duplicate of this bug. ***
The debugger isn't going to do much of anything (except crash) until something happens here, or we check in the bug 77636's patch.
I don't think I want to change the local-root model for 1.0, even if I thought I would be able to get it done.
Target Milestone: mozilla1.0 → mozilla1.1
Taking for 1.5. /be
Assignee: shaver → brendan
Target Milestone: mozilla1.1alpha → mozilla1.5alpha
Status: NEW → ASSIGNED
Target Milestone: mozilla1.5alpha → mozilla1.6alpha
Target Milestone: mozilla1.6alpha → mozilla1.7alpha
I'm not sure this is worth hacking on, as opposed to Narcissus. Someone else should feel free to step up. I'll advise, given a righteous hacker who wants to do this. /be
Target Milestone: mozilla1.7alpha → Future
Needed for E4X. Patch in a moment. /be
Target Milestone: Future → mozilla1.8alpha4
Attached patch scoped local roots (obsolete) (deleted) — Splinter Review
Shaver, here it is, broken out of the impending gigundo e4x patch. /be
Notice that JS_LeaveLocalRootScope eagerly frees cx->localRootStack, so there is the possibility of malloc churn here. We'd need an lrs->entryCount member to know when an allocated, empty stack was active, implying js_AllocGCThing should push on it rather than set cx->newborn. /be
Attachment #156388 - Flags: review?(shaver)
The entryCount malloc churn fix I mentioned would add bloat, of course (264 bytes per context, net of malloc overhead). Pick your poison. Note that the patch contains a relocation of StartResolving and StopResolving from jsobj.c to jscntxt.c, where they belong. This anticipates the e4x patch and helps reduce stress in that bug. /be
Comment on attachment 156388 [details] [diff] [review] scoped local roots This looks pretty good, though I want to re-read in the AM. How should we use this to properly solve the atom-reclamation issue from bug 77636? I also wonder about using an arena and resizing vector of marks, rather than the individually-allocated lrs frames, but I'll ponder that some more.
I started using an arena pool, but the lack of strong typing made it more pain than it was worth. With the ad-hoc structure, you can count on the compiler to do the alignment (no slop overallocated from malloc), you can index into lrc->roots, and the result is much more readable. More in a bit on atoms. Note that last-ditch GCs from the engine itself do not collect atoms (part of the patch in bug 39125), so it's only the debugger hook call-outs that might run a full GC (bug 77636). /be
Comment on attachment 156388 [details] [diff] [review] scoped local roots New patch coming. /be
Attachment #156388 - Flags: review?(shaver)
Also, I fixed some comment typos and infelicitous wordings. This looks good, I'd like to get it into 1.8a4 when the trunk opens. /be
Attachment #156388 - Attachment is obsolete: true
Also, commented js_LeaveLocalRootScope better. Ready for review! Oh, did I say that before? ;-) Looking for bugs myself is no fun, I want a code buddy. /be
Attachment #156448 - Attachment is obsolete: true
Namely, this one. /be
Attachment #156451 - Attachment is obsolete: true
Attachment #156452 - Attachment description: argh, last patch was a stale copy of the righ patch → argh, last patch was a stale copy of the right patch
Attachment #156452 - Flags: review?(shaver)
When mark != 0 but m == 0 by the bottom of the function. This case is where a scope just happens to claim the first element in a non-initial chunk (malloc'd separately from the stack). When leaving that scope, the saved lrs->scopeMark stored at that lrc->roots[0] will be nulled, but without this patch's fix (at the bottom of js_LeaveLocalRootScope), lrc would be wasted: the next push would see that (lrs->rootCount & JSLRS_CHUNK_MASK) == 0, and lrs->rootCount != 0, so it would allocate a new lrc and push it on top of the empty one. Whew. Clearly, I was trying to combine cases that can't be combined, given the lrs->firstChunk special case, mixed with some off-by-one bugs in earlier drafts. /be
Attachment #156452 - Attachment is obsolete: true
Attachment #156452 - Flags: review?(shaver)
If m == 0, n != 0, and top != v, we are forgetting a root that's not on top of the stack, so we have to search downward (top != v). But (m == 0) we can't start in the top chunk -- we must go down one. Hence the lrc2 = (m == 0) ? lrc->down : lrc statement before the while (--i > mark) loop. I tightened up a few unsigned comparisons to use > instead of !=, which is safe in these instances because predecessor code checks and excludes the <= case. /be
Attachment #156454 - Attachment is obsolete: true
These patches all contain some unrelated cleanup, btw: JSStackFrame.objAtomMap got in a while ago, and its #if 0'd code was removed or fixed for bug 169559, but for some reason I forgot to yank the struct member, its initialization to NULL, and its GC code. Bleh. Glad to be rid of it, finally. /be
1. Move the empty scope check in js_ForgetLocalRoot (called directly from the JS_ForgetLocalRoot API) out of the 'if (top != v)' non-topmost-value-being-popped hard case, so an API caller can't possibly trick the code into popping a scope mark(!). 2. Combine the 'lrc2 = (m == 0) ? lrc->down : lrc' statement before the loop with the 'if (j == 0) lrc2 = lrc2->down' in the loop body, for tighter, prettier code. Diff the last patch against this one to see this more clearly. /be
Attachment #156459 - Attachment is obsolete: true
Ok, this is testing well for me now, never mind mental execution. Any more easter egg bugs can come out in the wash. It won't be used by any code till e4x lands, so it's not gonna bust anyone. /be
Attachment #156463 - Attachment is obsolete: true
Blocks: e4x
Comment on attachment 156469 [details] [diff] [review] one-liner: fix js_EnterLocalRootScope to set cx->localRootStack (duh!) OK, I finally ran through it all in my head. Sorry for the overnight delay, but I was too tired to do a good job with it last night when I got back. r=shaver
Attachment #156469 - Flags: review+
It's unusual for the text of a JSMSG to end with a period. Also, the NB in the doc for JS_ForgetLocalRoot is a little misleading. If you forget on an old root, your recently allocated GC-things also become expensive to find due to the swap before pop.
nallen: thanks, I don't know where that period came from, honest! I purged a few that crept into js.msg a while ago. Looks like I missed the one at the end of JSMSG_BAD_INDIRECT_CALL's string too. Yeah, swapping instead of sliding down would preserve the strictly greater efficiency of forgetting more recent local roots, but I'm not worried -- there ought not be too many for O(n^2) problems to show up, or someone is failing to root as they go (build a tree top down). I'll change the comment to say more about this, although the subtleties are probably lost in jsapi.h /be
> Yeah, swapping instead of sliding down would preserve the strictly greater Er, "sliding down instead of swapping ...." Checked in. /be
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Flags: testcase-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: