Closed Bug 5534 Opened 25 years ago Closed 25 years ago

BLOCK: Large libpref string causes dialogs to crash on Linux

Categories

(Core :: Preferences: Backend, defect, P1)

x86
Linux
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: akkzilla, Assigned: akkzilla)

Details

Bring up an editor window on Linux or Mac. Make a selection (collapsed or not) and click on the link button on the toolbar. Crashes: Simon says on Mac it's in nsHTMLEditor::GetSelectedElement. Debugging on linux seems to have been horked by something that happened over the weekend; here's my linux gdb stack trace but I don't trust it (maybe Simon can add a Mac stack trace with line numbers). #0 0x4001ac24 in nsWebShellWindow::HandleEvent (aEvent=0xbffff4e8) at /builds/mon/mozilla/xpfe/appshell/src/nsWebShellWindow.cpp:413 #1 0x400af82a in nsWidget::DispatchEvent (this=0x81f2d78, event=0xbffff4e8, aStatus=@0xbffff4c8) at /builds/mon/mozilla/widget/src/gtk/nsWidget.cpp:957 #2 0x400af734 in nsWidget::DispatchWindowEvent (this=0x81f2d78, event=0xbffff4e8) at /builds/mon/mozilla/widget/src/gtk/nsWidget.cpp:919 #3 0x400ae8eb in nsWidget::OnResize (this=0x81f2d78, aRect=@0x202cf0) at /builds/mon/mozilla/widget/src/gtk/nsWidget.cpp:318 #4 0x400acae7 in idle_resize_cb (data=0x80ae9d8) at /builds/mon/mozilla/widget/src/gtk/nsGtkEventHandler.cpp:288 #5 0x40ac0628 in g_idle_dispatch () #6 0x202cf0 in ?? () #7 0x3 in ?? ()
Priority: P3 → P1
Target Milestone: M5
If we can't get insert link ready for M5, we should turn it off so M5 users don't hit the crash.
Looks like the iterator in nsHTMLEditor::GetSelectedElement is never advanced, so we go into an infinite loop. Charlie?
I check in this change to ublock the iterator. Things are better now. Index: nsHTMLEditor.cpp =================================================================== RCS file: /cvsroot/mozilla/editor/base/nsHTMLEditor.cpp,v retrieving revision 1.27 retrieving revision 1.28 diff -r1.27 -r1.28 1087a1088,1089 > > iter->Next();
I picked up Simon's change, but I still crash on Linux. (Might cure the hang and blank dialog, though, if I can get past the crash.)
Assignee: cmanske → danm
Summary: Insert Link dialog crashes → Any dialog crashes
This doesn't seem to be just the Link dialog. The Find dialog is also crashing. Akkana can supply some possible stack info (if the Linux debugger is working!) Simon Fraser says dialogs work on Mac.
Summary: Any dialog crashes → Any dialog crashes or hangs.
Also the prefs dialog (which I previously saw working on mcafee's machine last week). Basically, all dialogs on Linux seem to be hosed. Sometimes it crashes, sometimes it hangs. Sometimes it brings up a blank window before it crashes or hangs, sometimes not. When it crashes, stack traces on the core file are useless (corrupted stack, looks like), but if you run it in gdb and it crashes, the stack trace looks like the one listed here, nsWebShellWindow::HandleEvent.
Adding Rod and Pavlov to cc list -- rumour has it they're looking at linux dialogs for M5.
Assignee: danm → syd
Modeless dialogs work fine on Linux; all modal dialogs are quite broken. I'd say this is because the implementation of Linux modality, rumoured to be completed, isn't. Windows are set as modal in what must be proper gtk fashion, but the difficult part, where execution of the caller is actually halted by messing with the event loop, was never touched. I'm the modal windows guy, true, but I remain quite lost with this part. In fact, I'm going to be rude and dump this bug on the head of the kind fellow who was suckered into doing modal dialogs on Linux and in fact implemented the first part of the task. My apologies for doing this at 6:00 PM on the last day before milestone freeze. I claim some immunity from jerkhood by virtue of the circumstance that I just received this bug rather recently, myself. I apologize for not testing Linux modal dialogs earlier.
I'll see if I can determine if the connection to modality has any merit.
Syd: Kin and I were just looking at this, and we found that there were problems loading shared libraries (the wallet library?) We also found that in the particular case of the editor dialogs, there was another problem first involving the dialogs needing to be in a default/ directory -- until this is checked in (unfortunately we both have to leave) you can add a "default" at the end of the path in editor/ui/dialogs/content/Makefile.in. Here are some breakpoints Kin used which might be useful: 1 breakpoint keep n 0x40ed4854 in ToolkitCoreShowModalDialog(JSContext *, JSObject *, unsigned int, long *, long *) at nsJSToolkitCore.cpp:277 4 breakpoint keep n 0x40019e1c in nsAppShellService::CreateDialogWindow(nsIWebShellWindow *, nsIURL *, nsString &, nsIWebShellWindow *&, nsIStreamObserver *, nsIXULWindowCallbacks *, int, int) at nsAppShellService.cpp:295 6 breakpoint keep n 0x4027fce8 in nsNetlibService::OpenStream(nsIURL *, nsIStreamListener *) at nsNetService.cpp:359 9 breakpoint keep y 0x40558c61 in nsTextControlFrame::PostCreateWidget(nsIPresContext *, int &, int &) at nsTextControlFrame.cpp:508 10 breakpoint keep n 0x408f9677 in pr_UnlockedFindLibrary at prlink.c:390
Checked in the makefile fix. Now the crash I see is: #0 0x40dc883b in InMemoryDataSource::Assert (this=0x8235248, source=0x8207510, property=0x8123328, target=0x8122d98, tv=1) at /builds/tue/mozilla/rdf/base/src/nsInMemoryDataSource.cpp:1003 #1 0x40dd0d45 in RDFContainerUtilsImpl::MakeContainer (this=0x807f870, aDataSource=0x8235248, aResource=0x8207510, aType=0x8122d98, aResult=0x0) at /builds/tue/mozilla/rdf/base/src/nsRDFContainerUtils.cpp:342 #2 0x40dd08a3 in RDFContainerUtilsImpl::MakeSeq (this=0x807f870, aDataSource=0x8235248, aResource=0x8207510, _retval=0x0) at /builds/tue/mozilla/rdf/base/src/nsRDFContainerUtils.cpp:264 #3 0x40e2c1b4 in XULContentSinkImpl::OpenTag (this=0x8203190, aNode=@0xbfffefc0) at /builds/tue/mozilla/rdf/datasource/src/nsXULContentSink.cpp:1331 #4 0x40e2967e in XULContentSinkImpl::OpenContainer (this=0x8203190, aNode=@0xbfffefc0) at /builds/tue/mozilla/rdf/datasource/src/nsXULContentSink.cpp:574 #5 0x406ef566 in CWellFormedDTD::HandleToken (this=0x8221838, aToken=0x80b06e0, aParser=0x82031e8) at /builds/tue/mozilla/htmlparser/src/nsWellFormedDTD.cpp:504 #6 0x406ef004 in CWellFormedDTD::BuildModel (this=0x8221838, aParser=0x82031e8, aTokenizer=0x8221960, anObserver=0x0, aSink=0x8203190) at /builds/tue/mozilla/htmlparser/src/nsWellFormedDTD.cpp:256 #7 0x406e81cb in nsParser::BuildModel (this=0x82031e8) at /builds/tue/mozilla/htmlparser/src/nsParser.cpp:847 #8 0x406e80b4 in nsParser::ResumeParse (this=0x82031e8, aDefaultDTD=0x0) at /builds/tue/mozilla/htmlparser/src/nsParser.cpp:799 #9 0x406e771d in nsParser::EnableParser (this=0x82031e8, aState=1) at /builds/tue/mozilla/htmlparser/src/nsParser.cpp:540 #10 0x40e29f2a in XULContentSinkImpl::DoneLoadingStyle (aLoader=0x82585c0, aData=@0x82585e0, aRef=0x8223df0, aStatus=0) at /builds/tue/mozilla/rdf/datasource/src/nsXULContentSink.cpp:777 #11 0x40286ebf in nsUnicharStreamLoader::OnStopBinding (this=0x82585c0, aURL=0x8257eb0, aStatus=0, aMsg=0xbffff188) at /builds/tue/mozilla/network/module/nsNetStreamLoader.cpp:156 #12 0x402a98ca in nsDocumentBindInfo::OnStopBinding (this=0x82585f8, aURL=0x8257eb0, aStatus=0, aMsg=0xbffff188) at /builds/tue/mozilla/webshell/src/nsDocLoader.cpp:2276 #13 0x4028a0ff in stub_complete (stream=0x817f338) at /builds/tue/mozilla/network/module/nsStubContext.cpp:765
Well, I took a look and it appears that the editor is not being instantiated in the same way preferences dialog is. The prefs dialog is behaving modal but the editor is not, and I crash after using the editor, but prefs seems o.k.. I'll have to meet with rods on this one.
Prefs work for you? I crash bringing up the prefs dialog too (run apprunner with no args, then Edit->prefs).
Yes. One thing to note, I am running a build from sometime yesterday. The dialog still has those grevious native button errors, and lots of runtime warnings from Gtk+ are being issued that indicate we are blowing it big (has nothing to do with modality, and to me could be a good reason for a crash), but it doesn't crash for me. Means nothing, could be our machines are different. But I'll pull a new tree and see if it gets worse for me.
Didn't crash with a new build either when going to prefs. But lots of Gtk-CRITICAL errors. We need to have someone look at these -- it indicates a flawed use of the toolkit, and we'll probably clean up some number of issues (perhaps this one) by doing so.
Update: my build from 4/28 pulled at about 9am brought up the link dialog once, but all subsequent attempts to bring up any dialog have crashed (removing the registry file doesn't help). Kin's build pulled about half an hour earlier can show dialogs repeatedly, doesn't crash. We still suspect registry/library loading problems.
I get the following when I try pulling up the prefs dialog after pulling new source. It seems to confirm the contention of registry problems made earlier. I have no clue about this area -- ideas on who to reassign this too? Any idea if the win/mac platforms have evidence of problems? #0 0x407436d9 in nsScriptNameSetRegistry::InitializeClasses (this=0x80de860, aContext=0x8392338) at nsScriptNameSetRegistry.cpp:78 #1 0x40744393 in nsJSContext::InitializeExternalClasses (this=0x8392338) at nsJSEnvironment.cpp:202 #2 0x407444ad in nsJSContext::InitClasses (this=0x8392338) at nsJSEnvironment.cpp:247 #3 0x407442e1 in nsJSContext::InitContext (this=0x8392338, aGlobalObject=0x83922e4) at nsJSEnvironment.cpp:179 #4 0x40744aa7 in NS_CreateContext (aGlobal=0x83922e4, aContext=0x83a2644) at nsJSEnvironment.cpp:459 #5 0x402b17b3 in nsWebShell::CreateScriptEnvironment (this=0x83a2618) at nsWebShell.cpp:2196 #6 0x402b18b3 in nsWebShell::GetScriptGlobalObject (this=0x83a2618, aGlobal=0xbffff314) at nsWebShell.cpp:2224 #7 0x402a5aae in DocumentViewerImpl::Init (this=0x83914f0, aNativeParent=0x838e3e0, aDeviceContext=0x838e0e8, aPrefs=0x8055c90, aBounds=@0xbffff358, aScrolling=nsScrollPreference_kAuto) at nsDocumentViewer.cpp:313 #8 0x402ae482 in nsWebShell::Embed (this=0x83a2618, aContentViewer=0x83914f0, aCommand=0x838e600 "view", aExtraInfo=0x0) at nsWebShell.cpp:733 #9 0x402aa21d in nsDocumentBindInfo::OnStartBinding (this=0x838e5d0, aURL=0x838e610, aContentType=0x80c18a0 "text/html") at nsDocLoader.cpp:2032 #10 0x4028b5ab in NET_NGLayoutConverter (format_out=38, converter_obj=0x0, URL_s=0x8394b68, context=0x838ea00) at nsStubContext.cpp:942 #11 0x40263f85 in NET_StreamBuilder (format_out=38, URL_s=0x8394b68, context=0x838ea00) at mkstream.c:237 #12 0x401a773f in net_setup_file_stream (cur_entry=0x838eb48) at mkfile.c:783 #13 0x401a8511 in net_ProcessFile (cur_entry=0x838eb48) at mkfile.c:1319 #14 0x4025af17 in NET_ProcessNet (ready_fd=0x0, fd_type=1) at mkgeturl.c:3355 #15 0x40262df9 in NET_PollSockets () at mkselect.c:298 #16 0x40284a72 in nsNetlibService::NetPollSocketsCallback (aTimer=0x833e8c8, aClosure=0x80f3220) at nsNetService.cpp:1263 #17 0x40171de9 in TimerImpl::FireTimeout (this=0x833e8c8) at nsTimer.cpp:73 #18 0x401722d2 in nsTimerExpired (aCallData=0x833e8c8) at nsTimer.cpp:189 #19 0x40a877f0 in g_timeout_dispatch (source_data=0x8378958, current_time=0xbffff89c, user_data=0x833e8c8) at gmain.c:1147 #20 0x40a86ae3 in g_main_dispatch (current_time=0xbffff89c) at gmain.c:647 #21 0x40a8706f in g_main_iterate (block=1, dispatch=1) at gmain.c:854 #22 0x40a871f1 in g_main_run (loop=0x80a8158) at gmain.c:912 #23 0x409be277 in gtk_main () at gtkmain.c:475 #24 0x400a8b38 in nsAppShell::Run (this=0x80936c0) at nsAppShell.cpp:202
Kin and I spent some time comparing his build (which works) and mine (which doesn't work), and I've been shuffling libraries around between the two builds to determine where the problem is. Point 1: libpref.so is the culprit: if I use his libpref.so, dialogs work, if I use mine, they crash. Point 2: My build was --enable-mailnews, Kin's wasn't. (I'm currently building another tree without mailnews.)
Summary: Any dialog crashes or hangs. → mailnews build: libpref makes any dialog crash the app
Sure enough, my new non-mailnews build can bring up dialogs without crashing. Changing summary and adding mcmullen to cc list (John, are you the right person for libpref bugs?)
Assignee: syd → mcmullen
Summary: mailnews build: libpref makes any dialog crash the app → BLOCK: mailnews build: libpref makes any dialog crash the app
In modules/libpref/src/Makefile.in, there are two ifdefs for MOZ_MAIL_NEWS. If I change those to MOZ_MAIL_NEWS_NOT in a mailnews build, then dialogs work again. Probably this will screw up mail/news, though.
Summary: BLOCK: mailnews build: libpref makes any dialog crash the app → BLOCK: Large libpref string causes dialogs to crash on Linux
So we figured out what the problem is. It's not the actual mail news preferences that are causing the corruption, it's the size of the string that is created when the mail news preferences are added to the list of preferences. When building libprefs.so on unix, the makefiles run all the .js preference files through sed to generate jsbuffer.h which looks something like this: static char *pref_init_buffer = "pref(foo1)\n" "pref(foo2)\n" "pref(foo3)\n" ... ; When compiled, all these pref strings get concatenated into one large string. This seems to be fine if an application statically links with libpref.a, but trashes the call stack if we dynamically link against libpref.so, and then try to load libpref.so when running the app. Some Unix platforms like AIX, and maybe even Linux, have a limit on the frame size in the call stack. What might be happening here is that the system library load functions might be trying to load some of the .so data on the stack, in this case the large string, and hitting this limit.
Kin meant to add in the previous comment: the likely fix for this will be to change the script which generates jsbuffer.h to make an array of char* rather than a single char* (i.e. put commas at the end of each line and a 0 at the end) and change the routine which writes it out to loop over each string in the array.
Here is a patch to implement these changes. It seems to work on my Linux box. John (or someone), could you look this over and see if it looks reasonable? Two files changed under modules/libpref/src: Index: Makefile.in =================================================================== RCS file: /cvsroot/mozilla/modules/libpref/src/Makefile.in,v retrieving revision 1.11 diff -r1.11 Makefile.in 111,113c111,113 < echo "static char* pref_init_buffer = " >> $@; \ < cat $(CONFIG_FILES) | sed 's/\\/\\\\/g' | sed 's/\\r/\\n/' | sed 's/\"/\\\"/g' | sed 's/^M//g' | sed 's/^/"/' | sed 's/$$/\\n"/' >> $@; \ < echo \; >> $@; \ --- > echo "static char* pref_init_buffer[] = {" >> $@; \ > cat $(CONFIG_FILES) | sed 's/\\/\\\\/g' | sed 's/\\r/\\n/' | sed 's/\"/\\\"/g' | sed 's/^M//g' | sed 's/^/"/' | sed 's/$$/\\n",/' >> $@; \ > echo "0 };" >> $@; \ Index: unix/unixpref.c =================================================================== RCS file: /cvsroot/mozilla/modules/libpref/src/unix/unixpref.c,v retrieving revision 3.11 diff -r3.11 unixpref.c 47a48 > int i; 51c52,56 < status = PREF_EvaluateJSBuffer(pref_init_buffer, strlen(pref_init_buffer)); --- > /* loop over all the strings in the init buffer */ > for (i=0; i < (sizeof pref_init_buffer / sizeof *pref_init_buffer) && pref_init_buffer[i] != 0; ++i) > { > status = PREF_EvaluateJSBuffer(pref_init_buffer[i], strlen(pref_init_buffer[i])); > }
Status: NEW → ASSIGNED
Accepting bug. Looking at akkana's patch.
Assignee: mcmullen → akkana
Status: ASSIGNED → NEW
OK, I understand the change, it looks reasonable, and if you have tested it on a couple of Unices, I won't stand in the way. Assigning to akkana, since she has the fix.
Component: Editor → libPref
Changing the component to libpref. cc-ing myself (since it's not assigned to me now).
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Fixed for M5.
Akkana, can you verify this one? thanks!
Sujay -- did you never see the crash? I would have expected that QA builds would be crashing on bringing up a dialog, just like debug builds. (The fix won't be in until the next set of verification builds, though.)
just tried it out...I can't see the problem as described...
Status: RESOLVED → VERIFIED
verified in 5/6 build..I don't see that problem anymore on Mac and Linux.
Moving all libPref component bugs to new Preferences: Backend component. libPref component will be deleted.
Component: libPref → Preferences: Backend
You need to log in before you can comment on or make changes to this bug.