Closed
Bug 518678
Opened 15 years ago
Closed 15 years ago
archive crash [@ nsMsgLocalMailFolder::EndMessage(unsigned int)]
Categories
(MailNews Core :: Backend, defect)
Tracking
(Not tracked)
VERIFIED
FIXED
Thunderbird 3.0rc1
People
(Reporter: wsmwk, Assigned: Bienvenu)
References
(Blocks 1 open bug, )
Details
(Keywords: crash, fixed-seamonkey2.0.1, Whiteboard: [GS])
Crash Data
Attachments
(4 files)
(deleted),
patch
|
davida
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
standard8
:
review+
standard8
:
superreview+
standard8
:
approval-thunderbird3+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
rkent
:
review+
standard8
:
superreview+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
rkent
:
review+
|
Details | Diff | Splinter Review |
archive crash [@ nsMsgLocalMailFolder::EndMessage(unsigned int)]
10 crashes but likely a single person. how do we know? because they ALL have comments (reporter must be "the good samaratin of crashes", aka TGSC)
bp-21fc1049-34f4-4849-a1b0-dbda72090921
0 thunderbird-bin nsMsgLocalMailFolder::EndMessage nsAutoPtr.h:968
1 thunderbird-bin nsCopyMessageStreamListener::EndMessage nsCopyMessageStreamListener.cpp:117
2 thunderbird-bin nsImapMailFolder::EndMessage nsImapMailFolder.cpp:5161
3 libxpcom_core.dylib NS_InvokeByIndex_P xpcom/reflect/xptcall/src/md/unix/xptcinvoke_unixish_x86.cpp:179
4 libxpcom_core.dylib nsProxyObjectCallInfo::Run xpcom/proxy/src/nsProxyEvent.cpp:181
5 libxpcom_core.dylib nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:510
6 libxpcom_core.dylib NS_ProcessPendingEvents_P nsThreadUtils.cpp:180
7 thunderbird-bin nsBaseAppShell::NativeEventCallback widget/src/xpwidgets/nsBaseAppShell.cpp:121
8 thunderbird-bin nsAppShell::ProcessGeckoEvents widget/src/cocoa/nsAppShell.mm:405
9 CoreFoundation CFRunLoopRunSpecific
10 CoreFoundation CFRunLoopRunInMode
11 HIToolbox RunCurrentEventLoopInMode
12 HIToolbox ReceiveNextEventCommon
13 HIToolbox BlockUntilNextEventMatchingListInMode
14 AppKit _DPSNextEvent
15 AppKit -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:]
16 thunderbird-bin nsAppShell::ProcessNextNativeEvent widget/src/cocoa/nsAppShell.mm:649
17 thunderbird-bin nsBaseAppShell::DoProcessNextNativeEvent widget/src/xpwidgets/nsBaseAppShell.cpp:151
18 thunderbird-bin nsBaseAppShell::OnProcessNextEvent widget/src/xpwidgets/nsBaseAppShell.cpp:278
Updated•15 years ago
|
Component: General → Backend
Product: Thunderbird → MailNews Core
QA Contact: general → backend
Version: 3.0 → 1.9.1 Branch
Reporter | ||
Comment 1•15 years ago
|
||
Mac-only still, 3.0b2, 3.0b3 and 3.0b4
interesting comments are
http://crash-stats.mozilla.com/report/index/361121d7-d56f-4eec-b0b2-d4aec2090924
selected all messages in a IMAP folder containing about 1400 and hit 'A' to archive all items. It seems to have worked for a while and some time later a modal message said that "another task is running on this folder, wait till it's done" or something like that. When I clicked OK, TB crashed.
http://crash-stats.mozilla.com/report/index/899fc5db-bdac-49bc-aa68-5f30f2090923
switch to a folder (and it is indexing, downloading from the imap server, etc) and I select all and archive right away before it is done doing all of its updates.
Comment 2•15 years ago
|
||
I'm not the original good Samaratin but I had this problem too. I'm also on a Mac (Leopard) using 3.0b4.
http://crash-stats.mozilla.com/report/index/fc22a0eb-d2d9-4618-94d1-0d0c42091010
1. Tried out the new Archive button on my imap account and noticed that it created a new Archive folder on the imap server.
2. Changed the setting in Thunderbird to put the archive in Local Folders.
3. Switch to a local folder containing lots of old mail going back to 2004.
4. Select all and press Archive.
5. Got an alert dialog saying "Operation Failed because another operation is using the folder. Please wait for that operation to finish and then try again".
6. After dismissing the dialog I got another one just like it. I didn't count, but I think I probably got 5 dialogs in all from the original Archive operation. No crashes though, and some of the messages were archived.
7. Tried this again a few times with lots of failures but no crashes. Eventually it crashed but I'm not sure what was different that time. Maybe because by then I had switched to archiving messages in my IMAP inbox (I started out archiving a local folder).
The dialogs only happened when I tried archiving a lot of messages at once. Everything was ok with just a screenful of messages. I think this probably happens when the selection contains messages from multiple years, since in that case not all the messages are being copied to the same archive folder.
Comment 3•15 years ago
|
||
I just tried reproducing the problem by selecting messages from my imap inbox from 2008 and 2009 and it worked just fine. No crashes, no dialogs. Apparently you need to archive a lot of messages at once.
Reporter | ||
Comment 5•15 years ago
|
||
#28 crasher for 3.0b4, but my guess would be it will not be that low when we actually release 3.0
Comment 6•15 years ago
|
||
With the current numbers, leaving as wanted-. bienvenu will try and look at anyway. If we get worse numbers for RC1, renomination could be worthwhile.
Flags: wanted-thunderbird3+
Flags: blocking-thunderbird3?
Flags: blocking-thunderbird3-
Comment 7•15 years ago
|
||
I'm confident that bienvenu will be able to reproduce this and if he can't, I have a backup of the mail folder that caused the problem.
Comment 8•15 years ago
|
||
I'm seeing a crash that looks a lot like this. Select about 4-5k messages and hit 'a' Spinning Pizza for a bit, then the Apple crash report dialog. I don't seem to be generating Mozilla crash reports, though.
Totally reproducible, so how can I generate useful info here?
Reporter | ||
Comment 9•15 years ago
|
||
(In reply to comment #8)
> Totally reproducible, so how can I generate useful info here?
i think in your case we just wait for a patch to test
Assignee | ||
Comment 10•15 years ago
|
||
ah, archiving from an imap folder to a local folder - interesting. I did a large archive and I noticed that the batching code was behaving unexpectedly, in that it did multiple copies to my 2009 folder, instead of just one. That could partly account for the problem, though the copy service is really supposed to prevent contention over the dest folder.
Assignee | ||
Comment 11•15 years ago
|
||
There are two things going on here:
1. archive still does batches on a per-month basis even if the archive granularity is per-year (that's my fault). This means we'll generate up to 12 simultaneous copies to the same folder.
2. the copy service is supposed to serialize these requests but it isn't quite working in this case. It may be that there's something about imap to local copies, or it may be something more fundamental.
I'm going to fix #1, and then try to figure out #2. Marking blocking for now.
Assignee: nobody → bienvenu
Status: NEW → ASSIGNED
Flags: blocking-thunderbird3- → blocking-thunderbird3+
Target Milestone: --- → Thunderbird 3.0rc1
Comment 12•15 years ago
|
||
Just as a reminder, the unit tests that were written for some of these move cases also failed intermittently on the Mac (bug 502928). It is possible that the root cause of both these test failures, and your thing 2. from comment 11 are the same.
Assignee | ||
Comment 13•15 years ago
|
||
thx, could be related. I'm on Windows, but of course, there's really nothing platform-specific about the move/copy code.
Assignee | ||
Comment 14•15 years ago
|
||
this makes the batching match the archive granularity, which makes this crash much harder to reproduce. I figure out the archiveFolderUri and granularity in the batch coalescing code, and store it in the batch. And I changed the batch key to know whether the month should be used to determine batches.
There's still a little bit of duplication of code figuring out the identity, but we only run the dup code when the root archive folder doesn't exist.
Attachment #407156 -
Flags: review?(david.ascher)
Assignee | ||
Comment 15•15 years ago
|
||
this fixes the lowest level of the crash.
What's left is figuring out how the copy service is getting confused...
Attachment #407158 -
Flags: superreview?(bugzilla)
Attachment #407158 -
Flags: review?(bugzilla)
Updated•15 years ago
|
Attachment #407156 -
Flags: review?(david.ascher) → review+
Assignee | ||
Comment 16•15 years ago
|
||
I've checked in the first patch, which suffices to take this off the blocker list.
Assignee | ||
Updated•15 years ago
|
Attachment #407156 -
Attachment description: fix archive batching → fix archive batching - checked in.
Assignee | ||
Updated•15 years ago
|
Flags: blocking-thunderbird3+
Updated•15 years ago
|
Attachment #407158 -
Flags: superreview?(bugzilla)
Attachment #407158 -
Flags: superreview+
Attachment #407158 -
Flags: review?(bugzilla)
Attachment #407158 -
Flags: review+
Attachment #407158 -
Flags: approval-thunderbird3+
Assignee | ||
Updated•15 years ago
|
Attachment #407158 -
Attachment description: fix crash → fix crash - checked in.
Assignee | ||
Comment 17•15 years ago
|
||
The second patch was checked in as well.
If you disable the first patch, in order to make the copy service again chain urls that do a copy to the same target folder, the STR to reproduce this bug are as follows:
1. Setup an imap account to archive to a local folder, w/ the granularity left as the default (monthly)
2. Select 3 imap messages from different months but the same year.
3. Press archive.
Thx to asuth for listening to my addled description of the weird things I was seeing in the copy service and saying several true things that allowed me to make sense of what I was seeing. What's happening is that when the first move finishes, nsMsgCopyService::NotifyCompletion calls ClearRequest - this calls OnStopCopy on the listeners, which provokes the archive batching code into calling CopyMessages again, which adds a new request to the request queue. But nsMsgCopyService::NotifyCompletion is looping through the request queue looking for requests with a particular source and destination, and the batching code has unwittingly added a new one that matches, so it gets removed right away. But nsMsgLocalMailFolder::OnCopyCompleted doesn't get called for this second request, so it leaves the folder in an unhappy state, i.e., with a copy state object. The third request comes along, and the folder complains about the existing copy state...
So it turns out to be a variation on the classic issue of iterating over a list that has changed out from under us. I'm not sure why we think the second request is done before it has even started, and fixing that might fix the problem.
Assignee | ||
Comment 18•15 years ago
|
||
I'm going to try to come up with a test case that breaks w/o this patch
Assignee | ||
Comment 19•15 years ago
|
||
Attachment #407870 -
Flags: review?(kent)
Assignee | ||
Comment 20•15 years ago
|
||
Comment on attachment 407825 [details] [diff] [review]
fix for copy chaining in copycompleted notification
Kent knows this code better than anyone else, so I'm asking him for a review.
The change to where we increment sourceIndex is obviously not really a change...I can back that part out if you like.
Attachment #407825 -
Flags: superreview?(bugzilla)
Attachment #407825 -
Flags: review?(kent)
Updated•15 years ago
|
Attachment #407825 -
Flags: review?(kent) → review+
Comment 21•15 years ago
|
||
Comment on attachment 407825 [details] [diff] [review]
fix for copy chaining in copycompleted notification
"The change to where we increment sourceIndex is obviously not really a
change...I can back that part out if you like."
That is my only complaint, I don't see the reason to remove a standard loop idiom like you did, so please restore the original.
Overall this has become a very fragile piece of code over time, and you are just patching over the fragility. But this is not the time for a major refactoring.
Comment 22•15 years ago
|
||
Comment on attachment 407870 [details] [diff] [review]
unit test that asserts w/o this fix
>+load("../../mailnews/resources/asyncTestUtils.js");
IFAICT you are not using this, so it could be removed.
>+var gMessageGenerator;
>+var gScenarioFactory;
These are only used locally in run_test, so should not be globals.
r=me with these nits fixed.
Attachment #407870 -
Flags: review?(kent) → review+
Updated•15 years ago
|
Attachment #407825 -
Flags: superreview?(bugzilla) → superreview+
Comment 23•15 years ago
|
||
Is the theory still that this bug is triggered when archiving messages from multiple years? I just got several crashes when trying to archive messages only from the same year.
I'm trying to archive my itunes receipts which are dated 2008 and 2009. In the search bar i chose 'from filter' and type "itunes store' which displays all my receipts in the message list pane. To archive a mail i select 1 or more messages and hit the archive button in the message pane. I'm not using a separate search window.
At first it was only crashing when i selected messages from 2008. Archiving mails from 2009 failed too, but i only got an error dialog with no crash.
"an operation failed because another operation is using the folder"
Now that I'm done archiving the 2008 messages, archiving 2009 messages causes crashes too.
Still using 3.0b4
4b33ac82-4e53-4861-8c01-ff7172091024 10/24/09 8:46 PM
02fb4e29-db4e-4975-adaa-0144a2091024 10/24/09 8:39 PM
bded6a07-eaeb-4236-8d7b-094742091024 10/24/09 7:47 PM
7734f1a2-fd13-4921-b999-c79822091024 10/24/09 7:43 PM
fe2bc0af-e4f7-4c42-b0be-28fd12091024 10/24/09 7:42 PM
Assignee | ||
Comment 24•15 years ago
|
||
No, the theory was always archiving messages from different months in the same year caused crashes, if you had an archive granularity of years.
Assignee | ||
Comment 25•15 years ago
|
||
Dawn, I highly recommend trying a 3.0 pre build which should fix these crashes - http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-1.9.1/
Comment 26•15 years ago
|
||
Ok thanks, I misunderstood. I just installed the latest build and archiving works great. I love that feature!
Updated•15 years ago
|
Whiteboard: [final two patches need checkin on trunk]
Assignee | ||
Comment 27•15 years ago
|
||
the last two patches have been landed on the trunk. They are additional bullet-proofing, w/ a unit test. Marking fixed.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Whiteboard: [final two patches need checkin on trunk]
Updated•15 years ago
|
Keywords: fixed-seamonkey2.0.1
Reporter | ||
Comment 28•14 years ago
|
||
v.fixed per comment 26 and signature gone from crash stats *after* 3.0b4
Status: RESOLVED → VERIFIED
Updated•14 years ago
|
Crash Signature: [@ nsMsgLocalMailFolder::EndMessage(unsigned int)]
Updated•12 years ago
|
Whiteboard: [GS]
Comment 29•12 years ago
|
||
This bug is coming back for at least one user of TB13 :-(
See https://getsatisfaction.com/mozilla_messaging/tags/bug_518678
You need to log in
before you can comment on or make changes to this bug.
Description
•