Closed Bug 5502 Opened 25 years ago Closed 25 years ago

When reopen a messenger an extranious char gets inserted into the header with non-ascii

Categories

(MailNews Core :: Internationalization, defect, P3)

x86
Windows NT
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: marina, Assigned: davidmc)

Details

Steps to reproduce: -send a message where header consists of a long string with non-ascii chars; -get this message; //note: in the thread pane message header looks fine; -close Messenger; -quit Seamonkey; -reopen Messenger; //note: in every Header in the thread pane there is an extranious chars (\,$,A, D etc..) inserted,with every reopening you'll get +one extranious chars.
Assignee: nhotta → davidmc
Reassigning to davidmc@netscape.com. Similar problem was fixed by him.
That bug was: Bug 4925. This bug seems to be slightly different from the earlier one in that every time, you delete .msf file, it goes away upon the first loading of an Inbox. Also of the many examples of the original problem I have in my mail box, I see this problem only in 2 of them. Apparently, it needs to meet some extra conditions (other than just the headers being "long") for this problem to occur. I also didn't get more extraneous characters upon subsequent loading of mail boxes.
Hmm, the last comment about long content made me think of something I did wrong that might cause a problem. I recently checked in a way to continue long lines using backslash and linebreak; I think it can fail in conjunction with the $ hex encoding, in the sense that a sequence of backslash linebreak '$' probably is not read correctly, because I did no allow for the possibility that the first byte after backslash linebreak might need an escaped interpretation.
davidm seems to be correct in that my 2 problem examples show that the the line folding in the .msf file occurs in such a way that the backslash splits the $0D$0A sequence in half: ... $0D\ $0A ... There are other "$0D$0A" sequences in the .msf data but the 2 erxamples are the only ones which split between 2 lines. These explain my problems. marina's examples seem to be different from these and require further investigation.
I'll soon fix how '$' is treated after backslash linebreak, but I need an open tree to check in, unless this stops M5, which seems very unlikely.
Target Milestone: M6
M6
There is another type of extraneous character which gets inserted in the thread pane display. This is the musical note (Eigth note), i.e. U+266A. When I looked into the .msf file, I noticed that this character gets inserted right in front of the $0D$0A sequence. You can induce this problem by sending out a message (via 4.5x) containing a long ASCII subject header. Here's one example: "This is a message containing a long subject header which must be folded because this line should easily exceed seventy-five characters or so that we limit for each line." Note: Omit the double-quotes and do not insert CR.
I'll work on this soon.
Momoi, the changed morkParser::ReadValue() method below will probably fix the bug. It replaces the incorrect version in the tree within file mailnews/db/mork/src/morkParser.cpp morkBuf* morkParser::ReadValue(morkEnv* ev) { morkBuf* outBuf = 0; morkCoil* coil = &mParser_ValueCoil; coil->ClearBufFill(); morkSpool* spool = &mParser_ValueSpool; spool->Seek(ev, /*pos*/ 0); if ( ev->Good() ) { morkStream* s = mParser_Stream; register int c; while ( (c = s->Getc(ev)) != EOF && c != ')' && ev->Good() ) { if ( c == '\\' ) // next char is escaped by '\'? { if ( (c = s->Getc(ev)) == 0xA || c == 0xD ) // linebreak after \? { c = this->eat_line_break(ev, c); if ( c == ')' || c == '\\' || c == '$' ) { s->Ungetc(c); // just let while loop test read this again continue; // goto next iteration of while loop } } if ( c == EOF || ev->Bad() ) break; // end while loop } else if ( c == '$' ) // "$" escapes next two hex digits? { if ( (c = s->Getc(ev)) != EOF && ev->Good() ) { mork_ch first = (mork_ch) c; // first hex digit if ( (c = s->Getc(ev)) != EOF && ev->Good() ) { mork_ch second = (mork_ch) c; // second hex digit c = ev->HexToByte(first, second); } else break; // end while loop } else break; // end while loop } spool->Putc(ev, c); } if ( ev->Good() ) { if ( c != EOF ) spool->FlushSink(ev); // update coil->mBuf_Fill else this->UnexpectedEofError(ev); if ( ev->Good() ) outBuf = coil; } } return outBuf; }
This will get checked in the next time I sync with the tree in the next few days, if the tree is open. I'm not sure it fixes symptoms that momoi observed.
If this "has" to be fixed for m6, then you will need to check it in while the tree is closed. Otherwise, you could move it to m7. Does anyone think this has to be fixed for m6?
Target Milestone: M6 → M7
OK, I'll move it myself.
I can probably check this in when you want it, by moving aside the other changes in that file and reverting the content, and making only this fix. But the only testing it would have would be the absence of problems in the runs I've made while working on incremental writing. (I haven't checked in code for a while because I've had a lot of trouble getting and staying in sync with the tip. For example, the last time I wasted a bit more than two days trying to use a build script on Mac that was obsoleted without any announcement. :-) And today I'd need to get new build tools to sync again, and I still have unverified changes in my tree. So I'm a bit adrift of the tip these days.)
No new; this comment is intended to quiet Bugzilla auto-notices.
Will this get into M7? Or M8?
Fixed in my tree, per code shown above. Not checked in since I can't seem to get my incremental writing act together for M7, and I have not synced with the tree. I'd bet you $50 I can check in the fix without building it on any other platform, and not break the tree. But that's not how we do things (at least when folks pretend to follow the rules). I can check it in any time if I break the rules. If I play by the rules, I have no idea when it will get checked in, but it could not be more than several days plus as many days as the tree is closed for M7.
You'll probably win the bet :-) so I won't take you up on it. Since int'l QA opened this bug, I'll just let them comment on whether they "need" it for M7 or not. I was just doing a search on M7 MailNews bugs that are open. If this is not going to be checked in for M7, then the target milestone should be changed to M8. Thanks.
Target Milestone: M7 → M8
M8
M8 is acceptable but we would like to see this done soon after m7 ships.
When i reopen my inbox with 6/30/99 build i still can see this bug. With copy/move to another folder functionning now Japanese headers get extranious chars inserted after copy/move.
Target Milestone: M8 → M9
moving to m9 - should be fixed very soon.
Isn't this fixed by the recent Mork drop? If so, please mark fixed.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Ah, yes it should be fixed now, but I can't verify the report case. However, I gather that's why we verify bugs; so I'll mark this fixed.
Didn't we used to wrap long lines in the thread pane headers? With 7/16/99 Win32 M9 build, I see that the lines are not wrapped and end where the Subject width ends. There seems to be no way to expand the Subject header length, either. I don't see extraneous characters after 3-4 re-starts but I cannot be sure about the ones that don't wrap. The truncation could hide a potential problem. So the question is: Do we care about non-wrapping? Are you going to simply insert an ellipsis marker and be done with it? If so, I think we are done with this bug.
Actually, I would like to see what long lines look like when the Subject field width is widened.
Hmm, I can suggest a way to verify when the UI will not show problem lines. It involves looking at the Mork text files, which is slightly unpleasant when the Mork content is atomized so much that it is filled with unsightly hex. But when you are looking for specific strings and whether then have changed across sessions, then you can ignore most of the harsh Mork syntax. Let's say you have a long line, and some characters which get written using the $xx style hex notation. Mork will "continue" the long lines by using a backslash and a linebreak before continuing the content definition. This bug was caused by bad interaction between these two, where the backslash continuation was not coping with the hex notations after a linebreak. There are two ways you can verify by looking at Mork .msf files. One way is to look a the same potential problem string across sessions in which the .msf file gets changed and rewritten. You want to see that the problem string does not mutate itself over time across sessions, but stays about the same. The second way to verify, is to deliberately explore the edge cases of \ and $xx to see if you can find a way to provoke a failure, and you can do this by editing the Mork .msf file, by moving where the \ backslash is used before and after $xx style encodings, so see if any combination gives a bad result. You should be able to insert "\linbreak" in between any two characters and have things work. But you can't put it between multibyte encodings that would split a single logical content character, so you should not split "\)" or any of the three bytes in a $xx sequence, etc.
Status: RESOLVED → VERIFIED
verified in 1999-07-27-08 windows build
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.