Closed Bug 9099 Opened 25 years ago Closed 25 years ago

HTML entity followed by CR garbles page

Categories

(Core :: Layout, defect, P3)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: waldemar, Assigned: jbetak)

References

()

Details

Attachments

(3 files)

The attached page produces the output in the attached gif. The stray square occurs only if the entity ‘ is immediately followed by a carriage return.
Attached file HTML test page (deleted) —
Attached image Screen shot of output (deleted) —
This error occurs with apprunner M7.
Status: NEW → ASSIGNED
Assignee: rickg → pierre
Status: ASSIGNED → NEW
Pierre -- this bug does NOT occur on windows. Can you please take a look on mac? I suspect a font mapping problem.
Assignee: pierre → rickg
A linefeed character (0x000A) is inserted in the Unicode string when nsRenderingContextMac::DrawString() is called. The string is: <begin quote><linefeed><a><n><d><space><end quote> instead of: <begin quote><space><a><n><d><space><end quote> Viewer on Windows displays the string as "and " (without leading space) instead of " and " (with leading space), which makes me think that the string also contains a linefeed on Windows with the only difference that this character is skipped instead of being displayed as a bad character as it is on the Mac. I verified with Communicator and Internet Explorer that there should be indeed a leading space in the string, so it seems to me that it is a parser bug: when an HTML entity is immediately followed by a carriage return, the parser should generate a space character. Reassigned to Rick.
Assignee: rickg → kipp
Component: Parser → Layout
Kipp -- it appears to me that the parser is sending exactly the right content over. I watched the AddText() method in the sink, and it looked correct too. Perhaps we have a buffer copy error, or maybe it's in the lexomorphic transform code somewhere.
Status: NEW → ASSIGNED
Target Milestone: M10
Assignee: kipp → erik
Status: ASSIGNED → NEW
The test case has unicode characters \u2018 and \u2019 in it which do not render at all on linux. The bug was reported on MaxOS so I suspect the other platforms have a problem as well.
Assignee: erik → ftang
Target Milestone: M10 → M12
Status: NEW → ASSIGNED
Target Milestone: M12 → M14
Mark as M14 assigned.
This is a backend bug. Probably good for jbetak to work on.
Assignee: ftang → jbetak
Status: ASSIGNED → NEW
Change Platform and OS to ALL
OS: Mac System 8.6 → All
Hardware: Macintosh → All
Rick, I'm sending this over again - I verified the Unicode <-> ISO8859-1 character conversions and went through a couple of examples. I think the best way of addressing this problem is, the have the parser to ignore CR/LF based in a text run. CR/LF should be considered white space in such a context and we should avoid rendering a white space (x20 space), since it might be not appropriate in Japanese and Chinese contexts. I looked into nsHTMLTokens.cpp and nsHTMLTokenizer. Would it be possible to recycle the CR/LF in CNewlineToken::Consume based on the context? It worked in the debugger...
Assignee: jbetak → rickg
jbetak: your suggestion is appreciated and your idea could work, but it's misguided. The parser CANNOT arbitrarily change newlines to spaces in this way. Imagine that this sequence occured in a <PRE> tag. The new line would be very meaningful. Furthermore, if it occured in a tag that was given CSS treatment of <PRE> the parser would never know. The right answer is that the layout system needs to convert the newline into a space like it does everywhere else.
I'm reassigning this to Buster, since he's mr layout now, and has control over text rendering.
Assignee: rickg → buster
throwing into kipp's bucket for now. not critical for beta, moving to M16
Assignee: buster → kipp
Target Milestone: M14 → M16
Status: NEW → ASSIGNED
I'm taking this back out, ftang help to narrow down the problem... Should be minor fix.
Assignee: kipp → jbetak
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
mark this as M15
Target Milestone: M16 → M15
see the Thai URL on Babel. This bug results in displaying a special character before each <wbr> making rendering of Thai documents very difficult since they use <wbr> quite frequently to indicate possible word breaks...
finally managed to get the fix in. Troy, ftang - thanks for all your help.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: