Closed Bug 7533 Opened 26 years ago Closed 25 years ago

{compat} Content-Type HTTP Header specifies charset

Categories

(Core :: Networking, defect, P3)

defect

Tracking

()

VERIFIED FIXED

People

(Reporter: michaell, Assigned: gagan)

References

()

Details

I don't really know how to call this problem: Go to the Merrriam Webster dictionary site above. Enter a word in their dictionary lookup text box and click search. I saw their sidebar image disappear and be replaced by several bullets, and that's all that happened. I seem to lose all interactivity with the page. Apprunner tells me that http://www.m-w.com/cgi-bin/dictionary was loaded successfully, but I see no definition for the word I was looking for.
Component: Apprunner → Parser
on win95, using build 1999060208, I went to the main page, typed in the word 'help' and selected search. The resulting search page did not paint correctly. It rendered as if I was using Paint Shop Pro and selected to do a screen grab. It works fine in 4.x. Looking at the content of the resulting page does show some odd coding practices: 1. they have a base element tag above the doctype statement 2. they code their comments oddly, for example: <!-------LOGO--------> they are missing the needed space after the -- and before the ending --
Assignee: don → rickg
QA Contact: leger → janc
Status: NEW → ASSIGNED
Assignee: rickg → nisheeth
Status: ASSIGNED → NEW
Nisheeth -- Can you please take a look at this and try to narrow down the problem? Thanks.
Status: NEW → ASSIGNED
OK, I've reduced the page to the following: -------- <html> <head> <title>Welcome to Merriam-Webster</title> </head> <body> <form method="post" action="http://www.m-w.com/cgi-bin/dictionary"> <input type="hidden" name="book" value="Dictionary"> <p>WWWebster Dictionary: <input name="va" size="15"> <input type="submit" value="Search"> </p> </form> </body> </html> -------- All that is happening here is a simple form POST. I'm ccing Eric Pollmann to see if he knows more about this. I'll investigate a more tomorrow.
This does not seem to be related to either a) posting this form correctly or b) loading the response page. The bug is possibly a combination of these and other factors. Posting this form seems to work, as long as not to this URL: http://blueviper/forms/postecho.html - shows the post works as it should http://blueviper/forms/postbv.html - spits back the same content Using a GET to send the form works, even to the same URL: http://blueviper/forms/getmw.html - submits and receives a page from m-w.com The only case where I see failure is the mentioned case: http://blueviper/forms/postmw.html - failure case
Ah, the problem appears to be in the web site's implementation of their service, and not a browser problem. I traced the network traffic and noticed this: IE 5.0 posts this: ----- POST /cgi-bin/dictionary HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/msword, application/vnd.ms-powerpoint, */* Referer: http://blueviper/forms/postgrab.html Accept-Language: en-us Content-Type: application/x-www-form-urlencoded Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt) Host: www.m-w.com Content-Length: 23 Connection: Keep-Alive book=Dictionary&va=test ----- The web server responds with the page in it's entireity. We post this: ----- POST /cgi-bin/dictionary HTTP/1.1 Connection: Keep-Alive User-Agent: Mozilla/5.0 [en] (Win95; I) Pragma: no-cache Host: www.m-w.com Accept: text/html, text/xml, image/png, image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */* Accept-Encoding: gzip Content-type: application/x-www-form-urlencoded; charset=ISO-8859-1 Content-Length: 23 book=Dictionary&va=test ----- The web server responds with a null page. I'd say this is Mirriam Websters bug and not ours...
Assignee: nisheeth → pollmann
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
I've contacted Merriam Webster to let them know about this. I'm grabbing this bug and will mark it fixed when/if I hear back from them.
I just heard back from Gerry Wick at Merriam Webster! He says that the User-Agent is not checked in the dictionary CGI. So if it's not that, it must be something else. :) After some more debugging, I've narrowed it down to the content-type line: Content-type: application/x-www-form-urlencoded; charset=ISO-8859-1 If I remove the charset information from the line to make it look like IE 5.0's, everything is peachy: Content-type: application/x-www-form-urlencoded Is our content-type syntax correct? No: We should fix our content-type syntax! Yes: The ball is in Merriam Webster's court, but there may be more sites that are broken too. From what I read of the HTTP spec, our content-type syntax looks fine. But I'm no expert here. Anyone know for sure?
Our content-type is just fine. As you mentioned (in your email) its very likely that the apache module that is checking for the content-type is doing an "equal" instead of "starts with" comparison which throws it off on our content-type string.
Summary: Odd site behavior → Merriam Webster and Content-Type HTTP Header
Status: ASSIGNED → RESOLVED
Closed: 26 years ago
Resolution: --- → INVALID
I'm marking this bug 'invalid'. If anyone feels it needs further attention, please feel free to grab it and reopen it!
Status: RESOLVED → VERIFIED
*** Bug 10112 has been marked as a duplicate of this bug. ***
This is also causing problems with submission to a Coldfusion web app. See bug 10112 for details.
I'm going to report this to Allaire regarding their ColdFusion product.
*** Bug 11447 has been marked as a duplicate of this bug. ***
Component: Parser → Networking-Core
OS: Windows 95 → All
Hardware: PC → All
Summary: Merriam Webster and Content-Type HTTP Header → Content-Type HTTP Header specifies charset
From bug 11447 + ---- Additional Comments From semperubisububi@hotmail.com 08/11/99 07:53 ---- + I think this is REALLY dangerous. !! The pages work fine in IE 5!!! Secondly, + if anyone were to attempt to use Mozilla in an corporate enviroment that used + Front Page (this is relatively common) for authoring, it posts that content + type every time. I'll agree that this is very nice an theoretically correct + and all of that. But no one will use a browser that says "sorry tell the + author to change their page to work with this browser". So obvioulsy if the + goal is to have actual end-users of this software then this is something that + will have to be changed. Consider, Microsoft is going to say "here's a + browser that runs all your existing pages and does this gee-wiz" and + Netscape/Mozilla are going to say "fix your web pages so they'll work with out + browser". What do you think Joe blow-customer will choose? Think about your + customer. It would be nice to get a feel for the number of web pages we're breaking by changing our Content-Type header. Does anyone in the networking group want this bug?
Add Yahoo's webmail to this. I suggest the web tools be evaluated as well. I know Front Page is big on using the content type you describe. If I recall correctly there are some instances when using ASP where you HAVE to put that charset type.
Would it be a solution to add a preference option? A checkbox that says: Use Standard HTTP1.1 (may not work witl all web pages) Or somethign like that, so that the user can make the final choice. The problem is that it'd have to default to off for the average user, otherwise they will just click on something, It won't work, and they will assume it's Mozilla that's broken. The average user doesn't seem to care about standards compliancy anyway as long as it works *sigh*. PS any site using ColdFusion for their dynamic web pages doesn't work with Mozilla at the moment (I recieved no responce from Allaire)
Status: VERIFIED → REOPENED
This idea has some potential. Since I am not working on the networking code, I'm reopening and assigning this to warren.
Resolution: INVALID → ---
Target Milestone: M7
Assignee: pollmann → warren
Status: REOPENED → NEW
*** Bug 12041 has been marked as a duplicate of this bug. ***
Assignee: warren → gagan
dup? ->gagan
Its a dupe alright.
*** Bug 12625 has been marked as a duplicate of this bug. ***
*** Bug 12625 has been marked as a duplicate of this bug. ***
*** Bug 13134 has been marked as a duplicate of this bug. ***
Status: NEW → ASSIGNED
Target Milestone: M15
*** Bug 14261 has been marked as a duplicate of this bug. ***
Shouldn't this bug be M12-13? (a beta-blocker) Even if we get Allaire to release a ColdFusion patch to fix this, many sites running ColdFusion will not install a patch, and it breaks some very important sites.
This breaks more than Cold-fusion pages.
Summary: Content-Type HTTP Header specifies charset → {compat} Content-Type HTTP Header specifies charset
I agree that sending 'charset' should be a pref option, defaulted to off. This is from HTTP 1.1 (http://www.ietf.org/rfc/rfc2616.txt), section 3.7: Note that some older HTTP applications do not recognize media type parameters. When sending data to older HTTP applications, implementations SHOULD only use media type parameters when they are required by that type/subtype definition. So we would not actually be breaking standards compliance by making this optional. In any case, "compatability mode" switches are already being used over in Layout, so I don't think that it would be controversial...
*** Bug 15180 has been marked as a duplicate of this bug. ***
I also think that a preferences option (default off) would be a good idea, incidently here's some more stuff that seems to get confused by this: Hotmail (whatever they're using) Sun Java Web Server Java Servlet and JSP developers kit - I've reported this to Sun Servlets on Apache + Apache JServ
*** Bug 16642 has been marked as a duplicate of this bug. ***
*** Bug 15745 has been marked as a duplicate of this bug. ***
*** Bug 10300 has been marked as a duplicate of this bug. ***
Status: ASSIGNED → RESOLVED
Closed: 26 years ago25 years ago
Resolution: --- → FIXED
I have switched this to "off" we still may have to switch it on based on a pref. But the default is now off. I have filed a separate bug 18431 for turning that on from a pref.
First of all, I'd like to thank whoever added me to the Cc list. I'm not that surprised that there are a lot of server-side pieces of software that have this problem. But it's interesting to see how many bugs have been declared a dup of this one. Secondly, I should note here that MSIE has chosen to indicate the charset info inside the form submission itself, using a special field called "_charset_" (includes the underscores). We may want to add support for that in Mozilla, since I think it's clear that we can't do it the way the spec says, due to all the broken server-side software out there. I've logged a separate bug for that: http://bugzilla.mozilla.org/show_bug.cgi?id=18643 Finally, I'd say that this shouldn't even be a pref. Just remove the code that adds the charset to the Content-Type line. The cat's out of the bag. There will probably never be a time when we can introduce the charset param on the Content-Type header. There is too much broken software out there, and it would be difficult to get rid of it. Take the path of least resistance, and follow MSIE's footsteps (the _charset_ field).
Status: RESOLVED → VERIFIED
The problem with IE's hack is that it clashes with form fields which already have the name "_charset_". See further comments in bug 18431.
I suspect that there are far fewer sites that use the odd form field name "_charset_" than sites that choke on a charset parameter in the Content-Type header. It should therefore be easier to get people to stop using _charset_ for their own private use. As far as I know, MSIE5 has been sending out the _charset_ field for a while now. If there were any sites at all that happened to be using _charset_ for their own purposes, they've probably stopped doing that by now.
*** Bug 18967 has been marked as a duplicate of this bug. ***
*** Bug 19602 has been marked as a duplicate of this bug. ***
*** Bug 19336 has been marked as a duplicate of this bug. ***
*** Bug 20293 has been marked as a duplicate of this bug. ***
Bulk move of all Networking-Core (to be deleted component) bugs to new Networking component.
*** Bug 22202 has been marked as a duplicate of this bug. ***
*** Bug 16096 has been marked as a duplicate of this bug. ***
Bug 289060 talks about the same problem on a "per form field" granularity level
You need to log in before you can comment on or make changes to this bug.