Closed Bug 45143 Opened 24 years ago Closed 24 years ago

META HTTP-EQUIV="Refresh" not re-directing properly

Categories

(Core :: Networking, defect, P3)

defect

Tracking

()

VERIFIED FIXED
mozilla0.9

People

(Reporter: stephe, Assigned: jag+mozilla)

References

()

Details

(Keywords: testcase, Whiteboard: would like for 0.9)

Attachments

(10 files)

On the this web site, there are links that look like http://www.britannica.com/bcom/redirect/1,5793,,00.html?url=%2Fnews If I click on this link using Mozilla, I am brought to an error page that says in part: "The requested object does not exist on Britannica." If I click on the same link in NN4, it works fine. In fact I can take the above link and paste into the URL bar and get the same error. Or I can place the link in a small html page and get the same error when I click on the link. This seems to have something to do with the redirection that is goin on. In NN4 when I click on the link, it brings up the page successfully. If I copy that url and paste it into Mozilla's url bar, Mozilla can load the page as well. This web site did not have this problem at some point in the near past so either Mozilla has regressed or the web site was redesigned, but it does not look as if it was. I tried this using the latest nightly build of Mozilla, 071020, on NT4 and Linux.
It loaded for me just fine with build 2000071108 on Linux. Can you try this again with a newer build?
linux, build ID 2000071108 Windows 95, build ID 2000071008 I'm seeing this problem too. The offending bit is this (you get it when you fetch http://www.britannica.com/news/): <META HTTP-EQUIV="Refresh" CONTENT="0; URL=/brit/0,8532,25,00.html"> which will result in a redirect to "http://www.britannica.com/brit/0"
Status: UNCONFIRMED → NEW
Ever confirmed: true
Does anyone know why we use ; *and* , as token separator in nsHTMLContentSink.cpp? This breaks this URL with , in it.
ouch! I entered comments into this bug using mozilla awhile ago yet they were never posted :-/. There are sites out there that use ',' as a seperator and there are sitest that use ';'. We're screwed on this one.
*** Bug 45250 has been marked as a duplicate of this bug. ***
Not necessarily. When the value of http-equiv is "refresh" Mozilla could just split (not tokenize) on the first , or ; it finds, use whatever's before it as the wait time, and everything after the "url=", not including any trailing invalid characters, as the destination. That should cover most if not all cases in current practice.
But bug 45250 doesn't have anything to do with "refresh".
I think bug 45250 is a different bug... http://lxr.mozilla.org/seamonkey/source/layout/html/document/src/nsHTMLContentSink.cpp#4130 That's where the meta http-equiv=refresh is dealt with. Also see my comments in bug 45250 which I just reopened.
We could look for the first available tokenizer char ; or ,. Then we could do the next searches only with that char first found. That would also work in most cases, but would fail if there is only a url in the refresh header and it contains a ,. Anyway, in my opinion, if we have a site that conforms to the standards and we break it, because we want to fix other sites that violate those standards, the path is clear: We support the standards supporting site 100%, may the other site be part of the top 100 or not.
Added testcase since britannica has changed and now works in Mozilla (yes, also those older versions with which it didn't work). I think I've found the fix for this... I'll attach it in a while.
Keywords: testcase
Actually... Could someone point at the CNN pages which use a ',' seperator? I've tried to use the ',' seperator in a testcase and NS4.73 just choked on it. So, the obvious fix is to just not use ',' as a seperator. However, I'll attach a patch which needs review and some testing (can't build here), and barring any mistakes fixes all but one case: content="12,foo.html", where "12,foo.html" is a page, instead of indicating a 12 second wait with a redirect to foo.html. The workaround for that case is content="0,12,foo.html" or content="url=12,foo.html". However, I'd rather ',' not be a valid seperator at all. Adding patch, requesting review.
Keywords: patch, review
I'll attach a new patch which is a bit cleaner and deals with most cases. Basically it tries to get the number of seconds (optional) and uri (optional), allowing both ',' and ';' as separators, and throws away anything that follows the uri and can't/shouldn't be part of it.
Target Milestone: --- → Future
*** Bug 49447 has been marked as a duplicate of this bug. ***
Blocks: 49476
This looks good to me. The problem of bug 49447 works with this. The dot problem is also taken care of.
*** Bug 42531 has been marked as a duplicate of this bug. ***
Assigning this bug to myself and I'll try to get this patch r=, a= and checked in.
Assignee: gagan → disttsc
Got an r=valeski, keyword review -> approval.
Keywords: reviewapproval
Accepting bug (sshhh, bugzilla), got a fix, rewriting it slightly per waterson's request.
Status: NEW → ASSIGNED
latest patch looks good to me, r=valeski.
The last patch looks good to me. harishd: could you be 2nd reviewer on it? valeski has already given it one look through. a=waterson, so long as r=harishd.
*** Bug 57637 has been marked as a duplicate of this bug. ***
Attaching a new patch, based on comments from harishd... Added comments, changed the parameter naming style and looked at the return values and how I've changed them. I completely rewrote the parser logic to use iterators, it's a lot cleaner (and hopefully understandable) this way (I hope). Here are a few questions: 1) Can someone explain to me when ProcessMETATag is supposed to pass on errors it encounters, and when not (other than "we know stuff might go wrong so we'll try this alternative")? 2) Currently we skip all of the "header" processing code if we don't have a docShell. Is this correct behaviour, or do we only need to do that for those who really need the docShell? 3) I've moved |mDocument->SetHeaderData(fieldAtom, content);| up so it'll always be set. It used to always be set for all but "refresh" and "set-cookie", where it would only be set if we didn't |return rv;| early. What is the correct/desired behaviour here? Move it down and make it only be set if NS_SUCCEEDED(rv)? Make it only be set for certain header types? Also see the // -jag- comments for the above three questions Btw, currently there are a number of |if (NS_FAILED(rv)) return rv;| which prevent us from reaching |NS_RELEASE(it);|. I've turned all that logic around, making it fall through to the NS_RELEASE on error, and continue if all's ok.
Attached patch [patch] rewritten yet again (deleted) — Splinter Review
Oh, and I forgot... What kind of whitespace can I be expecting in the content attribute? Is |nsCRT::IsSpace| (' ', '\n', '\r', '\t') enough, or do I need the '\b'? scc, is there something nsReadableUtils.h could provide? |skipChars(i1, i2, charSet)| or perhaps a more specialized |skipWhitespace(i1, i2)|?
r=valeski
Yay... Found another spot where this code lives. http://lxr.mozilla.org/seamonkey/source/docshell/base/nsDocShell.cpp#3783
*** Bug 59023 has been marked as a duplicate of this bug. ***
cc'ing rpotts. Can you help us answer some of the questions in my 2000-10-26 09:25 comment?
r=harishd
hey peter, I do not think that '\b' needs to be included as a whitespace character. I'd stick with isAsciiSpace(...). I took a look at your patch and have a few suggestions: In ProgressMetaTag(...) 1. I would change 'it' to a nsCOMPtr<nsIHTMLContent> to avoid having to do an explicit NS_RELEASE 2. I would also change fieldAtom to an COMPtr. Remember, to use the dont_AddRef(...) modifier as follows: nsCOMPtr<nsIAtom> fieldAtom( dont_AddRef(NS_NewAtom(header)) ); to keep the refcounting correct :-) By using COMPtrs you don't need to worry about any early return paths that exist now, or could be introduced in the future :-) 3. I would only call mDocument->SetHeaderData(...) if the data is correct. This preserves the same semantics as the old code - and makes some sense :-) ProcessMETARefresh(...) is a really scary method. Mostly because it is a complete rewrite of the HTTP-EQUIV=REFRESH parsing. I know that whenever we change this code some part of meta-refresh breaks :-( 1. Get rid of the '\b' whitespace checks. 2. Should seconds be cleared if we drop into the 'back to square 1' case? It seems like this case assumes that no time was specified. But you could have some lame, relative URL that began with a digit :-) 3. It looks like the new code will also accept 'HTTP-EQUIV=REFRESH CONTENT="5 URL=...". Where the semicolon (or comma) is completely missing. Is this correct? Or are we being too forgiving? (I don't know the answer :-) 4. Can we fix the spelling of 'reefer' :-) I have no idea what 'reefer' is meant to be. It looks like it's been in the code way too long! -- rick
dude! I put "reefer" in as a subliminal suggestion that it might be easier to work on this code after a joint ;-).
> 1. I would change 'it' to a nsCOMPtr<nsIHTMLContent> to avoid having to do an > explicit NS_RELEASE > 2. I would also change fieldAtom to an COMPtr. Remember, to use the > dont_AddRef(...) modifier as follows: > nsCOMPtr<nsIAtom> fieldAtom( dont_AddRef(NS_NewAtom(header)) ); > to keep the refcounting correct :-) Yup, sounds good, will do that. > 3. I would only call mDocument->SetHeaderData(...) if the data is correct. > This preserves the same semantics as the old code - and makes some sense :-) Yes that makes sense. However, currently it seems to happen regardlessly for "default-style", "link", "content-base" and "window-target". Only "refresh" and "set-cookie" possibly return (in the old code) before reaching SetHeaderData([0]). So I'm curious, is this what we want? Also, (did I mention that here yet? Nope, doesn't look like I did) SetHeaderData uses the first parameter as a unique ID (a key) to store the data, so with multiple set-cookie headers, you lose data. Apparantly that is currently not a problem because of |cookieServ->SetCookieString| which also stores it, then adds whatever's in the header data store and isn't duplicate (or so I was told). > ProcessMETARefresh(...) is a really scary method. Mostly because it is a > complete rewrite of the HTTP-EQUIV=REFRESH parsing. I know that whenever we > change this code some part of meta-refresh breaks :-( Well, we can change this to do exactly what NS 4 does if that's deemed The Right Thing. > 1. Get rid of the '\b' whitespace checks. Will do. > 2. Should seconds be cleared if we drop into the 'back to square 1' case? > It seems like this case assumes that no time was specified. But you could > have some lame, relative URL that began with a digit :-) Urgh, you're right. I did that in one of my previous patches, I forgot it in the rewrite... Will do. > 3. It looks like the new code will also accept 'HTTP-EQUIV=REFRESH CONTENT="5 > URL=...". Where the semicolon (or comma) is completely missing. Is this > correct? Or are we being too forgiving? (I don't know the answer :-) Can we be too forgiving? This wasn't exactly by design (oops) but I'm starting to like it :-) I could however just skip everything if I don't find a ';' or ',' there (add an |else { iter = doneIterating; }| to do that). This kind of feedback is what I was looking for :-) Thanks Rick So that answered my question #3 from a while back, leaving #1 (ProcessMETATag return value) and #2 (when should no docShell cause us to skip code?)
we need to be forgiving.
*** Bug 60223 has been marked as a duplicate of this bug. ***
For clarity, I'll repeat the unanswered questions (just in case someone actually reads their bug mail): 1) Can someone explain to me when ProcessMETATag is supposed to pass on errors it encounters, and when not (other than "we know stuff might go wrong so we'll try this alternative")? 2) Currently we skip all of the "header" processing code if we don't have a docShell. Is this correct behaviour, or do we only need to do that for those who really need the docShell?
*** Bug 66265 has been marked as a duplicate of this bug. ***
*** Bug 66283 has been marked as a duplicate of this bug. ***
Nominating for mozilla0.9 (and recommend nsbeta1 although as I don't work for Netscape I'm not setting that) this really needs to be fixed before any more end user moz spinoffs (NS6.x, beonex, etc) hit the market. Like I said in my bug http://bugzilla.mozilla.org/show_bug.cgi?id=49447, I was working at the Guardian at the time and I was creating a simple series of sidebars for their network of sites in the hope that they'd help draw interest in Netscape 6 once it was launched. Well, because they use the StoryServer content management system and by default it uses commas in the URLs this caused problems with the sidebars as the wouldn't work as they periodically refreshed using meta tags. So because this bug wasn't fixed for NS 6.0 it meant the sidebars never happened, so Netscape lost out on some free publicity, I don't work there at the moment so even when this bug is fixed I dunno if someone will revive the idea of the news sidebar panels for the site. But anyway, this created a bad impression of Netscape 6.0 and really needs to be sorted for the next release.
Keywords: mozilla0.9
Well, I've got a patch, all I need is answers to those two questions. Otherwise I'll just try to keep things the way they were as much as possible and open a new bug on those questions. Going for mozilla 0.9.
Target Milestone: Future → mozilla0.9
Okay, I'm going to update that patch and try and get some traction. Updating summary to better reflect the bug.
Summary: Mozilla will not follow certain links → META HTTP-EQUIV="Refresh" not re-directing properly
*** Bug 73276 has been marked as a duplicate of this bug. ***
Hi I´ve been having a problem with mozilla that might be related to this. If you put <meta http-equiv=refresh content=´0; http://www.somepage.com/´> it actually works when it shouldn´t. I only realized I was not putting the url= after I tested it with ns4.7 and ie. I don´t think mozilla should accept invalid input (at least when no other browsers do it). There is another problem. If I write <meta http-equiv=refresh content=´0; something?aa=otherthing´> then mozilla tries to go the the page named ´otherthing´. It catches the part after the = even when there was no url=. Tested with mozilla 20010319.
> There is another problem. If I write <meta http-equiv=refresh content=´0; > something?aa=otherthing´> then mozilla tries to go the the page named > ´otherthing´. It catches the part after the = even when there was no url=. Once I get my patch updated (and hopefully checked in soon) you should find that this will work as expected. I do think however I'll allow "url=" to be absent, even if no other browser supports it.
yea, we need to be able to handle absent "url=" tokens.
*** Bug 73375 has been marked as a duplicate of this bug. ***
I've decided to mark nsbeta1 despite the fact I normally avoid nominating for Netscape's release schedule because if this isn't fix it's going to make Netscape 6.5 look really bad for the reasons I keep outlining both here and in Bug 49447 particularly as there's a patch here and just a couple of questions to be resolved. Someone familiar with this code should be able to answer these remaining questions fairly quickly.
Keywords: nsbeta1
Hardware: PC → All
I have to go away for the evening, but I'll run it through my test-suite (as soon as I recover that, I managed to trash the partition which has it) when I come back. Putting up the untested patch so anyone interested can run it through their tests.
*** Bug 73791 has been marked as a duplicate of this bug. ***
My bad - Bug 73791 is not a dup
*** Bug 74054 has been marked as a duplicate of this bug. ***
So, by moving the code into nsDocShell as part of nsIRefreshURI I'm able to get rid of the code duplication and update the code in nsDocShell (which has kinda been neglected while its cousin was being fixed). It fixes all of the dupes of this bugs and other uris I could find, though it'd be very nice if someone could run this through a test suite.
r/sr=rpotts. This looks really good ! I only have a couple of questions/comments: - why is '\b' considered whitespace? It doesn't look like it was in the old parsing code... - in nsDocShell::RefreshURIFromHeader(...) I think that you should return 'rv' at the bottom (rather than NS_OK)... This allows the failure codes to be returned correctly... Then, you can also remove the check for NS_FAILED after the call to RefreshURI(...)
I'll remove this assertion (it could also trigger if someone did content="3000000000; url=bla.html"): + NS_ASSERTION(seconds >= 0,"Parse error in content attribute of (meta) http-equiv=refresh"); - why is '\b' considered whitespace? It doesn't look like it was in the old parsing code... I think I copied it from this code: http://lxr.mozilla.org/seamonkey/source/string/obsolete/nsString2.cpp#40 I'm not sure how necessary it is though... - in nsDocShell::RefreshURIFromHeader(...) I think that you should return 'rv' at the bottom (rather than NS_OK)... This allows the failure codes to be returned correctly... Then, you can also remove the check for NS_FAILED after the call to RefreshURI(...) Yeah, I've been struggling with the return value, but I guess if we can't create a new URI, something's wrong, and we should return early from HTMLContentSink::ProcessHeaderData. Note though that that's a change from the current code, which doesn't. Also note that nsDocShell::SetupRefreshURI completely throws away the return value, should that too be fixed?
Whiteboard: would like for 0.9
go ahead and check in when your ready - approved by drivers for 0.9
I also fixed two minor bugs: 1) in case content="15.html", it would optimistically parse that as 15 seconds, encounter the '.', decide the number was part of a url, and re-parse it as a url. But it didn't reset seconds to 0, so you'd refresh after 15 seconds instead of immediately. This bug was actually pointed out by rpotts 2000-11-07 12:16 comment. I'm pretty sure I fixed it back then, I guess I lost it somewhere in the bitrotting. Whoops :-) 2) in case content="-1; url=foopy.html", it would try to go to url "-1;". I'm now allowing for one of '-', '+' before the number, and setting any negative numbers to 0 later on (yeah, I could just directly set to 0, but who knows, we may invent time travel someday!).
looks good... (r/sr=rpotts) -- rick
r=brendan@mozilla.org if you need it, I read the patch, it's all good. /be
Checked in, marking fixed. Thanks everyone!
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Keywords: approval
Resolution: --- → FIXED
*** Bug 79080 has been marked as a duplicate of this bug. ***
Status: RESOLVED → VERIFIED
verified: WinNT4 2001052204 Mac os9 2001051504 Linux rh6 2001052213
not sure if this is a regression, or a new manifestation, but http://www.peoplesclinic.org/ continuously reloads.
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
jud: which build and what platform? I can't see it on the 200105304 build on Windows ...
Don't see this on Linux 2001052908 FWIW
I was seeing this w/ a 0.9 binary on win98.
12 dups. Marking mostfreq.
Keywords: mostfreq
valeski, can you reproduce this with a recent nightly build?
qa to me. +qawanted, verifyme If this is regressing, a build AND a testcase URL would be most helpful. This is an area of HTML I am not especially familiar with.
Keywords: qawanted, verifyme
QA Contact: tever → benc
from my previous comment: "not sure if this is a regression, or a new manifestation, but http://www.peoplesclinic.org/ continuously reloads." I reopened this because jag recently checked in some META REFRESH changes. It *may* be a separate bug. steps to repro: 1. visit the url outlined above. 2. watch it continuously reload. again, I did this on win98 w/ a 0.9 build off of ftp.mozilla.org. I don't have a win98 box avail anymore to verify it w/ newer builds.
Assignee: jag → jaggernaut
Status: REOPENED → NEW
I can see this with the 0.9 build, can't see it with any recent nightly, and from what I can tell those pages don't use META REFRESH but javascript to do the refresh. Marking this bug fixed again.
Status: NEW → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
not seeing this on todays 0.9.1, also passing META Refresh test cases verified: Win98 2001060409
Status: RESOLVED → VERIFIED
QA Contact: benc → tever
Keywords: qawanted
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: