Closed
Bug 7399
Opened 25 years ago
Closed 25 years ago
Escaping illegal chars in URLs
Categories
(Core :: Networking, defect, P3)
Tracking
()
M9
People
(Reporter: hjtoi-bugzilla, Assigned: gagan)
References
()
Details
Recently http://www.biztalk.org had spaces in links. They worked in IE and
Opera, but not in Netscape nor Gecko. They later changed the spaces to
underscores.
In XML world at least the browser should escape ALL illegal characters in URLs
(I just read a mail about that today, but can't remember on which list it was).
So if there are spaces in URLs they should be escaped with %20 automatically by
the browser. Gecko understands escaped URLs, it is just a matter of doing the
escaping...
The URL has a doc that contains one link that points to a file with a space
in its name. IE handles that fine, NS and Gecko fail.
Comment 1•25 years ago
|
||
There are some problems with this:
1) different URL RFCs have different ideas of what illegal characters are
2) Should the URL, as given in the document already be legal? Is it the job of
the browser to correct a URL when the correction might mess up the server?
(What do current browsers do here?)
I think one may end up having to stick to tradition on this, but I'm not really
sure what the URL RFC's say about correction of URLs.
(When the site you mention above had spaces in links, was the whole thing in
quotes? If not, then the problem was with parsing.)
Reporter | ||
Comment 2•25 years ago
|
||
It took some time to find where I had read that piece about illegal characters
in URIs (note, _URI_). The below URLs should answer your questions.
The discussion happened on XML-DEV. Here is a link to the archive and the
thread you should read:
http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-May-1999/0573.html
Here are some extracted relevant URLs from the discussion:
http://www.w3.org/TR/WD-charmod#URIs
http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.2
Comment 3•25 years ago
|
||
According to those last two links, which point to HTML 4 section B.2.1 and
the last working draft of the W3C Character Model respectively, we should
indeed be escaping URIs.
1) We should probably take the superset. That way all bases are covered.
2) Yes, the URI in the document should indeed be legal. No, I would say that it
is not our job to correct it. However, we should certainly not be sending
invalid URIs to servers, so I suggest encoding would be best.
Currently, we are dropping spaces in URIs altogether (this happens somewhere in
the content sink, see bug 8319). We should certainly not be doing this.
Changing all Networking Library/Browser bugs to Networking-Core component for
Browser.
Occasionally, Bugzilla will burp and cause Verified bugs to reopen when I do
this in a bulk change. If this happens, I will fix. ;-)
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → DUPLICATE
Updated•25 years ago
|
Status: RESOLVED → VERIFIED
Bulk move of all Networking-Core (to be deleted component) bugs to new
Networking component.
You need to log in
before you can comment on or make changes to this bug.
Description
•