Closed
Bug 29044
Opened 25 years ago
Closed 25 years ago
Mac SSL: pictures not loading properly over https
Categories
(Core :: Networking, defect, P3)
Tracking
()
VERIFIED
FIXED
People
(Reporter: mw, Assigned: rpotts)
References
Details
(Whiteboard: [PDT+] 2/28/00 (patch works on Mac))
Attachments
(1 file)
(deleted),
patch
|
Details | Diff | Splinter Review |
Symptoms: When loading a page via https, pictures at first seem to load, then
after some time elapses, the pictures disappear to be replaced by their ALT
text.
What I know thus far:
- The data is passed via PSM to the browser using a local socket connection.
- Cartman properly closes the socket from its end using
PR_Close/OTSndOrderlyDisconnect. However, the browser does not respond properly
to the orderly disconnect.
- This is somewhat perplexing, because when NSPR sockets are created on the Mac,
they are automatically attached to a notifier routine (called, appropriately
enough, NotifierRoutine) which responds to T_ORDREL (the event that happens when
the peer closes the socket normally) by calling OTRcvOrderlyDisconnect. However,
OT Session Watcher reports that no such call was made.
- Doug Turner indicated to me earlier today that the SSL socket code in Necko
pushes a new IO layer onto the NSPR socket; I am beginning to wonder if that is
causing problems. I plan to investigate further this evening.
This will take me an indeterminate amount of time, because I am not as familiar
with the Necko or socket code as anyone who has worked in the code.
Reporter | ||
Comment 1•25 years ago
|
||
Attaching to 19119, asking for PDT eval.
Comment 2•25 years ago
|
||
Adding warren to the CC list in case he can chime in with pointers or
suggestions.
Thanks,
Jim
Putting on PDT+ radar for beta1. Need est to fix in Status Whiteboard ASAP
please.
Whiteboard: [PDT+]
Reporter | ||
Comment 4•25 years ago
|
||
I put in a date of 2/25 for the fix, but that is a patently untrue date. The
real truth is that this is a nondeterministic, low-level bug, probably in NSPR.
I am having difficulty tracking this down, and do not have any idea when it will
be done.
Whiteboard: [PDT+] → [PDT+] 2/25
Reporter | ||
Comment 5•25 years ago
|
||
Since I have heard a whole lot of nothing here, I thought I'd add a few more
data points.
NSPR is, as far as I can tell, properly returning an EOF when it is done reading
data from the socket. I have traced the code back into the level below
nsSocketTransport::doRead; I am in the process of determining where this code is
coming from. In any case, this intermediate layer is conflating the EOF with
other data that had previously been read, returning the whole lump and
pretending to be readable.
Since there is no more data to be read, an error condition comes in at poll
time, and the socket is closed with an error, resulting in no pictures.
Does this jog anyone's memory?
Reporter | ||
Comment 6•25 years ago
|
||
[Adding wtc since he will be able to confirm/deny NSPR behavior]
I want to point your collective attention to the following lines:
(A)
http://lxr.mozilla.org/seamonkey/source/netwerk/base/src/nsSocketTransport.cpp#9
46
(B) http://lxr.mozilla.org/seamonkey/source/xpcom/io/nsPipe2.cpp#659
(C)
http://lxr.mozilla.org/seamonkey/source/netwerk/base/src/nsSocketTransport.cpp#9
58
(D) http://lxr.mozilla.org/seamonkey/source/xpcom/io/nsPipe2.cpp#676
nsSocketTransport calls the underlying pipe's WriteSegment method at line 946
(A) above. The WriteSegment call loops, calling the reader method at nsPipe2.cpp
(B). In the case of the picture I am testing, it loops twice: once to read the
picture data, the second picks up EOF from the socket as it should. However, the
EOF state is not propagated up, apparently due to the first read call returning
some number of bytes. nsSocketTransport incorrectly assumes (at (C)) that the
socket would block and presumably goes back to poll on the socket. However, I
believe PR_Poll will return an error (POLLHUP) on the socket since EOF has been
reached. This would cause the picture to go away.
Now, I have confirmed (at (D)) that the pipe state is properly set to
NS_BASE_STREAM_CLOSED. What would be the best way to propagate this state up to
nsSocketTransport, given that the zero-length read is not being passed up?
Let me know what the right fix is here.
Comment 7•25 years ago
|
||
Rick Potts needs to look at this.
Assignee | ||
Comment 8•25 years ago
|
||
This sucks! It sounds like we need to be propagating the NS_BASE_STREAM_CLOSED
result out through WriteSegments(...) so the socket transport knows not to put
the socket back in the poll list.
I guess the other option is to add some getter on nsIBufferInputStream that
returns whether the stream has reached eof...
Any idea why this is only a problem on the Mac? It appears that on the other
platforms, doing the extra poll to detect EOF is working...
-- rick
Reporter | ||
Comment 9•25 years ago
|
||
I am not certain why PR_Poll has a problem with this only on the Mac -- wtc or
gordon might have a better idea there. I believe that the SSL socket code is the
only place in the Seamonkey code where a new IO layer is pushed onto a
PRFileDesc (PR_PushIOLayer), so for all I know that could affect PR_Poll results
in this case.
In any case, Rick seems to know at least as much as anyone else what the proper
fix is here, so I will assign this bug to him. Feel free to reassign back to me
if you think there is something else happening at a lower level or in SSL that I
need to look at. Likewise, if you have a potential fix available, I'll be all
too happy to drop it into my tree and try it out. (Simon Fraser just switched
the builds to use the correct NSPR tag, so PSM should work in tomorrow's Mac
commercial builds.)
If it helps, the URL of the image I was using for testing is
(https://www.waiter.com/wwwsys/images/mywtrlogo.gif).
Assignee: mwelch → rpotts
Comment 10•25 years ago
|
||
Good detective work Mark. Rick and I discussed this and think we have a
solution.
What's going on is that the pipe model is that as you read from a pipe, the EOF
marker logically sits at the end of the data. And if you read multiple times
while at EOF, you'll just get EOF back each time. However, sockets don't work
this way -- if you try and put them back in the select set and they've been
closed, you'll get an error, not EOF again.
What was happening was that we would do a read, and the underlying socket would
get EOF, but there would still be data in the pipe. Since we wanted to be fair
to other clients, we would only take some of the data. Since the pipe didn't
report EOF, we were rescheduling the socket with the select set for subsequent
reading -- but the socket had already been closed. Hence the HUP.
So basically, once we've detected EOF when reading from a socket, we have to
remember that fact in the socket transport so that we don't end up putting the
socket back in the select set again later. We can do this by noting that fact
in the (up until now unused) OnClose callback on the socket transport.
BTW, this is the mac manifestation of the same bug we had on linux some weeks
ago that I tried to fix by handling POLL_HUP, bug 23778.
Updated•25 years ago
|
Whiteboard: [PDT+] 2/25 → [PDT+] 2/28/00
Assignee | ||
Comment 11•25 years ago
|
||
Hey Mark,
I'm attaching a patch which should fix the problem. Basically, I pass a
structure into nsBufferInputStream::WriteSegments(...) which allows me to
capture the EOF inside of nsReadFromSocket(...) and propagate it back out to
nsSocketTransport::doRead(...).
If the EOF flag is set, the transport moves into the "done" state immediately
(without putting the FD back on the POLL list one last time :-)
Let me know if this fixes your problem...
-- rick
Assignee | ||
Comment 12•25 years ago
|
||
Reporter | ||
Comment 13•25 years ago
|
||
The patch looks like the right solution based on my understanding. I'll have to
try it out tomorrow morning when I get in front of my Mac.
Many thanks.
Reporter | ||
Comment 14•25 years ago
|
||
Just got in, will try patch.
Assignee | ||
Updated•25 years ago
|
Whiteboard: [PDT+] 2/28/00 → [PDT+] 2/28/00 (waiting on feedback re: patch)
Reporter | ||
Updated•25 years ago
|
Whiteboard: [PDT+] 2/28/00 (waiting on feedback re: patch) → [PDT+] 2/28/00 (patch works on Mac)
Reporter | ||
Comment 15•25 years ago
|
||
Patch appears to work. Managed to load several pages, and the pictures all
appeared. Many thanks.
Is there any way to check this in tonight? PDT seems to really want this (and
all other Mac SSL-related changes) in tomorrow's build.
Assignee | ||
Comment 16•25 years ago
|
||
checked in.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•