| | [Haskell-cafe] sendfile leaking descriptors on Linux?  | | 
06-02-10, 12:14 PM
| | | Re: [Haskell-cafe] Re: sendfile leaking descriptors on Linux? On Sat, Feb 06, 2010 at 09:16:35AM +0100, Bardur Arantsson wrote:
> Brandon S. Allbery KF8NH wrote:
> >On Feb 5, 2010, at 02:56 , Bardur Arantsson wrote:
> [--snip--]
> >
> >"Broken pipe" is normally handled as a signal, and is only mapped
> >to an error if SIGPIPE is set to SIG_IGN. I can well imagine that
> >the SIGPIPE signal handler isn't closing resources properly; a
> >workaround would be to use the System.Posix.Signals API to ignore
> >SIGPIPE, but I don't know if that would work as a general solution
> >(it would depend on what other uses of pipes/sockets exist).
>
> It was a good idea, but it doesn't seem to help to add
>
> installHandler sigPIPE Ignore (Just fullSignalSet)
>
> to the main function. (Given the package name I assume
> System.Posix.Signals works similarly to regular old signals, i.e.
> globally per-process.)
>
> This is really starting to drive me round the bend...
Have you seen GHC ticket #1619?
[url]http://hackage.haskell.org/trac/ghc/ticket/1619[/url]
> One further thing I've noticed: When compiling on my 64-bit machine,
> ghc issues the following warnings:
>
> Linux.hsc:41: warning: format ‘%d’ expects type ‘int’, but argument
> 3 has type ‘long unsigned int’
> Linux.hsc:45: warning: format ‘%d’ expects type ‘int’, but argument
> 3 has type ‘long unsigned int’
> Linux.hsc:45: warning: format ‘%d’ expects type ‘int’, but argument
> 3 has type ‘long unsigned int’
> Linux.hsc:45: warning: format ‘%d’ expects type ‘int’, but argument
> 3 has type ‘long unsigned int’
>
> Those lines are:
>
> 39: -- max num of bytes in one send
> 40: maxBytes :: Int64
> 41: maxBytes = fromIntegral (maxBound :: (#type ssize_t))
>
> and
>
> 44: foreign import ccall unsafe "sendfile64" c_sendfile
> 45: :: Fd -> Fd -> Ptr (#type off_t) -> (#type size_t) -> IO
> (#type ssize_t)
>
> This looks like a typical 32/64-bit problem, but normally I would
> expect any real run-time problems caused by a problematic conversion
> in the FFI to crash the whole process. Maybe I'm wrong about this...
To convert those '#' constants, hsc2hs preprocessor constructs a
C file things like 'printf("%d", sizeof(ssize_t))' to use the
system's C compiler and avoid having the encode the ABI of every
platform (to be able to know the memory layout of the
structures).
So that message comes from that C file, not from your Haskell
one. At runtime it really doesn't matter.
--
Felipe.
_______________________________________________
Haskell-Cafe mailing list
[email]Haskell-Cafe@haskell.org[/email]
[url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url] | 
06-02-10, 05:18 PM
| | | [Haskell-cafe] Re: sendfile leaking descriptors on Linux? Felipe Lessa wrote:
> On Sat, Feb 06, 2010 at 09:16:35AM +0100, Bardur Arantsson wrote:
>> Brandon S. Allbery KF8NH wrote:
>>> On Feb 5, 2010, at 02:56 , Bardur Arantsson wrote:
>> [--snip--]
>>> "Broken pipe" is normally handled as a signal, and is only mapped
>>> to an error if SIGPIPE is set to SIG_IGN. I can well imagine that
>>> the SIGPIPE signal handler isn't closing resources properly; a
>>> workaround would be to use the System.Posix.Signals API to ignore
>>> SIGPIPE, but I don't know if that would work as a general solution
>>> (it would depend on what other uses of pipes/sockets exist).
>> It was a good idea, but it doesn't seem to help to add
>>
>> installHandler sigPIPE Ignore (Just fullSignalSet)
>>
>> to the main function. (Given the package name I assume
>> System.Posix.Signals works similarly to regular old signals, i.e.
>> globally per-process.)
>>
>> This is really starting to drive me round the bend...
>
> Have you seen GHC ticket #1619?
>
> [url]http://hackage.haskell.org/trac/ghc/ticket/1619[/url]
>
>
I hadn't. I guess the conclusion is that SIG_PIPE is ignored by default anyway. So much
for that.
During yet another bout of debugging, I've added even more "I am here" instrumentation
code to the SendFile code, and the culprit seems to be threadWaitWrite. Here's the bit
of code I've modified:
> sendfile :: Fd -> Fd -> Ptr Int64 -> Int64 -> IO Int64
> sendfile out_fd in_fd poff bytes = do
> putStrLn "PRE-threadWaitWrite"
> threadWaitWrite out_fd
> putStrLn "AFTER threadWaitWrite"
> sbytes <- c_sendfile out_fd in_fd poff (fromIntegral bytes)
> putStrLn $ "AFTER c_sendfile; result was: " ++ (show sbytes)
> if sbytes <= -1
> then do errno <- getErrno
> if errno == eAGAIN
> then sendfile out_fd in_fd poff bytes
> else throwErrno "Network.Socket.SendFile.Linux"
> else return (fromIntegral sbytes)
This is the output when a file descriptor is lost:
---
AFTER sendfile: sbytes=27512
DIFFERENCE: 627264520
remaining=627264520, bytes=627264520
PRE-threadWaitWrite
Got request for CONTENT for objectId=1700000000000000,f215040000000000
Serving file 'X'...
Sending 625838080 bytes...
in_fd=13
---
So I have to conclude that threadWaitWrite is doing *something* which causes
the thread to die when the PS3 kills the connection.
_______________________________________________
Haskell-Cafe mailing list
[email]Haskell-Cafe@haskell.org[/email]
[url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url] | 
06-02-10, 05:45 PM
| | | [Haskell-cafe] Re: sendfile leaking descriptors on Linux? Bardur Arantsson wrote:
(sorry about replying-to-self)
> During yet another bout of debugging, I've added even more "I am here"
> instrumentation code to the SendFile code, and the culprit seems to be
> threadWaitWrite.
I think I've pretty much confirmed this.
I've changed the code again. This time to:
> sendfile :: Fd -> Fd -> Ptr Int64 -> Int64 -> IO Int64
> sendfile out_fd in_fd poff bytes = do
> putStrLn "PRE-threadWaitWrite"
> -- threadWaitWrite out_fd
> -- putStrLn "AFTER threadWaitWrite"
> sbytes <- c_sendfile out_fd in_fd poff (fromIntegral bytes)
> putStrLn $ "AFTER c_sendfile; result was: " ++ (show sbytes)
> if sbytes <= -1
> then do errno <- getErrno
> if errno == eAGAIN
> then do
> threadDelay 100
> sendfile out_fd in_fd poff bytes
> else throwErrno "Network.Socket.SendFile.Linux"
> else return (fromIntegral sbytes)
That is, I removed the threadWaitWrite in favor of just adding a
"threadDelay 100" when eAGAIN is encountered.
With this code, I cannot provoke the leak.
Unfortunately this isn't really a solution -- the CPU is pegged at
~50% when I do this and it's not exactly elegant to have a hardcoded
100 ms delay in there.
I'm hoping that someone who understands the internals of GHC can chime
in here with some kind of explanation as to if/why/how threadWaitWrite can
fail in this way.
Anyone?
Cheers,
_______________________________________________
Haskell-Cafe mailing list
[email]Haskell-Cafe@haskell.org[/email]
[url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url] | 
06-02-10, 10:01 PM
| | | Re: [Haskell-cafe] Re: sendfile leaking descriptors on Linux? me too.
2010/2/5 MightyByte <mightybyte@gmail.com>:
> I've been seeing a steady stream of similar resource vanished messages
> for as long as I've been running my happstack app. *This message I get
> is this:
>
> <socket: 58>: hClose: resource vanished (Broken pipe)
>
> I run my app from a shell script inside a "while true" loop, so it
> automatically gets restarted if it crashes. *This incurs no more than
> a few seconds of down time. *Since that is acceptable for my
> application, I've never put much effort into investigating the issue.
> But I don't think the resource vanished error results in program
> termination. *When I have looked into it, I've had similar trouble
> reproducing it. *Clients such as wget and firefox don't seem to cause
> the problem. *If I remember correctly it only happens with IE.
>
> On Fri, Feb 5, 2010 at 2:56 AM, Bardur Arantsson <spam@scientician.net> wrote:
>> Jeremy Shaw wrote:
>>>
>>> Actually,
>>>
>>> We should start by testing if native sendfile leaks file descriptors even
>>> when the whole file is sent. We have a test suite, but I am not sure if it
>>> tests for file handle leaking...
>>>
>>
>> I should have posted this earlier, but the exact message I'm seeing in the
>> case where the Bad Client disconnects is this:
>>
>> * hums: Network.Socket.SendFile.Linux: resource vanished (Broken pipe)
>>
>> Oddly, I haven't been able to reproduce this using a wget client with a ^C
>> during transfer. When I "disconnect" wget with ^C or "pkill wget" or even
>> "pkill -9 wget", I get this message:
>>
>> *hums: Network.Socket.SendFile.Linux: resource vanished (Connection reset by
>> peer)
>>
>> (and no leak, as observed by "lsof | grep hums").
>>
>> So there appears to be some vital difference between the handling of the two
>> cases.
>>
>> Another observation which may be useful:
>>
>> Before the sendfile' API change (Handle -> FilePath) in sendfile-0.6.x, my
>> code used "withFile" to open the file and to ensure that it was closed. So
>> it seems that withBinaryFile *should* also be fine. Unless the "Broken pipe"
>> error somehow escapes the scope without causing a "close".
>>
>> I don't have time to dig more right now, but I'll try to see if I can find
>> out more later.
>>
>> Cheers,
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> [email]Haskell-Cafe@haskell.org[/email]
>> [url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url]
>>
> _______________________________________________
> Haskell-Cafe mailing list
> [email]Haskell-Cafe@haskell.org[/email]
> [url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url]
>
_______________________________________________
Haskell-Cafe mailing list
[email]Haskell-Cafe@haskell.org[/email]
[url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url] | 
07-02-10, 09:14 AM
| | | [Haskell-cafe] Re: sendfile leaking descriptors on Linux? Bardur Arantsson wrote:
> Bardur Arantsson wrote:
>
> (sorry about replying-to-self)
>
>> During yet another bout of debugging, I've added even more "I am here"
>> instrumentation code to the SendFile code, and the culprit seems to be
> > threadWaitWrite.
>
As Jeremy Shaw pointed out off-list, the symptoms are also consistent
with a thread that simply gets stuck in threadWaitWrite.
I've tried a couple of different solutions to this based on starting a
separate thread to enforce a timeout on threadWaitWrite (using throwTo).
It seems to work to prevent the file descriptor leak, but causes GHC
to segfault after a while. Probably some sort of other resource exhaustion
since my code is just an evil hack:
> killer :: MVar () -> ThreadId -> IO ()
> killer dontKill otherThread = do
> threadDelay 5000
> x <- tryTakeMVar dontKill
> case x of
> Just _ -> putStrLn "Killer thread expired"
> Nothing -> throwTo otherThread (Overflow)
where the relevant bit of sendfile reads:
> mtid <- myThreadId
> dontKill <- newEmptyMVar
> forkIO $ killer dontKill mtid
> threadWaitWrite out_fd
> putMVar dontKill ()
So I'm basically creating a thread for every single "threadWaitWrite" operation
(which is a lot in this case).
Anyone got any ideas on a simpler way to handle this? Maybe I should just
report a bug for threadWaitWrite? IMO threadWaitWrite really should
throw some sort of IOException if the FD goes dead while it's waiting.
I suppose an alternative way to try to work around this would be by forcing the output
socket into blocking (as opposed to non-blocking) mode, but I can't figure out how to
do this on GHC 6.10.x -- I only see setNonBlockingFD which doesn't take a parameter
unlike its 6.12.x counterpart.
Cheers,
_______________________________________________
Haskell-Cafe mailing list
[email]Haskell-Cafe@haskell.org[/email]
[url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url] | 
07-02-10, 04:00 PM
| | | Re: [Haskell-cafe] Re: sendfile leaking descriptors on Linux? It's not clear to me that this is actually a bug in threadWaitWrite.
I believe that under Linux, select() does not wakeup just because the file
descriptor was closed. (Under Windows, and possibly solaris/BSD/etc it
does). So this behavior might be consistent with normal Linux behavior.
However, it is clearly annoying that (a) the expected behavior is not
documented (b) the behavior might be different under Linux than other OSes.
In some sense it is correct -- if the file descriptor is closed, then we
certainly can't write more to it -- so threadWaitWrite need not wake up..
But that leaves us with the issue of needing someway to be notified that
the file descriptor was closed so that we can clean up after ourselves..
- jeremy
On Sun, Feb 7, 2010 at 2:13 AM, Bardur Arantsson <spam@scientician.net>wrote:
> Bardur Arantsson wrote:
>
>> Bardur Arantsson wrote:
>>
>> (sorry about replying-to-self)
>>
>> During yet another bout of debugging, I've added even more "I am here"
>>> instrumentation code to the SendFile code, and the culprit seems to be
>>>
>> > threadWaitWrite.
>>
>>
> As Jeremy Shaw pointed out off-list, the symptoms are also consistent
> with a thread that simply gets stuck in threadWaitWrite.
>
> I've tried a couple of different solutions to this based on starting a
> separate thread to enforce a timeout on threadWaitWrite (using throwTo).
>
> It seems to work to prevent the file descriptor leak, but causes GHC
> to segfault after a while. Probably some sort of other resource exhaustion
> since my code is just an evil hack:
>
> > killer :: MVar () -> ThreadId -> IO ()
> > killer dontKill otherThread = do
> > threadDelay 5000
> > x <- tryTakeMVar dontKill
> > case x of
> > Just _ -> putStrLn "Killer thread expired"
> > Nothing -> throwTo otherThread (Overflow)
>
> where the relevant bit of sendfile reads:
>
> > mtid <- myThreadId
> > dontKill <- newEmptyMVar
> > forkIO $ killer dontKill mtid
> > threadWaitWrite out_fd
> > putMVar dontKill ()
>
> So I'm basically creating a thread for every single "threadWaitWrite"
> operation
> (which is a lot in this case).
>
> Anyone got any ideas on a simpler way to handle this? Maybe I should just
> report a bug for threadWaitWrite? IMO threadWaitWrite really should
> throw some sort of IOException if the FD goes dead while it's waiting.
>
> I suppose an alternative way to try to work around this would be by forcing
> the output
> socket into blocking (as opposed to non-blocking) mode, but I can't figure
> out how to
> do this on GHC 6.10.x -- I only see setNonBlockingFD which doesn't take a
> parameter
> unlike its 6.12.x counterpart.
>
>
> Cheers,
>
> _______________________________________________
> Haskell-Cafe mailing list
> [email]Haskell-Cafe@haskell.org[/email]
> [url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url]
>
_______________________________________________
Haskell-Cafe mailing list
[email]Haskell-Cafe@haskell.org[/email]
[url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url] | 
07-02-10, 04:23 PM
| | | [Haskell-cafe] Re: sendfile leaking descriptors on Linux? Jeremy Shaw wrote:
> It's not clear to me that this is actually a bug in threadWaitWrite.
>
> I believe that under Linux, select() does not wakeup just because the file
> descriptor was closed.
select() has the option of specifying an "exceptfds" FD_SET where I'd
certainly _expect_ select() to flag an FD if it's closed. Annoyingly,
the man page is not very specific about what an "exception" is, so it's
hard to be sure.
> (Under Windows, and possibly solaris/BSD/etc it
> does). So this behavior might be consistent with normal Linux behavior.
> However, it is clearly annoying that (a) the expected behavior is not
> documented (b) the behavior might be different under Linux than other OSes.
>
> In some sense it is correct -- if the file descriptor is closed, then we
> certainly can't write more to it -- so threadWaitWrite need not wake up..
> But that leaves us with the issue of needing someway to be notified that
> the file descriptor was closed so that we can clean up after ourselves..
>
True, it is perhaps technically not a bug, but it is certainly a
misfeature since there is no easy way (at least AFAICT) to discover that
something bad has happened for the file descriptor and act accordingly.
AFAICT any solution would have to be based on a separate thread which
either 1) "checks" the FD periodically somehow, or 2) simply lets the
thread doing the threadWaitWrite time out after a set period of
inactivity. Neither is very optimal.
Either way, I'd certainly expect the sendfile library to work around
this somehow such that this situation doesn't occur. I'm just having a
hard time thinking up a good solution  .
Cheers,
_______________________________________________
Haskell-Cafe mailing list
[email]Haskell-Cafe@haskell.org[/email]
[url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url] | 
09-02-10, 11:51 PM
| | | Re: [Haskell-cafe] Re: sendfile leaking descriptors on Linux? On Sun, Feb 7, 2010 at 9:22 AM, Bardur Arantsson <spam@scientician.net>wrote:
True, it is perhaps technically not a bug, but it is certainly a misfeature
> since there is no easy way (at least AFAICT) to discover that something bad
> has happened for the file descriptor and act accordingly. AFAICT any
> solution would have to be based on a separate thread which either 1)
> "checks" the FD periodically somehow, or 2) simply lets the thread doing the
> threadWaitWrite time out after a set period of inactivity. Neither is very
> optimal.
>
> Either way, I'd certainly expect the sendfile library to work around this
> somehow such that this situation doesn't occur. I'm just having a hard time
> thinking up a good solution .
>
>
Well, it is certainly a bug in sendfile that needs to be fixed. I'm not sure
how to fix it either. If we can simplify the test case, we can ask Simon
Marlow..
- jeremy
_______________________________________________
Haskell-Cafe mailing list
[email]Haskell-Cafe@haskell.org[/email]
[url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url] | 
10-02-10, 01:48 AM
| | | Re: [Haskell-cafe] Re: sendfile leaking descriptors on Linux? Matt, have you seen this thread?
Jeremy, are you saying this a bug in the sendfile library on hackage,
or something underlying?
thomas.
2010/2/9 Jeremy Shaw <jeremy@n-heptane.com>:
> On Sun, Feb 7, 2010 at 9:22 AM, Bardur Arantsson <spam@scientician.net>
> wrote:
>>
>> True, it is perhaps technically not a bug, but it is certainly a
>> misfeature since there is no easy way (at least AFAICT) to discover that
>> something bad has happened for the file descriptor and act accordingly.
>> AFAICT any solution would have to be based on a separate thread which either
>> 1) "checks" the FD periodically somehow, or 2) simply lets the thread doing
>> the threadWaitWrite time out after a set period of inactivity. Neither is
>> very optimal.
>>
>> Either way, I'd certainly expect the sendfile library to work around this
>> somehow such that this situation doesn't occur. I'm just having a hard time
>> thinking up a good solution .
>
> Well, it is certainly a bug in sendfile that needs to be fixed. I'm not sure
> how to fix it either. If we can simplify the test case, we can ask Simon
> Marlow..
> - jeremy
> _______________________________________________
> Haskell-Cafe mailing list
> [email]Haskell-Cafe@haskell.org[/email]
> [url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url]
>
>
_______________________________________________
Haskell-Cafe mailing list
[email]Haskell-Cafe@haskell.org[/email]
[url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url] | 
10-02-10, 04:53 PM
| | | Re: [Haskell-cafe] Re: sendfile leaking descriptors on Linux? On Feb 9, 2010, at 6:47 PM, Thomas Hartman wrote:
> Matt, have you seen this thread?
>
> Jeremy, are you saying this a bug in the sendfile library on hackage,
> or something underlying?
I'm saying that the behavior of the sendfile library is buggy. But it
could be due to something underlying..
Either threadWaitWrite is buggy and should be fixed. Or
threadWaitWrite is doing the right thing, and sendfile needs to be
modified some how to account for the behavior. But I don't know which
is the case or how to implement a solution to either option.
- jeremy
_______________________________________________
Haskell-Cafe mailing list
[email]Haskell-Cafe@haskell.org[/email]
[url]http://www.haskell.org/mailman/listinfo/haskell-cafe[/url] |  | | | Thread Tools | | | | Display Modes | Linear Mode |
Posting Rules
| You may not post new threads You may not post replies You may not post attachments You may not edit your posts HTML code is Off | | | All times are GMT +1. The time now is 09:27 AM.
Powered by vBulletin® Version 3.6.8 Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.3.0 | |