View Single Post
Old 04-22-2008, 01:01 AM  
Yil
Too much time...
 
Join Date: May 2005
Posts: 1,194
Default

Doh! You remember that lockup bug? Well, I spoke too soon. On the 4 processor system it happened again. The "good" news is I found a pattern! Every single pure lockup dump (over 10 now) has been the result of I believe an ioFTPD timeout that forces the socket closed to cancel all the outstanding requests.

The interesting thing here is I don't see how this code can be executed twice since there is a test for that but it appears to be since the timeout return code means the timeout happened but the callstack is not from that call... What's tricky here, is this code is calling winsock library functions that should return errors. They are never supposed to lock up the entire process. So while ioFTPD may be callling them in some improper way, the "lockup" bug is actually a winsock bug. That's my story and I'm sticking to it!

Having found the common thread, and the timeout code forcing a closesocket, I've learned some stuff and I should be able to duplicate the problem locally now and thus be able to solve it.

I just want to get this thing stable so I can start breaking it with new features again

FYI: 6.4 will eventually fix this problem, and 6.5 will introduce changes to the userfile for some new options that might break old compiled scripts. TCL based scripts will likely work fine though. I might be able to offer some sort of compatibility mode and hide the new userfile fields or something, but no promises.
Yil is offline   Reply With Quote