Go Back   FlashFXP Forums > > > >

ioFTPD General New releases, comments, questions regarding the latest version of ioFTPD.

Reply
 
Thread Tools Rate Thread Display Modes
Old 08-12-2004, 11:52 PM   #1
darkone
Disabled
FlashFXP Registered User
ioFTPD Administrator
 
darkone's Avatar
 
Join Date: Dec 2001
Posts: 2,230
Default New core (revision 3)

Ok, as most of you know I've been working on non-ioFTPD related project for past 'few' months. While I've been busy with the other project, I've also done some plans (notepad drafts & quick performance tests) for the new io (input output) core.

It took 3rewrites to get it all working, but now I'm confident that I've overcome all the theoretical problems that I had with the fully asynchronous processing line.

Old core used to process everything in certain order. When reading data from ssl encrypted socket, it first called 'read' function. Once read completed, it called ´decryption' function. Finally when decryption completed, it called some other function (to look for linefeeds or so). Because everything is done in linear order, it is very easy to synchronize. 'read' -> 'decrypt' -> 'do something with buffer'

New core incorporates completely new ideology. Order of events is no longer predeterminated. 'read' function may be called at the same time as 'decryption' function is processing buffer returned by previous read call. It's also possible that function that does something with decrypted buffer, is also running while decryption is in progress...

As you can imagine, it is no longer a trivial task to get it synchronized properly. With proper manner I mean, that it is not acceptable for thread to block another for longer than few quantums (quantum = slice of cpu time thread gets from kernel to execute) Because whenever the pipeline stalls, the efficiency drops. Therefore I have tried to figure out a way to make stalls as short as possible, and in many ways I think I have succeeded in this.

I have placed high hopes for the new core. And when it is done, I hope I can say it was worth it. There are no examples of anything alike available - and AFAIK this method has never been done before. When I have the performance figures - and if they are what I expect them to be, I can be sure that it's unique solution.


- Old core: ~160mb/sec cached disk read (to socket)
- New core (rev 2): ~220mb/sec cached disk read (to socket)
- New core (rev 3): Faster than rev 2.
darkone is offline   Reply With Quote
Old 08-13-2004, 12:04 AM   #2
darkone
Disabled
FlashFXP Registered User
ioFTPD Administrator
 
darkone's Avatar
 
Join Date: Dec 2001
Posts: 2,230
Default

Notepad draft from yesterday:

http://www.ioftpd.com/~darkone/tmp/stupid.txt

As usual example does not compile, nor has any real functionality. Idea of example is to show, complexity (and or simplicity) of algorithms in use. For legal reasons I have to mention that, one is not allowed to use this code or its' director or indirect derivants without my permission.
darkone is offline   Reply With Quote
Old 08-13-2004, 12:57 AM   #3
peep
Senior Member
FlashFXP Scripter
ioFTPD Foundation User
 
Join Date: Sep 2003
Posts: 132
Default

Oh man, oh man. Have I been waiting for this post to come

Great going d1. Gonna go read your post more throughly now!
Keep doing what you do best!

Let's show them Linux servers who's got the fastest daemon/servers around

Microsoft Windows Server 2003 vs. Linux Competitive File Server

peep
peep is offline   Reply With Quote
Old 08-16-2004, 03:21 AM   #4
darkone
Disabled
FlashFXP Registered User
ioFTPD Administrator
 
darkone's Avatar
 
Join Date: Dec 2001
Posts: 2,230
Default

I wrapped up few tests for the new core...

ftp buffered file -> socket -> socket [ftp.exe] -> nul
- Parameters: FILE_FLAG_SEQUENTIAL_SCAN, SO_SNDBUF = 0, 3 * 32kb application write buffer, ftp.exe
- Result: ~180mb/sec
- Conclusion: No significant speed gains above the current core. Reduced memory and cpu (~25% in user mode) usage.

ftp unbuffered file -> socket -> socket [ftp.exe] -> null
- Parameters: FILE_FLAG_NO_BUFFERING, SO_SNDBUF = 0, 3 * 8kb application write buffer, ftp.exe
- Result: ~60mb/sec
- Conclusion: User mode cpu usage reduced significantly (to 1-2%), while kernel mode cpu usage increased somewhat. Greatest benefits when used asynchronous devices. (scsi, network shares)

buffered loopback transfer file -> socket -> socket -> null
- Parameters: FILE_FLAG_SEQUENTIAL_SCAN, no socket write buffers, 3 * 32kb application read buffer + 3 * 8kb application write buffer, within single daemon thread
- Result: ~120mb/sec transfer (240mb/sec throughput)
- Conclusion: Result is a bit lower than exepected. But because test was limited to single thread, it only utilized capacity of one processor.


It currently seems, that there is very little to be optimized in the main code - as nearly 100% of cpu-time is now used by mandatory calls: GetQueuedCompletionStatus(), WSASend(), WSARecv(), ReadFile() & WriteFile().
darkone is offline   Reply With Quote
Old 08-16-2004, 01:07 PM   #5
Mr_X
Senior Member
FlashFXP Registered User
ioFTPD Foundation User
 
Join Date: Sep 2003
Posts: 142
Default

I seen a long time ago an article on Onversity about optimizing loops. This may help you. Here's a link to the PDF file:
d-loop
Mr_X is offline   Reply With Quote
Old 08-19-2004, 12:35 PM   #6
darkone
Disabled
FlashFXP Registered User
ioFTPD Administrator
 
darkone's Avatar
 
Join Date: Dec 2001
Posts: 2,230
Default

I actually use similar optimizations already...

Code:
void foo(int lOperation)
{
  switch (lOperation) {
  case 0:
    DoSomethin1();
    break;
  case 1:
    DoSomething2();
    break;
  }
}
could be written as:

Code:
lpProc[2] = { DoSomething1, DoSomething2 };

void foo(int lOperation)
{
  lpProc[lOperation]();
}
darkone is offline   Reply With Quote
Old 08-20-2004, 04:15 AM   #7
darkone
Disabled
FlashFXP Registered User
ioFTPD Administrator
 
darkone's Avatar
 
Join Date: Dec 2001
Posts: 2,230
Default

Finished the first test round with 2048 test connections; total transfer speed was ~180mb/sec. In the test daemon was acting as both client and server, so number of downloading client connections was 1024 and uploading server connections was 1024 as well. Cpu usage was at ~80% on both cpus. Memory usage using 3read & 3write buffers of 16kb, was ~160mb.

=> There's still room for optimizations, but core seems to handle heavy io loads now with ease.
darkone is offline   Reply With Quote
Old 08-20-2004, 05:34 AM   #8
ADDiCT
Senior Member
FlashFXP Beta Tester
ioFTPD Scripter
 
Join Date: Aug 2003
Posts: 517
Default

(stupid question, but who am i ) do these optimiziations for extreme situations (+1000 transfers) produce any overhead for normal situations (~20 transfers) ?
ADDiCT is offline   Reply With Quote
Old 08-20-2004, 07:00 AM   #9
darkone
Disabled
FlashFXP Registered User
ioFTPD Administrator
 
darkone's Avatar
 
Join Date: Dec 2001
Posts: 2,230
Default

Ofcourse not Resources free by reduction in memory copying negate the overhead that required thread synchronization adds.

For uploads, I expect tremendenous performance improvent; seperate encryption thread pool is now gone - io threads are able to transform themselves to encryption threads when required. This means that thread is in many sitautations able to decrypt/encrypt/chash/whatever the buffer immeaditely after processing it. Just like with the old core, amount of threads performing such operations simultanously is definable parameter.
darkone is offline   Reply With Quote
Old 08-20-2004, 08:19 PM   #10
darkone
Disabled
FlashFXP Registered User
ioFTPD Administrator
 
darkone's Avatar
 
Join Date: Dec 2001
Posts: 2,230
Default

It seems that I'm getting close to the performance requirements I set to the core.

Transfer times for 800mb file, when daemon is working as both client and server:
1 connection: 6.3 seconds (253mb/sec)
2 connections: 12.6 seconds
10 connections: 62.6 seconds
100 connections: 619.8 seconds
1000 connections: 6285.2 seconds

Performance of single cached transfer seems to be always constant.

Just one odd thing.. I noticed I had pulled the wrong figure for old core performance: http://www.ioftpd.com/board/showthread.phpthreadid=3174

... and the odd thing is, that now that I try to transfer same file, I get lower performance (even with the old io) And I can't remember any changes since (other than I added 1Gb of memory)
darkone is offline   Reply With Quote
Old 08-21-2004, 02:09 AM   #11
Syrinx
Junior Member
 
Join Date: Jun 2004
Posts: 25
Default

"Transfer times for 800mb file, when daemon is working as both client and server:
1 connection: 6.3 seconds (253mb/sec)
"

The above reslt confuses me.How Can transfering a 800 mb file with 6.3 seconds,and the transfer speed is 253 mb/sec.

800 mb / 6.3 seconds = 253 mb/sec?
Syrinx is offline   Reply With Quote
Old 08-21-2004, 03:32 AM   #12
ADDiCT
Senior Member
FlashFXP Beta Tester
ioFTPD Scripter
 
Join Date: Aug 2003
Posts: 517
Default

800 MB / 6.3s = 126,98 MB/s

"when daemon is working as both client and server"

in those 6.3s, it has both uploaded and downloaded 800 MB, so it's 126,98 MB/s "full duplex", or a total of 253 MB/s
ADDiCT is offline   Reply With Quote
Old 08-21-2004, 04:54 AM   #13
iXi
Senior Member
ioFTPD Foundation User
 
Join Date: Nov 2002
Posts: 220
Default

Quote:
ftp unbuffered file -> socket -> socket [ftp.exe] -> null
- Parameters: FILE_FLAG_NO_BUFFERING, SO_SNDBUF = 0, 3 * 8kb application write buffer, ftp.exe
- Result: ~60mb/sec
- Conclusion: User mode cpu usage reduced significantly (to 1-2%), while kernel mode cpu usage increased somewhat. Greatest benefits when used asynchronous devices. (scsi, network shares)
d1 does io needs special buffer settings for asynchronous devices? because i'm using some scsi raids..

cya
iXi is offline   Reply With Quote
Old 08-21-2004, 09:23 AM   #14
darkone
Disabled
FlashFXP Registered User
ioFTPD Administrator
 
darkone's Avatar
 
Join Date: Dec 2001
Posts: 2,230
Default

Not likely that I'll support that FILE_FLAG_NO_BUFFERING, implementation seems to be costly. (amount of code compared to performance delta)
darkone is offline   Reply With Quote
Old 08-21-2004, 09:39 AM   #15
iXi
Senior Member
ioFTPD Foundation User
 
Join Date: Nov 2002
Posts: 220
Default

sounds quit good
iXi is offline   Reply With Quote
Reply

Tags
called, core, function, read, socket

Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 07:30 PM.

Parts of this site powered by vBulletin Mods & Addons from DragonByte Technologies Ltd. (Details)