Go Back   FlashFXP Forums > > > >

! Requests Need a script or some sort of cool .bat file ? Ask here!

Reply
 
Thread Tools Rate Thread Display Modes
Old 05-04-2005, 07:52 AM   #1
CoLt-[45]
Member
FlashFXP Registered User
ioFTPD Foundation User
 
Join Date: May 2003
Posts: 32
Post [REQ] Extremely Fast Database

Looking for a dupe script that can handle at least 50,000 or 60,000 files (NOTE : NOT DIR!!! - I'm talking about FILES!) - tried couple of scripts but everytime someone upload a file, it's lags like hell - like 5 sec before going on the next one.

Any suggestion, or new script for me to test, etc.. would be gladly to try out those.

I'm willing to install a fresh ioFTPD if I have to. - but other than that,
i'm still running on
[ioFTPD 5-8-5r]-[dZSbot 1.15]-[Eggdrop 1.6.17]

TIA
CoLt-[45] is offline   Reply With Quote
Old 05-04-2005, 05:21 PM   #2
FTPServerTools
Senior Member
FlashFXP Beta Tester
ioFTPD Scripter
 
Join Date: Sep 2002
Posts: 543
Default

Do me a favor and try my dupechecker with such an amount of files. Upload checking of 60000 files should take about 17 reads in a file, meanign within 0.2 seconds it can check a dupe out of 60000 files and yet it is still in a simple SORTED!! list. It works on dirs but I can extend it to work on files. The drawback is tho that after a save it takes more time to process... Have you considered using sqlite with an index in tcl? There is a tcl extension for sqlite that might give you the stuff you need. You would need some tcl scripting then tho. 60000 files with an average of lets say 14 characters is only a measly 830K file which is basically a small file. Reading such a file can be done super quickly. You can use DupeLister to make new dupelists. Please let me know if it is fast enough for you. I have been given reports of 400000 entries in it being handled within 2 seconds, so I assume in your case it'd be fast enough. If it is let me know then I'll see if I can add file support to it as well (shouldnt be hard at all).
DupeSearch and DupeLister are what you need for testing. OnDirCreated does dupe dir blocking I can make a OnPreFileUpload or something like that that tests a filename against a list like being created with DupeLister.
FTPServerTools is offline   Reply With Quote
Old 05-05-2005, 03:54 PM   #3
esmandil
Senior Member
FlashFXP Registered User
ioFTPD Foundation User
 
Join Date: Oct 2004
Posts: 107
Default

Alternatively, newest version of esmNewdir has some support for file dupe... and should be fast enough for your needs. Be warned, however, that I don't use file dupe functions, so there are probably still a couple of bugs hiding there.
esmandil is offline   Reply With Quote
Old 05-05-2005, 03:58 PM   #4
CoLt-[45]
Member
FlashFXP Registered User
ioFTPD Foundation User
 
Join Date: May 2003
Posts: 32
Default

Heh, yeah saw that part

and posted

http://www.ioftpd.com/board/showthre...1975#post31975

Thanks by the way.

Colt
CoLt-[45] is offline   Reply With Quote
Old 05-06-2005, 01:26 PM   #5
deo
Banned
 
Join Date: Feb 2005
Posts: 46
Default

Quote:
Block dupes by filenames for ioFTPd.
Undupe with wildcard.
Alter database on delete.

Compiled in c from modified poci source.
http://ioftpd.humandroids.net/

badDUPE may be fast enuff...?
deo is offline   Reply With Quote
Old 05-14-2005, 01:09 PM   #6
darkone
Disabled
FlashFXP Registered User
ioFTPD Administrator
 
darkone's Avatar
 
Join Date: Dec 2001
Posts: 2,230
Default

Quote:
Originally posted by FTPServerTools
Do me a favor and try my dupechecker with such an amount of files. Upload checking of 60000 files should take about 17 reads in a file, meanign within 0.2 seconds it can check a dupe out of 60000 files and yet it is still in a simple SORTED!! list. It works on dirs but I can extend it to work on files. The drawback is tho that after a save it takes more time to process... Have you considered using sqlite with an index in tcl? There is a tcl extension for sqlite that might give you the stuff you need. You would need some tcl scripting then tho. 60000 files with an average of lets say 14 characters is only a measly 830K file which is basically a small file. Reading such a file can be done super quickly. You can use DupeLister to make new dupelists. Please let me know if it is fast enough for you. I have been given reports of 400000 entries in it being handled within 2 seconds, so I assume in your case it'd be fast enough. If it is let me know then I'll see if I can add file support to it as well (shouldnt be hard at all).
DupeSearch and DupeLister are what you need for testing. OnDirCreated does dupe dir blocking I can make a OnPreFileUpload or something like that that tests a filename against a list like being created with DupeLister.
Consider saving data to more than one file, when file (database) size grows too large. I'm assuming you're using method similar to binary search algorithm on sorted file that I posted a while ago.

Here's simple example of how contents of files should/could look like:

filedb_1.dat
[number of files in database]
[min value of database 1]
[min value of database ...]
[filename of database ...]
...
[min value of database N]
[filename of database N]
[database contents part 1]

filedb_....dat
[database contents part ...]

filedb_N.dat
[database contents part N]

When file grows larger than eg. 5000 entries, it's split into two and header information in file_db1.dat is to be updated. With 1000000 entries you'd end up having 200files. That equals to maximum of 8 (binary search variant on min values) + 13 (binary search on file) comparisons. Neat and very efficient. Also it might be wise to limit size of filename to fixed value, and use twice as large read buffer - so you'll get full entry no matter what.

If you need further information, just message me on irc.
darkone is offline   Reply With Quote
Reply

Tags
files, gladly, script, test, upload

Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 02:19 PM.

Parts of this site powered by vBulletin Mods & Addons from DragonByte Technologies Ltd. (Details)