View Single Post
Old 02-25-2005, 01:43 PM  
andreag
Junior Member
 
Join Date: Feb 2005
Posts: 4
Default Corruption for resumed downloads... that's why happens in some cases !

(I re-post this here because before it was on a restricted part of the forum, so everyone can read this now)

Consider all this as an "official" suggestion from a registered user.

Now I understand (thanks Bigstar) that the "rollback" feature is only for cutting the last KBytes that may be fake data if something went wrong during a download and connection dropped.

I use FlashFXP to copy a lot of big/huge files (1 GB - 2 GB) but also small files from remote FTPs. Sometimes, I got strange corruption problems in very large files. Now I undestand how it all happened. I wish it's possible to avoid this chance in the future.

I think that it could be enough to have the possibility to check at least the last 3-4 KBytes of the downloaded file comparing them to the corresponding 3-4 KBytes of the remote file, eventually only for the files where it can really have sense for the user (I should choose by myself which ones using name, extensions, size or date values), but if you follow my blueprints probably it could be used always for files of longer than about 10-20 KBytes.

Maybe a higher value for this "intelligent rollback and check" could be useful or maybe not. I don't know, but would let the user decide.

It could be done in this way: when you're resuming, you have in any case the need to open the remote file and the local one. It would be enough to open the remote one some KBytes "before" than the real needed restart point, put the downloaded data in a memory buffer/structure (... I used to be a programmer years ago...) without writing anything on the disk (thus avoiding to ruin the local file which could just be a previous and valid shorter version of the remote file) and then finally comparing this data with the one from the corresponding part of the local file. If the parts are identical, then it's probable (but not completely sure) you can safely resume. To avoid possible data corruption problems, I would avoid to check the last KB of data saved in the local file (or consider to enforce the "rollback" feature in this case too). If the compared parts are different, the files are different versions with the same name, so user should choose to overwrite local file or to rename it or maybe to change the name of the one he's downloading. For sure, he must not resume at all !

In this way, you would avoid the "broken" file and a second (full) download that you will need to do when you realize that the file is broken. With big files, it may save hours of connection and bandwidth.

I hope to have been clear in my explaination, otherwise tell me.

Then I read the "queue file format definition" to understand how it's inside, and I completely agree (now I finally understand his point of view !) with the guy that signs messages as "EM" that wrote so many things about the format.

If it's true that some sites may be reliable for file dates, we should have them (dates/timestamps) included in the queue file too. They could be useful for spotting possible changes even if the filezise is the same (that happened to me too ! I had to redownload a lot of files in a hurry because content/version was different but size was exactly the same)

Maybe "EM" asked too much, but he's definitely right. As things are now, we all have some risks to get corruption on files which is really difficult to spot until is too late. And it already happened to me...

About queue saving: my opinion is to let us users choose what to do. I'm doing well with actual settings. Just let us choose the interval (for me would be "every file", for others "every minute" if they're doing many fast and small transfers).

I would like to know the opinion of the other REGISTERED users about all these matters, but in any case I hope to see some of these options implemented in the near future.

At least the one of the resume check on the last KB of the files. From a (ex) programmer point of view, I think it should be easy enough to implement.

Let me know what developers are thinking about alll this.

Move to the proper /other thread if necessary.

Thkx + Greetz,

Andrea
andreag is offline