Go Back   FlashFXP Forums > >

General Discussion Need help? Have a problem? Let us help you. Bug reports and feature requests should be made using the Bug Tracker or Feature Tracker

Closed Thread
 
Thread Tools Rate Thread Display Modes
Old 04-05-2014, 11:05 AM   #16
brackebuschtino
Member
FlashFXP Registered User
 
Join Date: Feb 2012
Location: /dev/null
Posts: 40
Default

Beginning with the last point:

I was already wondering if the regex evaluation might be case (in)sensitive. Since i am used to write regular expressions like
Code:
/^(the|ex|pre|ssion)/i
i didn't have the impression that the format used in the filters dialog allows for any switches. At least i had no idea where to add them. Simply append? Or use delimiter like generally used?

Regarding the filter box:

I found that its pretty unhandy to open the filter box, type the regex and then have to close it to see if it matches to open it again and try another version - especially as there seems to be no hotkey for it. Much handier would be an immediate feedback within the underlying windows ... a real-time 'onChange' ... evaluation so to say. Do you think that could be realised?

Regarding other places having issues with regexes:

I think not every section requires to support regexes. Take the 'Options > File Associations > File Patterns' section. File type associations typically look like
Code:
*.ext1, *.ext2, *.ext3, ...
In my opinion the only thing a regex could do here is to group these like so
Code:
\.(ext1|ext2|ext3|...)
In the end all extensions must be listed which - to me - excluded this section to support regex.

Regarding the testing recommendations:

Thanks for these hints. I'll definately check them out. Didn't know about the 'Mask Select' feature. There seems to be much more under the hood i didn't know of so far. Thank you!

Basically, the whole topic is not that urgent to spend all your spare time on it. Its weekend. Enjoy it!

Last edited by brackebuschtino; 04-05-2014 at 11:17 AM.
brackebuschtino is offline  
Old 04-09-2014, 09:00 AM   #17
bigstar
FlashFXP Developer
FlashFXP Administrator
ioFTPD Beta Tester
 
bigstar's Avatar
 
Join Date: Oct 2001
Posts: 8,012
Default

I made some additional improvements, you can download the latest build from within FlashFXP via the main menu under Help > Check for new version.

The main improvement is that if you enter an invalid regex syntax the field background color turns light red and I added an Apply button to the filter box so that you can apply the changes without closing the dialog.

I also changed the regex to ignore case by default.

I think you might be right about the File Associations, for now I have held off on implementing regex for this.
bigstar is offline  
Old 04-09-2014, 09:36 AM   #18
brackebuschtino
Member
FlashFXP Registered User
 
Join Date: Feb 2012
Location: /dev/null
Posts: 40
Default

Thanks a lot for your effort in this!

Quote:
I also changed the regex to ignore case by default.
I'm afraid this is not a good idea. In my case in fact it makes a difference. Better to make it case sensitive by default and allow for adding a switch, because one can turn off the case sensitivity rather than turning it on.

Just one thing i'd like to come back to:

Given the following example regex how or better where would the common regex switches (i,m,u, etc.) be placed in the specific notification for this app? Append?

Code:
rx .*-(?!publisher1)

Last edited by brackebuschtino; 04-09-2014 at 10:26 AM.
brackebuschtino is offline  
Old 04-09-2014, 01:52 PM   #19
bigstar
FlashFXP Developer
FlashFXP Administrator
ioFTPD Beta Tester
 
bigstar's Avatar
 
Join Date: Oct 2001
Posts: 8,012
Default

I did not realize that PCRE does not have a way to turn off case sensitivity, This puts some kinks into my plan, I will need to come up with another way of handling "ignore case" perhaps a global setting where this can be toggled off for those who need it.

In this type of situation I imagine that 9 out of 10 times you would not want case sensitive matching, I am not even sure if I can come up a good example where I'd need case sensitivity.

The ideal solution would be to make this part of the entry settings but this is going to require some design changes. Something I am not sure if I can justify at this time.

It took me some time to figure out the proper way to ignore case with PCRE, I am not 100% sure if this is correct.
Code:
rx (?i).*-(?!publisher1)
bigstar is offline  
Old 04-10-2014, 10:57 AM   #20
DayCuts
Senior Member
FlashFXP Beta Tester
 
Join Date: Dec 2003
Posts: 421
Default

Wrote quite a lengthy/detailed response breaking down the problems with your attempts, the misunderstanding about how lookarounds work, and how to design a pattern that works but my browser crashed so you will just have to settle for the footnotes and research yourself to get a better understanding.

Pure PCRE solution:
Code:
(?im)^(?!.+-(publisher1|publisher2)$).+$
Code:
/subdir/subdir/.../Author1_-_Title1_(1234)-Publisher1
/subdir/subdir/.../Author2_-_Title2_(1234)-PublisherX
/subdir/subdir/.../Author2_-_Title2_(1234)-Publisher2
Other notes:
Quote:
Originally Posted by bigstar View Post
What might be more suited for what you desire is to use the Selective Transfer feature.
I agree with this suggestion, regex pattern matching was not designed for 'non-matching'. Although it can be done the internal processing is more expensive for anything other than use with single characters, as is the use of lookarounds, etc. While the above pattern should work in the Skip List if PCRE matching is now also possible within selective transfer rule sets I would highly suggest ditching the expensive 'non-match' style negative lookahead pattern and opting for a normal 'match' style pattern.

Quote:
Originally Posted by bigstar View Post
I've made a small change to the syntax prefix
Can I suggest you reinstate the colon as part of the prefix? There should be no circumstances in which somebody might try (or be able) to match 'rx:<space>...' as a literal (non regex) pattern, however there is the possibility of somebody trying to match 'rx<space>...'.

Quote:
Originally Posted by bigstar View Post
It took me some time to figure out the proper way to ignore case with PCRE, I am not 100% sure if this is correct.
Code:
rx (?i).*-(?!publisher1)
Your use of (?i) here is correct. Given that FlashFXP is a windows client and windows (and the users there of) mostly think is a case-insensitive manner it might be okay to make it case-insensitive by default. Just so long as the case-sensitive modifier can be used within the pattern. (?-i) would force case sensitivity.

Modifiers/switches can be used anywhere in a pattern. When a modifier is seen it is explicitly applied to the remainder of the pattern, or until switch by another modifier. The basic form of a modifier is (?[onswitches][-offswitches][:regex]). This support for :regex means you can do things like (?i)^x(?-i:Y)z to match any case form of xyz as long as Y is capitalized, where (?-i:Y) is equivalent to (?-i)Y(?i).

A great regex introductory tutorial can be found at Regular-Expressions.info - Regex Tutorial, Examples and Reference - Regexp Patterns
DayCuts is offline  
Old 04-10-2014, 11:21 AM   #21
brackebuschtino
Member
FlashFXP Registered User
 
Join Date: Feb 2012
Location: /dev/null
Posts: 40
Default

Thanks for your reply and the suggested pattern. In fact i did my homework and searched the web as well as asked other developers, which resulted in a negative lookbehind reather than lookahead.
Code:
rx .*(?<!-PublisherX)$
Quote:
I would highly suggest ditching the expensive 'non-match' style negative lookahead pattern and opting for a normal 'match' style pattern.
The issue with this solution is that one is forced to manually select all highlighted results and put them into queue, while when using the skiplist with the above pattern (or yours) allows for putting a complete directory into queue and leave the rest to the application wish will reliably drop all non-matching queue items. This is exactly what i want. If i was satified with the manual way of scanning a folder and pick the cherries i wouldn't had asked for the skiplist improvement.

Regarding the "expensiveness":
I think that with todays computer power this plays no role. Furthermore i think that a little more time for regex-processing results in less intensive server workload. Also i think that not everybody using FFXP has an active skiplist that might have an impact on the transfer speed.

In fact im OK with every implementation (skiplist, selective transfer) that allows for the current state (PCRE support and lookahead/lookbehind-support) that allows to match as exactly as wished.

Quote:
[...]reinstate the colon as part of the prefix?[...]
I agree to this suggestion. I also found the blank alone to be potentially more confusing than having the colon visually presenting the delimitation. Mabe the blank could be dropped completely as the colon could satisfy the requirement as a delimiter?

Thanks a bunch for the on-/off-switch lession. I didn't know that yet. With this feature available i absolutely agree to your suggestion to make the pattern matching case insensitive by default.
brackebuschtino is offline  
Old 04-10-2014, 04:06 PM   #22
bigstar
FlashFXP Developer
FlashFXP Administrator
ioFTPD Beta Tester
 
bigstar's Avatar
 
Join Date: Oct 2001
Posts: 8,012
Default

Quote:
Originally Posted by DayCuts View Post
Can I suggest you reinstate the colon as part of the prefix? There should be no circumstances in which somebody might try (or be able) to match 'rx:<space>...' as a literal (non regex) pattern, however there is the possibility of somebody trying to match 'rx<space>...'.
Both rx<space> and rx:<space> can be used depending on your own preference.

It made more sense to me to simplify the prefix to to rx<space> because in most instances trailing spaces are automatically stripped off.

Quote:
Originally Posted by DayCuts View Post
Just so long as the case-sensitive modifier can be used within the pattern. (?-i) would force case sensitivity.
Thank you for clarification, I was not aware of using - to reverse to modifier.

I don't use regexp as much as one might think and most of this is new to me as well
bigstar is offline  
Old 04-12-2014, 05:36 AM   #23
DayCuts
Senior Member
FlashFXP Beta Tester
 
Join Date: Dec 2003
Posts: 421
Default

Quote:
Originally Posted by brackebuschtino View Post
Thanks for your reply and the suggested pattern. In fact i did my homework and searched the web as well as asked other developers, which resulted in a negative lookbehind reather than lookahead.
Code:
rx .*(?<!-PublisherX)$
Yep, in fact a lookbehind is the more appropriate selection in this case since the part of the string your most interested in is at the end. Less expensive as well.

Quote:
Originally Posted by brackebuschtino View Post
The issue with this solution is that one is forced to manually select all highlighted results and put them into queue, while when using the skiplist with the above pattern (or yours) allows for putting a complete directory into queue and leave the rest to the application wish will reliably drop all non-matching queue items. This is exactly what i want. If i was satified with the manual way of scanning a folder and pick the cherries i wouldn't had asked for the skiplist improvement.
I was refering to use in the Selective Transfer rules when I suggested simplifying, which already gives the option to Transfer or Skip and a choice of File and Folder matching. You could use a combination of the skip list for most rules, and a selective transfer ruleset for those that require negating.

Quote:
Originally Posted by brackebuschtino View Post
Regarding the "expensiveness":
I think that with todays computer power this plays no role. Furthermore i think that a little more time for regex-processing results in less intensive server workload. Also i think that not everybody using FFXP has an active skiplist that might have an impact on the transfer speed.
Abundant resources is no excuse not to do things in the most efficient way possible. While in a normal regex matching situation (one pattern against one string or file) it may be negligible, in a situation where you may end up with multiple look-around rules among a list of dozens of other rules that all have to be checked against a potentially huge list of files/directories expensiveness can add up quickly to a noticeable delay. Admittedly you would likely need a complex skip list and huge directory listing to notice anything on the average system these days.

Quote:
Originally Posted by bigstar View Post
Both rx<space> and rx:<space> can be used depending on your own preference.

It made more sense to me to simplify the prefix to to rx<space> because in most instances trailing spaces are automatically stripped off.
My concern here was more to do with the difference between something like "rx abc*.mp?" being processed as a basic glob or pcre. The results would be vastly different due to wildcard and period function in regular expressions. Requiring the colon would be a way of ensuring somebody not familiar regular expressions (or the support for them in the program) does not try to use a simple glob rule that is misinterpreted.
DayCuts is offline  
Old 04-12-2014, 10:36 AM   #24
brackebuschtino
Member
FlashFXP Registered User
 
Join Date: Feb 2012
Location: /dev/null
Posts: 40
Default

Quote:
Yep, in fact a lookbehind is the more appropriate selection in this case since the part of the string your most interested in is at the end. Less expensive as well.
Unfortunately this doesn't seem to allow for grouping or appending an additional group that might exist. At least it didn't work for me with:

Code:
rx .*(?<!-PublisherX(_int)?)$
Code:
rx .*(?<!-(PublisherX|OtherY))$
brackebuschtino is offline  
Old 04-12-2014, 07:29 PM   #25
DayCuts
Senior Member
FlashFXP Beta Tester
 
Join Date: Dec 2003
Posts: 421
Default

Learned something when trying to figure out why (_int)? worked in a look-ahead but not a look-behind. The answer is that in almost all regex flavors (language implementations) a look-behind must be a fixed-width expression. Not only does this mean you can not include ? + *, which rules out anything like (_int)?, but you also can not include optionals of different lengths like (pub|longpublisherame). Ultimately this means that a look-behind is not a viable option for your purposes unless the developing language of the program is using it is .NET or ABA.

Now onto a solution... first of all one reason optionals were not working for you is that you are forgetting part of the expression. The modifiers and anchors are important. I did come up with a working solution using a look-behind, but my test list used equal length publisher names. It failed thereafter due to the fixed-length requirement but here it is anyway...
Code:
(?im)(?(DEFINE)(?<publist>(?:publisher1|publisher2)))^.+(?(?<=_int$)(?<!-(?&publist)_int)|(?<!-(?&publist))$)
An updated version of the original pattern...
Code:
(?im)^(?!.+-(?:publisher1|publisher2)(?:_int)?$).+$
DayCuts is offline  
Old 04-14-2014, 10:50 AM   #26
brackebuschtino
Member
FlashFXP Registered User
 
Join Date: Feb 2012
Location: /dev/null
Posts: 40
Default

Thanks alot for this lession. It turns out that using lookarounds is very tricky. I would never been able to adopt this pattern on my own.
brackebuschtino is offline  
Closed Thread

Tags
checked, folders, mask, skip, skiplist

Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 01:32 AM.

Parts of this site powered by vBulletin Mods & Addons from DragonByte Technologies Ltd. (Details)