View Single Post
Old 04-12-2014, 07:29 PM  
DayCuts
Senior Member
 
Join Date: Dec 2003
Posts: 421
Default

Learned something when trying to figure out why (_int)? worked in a look-ahead but not a look-behind. The answer is that in almost all regex flavors (language implementations) a look-behind must be a fixed-width expression. Not only does this mean you can not include ? + *, which rules out anything like (_int)?, but you also can not include optionals of different lengths like (pub|longpublisherame). Ultimately this means that a look-behind is not a viable option for your purposes unless the developing language of the program is using it is .NET or ABA.

Now onto a solution... first of all one reason optionals were not working for you is that you are forgetting part of the expression. The modifiers and anchors are important. I did come up with a working solution using a look-behind, but my test list used equal length publisher names. It failed thereafter due to the fixed-length requirement but here it is anyway...
Code:
(?im)(?(DEFINE)(?<publist>(?:publisher1|publisher2)))^.+(?(?<=_int$)(?<!-(?&publist)_int)|(?<!-(?&publist))$)
An updated version of the original pattern...
Code:
(?im)^(?!.+-(?:publisher1|publisher2)(?:_int)?$).+$
DayCuts is offline