Trim all characters (from left) before a given character/string

Advanced Renamer forum
#1 : 22/09-18 22:00
James E. McBride
James E. McBride
Posts: 13
Hello,
I am trying to understand RegEx better and I came up with what I thought was a good way to trim all text from the left up until the first '('

The method that I used is:
Replace (RegEx)
Text to be replaced: .+\(
Replace with: (
Occurrence: 1st (actually, using 'all' and '1st', the results are the same)

My input/output is below.

The one that I am baffled by is the 15th line where the output is simply:
(2cd)

I am not sure how that happened, or how I need to fix it.
Thanks for the help! This forum has been very helpful in my journey to learn more about ARen :)

Foldername
Kng Diamond (1986) Fatal Portrait
Kng Diamond (1987) In Concert 1987- Abigail
Kng Diamond (1988) Them
Kng Diamond (1989) Conspiracy
Kng Diamond (1990) The Eye
Kng Diamond (1992) A Dangerous Meeting
Kng Diamond (1995) Spiders lullaby
Kng Diamond (1996) The Graveyard
Kng Diamond (1997) Abigail
Kng Diamond (1998) Voodoo
Kng Diamond (2000) House Of God
Kng Diamond (2001) 20 Years Ago- A Night Of Rehearsal
Kng Diamond (2002) Abigail II
Kng Diamond (2003) The Puppet Master
Kng Diamond (2004) Deadly Lullabyes Live (2CD)
Kng Diamond (2007) Give Me Your Soul... Please
Various Artsts (2000) Tribute To Kng Diamond
---------------------------------------------------------------------------------------
Nevv Foldername
(1986) Fatal Portrait
(1987) In concert 1987-
(1988) Them
(1989) Conspiracy
1990) The Eye
(1992) A Dangerous Meet
(1995) Spiders Lullaby
(1996) The Graveyard
(1997) Abigail
(1998) Voodoo
(2000) House of God
(2001) 20 Ago-AN...
(2002) Abigail 1i
(2003) The Puppet Master
(2cd)
(2007) Give Me Your soul .
(2000) Tribute To
Path


22/09-18 22:00
#2 : 23/09-18 15:11
David Lee
David Lee
Posts: 1125
Reply to #1:
I'm baffled as well! However if all your folder names contain a 4-digit date you can use that to define a second sub-pattern group:

Text to be replaced: (.+)(\d{4})
Replace with: (\2



23/09-18 15:11
#3 : 23/09-18 15:40
James E. McBride
James E. McBride
Posts: 13
Reply to #2:

Thank you much, David! I got sorta tunnel-visioned into doing it that one way that I wasnt considering the many other ways to do the same thing :)

This is an awesome program!


23/09-18 15:40
#4 : 24/09-18 19:15
David Lee
David Lee
Posts: 1125
Reply to #3:

This has been nagging at me so I just had to get to the bottom of it and work out how to achieve the result with a single stop character!

The RegEx: ([^\(]*)) will select a string of any number of characters not including "(" - so just leave "Replace with" blank.

Whilst this will work with your folder names, it will fall over for names already in the desired format (eg "(2004) Deadly Lullabyes Live (2CD)" ---> "((2CD)" ) so a more robust solution would be to capture the remainder of the string as a second subpattern group:

Text to be replaced: ([^\(]*)(.*)
Replace with: \2


24/09-18 19:15 - edited 24/09-18 19:19
#5 : 24/09-18 22:58
Kyle Pierce
Kyle Pierce
Posts: 4
James,

I think it's selecting the "all" instead of the "1st" occurrence that is causing the unexpected behavior.

The regex will first delete everything up to the first open paren, then will delete everything up until the next open paren, then the next open paren, ...

So, it would look like this:

Kng Diamond (2004) Deadly Lullabyes Live (2CD) --->
DELETE TO FIRST '(' --->
2004) Deadly Lullabyes Live (2CD) --->
DELETE TO NEXT '(' --->
2CD) --->
NO MORE '(' SO ADD '(' AT BEGINNING OF STRING --->
(2CD)

On all the other lines, you only have one open paren, so in those cases 1st = all, but on this line that's not the case.

Remember, RegEx is fun (no matter what anyone tells you)!

Cheers!


24/09-18 22:58
#6 : 25/09-18 13:03
David Lee
David Lee
Posts: 1125
Reply to #5:

Kyle

I'm afraid that you are wrong. In his initial post James clearly stated that he had selected 1st occurrence and "All" and "1st" returned exactly the same results, which I also confirmed.

The real issue is that, in this implementation of RegEx, the quantifiers * and + are "greedy" by default and match as many characters as possible. This means that matching is terminated by the last opening parenthesis in the string rather than the first, as is required. These quantifiers may be made "lazy" (or "minimal") by appending a question mark - so the RegEx .+?\( will match only the characters up to and including the first instance of "(", as James wished.

A better solution is to replace [^(]* with a blank string. If 1st occurrence is selected this also will correctly handle folder names that are already in the required format - ie if the string commences with "(" then nothing is matched.

I would have expected that the same RegEx should work using the "Remove pattern" method but in this case ALL characters other than "(" are matched and removed.

See https://en.wikipedia.org/wiki/Regular_expression


25/09-18 13:03 - edited 08/10-18 11:17