Help with syntax

Advanced Renamer forum
#1 : 22/03-15 04:06
David
David
Posts: 64
I would like to be able to rename strings like this

Why The Knights Disappeared (2003).cbr.cbr
Official Index To Marvel Universe 011 (2009).cbz.cbz

To this

Why The Knights Disappeared (2003).cbr
Official Index To Marvel Universe 011 (2009).cbz

And I would like to be able to do it so the same replace method would look for both cbr and cbz files with the same poor naming

Thanks


22/03-15 04:06
#2 : 22/03-15 11:12
Tester123
Tester123
Posts: 92
Reply to #1:
Assuming the file name structure is like this:
Something.XXX.XXX
where XXX is one or more characters long, and XXX is the same in both sections, then you can try this:

Replace Method:
--------------------
Text to be replaced: (.*)(\..+)\2
Replace with: \1\2
Use regular expressions: Tick
Apply to: Name and Extension

Hope that helps.


22/03-15 11:12 - edited 22/03-15 14:15
#3 : 27/03-15 23:56
David
David
Posts: 64
Reply to #2:

Thanks very much it worked great.

could you tell me what the components of the expression (\..+)\2 actually does so I can use it in the future.

There's a saying
If all you learn are methods you are forever tied to your methods,
But... if you learn the principles behind the methods then you can devise your own.

Thanks


27/03-15 23:56 - edited 27/03-15 23:57
#4 : 29/03-15 12:25
Tester123
Tester123
Posts: 92
Reply to #3:

"Give a man a fish and he will be fed for one day. Teach a man to fish and he will no longer be hungry."

So here goes...

First of all () is used to capture a 'group'. This is as the name suggests: it 'batches' a series of characters together. Moreover, the main advantage is that it allows you to refer to it later on, either in the search string itself or in the replace string. In the search string, there are two groups: (.*) and (\..+).

The first group (.*) means find a character up to any length. The dot means any character, and the asterisk means any length. Taking your first example, this would return: 'Why The Knights Disappeared (2003)'. It does not include the following dot, as that is 'included' in the subsequent group.

The second group (\..+) can be broken down into two parts:

Part 1: \.
This means find the dot character (literally). Because the dot is used to represent any character, to refer to it literally, you need to escape it. That's what the \ is for.

Part 2: .+
In this case, the dot is 'any character' as it is not escaped with a \ character. The + sign is called a quantifier (just like the * above); it describes how many characters to look for. + means one or more, while * means zero or more.
In this case, we want to look for at least one character after the literal dot.

Following the second group is the mysterious \2. This is used to refer to a previously defined group. What we are saying here is: repeat group 2. For example, if group 2 was '.cbr', then \2 would also be '.cbr'.

So that's the search string defined.

Finally we come to the replace string. Notice that it is referring to two groups: \1 and \2. These are substituted with whatever values were found in the two groups in the search string. The repeat of group 2 (signified by \2 in the search string), is ignored. This effectively removes it from the 'replaced string'.

Taking your first example, the \1 is 'Why The Knights Disappeared (2003)' and \2 is '.cbr'.

Hope that's semi-understandable. Regular expressions, while being very powerful, are not the easiest things to read and understand.


29/03-15 12:25 - edited 29/03-15 12:27
#5 : 06/04-15 02:19
David
David
Posts: 64
Reply to #4:
Smile,

Man, that was a fantastic explanation. And thanks so much for that, it will be VERY helpful in the future.


06/04-15 02:19
#6 : 07/04-15 22:50
Kim Jensen
Kim Jensen
Administrator
Posts: 870
Reply to #4:
Very good explanation of regular expressions.


07/04-15 22:50
#7 : 29/07-15 17:44
David
David
Posts: 64
Reply to #4:
Hi Tester along the same lines I was wondering if there is a way to do this similar task
I want to do the same thing but not just with an extension.

For example:
01 - Superstar - Jamelia - Superstar - Jamelia.mp3
02 - Groove Jet (If This Ain't Love) - Spiller - Groove Jet (If This Ain't Love) - Spiller.mp3
08 - Four To The Floor (Thin White Duke Mix) - Starsailor - Four To The Floor (Thin White Duke Mix) - Starsailor.mp3

I would like to eliminate the duplicate name in the filename
I'm assuming I can do it in a similar fashion but I can't figure out the syntax to AR can understand what I am asking it to do.

Thanks.
A description of the components again would be fantastic if you have the time.

Cheers

I was actually able to achieve it by using 2 replace methods and this
(.*) - (.*) - (.*)\2
Replace with
\1 - \3 \2

and I used the exact same code in the second method to finish it. For some reason I couldn't remove both pieces with the same expression I could only remove one at a time. I'm sure I just didn't configure my expression properly and there is a way to do it in a single expression if you know how to do it I would be most interested. I tried grouping the pieces I wanted to remove but I guess I don't know syntax well enough to get AR to understand what I want.

BTW The result after one iteration of the expression yielded the following
01 - Superstar - Jamelia - Superstar.mp3

The second iteration of the exact expression yielded
01 - Superstar - Jamelia
Which is what I want but I don't know how to use RE well enough to configure my expression to do it in one step and I would just like to understand the syntax so I can furture understand Regular Expressions.

Thanks


29/07-15 17:44 - edited 29/07-15 17:59
#8 : 29/07-15 23:56
Tester123
Tester123
Posts: 92
Reply to #7:
Hi David,

A good attempt! For an all-in-one solution, try this 'Replace Method':

Text to be replaced: (\d{2})( - .*)\2
Replace with: \1\2
Use regular expressions: Tick
Apply to: Name

Example first:
01 - Superstar - Jamelia - Superstar - Jamelia.mp3

Try to look for a pattern in the filename that you can exploit and formalize, e.g.
a) a 2 digit number, followed by...
b) <space><hyphen><some characters>, followed by...
c) repetition of the line above

Giving (without the quotes):
a) '01'
b) ' - Superstar - Jamelia'
c) same as b)

In English, we would then say 'take only a) and b), leaving out c)'. Once you have identified this series of steps, that's half the problem solved.

So this is a good place to start (i.e. defining the text to be replaced). Define three groups matching the filename pattern above:
Group 1: (\d{2}) - this is the '2 digit number'
Group 2: ( - .*) - this is the '<space><hyphen><some characters>'
Part 3: \2 - this means 'copy whatever was found in group 2. Here it is not enclosed in (), so it's not actually a group. However, since we don't plan to refer to it in the replace string, we don't need to group it (it's optional in this case, I thought I'd do it this way as an educational step). If you wanted to refer to it later on, or in the replace string, then it would be necessary to group it.

In the Replace section:
We then say, take group 1 and group 2 only (represented by \1\2), or in other words, drop part 3 (the repeat of group 2).

This leaves you with the desired filename.


29/07-15 23:56 - edited 30/07-15 00:17