How to force capitalize a subpattern group?

Advanced Renamer forum
#1 : 02/05-24 12:38
James E McBride
James E McBride
Posts: 17
Good morning!
My ultimate goal is to create a REPLACE method (or, I MAY need a SCRIPT method to do it) to identify Roman numerals (in either lower, upper or mixed case) in text and capitalize only the Roman numerals section(s). I would probably like to do something similar with common acronyms as well.

My question is this....in another forum about REGEX, I came upon a method to capitalize text which was:
\U\1\E \2

with
\U = begin force capitalization
\E = end force capitalization

Does the REGEX system in AR have a similar item?

Thanks for the help!
Actually, if you can help with a good REGEX for Roman numerals, I would appreciate it as well!

--Jim


02/05-24 12:38
#2 : 02/05-24 22:15
Delta Foxtrot
Delta Foxtrot
Posts: 285
Reply to #1:

Hi James, good day to you, :)

I can tell you that \U<number> works for capitalizing capturing groups. I've never tried the \E variant–and don't even find it in my PowerGREP regex reference, which is pretty exhaustive. Not saying it doesn't exist, just I've never seen it. Do you know what flavor of regex engine it works with?

If you've done any exploration on the extrawebs you must have found this regex expression:
^M{0,3}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$

Here's a fair summation of using regex for Roman numeral validation including jscript example:
https://www.geeksforgeeks.org/validating-roman-n umerals-using-regular-expression/
And another page with possibly a better explanation of the expression used:
https://codepal.ai/regex-explainer/query/jDiAUwF z/roman-numeral-regex-pattern-explained

I did a small test and can verify that, at least for my small sample, the expression worked. Well, except for the one curveball I threw it, the expression for 4,000 written as MMMM. I did that because I don't know how to write in thousands the other way (4,000 would be a IV with a bar above it. Not something you'd be able to do in ARen, or Windows filenames, unless there are unicode characters for that). I changed the first character match to make it a capturing group and increase the max match to 4 and here's the result:
https://drive.google.com/file/d/1ZwaCcgFkbI22ugb tGBPSW8J9LAoRkFIK/view?usp=sharing
(Oh, I tried the replace string as \U\1\2\3\4\E and it did not work)

As you probably realize, if the Roman numerals are mixed into other text you'll have to manipulate t he expression. As is, it only works on a filename composed solely of the numeral due to the ^ and $ anchors. If all your numerals are delineated with spaces, or some other characters that could be made into character classes you should be in pretty good shape. If you want to give examples of filenames maybe we could help.

Good luck, let us know how it goes,
DF


02/05-24 22:15 - edited 02/05-24 22:35
#3 : 03/05-24 14:24
Miguel
Miguel
Posts: 136
Reply to #1:
Hi.
I have been trial and error test and this is the most accurate i get with some examples. This let me capitalize roman number in a string of text.
Don´t know if this could produce some error in other examples.

REPLACE: \b([ivxlcdm]+)\b

REPLACE WITH: \U1

Use regular expresion: Yes

https://i.ibb.co/Ky3DhCY/Captura-03-05-2024-41s. png

Miguel





03/05-24 14:24 - edited 03/05-24 16:05