Converting/removing special characters / adding a text string at the beginning of the filename if not present
Hello, thanks for stopping by to read my questions. I have an ambitious/complex renaming project that I cannot figure out by myself, being new to regular expressions. Multiple operations need to be performed on a set of files.
First, I want to:
+++ Add a phrase to files with a specific filetype +++
I have a bunch of files with two different file extensions.
Files with filetypeB should always get or have a specific phrase at the beginning, filetypeA should not.
The names of the files are something like this:
food_ milk & éclairs 3.filetypeA
SpecificPHRASE_stringofwords[Hello] & über! V2.filetypeB
FiletypeB should start with a specific phrase (always the same one with lower and uppercase letters) followed by an underscore at the beginning, example:
SpecificPHRASE_
If a filetypeB file does not have the phrase, I want the renamer to add it. But if it is there, the addition should be skipped. It should never be applied to filetypeA files.
EDIT:
I have figured out a way to check if the phrase exists and to add it, also removing double entries such as SpecificPHRASE_SpecificPHRASE, and worked out that I can give the batch a filemask filter so only that filetype will be affected using the Replace Method with regular expressions.
Text to replace: (SpecificPHRASE_)*
Text to replace with: SpecificPHRASE_
Regular Expression option checked.
So I guess that part is solved, unless there is a better way to do this.
+++ Remove/convert special characters +++
The second thing I want to do is remove/convert special characters in all filenames.
The files may contain special characters such as @, ! &, ê, ä, ß, symbols, or letters from other alphabets such as Cyrillic.
1. Any Symbols like © or ♥ should be either removed entirely or converted to underscores.
2. I want the filenames to only consist of:
Letters (Aa-Zz),
numerals 0 to 9,
_ underscores and - dashes.
3. Any punctuation, brackets, and empty spaces should be converted to underscores, but existing dashes should remain untouched.
4. Special letters should be converted into plain letters, for example, é becomes e.
5. Characters like @ should become "At", & should become "And", ° should be "Degrees".
I already figured out how to solve the problem for 3 and 5. using the list replace method. This feature makes conversions of these characters into plain text very easy.
I also worked out a replacement containing [^\w\d_-] using regular expressions and it keeps only regular letters, numbers, _ and -, but also removes the special letters completely instead of replacing them with plain ones.
But if I go and put every special character and symbol into a list manually especially from other alphabets, it'll take forever. Is there a way to convert the special characters into a plain English alphabet? so, a Cyrillic Ф would become f, an ê would become e, and so on.
Any help is very much appreciated. I plan on creating a batch file to automate this, so if this takes multiple steps/cannot be put into one single operation, that would be okay.
Edit2: (I am trying to work on/solve the issue as best as I can):
I found a javascript script that can normalize accent characters here:
https://ricardometring.com/javascript-replace-special-charac ters
but I have no idea how to add the code to work in advanced renamer (a simple copy/paste into the script method field did not work and I have no idea how to code in js). Maybe someone here knows how to get it to work?
First, I want to:
+++ Add a phrase to files with a specific filetype +++
I have a bunch of files with two different file extensions.
Files with filetypeB should always get or have a specific phrase at the beginning, filetypeA should not.
The names of the files are something like this:
food_ milk & éclairs 3.filetypeA
SpecificPHRASE_stringofwords[Hello] & über! V2.filetypeB
FiletypeB should start with a specific phrase (always the same one with lower and uppercase letters) followed by an underscore at the beginning, example:
SpecificPHRASE_
If a filetypeB file does not have the phrase, I want the renamer to add it. But if it is there, the addition should be skipped. It should never be applied to filetypeA files.
EDIT:
I have figured out a way to check if the phrase exists and to add it, also removing double entries such as SpecificPHRASE_SpecificPHRASE, and worked out that I can give the batch a filemask filter so only that filetype will be affected using the Replace Method with regular expressions.
Text to replace: (SpecificPHRASE_)*
Text to replace with: SpecificPHRASE_
Regular Expression option checked.
So I guess that part is solved, unless there is a better way to do this.
+++ Remove/convert special characters +++
The second thing I want to do is remove/convert special characters in all filenames.
The files may contain special characters such as @, ! &, ê, ä, ß, symbols, or letters from other alphabets such as Cyrillic.
1. Any Symbols like © or ♥ should be either removed entirely or converted to underscores.
2. I want the filenames to only consist of:
Letters (Aa-Zz),
numerals 0 to 9,
_ underscores and - dashes.
3. Any punctuation, brackets, and empty spaces should be converted to underscores, but existing dashes should remain untouched.
4. Special letters should be converted into plain letters, for example, é becomes e.
5. Characters like @ should become "At", & should become "And", ° should be "Degrees".
I already figured out how to solve the problem for 3 and 5. using the list replace method. This feature makes conversions of these characters into plain text very easy.
I also worked out a replacement containing [^\w\d_-] using regular expressions and it keeps only regular letters, numbers, _ and -, but also removes the special letters completely instead of replacing them with plain ones.
But if I go and put every special character and symbol into a list manually especially from other alphabets, it'll take forever. Is there a way to convert the special characters into a plain English alphabet? so, a Cyrillic Ф would become f, an ê would become e, and so on.
Any help is very much appreciated. I plan on creating a batch file to automate this, so if this takes multiple steps/cannot be put into one single operation, that would be okay.
Edit2: (I am trying to work on/solve the issue as best as I can):
I found a javascript script that can normalize accent characters here:
https://ricardometring.com/javascript-replace-special-charac ters
but I have no idea how to add the code to work in advanced renamer (a simple copy/paste into the script method field did not work and I have no idea how to code in js). Maybe someone here knows how to get it to work?