Remove all duplicates, but keep the first occurrence

Advanced Renamer forum
#1 : 07/08-24 04:59
The Jackal
The Jackal
Posts: 6
keep the first occurrence of each duplicated word while removing the subsequent duplicates.
Input:
Awad C - Gallery name Awad C Awad C C 2024-7-8
Result:
Awad C - Gallery name 2024-7-8

I tried these two:
\b(\b\w+\b)(?=.*\b\1\b)
\b(\w+)\b(?=.*\b\1\b)
but the results are the opposite, removed the first occurrence, and kept the rest.

I did a quick search in the forum but no luck:
https://www.advancedrenamer.com/forum_thread?for um_id=14407
https://www.advancedrenamer.com/forum_thread?for um_id=3052


07/08-24 04:59
#2 : 07/08-24 06:18
Delta Foxtrot
Delta Foxtrot
Posts: 285
Reply to #1:

Hi Jack,

Without more information it's hard to say. How about a more representative sample of filenames.

By that I mean, it's easy to write something that will work for that particular filename, but probably won't work for anything else.

Best,
DF


07/08-24 06:18
#3 : 07/08-24 07:52
Delta Foxtrot
Delta Foxtrot
Posts: 285
Reply to #2:

Hello again,

Actually this *might* do it for you, as long as the "words" you want to keep are the first two in the name and the repeats come after some text you want to keep. You may need to remove the " - " if your files don't all have that.

REPLACE method: (remove all quotes)
Replace: "^([^ ]+) ([^ ]+) - (.*?)(\1|\2) ?(\1|\2)? ?(\1|\2)? ?(\1|\2)? ?(\1|\2)?"
Replace with: "$1 $2 - $3"
Case doesn't matter
Use regular expressions checked, of course
Apply to Name

All the extra "(\1|\2)?" elements may be overkill, depending on the structure of your filenames, but they don't hurt anything.

https://drive.google.com/file/d/1eBIYbwTG6WFoVQu hOs1c2H1Y4OUZmN4E/view?usp=sharing

Let us know how that works.

Best,
DF


07/08-24 07:52 - edited 07/08-24 07:57
#4 : 07/08-24 09:07
Delta Foxtrot
Delta Foxtrot
Posts: 285
Reply to #3:

This seems to work too, maybe more generally:

R:^([^ ]+) ([^ ]+)(.*?)((\1|\2) ?)+
RW:$1 $2$3


07/08-24 09:07 - edited 07/08-24 09:08
#5 : 07/08-24 10:24
Miguel
Miguel
Posts: 136
Reply to #1:
Hi Jackal.
Try this:

Replace: (..+) \1 . (copy the dot)
Replace with: \2

Work with your examples and with some mines

https://drive.google.com/file/d/1rDdY_TfTSdRsiAD FBoppwmE25ySaduhv/view?usp=drive_link

Miguel


07/08-24 10:24
#6 : 07/08-24 12:11
Delta Foxtrot
Delta Foxtrot
Posts: 285
Reply to #5:

Hey Miguel, great work as usual my friend.

I got to thinking about how many times I've seen this question asked, and how not all replace methods work with all filename structures, and I decided I should just write a little script that will eliminate all duplicate words as long as you consider anything between spaces a word. Ten lines, what was I waiting for?

EDIT: ORIGINAL SCRIPT REPLACED:

ttfn = item.name.replace( /[ ]{2,}/gi, " ") ;
tfn = ttfn.split(" ");
const newfn = [];
const ufn = [];
for ( j = 0; j < tfn.length; j++ ) {
if ( !(ufn.includes(tfn[j].toUpperCase())) ) {
newfn.push( tfn[j] );
ufn.push( tfn[j].toUpperCase() ) ;
}
}
return newfn.join(" ") ;

END EDIT

It checks case too, so awad c and Awad C are both considered, and any instance after the first is eliminated. That's why it's 10 lines instead of four. :) And it doesn't desturb anything not a duplicate word.

Best,
DF

EDIT: not sure why that came out partly in italics! END EDIT


07/08-24 12:11 - edited 08/08-24 13:44
#7 : 08/08-24 12:13
Miguel
Miguel
Posts: 136
Reply to #6:
Hi DF.

You are right. This is a recurring question. Your script idea is fantastic. Unfortunately, I have tried it and it doesn't work.
Post a screenshot
https://drive.google.com/file/d/1m3U3VkJz8GJccjh gq8E70VUPMKcpmQBt/view?usp=drive_link

Miguel



08/08-24 12:13
#8 : 08/08-24 13:03
Delta Foxtrot
Delta Foxtrot
Posts: 285
Reply to #7:

EDIT: It did the same thing again! I don't know WTH is going on, but this forum software is sabotaging me. The REALLY weird thing is that, as I'm editing this post, IT IS CORRECT IN THE EDIT SCREEN! And I just noticed that it's not letting me post the combination "[" plus "i" plus "]" at all! I'm going to add a screenshot; just edit in the code that must not be written on this forum and it will work, I promise! :) It's just 3 spots in the if statement after "tfn" after "includes" and after the two "push" methods.

https://drive.google.com/file/d/1793J2723_BhMkwk oZf_luOvN_v5R1myX/view?usp=sharing

END EDIT

Wow, Miguel, I'm not really sure what happened between copying and pasting that script, but somehow it lost some array element information ( the push methods should have had "tfn" in them, not just "tfn".

I swear, when I copied them they were there! :) I wonder if I fumbled some keys when pasting. There was that weird half-italics thing too, and now that I look at it it started at the location of the first that was missing. must be the code for "halt and catch fire" on this forum! Anyway, here's the real thing, and I'm going to correct the one above too:

EDIT: DELETED... SEE NEXT POST

------------------------------------
(I'm actually glad this happened... for some reason it made me realize that I should check for multiple spaces between words and if found make them single spaces.)

Best, and thanks!
DF


08/08-24 13:03 - edited 08/08-24 13:42
#9 : 08/08-24 13:21
Delta Foxtrot
Delta Foxtrot
Posts: 285
Reply to #8:

So "square-bracket i square-bracket" obviously is the italics command on here.
Of course I picked a character combination that is self-destructive! :)

EDIT:
Here's a new version, compete with (hopefully) italic-delete and multi-space removal:
------------------------------------------------------
ttfn = item.name.replace( /[ ]{2,}/gi, " ") ;
tfn = ttfn.split(" ");
const newfn = [];
const ufn = [];
for ( j = 0; j < tfn.length; j++ ) {
if ( !(ufn.includes(tfn[j].toUpperCase())) ) {
newfn.push( tfn[j] );
ufn.push( tfn[j].toUpperCase() ) ;
}
}
return newfn.join(" ") ;
------------------------------------------------------
Hope this works THIS TIME!

THANKS Miguel!


08/08-24 13:21 - edited 08/08-24 13:45
#10 : 08/08-24 14:01
Miguel
Miguel
Posts: 136
Reply to #9:
Hi DF.
Boom!!
Both works like a charm!! and with multiple cobinations. You are fantastic.

https://drive.google.com/file/d/1IrNNAM1NFabhAJp euiLNAMbM1sr_stJx/view?usp=drive_link

In your first script the [ i ] was lost. The second work seamlessly.

Greetings.

Miguel

EDIT: ITALICS HAVE APPEARED. I THINK THAT'S WHERE THE WHOLE PROBLEM LIES.
EDIT 2: I just discovered that if you put bracket+I+bracket it deletes them and activates italics. Adding space [ ] not.


08/08-24 14:01 - edited 08/08-24 14:10
#11 : 08/08-24 14:25
Delta Foxtrot
Delta Foxtrot
Posts: 285
Reply to #10:

Well, Miguel, I'm glad that mystery is solved! Leave it to us bloodhounds, right? :)

Did you see I changed it to use [j] as the counter and delete multiple spaces when present? Now I can paste it here in the forum.

It's very common to use i as a counter in loops like in this script; I'm surprised I haven't run into this problem before on here...

Best regards my friend,
DF


08/08-24 14:25
#12 : 09/08-24 22:08
The Jackal
The Jackal
Posts: 6
Reply to #10:

Guys, I wanted to say a huge thank you for all your help with the script. You guys are lifesavers!
Unfortunately, I'm running into a little trouble getting it to work. I've attached an image so you can see exactly what's happening.
https://i.imgur.com/mJBNf2P.jpeg

I've also shared a link to the Google Doc to kindly write the full script.
https://docs.google.com/document/d/1Vq-dTaPMlamj zlyF3yCauhqNj1XxZD3kNcXSBsrxQqc/edit

Any help you can give would be really appreciated.
Thanks again!


09/08-24 22:08
#13 : 09/08-24 22:26
Miguel
Miguel
Posts: 136
Reply to #12:
Hi Jackal.
The last line of your script has an error.
Shoul be:
return newfn.join(" ") ;

You have writen:
return newfn.ioin(" ") ;

Miguel


09/08-24 22:26