Below is a bash script to recursively sanitize folder and file names. It leaves all numbers, letters, dots, hyphens and underscores untouched, but replaces all other characters with underscores.
#! /bin/bash
sanitize() {
shopt -s extglob;
filename=$(basename "$1")
directory=$(dirname "$1")
filename_clean="${filename//+([^[:alnum:]_-\.])/_}"
if (test "$filename" != "$filename_clean")
then
mv -v --backup=numbered "$1" "$directory/$filename_clean"
fi
}
export -f sanitize
find $1 -depth -exec bash -c 'sanitize "$0"' {} \;
Almost 10 years later, this saved my ass today. This is the only solution I could find that properly takes care of the problem of altering nested folder and files names recursively. Thanks a lot!
Good to know! Thanks for the feedback. :)
So long ago, but still very useful. Thanks for sharing.
Related to the above: I managed to put a newline in the filename but the HTML sanitizer from WordPress won’t let me paste it in. :) So the filename above should have a Return just before the final quote. FYI.
touch “_\* uu ~ \( \) .txt ”
Now run your sanitizer on this. Let me know what happens. Personally I get:
mv: rename ./_* uu ~ ( ) .txt to ./_uu_txt: No such file or directory
tai@recluse:~/Downloads/tester$bash -version
GNU bash, version 3.2.51(1)-release (x86_64-apple-darwin13)
Copyright (C) 2007 Free Software Foundation, Inc.
Thanks for pointing this out.
I changed the script to not use “read” anymore, because that was causing the problems.
Instead I now use find -exec with a shell function.
I also added a –backup option to the mv command to make sure that multiple files that would all be sanitized to the same file name do not get lost in translation. :o)
Just wanted to say thanks, just the thing to save me from a directory of filenames which should never have existed!
Thank you, that was helpful!