bash script to recursively sanitize folder and file names

Below is a bash script to recursively sanitize folder and file names. It leaves all numbers, letters, dots, hyphens and underscores untouched, but replaces all other characters with underscores.

#! /bin/bash

sanitize() {
  shopt -s extglob;

  filename=$(basename "$1")
  directory=$(dirname "$1")

  filename_clean="${filename//+([^[:alnum:]_-\.])/_}"

  if (test "$filename" != "$filename_clean")
  then
    mv -v --backup=numbered "$1" "$directory/$filename_clean"
  fi
}

export -f sanitize
find $1 -depth -exec bash -c 'sanitize "$0"' {} \;

9 thoughts on “bash script to recursively sanitize folder and file names

  1. Almost 10 years later, this saved my ass today. This is the only solution I could find that properly takes care of the problem of altering nested folder and files names recursively. Thanks a lot!

  2. Related to the above: I managed to put a newline in the filename but the HTML sanitizer from WordPress won’t let me paste it in. :) So the filename above should have a Return just before the final quote. FYI.

  3. touch “_\* uu ~ \( \) .txt ”

    Now run your sanitizer on this. Let me know what happens. Personally I get:

    mv: rename ./_* uu ~ ( ) .txt to ./_uu_txt: No such file or directory

    tai@recluse:~/Downloads/tester$bash -version
    GNU bash, version 3.2.51(1)-release (x86_64-apple-darwin13)
    Copyright (C) 2007 Free Software Foundation, Inc.

    1. Thanks for pointing this out.

      I changed the script to not use “read” anymore, because that was causing the problems.

      Instead I now use find -exec with a shell function.

    2. I also added a –backup option to the mv command to make sure that multiple files that would all be sanitized to the same file name do not get lost in translation. :o)

Leave a comment