Organize image and video files by creation date with exiftool

I use a simple bash script /usr/local/bin/move-cam-files.sh to let exiftool move photos and videos from our camera to directories on our file server, based on creation date as per media file metadata:

#! /bin/bash -x

move() {
  exiftool -r -v -ext $2 '-directory<DateTimeOriginal' \
           -d "/home/storage/$3/%Y/%m" "$1"
}

# directory where to read media files from is $1
# if unspecified use current directory
dir=${1:-.}

move $dir AVI videos
move $dir JPG photos

Install and maintain Cygwin without Windows admin rights

Sometimes you work on Windows as a restricted user, without admin rights. Like many other good software packages, Cygwin can be installed anyway (see the respective FAQ entry).

These are the steps I used:

  1. Download setup-x86.exe (32bit) or setup-x86_64.exe (64bit).
  2. Run it with the “--no-admin” option
  3. During installation select the “wget” package
  4. Create /usr/local/bin/cygwin-setup.sh (for 32bit omit the “_64”):
    #! /bin/sh
    rm setup-x86_64.exe
    wget http://cygwin.com/setup-x86_64.exe
    chmod u+x setup-x86_64.exe
    run ./setup-x86_64.exe --no-admin
    
  5. Make the script executable:
    chmod ugo+x /usr/local/bin/cygwin-setup.sh
  6. Create a copy of the Cygwin terminal shortcut, rename it “Cygwin setup”
  7. Edit the shortcut target, replace
    mintty.exe -i /Cygwin-Terminal.ico -

    with

    mintty.exe -i /Cygwin-Terminal.ico /bin/bash -l -c 'cygwin-setup.sh'

Whenever you want to run the Cygwin installer to install or remove packages, you can just execute the shortcut or run cygwin-setup.sh from the Cygwin command prompt.

Alternatively, you could also use the pure command-line tool apt-cyg.

Use standard bash for ssh login on shared hosts

Please note: The following instructions are for ssh logins on a remote host. The approach is not suitable for executing remote commands via ssh.

Problem

When you work on a remote Linux or Unix server (via ssh) you sometimes cannot control your login shell and/or its default config file. For example, you might be sharing the same user account on the server with other people or the use of the chsh tool might be locked down.

Maybe the default shell is something like ksh, or if bash is used maybe the .bashrc sets vi key bindings. This can be annoying if you are used to standard Linux bash with its default Emacs style bindings.

Suggested solution

In these cases you can do the following, assuming bash is installed and in the path on the host:

1) Create /usr/local/bin/sbash.sh (on Windows, use Cygwin).:

#! /bin/bash

 set -x

 if [ "$#" -gt 0 ]; then
   user_at_host="$1"
   shift
   ssh_options="$@"
 else
   set +x
   echo Usage: $(basename "$0") user@host [ssh-options]
   exit 1
 fi

 ssh -t $ssh_options \
        $user_at_host \
     "bash --rcfile .bashrc.for-remote-user-${USER}"

Make the file executable using something like chmod ugo+x /usr/local/bin/sbash.sh. You can then use it for remote logins like the ssh command, for example:

 
oliver@basement:~$ sbash.sh user@host

The $USER variable in the sbash.sh script will be substituted by the local shell with your local user name, which in this example is “oliver”.

2) On the host create ~/.bashrc.for_remote_user_USERNAME where USERNAME is the user name from the ssh client as mentioned above:

user@host$ vim $HOME/.bashrc.for-remote-user-oliver

Make sure this file name matches the –rcfile option in your ssh command.

You can then edit and use this file like a normal .bashrc file, i.e. for setting your favorite environment variables, bash options, aliases, etc.

Minimal Debian VM upgraded to wheezy / Jenkins with OpenJDK 7

I have upgraded my minimal Debian VM to current Debian stable (“wheezy”). It comes in OVA format which is deployable to Virtualbox or VMware.

The Debian system JDK, i.e. the location of java, javac, etc. commands in the PATH, is still OpenJDK 6 because that is the default-jdk on wheezy and the Jenkins deb packages from jenkins-ci.org depend on it. In particular this means that the Tomcat / Jenkins process itself is executed by OpenJDK 6.

But my Jenkins installer now also installs OpenJDK 7 and pre-configures it as the default JDK for Jenkins jobs. That means you can now use this Jenkins instance to build your Java 7 projects, as well as older Java projects.

Please follow the step-by-step installation instructions if you want to use the VM. It consists completely of Free / Open Source software. I provide it for download “as is” without any warranty of any kind.

Nightly file server backups to external harddrive

I use a small headless Debian system as file server for all family photos, videos, documents, etc. Its hostname is “bubba”. I have recently set it up to run backups to an external harddrive, using cron and rsync.

The external disk is a 500G laptop SATA disk in an USB/eSATA enclosure. It requires no separate power supply. So far I have only got it to work over USB. Somehow the eSATA does not work for me on Debian 6 (aka “squeeze”), even though the file server has an eSATA port.

Prerequisites


sudo mkdir /mnt/backup
sudo apt-get install ntfs-3g

NTFS mount/unmount with sudo

I use NTFS as the filesystem on the backup disk because we wanted it to be compatible with MS Windows. The Debian Linux on the file server uses ntfs-3g for mounting the disk read-write. Unfortunately that only works well with root rights, so I configured sudo to permit myself password-less mounting and unmounting of the device.

/etc/sudoers entry
oliver ALL = NOPASSWD: /bin/mount /mnt/backup, \
                       /bin/umount /mnt/backup

Nightly rsync

The nightly backup process itself is a simple non-destructive local rsync command, wrapped by mount and unmount commands, to make sure that we can unplug the external disk anytime we want (just not around midnight).

My crontab:


oliver@bubba:~$ crontab -l
0 0 * * * /home/oliver/shared/scripts/backup.sh

The backup.sh script
#! /bin/sh

if mountpoint /mnt/backup; then
  sudo umount /mnt/backup
fi

sudo mount /mnt/backup

if [ $? -eq 0 ]; then

  rsync -avvih --progress \
    --exclude /downloads \
    --exclude /movies \
  /home/storage/ /home/oliver/backup \
  > /tmp/cron_output.log 2>&1

fi

sudo umount /mnt/backup

Symlinks and fstab

Symlink in my home for convenience:

oliver@bubba:~$ ls -l /home/oliver/backup
lrwxrwxrwx 1 oliver users 11 Aug 7 21:38 /home/oliver/backup -> /mnt/backup/

Entry in /etc/fstab:

oliver@bubba:~$ grep "/mnt/backup" /etc/fstab
/usr/local/share/backup /mnt/backup ntfs-3g defaults 0 0

Device symlink /usr/local/share/backup:

oliver@bubba:~$ ls -l /usr/local/share/backup
lrwxrwxrwx 1 root staff 84 Jun 23 01:51 /usr/local/share/backup -> /dev/disk/by-id/usb-WDC_WD50_00BPVT-00HXZT3_FDC0FD500000000FD0FF61A6103926-0:0-part1

Try it manually


/home/oliver/shared/scripts/backup.sh &
less /tmp/cron_output.log

Room for improvement

The symlink to the device file is the ugliest part of the whole solution. Currently I have to plug the disk directly into a USB slot on the file server because if I connect it via a USB hub, it will appear under a different name in /dev/disk/by-id and my symlink won’t work. I would like to use a udev rule instead that automatically creates an identical symlink no matter how the the disk is plugged in.

I would also like to implement a 2-way backup so that files we put on the external disk, for example photos from a trip to relatives, will be mirrored to the file server. It should be just another rsync command going in the opposite reaction.

Maybe I would also like the backup process to start right away when the disk is plugged in, in addition to the nightly cron job. This would probably require another udev rule.

Uncertain future for Excito Bubba home servers

I own an Excito Bubba/2 file and print server, running Debian Squeeze. Mostly I am quite happy with it.

Now recently the CTO of the Swedish manufacturer announced that Excito is shifting focus, the Bubba/3 product is not marketed on the main excito.com website anymore, but sold off at discount prices on their web store.

This shift seems to be a logical step given the split of the originally 4 founders of the company a few years ago and Excito’s ongoing struggle to support the versatile Debian based Bubba servers.

Tor Krill
and PA Nilsson, the two founders who left Excito a while ago formed OpenProducts and were planning to take over support of the B3, but recently decided to cancel that takeover.

It is uncertain if Excito’s Bubba product line and its customized Debian distribution will survive. Open-sourcing their proprietary Debian packages would certainly be nice. I have tried to initiate a discussion on the Excito forums about this.

Sqoop daily Oracle data into Hive table partition

The following bash script can be used to import Oracle records into a Hive table, partitioned by date. It uses Sqoop. Both Hive and Sqoop are part of typical Hadoop distributions, like the Hortonworks Sandbox, for example.

#!/bin/sh

function upper() {
  echo "$1" | tr [a-z] [A-Z]
}

if [ $# -ge 5 ]; then
  schema=$(upper $1)
  table=$(upper $2)
  column_to_split_by=$(upper $3)
  date_column=$(upper $4)
  date_value="$5"
else 
  echo
  echo "Usage: $(basename $0) schema table column-to-split-by date-column YYYY-MM-DD"
  echo
  echo "Imports all records where value of date-column is \$date_value from"
  echo "Oracle table \$schema.\$table as a Hive table partition."
  echo "Hadoop will split the import job based on the column-to-split-by."
  echo "* The table must have the columns specified as column-to-split-by and date-column."
  echo "* The column-to-split-by must be finer granularity than date-column, ideally unique."
  echo "* The date_value must be in YYYY-MM-DD format."
  echo "* If date_value is unspecified, the current date will be used."
  exit 1
fi

echo "schema = $schema"
echo "table = $table"
echo "column_to_split_by = $column_to_split_by"
echo "date_column = $date_column"
echo "date_value = $date_value"

# we have to drop the partition, because --hive-overwrite does not seem to do it
hive -e "use $schema; alter table $table drop if exists partition($date_column='$date_value');"

columns=$( \
sqoop eval \
--options-file /usr/local/etc/sqoop-options.txt \
--query "select column_name from all_tab_columns where table_name = '$table'" \
| tr -d " |" \
| grep -Ev "\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-|COLUMN_NAME|$date_column" \
| tr '\n' ',' \
| sed -e 's/\,$//'
)

query="select $columns from $schema.$table \
       where $date_column = to_date('$date_value', 'YYYY-MM-DD') \
       and \$CONDITIONS"

echo "query = $query"

sqoop import \
--options-file "/usr/local/etc/sqoop-options.txt" \
--query "$query" \
--split-by "$column_to_split_by" \
--target-dir "$schema.$table" \
--hive-import \
--hive-overwrite \
--hive-table "$schema.$table" \
--hive-partition-key "$date_column" \
--hive-partition-value "$date_value" \
--outdir $HOME/java

JDBC connection details

Put them into /usr/local/etc/sqoop-options.txt, in a format like this:

--connect
jdbc:oracle:thin:@hostname:port:hostname
--username
oracle_username
--password
oracle_password

Manage “recommended” dependencies with apt-get, debfoster and custom script

I use apt-get to install Debian packages. By default apt-get will also install all packages that your desired package depends on or that it recommends.

I find that recommended dependencies are often not actually necessary. I use debfoster to carefully review and selectively remove them, to keep my system light and clean. My approach requires this line in /etc/debfoster.conf:

UseRecommends = no

With this setting, debfoster will ignore recommended dependencies and allow you to decide individually if you want to keep them.

Disclaimer

This approach only makes sense if you know exactly what you are doing. Sometimes the removal of “recommended” dependencies can actually break functionality. If in doubt, tell debfoster to keep (Y) the respective packages or skip (s) the decision.

The prune (p) option offered by debfoster is the most drastic removal type and should be used with extreme caution.

Reinstall recommended dependencies

The ant-rdepends command can help you find out which packages recommend a given package (replace PACKAGE with the name of the package you are interested in):

sudo apt-get install apt-rdepends
apt-rdepends -pr --state-show Installed --state-follow Installed --show Recommends PACKAGE

If you ever remove too much, you can reinstall all dependencies (including the recommended ones) of a package using the following script. Save it for example as /usr/local/bin/install-dependecies.sh and use chmod ugo+x to make it executable.

#!/bin/sh

if [ $# -ne 1 ]; then
  echo "Usage: $(basename $0) package"
  exit 1
fi

package="$1"
header="Package $package depends on:"

df_output=$(debfoster -o UseRecommends=true -d $package)
pkg_list=${df_output#*$header}

if [ "$pkg_list" != "$df_output" ]; then
  sudo apt-get install $pkg_list
else
  echo $df_output
fi

Reference info