My current IntelliJ code inspections profile for Java projects

I recently exported my IntelliJ code inspections profile for Java projects from IntelliJ Community Edition to share it with whoever might be interested.

These highly customized code inspections are based on industry standards like the official Java code conventions, various best practices from the Java community and my experience over many years as Java developer and team lead, trying to ensure code quality and maintainability.

Feel free to download and save it as a local XML file. Then you can import it into any of your IntelliJ projects via Analyze – Inspect Code – Inspection profile – […] button – Import:

Blogging in the open

In many ways it is a step in a good direction that many companies use internal “collaboration portals’ with chat forums, wikis, etc. But it also segregates these corporate communities from the public web. Certainly, company-internal proprietary knowledge and business discussions belong behind the firewall, but that is only a fraction of the knowledge sharing going on at work.

It is unfortunate that even non-proprietary conversations about Open Source tools and technologies get separated from the public internet. I think at least the so-called “personal blogs” that many corporate collaboration portals offer belong into the worldwide “blogosphere”, not on intranets or other “walled gardens” systems.

Judging from my own experience with oldoldo.wordpress.com, bloggers and professional software developers actually benefit from using public blogging services. Platforms like WordPress are very good at making articles findable via search engines like Google, allow categorization, attachment management, feed update notifications into Twitter, LinkedIn, etc. And on the whole, a public blog generates more useful feedback and discussions than limiting the same thing to the people who happen to work for the same employer.

To directly notify coworkers about new posts on my blog, I simply post the link (i.e. the hopefully permanent URL) on my employers collaboration site, not the content itself.

And if I ever change jobs, I will still have access to my own posts. That certainly helps an older guy like me, with a weak memory for details … :)

Uncertain future for Excito Bubba home servers

I own an Excito Bubba/2 file and print server, running Debian Squeeze. Mostly I am quite happy with it.

Now recently the CTO of the Swedish manufacturer announced that Excito is shifting focus, the Bubba/3 product is not marketed on the main excito.com website anymore, but sold off at discount prices on their web store.

This shift seems to be a logical step given the split of the originally 4 founders of the company a few years ago and Excito’s ongoing struggle to support the versatile Debian based Bubba servers.

Tor Krill
and PA Nilsson, the two founders who left Excito a while ago formed OpenProducts and were planning to take over support of the B3, but recently decided to cancel that takeover.

It is uncertain if Excito’s Bubba product line and its customized Debian distribution will survive. Open-sourcing their proprietary Debian packages would certainly be nice. I have tried to initiate a discussion on the Excito forums about this.

Sqoop daily Oracle data into Hive table partition

The following bash script can be used to import Oracle records into a Hive table, partitioned by date. It uses Sqoop. Both Hive and Sqoop are part of typical Hadoop distributions, like the Hortonworks Sandbox, for example.

#!/bin/sh

function upper() {
  echo "$1" | tr [a-z] [A-Z]
}

if [ $# -ge 5 ]; then
  schema=$(upper $1)
  table=$(upper $2)
  column_to_split_by=$(upper $3)
  date_column=$(upper $4)
  date_value="$5"
else 
  echo
  echo "Usage: $(basename $0) schema table column-to-split-by date-column YYYY-MM-DD"
  echo
  echo "Imports all records where value of date-column is \$date_value from"
  echo "Oracle table \$schema.\$table as a Hive table partition."
  echo "Hadoop will split the import job based on the column-to-split-by."
  echo "* The table must have the columns specified as column-to-split-by and date-column."
  echo "* The column-to-split-by must be finer granularity than date-column, ideally unique."
  echo "* The date_value must be in YYYY-MM-DD format."
  echo "* If date_value is unspecified, the current date will be used."
  exit 1
fi

echo "schema = $schema"
echo "table = $table"
echo "column_to_split_by = $column_to_split_by"
echo "date_column = $date_column"
echo "date_value = $date_value"

# we have to drop the partition, because --hive-overwrite does not seem to do it
hive -e "use $schema; alter table $table drop if exists partition($date_column='$date_value');"

columns=$( \
sqoop eval \
--options-file /usr/local/etc/sqoop-options.txt \
--query "select column_name from all_tab_columns where table_name = '$table'" \
| tr -d " |" \
| grep -Ev "\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-|COLUMN_NAME|$date_column" \
| tr '\n' ',' \
| sed -e 's/\,$//'
)

query="select $columns from $schema.$table \
       where $date_column = to_date('$date_value', 'YYYY-MM-DD') \
       and \$CONDITIONS"

echo "query = $query"

sqoop import \
--options-file "/usr/local/etc/sqoop-options.txt" \
--query "$query" \
--split-by "$column_to_split_by" \
--target-dir "$schema.$table" \
--hive-import \
--hive-overwrite \
--hive-table "$schema.$table" \
--hive-partition-key "$date_column" \
--hive-partition-value "$date_value" \
--outdir $HOME/java

JDBC connection details

Put them into /usr/local/etc/sqoop-options.txt, in a format like this:

--connect
jdbc:oracle:thin:@hostname:port:hostname
--username
oracle_username
--password
oracle_password

Make apps4halifax – Intro

Halifax Regional Municipality is introducing ‘apps4halifax‘, its first-ever Open Data App Contest. Similar initiatives have been successful in Ottawa, Edmonton and many other cities worldwide.

Residents can submit ideas or code apps using the HRM Open Data catalog. The best submissions may win cash prizes and awards.

The Open Data catalog is implemented using the Socrata Open Data Portal. The SODA 2.0 restful web service API allows developers to query and consume live data from the public data-sets.

The Socrata developer documentation explains how to use queries to endpoints, supported datatypes and response formats.

The currently available data-sets include Crime occurrences, Building types, Buildings, Bus Routes, Bus Stops, Bylaw Areas, Civic Addresses, Community Boundaries, Park Recreation Features, Parks, Polling Districts, Streets, Trails, Transit Times, Waste collections and Zoning Boundaries.

You can construct web service query URLs like this:

You can determine the RESOURCE-ID for a dataset like this:

  1. Go to https://www.halifaxopendata.ca/
  2. Click on a dataset name
  3. Click the “Export” button
  4. Under “Download As”, copy one of the links, e.g. JSON
  5. The resource id is the part of the URL between “views/” and “/rows.json”

As an extremely useful example, you could query fun things like all HRM garbage collections occurring on Wednesdays:

http://www.halifaxopendata.ca/resource/ga7p-4mik.json?collect=’WEDNESDAY’

As a simple and quick way to create web pages that interact with these web services you could use JQuery and its getJSON() function.

I will probably follow up with more posts on this topic soon.

Transparently improve Java 7 mime-type recognition with Apache Tika

Java 7 comes with the method java.nio.file.Files#probeContentType(path) to determine the content type of a file at the given path. It returns a mime type identifier. The implementation actually looks at the file content and inspects so-called “magic” byte sequences, which is more reliable than just trusting filename extensions.

However, the default implementation included in Java 7 seems to be platform dependent and not very complete. For example, for me it did not even recognize an mp3 file as audio/mpeg. Fortunately, the Open Source library Apache Tika provides more comprehensive mime type detection and seems to be platform independent.

As shown below, you can register a simple Tika based FileTypeDetector implementation with the Java Service Provider Interface (SPI) to transparently enhance the behaviour of java.nio.file.Files#probeContentType(path). As soon as the resulting jar is in your classpath, the SPI mechanism wil pick up our implementation class and Files.probeContentType(..) will automatically use it behind the scenes.

Maven dependency

        <dependency>
            <groupId>org.apache.tika</groupId>
            <artifactId>tika-core</artifactId>
            <version>1.4</version>
        </dependency>

FileTypeDetector.java

package net.doepner.file;

import java.io.IOException;
import java.nio.file.Path;

import org.apache.tika.Tika;

/**
 * Detects the mime type of files (ideally based on marker in file content)
 */
public class FileTypeDetector extends java.nio.file.spi.FileTypeDetector {

    private final Tika tika = new Tika();

    @Override
    public String probeContentType(Path path) throws IOException {
        return tika.detect(path.toFile());
    }
}

Service Provider registration

To register the implementation with the Java Service Provider Interface (SPI), you need to have a plaintext file /META-INF/services/java.nio.file.spi.FileTypeDetector in the same jar that contains the class net.doepner.file.FileTypeDetector. The text file contains just one line with the fully qualified name of the implementing class:

net.doepner.file.FileTypeDetector

With Maven, you simply create the file src/main/resources/META-INF/services/java.nio.file.spi.FileTypeDetector containing the line shown above.

See the ServiceLoader documentation for details about Java SPI.

Reflexionen der Moderne im dramatischen Werk Ernst Tollers

About 12 years ago, in July 2001, I submitted my thesis “Zwischen Weltverbesserung und Isolation – Reflexionen der Moderne im dramatischen Werk Ernst Tollers” to complete my university degree in Mathematics and German Linguistics and Literature.

I wrote the document using Latex and GNU Emacs on a GNU/Linux system. It is available in PDF format.

The LaTex source files of the thesis are also available. The structure is very straightforward and uses predefined macro definitions that can be generally useful for writing essays, books and academic papers in the Liberal Arts.

For those who are fed up with their word processor messing with their layouts and prefer to just write plain text: Take a look at how simple, for example my introduction chapter is.

If you are interested, feel free to reuse my LaTeX macros in STYLE.tex.

The LaTeX code is compatible with TeX Live, version 2012. On Debian stable (wheezy) installation is as simple as

sudo apt-get install texlive texlive-latex-extra texlive-lang-german evince
wget https://github.com/odoepner/toller-moderne/archive/master.zip
unzip master.zip
cd toller-moderne-master/src/main/tex
pdflatex MAIN.tex
evince MAIN.pdf

Subversion 1.8 released

The new Subversion 1.8 features look quite good for a centralized Version Control System (VCS).

But note that the Subversion 1.8 working copy format is backwards-incompatible. Some tools like recent TortoiseSVN versions will use the 1.8 format by default which will cause compatibility problems for IntelliJ and any other tools that do not yet support it.

So for now, it is probably better to stick with 1.7 and wait until all your tools fully support 1.8. For IntelliJ you might want to watch [IDEA-94942] for status updates.

Personally, I am more interested in Git anyway because it offers all the flexibility of a decentralized VCS. I am reading the free “Pro Git” ebook on my Kobo eReader (epub format).

Play MP3 or OGG using javax.sound.sampled, mp3spi, vorbisspi

I tried to come up with the simplest possible way of writing a Java class that can play mp3 and ogg files, using standard Java Sound APIs, with purely Open Source libraries from the public Maven Central repositories.

The LGPL-licensed mp3spi and vorbisspi libraries from javazoom.net satisfy these requirements and worked for me right away. As service provider implementations (SPI), they transparently add support for the mp3 and ogg audio formats to javax.sound.sampled, simply by being in the classpath.

For my AudioFilePlayer class below I basically took the example code from javazoom and simplified it as much as possible. Please note that it requires Java 7 as it uses try-with-resources.

Maven dependencies

  <!-- 
    We have to explicitly instruct Maven to use tritonus-share 0.3.7-2 
    and NOT 0.3.7-1, otherwise vorbisspi won't work.
   -->
<dependency>
  <groupId>com.googlecode.soundlibs</groupId>
  <artifactId>tritonus-share</artifactId>
  <version>0.3.7-2</version>
</dependency>
<dependency>
  <groupId>com.googlecode.soundlibs</groupId>
  <artifactId>mp3spi</artifactId>
  <version>1.9.5-1</version>
</dependency>
<dependency>
  <groupId>com.googlecode.soundlibs</groupId>
  <artifactId>vorbisspi</artifactId>
  <version>1.0.3-1</version>
</dependency>

AudioFilePlayer.java

package net.doepner.audio;

import java.io.File;
import java.io.IOException;

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine.Info;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.SourceDataLine;
import javax.sound.sampled.UnsupportedAudioFileException;

import static javax.sound.sampled.AudioSystem.getAudioInputStream;
import static javax.sound.sampled.AudioFormat.Encoding.PCM_SIGNED;

public class AudioFilePlayer {

    public static void main(String[] args) {
        final AudioFilePlayer player = new AudioFilePlayer ();
        player.play("something.mp3");
        player.play("something.ogg");
    }

    public void play(String filePath) {
        final File file = new File(filePath);

        try (final AudioInputStream in = getAudioInputStream(file)) {
            
            final AudioFormat outFormat = getOutFormat(in.getFormat());
            final Info info = new Info(SourceDataLine.class, outFormat);

            try (final SourceDataLine line =
                     (SourceDataLine) AudioSystem.getLine(info)) {

                if (line != null) {
                    line.open(outFormat);
                    line.start();
                    stream(getAudioInputStream(outFormat, in), line);
                    line.drain();
                    line.stop();
                }
            }

        } catch (UnsupportedAudioFileException 
               | LineUnavailableException 
               | IOException e) {
            throw new IllegalStateException(e);
        }
    }

    private AudioFormat getOutFormat(AudioFormat inFormat) {
        final int ch = inFormat.getChannels();
        final float rate = inFormat.getSampleRate();
        return new AudioFormat(PCM_SIGNED, rate, 16, ch, ch * 2, rate, false);
    }

    private void stream(AudioInputStream in, SourceDataLine line) 
        throws IOException {
        final byte[] buffer = new byte[65536];
        for (int n = 0; n != -1; n = in.read(buffer, 0, buffer.length)) {
            line.write(buffer, 0, n);
        }
    }
}