Real-world usage of Commons CLI to parse command line arguments

The Apache Commons Command Line Interface (CLI), is a handy little toolkit for handling command line arguments.  Java programs, as with almost all other programs running on an OS that supports a CLI, permits passing of arguments from the user as an ordered collection of strings.

The problem with Commons CLI is that it doesn’t appear to manage a scenario that I wanted supported in xml2csv, whereby:

  1. The configuration file option (-c) is mandatory.  xml2csv cannot do anything meaningful without a configuration file, so one must be specified; EXCEPT
  2. When the user requests help (-h) or version information (-v), then obviously a configuration file is not required.
  3. In the usage message, I want to see both –h and –v as available command line options.

I could find a couple of search hits that gave me an answer to 1 & 2, but I needed to make a further change to support 3.

This is how I created the Options objects to model this:

static {
	// mainOptions contains all the options together, including help and version.
	Options mainOptions = new Options();
	Option option = new Option(OPT_CONFIG_FILE, "configuration-file", true, "A single file containing the configuration to use.");
	option.setRequired(true);
	mainOptions.addOption(option);
	option = new Option(OPT_OUT_DIR, "output-directory", true, "The directory to which the output CSV files will be written.  "
				+ "If not specified, current working directory will be used.  Directory must exist and be writeable.");
	mainOptions.addOption(option);
	option = new Option(OPT_TRIM_WHITESPACE, "preserve-whitespace", false,
				"If specified then whitespace will not be removed from the start and end of output fields.");
	mainOptions.addOption(option);
	option = new Option(OPT_APPEND_OUTPUT, "append-output", false,
				"If specified, all output will be appended to any existing output files.  If an existing file is"
				+ " appended to then field names will not be output.");
	mainOptions.addOption(option);

	// helpOptions contains only the help and version options, it's important that these are both optional.
	// Note how both mainOptions and helpOptions contains help and verbose options.
	Options helpOptions = new Options();

	option = new Option(OPT_HELP, "help", false, "Show help on using xml2csv and terminate.");
	mainOptions.addOption(option);
	helpOptions.addOption(option);

	option = new Option(OPT_VERSION, "version", false, "Show version information and terminate.");
	mainOptions.addOption(option);
	helpOptions.addOption(option);

	MAIN_OPTIONS = mainOptions;
	HELP_OPTIONS = helpOptions;
}

Once the options have been created, then the algorithm for using them is as follows:

  1. Parse the arguments against HELP_OPTIONS.
  2. If –v and –h are specified then carry out the relevant actions.  Note that –h needs to print the usage information as generated from MAIN_OPTIONS.
  3. If neither –v nor –h are specified then parse against MAIN_OPTIONS. If mandatory (-c) options are not specified then an error is printed and a usage message is shown.

The code is as follows:

/**
 * Instance entry point for command line invocation.
 *
 * @param args command line arguments
 */
public void execute(String[] args) {
	if (LOG.isInfoEnabled()) {
		LOG.info("xml2csv execute invoked {}", StringUtil.toString(args));
	}
	try {
		if (!showHelpOrVersion(args)) {
			BasicParser parser = new BasicParser();
			CommandLine cmdLine = parser.parse(MAIN_OPTIONS, args);
			LOG.info("Successfully parsed main options.");
			boolean trimWhitespace = Boolean.parseBoolean(cmdLine.getOptionValue(OPT_TRIM_WHITESPACE));
			boolean appendOutput = Boolean.parseBoolean(cmdLine.getOptionValue(OPT_APPEND_OUTPUT));
			String[] xmlInputs = cmdLine.getArgs();
			String outputDirName = cmdLine.getOptionValue(OPT_OUT_DIR);
			String configFileName = cmdLine.getOptionValue(OPT_CONFIG_FILE);
			execute(configFileName, xmlInputs, outputDirName, appendOutput, trimWhitespace);
		}
	} catch (ProgramException pe) {
		LOG.error("A fatal error caused xml2csv to abort", pe);
		// All we can do is print out the error and terminate the program
		System.err.print(getAllCauses(pe));
	} catch (ParseException pe) {
		// Thrown when the command line arguments are invalid
		LOG.debug("Invalid arguments specified: {}", pe.getMessage());
		System.err.println("Invalid arguments specified: " + pe.getMessage());
		printHelp();
	}
}

/**
 * If the arguments contain a request for help or verson information, then show these and return true, otherwise return false.
 *
 * @param args command line arguments.
 * @return true if the help or version options have been specified, false otherwise (i.e. normal processing should resume).
 * @throws ParseException if an option parsing exception occurs.
 */
private boolean showHelpOrVersion(String[] args) throws ParseException {
	CommandLineParser parser = new BasicParser();
	CommandLine cmdLine = parser.parse(HELP_OPTIONS, args, true);
	if (cmdLine.getOptions().length == 0) {
		return false;
	}

	LOG.info("Showing help or version information and terminating.");
	if (cmdLine.hasOption(OPT_HELP)) {
		printHelp();
	} else if (cmdLine.hasOption(OPT_VERSION)) {
		printVersionInfo();
	} else {
		throw new BugException("Options set up is wrong.  Found help or version, but neither believes they have been passed.");
	}
	return true;
}

You can see the code in action in xml2csv, check out Program.java.

Posted in Uncategorized | Leave a comment

Creating portable file names in Java

Today I needed some code in a Java application that would create a file with a valid file name, based on some user input. Of course, Java is a very portable language, so it was no good to only remove characters invalid in Windows, or Linux.  I wanted something as portable as possible and therefore likely to work on many platform.  Fortunately, most modern OS’s are POSIX compliant.

So I created this little method to convert strings to POSIX-compliant, and thus portable, file names.

I took the rules from Wikipedia’s page on Filename, which cited “Lewine, Donald. POSIX Programmer’s Guide: Writing Portable UNIX Programs 1991 O’Reilly & Associates, Inc. Sebastopol, CA pp63-64”, which are:

  1. Maximum of 14 characters long.
  2. Must not have a hyphen as the first character.
  3. Can only contain the following letters: A-Z, a-z, 0-9, . (period), _ (underscore), and – (hyphen/minus/dash) .
/**
 * Creates a POSIX-compliant file name based on the passed string.  The returned string will be:
 * <ol>
 * <li>14 characters or fewer.</li>
 * <li>Made up only of the following characters: A–Z a–z 0–9 . (period) _ (underscore) - (minus/hyphen).</li>
 * <li>The first character may not be a minus/hyphen.</li>
 * </ol>

 * @param str a string that needs to be converted to a file name.  
 * @return a string that only contains POSIX-compliant filename characters and length
 */
public static String convertToPOSIXCompliantFilename(String str) {
	if (str == null)
		return null;
	StringBuilder sb = new StringBuilder();
	for (int i = 0; i < str.length() && sb.length() < 14; i++) {
		char ch = str.charAt(i);
		if ((ch >= 'A' && ch <= 'Z') || (ch >= 'a' && ch <= 'z') ||
                    (ch >= '0' && ch <= '9') || (ch == '.') || (ch == '_') || ((ch == '-') && sb.length()>0)) {
			sb.append(ch);
		}
	}
	return sb.toString();
}

And some unit tests:

@Test
public void testPOSIXFileNameConverter() throws Exception {
	String[] validTestCases = new String[] { "ABCDEFGHIJKLM", "NOPQRSTUVWXYZ", "abcdefghijklm", "nopqrstuvwxyz", "0123456789_.-", "Normal.file" };
	for (String testCase : validTestCases) {
		assertEquals(testCase, FileUtility.convertToPOSIXCompliantFilename(testCase));
	}
	String[] adjustedTestCases = new String[] { "-----", "", "ABC!"£$%^-", "ABC-", "  A", "A", "1234567890ABCDEFGH", "1234567890ABCD",
					"----1234567890ABCDEFGH", "1234567890ABCD"};
	for (int i = 0; i < adjustedTestCases.length - 1; i += 2) {
		assertEquals(adjustedTestCases[i + 1], FileUtility.convertToPOSIXCompliantFilename(adjustedTestCases[i]));
	}
}

Nothing particularly clever, but I suspect I’ll be using this method whenever the user has some influence on files created by my code.  This code is public domain and Locima provides no guarantees of it’s correctness, or any warranties.  Naturally.

Posted in Hints and Tips, Java | Leave a comment

Default behaviour of logging frameworks

I’ve used logback in my latest project, xml2csv, and I think it’s great. Except for one tiny, but important detail.

I’m writing a standalone command line utility, designed to be used by people are neither Java experts nor developers. I want my code to only have a dependency on slf4j, the API that sits on top of logback (or many other logging frameworks), courtesy of a set of adapters.

When the program is working “as designed” there’s no need for logging.  After all, what would a “regular” user want with a log file?  And this is the problem. 

So, my choices are:

  1. Deliver a logback.xml file to sit inside the jar file.
  2. Deliver a logback.xml file to sit outside the jar file.
  3. Programmatically configure logback.

All of these solutions have serious drawbacks:

Provide a configuration file inside the jar file

Providing a configuration file in this way means that there’s only one configuration available.  Also, it will likely frustrate advanced users who want to control logging or replace the logging framework.

Provide a configuration file outside of the jar file

This is a bad experience for the user because, instead of one file, they now have two files to worry about. Also, the name of this file has no relation to xml2csv, so when copying the executable around they’ve got to remember to take this one, or suffer literally thousands of lines of nonsensical logging messages flying up the screen in front of them.

Programatically configure logback

This involves tightly coupling my code to logback, unless I use reflection to dynamically configure it, which isn’t a good thing.  If, at a later date, I want to replace logback with something else, I’d have to change the code. There’s also a difficult question of knowing when to configure the logging.  More advanced users might set up their own logging configuration and I don’t want to overwrite what they want the program to do!

So, what would be best?

If logback output nothing when unconfigured, then normal usage wouldn’t require any kind of configuration. Curious users, or users having problems who needed to switch logging on, could add a logback.xml to enable it. For this reason, I think that this is the best behaviour for all logging frameworks.

One caveat of this is that I remember struggling greatly trying to understand why my log4j configurations years ago were failing because it was never obvious how to enable debugging messages of the logging framework. Therefore, I’d recommend putting this in big, bold, bright letters in your documentation. Something as simple as an environment variable would be ideal for my usage.

An Alternative

Of course, the primary reason why logback is making lots of noise is because it’s there. There’s a strong argument to be made to omit logback entirely from the main build. This would have the following advantages:

  1. It would fix the problem for the default use-case, where logging is not required.
  2. Advanced users could slot in their own logging framework without having to deal with the annoyance of logback being embedded in the jar file.
  3. Smaller distribution (by some 700k).

If logging was required by a user then they would have to drop a logging implementation and configuration file. For advanced, technical-savvy users this is reasonably simple. But for users who just want to use xml2csv and who are having problems, then this could be an annoyance; certainly compared to specifying a single -logging parameter to the utility and generating a log file to send back to me.

This is the approach I’m going take with xml2csv, at least for the time being. Combined with some very explicit instructions on how to download the artifacts required to enable logging.

Posted in Hints and Tips, Java, Tools | Tagged , , | Leave a comment

Building with Apache Ivy

Background

This, and subsequent, entries describe the process I went through with moving an Eclipse Java project from Eclipse-managed, hard-coded, downloaded dependent jars to using Apache Ant and Ivy.  It will hopefully serve as a good tutorial for those attempting the same feat.

I have recently been working on a Java application that converts XML files to CSV files, intuitively named xml2csv which I’m maintaining in Github.  Xml2csv has quite a few dependencies on third-party libraries, approximately 10MB of them.  These third party libraries are all releases from the Apache project.

I’d done all my development in an out-of-the-box Eclipse Luna and used the “Add External Jar” button liberally, all pointing to my own directory of downloaded JAR files.  Of course, this means that the code won’t work for anyone other than me (unless they happened to have used exactly the same paths on their machine as I did) and incidentally also couldn’t be used by anyone unless they also had Eclipse.

The easiest way to remedy the dependency situation would be to create a directory in my Git repo for them and push them all to Github.  However, I really didn’t want to store all those libraries in Github because:

  1. It’s a big waste of bandwidth for people who want to look at the code having to include those repositories as part of a git clone (download) operation.  Many Java developers would already have their own copies of these libraries.
  2. As I didn’t create them and I don’t own them, I just didn’t feel comfortable hosting a public download of them.

Moreover, I really wanted to make it painless for someone who downloads the code to get it working and run the unit tests.

I’ve been rather spoilt by the rather lovely Nuget when developing for the MS platform, but unfortunately there’s no Nuget equivalent for Java, at least as far as I could find.  After some Internet searches and some following links I came across the following options:

  1. Apache Maven.
  2. Apache Ivy.
  3. Gradle.

After reading some reviews, comparisons and tutorials I decided to go for Apache Ivy, because:

  1. It was designed solely for dependency management, so it did exactly what I needed.
  2. It was designed for use with Ant, which I have some experience of from previous lives projects.
  3. There was an Eclipse integration available (although I believe this is true for all three).
  4. It didn’t use a syntax I wasn’t used to, like Groovy-based Gradle.  I’ll be honest, this isn’t a good reason, but I just didn’t fancy learning something completely new, see 3.
  5. It didn’t describe itself as “software project management and comprehension tool”  (Seriously, guys?   I refer you to a memorable line from Jules Whtifled when talking to Brett for that one).

Moving to Ant + Ivy

Right now, my project’s build cycle is managed by Eclipse.  This has worked really well for me so far, but now it’s not enough.  So off to http://ant.apache.org/ivy and download 2.4.0 rc1.  There were two options for the download “binary” and “binary-with-de pendencies”.  Not knowing what the dependencies were and unable to find a description, I decided to go with the latter.

I was really pleased to see that the download included the documentation, as I frequently work without an Internet connection.

I figured that the best thing for me to do was to learn “basic” Ivy, then go looking for an Eclipse IDE integration.  I must admit, I was really worried that I’d lose features like the automatic build in Eclipse, which I absolutely adore, but that’s not a good enough reason to abandon.

My project is really simple, you can see for yourself if you like.  I have a “src” and “testsrc” directory and a properties directory.  However, I needed the following libraries to make it sing and dance:

  1. Apache Xerces v2.11.
  2. Apache Commons CLI v1.2.
  3. QOS Simple Logging Framework (SLF4J) v1.7.7.
  4. Apache Log4j 2.0 and the SLF4J to Log4j 2.0 adapter.
  5. Saxonica Saxon-HE 9.

Nothing particularly onerous, but still around 10MB of Jar files.  My application builds to a relatively tiny 66 kilobyte Jar file.

From a brief read of the Ivy tutorial, I managed to create myself a simple ivy.xml file (bear in mind that I didn’t have any kind of ant build yet), but I’m reading the Ivy docs, so that’s where I started.  Here’s my ivy.xml:

<ivy-module xsi:nonamespaceschemalocation="http://ant.apache.org/ivy/schemas/ivy.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0">
	<info module="xml2csv" organisation="com.locima">
		<description>
			This Ivy file is used to download the third-party dependencies of the xml2csv project.
		</description>
	</info>
	<dependencies>
		<dependency rev="9.5.1-6" name="Saxon-HE" org="net.sf.saxon" />
		<dependency rev="2.11.0" name="xercesImpl" org="xerces" />
		<dependency rev="1.7.7" name="slf4j-api" org="org.slf4j" />
		<dependency rev="1.7.7" name="slf4j-log4j12" org="org.slf4j" />
		<dependency rev="2.0.2" name="log4j-core" org="org.apache.logging.log4j" />
		<dependency rev="1.2" name="commons-cli" org="commons-cli" />
		<dependency rev="4.11" name="junit" org="junit" />
	</dependencies>
</ivy-module>

I found the right org, name, and rev versions by going to http://mvnrepository.com/ and searching using the search box at the top of the page.  This wasn’t as easy a process as I thought it would be.  This is a public archive, so searching for “Log4j” or “Log4j 2”, for example, doesn’t actually give you Apache Log4j 2.0 in the first page of hits for either query, but a load of other projects that either use log4j, or extend it.  In the end had the most success searching for specific Jar names that I was using and this seemed to work for most things, but not Apache Commons Logging (I reverted to searching for “Apache Commons Logging 1.2” and that worked”).

So now, I’ve got a load of source code with missing dependencies.  I needed an Ant build.xml script to actually build my project.  Fortunately Eclipse can export the current build configuration as an Ant script, so right-clicking the project in package explorer, then selecting Export, expanding the General node and select Ant Buildfiles gave me a pretty lengthy build.xml file.

As I’d already stripped out all the hard-coded dependencies from the Eclipse, running ant quickly yielded 100 errors for missing dependencies before it bombed.  Success!!!

Adding Ivy to Ant

Following a simple tutorial on Ivy (file:///C:/apps/java/apache-ivy-2.4.0-rc1/doc/tutorial/start.html) I added a new “resolve” target and then set it as a pre-requisite to the build-project target, like this:

<project basedir="." default="build" name="xml2csv" xmlns:ivy="antlib:org.apache.ivy.ant">
    <target name="resolve" description="--> retrieve dependencies with ivy">
      <ivy:retrieve />
    </target>

And full of enthusiasm typed ant resolve at the command line:

C:UsersAndyProjectsxml2csv>ant resolve
Buildfile: C:UsersAndyProjectsxml2csvbuild.xml
  [taskdef] Could not load definitions from resource org/apache/ivy/ant/antlib.xml. It could not be found.
resolve: 
BUILD FAILED
C:UsersAndyProjectsxml2csvbuild.xml:47: Problem: failed to create task or type antlib:org.apache.ivy.ant:retrieve

The error message was very clear, I’d downloaded Ivy but hadn’t told Ant about it!  To fix this, I change the top of my ant build.xml file so it read as follows:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project name="xml2csv" xmlns:ivy="antlib:org.apache.ivy.ant" default="build" basedir=".">
	<path id="ivy.lib.path">
		<fileset includes="ivy-2.4.0-rc1.jar" />
	</path>
	<taskdef classpathref="ivy.lib.path" uri="antlib:org.apache.ivy.ant" resource="org/apache/ivy/ant/antlib.xml" />

Now aware of where Ivy was, it all sprang in to life and downloaded all the artifacts that I needed (any many more besides, but more on that later):

---------------------------------------------------------------------
|                  |            modules            ||   artifacts   |
|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
|      default     |   48  |   0   |   0   |   5   ||   55  |   0   |
--------------------------------------------------------------------- 

Ivy was doing what I needed it to do with a minimal amount of effort.  So far, so good.  Ivy has downloaded all the resources, but I don’t know where to and I don’t know how to add them to my project as dependencies on the classpath.

By luck, I looking inside my project root directory and found a new lib directory; and Behold! 46MB across 55 jar files!

Clearly, I had asked Ivy for too much, but fundamentally I was able to build and execute my project, along with all its Junit test cases.

The next entry of this series will describe how to control how much Ivy is downloading.

Posted in Uncategorized | Leave a comment

Validating Schemas in JAXP

I try to avoid moaning in blogs, personal or especially professional.  However, having been in .NET land for so long, coming back to Java and trying to do some XML processing has simply been a nightmare.

So many magic strings, badly-named methods, poorly designed interfaces and needless “pluggability” has meant meant that I’ve spent around 3 hours using search engines to try to techniques for achieving what I consider to be very simple and common use-cases.

Rather than a long diatribe listing all the things that I’ve found frustrating, instead I wanted to share just one method, which, for me, sums up XML processing Java quite neatly:

/**
 * Create a SAX parser instance that is configured to validate against the schemas used within this application. 
 * @return A SAX Parser instance, never returns null (exceptions thrown for all failures).
 * @throws SAXException If unable to create the parser.
 * @throws ParserConfigurationException If unable to create the parser.
 */
private SAXParser getParser() throws SAXException, ParserConfigurationException {
	
	// Where the XSD file is within my application resources, just one so far, but others will follow.
	final String[] schemaResourceNames = new String[] { "com/locima/xml2csv/inputparser/xml/MappingSet.xsd" };

	// So far, so good.
	SAXParserFactory factory = SAXParserFactory.newInstance();
	
	// To enable schema validation, ensure you set validating to false.  Yes, really.
	factory.setValidating(false);
	
	// Apparently, namespaces are a bit complicated, so override the default to ignore them.
	factory.setNamespaceAware(true);
	
	// Now tell it what language (using a magic string), as the parser can't work it out for itself,
	// as if XML files could declare what they are...
	SchemaFactory schemaFactory = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
	
	// Pass a set of schemas (schemata if you're feeling pedantic) to a method called setSchema <-- singular.
	factory.setSchema(schemaFactory.newSchema(getSchemasFromResourceNames(schemaResourceNames)));
	
	SAXParser parser = factory.newSAXParser();
	
	return parser;
}
Posted in Hints and Tips, Java | Tagged , , , , , | Leave a comment

Contracting vs. Permanent Employment

[Updated: 11/9 to include a note about Pensions]

I was recently asked for my opinion on whether I thought that contracting was a good idea.  My immediate, reactive thought was to answer with an emphatic “Yes!”, but I then started to consider the question more carefully.  After all, just because contracting works for me that’s not necessarily the case for everyone.  Some of my colleagues have left the world of contracting behind them to return to permanent employment and have no regrets whatsoever; i.e. it’s working for them.

Instead of expressing an opinion on whether it’s a good idea to contract, I decided instead that this post would be about what I’ve learned from personal experience, as well as from contracting colleagues and friends, about contracting.  Considering which parts give me pleasure and which parts I don’t enjoy at all.

Please your own comments or questions here, this post isn’t exhaustive at all and my experience is just that: my (limited) experience.

So, on that note, let’s start with the elephant in the room.

The Money

Most of us associate the act of becoming an IT contractor with getting more money.  A few minutes spent searching on any job site will show a stark difference between the going permanent and contract rates.  According to IT Jobs Watch[1] (this week), the average .NET developer contract rate is £350 per day.

As most permanent, salaried members of staff get paid an annual rate and not a daily rate, we need to do some maths to compare.  For 52 weeks in a year and off for 8 bank holidays[2], and for the sake of easy maths, 22 days on holiday; we have a total of 5 * (52 – ((22 + 8) / 5) = 230 working days a year.

Multiple that by our daily rate and the total potential income will be a healthy £80,500.  Compare this with IT Jobs Watch, which states that the average .NET developer salary today is £40k[3].

If you then consider more lucrative roles, such as Enterprise Architect, rates than can vary from £500 to £800 a day, then that takes the total to a rather smile-inducing £115,000 to £184,000.

However, it’s not quite that simple.  I underlined “total potential income” to emphasise that I did not say “your actual total income”.  Every contractor I’ve met has their own company of some kind that is then contracted by a third party.  This money is paid to and is therefore income for the company, not you.  Here are some of the things that you will or may need to pay out for as a company, before cash comes to you:

  1. Tax.  Of course, contractors have to pay tax  There are ways to be more “tax-efficient” as a contractor, but ultimately you’ll still pay, if you want to stay on the right side of HMRC’s rules!  This leads on to point two…
  2. Accountants.  Unless you’re doing all your own book-keeping you’ll want an accountant to do it for you.  Accountants are generally very smart people who deal with the (extreme) complexities of UK tax law and regulations to keep everything legal and help you avoid unpleasant surprise bills or fines.  I would be totally lost without my accountant and delegate as much as I possibly can to them, freeing me up to worry about being an architect.  Personally, I’ve found some aspects of this extremely stressful and has led to a few sleepless nights of worry; especially at the beginning when you’re being bombarded by new terms and new rules.
  3. Pension.  No matter how young you (think you) are, pensions are important.  As an independent contractor you’ll now need to sort this for yourself.  Your accountant may be able to recommend an Independent Financial Advisor who’ll be able to take you through the options and allow you to make a decision on what pension works for you, also Money Saving Expert has a good guide, or just search the Internet for “pension advice” (although make sure any sites you use are reputable, there’s quite a few scammers out there).  A lot of people find the concept of having their own pension intimidating, but I’ve found that once I got my head around some basics (such as: “it’s basically a savings account with special tax rules”), it’s actually very simple indeed.  For me, it’s a case of: 1) Set up a standing order or DD to money in every month; 2) Once a year, review where you are and where you’ll be at your target retirement age and adjust payments accordingly.  Pretty easy really!
  4. Training.  Most companies will not train contractors – after all, you’re there to fix a short-term problem (ideally) so should know what you’re doing on day 1.  Therefore, if you want to develop your skills or even keep them current, it’ll be on your time and at your expense.  When working out the impact of training, don’t forget to include the fact that not only will course fees apply, you also won’t be invoicing that week.
  5. Sick pay.  Unless you decide to purchase separate insurance, then any time you’re not working, the company is not receiving any income.  Consider, what if you were diagnosed with a serious illness or had an accident and were unable to work for 1month, 3 months, 6 months, a year or even never again?  Most permanent employees receive a level of “sick pay allowance” which you just won’t have.
  6. “Perks”, such as health Insurance, sickness insurance, staff discounts and so on.  Many companies offer perks to employees, such as discounted insurance.  As a supplier and not an employee you’ll not be getting those discounts.
  7. Miscellany.  You’ll may want or need things such as professional liability/indemnity insurance, books, a laptop, a mobile phone, software.  All of these things are up to you.

So, after all that, will you better off as a contractor?  That’s up to you to work out for yourself.  It’s always worth considering that a contractor who charges £350 a day, but doesn’t have a contract, is a lot worse off than a permanent employee on £30k (see Job Security below).

Personally, I enjoy the fact that I’m free to choose what to spend my company’s money on, rather it being the product of a corporate policy.  The pension scheme, perks, and training my company provides me suit me perfectly, of course, and when my circumstances change, I can also change these aspects to maintain that.

The Problem of Money Addiction

There’s another, more subtle, reason why I believe it’s a mistake to think or refer to the money earned as “your income”.  When I wake up in the morning, feeling a bit peaky, I used to immediately do a quick mental check: “If I stay in bed today, I won’t earn any money. Is a day off in bed worth £££?”

I used to do this and found it a very dangerous line of reasoning that led me to some very self-destructive habits.  I found I almost always answered the question “no, it’s not worth that much, I’ll drag myself in to work”.  I’d push through the day and get the job done, drugging and dragging myself through meetings and emails until the blessed home time came around.

This led to two things:

1. Work/Life balance seriously deteriorating.  I’d get home and be so tired that I had to go straight to bed.  At weekends I had no energy so needed to nap for a couple of hours during the day.

2. Increased intensity of symptoms.  If I tried to ignore feeling ill, even if it was just a cold, for too long then eventually my body would decide it had been abused enough and just crash.  I’d feel so awful I’d have to stay in bed for 2-3 days at a time to recover, just sleeping and drinking water and generally moaning and being a pain in the backside to my wife.

I realised eventually that it was better to take one or two days off to recover, instead of pushing myself to illness.  Of course, financially it makes sense as well, better to take one day off sick than three! 

I now try to member the line, often attributed to the 14th Dalai Lama[4]: “He sacrifices his health in order to make money.  Then he sacrifices his money to recuperate his health.”

Job Security

Job security is perceived to be the big risk, or downside, of contracting.  This used to terrify me, and still does worry me from time to time.  As a contractor you can be on as little as zero notice (“Sorry Andy, we don’t need you any more”, or even worse, “We would love you to stay, but all contractors are cancelled as of today.  Sorry… and bye!”); or all the up to several weeks.  However, what I’ve learned from talking to fellow contractors is that this generally turns out to be a lottery.  I’ve met contractors who’ve been with one client for 6 years, yet others who find themselves bouncing around different clients month to month, or even week to week.  I’ve also met contractors who have had notice periods ignored by employers with a “Yeah, we know that’s not what’s in the contract, but we don’t think you’ll do anything about it.”

One great piece of advice I’ve had from pretty much all other contractors is to “build a buffer”.  This means trying to save enough money so that you can go for a reasonable amount of time without work.  That way sudden termination won’t suddenly mean you can’t pay the mortgage.  But what’s reasonable?  My idea of reasonable may not be yours, after all!

This is what I did:

  1. Worked out the lifestyle I could “enjoy” and “tolerate” and how much a month that will cost. 
  2. Consider: “If I was let go tomorrow, how quickly could I find employment?”.  There’s no correct answer to this, but by keeping an eye the various job sites (just search for them, there’s no shortage of supply!) to see if there’s opportunities that I could sensibly apply for, I was able to reason: “Surely I could get one of those in a pinch?”
  3. Multiply the outputs of 1 and 2 and this is how much money you want to keep in your bank account (or the company bank account, taking in to account the other costs above) at all times.

If you can save enough to reach the point where you can honestly say to yourself: “I could go, say, 3 months without work if I really had to” then hopefully this will give you peace of mind.

Also, one thing I’ve noticed is that just because you’re permanent employee, doesn’t mean you’re safe from suddenly finding yourself without a job.  Redundancies are not as uncommon as we’d all like them to be, and I’ve worked in more than one company where the turnover of permanent employees was greater than that of contractors for some periods of time.

Are You Actually Any Good?

Following on neatly from job security, there’s always the question of “Are you actually any good at what you’re trying to get a contract to do?”  Be honest with yourself: there’s a big difference between “It’s a stretch but I could do this” and “I have no idea what I’m doing”.  If you keep getting in to contracts where you cannot fulfil the client’s needs then you’ll find yourself being quickly dropped.

Once your CV is littered with short engagements, your potential clients may regard you warily and start asking awkward questions, such as “So, why did you only stay at this client for 2 weeks?” or even worse, making assumptions: “We’re looking for someone to last the duration of this project of 6 months and your history shows you generally terminate after 3 months” and not contacting you at all.

Having said that, I remember, as a young, permanent employee, having a very strong belief that all contractors were super-human geniuses, capable of extraordinarily feats of development or architecture and thus fully worth every penny of their seemingly huge day rates.

Oh, how I can now laugh looking back!  I think it’s arguable that the bar is higher for contractors, but probably not as high as you’re thinking.  Your mileage, of course, may vary.

Extra Overheads

Running your own company takes time and effort, over and above what you do for your clients.  Even with a good accountant you’ll still need to arrange the affairs of the company, read and understand contracts, file expenses and so on.  Of course, you can employ or contract someone else to do it – at a cost.  By some delegating I’ve found that when everything is running smoothly, I can manage by dedicating a 2-3 of hours a week to “running” Locima.  But it’s taken a while to get the processes streamlined to that point.

Some of the regular activities are:

  1. Logging all invoices.
  2. Logging all payments in to the company account and out again.  I.e. making sure that invoices are being paid in full.
  3. Logging all expenses and scanning all receipts.
  4. Logging all bills and their payments and scanning all related paperwork.
  5. Filing all physical documents.
  6. Checking backups (I found Scott Hanselman’s Rule of Three invaluable for this [5]).

Annoyingly, these overheads can suddenly go up: What if a customer refuses to pay, or decides to under pay?  Do you have the means (and will) to take them to court if necessary?  Will it turn out to be good money after bad?  Also what if HMRC decide to audit the company? That’ll take up potentially a lot of your time, potentially.

External events, outside of your control, can sometimes demand large amounts of your time.  Whilst some clients may be sympathetic, others may not – after all, that’s not their problem.

Perks

Running my own (very simple) company has given me a great opportunity to learn how the world of business works in more detail.  The employer/employee relationship is different from the customer/supplier relationship.  Without having to work through all of the items I’ve raised in this post I would be largely ignorant of it as my work typically involves different disciplines.

Of course, there are some things which I absolutely love about contracting:

Annual Reviews

This one makes me smile each time I think about it.  I remember many painful annual review processes from being a permanent employee that I no longer have to participate in:

  1. Hours spent preparing an annual (or if really unlucky, semi-annual) statement of achievements; having to set irrelevant (or soon-to-be-irrelevant) goals and having a “personal development plan” (often ignored by both employer and employee).
  2. Frustrating and irrelevant tick-lists of competences you needed to demonstrate to get a bonus or promotion.
  3. Being told “Sorry, we’ve got a quota/smaller pot of money this year” having fulfilled the criteria of the first two.

My reviews typically consist of, nearing the end of the contract, me asking two questions of the client.  The first is “Would you like to renew me for another period?”  If the answer is no, then the review is done and I can plan future work.  If the answer is yes then there may be a small amount of negotiation of a new rate; but this is generally resolved in a short email exchange.  Either way, I’ve found reviews for contractors are quick, straightforward and get straight to the key points for both me and the client.

I think that this is because decision on contractor renewals are frequently made more locally, for example by the project manager of the project I’m working on.  Firing or “releasing” a permanent employee can be a difficult and costly process, involving HR and several layers of management approval.  But for a contractor it’s extremely easy: “Andy costs X, he delivers Y.  Is this working for this project?”  Of course, different companies work in different ways so your mileage, as they say, may vary.

In summary: my clients don’t (and shouldn’t) care about my career development or my personal support needs.  That’s my responsibility.

Flexibility

Flexibility is another aspect of contracting that I enjoy, greatly.  I am, effectively, my own boss.  Once the terms and conditions of a contract with a third party of fulfilled, the rest of my time is mine to do with as I choose.  This allows me to have multiple clients (subject to competition and similar clauses) or work on my own projects.  In fact, if you have a long-term client then consider the impact that could have on your tax situation (see an accountant), such as from IR35.

Many permanent members of staff, at least in the UK IT market, including myself, have signed contracts in the past that mean that my employer owns all my output, even if done in my time on my equipment with absolutely no connection with my day job.

Conclusion

For me, right now, I’m really enjoying running Locima and the contractor lifestyle: the clients I work for, the projects I do for myself, the personal development I do for myself and the freedom to know that I have far more control (or at least perceived control) over my own destiny.

Of course, in the months and years to come when the job market changes, my lifestyle requirements change or just my luck changes, I may decide that it’s no longer the contractor life for me.  At that point it’ll be time to go back to port and the safer(?) shores of permanent employment.

You, on the other hand, may have a totally different experience!

References

[1] .NET Development Contracts, IT Jobs Watch as of 2/9/14 – http://www.itjobswatch.co.uk/contracts/uk/.net%20developer.do

[2] England only: New Year’s Day, Good Friday, Easter Monday, Early May Day Bank Holiday, Spring Bank Holiday, Summer Bank Holiday, Christmas Day, Boxing Day – http://publicholidays.co.uk/

[3] Wikiquote page on the 14th Dalai Lama – http://en.wikiquote.org/wiki/Talk:Tenzin_Gyatso,_14th_Dalai_Lama

[4] .NET Developer Jobs, IT Jobs Watch as of 2/9/14 – http://www.itjobswatch.co.uk/jobs/uk/.net%20developer.do

[5] The Computer Backup Rule of Three, by Scott Hanselman – http://www.hanselman.com/blog/TheComputerBackupRuleOfThree.aspx

Posted in Uncategorized | Leave a comment

Monty Hall Problem

The Monty Hall problem is one of those brain teasers which has confused me on the three previous occasions that I read about it.  The (many) articles on line explaining it didn’t resonate with me, so I decided to work it through again and write some code to prove it.

Once I’d written the code I had an epiphany and realised how incredibly simple it was, so this blog entry is me explaining the Monty Hall problem and why probability works out how it does.  Hopefully, if you’re like me and you’ve struggled with this before, this blog will make it simple and clear.

I’m not going to explain the Monty Hall problem, because Wikipedia does an excellent job. If you’re not familiar, see you in a few minutes!

A brief summary of the incorrect thought process is as follows:

  1. There are three doors: X, Y and Z. 
  2. The contestant picks door X.  There was a 1/3 chance of the contestant picking this door.  There was a 2/3 chance of then NOT picking door X.
  3. Door Y is revealed to be a goat, therefore the Cadillac is behind door X or door Z.
  4. The Cadillac is behind door X or door Z, i.e. there are only two permutations of goat, Cadillac and doors there it’s a 50/50 chance of finding the Cadillac.

So, why given a seemingly 50/50 chance of either having a goat or a Cadillac behind the two remaining doors is the probably of finding the Cadillac on a switch 2/3?

Let’s start with some simple code that emulates the problem.  This code is deliberately naive to make it easy to follow.

using System;
using System.Security.Cryptography;

namespace MontyHall
{
    internal class Program
    {
        private static Random _randomGenerator = new Random();

        private static void Main(string[] args)
        {
            Console.WriteLine("{0,-12}{1,12}{2,12}", "Iterations", "Stick Wins", "Switch Wins");

            for (int i = 0; i < 10; i++)
            {
                int iterations = 1000000;
                int winsIfStickCount = EmulateGame(iterations);
                int winsIfSwitchCount = iterations - winsIfStickCount;
                Console.WriteLine("{0,-12}{1,12}{2,12}", iterations, winsIfStickCount, winsIfSwitchCount);
            }
        }


        private static int EmulateGame(int iterations)
        {
            int winIfStickCount = 0;
            for (int i = 0; i < iterations; i++)
            {
                if (EmulateGame())
                {
                    winIfStickCount++;
                }
            }
            return winIfStickCount;
        }


        private static bool EmulateGame()
        {
            // Create 3 boxes, each with a goat
            string[] boxes = new[] { "Goat", "Goat", "Goat" };

            // Replace a goat with a Cadillac, randomly
            
            boxes[_randomGenerator.Next(3)] = "Cadillac";

            // Now pick a random door
            int randomDoor = _randomGenerator.Next(3);

            // Now Monty opens a door containing a goat
            int knownGoat = -1;
            while (knownGoat == -1)
            {
                /* Keep picking a door at random until Monty finds one that the contenstant didn't pick
                 * and contains a goat.  This isn't efficient, but it keeps the code very simple */
                int montysDoor = _randomGenerator.Next(3);
                if (montysDoor == randomDoor || boxes[montysDoor] == "Cadillac") continue;
                knownGoat = montysDoor;
            }

            /* If we originally picked the Cadillac, then we won.
             * If we didn't, then as Monty has revealed the other goat the
             * other door MUST contain the Cadillac
             */
            return (boxes[randomDoor] == "Cadillac");
        }
    }
}

Run the code, and ten million iterations of the Monty Hall problem will be run in batches of one million.  Here’s some sample output

Iterations    Stick Wins Switch Wins
1000000           333788      666212
1000000           333780      666220
1000000           332643      667357
1000000           333629      666371
1000000           333797      666203
1000000           333219      666781
1000000           333363      666637
1000000           332921      667079
1000000           333100      666900
1000000           333255      666745

Pretty convincing.  After 10,000,000 iterations we’re seeing a very definite pattern that switching doubles your chances of finding the Cadillac.

Why?

The reason why became obvious once I had realised that this chunk of code was totally superfluous to whether I won or not.

            // Now Monty opens a door containing a goat
            int knownGoat = -1;
            while (knownGoat == -1)
            {
                /* Keep picking a door at random until Monty finds one that the contenstant didn't pick
                 * and contains a goat.  This isn't efficient, but it keeps the code very simple */
                int montysDoor = _randomGenerator.Next(3);
                if (montysDoor == randomDoor || boxes[montysDoor] == "Cadillac") continue;
                knownGoat = montysDoor;
            }

The line of code that determines whether you win by sticking or switching is totally independent:

return (boxes[randomDoor] == "Cadillac");

Once you understand that this single, simple line is responsible for the probability of winning or losing, I think it becomes really easy to understand.  I.e.

  1. There is a 1/3 chance the contestant picks the Cadillac.
  2. There is a 2/3 chance that the contestant hasn’t picked the Cadillac.
  3. Therefore by switching there is a 2/3 chance of finding the Cadillac because if one of the two unpicked doors contains the Cadillac you are guaranteed to find it.

One of the difficulties with understanding this problem is the scale of the numbers involved.  It becomes much easier with bigger numbers:

Imagining trying to win the UK National Lottery draw jackpot.  The jackpot is won by picking 6 distinct numbers between 1 and 49 and matching them all.  There is a 1/13,983,816 chance of matching all 6 numbers with a single ticket (see http://lottery.merseyworld.com/Info/Chances.html).

  1. Imagine that you have 13,983,816 lottery tickets, each with a different set of numbers on.  That means that the winning lottery ticket is in there… somewhere.
  2. You pick a ticket at random.  There’s a 1/13,983,816 chance of finding the winning ticket.
  3. A magician arrives, waves his magic wand and all of the remaining tickets disappear EXCEPT for one.  The magician promises that this remaining ticket is the winning ticket unless you’ve already picked it up (remember there’s a 1/13,983,816 chance of that, so pretty small).
  4. The magician then offers you a swap: do you want to keep your ticket (which has a 1/13,983,816 chance of being the winning ticket), or take his ticket?

Still reckon it’s a 50/50 chance of winning the National Lottery?  Me neither.

Posted in C# | Leave a comment

What’s the difference between OnNavigatedTo and Loaded?

Sliding Block Puzzle was originally written to do all initialisation within the Loaded event handler for a page.  However, this yields some strange and unwanted behaviour sometimes, such as:

1. Pages don’t initialise properly when returning to them deactivation.

2. The Loaded event handler was being called twice when the VS2010 debugger is attached.

So, remember:

Loaded is called each time the page (or a control, if you’re in the handler for a control) is added to the “visual tree”, this means it may be called more than once.

OnNavigatedTo is called once only each time the page is activated.

Therefore, chances are, you want to be using OnNavigatedTo and not Loaded!

Posted in Uncategorized | Leave a comment

Sliding Block Puzzle available to developers!

Our first software app for Windows Phone 7.1 is now available to developers on CodePlexSliding Block Puzzle is a game that demonstrates how to achieve many useful techniques on Windows Phone, such as dynamic data-bound menus, item template selectors, using the camera, image processing, animation, messaging to the view (MVVM) and diagnostic trace all in a mixture of MVVM and Code-behind.

And, it’s open source under the MS-PL license, so you’re free to look at the code and use bits in your own projects.

The game is very simple at the moment, and not good enough to put on the marketplace (our decision, not Microsoft’s – yet!), once more levels have been added and custom game creation and sharing games is complete, we’ll submit it for consideration for inclusion in the Windows App marketplace.

Enjoy!  And please leave feedback on the Sliding Block Puzzle Codeplex site!

Posted in Microsoft, Windows Phone | Tagged , , , | Leave a comment

Windows Phone: What does InitializeComponent() do?

Coding in XAML is quite ridiculously easy on the face of it, there’s masses of tutorials, samples, guides and step-by-step instructions for building simple applications that solve problems in a pretty elegantly presented UI.

However like a marathon runner, just after you finish those tutorials and start programming in anger, you hit The Wall.  Your controls don’t display properly, binding doesn’t work, events are fired but aren’t received, or worst of all, customers report a problem with the UI that you can’t replicate.

The key to solving problems like this is to really understand what’s going on behind the scenes, under the covers and, most importantly, when you’re not looking.  Ultimately, all the syntactic sugar that is XAML is going to come back to some CLR code drawing pixels on a screen and a series of methods that are being called in a particular order; and when you can actually see them it will often make it blindingly obvious why your masterpiece doesn’t work (and occasionally show you what it was you did wrong and how to do it right).

As every novice learns, when you create a page or control in a Windows Phone application, you get the following

  1. A XAML XML file.
  2. A “code-behind” file that MVVM advises you should never fill in and that you should really consider deleting, putting all your code-behind logic in a ViewModel class.

However, when you build your project you also get a g.cs and a g.i.cs file stored in the obj directory, and as if to taunt your tortured developer soul, they’re identical.  Fortunately, you can ignore g.i.cs files as these appear only to be used by Visual Studio for Intellisense purposes.  g.cs files are generated from the XAML as part of the build of your project.

Inside those files contained the answer to one of the first questions I had when working with WP7:

What does InitializeComponent do and what happens if I put code above it?

This method firstly checks to see it hasn’t been called before, if it has then it returns immediately (so don’t consider calling this twice once you’ve done something horrible to the control hierarchy).  Then, it loads the XAML file, instructing the runtime to create some “stuff” that is then available for use within our object.

Finally it then initialises all those handy properties, one for each control in your XAML file, and initialises them by looking them up in the data interpreted from the XAML.

So, if you put something in your code-behind before InitializeComponent, then expect some NullReferenceExceptions when you attempt to access the internal properties!

The next few posts will be about my adventures with data binding.

Posted in Uncategorized | Leave a comment