Getting Into (A) State With Java 8 Streams

Hot on the heels of my earlier foray into Java 8's new shininess comes more playing.

The Oracle documentation has this to say about state and streams:

Note also that attempting to access mutable state from behavioral parameters presents you with a bad choice with respect to safety and performance; if you do not synchronize access to that state, you have a data race and therefore your code is broken, but if you do synchronize access to that state, you risk having contention undermine the parallelism you are seeking to benefit from. The best approach is to avoid stateful behavioral parameters to stream operations entirely; there is usually a way to restructure the stream pipeline to avoid statefulness.

All that accepted, nonetheless sometimes it is necessary for a stream operation to look around at the world. So the question naturally becomes: how?

Enter exhibit A. A toy application that, given a signal stream will determine it's phase: rising, level or falling.

package jdk8;

import java.util.concurrent.atomic.AtomicReference;
import java.util.stream.DoubleStream;

public class Phaser {
    public static void main(String[] args) {
        final int RESOLUTION = 20;
        final int CYCLES = 2;

        final AtomicReference<Double> history = new AtomicReference<>(0.0D);

        // generate a few cycles of a nice sinusoidal signal
        DoubleStream.iterate(0.0D, n -> n + (2 * Math.PI) / RESOLUTION)
                .limit(CYCLES * RESOLUTION)
                .sequential()
                .map(Math::sin)
                .boxed()
//                .map(d ->
//                                ((Function<Double, SignalPair>) next ->
//                                        new SignalPair(history.getAndSet(next), next)
//                                ).apply(d)
//
                .map(d ->
                    new SignalPair(history.getAndSet(d), d)
                )
                .map(v -> String.format("%f,%f,%d", v.getPrevious(), v.getCurrent(), v.getPhase()))
                .forEachOrdered(System.out::println);
    }
}

class SignalPair {
    private final Double previous;
    private final Double current;

    public SignalPair(Double previous, Double current) {
        this.previous = previous;
        this.current = current;
    }

    public Double getCurrent() { return current; }

    public Double getPrevious() { return previous; }

    public int getPhase() {
        return (int) Math.signum(current - previous);
    }
}

Hipster-point earning features here include:

  • Doublestream.iterate() presents an infinite generator for a sinusoidal double-valued signal with a specific resolution; it will be invoked a limited number of times (2 complete cycles in this case).
  • Processing a signal stream like this is inherently a sequential activity; hence .sequential()
  • It is easy to map a value x -> SIN(x) or execute println(s) using method references.
  • It's also easy to map a SignalPair instance to a CSV-formatted String instance.

The above are just small-denomination pieces of hipster currency, though. The real payoff can be seen in this drilldown:

        final AtomicReference<Double> history = new AtomicReference<>(0.0D);

        ...

                        .map(d ->
                    new SignalPair(history.getAndSet(d), d)
                )

It's worth stating again that determining a signal's phase is inherently both sequential and differential in nature. To be correct, you MUST compare values n and n-1 in the order that they arrived, not in some parallel-friendly way.

The AtomicReference makes it possible to maintain state (a single Double instance, in this case) from one stream invocation/iteration to the next. The double indirection imposed by the use of AtomicReference is actually needed: map() cannot simply update a reference to a Double but it can update the value stored in an instance to which it has access. A Map (or -old-school style-even an element of an array) would probably also work.

It's worth comparing the 'longhand' mapping code in comments with the shorter form. I originally wrote the former before realising that the latter was possible. The equivalence isn't made particularly clear in any documentation that I have read, so for reference I kept the "new and shiny" version alongside the "new and even shinier" one.

Does it work? Yes. Here is an edited version of the output:

0.000000,0.000000,0
0.000000,0.309017,1
0.309017,0.587785,1
0.587785,0.809017,1
0.809017,0.951057,1
0.951057,1.000000,1
1.000000,0.951057,-1
0.951057,0.809017,-1
0.809017,0.587785,-1
0.587785,0.309017,-1
...
-1.000000,-0.951057,1
...
0.951057,1.000000,1
1.000000,0.951057,-1
...
-0.951057,-1.000000,-1
-1.000000,-0.951057,1
...
-0.587785,-0.309017,1

On to Exhibit B. Another stateful algorithm: work out the area of a polygon, given the vertices of that polygon presented in sequential order.

package jdk8;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;

public class Area {
    public static void main(String[] args) {

        List<Vertex> polygon = new ArrayList<Vertex>() {{
            add(new Vertex(0.0D, 0.0D));
            add(new Vertex(0.0D, 2.0D));
            add(new Vertex(2.0D, 2.0D));
            add(new Vertex(2.0D, 0.0D));
        }};

        Vertex v0 = polygon.get(0);
        polygon.add(v0);  // ensure closed polygon

        AtomicInteger ai = new AtomicInteger(0);
        double area = polygon.stream().limit(polygon.size() - 1).sequential()
                .peek(System.out::println)
                .mapToDouble(p -> {
                    Vertex nextV = polygon.get(ai.incrementAndGet());

                    return (nextV.getX() - p.getX()) * (nextV.getY() + p.getY());
                })
                .sum() / 2;

        System.out.printf("Area: %f\n", area);
    }
}

class Vertex {
    private final double x;
    private final double y;

    Vertex(double x, double y) {
        this.x = x;
        this.y = y;
    }

    public double getX() { return x; }

    public double getY() { return y; }

    @Override
    public String toString() {
        return "Vertex{" +
                "x=" + x +
                ", y=" + y +
                '}';
    }
}

This is not so different, but has one new low-score hipster-point earning feature:

  • .peek() is a useful 'debugging' tool.

It is probably worth pointing out that the initialisation of the polygon ArrayList instance is just standard Java. It's not something that one often sees around but this feature has been a part of Java since anonymous inner classes and instance initialisers were introduced in version 1.1.

I hear you asking "Does this do it's stuff?" Oh why do you doubt me so? Take a look:

Vertex{x=0.0, y=0.0}
Vertex{x=0.0, y=2.0}
Vertex{x=2.0, y=2.0}
Vertex{x=2.0, y=0.0}
Area: 4.000000

I feel obliged to include the disclaimer that all the above is not really anything that a true purveyor of Functional Goodness(™) would condone or be proud of. But what the heck, eh?! At least I'm not saying "what's the point of it all?", now am I?

Tags: Java, java8, Programming

It's Java 8 Playtime!

Today we are going to look at Java 8's shininess.

The task at hand is to implement a very simple, standalone HTTP server that would parse the query parameters from a GET request and return a trivial bit of JSON-formatted data.

A Java 8 "hello name" server, in other words.

I am also trying to see how many new toys I can work into the project how idiomatic I can make the solution so that I can get a 'feel' for the "New Java."

Without further ado:

package jdk8;

import com.sun.net.httpserver.Headers;
import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;

import java.io.IOException;
import java.io.OutputStream;
import java.net.InetSocketAddress;
import java.util.List;
import java.util.concurrent.Executors;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class Main {
    public static void main(String[] args) throws IOException {
        InetSocketAddress addr = new InetSocketAddress(8080);
        HttpServer server = HttpServer.create(addr, 0);

        server.createContext("/", (HttpExchange ex) -> {
            if ("GET".equalsIgnoreCase(ex.getRequestMethod())) {
                Headers responseHeaders = ex.getResponseHeaders();
                responseHeaders.set("Content-Type", "application/json");
                ex.sendResponseHeaders(200, 0);

                try (OutputStream responseBody = ex.getResponseBody()) {
                    responseBody.write(generateResponse(extractParams("name", ex.getRequestURI().getQuery())).getBytes());
                } catch (Exception e) {
                    e.printStackTrace(System.err);
                }
            }
        });
        server.setExecutor(Executors.newCachedThreadPool());
        server.start();
        System.out.printf("Server created; listening on port %d\n", addr.getPort());
    }

    private static final String JOIN = " and ";
    private static final String PFX = " ";
    private static final String SFX = "!";
    private static final String NOSTR = PFX + SFX;

    private static String generateResponse(List<String> strings) {
        String s = strings.stream()
                .collect(Collectors.collectingAndThen(Collectors.joining(JOIN, PFX, SFX),
                        (str) -> NOSTR.equals(str) ? str.trim() : str));
        return String.format("{greeting=\"hello%s\"}", s);
    }

    private static List<String> extractParams(String key, String s) {
        // ("" + s): protect against s being null
        return Stream.of(("" + s).split("&"))
                .map(line -> line.split("="))
                .filter(pair -> pair.length == 2)
                .filter(entry -> entry[0].equalsIgnoreCase(key))
                .map(entry -> entry[1])
                .collect(Collectors.toList());
    }
}

There are a fair number of juicy tidbits here, including:

  • The classes in com.sun.net.httpserver. Perfectly legal to use and not contained within one of the restricted APIs, contrary to popular belief. Available since Java 1.6.
  • Try-with-resources. A Java 7 feature, but still little known/used it seems.
  • Executors. An apparently little-known Java 1.5 facility.
  • Lambdas all over the place.
  • Streams and collectors all over the place.

It seems to me that the java.util.stream.Collectors class is worthy of further study. There's a lot of niceness here, including Collectors.collectingAndThen(), Collectors.joining() and Collectors.toList(). I'm pretty sure that the better I get to know this Class, the less I will end reinventing stuff.

Does all this new and shiny stuff work? Well…the proof, as they say, is in the pudding:

So there you go, a small HTTP server full of Functional Goodness(™). What more can a trendy nerd want?

I guess that IF I was prepared to sacrifice any 'coolness' that I may have just accumulated, I would ask something like "But is all that Functional Goodness giving me anything over and above what I would normally get from Java?" And IF I was going to do that, I would be talking about code looking like:

    private static List<String> extractParams(String key, String s) {
        List<String> result = new ArrayList<>();
        // ("" + s): protect against s being null
        for (String param: ("" + s).split("&")) {
            String pair[] = param.split("=");
            if ((pair.length == 2) && key.equalsIgnoreCase(pair[0]))
              result.add(pair[1]);
        }
        return result;
    }

But I don't want' to sacrifice all that accrued 'coolness', so I certainly won't put forward the hypothetical question I just hypothetically posed…. In any case, I know all about the potential for parallelism implied by that Functional Goodness, so it would have been silly of me to even think that question, wouldn't it ;-)

By potential, I mean (of course):

    private static List<String> extractParams(String key, String s) {
        // ("" + s): protect against s being null
        return Stream.of(("" + s).split("&"))
                .parallel()
                .map(line -> line.split("="))
                .filter(pair -> pair.length == 2)
                .filter(entry -> entry[0].equalsIgnoreCase(key))
                .map(entry -> entry[1])
                .collect(Collectors.toList());
    }

I've not really seen an example like this "out there", so I hope that the world is now a better place for having this code snippet in it!

Tags: Java, java8, Programming

Using Spock Configuration in Grails

My Suite of Grails functional tests for one of my customers applications have been growing and I have recently added a few tests covering reports that are fairly…ponderous.

So I have decided to tag some of my tests as Slow, so that I can skip them at will, while still running the bulk of my tests.

Segregating one's tests is a fairly common thing to want and do, with a common solution; see for instance this blog post.

The Grails environment introduces a little wrinkle into the above scheme: as Spock issue 184 so eloquently puts it: "When compiling a SpockConfig file in a Grails project, classes in the project code are not visible. Classes from the project dependencies can be used."

Issue 184 provides a workaround: use a (class-level in my case, but not absolutely required) annotation that is supplied from a dependency but which has no other effect on the system under test, thus:

import javax.annotation.Resource as Slow

@Slow
class MySpec extends GebReportingSpec {
    ...

It's worth noting how Groovy's "import … as …" helps keep the code a bit clearer and should make it easier to munge around if/when issue 184 is fixed.

The remaining issue: how to define whether or not to run Slow tests. Easily done…create the file test/functional/SpockConfig.groovy as shown:

import javax.annotation.Resource as Slow

runner {
    if (System.properties['ios.tests.functional.exclude.Slow']) {
        exclude {
            annotation Slow
        }
    }
}

Since I am a happy IntelliJ user, I set up a run configuration that looks like:

Of course, this same command line can be used away from Intellij if needed.

I'm now a happy and more productive tester.

Tags: Geb, Grails, Groovy, Programming, Spock

Artifactory and Grails

A while back, a colleague asked me how Artifactory and Grails work together.

I couldn't point him to a really good resource on the 'web then and I didn't have a ready answer for him. But that was then, this is now. Let me tell you a story.

There are three parts.

Are you sitting comfortably? Then I'll begin…

Artifactory As A Repository Proxy

Artifactory offers a Default Virtual Repository. It is simplest to just use this.

Repositories are configured in BuildConfig.groovy.

All you need is:

repositories {
    inherits true // Whether to inherit repository definitions from plugins

    mavenRepo id:"Artifactory" , url:"http://localhost:8081/artifactory/repo"
}

No other repositories need to be established here; Artifactory can deal with everything.

If you need special repositories , tell artifactory to add them to its list of Remote Repositories, rather than adding a new repository directly to Grails.

To borrow a slogan: "it just works."

Publishing To Artifactory

You cannot publish to Artifactory's Default Virtual Repository. Nor should you need or want to. It is easy and appropriate to create a new Local Repository for your work. Like this:

Once you have your destination repository up and running, go back to BuildConfig.groovy and turn on the magic. Add the following to the top level (ie not nested within anything else) of the file:

grails.project.repos.default = "AT"

grails {
    project {
        repos {
            AT {
                url = "http://localhost:8081/artifactory/AT/"
                username = "bob"
                password = "MyUnguessablePassword"
            }
        }
    }
}

grails.project.groupId = "org.bob"

All pretty clear, I hope.

It is not strictly necessary to define a default repository in BuildConfig.groovy; one can be specified on the command line (as ‐‐repository=AT) but this can make life a little easier…and who doesn't want an easier life, eh?

Since we are working in a Maven-oriented world, it's fairly important to set the groupId. The associated artifactId and version information is picked up from the application.properties file.

Grails' release plugin does the heavy lifting, so add it to BuildConfig.groovy, thus:

plugins {
    ...

    build ":release:3.0.1"
}

All that's needed now is to run:

grails maven-deploy

And Bob's your uncle, as they say.

The end result:

Adding Manifest Data

It's extremely common to want/need to put extra information into an application's META-INF/MANIFEST.MF file.

In a Grails application, this is most easily done by hooking into the build lifecycle in scripts/_Events.groovy.

I've Jenkins) or from a buildNumber.properties file, if such exists:

import static grails.build.logging.GrailsConsole.instance as CONSOLE

eventCompileStart = { kind ->
    grailsSettings.config['BUILD_NUMBER'] = determineBuildNumber()
}

// NB: Possible race condition here, but ignore for now
//     Assumes BUILD_NUMBER is an integer
private determineBuildNumber() {
    def buildNumber = 0
    def bnp = System.getenv('BUILD_NUMBER')
    if (bnp)
        buildNumber = val(bnp)
    else
        try {
            def bnPropsFile = new File(grailsSettings.baseDir, "buildNumber.properties")
            if (bnPropsFile.isFile() && bnPropsFile.canRead()) {
                def properties = new Properties()
                bnPropsFile.withReader { r ->
                    properties.load(r)
                }
                buildNumber = val(properties.getProperty('BUILD_NUMBER'))
                if (bnPropsFile.canWrite()) {
                    properties.setProperty('BUILD_NUMBER', "${buildNumber + 1}")
                    bnPropsFile.withWriter { w ->
                        properties.store(w, null)
                    }
                }
            }
        }
        catch (e) {
            CONSOLE.warn("Exception when reading/writing 'buildNumber.properties'; assuming '0': " + e.getMessage())

            buildNumber = 0
        }

    CONSOLE.info("Using BUILD_NUMBER=${buildNumber}")

    buildNumber
}

private val(String bn) {
    try {
        Integer.parseInt(bn)
    } catch (ignore) {
        CONSOLE.warn("Could not parse presented BUILD_NUMBER '${bn}' as Integer; assuming '0'")
        0
    }
}

eventCreateWarEnd = { warName, stagingDir ->
    ant.jar(destfile: warName, update: true) {
        manifest {
            attribute(name: 'BUILD_NUMBER', value: grailsSettings.config['BUILD_NUMBER'])
            attribute(name: 'Built-By', value: System.properties['user.name'])
            attribute(name: 'Build-Host', value: InetAddress.getLocalHost().getHostName())
        }
    }
}

Refer back to the previous picture to see the results of all this.

As an aside, I really appreciate Groovy's easy integration of Ant. I probably could have used Ant to better effect in determineBuildNumber() but that may be the subject of a later story.

Now you have it, Grails and Artifactory, sitting in a tree…not quite K.I.S.S.I.N.G., but getting together pretty closely.

Tags: Grails, Groovy, Programming

Grabbing A Library That Is Not In Maven

Some commercial vendors (I'm thinking of ones whose names start with 'O' in particular) haven't quite got the concept of making their developer community's lives easy and don't publish their library files into any maven-accessible repository.

Putting aside for the moment the thought that perhaps such vendors might not deserve to have their software used, it may well happen that life (or your PHB) conspires against you and forces you to dance with the devil.

As a Groovy Grapes user (you are, aren't you?), what are you to do?

You could start/populate your own Artifactory repository but that may be considered to be inappropriate for the userbase or might be too heavyweight for a quick-and-dirty script. What to do..what to do?

The answer lies in this stackoverflow question/answer session. I am recreating it here, for posterity…who knows, the link may go away one day…such is the way of the 'net.

This is the sort of script fragment that I am considering:

@Grapes([
    @Grab('com.oracle:ojdbc6:11.2.0.3.0'),
    @GrabConfig(systemClassLoader=true, initContextClassLoader=true)
])
import groovy.sql.Sql

def oracleSql = Sql.newInstance("jdbc:oracle:thin:@192.168.1.64:1521:xe", "user", "password",
                      "oracle.jdbc.OracleDriver")
...

The trick is to manually create a grapes-compatible repository entry, mimicking what the grapes system would normally do.

Two things are required:

  • a compatible filesystem structure within your .groovy/grapes directory
  • an ivy config descriptor

The first looks like this:

The second looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<ivy-module version="2.0">
        <info organisation="com.oracle"
                module="ojdbc6"
                revision="11.2.0.3.0"
                status="release"
                publication="20140216144306"
                default="true"
        />
        <configurations>
                <conf name="default" visibility="public"/>
        </configurations>
        <publications>
                <artifact name="ojdbc6" type="jar" ext="jar" conf="default"/>
        </publications>
</ivy-module>

Once this is all squared away Grape will Grab the specified jar quite happily.

Simple, when you know how.

Took me quite a while to track down though!

I extend my thanks to all those who contributed on stackoverflow.

Tags: Groovy, Programming

New & Improved: Getting A Hibernate SessionFactory

Hibernate 4.3.0 changes and deprecates a few things. One of these 'things' that changes is probably the very first 'thing' that a hibernate application developer needs to do: obtain a SessionFactory.

Without further ado, what follows is the shiny new way of doing this:

package main;

import org.hibernate.HibernateException;
import org.hibernate.SessionFactory;
import org.hibernate.boot.registry.StandardServiceRegistryBuilder;
import org.hibernate.cfg.Configuration;

public class HibernateUtil {
        private static SessionFactory configureSessionFactory()
                        throws HibernateException {
                Configuration configuration = new Configuration().configure();
                StandardServiceRegistryBuilder builder = new StandardServiceRegistryBuilder()
                                .applySettings(configuration.getProperties());
                return configuration.buildSessionFactory(builder.build());
        }

        public static SessionFactory getSessionFactory() {
                return configureSessionFactory();
        }
}

My thanks go to Krishna Srinivasan over at JavaBeat's Java Dev Zone who provided the correct mechanism, even while the mighty participants at StackOverflow couldn't agree as to how this should be done.

Tags: Hibernate, Programming, Tools

Keeping Up With The Joneses

Everybody's doing it, so it must be good…'it' in this case being Google's AngularJS, the "Superheroic JavaScript MVW Framework."

Who am I to disagree?

To keep up with Mr & Mrs Everywhere I set myself the task of making a simple AJAX/REST-based cascading select form. And to ensure that my coolness factor doesn't shoot way off the scale, I also set myself the task of building the REST componentry with JEE7, rather than something else/better.

Why this task? Because I found very little true guidance on how to do it "out there." There exists a fair bit of "oh, it's easy" handwaving but no concrete example. Time to rectify that.

Along the way, I'll throw in a few toys like MireDot and MongoDB. Because I can, that's why!

To get us started, here's the form in all it's Bootstrap-py glory:

It's easy to see what is going on, I hope: the form uses AJAX (via AngularJS' data binding and services facilities) to hit a REST resource that serves up a list of Australia's capital cities, and bind that list to the origin select form control. Use the selection from that control to go back to the resource to get another list of destination cities (I'm assuming that our shipping company is pretty small and doesn't cover all of Australia). Once one has the origin and destination, and a given quantity of 'things' to be shipped, hit a second REST resource to do the calculation.

I guess that I'm not really that cool, because I'm going to take a traditional 3-tier view of the application: Database, Server-side and Client-side. Works for me, anyway.

Database Tier

MongoDB is the storage engine for this application.

The database is a collection of documents like:

[
{_id: 1, "origin": "Brisbane", "destination": [
    {"city": "Brisbane", "cost": 0.00},
    {"city": "Hobart", "cost":88.88 },
    {"city": "Canberra", "cost":22.22},
    {"city": "Darwin", "cost":44.44},
    {"city": "Sydney", "cost":111.11}
    ]},
{_id: 2, "origin": "Hobart", "destination": [
    {"city": "Brisbane", "cost": 88.88},
    ...
    ]},
...
]

In one of those nasty old-fashioned uncool SQL databases, one would model this sort of master/detail structure and implicit 'contains' constraint with a one-to-many relationship. I could do that in MongoDB as well, but MongoDB makes it possible to use an embedded sub-model, as shown above. This is somewhat cleaner and might well be more performant for certain queries.

For this example, I desire to load the data into a collection called 'cities' in a database called 'cities.'

MongoDB naturally supplies a loader tool to make this happen:

mongoimport.exe --host localhost --port 30000 --db cities --collection cities --file "cities.mongo" --jsonArray

For completeness' sake, here's a few queries and useful miscellaneous commands that can be issued to the database:

> use cities
switched to db cities

> show databases
cats    0.203125GB
cities  0.203125GB
local   0.078125GB

> show collections
cities
system.indexes
>

> var x = db.cities.findOne({"destination.city": "Sydney"}, {"destination.$": 1})
> x
{
        "_id" : ObjectId("52a908d43780d39488948586"),
        "destination" : [
                {
                        "city" : "Sydney",
                        "cost" : 111.11
                }
        ]
}
> x.destination[0].cost
111.11
>

> db.cities.find().forEach(printjson)
{
        "_id" : 1,
        "origin" : "Brisbane",
        "destination" : [
                {
                        "city" : "Brisbane",
                        "cost" : 0
                },
                {
                        "city" : "Hobart",
                        "cost" : 88.88
                },
                {
                        "city" : "Canberra",
                        "cost" : 22.22
                },
                {
                        "city" : "Darwin",
                        "cost" : 44.44
                },
                {
                        "city" : "Sydney",
                        "cost" : 111.11
                }
        ]
}
...
>

> db.getCollection('cities').drop();
true
>

At this trivial level of use, MongoDB is pretty straightforward.

The coding is, as expected, low-level.

// https://raw.github.com/jvmisc22/mongo-jndi/master/src/main/java/com/mongodb/MongoCitiesDatabase.java
package mongodb;

import com.mongodb.*;
import rest.InvalidDataException;

import java.math.BigDecimal;
import java.math.RoundingMode;
import java.net.UnknownHostException;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.logging.Logger;

public class MongoCitiesDatabase {

    public static Set<String> findAllCities() throws InvalidDataException {
        DBObject fields = new BasicDBObject();
        fields.put("origin", 1);
        fields.put("_id", 0);
        DBObject project = new BasicDBObject("$project", fields);

        final Set<String> ss = new HashSet<>();
        MongoClient mc = MongoSingleton.INSTANCE.getMongoClient();
        DB db = null;
        try {
            db = mc.getDB("cities");

            // http://docs.mongodb.org/ecosystem/drivers/java-concurrency/
            db.requestStart();
            db.requestEnsureConnection();

            DBCollection cities = db.getCollection("cities");

            AggregationOutput output = cities.aggregate(project);
            if (!output.getCommandResult().ok())
                throw new InvalidDataException("No Cities found.");

            for (DBObject r : output.results()) {
                String s = (String) r.get("origin");
                ss.add(s);
            }
        } finally {
            try {
                db.requestDone();
            } catch (NullPointerException e) {
                /* SQUELCH!*/
            }
        }
        return ss;
    }

    public static Set<String> findDestinationsBySource(String source) throws InvalidDataException {
        DBObject match = new BasicDBObject("$match", new BasicDBObject("origin", source));

        DBObject fields = new BasicDBObject();
        fields.put("destination.city", 1);
        fields.put("_id", 0);
        DBObject project = new BasicDBObject("$project", fields);

        final Set<String> destinations = new HashSet<>();

        MongoClient mc = MongoSingleton.INSTANCE.getMongoClient();
        DB db = null;
        try {
            db = mc.getDB("cities");

            db.requestStart();
            db.requestEnsureConnection();

            DBCollection cities = db.getCollection("cities");

            AggregationOutput output = cities.aggregate(match, project);
            if (!output.getCommandResult().ok())
                throw new InvalidDataException(String.format("/{source: %1$s}/destinations. Source not found.", source));

            for (DBObject r : output.results()) {
                BasicDBList destination = (BasicDBList) r.get("destination");
                for (Object d : destination) {
                    BasicDBObject o = (BasicDBObject) d;
                    String c = (String) o.get("city");
                    destinations.add(c);
                }
            }
        } finally {
            try {
                db.requestDone();
            } catch (NullPointerException e) {
                /* SQUELCH!*/
            }
        }
        return destinations;
    }

    // http://www.mkyong.com/mongodb/java-mongodb-query-document/
    public static BigDecimal findCostBySourceAndDestination(String source, String dest) throws InvalidDataException {
        DBObject unwind = new BasicDBObject("$unwind", "$destination");

        BasicDBObject andQuery = new BasicDBObject();
        List<basicdbobject> obj = new ArrayList<>();
        obj.add(new BasicDBObject("origin", source));
        obj.add(new BasicDBObject("destination.city", dest));
        andQuery.put("$and", obj);

        DBObject match = new BasicDBObject("$match", andQuery);

        DBObject fields = new BasicDBObject();
        fields.put("destination.cost", 1);
        fields.put("_id", 0);
        DBObject project = new BasicDBObject("$project", fields);

        MongoClient mc = MongoSingleton.INSTANCE.getMongoClient();
        DB db = null;
        try {
            db = mc.getDB("cities");

            db.requestStart();
            db.requestEnsureConnection();

            DBCollection cities = db.getCollection("cities");

            AggregationOutput output = cities.aggregate(unwind, match, project);
            if (!output.getCommandResult().ok())
                throw new InvalidDataException(String.format("{source: %1$s} or {destination: %2$s} not found.", source, dest));

            for (DBObject r : output.results()) {
                BasicDBObject destination = (BasicDBObject) r.get("destination");
                return new BigDecimal((Double) destination.get("cost")).setScale(2, RoundingMode.CEILING);
            }
        } finally {
            try {
                db.requestDone();
            } catch (NullPointerException e) {
                /* SQUELCH!*/
            }
        }

        // should not happen!
        throw new InvalidDataException(String.format("Given ({source: %1$s}, {destination: %2$s}); no data.", source, dest));
    }

    // Joshua Bloch's Java 1.5+ Singleton Pattern
    // http://books.google.com.au/books?id=ka2VUBqHiWkC&pg=PA17&lpg=PA17&dq=singleton+bloch&source=bl&ots=yYKmLgv1R-&sig=fRzDz11i4NnvspHOlooCHimjh2g&hl=en&sa=X&ei=xvOwUsLVAuSOiAeVyYHoAQ&ved=0CDgQ6AEwAg#v=onepage&q=singleton%20bloch&f=false
    private enum MongoSingleton {
        INSTANCE;
        private static final Logger log = Logger.getLogger(MongoSingleton.class.getName());
        private MongoClient mongoClient = null;

        MongoClient getMongoClient() {
            MongoClientOptions options = MongoClientOptions.builder()
                    .connectionsPerHost(25)
                    .build();

            if (mongoClient == null)
                try {
                    ServerAddress serverAddress = new ServerAddress("localhost", 30000);
                    mongoClient = new MongoClient(serverAddress, options);
                } catch (UnknownHostException uhe) {
                    String msg = "getMongoClient(); configuration issue. UnknownHostException: " + uhe.getMessage();
                    log.severe(msg);
                    throw new RuntimeException(msg, uhe);
                }

            return mongoClient;
        }
    }
}

This is quite reminiscent of straight JDBC coding. Assembly language coding for the database. Shudder. If I had to do this again, I would have gone for something like Jongo to make life somewhat easier.

There are a few points of interest here:

  • The use of the db.request{Start,EnsureConnection,End} sequence to ensure that the MongoClient Java driver handles concurrency in a manner more compatible with server-side requirements than normal.
  • The use of $unwind, which "Peels off the elements of an array individually, and returns a stream of documents." and allows for search within the embedded 'destinations' array.
  • (A standard Java trick) the use of Enum to provide a singleton. Yes, I do know that "singletons are evil." In this case I just couldn't be bothered messing with Wildfly's innards to get an objectfactory asserted into the JNDI, etc. You should! See here and here.
  • RuntimeException-based exception handling makes life easy (and see later)

Server Tier

I am 'exploring' JAX-RS in JEE7. No doubt as the result of a long exposure, JBoss seems to be somewhere down "in my DNA" and thus I will eschew Glassfish 4 (which now has a decidedly uncertain future, anyway) and go with Wildfly 8 CR1.

I'm aiming to build a couple of REST resources. Nothing too fancy, level 2 of the Richardson REST Maturity Model seems a happy place to be. The resources will be: Cities and Shipping. No prizes for guessing what each is for, so without further ado…

Cities
package rest;

import mongodb.MongoCitiesDatabase;

import javax.ws.rs.GET;
import javax.ws.rs.Path;
import javax.ws.rs.PathParam;
import javax.ws.rs.Produces;
import javax.ws.rs.core.MediaType;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;
import java.util.logging.Logger;

@Path("/cities")
@Produces(MediaType.APPLICATION_JSON)
public class Cities {

    private static final Logger log = Logger.getLogger(Cities.class.getName());

    @GET
    public Map<String, Set<String>> getDefault() {
        log.info("getDefault()");

        return sources();
    }

    @GET
    @Path("/sources")
    public Map<String, Set<String>> sources() {
        log.info("sources()");

        return new HashMap<String, Set<String>>() {{
            put("sources", MongoCitiesDatabase.findAllCities());
        }};
    }

    @GET
    @Path("/{source}/destinations")
    public Map<String, Set<String>> destination(@PathParam("source") final String source) {
        log.info(String.format("/{source: %1$s}/destinations", source));

        return new HashMap<String, Set<String>>() {{
            put("destinations", MongoCitiesDatabase.findDestinationsBySource(source));
        }};
    }
}

A pretty straightforward, read-only resource.

While not strictly necessary, I tend to ensure that my JSON objects all encapsulated as a single entry in a top-level container map. Maybe it's the XML-induced OCD surfacing through my psyche ("thou shalt have but a single root to a document") but I believe that this makes life a teeny-weeny bit nicer. Most examples over at JSON.org are like this, too. There's another reason to encapsulate a response like this: it is ridiculous that JEE7 (via JAX-RS) makes it less-than-straightforward (ie: hard) to just return plain old List<String>. Don't believe me? LMGTFY.

Shipping

Shipping is an "Algorithmic Resource." To a degree, referring to this as a resource represents REST doublethink. To us oldies, it is clearly a plain old service.

package rest;

import mongodb.MongoCitiesDatabase;

import javax.ws.rs.*;
import javax.ws.rs.core.MediaType;
import java.math.BigDecimal;
import java.math.RoundingMode;
import java.util.HashMap;
import java.util.Map;
import java.util.logging.Logger;

/**
 * An "Algorithmic REST Resource" that determines the cost of shipping items around the capital cities of Australia.
 *
 * @servicetag Cities
 *
 * @author Bob
 */

@Path("/shipping")
public class Shipping {

    private static final Logger log = Logger.getLogger(Shipping.class.getName());

    /**
     * Determines the cost of shipping items around the capital cities of Australia.
     *
     * @param  origin Where to ship from
     * @param destination Where to ship to
     * @param quantity how many to ship
     * @return The cost of shipping items
     * @throws CityNotFoundException leading to a 404 return status.
     * @statuscode 404 If any of the origin or destination parameters can't be found.
     */
    @Produces(MediaType.APPLICATION_JSON)
    @GET
    @Path("/calculate")
    public Map<String, Map<String, BigDecimal>> calculate(@QueryParam("origin") String origin, @QueryParam("destination") String destination,
                                                          @DefaultValue("1") @QueryParam("quantity") Integer quantity) {
        log.info(String.format("calculate():: Origin: %1$s; Destination: %2$s; Quantity: %3$d", origin, destination, quantity));

        BigDecimal costPer = MongoCitiesDatabase.findCostBySourceAndDestination(origin, destination);
        final BigDecimal res = costPer.multiply(BigDecimal.valueOf(quantity).setScale(2, RoundingMode.CEILING));

        final Map<String , BigDecimal> resMap = new HashMap<String, BigDecimal>() {{
            put("result", res);
        }};

        return new HashMap<String, Map<String, BigDecimal>>() {{
            put("calculate", resMap);
        }};
    }
}

There's a bit of Javadoc here, which I'll talk about later on, but otherwise, there's nothing special to see here.

Exception Handling

If you refer back to the MongoCitiesDatabase class, you will note that application exceptions are signalled via the InvalidDataException class.

You should also note that none of the client code explicitly handles such exceptions (a benefit of making the class a descendent of the unchecked RuntimeException, rather than plain old-and some would say evil-compiler-checked Exception). This is possible because we use an "exception mapper" to ensure that exceptions are handled and mutated into acceptable REST behaviour:

package rest;

import javax.ws.rs.core.MediaType;
import javax.ws.rs.core.Response;
import javax.ws.rs.ext.ExceptionMapper;
import javax.ws.rs.ext.Provider;

@Provider
public class InvalidDataMapper implements ExceptionMapper<InvalidDataException> {

    @Override
    public Response toResponse(InvalidDataException ide) {
        return Response.status(Response.Status.BAD_REQUEST).entity(ide.getMessage()).type(MediaType.TEXT_PLAIN_TYPE).build();

    }
}

This mapper ensures that a BAD_REQUEST (404) response is issued to the client, bundled with a plain-text message.

Application

JAX-RS likes to know how all the various resources are linked together. The simplest way to let it know what's what is to provide an Application class. For this project, the requisite class is:

package rest;

import javax.ws.rs.ApplicationPath;
import javax.ws.rs.core.Application;
import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;

@ApplicationPath("/rest")
@SuppressWarnings("unused")
public class ApplicationConfig extends Application {
    public Set<Class<?>> getClasses() {
        return new HashSet<Class<?>>(Arrays.asList(Shipping.class, Cities.class, InvalidDataMapper.class));
    }
}
JSON Vulnerability Protection

AngularJS has a built-in mechanism for consuming JSON-formatted data that is protected against misuse. The servlet filter given below shows how to create protected data suitable for AngularJS' use by prepending all JSON data with a syntactically illegally fragment of JSON. While angularJS knows to automatically remove this standard 'nasty' prefix from any data given to it, the assumption is that a "baddie's" application won't know that it has to do this and so will encounter JSON parse errors, rendering the protected data inaccessible. Google (they use a slightly different/nastier approach), Facebook and many other big sites do this sort of thing, so it must be A Good Thing to do, right…

package rest;

import javax.servlet.*;
import javax.servlet.annotation.WebFilter;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpServletResponseWrapper;
import javax.ws.rs.core.MediaType;
import java.io.IOException;
import java.io.PrintWriter;
import java.io.StringWriter;

@WebFilter(filterName = "AngularJSJSONVulnerabilityProtectionFilter",
        urlPatterns = {"/rest/*"})
public class AngularJSJSONVulnerabilityProtectionFilter implements Filter {

    // JSON Vulnerability protection in angular
    // see: http://docs.angularjs.org/api/ng.$http
    private static final String AngularJSJSONVulnerabilityProtectionString = ")]}',\n";

    @Override
    public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain) throws IOException, ServletException {

        HttpServletResponse response = (HttpServletResponse) res;

        StringResponseWrapper responseWrapper = new StringResponseWrapper(response);

        chain.doFilter(req, responseWrapper);

        String content = responseWrapper.toString();

        // turn on AngularJS' JSON vulnerability protection feature
        // ignore mapped exceptions and anything else not 'Kosher'
        if (MediaType.APPLICATION_JSON.equals(responseWrapper.getHeader("Content-Type")))
            content = String.format("%1$s%2$s", AngularJSJSONVulnerabilityProtectionString, content);

        byte[] bytes = content.getBytes();

        response.setContentLength(bytes.length);
        response.getOutputStream().write(bytes);
    }

    @Override
    public void init(FilterConfig config) throws ServletException {
    }

    @Override
    public void destroy() {
    }

    // http://stackoverflow.com/questions/1302072/how-can-i-get-the-http-status-code-out-of-a-servletresponse-in-a-servletfilter/1302165#1302165
    private class StringResponseWrapper
            extends HttpServletResponseWrapper {

        private StringWriter writer;

        public StringResponseWrapper(HttpServletResponse response) {
            super(response);
            writer = new StringWriter();
        }

        @Override
        public PrintWriter getWriter() {
            return new PrintWriter(writer);
        }

        @Override
        public ServletOutputStream getOutputStream() {
            return new StringOutputStream(writer);
        }

        @Override
        public String toString() {
            return writer.toString();
        }
    }

    private class StringOutputStream extends ServletOutputStream {
        private StringWriter stringWriter;

        public StringOutputStream(StringWriter stringWriter) {
            this.stringWriter = stringWriter;
        }

        public void write(int c) {
            stringWriter.write(c);
        }

        public boolean isReady() {
            return true;
        }

        public void setWriteListener(WriteListener writeListener) {
        }
    }
}

This is all much more cumbersome than I would like to see. Sadly, JEE7 continues to be a rather verbose beastie.

Client Side Tier

This is a simple AngularJS application. Rather than build from scratch, I am using Angular Seed.

I am also using Twitter bootstrap to give me some basic prettiness. I'm not using any of Bootstrap's Javascript-based componentry. It is apparently too much to expect these two big lumps of Javascript to play well together. There are a couple of projects aiming to "reimplement" the missing components: this and this. Both are quite low fidelity attempts and so neither are truly Bootstrap. This may or may not matter to you and rather than rant about how imperfect the world is, I'll just swallow my bile and just get on with life…

There are four basic Points Of Interest, so let's take a look.

app.js

This is where angular creates routes, assigns controllers and does the rest of its housekeeping. For this app, it is fairly simple:

'use strict';

// Declare app level module which depends on filters, and services
angular.module('cascadeSelects', [
  'ngRoute',
  'cascadeSelects.filters',
  'cascadeSelects.services',
  'cascadeSelects.directives',
  'cascadeSelects.controllers'
]).
config(['$routeProvider', '$httpProvider', function($routeProvider, $httpProvider) {
  $routeProvider.when('/', {templateUrl: 'partials/cascadeSelectsPartial.html', controller: 'CascadeSelectsController'});
  $routeProvider.otherwise({redirectTo: '/'});
}]);

I am following the angular-seed convention of using a template along with views/partials for content even though for this application there is only 1 view. So sue me, I am too lazy to hack anything more specific!

HTML

Leaving aside the very basics of AngularJS, as embodied in the template index.html file, it is worth taking a quick look at the file 'partials/cascadeSelectsPartial.html.'

This embodies an HTML form, with added AngularJS goodness:

<form class="form-horizontal" role="form">
    <fieldset>
        <legend>Calculator:</legend>

        <div class="form-group">
            <label for="originSelect" class="col-sm-2 control-label">Origin</label>

            <div class="col-sm-6">
                <select id="originSelect" class="form-control" ng-model="cities.chosen"
                        ng-options="src for src in cities.sources">
                </select>
            </div>
        </div>

        <div class="form-group">
            <label for="destinationSelect" class="col-sm-2 control-label">Destination</label>

            <div class="col-sm-6">
                <select id="destinationSelect" class="form-control" ng-model="destinations.chosen"
                        ng-options="dest for dest in destinations.destinations">
                </select>
            </div>
        </div>

        <div class="form-group">
            <label for="quantityInput" class="col-sm-2 control-label">Quantity</label>

            <div class="col-sm-6">
                <input type="number" name="input" id="quantityInput" ng-model="shipping.quantity" min="1" required>
            </div>
        </div>

        <div class="form-group">
            <label for="shippingButton" class="col-sm-2 control-label"> </label>

            <div class="col-sm-6">
                <button id="shippingButton" ng-click="shipping.calculateShipping()" class="btn1">Calculate</button>
            </div>
        </div>

        <div class="form-group">
            <label for="shippingButton" class="col-sm-2 control-label"> </label>

            <div class="col-sm-8" ng-if="shipping.calculateShippingResponse" ng-animate="'example'">
                {{shipping.quantity}} item(s), shipping from {{cities.chosen}} to {{destinations.chosen}}:
                ${{shipping.calculateShippingResponse}}
            </div>
            <div class="col-sm-8" ng-if="error" ng-animate="'example'">
                {{error}}
            </div>
        </div>
    </fieldset>
</form>

The salient points:

  • The two select controls show AngularJS data binding to collections originating from REST. The first collection is 'cities.sources' and the second cascades from the chosen city ('cities.chosen') into 'destinations.destinations.'
  • The 'Calculate' button shows binding to the 'shipping.calculateShipping' function in the controller.
  • There exist two conditionally rendered response divs that show how binding to various elements can drive the page content model.
controllers.js

The embodyment of the application flow.

'use strict';

/* Controllers */

angular.module('cascadeSelects.controllers', []).
    controller('CascadeSelectsController', ['$scope', 'Cities', function($scope, Cities) {

        $scope.error = undefined;

        // TODO: something better needed here
        function error(r) {
            $scope.error = "Error: " + r.status + ". Message: " + r.data;
        }

        // Instantiate an object to store your scope data in (Best Practices)
        // http://coder1.com/articles/consuming-rest-services-angularjs
        $scope.cities = {
            sources: null,
            chosen: null
        };
        $scope.destinations = {
            destinations: null,
            chosen: null
        };
        $scope.shipping = {
            calculateShipping: function() {
                Cities.calculate.get({origin: $scope.cities.chosen, destination: $scope.destinations.chosen,
                                      quantity: $scope.shipping.quantity}, function(response) {
                    $scope.shipping.calculateShippingResponse = response.calculate.result;
                }, function(httpResponse) {
                    console.log(error(httpResponse));
                })
            },
            calculateShippingResponse: undefined,
            quantity: 1
        };

        Cities.cities.query(function(response) {
            $scope.cities.sources = response.sources;
            $scope.cities.chosen = $scope.cities.sources[0]
        }, function(httpResponse) {
            console.log(error(httpResponse));
        });

        $scope.$watch("[shipping.quantity, destinations.chosen, cities.chosen]", function(newValue, oldValue, scope) {
            $scope.shipping.calculateShippingResponse = undefined;
            $scope.error = undefined;
        }, true);

        $scope.$watch("cities.chosen", function(newValue, oldValue, scope) {
            if (newValue === null)
                return;
            Cities.destinations.query({source: newValue}, function(response) {
                    $scope.destinations.destinations = response.destinations;
                    $scope.destinations.chosen = $scope.destinations.destinations[0]
                }, function(httpResponse) {
                    console.log(error(httpResponse));
                }
            )
        }, true);
    }]);

Notable points:

  • Nothing is bound directly to $scope; everything is handled using a "poor man's namespace" mechanism, instead. This is easier to read and also less likely to suffer conflict with other parts in a large application.
  • Note how $watch allows for cascading the selects.
  • Note that 'cities.chosen' is watched more than once. This allows a slightly more structured approach to having multiple tasks operate in response to a single stimulus.
  • Where the HTML defines data binding to various elements, the controller is the entity that populates $scope with these necessary elements.
  • There is a little, very simple, error handling in play here.

As a structural nicety, the controller calls on a related service to do the REST heavy lifting.

services.js

The service uses AngularJS' $resource service to handle the low-level details of driving HTTP and interacting with a restful resource.

'use strict';

/* Services */

angular.module('cascadeSelects.services', ['ngResource']).
    factory("Cities", function($resource){

        var context = '/web_war';
        return {
            cities: $resource(context + '/rest/cities', {}, {
                query: {method: 'GET', params: {}, isArray: false}
            }),
            destinations: $resource(context + '/rest/cities/:source/destinations', {}, {
                query: {method: 'GET', params: {source: '@source'}, isArray: false}
            }),
            calculate: $resource(context + '/rest/shipping/calculate', {}, {
                get: {method: 'GET', params: {origin: '@origin', destination: '@destination', quantity: '@quantity'}, isArray: false}
            })
        };
    })
    .value('version', '1.0');

Noteworthy here are:

  • The application's '/web_war' context is hard coded. Yuk! Bad, bad Bob! It is possible to get it from incoming request but I didn't bother for this application. Put it down to laziness, again.
  • The use of $resource for REST interactions.
  • This service handles multiple URLs. This is a nice feature that's not really documented anywhere 'official', as far as I can see.
  • The convention whereby the URL definition ':var' is fulfilled using the parameter '@var'. Something else not actually documented as far as I could see, so thank you stackoverflow.

So there you are: a working application showing how to do REST-driven cascading selects using AngularJS. Hopefully in the future, the various search engines will be able to point any needy developer this way.

Toy Time

But wait! There's more!

Given that there is no schema equivalent for JSON/REST, the need for good documentation is paramount. Given, too, that devs (including yours truly) are dreadful at writing doco, there is a need for a good toy to help us document our beautiful APIs.

"MireDot is a REST API documentation generator for Java that generates beautiful interactive documentation with minimal configuration effort."

How does MireDot go for this application? Like this:

Refer back to the Shipping resource. You can see how Miredot reads the Javadoc comments and incorporates them into the generated doco. Miredoc also supplies a few helpful Javadoc annotations (@servicetag/@statuscode) and also some 'real' annotations to guide the generation process. Miredot also has its own set of configuration options, and this can alter what is generated as well.

Not too shabby!

I'm not sure I like the abstract way that the JSON payload is shown in this case but it is tweakable and it's early days for the product.

I found Miredot easy to get and setup and when I corresponded with the developers (sending a bug report and a few opinions), they were courteous and responsive. What's not to like!

There's a free and a paid version. I played with the free version. Check it out!
(NB: I'm not affiliated in any way.)

You can download the IntelliJ project (sans Miredot configuration, please note), should you so desire.

Tags: Java, Javascript, Programming, REST, Tools

Kill A Process With A Specific Command Line From The Command-Line In Windows

You learn something new every day!

After a bit of wrestling with windows' tasklist and taskkill tools, I finally took to the fabulous interweb and came up with this, viz:

wmic Path win32_process Where "Caption Like '%java.exe%' AND CommandLine Like '%groovy.home%'" Call Terminate

I knew about WMIC, but I didn't appreciate how versatile it actually was.

I do now!

PS: I also appreciate the PowerShell version so I guess that makes two things I have learned today.

Tags: Tools

Enterprise & Media Storage In The Cloud

Amazon is really going all out to win the "cloud wars."

Witness:

Long may the wars continue…I get to attend free events like the above.

Thanks to Amazon for putting on another informative event.

Tags: Cloud, Tools

Playing

Update:
No sooner had I posted this than the excellent Tim Yates ( ) contacted me to show me how a true master does things, viz:

Another Groovy option might be to mark the `Statement` class with `@groovy.transform.Canonical` and then do:

text.split( /<(\d+)>/ )
    .collect { it.split( ',' ) }
    .findAll { it.size() == 9 }
    .collect { new Statement( *it ) }
    .each { println it.dump() }

Beautiful! I love this solution! Tim's work is always informative.

I admit that each time I wrote the 'if' statements I said to myself "there simply has to be a better way!" Trouble is, I couldn't see through to that better way. I doff my cap to Tim for teaching me.

Similarly munging the other examples here is Left As An Exercise For The Reader (I'm too lazy, in other words).

And now, back to the original post…

This post on the Grails user list got me thinking. So I thought and then I played, and played…and played.

Below are Groovy, Ruby, Haskell and Scala versions of the same program.

Groovy Version

class Statement {
  String item1
  String item2
  String item3
  String item4
  String item5
  String item6
  String item7
  String item8
  String item9
}

def text ='<1>TCODE<2>14044,20110331,0,GBP,0,14044,20110331,52,TT<3>14044,20110331,0,GBP,0,14044,20110331,401,MM<4>14044,20110331,0,GBP,0,14044,20110331,403,MM<5>14044,20110331,0,GBP,0,14044,20110331,701,SW<6>14044,20110331,0,GBP,0,14044,20110331,701,SW<7>14044,20110331,0,GBP,0,14044,20110331,701,SW<8>14044,20110331,0,GBP,0,14044,20110331,701,SW'

def matcher = text =~ /<(\d+)>([^<]*)/
matcher.each { it, csv = it[2] ->
  def split = csv.split(',', -1)

  if (split.size() == 9) {
    def itemMap = split.inject([:]) { map, o -> map << [("item${map.size() + 1}".toString()): o] }

    def s = new Statement(itemMap)

    // do 'something' with the new Statement
    println s.dump()
    }
}

And the result:

<Statement@1a260305 item1=14044 item2=20110331 item3=0 item4=GBP item5=0 item6=14044 item7=20110331 item8=52 item9=TT>
<Statement@62e7f06c item1=14044 item2=20110331 item3=0 item4=GBP item5=0 item6=14044 item7=20110331 item8=401 item9=MM>
<Statement@6959752e item1=14044 item2=20110331 item3=0 item4=GBP item5=0 item6=14044 item7=20110331 item8=403 item9=MM>
<Statement@701c550a item1=14044 item2=20110331 item3=0 item4=GBP item5=0 item6=14044 item7=20110331 item8=701 item9=SW>
<Statement@54133d06 item1=14044 item2=20110331 item3=0 item4=GBP item5=0 item6=14044 item7=20110331 item8=701 item9=SW>
<Statement@3b0b8009 item1=14044 item2=20110331 item3=0 item4=GBP item5=0 item6=14044 item7=20110331 item8=701 item9=SW>
<Statement@7002ed27 item1=14044 item2=20110331 item3=0 item4=GBP item5=0 item6=14044 item7=20110331 item8=701 item9=SW>

Ruby Version

class Statement < Struct.new(:item1, :item2, :item3, :item4, :item5, :item6, :item7, :item8, :item9)
end

text = '<1>TCODE<2>14044,20110331,0,GBP,0,14044,20110331,52,TT<3>14044,20110331,0,GBP,0,14044,20110331,401,MM<4>14044,20110331,0,GBP,0,14044,20110331,403,MM<5>14044,20110331,0,GBP,0,14044,20110331,701,SW<6>14044,20110331,0,GBP,0,14044,20110331,701,SW<7>14044,20110331,0,GBP,0,14044,20110331,701,SW<8>14044,20110331,0,GBP,0,14044,20110331,701,SW'

matches = text.enum_for(:scan, /<(\d+)>([^<]*)/).map do
  Regexp.last_match
end
matches.each do |it, csv = (it[2])|
  split = csv.split(',')

  if split.size == 9
    itemHash = split.inject({}) do |h, o|
      k = "item#{h.size + 1}"
      h[k] = o
      h
    end

    s = Statement.new(itemHash)

    # do 'something' with the new Statement
    puts s.to_s
  end
end

Scala Version

import scala.util.matching.Regex.MatchIterator

case class Statement(item1: String, item2: String, item3: String,
                     item4: String, item5: String, item6: String,
                     item7: String, item8: String, item9: String)

object ScalaSplit {
  private val TEXT = "<1>TCODE<2>14044,20110331,0,GBP,0,14044,20110331,52,TT<3>14044,20110331,0,GBP,0,14044,20110331,401,MM<4>14044,20110331,0,GBP,0,14044,20110331,403,MM<5>14044,20110331,0,GBP,0,14044,20110331,701,SW<6>14044,20110331,0,GBP,0,14044,20110331,701,SW<7>14044,20110331,0,GBP,0,14044,20110331,701,SW<8>14044,20110331,0,GBP,0,14044,20110331,701,SW"
  private val PAT = """<[^<>]+>([^<]*)""".r

  private def stripTag(s: String): String = s.dropWhile(_ != '>').drop(1)

  def main(args: Array[String]) {

    val matches: MatchIterator = PAT findAllIn TEXT
    matches foreach { m =>
        val csv: Array[String] = stripTag(m).split(',')
        if (csv.length == 9) {
          val s: Statement = Statement(csv(0), csv(1), csv(2), csv(3), csv(4), csv(5), csv(6), csv(7), csv(8))

          // do 'something' with the new Statement
          println(s)
        }
    }
  }
}

I couldn't find any way to spread parameters to arguments, so this is a bit long-winded.

Haskell Version

import Text.Regex
import Text.Regex.Posix
import Data.Char
import Data.List
import Data.Maybe
import Control.Monad

data Something = Something { item1 :: String,
                             item2 :: String,
                             item3 :: String,
                             item4 :: String,
                             item5 :: String,
                             item6 :: String,
                             item7 :: String,
                             item8 :: String,
                             item9 :: String
                             } deriving Show

stripTag s = fromJust $ stripPrefix ">" $ dropWhile (/= '>') s

eachMatch m = do
    let csv = stripTag m
    let matches = splitRegex (mkRegex ",") csv
    when ((length matches) == 9) $ do
        let st = Something { item1 = matches!!0, item2 = matches!!1, item3 = matches!!2,
                             item4 = matches!!3, item5 = matches!!4, item6 = matches!!5,
                             item7 = matches!!6, item8 = matches!!7, item9 = matches!!8
                             }
        putStrLn $ show st

main = do
    let text = "<1>TCODE<2>14044,20110331,0,GBP,0,14044,20110331,52,TT<3>14044,20110331,0,GBP,0,14044,20110331,401,MM<4>14044,20110331,0,GBP,0,14044,20110331,403,MM<5>14044,20110331,0,GBP,0,14044,20110331,701,SW<6>14044,20110331,0,GBP,0,14044,20110331,701,SW<7>14044,20110331,0,GBP,0,14044,20110331,701,SW<8>14044,20110331,0,GBP,0,14044,20110331,701,SW"
    let pat = "<[^<>]+>([^<]*)"
    let matches = getAllTextMatches (text =~ pat :: AllTextMatches [] String)
    mapM_ eachMatch matches

It's interesting that all the versions look pretty much the same. There exist two possible reasons, as far as I can see. Either (1) since the problem is the same, the solution will pretty much be the same, or (2) if all you have is a hammer, everything looks like a nail.

Still, a fun activity that kept me occupied!

Am I cool yet?

Tags: Groovy, Haskell, Programming, Ruby, Scala