Skip to content

Latest commit

 

History

History
439 lines (351 loc) · 16.6 KB

README.adoc

File metadata and controls

439 lines (351 loc) · 16.6 KB

Letter Counter Koans

This is a simple koans for Java and Groovy. The intent is to create a method that consumes a string of arbitrary length and returns a map where the key is the lower case letter and the value is the number of instances of that letter found in the input string.

Example Input Expected output

""

[:] - Empty Map

"a"

['a':1]

"aabcccc"

[a:2, b:1, c:4]

First implementation - LetterCounterBase

The first implementation that popped into my head was a pretty trivial for-loop based algorithm that loops through the string and stores the result in a HashMap named result: it looks like this:

LetterCounterBase.java
public class LetterCounterBase {

    public Map<Character, Integer> countUsage(String inputString) {

        HashMap<Character, Integer> result = new HashMap<>();
        for (int i = 0; i < inputString.length(); i++) {
            char c = inputString.charAt(i);
            if (result.containsKey(c)) {
                result.put(c, result.get(c) + 1);
            } else {
                result.put(c, 1);
            }
        }
        return result;
    }
}

Note at this point nulls are not handled, the code is a little more verbose than I would like, but for the most part it works.

Second implementation — LetterCounterRefinement1

At this point I decided to impement a second method that refines the first one and hopefully makes the code a bit more succinct. To do so I keep the loop, but realize that the if is taking up more vertical space than it deserves for its small space in the algorithm and the call to result.put is duplicated in the code. So I extract an interface, since now I have two implementations that I’ll want to test and implement the following class:

LetterCounterRefinement1.java
package com.aceprogramming.koans.letter.counter;

import java.util.HashMap;
import java.util.Map;

final class LetterCounterRefinement1 implements LetterCounter {
    LetterCounterRefinement1() {
    }

    public Map<Character, Integer> countUsage(String inputString) {
        HashMap<Character, Integer> result = new HashMap<>();
        for (int i = 0; i < inputString.length(); i++) {
            char c = inputString.charAt(i);
            int count = result.containsKey(c) ? result.get(c) + 1 : 1;
            result.put(c, count);
        }
        return result;
    }
}

In this implementation I realized that it was not the put that changed in the if, but the count that was being put into the Map so I changed the branch to only deal with that count and factored out the result.put onto its own line, which removed the duplication of the two calls to result.put. The readability of the code also improves in that I can now clearly see that if the map already contains the letter then the count becomes 1 added to the existing value otherwise it just becomes 1. In both cases, the Map is updated.

Because I now have a nice interface over the class I also implement the interface in my original class:

LetterCounterBase.java
final class LetterCounterBase implements LetterCounter {

    public Map<Character, Integer> countUsage(String inputString) {

        HashMap<Character, Integer> result = new HashMap<>();
        for (int i = 0; i < inputString.length(); i++) {
            char c = inputString.charAt(i);
            if (result.containsKey(c)) {
                result.put(c, result.get(c) + 1);
            } else {
                result.put(c, 1);
            }
        }
        return result;
    }
}

Note at this point I also make both classes final and package-protected. Final because they will not be intentionally designed for extension, package-protected to keep them nicely hidden behind the interface.

For completeness, here is what the interface looks like:

LetterCounter.java
package com.aceprogramming.koans.letter.counter;

import java.util.Map;

public interface LetterCounter {
    Map<Character, Integer> countUsage(String inputString);
}

But what about nulls?

Just after doing this, I realized I missed a really simple test case in my spock test. I’m not checking for a null input. I considered the empty string in my input, but never nulls.

Noticing that the null case is not covered leads me to update my spock specification to include it:

LetterCounterSpec.groovy
package com.aceprogramming.koans.letter.counter

import spock.lang.Specification
import spock.lang.Unroll


class LetterCounterSpec extends Specification {
    def letterCounters = [new LetterCounterBase(), new LetterCounterRefinement1()]


    def "A LetterCounter can be created"() {
        expect:
        letterCounters.each {assert(it != null)}
    }

    @Unroll
    def "Check counted letters for #inputString"(String inputString, Map expectedResult) {
        expect:
        letterCounters.each {assert(it.countUsage(inputString) == expectedResult)}

        where:
        inputString | expectedResult
        ""          | [:]
        "a"         | [((char)'a'):1]
        "aa"        | [((char)'a'):2]
        "aab"       | [((char)'a'):2, ((char)'b'):1]
        "aabccc"    | [((char)'a'):2, ((char)'b'):1, ((char)'c'):3]
        "cabcac"    | [((char)'a'):2, ((char)'b'):1, ((char)'c'):3]
        null        | [:]
    }
}

As can be seen above I decided that null should be handled the same way that the empty String is an return back an empty map. It could be argued that throwing (or continue to let be thrown) an exception would be better, since null could be considered invalid input. I have found that doing this when the result can be an empty container often leads to more bugs and confusion than just handling the null and returning the empty container. The argument against this is that it doesn’t catch what could be a bug in the callers code, but I would rather have my code be more robust and trust that the caller does his or her testing as well.

Adding in the test for null causes my test to fail with a NullPointerException because neither of my initial implementations handle null so its time to fix both of them.

To fix the NullPointerException problem in the initial LetterCounterBase implementation I wanted to stick with a solution that was similar in concept to what I had already done. Something that was just simple and effective even if it were verbose. Here is the solution to the LetterCounterBase:

LetterCounterBase.java
    public Map<Character, Integer> countUsage(String inputString) {
        if(inputString == null) {
            inputString = "";
        }
        HashMap<Character, Integer> result = new HashMap<>();
        for (int i = 0; i < inputString.length(); i++) {
            char c = inputString.charAt(i);
            if (result.containsKey(c)) {
                result.put(c, result.get(c) + 1);
            } else {
                result.put(c, 1);
            }
        }
        return result;
    }

I just placed a simple if check at the beginning that flips the null to the empty string, since I decided to handle nulls the same way as the empty string. The solution is straight forward, but it also begins to show some of the problems with this overall approach in that the method is already getting long and verbose. The cognitive load of the method is also starting to creep up to the point where I’m starting to feel the need to comment the code. The overall solution while quick is clearly not the greatest.

For the LetterCounterRefinement1 implementation, the somewhat cleaner style of my initial approach led me to a cleaner and better way to handle the input null:

LetterCounterRefinement1.java
public Map<Character, Integer> countUsage(String inputString) {
        return inputString == null ? countLetters("") :
                countLetters(inputString);
    }

    private static Map<Character, Integer> countLetters(String inputString) {
        HashMap<Character, Integer> result = new HashMap<>();
        for (int i = 0; i < inputString.length(); i++) {
            char c = inputString.charAt(i);
            int count = result.containsKey(c) ? result.get(c) + 1 : 1;
            result.put(c, count);
        }
        return result;
    }

Here we follow the Clean Code rule:

FUNCTIONS SHOULD DO ONE THING. THEY SHOULD DO IT WELL. THEY SHOULD DO IT ONLY.

— Robert C. Martin
Clean Code

The first method just handles the null situation. It delegates everything else to a private method that does the real work of counting the letters. This simple rule makes it transparently clear what the intent of the code is with no desire to add comments.

With the nulls handled and the unit tests passing, I now want to explore how this could be implemented in other ways perhaps making better use of modern java features…​

Using the Java 8 compute method on Map

The first thing I noticed when I started surveying more modern Java techniques is that Java 8 added a compute method to Map. (Javadoc). This method allows me to simplify the LetterCoutnerRefinement1 implementation cutting two lines into one and makeing the code a tad bit cleaner:

private static Map<Character, Integer> countLetters(String inputString) {
        HashMap<Character, Integer> result = new HashMap<>();
        for (int i = 0; i < inputString.length(); i++) {
            char c = inputString.charAt(i);
            result.compute(c, (k, v) -> (v == null) ? 1 : ++v);
        }
        return result;
    }

While I don’t care for compute as the name of the method much (since it is really mapping the value) I can live with it since it is the built in standard. This certainly feels like a good step in the right direction. The result of this change is that I moved the branch check for the value being null into the compute call which now executes that function and automatically puts the result. Because compute passes a null value into the value fuction when the key is not in the map, everything works the same and we are golden.

Using Java 8 Streams

My next thought was to do this using the even more modern Java 8 Streams API. The streams API gives us a great example of internal iteration that enables a much more functional style of programming. By "internal iteration" I mean that the actual iteration is hidden from the developer and instead a function is passed in that operates on each element of a stream as those elements pass by. This can have some nice benefits among them is very concise code and the potential for parallelism. Here is my first cut at using streams:

package com.aceprogramming.koans.letter.counter;

import java.util.Map;
import java.util.stream.Collectors;

final class LetterCounterRefinement3 implements LetterCounter{
LetterCounterRefinement3() {

    }

    @Override
    public Map<Character, Integer> countUsage(String inputString) {
        return inputString.chars()
                .mapToObj(c -> (char)c)
                .collect(Collectors.groupingBy(c -> c, Collectors.summingInt(c -> 1)));

    }
}

There are a few of interesting things to note about this implementation. First I had to use Collectors.summingInt(c → 1) instead of Collectors.counting() because counting returns a Long where my interface requires an Integer. I could have refactored the interface but I chose to keep it the same and find a method that does what I wanted it to even though it increases complexity a tiny bit.

The second thing I find interesting is that even though the method length is much smaller, the solution is not as clearly readable as any of the previous ones. Usually less code means cleaner code, all other things being equal. That (at least for me) is not true in this case. Here are the things in my new stream based solution that I think are harmful to its readablity:

  • calling chars on a string does not create a stream of Character or char. It creates an IntStream which is a stream of int. This makes it necessary to do an odd cast in the mapToObj method.

  • Even though I want the mapToObj method to map the ints in the streams to the Character object I still cast to a base type of (char). There appears to be no clean way out of this

  • The Collectors.groupingBy method does most of the work, but its not readably clear how to anyone who has not done functional group bys before. It is also hard to reason about the types of the resultant map.

The last thing I find interesting about this implemantion is that my unit tests don’t pass. In particular I get a NullPointerException when I pass in a null string. So I’ll need to fix that before commiting to git.

Fixing the NullPointerException

To fix the null pointer exception, I could follow the same pattern I did in LetterCounterRefinement2 (and kind of in 1) but that really feels ugly. I would have the same duplicate code in 3 places. There is no excuse for that!! So to fix, I factor out the null check into its own class that then defers the actual letter counting to the real class. I wrap the null check around the implementation allowing me to reuse it wherever I want. Doing things this way also has the huge advantage of allowing a client to use the null safe version or the non-null safe version of an implementation easily. This is an example of the Decorator Pattern.

To accomplish this I create a NullSafeLetterCounter that does nothing except the null check and then defers to the class that it wraps. All I have done here is taken the first method out of LetterCounterRefinement2, parameterized it with the actual algorithm to call and put it in a class. Here is the result:

package com.aceprogramming.koans.letter.counter;

import java.util.Map;
import java.util.Objects;

final class NullSafeLetterCounter implements LetterCounter {
    private final LetterCounter letterCounter;

    NullSafeLetterCounter(final LetterCounter letterCounter) {
        this.letterCounter = Objects.requireNonNull(letterCounter, "The input letter counter cannot be null.");
    }

    @Override
    public Map<Character, Integer> countUsage(String inputString) {
        return inputString == null ? letterCounter.countUsage("")
                : letterCounter.countUsage(inputString);
    }
}

Then I remove that functionality from the other LetterCounter implementations and change my unit test to use the null safe wrapper:

package com.aceprogramming.koans.letter.counter

import spock.lang.Specification
import spock.lang.Unroll


class LetterCounterSpec extends Specification {
    def letterCounters = [new NullSafeLetterCounter(new LetterCounterBase()),
                          new NullSafeLetterCounter(new LetterCounterRefinement1()),
                          new NullSafeLetterCounter(new LetterCounterRefinement2()),
                          new NullSafeLetterCounter(new LetterCounterRefinement3())]


    def "A LetterCounter can be created"() {
        expect:
        letterCounters.each {assert(it != null)}
    }

    @Unroll
    def "Check counted letters for #inputString"(String inputString, Map expectedResult) {
        expect:
        letterCounters.each {assert(it.countUsage(inputString) == expectedResult)}

        where:
        inputString | expectedResult
        ""          | [:]
        "a"         | [((char)'a'):1]
        "aa"        | [((char)'a'):2]
        "aab"       | [((char)'a'):2, ((char)'b'):1]
        "aabccc"    | [((char)'a'):2, ((char)'b'):1, ((char)'c'):3]
        "cabcac"    | [((char)'a'):2, ((char)'b'):1, ((char)'c'):3]
        null        | [:]
    }
}

Note at this point I could factor out the test for just the null check and not have the repeaded call to new the NullSafeLetterCounter. In a produciton system I would also provide a factory of some type to be able to create the various letter counters. That may be a good starting point for a different exercise.

Conclusion

I started with a really simple problem to count the occurrences of letters in an arbitrary String. I then created unit tests for the known requirements and implemented a quick solution to those unit tests as a baseline. I then used that to iterate and refine the solution in order to explore some of the newer features of Java. In the end I ended up with 4 pretty reasonable solutions to the problem and a bunch of new knowledge.

As a note, my favorite implementation is LetterCounterRefinement2 as I stated above I find the Java Streams API version to be less readable. But that may be just a matter of taste and my age. :)