Groovin’ in Java

March 7, 2012

I’ve just been watching an interesting presentation on using Groovy as an additional toolkit in Java.  Groovy excels in handling XML, File I/O and JDBC, as well as offering good support for Collections and Closures (as regular readers of this blog already know!).  The type of things you typically need to do day-in-day-out for your regular job.

If you want to learn more about what you can do with Groovy in the real world take a look at the presentation on InfoQ – Improve Your Java with Groovy

The presenter, Ken Kousen, also has a good blog with some very useful information on using Groovy for doing regular & proper programming tasks – the sort of thing it’s really intended for – not what I do here!

He’s got a book in the works, Making Java Groovy, which should be coming out in late 2012.

Project Euler Problem #65

January 29, 2012

Read the details of the problem here

Summary

Find the sum of digits in the numerator of the 100th convergent of the continued fraction for e.

Solution

This question immediately reminded me of Problem #57 which ended up being quite a simple solution though it sounded involved initially. I took a look at that, and adopted a similar approach to this one.

The trick with this problem was to spot the pattern as to how the numerator changed down the sequence. As it turned out, the values of the denominator are superflous to this and can just be ignored for this purpose.

Taking the first ten terms as given (with a zeroth term for completeness) given this is how it works:

Term Numerator Multiplier Formula
0 1    
1 2 2 2 * num0
2 3 1 1 * num1 + num0
3 8 2 2 * num2 + num1
4 11 1 1 * num3 + num2
5 19 1 1 * num4 + num3
6 87 4 4 * num5 + num4
7 106 1 1 * num6 + num5
8 193 1 1 * num7 + num6
9 1264 6 6 * num8 + num7
10 1457 1 1 * num9 + num8

So this can be summarised as:

numn = muln * numn-1 + numn-2

Where n is the term being calculated and muln is the nth element of the continued fraction.  This is simply n / 3 * 2 when n mod 3 == 0, and 1 otherwise.

Given that the multipliers work in triplets, this can be used to advantage and reduce the number of calculations required. I created a function that given the n-1 and n-2 numerators, and a term number could calculate the next three numerators in the sequence. It meant that I didn’t have to do any modulus calculations as long as I kept the boundary aligned correctly i.e. only call it for terms, 2, 5, 8, etc. which were the start of the triplets.

Starting at term 2, I then called this recursively until it was being asked to calculate the 101st term. At this point the 100th term would be the n-1 numerator value and could be returned.

def f(p1, p2, tc) {
  if (tc == 101) return p1
  def m = p1 + p2
  def n = ((tc + 1).intdiv(3) * 2) * m + p1
  f(n + m, n, tc + 3)
}

def answer =
    (f(2G, 1G, 2).toString() as List)*.toInteger().sum()

This runs in around 115 ms, with sum of the digits taking about 10 ms. A port of this code to Java runs in around 1.3 ms in total.

Conclusion

Groovy was a nice language to do this in. Not having to mess about with BigInteger objects explicitly is always a boon in the Java world. Performance was quite disappointing though – given that f() is only ever called with BigInteger arguments the compiler should be able to work out exactly what types the variables are and the generated bytecode should be pretty similar to that which javac outputs…but it’s not. In this case Groovy is nearly two orders of magnitude slower and I really don’t see a good reason for it at this stage in the language maturity.

Project Euler Problem #243

January 21, 2012

Read the details of the problem here

Summary

Find the smallest denominator d, having a resilience R(d) < 15499 ⁄ 94744

Solution

Due to real-life intrusion I haven’t been doing much Project Euler for the last few months but a conversation with a colleague at work the other day brought the subject up and I thought I’d revisit the site.

They’ve changed their format quite a lot now and though it’s still fundamentally the same, the concept of “badges” has been introduced – “awards” above and beyond just the old numbers of puzzles solved (which is now more granular). I’m a sucker for things like that and one of them – “Trinary Triumph” – just needed one more question to complete so it was, as the saying goes, like a red rag to a bull!

The problem was actually worded quite straightforwardly, and a little thought showed that this was very similar to Problem #69 and Problem #70. The numerator in the resilience function is going to be the count of terms that are relatively prime to the denominator i.e. Euler’s Totient (φ) function. The solution to Problem #69 was found by maximising n/φ(n) whereas, in this problem we’re looking to minimize φ(d)/(d-1) – so it’s essentially the same thing. The answer is likely to be found by looking for a number consisting of the product of many small prime factors perhaps with some multiplier (also a product of small primes) i.e. some of those factors may be of a higher power.

Problem #108 and Problem #110 also tackled the domain of products of powers of primes and the solution to the latter seemed a good place to start as the one to the former ended up as being a bit of a lucky hack. As per that solution, taking the approach of assuming that the answer was to be found somewhere at n * product(prime factors) and driving off that meant that factorisation of large numbers wasn’t needed so Groovy would remain a viable language to do this in.

Again, when doing these problems with Groovy, it’s best to try to work out the potential scope ahead of runtime as it doesn’t perform that well at number crunching.

What I did need to how to calculate the totient value for numbers with large numbers of factors – I’d kind of worked it out from a bit of guess work for two factors when doing Problem #70 but this wouldn’t work for this. Luckily Wikipedia came to the rescue giving the formula quite succinctly as:

φ(n) = n(1-1/p1)(1-1/p2)…(1-1/pr).

Using this I found that the product of a prime sequence 2..23 gave a totient above what was being sought, whilst 2..29 was a potential solution itself. The answer probably fell somewhere between these two and this therefore formed the search space.

I then just reused the helper function I’d used in Problem #110 with a bit of an educated guess as to what the maximum power might be. If the multipler was 25 i.e. 32, this would exceed the solution of the longer prime sequence – so a maximum power of 4 seemed reasonable to use. I also seeded the generator function with the assumption that all of the primes between 2..23 were going to be factors – this seemed a reasonable assumption given what I knew of the totient function characteristics.

def gen_powers(range, maxlen, yield) {
  gen_powers(range, maxlen, [], yield)
}

def gen_powers(range, maxlen, pows, yield) {
  if (pows.size() == maxlen)
    yield(pows)
  else if (pows.size() == 0)
    range.each { gen_powers(range, maxlen, [it], yield) }
  else
    range.findAll { it >= pows[0] }
         .each { gen_powers(range, maxlen, [it]+pows, yield) }
}

def ( answer, TARGET, POWER_RANGE ) = [ Long.MAX_VALUE, 15499/94744, (0..4) ]
def primes = [ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 ]

gen_powers(POWER_RANGE, primes.size(), [1] * primes.size()- 1) { powers ->
  def (d, factors) = [ 1G, [] ]
  powers.eachWithIndex { v, i ->
    d *= primes[i] ** v
    if (v > 0) factors << primes[i]
  }
  if ((d > 1) &&
      (answer > d) &&
      (((int)(factors.inject(d) { p, v -> p * (1 - 1/v) })) / (d-1) < TARGET)) {
    answer = d
  }
}

Despite the heavy use of the slow Groovy listops (and a bit of hacky code) this runs in around 940 ms.

Next will be Problem #65, a question on continued fractions, that will give me my “As Easy As Pi” award!

Conclusion

With the upfront analysis on the problem and the re-use of the power generator, Groovy made it easy to put together a terse solution that ran in a reasonable time. When doing the calculation of the value of the denominator I did find myself missing Python’s map() capability to process two lists against a closure and return a result. Sounds like the sort of thing that should be in there, maybe it’s buried in the Groovy docs somewhere…

I was interested to see that JetBrains, creators of the splendid IntelliJ IDEA IDE amongst other things, are working on their own “better-Java-than-Java” language, Kotlin.

Ostensibly this is to resolve some of the “issues” that the Java language has, whilst being simpler than the currently strongest competitor, Scala.  Kotlin is going to be supported as a first-class citizen in the IDEA IDE from around the end of 2011.

There’s a lot of talk on various fora about the need (or lack of) for introducing a new language when others like Scala, Groovy and Clojure are becoming more established.  This is similar to the fuss back in April 2011 when Gavin King “leaked” the news that Red Hat were working on their “Ceylon” JVM-hosted language.  The public launch of Ceylon is likely to be in the same sort of timescale.

Most of the chatter seems to be about the benefits around the features of the languages, semantic overhead of learning, efficiency of execution, dilution of Java skillsets, etc.  All are interesting and sound discussions.

What I’m not seeing is anyone saying about the pure commercial value of a company such as JetBrains or RedHat launching their languages focusing on their their own ecosystems.  I don’t believe these guys aren’t doing it for the love of the Java community. They’re doing it for cold, hard cash… any why not?

It’s interesting that JetBrains are saying that Kotlin is intended for “industrial use”.  By this I’m expecting the real value-add features (which must be compelling) to be supported in their paid-for “Ultimate” edition of IDEA rather than the free “Community” version and they’re targeting the “enterprise” market.

They would need to convince “corporate” development managers of the benefits of the Kotlin, disabuse them of the risks of selecting a single vendor language  and get it embedded into their companies order to monetise the investment in developing it.

That’s not to say they’re wrong to try – look at how successful VB6 was and that’s a single company language supported by a strong IDE!

They’ll have to be quick though.

Java 7 has just been released and Java 8 is due in late 2012.  Whilst J7 addresses quite a few “niggles” with Java (including reducing some noise around Generics and the equivalent of C#‘s “using” statement)  and add some good features into the JVM it’s not the complete “Project Coin” package.

The “big thing” being sold in all these new languages are Lambdas/Closures which won’t come until J8.  (You can see the split of features here.) This means that they’ll have 12 months to get some traction before the “official” Java language supports this.

Personally, I don’t believe that having closures in an OO imperative language is that important.  Useful, of course, but not the end of world if they’re lacking.  There didn’t seem to be much demand for them in Java before Ruby on Rails became popular in 2006 and there was an outbreak of scripting-envy across the Java community.

Of course, with the JVM bytecode instruction InvokeDynamic being added in Java 7 it means that dynamic language such as JRuby should become much more performant so, if you really want closures on the JVM why not just use that?

I think it’s disappointing that the official Java/JVM ecosystem has been on a bit of a hiatus for the last 5 years (Java 6 was released in 2006) but it’s kind of understandable given the situation with Sun and Oracle.  It’s also disappointing that Coin was split across two releases as this will delay adoption too and, in the Java world, allow third-parties to fill the perceived gap with semi-official solutions in the interim.

It will be interesting to see how this all plays out.  I do hope it doesn’t distract JetBrain or RedHat/JBoss from continuting to provide world-class platform support for the official Java stack on their products.

Read the details of the problem here

Summary

Find the smallest positive integer, x, such that 2x, 3x, 4x, 5x, and 6x, contain the same digits in some order.

Solution

I used Groovy listops to drive this in, in essence, a single line within a loop. It was quick to write and “obviously correct” but very slow – taking 2.82 seconds. Using as Set is faster than using groupBy() to eliminate duplicates.

def ( i, answer ) = [ 1, 0 ]

while (answer == 0) {

  answer =
      ( ((1..6).collect { (i * it).toString().toList().sort() } as Set)
               .size() == 1 ) ? i : 0

  i++
}

Removing the reliance on the list operators, as per the solution below, brings the runtime down to 0.78 seconds. The code isn’t really any more complex.

def ( i, answer ) = [ 1, 0 ]

while (answer == 0) {

  def s = (i.toString() as List).sort()

  for (j in 2..6) {
    if (s != (i * j).toString().toList().sort()) break
    if (j == 6) answer = i
  }

  i++
}

So that’s the two-for-one deal on this solution.

Conclusion

Groovy – You can take declarative logic, or some moderate performance. Take your pick.

Note: Putting this performance into context though, a port of the solutions to Ruby 1.8.7 has them running in 20.2 seconds and 4.55 seconds respectively. Running these scripts in JRuby 1.5.0 gives run times of 5.57 seconds and 1.60 seconds respectively. So both Groovy and the JVM scores very well in this regard.

Read the details of the problem here

Summary

If p is the perimeter of a right angle triangle, {a, b, c}, which value, for p ≤ 1000, has the most solutions?

Solution

Straightforward, brute-force solution. Just run through a combination of lengths for two sides and calculate the hypotenuse. For hypotenuses that are of integer length, check that the perimeter is within the limit set and, if so, increment a count for that length keeping track of the maximum element as it goes along.

def LIMIT = 1000
def ( c, max, answer ) = [ [0]*LIMIT, 0, 0 ]

for (i in 1..<LIMIT.intdiv(2)) {
  for (j in 1..<LIMIT-i) {
    def hyp = ((i * i) + (j * j)) ** 0.5
    if ((int)hyp == hyp) {
      def p = i + j + hyp
      if (p < LIMIT) {
        c[p]++
        if (c[p] > max) ( max, answer ) = [ c[p], p ]
      }
    }
  }
}

This runs in 0.59 seconds.

Conclusion

Groovy made for a terse syntax but nothing that couldn’t have easily been done in Java. Interestingly, replacing the pythonic syntax of the square root (n ** 0.5) with the traditional Java library call to Math.sqrt(n) the run-time drops to 0.27 seconds. In this case p needs to be forced to an int when being used as an index into c[]. This change in performance is quite astonishing.