I took a look at the new weekly UK National Lottery Game that pays the winner a tax-free GBP 10,000 every month for 30 years.  Very nice.  The runner up gets that same payout, but only for a year.  Still nothing to sneeze at.

It’s a select 5 from 47 balls game, with prizes for 2,3,4, and 5 correct balls.  For the bigger prizes in each tier you also need to pick the right one out of ten “life” balls.  That means that you’ve got about 1 / 915 chance of getting a prize on any given ticket.

The prize breakdown is something like as follows, I’ve assumed a 2.5% annual depreciation on the GBP 10,000 per month.  That may be high or low, it just gives what I think is a reasonable a working figure for now.

Criteria Winning
1 in
Payout (GBP) Expected
Value (GBP)
5 numbers + life ball 15,339,390 2,554,155 0.1665
5 numbers 1,533,939 120,000 0.0782
4 numbers + life ball 1,783,650 250 0.0001
4 numbers 178,365 50 0.0003
3 numbers + life ball 162,150 30 0.0002
3 numbers 16,215 20 0.0012
2 numbers + life ball 10,810 10 0.0009
2 numbers 1,081 5 0.0046

That means that the Expected Value of a ticket, which, costs GBP 1.50, is around GBP 0.25. So that’s an expected loss of around 83% per play. Not great odds but there’s always that 1 / 1,394,490 chance of one of those top two prizes which would be, as I said, very nice! If you’re going to gamble, gamble aware.

Advertisements

Fear the Flat Earth

Jan 13, 2019

This is an unusual post for me. I don’t normally comment on societal issues (as I see them) but this is something that been yanking my chain for a few months now so I’m just getting it off my chest.

Recently I’ve been seeing a rise in the number of channels on YouTube either supporting a Flat Earth geocentric model of the cosmos or the debunking of those models (such as they are) by proponents of a Globe Earth heliocentric model. Both sides seem to have a large number of followers/subscribers which ramps up their advertising revenue, sell merchandise to support their channels, have Patreon accounts, etc. Whilst there have always been a tiny minority of people that have held to Flat Earth and other conspiratorial views I’m putting this resurgence largely down to the wave of YouTube entrepreneurialism that’s currently in full swing.

On the Flat Earth side, it’s difficult to know who actually firmly believes it, those who are just riding along for fun (and perhaps profit), and those who are just enjoying the aggressive anti-social trolling aspect of it – there appear to be a lot of ad hominem arguments (on both sides of the debate) being propounded but this could be just the “freakshow” aspect of revenue generation on social media i.e. both sides could be in collusion and just putting a show on for the audience. (I’ve not linked to any of the blogs and channels on this page as that’s just feeding the frenzy.)

The deep complexity of the modern world and science operating at both the macro- and micro scales beyond easy comprehension means that it’s no longer just a matter of “believing your own eyes” or being able to do meaningful experiments to provide insight on your own. Some people, largely due to the fault of the education systems and rise in the trend, driven by politicians and the media, to distrust experts have extended this to mean that scientists and, by extension, even established science, and indeed mathematics itself, is to be wholly distrusted. The open platform provided by the Internet and Social Media means that, whilst in the past these views wouldn’t pass robust peer review so could only be “vanity” published and so have little credibility, they can be now be broadcast without any review at all and join the background of “fake news” on those platforms.

While this remains in the background as just a bit of fun that’s fine, but this can be a problem if children are educated to not trust, or even understand, the “scientific method”. There is now even a push by some to try and get “Flat Earth”, and “alternative sciences” taught in schools on the basis that protection of “free speech” and “belief” is enshrined in law. While this appears to be similar to the typical “creationist” vs. “evolutionist” religious arguments it’s not really.

Things that can be subjected to the scientific method of observation, hypothesis, prediction, empirical investigation and ongoing refinement are simply not a matter of belief. People may choose to ignore the information and insight derived from that rigorous process but that’s more a case of wilful blindness of objective reality rather than a religious belief. I do agree that some outcome hypotheses arising from scientific investigation do need to be considered within a social and political context but the underlying investigation will have unearthed some wholly objective facts that cannot be doubted – the cause of those factual observations may not be completely understood, but that doesn’t invalidate the science itself.

I’m no scientist but, through my programming career and education, I’ve had a deep interest in science as a whole and I do find all this a very worrying development. Not so much in that people don’t understand things, that’s just a matter of time and education, but the increasingly black-and-white blinkered viewpoints that are coming to the fore these days.

I don’t believe that humans are going to slip back into the post-classical, pre-Renaissance “Dark Ages” because of this but, if this anti-science following gains sufficient traction around the world within populist political circles, it’s going to be much harder for public science to get the funding it needs. This sort of thing will become the strict remit of corporates who will them control the science and the technologies that arise from it. (This is just a belief so I don’t need to prove it to anyone.) Best case, it just becomes more likely that a lot of time and resources will be wasted in trying to just manage down this sort of anti-science movement rather than spending that effort in making progress.

Although it’s all available in books, on the Internet and in educational establishments, I might do some posts explaining some simple concepts on here – like why it’s OK that it’s winter in the northern hemisphere (on the assumption that the Earth is indeed a globe), when the Earth is closest to Sol.

 

 

Being an old guy in computing terms, I still remember the occasion when one of my colleagues told me that they’d bought a new PC with a gigabyte hard drive.  That’s 1 GB.  1,000 MB.  It was, I believe, back in 1994.  They’d got themselves a new home machine with a 60 MHz Pentium (built on an 800 nm process) and a 1 GB hard disk.  How they gloated.  How envious was I, stuck with my 500 MB disk and a 486DX2-66 processor?

These days, we have 512 GB SD cards, which are merely the size of a postage stamp, available for only GBP 314.  A terabyte version has recently been announced by SanDisk.

The size of today’s storage devices has led to people just not appreciating just how vast those stores are!

Here comes the science part: An SI terabyte, which is a usual measure of persistent storage, is 1,000,000,000,000 (1012) bytes.  Each byte is 8 bits, and can encode a single character in ASCII or EBCDIC schemes – let’s ignore internationalisation and multi-byte Unicode for now as that just complicates things.

Note that this is distinct to a CS tebibyte which is 1,099,511,627,776 (240) bytes and is used as a more usual measurement for memory.

So, to put these numbers into some perspective, just how big IS a terabyte?

It’s quite common to talk about it in terms of hours of music or video but that is really kind of misleading… sure you can imagine a stack of 213 Blu-ray movie disks, two and half years of continual MP3 music or nearly 2 weeks of worldwide tweets but what does that actually mean in more visceral terms?

Let’s work it out in terms of space and time using, whilst they still exist as physical artefacts, traditional paperback books.  For this example, let’s use Tolstoy’s War and Peace to generate the numbers. We’ll do time first.

War and Peace has a word count of 587,285.  It’ll depend on the particular translation, I guess, but let’s use that as a working number for now.

The average typing speed for a typical person is around 40 words per minute (but obviously more for a professional typist).  That means that it would take someone, maybe you, around 245 hours to type a copy of War and Peace.  We can convert time into money too – the minimum wage is  around GBP 7 per hour, so the opportunity cost of the time taken for that typing would be around GBP 1,715.  That represents about 6 weeks of work effort.

The English language has around 4.7 letters per word on average but don’t forget the spaces and punctuation too – so make it 5.7 characters per word; 5.7 x 587,295 = 3,400,380 bytes.  A Project Gutenberg copy of War and Peace is 3,359,550 bytes so that’s quite a good correlation.  Let’s just use 3.4 MB to make the numbers easier.

A terabyte is 1,000,000 MB so it could hold 294,118 copies of War and Peace.  As calculated above for a single copy and multiplied up, that represents around  72,058,910 hours of human effort and so a labour value of about GBP 504,412,370.  (You can see why photocopiers took off…)

Those 294,118 copies would hold about 368,529,854 pages (see below), each with maybe 476 words on it on average.  Every adult in the UK could have 7 pages dedicated to them.

A terabyte is worth over 500 million pounds sterling of minimum wage workers’ typing effort and, at 8 hours a day for 260 days a year, would take 34,644 years to complete.

Best get some help to finish that before lunch.

The adult population of the UK is about 52 million so, if they all mucked in, they should get the job done in just under an hour and a half by typing some 3,325 words each, which just so happens to be around 7 pages of content. How convenient.

It’s also worth noting that, it would take 34.6 years of work for that average typist to fill a single GB, which holds 294 copies of the book.  This is why, when everything was held as plain ASCII text files, even 500 MB felt cavernous to most people!

People read, on average, at about 250 words per minute so it’d take around 5.6 years to just read a GB of text, and 5,623 years to read a TB. (Assuming a 9-5 job, 5 days a week.)

So that’s time, and by extrapolation, money.  Now for space.

Copies of War and Peace differ in physical size and page count depending on extraneous supporting content and typeface used.  Here are three off of Amazon, we’ll take an average of them as our working numbers.

ISBN-13 Pages Dimensions Volume
978-1853260629 1,024 12.2 x 5.6 x 19.8 cm 1,352.74 cm3
978-0140447934 1,440 13.0 x 6.1 x 19.8 cm 1,570.14 cm3
978-0099512246 1296 13.1 x 5.5 x 21.6 cm 1,556.28 cm3
 Averaged 1,253 12.8 x 5.7 x 20.4 cm 1,493.19 cm3

We’ll just assume that the extraneous text is not material to the argument and stick with the 3.4 MB data per edition.

So the printed information density of a War and Peace paperback book is around 2,277 bytes per cubic centimetre, or 2.28 KB/cm3.  This equates to about 2.28 GB/m3.  (You can see why electronic storage is so popular…).

There are 1,000 GB in 1 TB, so a TB in such a printed form would take up around 439 cubic metres and neatly contain those 294,118 copies of War and Peace.  That’s a cube 7.6 metres (or about 25 feet) on each side.

It’s still a little abstract so let’s make it more concrete.

A community swimming pool is generally about 25m long and 13m wide. It obviously various depending on the number of lanes but this is roughly what you’d expect.  Depth varies too but, for this, imagine a 2m depth throughout – no shallow end or diving area!

That means that a terabyte of books would fill such a community pool to a depth of 1.35 metres or nearly 4.5 feet.  Each pool could hold the equivalent of around 1.5 TB in printed form.

A 6 TB disk, now commonly available for under GBP 200, holds enough data that, if printed in a paperback novel form, would fill 4 standard swimming pools, or a 50m Olympic pool to a depth of 2.1 metres.

Hopefully that gives you a feel for just how much data a terabyte, or even a mere gigabyte, actually is, when it represents textual information rather than high-definition video or music content.

As an aside (and this is highly speculative and could be utterly wrong in so many, many ways), just how small a volume can a TB be stored in? Well, it’s looking like a holographic universe (if that’s an actual thing) stores one bit in an area one Planck length per side, or 2.56E-70 m2.  If we want to store 8E12 bits we’d need 20.48E-58 mor a sphere at least 2.55E-29 metres in diameter surrounding the information.  That’s the size of the black hole formed by a mass of 8.9 g packed into a singularity, or something like that anyway. Of course, we’d likely also need sufficient information to be held to fully describe the substrate containing the data which would push out the size very substantially.  As the man said, “there’s plenty of room at the bottom“.

Read the details of the problem here

Summary

Prime pair sets.

Solution

This problem took a little bit of thinking about, primarily to get it to perform adequately – on the basis that I’m trying to get this to run in under 5 seconds. When I first solved this problem I simply couldn’t get the computational performance out of Groovy to do the prime validation quickly enough and ended up calling out to a native Java method. With the introduction of the @CompileStatic AST transformation this has made it possible to do. After my next laptop upgrade, I’m going to be having to aim for maybe 2 seconds at the most on these questions!

The solution uses a fairly naive implementation of an isPrime() method which is still substantially faster than using BigInteger#isProbablePrime() from Groovy when using @CompileStatic – at least for the range of numbers being used here. I made an assumption that 4-digit primes would be sufficient to solve this question and, luckily, it worked out to be correct.

The program starts by constructing a list of primes mapped to a list of other primes that conform to the question’s constraints of both orders of concatenation also being prime. Using lists of strings rather than integers saves a lot of overhead in the subsequent checks. The lists only contain primes that are greater than the key. So, for example, 3’s list contains 7 (as both 37 and 73 are prime), but that of 7 doesn’t contain 3. Any numbers that have fewer than 4 entries in their list are dropped as a set of 5 primes are being sought. The lowest number in that set will have the full membership in it’s list.

Having generated this list of prime-pairs, a set of primes are grown to find a five members set that fulfil the criteria. At each recursion the entries of the list of the most recently added member are checked to make sure that they meet the concatenation criteria of each of the other member of the list (by making sure that it’s on their respective lists of potential pairs) and that it wouldn’t exceed the current minimum value found. For each of the values that fulfil this criteria, a recursion is called with this appended to the current list. The minimum value found for that particular path is returned. In this respect it’s like creating a directed graph with the nodes (of which there are only 1,227) being the primes and the edges (of which there are 17,845) being the fact that two nodes fulfil the concatenation criteria.

There are probably some algorithmic optimisations that could be made here but, as it is, the code is quite simple to understand.

import groovy.transform.CompileStatic

@CompileStatic
boolean isPrime(final int num) {

  if (num <= 1) return false
  if (num <= 3) return true
  if ((num % 2 == 0) || (num % 3 == 0)) return false

  int r = (int)Math.sqrt(num)
  int f = 5

  while (f <= r) {
    if (num % f == 0) return false
    if (num % (f + 2) == 0) return false
    f += 6
  }

  true
}

def getPrimePairs(maxnum) {

  def primes = ((2..maxnum).findAll { isPrime(it) })*.toString()
  def primeCount = primes.size()
  def pairs = [:]

  for (ip1 in 0..<primeCount - 1) {
    def sp1 = primes[ip1]
    pairs[sp1] = [] as Set
    for (ip2 in (ip1 + 1)..<primeCount) {
      def sp2 = primes[ip2]
      if (isPrime((sp1 + sp2).toInteger()) &&
          isPrime((sp2 + sp1).toInteger())) {
        pairs[sp1] << sp2
      }
    }
    if (pairs[sp1].size() < 4) pairs.remove(sp1)
  }
  pairs
}

def find5(list, min, pairs) {

  def sum = list*.toInteger().sum()
  if (list.size() == 5) return sum

  pairs[list[-1]].findAll { p -> ((sum + p.toInteger()) < min) &&
                                  (list.every { pairs[it].contains(p) }) }
                 .each { min = [ min, find5(list + it, min, pairs) ].min() }
  min
}

def (answer, primePairs) = [ Integer.MAX_VALUE, getPrimePairs(10000) ]

for (p in primePairs.keySet()) {
  answer = [ answer, find5([p], answer, primePairs) ].min()
}

This executes in Groovy 2.4.7 under Java 8u112 in around 2.9 seconds, so within my 5 second cut-off. The build of the list of primes takes about 115 ms, with the construction of the prime pair map taking another 1.5 seconds. The isPrime() method is called around 850,000 times and takes around 960 ms of the total runtime.

Conclusion

Quite surprisingly, Groovy was perfectly good for this problem as long as @CompileStatic is used to optimise the isPrime() method. Left as dynamic, the runtime pushes out to just under 30 seconds. The Groovy listops made it easy to construct the prime-pair lists and for quite an expressive statement for the recursive descent conditions.

Read the details of the problem here

Summary

Poker hands.

Solution

This was a straightforward programming problem rather than anything taxing from a mathematical perspective. I used Groovy listops to process the input file into a sorted list of [ count, face_value ] pairs that could then just be inspected in order of priority (along with some flags to indicate whether there was a monotonic sequence and a only a single suit in the hand) to rank the hand.

There are only 5 combinations of counts available (as given below). It would have been possible to simply check against these to provide a basic classification of the hand but with the need to have the flush and straight processing it just seemed easier and clearer to have the ranking done explicitly.

Count Permutations: [ 1, 1, 1, 1, 1 ], [ 2, 1, 1, 1 ], [2, 2, 1 ], [ 3, 2 ], [4, 1]

Ties were broken by using a base-13 (tridecimal) system with the rank being the most-significant digit and then using the first two group face values as the differentiators so a simple score was created for each hand. As it happened there were no clashes that weren’t broken by this method – it would have to be extended to subsequent cards and then suits (which were ignored in this question) otherwise, so you’d end up with a base-52 (duopentagecimal) system where each card had a unique value but were grouped unless there were clashes.

It didn’t take a lot of coding to get this done. It could be done more tersely, perhaps, but it would be more cryptic!

class Hand {

  private final _cards
  private final _isStraight
  private final _isFlush

  Hand(cardList) {
    def vals = cardList.collect { ['face': '23456789TJQKA'.indexOf(it[0]), 'suit': it[1]] }

    _isFlush = (vals.groupBy { it.suit }.size() == 1 )
    _cards   = vals.groupBy { it.face }
                   .collect { [ (it.value.size), it.key ] }
                   .sort { -(it[0] * 13 + it[1]) }
    _isStraight = (_cards.size() == 5) && (_cards[0][1] - _cards[-1][1] == 4)
  }

  def score() {
    rank() * 169 + _cards[0][1] * 13 + _cards[1][1]
  }

  def rank() {
    if (_isStraight && _isFlush && _cards[0][1] == 12) return 9
    if (_isStraight && _isFlush)                       return 8
    if (_cards[0][0] == 4 )                            return 7
    if (_cards[0][0] == 3 && _cards[1][0] == 2 )       return 6
    if (_isFlush)                                      return 5
    if (_isStraight)                                   return 4
    if (_cards[0][0] == 3 )                            return 3
    if (_cards[0][0] == 2 && _cards[1][0] == 2 )       return 2
    if (_cards[0][0] == 2)                             return 1
    0
  }

  def beats(opponent) {
    def (s1, s2) = [ this.score(), opponent.score() ]
    if (s1 > s2) return true
    assert (s1 != s2)
  }
}

def answer = new File('pe054_poker.txt').readLines().count {
  def cards = it.split(" ")
  new Hand(cards[0..4]).beats(new Hand(cards[5..-1]))
}

This executes in Groovy 2.4.7 under Java 8u112 in around 0.25 seconds, so well within my 5 second cut-off.

Conclusion

Groovy was perfectly good for this problem with the simple file handling and listops allowing the input to be prepared for processing very easily.

Read the details of the problem here

Summary

Investigating multiple reflections of a laser beam.

Solution

A nice little problem and really helped by Euler providing the slope tangent for points on the ellipse. It’s a matter of finding the intersection point between the vector denoting the beam’s path and the ellipse equation and then reflecting it around the normal (at right angles to the tangent) of the mirror’s slope at that point (using the dot and cross products). This just repeated until the beam intersection is within a suitable tolerance of the entry/exit gap at the top.

def ( answer, epsilon ) = [ 0, 0.01d ]

def p0 = [ x:0.0d, y:10.1d ]
def p1 = [ x:1.4d, y:-9.6d ]

while ((answer < 1) ||
       !((Math.abs(p1.x) <= epsilon && (p1.y > 0.0)))) {

  answer++

  def a = [ x:p0.x - p1.x, y:p0.y - p1.y ]
  def b = [ x:-4.0 * p1.x, y:-p1.y]

  def ( dotp, crossp ) = [  a.x * b.x + a.y * b.y,
                            a.x * b.y - b.x * a.y ]

  def c = [ x: dotp * b.x - crossp * b.y,
            y: crossp * b.x + dotp * b.y ]
  def s = (-2.0 * (4.0 * c.x * p1.x + c.y * p1.y)) /
          (4.0 * (c.x * c.x) + (c.y * c.y))
  def p2 = [ x:s * c.x + p1.x, y:s * c.y + p1.y ]

  ( p0, p1 ) = [ p1, p2 ]
}



This executes in Groovy 1.8.8 under Java 7u17 in around 0.12 seconds, so well within my 5 second cut-off.

Conclusion

Groovy made for quite a terse solution to this problem. The ability to use tuples with named members as lightweight point and vector classes was quite convenient and made for more readable code. The multiple assignment capability kept the LoC count down too. It’s important to include the ’d’ type signifiers on the starting values to force the use of doubles throughout the calculations. This is one of those Gr-r-r-oovy annoyances that I’d prefer not having to worry about though.

Bit of a throw away post but it might save someone 5 minutes…  Nothing to do with Project Euler

I needed to have a quick way of rolling an arbitrary number of polyhedral dice including the ability to take a number of the highest from all those rolled.

I did this in Groovy by adding an overloaded roll() instance method to Integer along with a helper isPolyhedralDie() method (but only to catch me making some sort of basic typo in the calling code).

Integer.metaClass.isPolyhedralDie() {
  (delegate as Integer) in [ 3, 4, 6, 8, 10, 12, 20, 100 ]
}

Integer.metaClass.roll() {
   delegate.roll(1)
}

Integer.metaClass.roll() { dice, best = dice ->;
   assert delegate.isPolyhedralDie() && (best > 0) && (dice >= best)
   def ( sides, rnd ) = [ delegate, new Random() ]
   (1..dice).collect { rnd.nextInt(sides) + 1 }
            .sort().reverse()[0..<best].sum()
}

// Generate 4x 3/4d6 character attribute sets

4.times {
  def r = (1..6).collect { 6.roll(4,3) }
  r.countBy { it }.sort { a, b -> b.key <=> a.key  }
   .each { k, v -> print "${v}x $k, " }
  println "Sum = ${r.sum()}"
}

Usage:

Integer#roll()                   // 6.roll() = roll 1d6
Integer#roll(noOfDice)           // 6.roll(3) = roll 3d6
Integer#roll(noOfDice, noOfBest) // 6.roll(4,3) = roll 4d6 take 3