- Last week we presented a logic puzzle devised by Einstein which we used as a simple analogy to explain business intelligence and the components (cubes, dimensions and elements) typical of a real business with thousands or millions of transactions.
– This week, we provide a solution to the puzzle as well as discussing the Data Mining perspective illustrated in Einstein's puzzle...
Click on this link to look back at the original puzzle and the 15 hints provided.
The question is: Who owns the fish?
Here is The Solution:
Since we know we have only 25 data points to derive, let's start by making a five by five table (instead of a 5 x 5 x 5 x 5 x 5 x 5 or 15,625, six dimensional cube) where columns represent the five houses and the rows represent nationality, colour, drink, smoke, and pet per the table below.

- In the next table below, let's fill in the easy clues first (hints 8, 9 and 14).
- Then, the green house/white house combination (hint 4) cannot be houses 1 and 2 or houses 2 and 3 because the second house is blue. It also cannot be houses 3 and 4 because the green house owner drinks coffee (hint 5) and the third house owner drinks milk. So the green house must be fourth and the owner drinks coffee and the white house is fifth.
- The red house with the Brit (hint 1) cannot be the second, fourth or fifth house (which are blue, green, and white), nor can it be the first house (which contains the Norwegian), so it must be House 3.
- Therefore, the yellow house with the Dunhill smoker (hint 7) must be the first house and horses are kept in the second house (hint 11).

We can now complete the puzzle in the last table below as follows:
- The Norwegian in the first house does not drink milk or coffee, because we already know these beverages are drunk in the third and fourth houses. They do not drink tea, because the Dane drinks tea (hint 3), nor beer, because the Blue Master smoker drinks beer (hint 12) and the Norwegian smokes Dunhill. Therefore, the Norwegian must drink water.
- The Blend smoker is in the second house, with a neighbour who drinks water (hint 15).
- This Blend smoker cannot drink beer (hint 12), so the beer-drinking Blue Master-smoking person must live in the fifth house.
- The tea-drinking Dane (hint 3) lives in the second house.
- The German who smokes Prince (hint 13) must be in the fourth house.
- The Swede must live in the fifth house with his dogs (hint 2).
- The Pall Mall smoker lives in the third house and raises birds (hint 6).
- That leaves only the Norwegian in the first house as the one with cats who lives next door to the man who smokes Blend (hint 10).
Therefore, assuming that the fifth pet is a fish, it is the GERMAN who owns the fish.

The Data Mining Perspective
It is worth commenting that the distribution of data in this problem makes it manageable as a solution in two dimensions. In the business world however, there will be thousands or millions of transactions and the distribution of data will not be so neat. For example, not all Germans own fish and not all people who drink water also smoke Dunhill. So with many many transactions, business decision makers might typically want the answers to questions like:
- Which nationalities drink beer?
- Is there a link between pet ownership and cigarette preference?
- What is the most popular house colour in Norway?
For questions like these, the process of discovery is inverted and, rather than having the gathered intelligence allow us to fill in the data points, we use the available data points to enable business to extract answers about the relationships implied by the data. Relationships that may not be apparent or obvious from a dimensional business intelligence review of the available data. This is Data Mining. “Data mining is the semi-automatic extraction of patterns, changes, associations, anomalies, and other statistically significant structures from large data sets.”
So how do Business Intelligence and Data Mining differ? Terms associated with traditional BI include analytics or exploration, drill down, trending, reporting, planning data entry and modeling of business rules. For data mining, terms include predictive analytics, classification, association, regression, segmentation.
In a future post, we will go into these data mining terms in more detail.
About CALUMO
The CALUMO Group (www.calumo.com) is a leading provider of Performance Management solutions for enterprise-wide reporting planning and Business Intelligence. Since 1998 the group has successfully delivered solutions to a wide range of enterprises from SME's to some of the largest listed companies and government organisations. These solutions provide quantifiable and valuable business insight, offering a single platform from which to integrate Corporate, Financial and Operational performance and objectives.