We get fruit delivered at work twice a week. This morning, I took an apple and noted that wasn't a particularly good apple. Not terrible, just not great. I'm not sure how long apples have been around. Certainly, they'be been cultivated for several thousand years. One would guess they existed wild for a lot longer than that. The presence of fresh apples in the northern hemisphere in May is an extremely recent development. So, I didn't bother complaining about the less than great apple to those who supply the fruit.
Technology users (myself included) are generally less understanding. New features and higher levels of performance are not only expected, they are required. Just because it's impossible doesn't take it off the table because impossible is a temporary state.
Two years ago we put together an analytic cube for a group at work that holds 50 billion fact rows tied to about 300 dimension attributes. When we first turned it on, queries were coming back in around 20 minutes. That seemed like a pretty big improvement over the 2-3 weeks they had been spending to get the same information out of the data warehouse, but the users wanted performance comparable to other cubes they used. The fact that this one was 20 times larger didn't impress them.
Through some very aggressive tuning, we were able to get most queries coming back in just a few minutes. This month, we gave a demo of their data on a true scalable platform (Hadoop/Impala/atScale) and showed that we can, in fact, return their results in seconds; there's just the little matter of paying for the hardware. And, if they really decide money is no object, we also showed how adding Vertica into the mix gave them another order of magnitude in speed.
It will be interesting to see how this unfolds. I'm pretty sure they'll go for for the top of the line solution. There's already talk of initiatives with trillion-row data sets. And, when those come about, they'll want those answers just as fast.
No comments:
Post a Comment