Indexing Hadoop: If it’s so simple, how come not everyone’s doing it?

ganadiotis
Feb 17, 2014
Analysis, Data Lakes & Warehouses, Databases, Technical
Big Data, Hadoop, SQL, SQL-on-Hadoop, Structure
4 comments

Is Hadoop really the best thing since sliced bread? You’d probably get this idea, if you have been talking to any of the (proliferating) Hadoop advocates / vendors. Hadoop and its expanding ecosystem are being touted as the ideal solution to any organization’s data management needs – and admittedly, for good reason. Hadoop offers a […]

ganadiotis
Jan 15, 2014
Analysis, Data Lakes & Warehouses, News, Technical
Hadoop, Hive, Impala, RDF, SQL, SQL-on-Hadoop
3 comments

A small step for Impala, a big step for SQL-on-Hadoop. More to come, hopefully.

Recently Cloudera published the results of a benchmark performed internally, comparing its own SQL-on-Hadoop implementation (Impala) against a carefully selected competition composed of Hive and an undisclosed RDBMS and showing that Impala outperforms both. As Gigaom’s Derrick Harris was quick to point out, beating Hive is not something to write home about as Hive is […]

ganadiotis
Jan 13, 2014
Analysis, Data, Software Engineering, Technical
API, SQL
2 comments

Data Modeling for APIs. Part 1: setting the stage

Lately we’ve been engaged in the design of a data model for a project aiming to deliver an API for analytics in the domain of energy. As there is an ongoing debate in the consortium wrt to the type of API that will be implemented (RESTful vs Web Services), we have been asked to provide […]

Linked Data Orchestration

Indexing Hadoop: If it’s so simple, how come not everyone’s doing it?

A small step for Impala, a big step for SQL-on-Hadoop. More to come, hopefully.

Data Modeling for APIs. Part 1: setting the stage

Recent Posts