Mix and match analytics: data, metadata, and machine learning for the win
Creating winning analytics solutions means combining and making the most of different approaches and techniques. Taking a look at how Google does this for YouTube can provide inspiration and set a framework for analytics solutions.
YouTube recommendations are a prominent example of applying advanced analytics on a massive scale to improve a service, the experience users get out of it, and the bottom line of the vendor behind it — Google. Previously, we explored the rationale behind it and pondered as to how this type of analytics could be classified. It’s time to pick up where we left off and explore how it works under the hood.
Inspiration came from a hit moment for YouTube recommendations: one of those times when it succeeded in picking up the track that the person curating an ad-hoc, spur-of-the-moment playlist was about to play next. The wow effect induced by this successful prediction/recommendation of a rarity, triggered a spur-of-the-moment discussion which may serve to illuminate different aspects of analytics.
So, how could YouTube know what you want to play next before you play it?
One take at this would be to find a way to rank the degree to which videos are similar to each other based on their content. Strange as it may sound, this scenario is plausible. A typical application is ranking documents as to their similarity based on hashes of their content. This approach is based on a technique called Locality Preserving Hashing (LPH).