K is for Knowledge: Application and data integration for better business, using metadata and knowledge graphs
Being disrupted by Big Tech is one of the greatest concerns for any business. Good news: There may be a path to accelerate digital transformation and out-compete Big Tech, by leveraging domain knowledge.
First, you get the software: Operating systems, search engines, browsers, and social networks. Then, you get the hardware: Mobile phones, data centers, cloud. Then, you get a gradually expanding foothold in just about anything from advertising and media to healthcare and from autonomous vehicles to banking.
In this process, Big Tech has managed to amass money and power, building its ruthless efficiency on data-driven culture and products. The awe this has instilled on businesses has been captured by a pop culture reference to a Game of Thrones series episode called the Red Wedding.
In the series, Red Wedding refers to a massacre. The metaphor has been used to describe the effect AWS announcements have on software businesses that see AWS enter their turf. The software business has been the first to feel Big Tech’s effect, but it does not look like it will be the last.
Today, every business is a technology business, in the sense that it runs on technology. Unlike Big Tech, however, most businesses have a surplus of legacy systems and a deficit of tech talent. This makes modernization risky and costly. Most businesses can’t afford to rip and replace systems built over the years. The architecture may be outdated, but the business logic is tried and true.
So, what are businesses to do? Sit and wait to be disrupted, invest huge amounts in modernization efforts, try to out-tech Big Tech? None of these sounds like a very good solution. But there may be another option to win this battle.
First and foremost, every business needs to become the best possible version of itself by leveraging their competitive advantage: Domain business knowledge. This was the most important takeaway from one of the most forward-thinking events in Europe, the Big Things conference.
We have referred in the past to the path from Big Data to AI, again based on observations made at the conference. This year, the event itself evolved along this path, rebranding as Big Things, and giving the stage to an array of speakers from organizations big and small alike.
Google was among those, as Cassie Kozyrkov, chief decision scientist at Google, keynoted the event. Kozyrkov offered an excellent blueprint on how to use machine learning for data-driven decision making. One of the many points made was that without trusted data, this is a non-starter. No trusted data means no data-driven decision making, which means no efficiency.
In other words: If your data is a mess, it’s going to kill your business. This was the starting point for Oscar Mendez’s keynote. Mendez, who is the CEO and co-founder of Stratio, defined trusted data as data that is clean, secure, accurate, organized, and have well-defined origins and clear access guidelines.
As Mendez puts it, Big Tech monitors interactions, collects data, and learns something all the time. Most other businesses don’t. But this goes beyond the cold start problem. Many businesses have started collecting data, and legacy systems are huge data troves, too. But how do you get from zero to trusted data?
Data governance is one part of the answer. Things such as data lineage, access control, and metadata enrichment fall under data governance. In that respect, businesses that listened to the GDPR wake-up call and put data governance processes and systems in place should already be better positioned to deal with these issues.
Another part of the answer, Mendez argued, is virtualization. With an array of systems in place, each generating data in its own format and storing it in its own silo, how can businesses ever hope to have a holistic, integrated picture?
Mendez’s proposed solution to this combines data catalogs and virtualization to create what is called a trusted data fabric. What this means is that data stays where it is, and accessing it happens via the fabric layer, utilizing the data catalog to point to the underlying systems of record.