In Between Years. The Year of the Graph Newsletter: January 2019
In between years, or zwischen den Jahren, is a German expression for the period between Christmas and New Year. This is traditionally a time of year when not much happens, and this playful expression lingers itself in between the literal and the metaphoric. As the first edition of the Year of the Graph newsletter for 2019 is here, a short retrospective may be due in addition to the usual updates.
When we called 2018 the Year of the Graph, we did not have to wait for the Gartners of the world to verify what we saw coming. We can without a doubt say this has been the Year Graphs went mainstream. Things are unfolding, quantity and quality of releases and use cases, and all around progress, is stepping up. So, what follows the Year of the Graph? Why, more Years of the Graph.
In terms of the Newsletter, a small change introduced here is that from now on each edition will come with a title in addition to the month it was released in. Reminder: you can find all previous editions here, and subscribe at the end of each post. In terms of the Report..stay tuned!
Doing reviews and predictions is a favorite pastime for this time of year, so let’s start with a roundup of those. Dan McCreary from Optum Technologies and Giovanni Tumarello from Siren have written some of the most well-received and insightful reviews, while Dataversity compiled a list of insights collected from industry key figures.
What to expect of Semantic Web and other Semantic Technologies? Knowledge Graphs have gotten a lot of attention as a backbone for Machine Learning, Deep Learning, and AI business use cases. Expect that to continue. There has been a growing appreciation of Knowledge Graphs – ontologies – to provide an umbrella overlay for cross-walks across siloed information resources.
In Connected Data London, we had the opportunity to host many of those key figures. We learned from them as we all shared insights and knowledge, getting a taste of things to come. Here is our roundup, with links to all the content in case you missed it.
Knowledge Graphs, Machine Learning and AI, Linked Data and Semantic Technology and Graph Databases are redefining how data works. Data is redefining how everything works. Connected Data London is the go-to event for the latest developments in these key technologies. We curated our program on…
The new year has hardly kicked off, but we already have the first major news. Apache TinkerPop 3.4 is here with a slew of improvements: GraphBinary, a new network serialization format that has shown to be significantly faster than existing options, the SPARQL bridge is officially here, Gremlin Recipes have been expanded with Anti-Patterns, and more. Speaking of recipes, Kelvin Lawrence from IBM has also updated his Practical Gremlin guide.
Apache TinkerPop 3.4.0 Released. Avant-Gremlin Construction #3 for Theremin and Flowers https://t.co/b6wgT8GnuE #graphdb #nosql
SPARQL is often dismissed as being impractical, hard to use, etc – like the rest of the Semantic Web. While criticism is warranted, for both technical and non-technical reasons, something seems to be changing. Ruben Verborgh from Ghent University / Inrupt makes valid points on how the Semantic Web offers solutions to hard problems everyone faces, compares SPARQL to GraphQL, and elaborates on how to make this stack accessible, and easier to use.
Making decentralized Web app development fun ◆ While the Semantic Web community was fighting its own internal battles, we failed to gain traction with the people who build apps that are actually used: front-end developers.
Case in point: SPARQL used to measure use of programming paradigms, thanks to Wikidata and its SPARQL endpoint. And some insights derived from SPARQL practitioners. Remember, there is an upcoming W3C workshop on Graph standardization, and we expect to see proposals on how this space can move forward and bridge the gap between RDF and Labeled Property Graphs.
I love that SPARQL is the language used to compare the mentions of all the other programming languages here. https://t.co/PMclq0YViC
And while we’re discussing query languages: Neo4j just released the first milestone of its GraphQL to Cypher transpiler. What this means is you can now query Neo4j using GraphQL. This may serve as one more entry point to Neo4j for developers coming into the graph ecosystem, as GraphQL’s popularity is on the rise.
Because it’s much more fun, I wrote it in Kotlin using graphql-java as a base library. We have learned a lot in the last 2 years of integrating GraphQL with Neo4j, especially how easy and empowering it is to translate GraphQL to Cypher.
CosmosDB, on its part, open-sourced an update of its .NET SDK. But more importantly, CosmosDB lowered its pricing, and added a mechanism to get an accurate cost measure on each Graph API operation. Howard van Rooijen from Endjin elaborates.
We’re currently building a Data Governance Platform product that enables UK Financial Services organisations to discover and manage the life-cycle, usage, risk and compliance requirements of data assets across the organisation. Much of the core functionality is delivered using Cosmos DB’s Gremlin API to model data lineage and other relationships best represented by a graph data structure.
We’re happy to announce the beta release of Stardog 7 which comes with a new storage engine that significantly improves write performance. We have been talking about the new Stardog storage engine, Mastiff, for a while now. Mastiff is based on the open-source-and very low-level-key-value store RocksDb and brings multi-versioned concurrency control (MVCC) to Stardog transactions.
With so much choice in RDF stores, doing some testing to see how each performs can help. Angus Addlesee from Wallscope did that. This is no small undertaking, but it’s also far from complete: not all stores included, configurations not optimized, number of experiments too small, and of course, LPG stores not included. Just goes to show why there are entire projects, such as HOBBIT, dedicated to this effort. Plus, performance is just one of the considerations when evaluating graph databases.
There are many Triplestores available and it is difficult to decide which is best for each use-case. In this article I tie all of my previous posts together and explore the pros and cons of some of the most popular triplestores. I will be using RDF and SPARQL queries that I have created and discussed previously.
DataStax also released version 6.7 of its multi-model DSE platform. Kafka and Docker integration, operational analytics and more – graph operations will benefit as well. Plus, some useful and generally applicable advice on dealing with super-nodes from DataStax’s Jonathan Lacefield.
Graph databases are receiving a lot of hype these days because of the promise of fast and flexible queries that aren’t possible within either traditional RDBMs or NoSQL stores built on simple/singular access patterns. There are some practical tips and tricks that ensure that your graph database project is going to live up to the hype.
More useful advice: how to avoid Doppelgangers using graph databases, and one graph to find them all by G Data’s Florian Hockmann and Kadir Bölükbasi
In this post, we take a look at the problem of getting duplicate data (the doppelgängers) in a graph database like JanusGraph and discuss different approaches to solve it. We will therefore walk through our experiences with upserts at G DATA and how we improved our upserting process…
Another multi-model graph database with a new release: AgensGraph 2.0. AgensGraph is based on PostgreSQL, offering a graph API and query language (Cypher) on top of it.
AgensGraph is a new generation multi-model graph database for PostgreSQL. AgensGraph offers the graph analysis environment for highly connected data in which users can write, edit, and execute SQL and Cypher query together at the same time. AgensGraph comes along with PostgreSQL compatibility and PostgreSQL Extensions.
On the same vein: Redfield has integrated OrientDB with Knime Analytics Platform, a tool for advanced analytics and machine learning. We expect to see more graph database integrations.
January 3, 2019: Johan Tornborg posted images on LinkedIn
Ontology and Data Science usually don’t go in the same sentence. Favio Vázquez from Ciencia y Datos argues they should.
How the study of what there is can help us be better data scientists. If you are new to the word ontology don’t worry, I’m going to give a primer on what it is, and then why it matters for the data world.
How are you going to visualize your Ontology? James Malone from SciBite explores the options.
What’s the most useful way to visualise an ontology? It’s a question I’ve returned to many times over the last decade of building tools which employ ontologies in some way. And when a friend recently asked me about useful mechanisms for visualising ontologies, I thought it was about time I wrote up some thoughts.
Graphs and Deep Learning go together well. David Mack from Octavian shows how to get started.
Since our talk at Connected Data London, I’ve spoken to a lot of research teams who have graph data and want to perform machine learning on it, but are not sure where to start. In this article, I’ll share resources and approaches to get started with machine learning on graphs.