The Year of the Graph Newsletter: September 2018

Knowledge graphs in Gartner’s hype cycle, machine learning extensions and visual tools for graph databases, Ethereum analytics with RDF, Using Gremlin with R, SPARQL, and Spring, graph database research wins best paper award in VLDB, and benchmarking AWS Neptune.

Not bad for a typical summer vacation month such as August. This edition of the Year of the Graph newsletter had to be extended to make sure we include as much of the good stuff as possible.

Gartner’s hype cycle for 2018 was recently released, and knowledge graphs were included for the first time. If you wanted official proof it is the year of the graph, there you have it. When a hitherto niche technology gets in the spotlight, some explanations are in order, and Andreas Blumauer from the Semantic Web Company has a go at this.

Knowledge Graphs – Connecting the Dots in an Increasingly Complex World

Google has had a knowledge graph for a while now. But developing and using a knowledege graph at web scale is no easy feat. Diffbot claim to have managed to do just that, turning the web into the world’s largest knowledge graph.

The web as a database: The biggest knowledge graph ever | ZDNet

There is a lot to be said about knowledge graphs, what they are, and how to build them. A graph database will be the foundation on which you build one, but that’s not the only thing you can use graph databases for. Neo4j’s Jennifer Reif talks about when graph databases make sense.

How Do You Know If a Graph Database Solves the Problem?

Here’s the thing about knowledge graphs: you don’t necessarily need to move all your data to a graph database in order to build one. But you do need to have the right pointers and metadata about your data, and for this you do need a graph database. Kurt Cagle from Semantical LLC describes the approach.

Building Semantic Data Catalogs – Kurt Cagle – Medium

Since we’re at the semantic side of things in graphs, check out how Alethio and SANSA combined the SANSA stack for reading and querying large scale RDF data with two of the most classic graph algorithms, Connected Components and PageRank, to do analytics on the Ethereum network.

The Hubs & Authorities in Transaction Network – Powered by SANSA and Graph Analysis

RDF and graph analytics, check. RDF and machine learning, check too. Expect to see this more and more going forward. Here Pedro Oliveira from Stardog outlines how Stardog’s machine learning extensions for SPARQL do similarity search.

Similarity Search – Stardog

Neo4j also has some machine learning extensions. Lauren Shin, an intern at Neo4j, has developed some extensions for linear regression, which she outlines here.

Graphs and ML: Multiple Linear Regression – Towards Data Science

Another contributor, Peter Heisig from Technische Universität Dresden, another Neo4j extension. Heisig has built a Graph View Editor to interact with Neo4j, skipping the writing Cypher part.

Neo4j Graph View Editor

More visual tools. Dave Bechberger built an IDE for running traversals and visualizing results for Tinkerpop-enabled graph databases. It’s still early stage, but if you are not a big fan of the console, this may work well for you. And it’s open source, so you can contribute too.

Dave Bechberger on Twitter

But that’s not the only reason Tinkerpop users have to rejoice. Microsoft also developed and open sourced a valuable resource for Tinkerpop-enabled graph databases: a Spring Data layer for Gremlin. If you like Spring Data, you will sure appreciate this.

Spring Data Gremlin for Azure Cosmos DB Graph API

Tinkerpop on a roll: Dharmen Punjani and Harsh Thakkar from the University of Bonn just released their Gremlin – SPARQL connector, which was included in Tinkerpop. This means you can now query Tinkerpop-enabled graph databases using SPARQL.

Jens Lehmann on Twitter

Wrapping up with Tinkerpop and Gremlin, Jeffrey Hanson from the University of Queensland shows how Gremlin can be used to find subgraphs in R. Hanson is a conservation scientist, drawn to graphs by problems he has to deal with in his work.

RPubs – Subgraphs in R using Gremlin

This goes to show the ubiquity of large graphs and the surprising challenges
of graph processing. That was also the title of Siddhartha Sahu’s and his co-authors’ user survey paper that won the best paper award in VLDB.

Siddhartha Sahu on Twitter

Did you ever wonder how fast AWS Neptune really is? Not as fast as TigerGraph, according to this benchmark published by TigerGraph’s VP of Engineering Mingxi Wu. Of course, benchmarks done by vendors should always be taken with a pinch of salt, but this may give you an idea.

Amazon Neptune, the Truth Revealed

Performance is important of course, but choosing a graph database is a hard exercise which should take many factors into account. Good news is, somebody did this already, so you don’t have to. The most comprehensive research on graph databases is out there, it will save you time and money, and ensure you choose what works for you. And if you’ve read this far, here’s a limited edition 33% off discount code for you: 33OFF

The Year of the Graph

Would you like to receive the latest Year of the Graph Newsletter in your inbox each month? Easy – just signup below. Have some news you think should be featured in an upcoming newsletter? Easy too – drop me a line here.

The Year of the Graph Newsletter
Every month in your Inbox
Your email stays with Linked Data Orchestration and will only be used for this newsletter.

Has one comment to “The Year of the Graph Newsletter: September 2018”

You can leave a reply or Trackback this post.
  1. Seems Knowledge Graphs have arrived! Thanks for putting together the news letter

Write a Reply or Comment

Your email address will not be published.