A report on the 2010 Extended Semantic Web Conference

A report on the 2010 Extended Semantic Web Conference

This year’s edition of ESWC, which took place between May 30th and June 3rd in Crete, was the first one in the series of Semantic Web Conferences under the ‘Extended’ (instead of former ‘European’) title and justified this by being the scene for the lively international Semantic Web community. In a nutshell, it was interesting and productive in terms of content and interaction, well-organised and fun – a successful high-end conference.

Besides presenting our own work, of course we also had the chance to attend lots of presentations, get new ideas and knowledge as well as renew established connections and make new ones – all of which is valuable in and by itself. I attended all the In Use track presentations, most of the Web of Data ones plus tutorials and workshops.

I also got to read most papers that looked interesting and i have to agree with Tom Heath, who stated at some point that Linked Data pretty much IS the Semantic Web for the most part today: this is what is driving public adoption and exposure and gives it the critical mass to move forward, so this was also my own focal point.

So now that we have enough data in the Web of Data, how do we actually query and rank them? Surprisingly perhaps, there’s still work to be done in this area, as so far we have been used to retrieving documents but not data on the the Web.

Fact: you cannot and should not expect end users to learn to formulate and submit queries in SPARQL in order to benefit from SW advanced retrieval capabilities – same as nobody expects them to formulate and submit queries in SQL to use RDBMSs. Solution?

‘A Comparative Study of Keyword Search, Query Completion, Result Completion and Faceted Search’ by Tran et.al. presented interesting findings in assessing different methods to support users in their retrieval tasks. Apparently, keyword search is very popular, so ‘Combining Keyword Translation with Structured Query Answering for Efficient Keyword Search’ by Ladwig et.al. seems like a good way to go.

Faceted browsing is also useful in that sense and the system presented by Heim et.al. in ‘Facet Graphs: Complex Semantic Querying Made Easy’ looked promising – check out the gFacet system here. Their work on ‘Interactive Relationship Discovery via the Semantic Web’ for relation discovery in the Web of Data also looks promising and you can find more on the RelFinder system here.

Ranking web of data results is a line of work that seems somewhat underdeveloped at the moment and we expect to see ranking algorithms that go beyond PageRank, as the Web of Data has so much more structure that can be taken into account. Delbru et.al presented relevant work for this, both in ‘A Node Indexing Scheme for Web Entity Retrieval’ and ‘Hierarchical Link Analysis for Ranking Web Data’. Knowing the ‘Object Link Structure in the Semantic Web’ as presented by Ge et.al. can also give some insight into this task.

Caching is also important, as the LOD cloud won’t always be accessible and responsive, so a cache for the LOD as presented by Martin et.al. in ‘Improving the Performance of Semantic Web Applications with SPARQL Query Result Caching’ seems like a good idea and is nicely complimented by local subgraph processing, presented by Schandl in ‘Replication and Versioning of Partial RDF Graphs’.

And last but not least, i did mention scale, right? For years the SW was being dismissed by industry as a promising idea that won’t scale. So now that we have all this data on the cloud that can be used to augment organisation-specific semantic data resulting in massive amounts, is it possible to efficiently reason over them? Not in massive scale, not until recently at least: it is now. The WebPIE system presented by Urbani et.al. in ‘OWL Reasoning with MapReduce: Calculating the Closure of 100 Billion Triples’ lets you do just that, using parallelism, displaying anytime behaviour and offering the code and instructions so that anyone can run this on the cloud.

This is a big deal actually and would nicely complement triple stores to solve the problem of having to wait to do forward chaining (as presented by Thakker et.al. in ‘A Pragmatic Approach to Semantic Repositories Benchmarking ’): just load the data and use WebPIE to progressively calculate the closure in the background, collect the results.

So to me there is no doubt it should have won the best paper award: faster-than-ever OWL-Horst reasoning that puts every other reasoner out there to shame and is also available for everyone to run on the cloud! (even though i could be slightly biased in saying so, as this is in some way seeing people moving forward with things you’ve worked on – in this case, MARVIN :)

On a related note, there was also an offspring of the OpenKnowledge project: ‘OKBook: Peer-to-Peer Community Formation’. It’s basically a system for p2p community formation in order to execute web service-like choreography protocols, only lighter and more dynamic and also using semantics for the publishing and matching process. The OKBook system looks like a nice implementation of some of the ‘future work’ features for the OK project (just begging for some real-world use). So, kudos to Spyros, Jacopo, Frank, Dave, Xi and co!

Finally, before moving on to talk about our own work: winner of the best In Use track paper, ‘Put in Your Postcode, Out Comes the Data: A Case Study’ by Omitola et.al. showcased a real-world success story of putting public data on the Web of Data – something to keep in mind when embarking on such endeavors as we intend to!

As far as IMC Technologies is concerned, our sponsorship created lots of interest in the company and its activities, reinforced by presentations of our work: the paper ‘Facilitating Dialogue – Using Semantic Web technology for eParticipation‘ was presented by George Anadiotis in the In Use session, while Panos Alexopoulos presented work ‘Towards a Methodology for the Engineering of Fuzzy Ontologies‘ in the Poster session.

Facilitating Dialogue introduced the domain of eParticipation to the Semantic Web crowd and tried to show how SW technology is relevant for the domain and how it has benefited our eDialogos platform. The key points of intersection are:

  • Using domain ontologies to model deliberation domains and making it seamlessly available to users via a Web2.0 interface in order to provide background knowledge, browse, annotate and retrieve content and expressed views
  • Integrating our advanced hybrid search mechanism for content and views retrieval and query expansion
  • Using a deliberation ontology and a Linked Data mechanism to create an interconnected Dialogue ecosystem

The work was generally well-received and i’m glad to say that it has generated interest beyond the SW community as well. In fact, as we officially announced in both the ELLAK conference and ESWC, our intention is to release an open source version of the eDialogos platform by the end of 2010; we are currently looking into the most appropriate distribution and licensing in order to implement this.

Our poster presentation was about a preliminary version of a novel methodology for developing fuzzy ontologies, namely ontologies that may represent vague knowledge. The ability to model, exploit and manage such knowledge is one of the distinguishing characteristics of our knowledge management platform and what the above methodology does is to enable knowledge engineers and domain experts to identify, capture and model vague knowledge in the form of ontologies in the most appropriate and effective way.

The work was appealing to several participants of the conference who realized the need for dealing with vagueness and imprecision in real-world knowledge management scenarios. Therefore, our immediate plans are to further develop and finalize our methodology and accompany it with a comprehensive tool for developing and managing fuzzy ontologies.

Join the Orchestrate all the Things Newsletter

Stories about how Technology, Data, AI and Media flow into each other shaping our lives. Analysis, Essays, Interviews, News. Mid-to-long form, 1-3 times/month.

 
 

Write a Reply or Comment

Your email address will not be published.