The 12th International Semantic Web Conference
and the 1st Australasian Semantic Web Conference
21-25 October 2013, Sydney, Australia

Generating structure Profiles of Linked Data Graphs

Besnik Fetahu, Stefan Dietze, Bernardo Pereira Nunes, Davide Taibi and Marco Antonio Casanova
While there exists an increasingly large number of Linked Data, metadata about the content covered by individual datasets is sparse. In this paper, we introduce a processing pipeline to automatically assess, annotate and index available linked datasets. Given a minimal description of a dataset from the DataHub, the process produces a structured RDF-based description that includes information about its main topics. Additionally, the generated descriptions embed datasets into an interlinked graph of datasets based on shared topic vocabularies. We adopt and integrate techniques for Named Entity Recognition and auto- mated data validation, providing a consistent work ow for dataset profiling and annotation. Finally, we validate the results obtained with our tool.
