Home |  ENGLISH |  Kontakt |  Impressum |  Anmelden |  KIT

Lehre/Seminar Die Rolle von Ontologien in Linked Data

Aus Aifbportal

Wechseln zu: Navigation, Suche

Seminar Die Rolle von Ontologien in Linked Data



Details zur Lehrveranstaltung
Dozent(en) Rudi StuderElena SimperlDenny VrandecicBenedikt Kämpgen
Übungsleiter Benedikt Kämpgen
Fach (Gebiet)
Leistungspunkte
Erfolgskontrolle Vortrag, Seminararbeit
Semester WS


Aktuelle und ergänzende Informationen, sowie Zeiten und Räume der Lehrveranstaltung finden Sie im Vorlesungsverzeichnis der Universität.
Link zum Vorlesungsverzeichnis
Link zum Studierendenportal


Forschungsgruppe




Literatur

Literaturempfehlungen finden sich in der Mendeley-Gruppe zur Veranstaltung.




Inhaltsverzeichnis

Welche Rolle spielen Ontologien für Linked Data? or Did Linked Data kill ontologies?

Termine
  • 21.10.2011 - 10:00 - 11:30 - Raum 253, Gebäude 11.40 - Kick-Off, Einführung in das Thema
  • 28.11.2011, 10:00 - 11:30, Raum 253, Gebäude 11.40 - Blocktermin, kurze Zwischenvorträge der Studierenden
  • 20.01.2012, 08:00 bis 19:00, Raum 226, Gebäude 11.40 - Blocktermin, Abschlussvorträge der Studierenden
    • (Ausweichtermin für Abschlussvorträge: 03.02.2012, 14:00 bis 19:00, Raum 253, Gebäude 11.40)


Anmeldung
  • Voraussetzung: Die Begriffe Ontologie und Linked (Open) Data sollten bekannt sein. Unsere Literaturempfehlungen nennen Literatur zum Einblick in diese beiden Themen, die in diesem Seminar zusammengebracht werden sollen.
  • Die Anmeldung erfolgt per E-Mail an Benedikt Kämpgen und ist bis zum ersten Seminartreffen möglich.
  • Bitte bei Anmeldung angeben:
    • Name
    • Matrikelnummer
    • Studienfach
    • Fachsemester (im Semester des Seminars)
  • Bitte zur Anmeldung erledigen: Die Teilnehmer werden gebeten, sich bei der Forschungsgruppe "Seminar - Die Rolle von Ontologien in Linked Data" [1] bei Mendeley anzumelden. Die Forschungsplattform Mendeley unterstützt die Studierenden bei der Literaturverwaltung und bei der Diskussion mit anderen Teilnehmern und Dozenten. Auch Ankündigungen bzgl. des Seminars werden über die Plattform gemacht.


Ablauf
  • Leistungskontrolle: Die Studierenden sollen eine Seminararbeit von ca. 8-12 Seiten erstellen, sowie eine Abschlusspräsentation halten.
  • Veranstaltung wird auf Deutsch sein.
  • Vorträge können auf Deutsch oder Englisch gehalten werden.
  • Bei Fragen können Sie sich jederzeit beim Übungsleiter melden.


Beschreibung

Die Forschung an Ontologien hat in der Informatik eine lange Geschichte. Ontologien erlauben die formale und explizite Beschreibung einer gemeinsamen Sicht auf eine Domäne. Einen besonderen Schub haben Ontologien durch das Semantic Web erhalten. Dessen Ziel ist es, die Bedeutung von Daten im Web auch für Maschinen verständlich zu machen. So gibt es im Web auch viele Anwendungen für Ontologien [2], beispielsweise die Integration verschiedener Datenquellen, die Suche nach speziellen Informationen oder die Entscheidungsunterstützung.


Das Aufkommen von Linked (Open) Data [3] hat die Anwendungsmöglichkeiten von Ontologien weiter vergrößert: Zunächst einmal gibt es nun eine Vielzahl an Ontologien; so viele, dass auch ähnliche Domänen durch mehrere Ontologien abgebildet werden. Zum Beispiel können Unternehmen sowohl mit der Yahoo! SearchMonkey Commerce Ontologie als auch der Organization Ontology beschrieben werden. Weitere Ontologien sind RDF, RDFS und OWL, mit denen das Semantic Web standardisiert wurde; Domänen-spezifische Ontologien wie GoodRelations zum Beschreiben von Produktherstellern und Dienstleistern, FOAF zum Beschreiben von Personen, SKOS zum Beschreiben von Glossaren; oder Ontologien, mit denen allgemeines Wissen beschrieben wird, wie z.B. OpenCyc, DBpedia und Freebase. Außerdem gibt es mittlerweile viele Daten; und was viel bedeutsamer ist: als Linked Data erfüllen sie Kriterien, die ihren automatischen Zugriff und ihre automatische Interpretation ermöglichen. Beispiele sind DBpedia mit Daten aus Wikipedia, EUROSTAT mit statistischen Daten der Europäischen Union zu Ihren Ländern, und GenBank mit Daten aus der Genetik. Und schließlich gibt es viele Werkzeuge, die die Verwendung von Ontologien und Daten vereinfachen. Dazu gehören Editoren wie das Neon Toolkit; Suchmaschinen wie Sindice; Ontologie- und Datenspeicher wie BioPortal und CKAN; sowie Frameworks und Bibliotheken wie Semantic MediaWiki, fourty2, and SPARK zum Programmieren mit Ontologien und Daten.


In diesem Seminar werden Konzepte und Technologien zum Einsatz von Ontologien unter diesen neuen Voraussetzungen untersucht. Ergebnisse sollen an praktischen Beispielen z.B. aus interessanten Fallstudien in der Literatur oder eigenen kleinen Experimenten demonstriert werden.


Themen

Hinweise:

  • Themen werden auf Englisch vergeben, können aber auf Deutsch bearbeitet werden
  • Literatur wird durch Links oder in eckigen Klammern genannt und kann meist so in Mendeley gefunden werden. Dokumente selbst müssen ggf. selbst heruntergeladen werden. Innerhalb des Uni-Netzes besteht Zugriff auf jeder der Literatur.
  • Literatur ist lediglich Empfehlung, soll die Arbeit nicht einschränken. Wichtig ist das grobe Topic und das Ziel, den Bezug zwischen Ontology Engineering und Linked Data herzustellen.

In the following, first, short definition of ontology engineering and Linked Data is given. Then, along the NeOn methodology of ontology engineering, the topics of this Seminar are given.

Short introduction to Ontology Engineering

  • In Computer Science, an ontology is broadly defined as a formal specification (machine-understandable description) of a shared (group-consensus-based) conceptualization (concepts are described) of a domain of interest (of a certain field).
  • Reference: Handbook of Ontologies 2009
  • Ontology Engineering as we understand it:
    • Consensus: Collaborative effort of modelling a domain of interest.
    • Conceptualization: Definition of concepts, relationships, and instances.
    • Formal specification: Modelling using Semantic Web standards such as OWL, RDF*
    • Reasoning: Depending on their expressivity, ontologies allow to derive new information using reasoning.

Short introduction to Linked Data

  • It is the aim of Linked Data to be able to query the whole web as one global data space
  • Reference: [Evolving the Web into a Global Data Space]
  • Linked Data as defined through the four principles by Tim Berners-Lee
    • Unique identifiers for things: Using unique identifiers for things
    • HTTP URIs: Use HTTP URIs so that people can look up those names.
    • Resolving provides information in RDF*/SPARQL: When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
    • Links: Include links to other URIs, so that one can discover more things.

Topics

  • There is no standard methodology for Ontology Engineering
  • NeOn Methodology is said to be the most comprehensive one
  • Reference: [Evaluation Framework for Ontology Development and Management Methodologies]
  • In the following, along the NeOn Methodology, we will describe possible open question regarding ontology engineering in the time of Linked Data.
  • Ontology Life Cycle according to NeOn
    • NeOn Methodology is a scenario-based methodology
    • Reference: [NeOn Methodology for Building Ontology Networks : a Scenario-based Methodology]
    • Each scenario is decomposed of different processes or activities.
    • We assume that it our goal to create an ontology for a Semantic Web application such as in the area of media (Search-Engines, News), e-commerce (E-Bay, BestBuy) and cultural heritage (Museums).

Linked Data for ontology engineering from scratch

  • Here, the role of Linked Data for ontology engineering from scratch is examined [4].
  • Creating an ontology from scratch requires continuous knowledge acquisition.
  • An ontology requirements specification document is created.
  • Conceptualization, Formalization, Implementation is done according to On-To-Knowledge [5]
  • Important activities here are doing a feasibility study and kickoff, and iteratively refining and evaluating the ontology.
  • The result of this iterative development process is an ontology ready for roll-out into a productive system.
  • We assume that Linked Data has brought up requirements and opportunities with the main development process.

Topic: Using Semantic Wikis for knowledge acquisition

  • Ontology engineering is a social activity.
  • Early, already, wikis such as Wikipedia have been recognized as a suitable ontology development environment [6].
  • Various wiki and wiki-like systems [7], [Argumentation-Based Ontology Engineering], [MoKi : The Enterprise Modelling Wiki Research Background : Enterprise Modelling], [Encyclopedic Knowledge Patterns from Wikipedia Links], [YAGO2 : Exploring and Querying World Knowledge in Time , Space , Context , and Many Languages] have been developed to support this social knowledge acquisition.
  • Commonly in use to collaboratively create ontological information is Semantic MediaWiki.
  • Similarly, publishing and consuming Linked Data is a social effort.
  • It is common use to split up the effort of publishing and consumption and to allow continuous improvements, e.g., mappings between instances.
  • See [8] and [9] examples of how Semantic MediaWiki is used together with Linked Data.
  • In this topic we want to answer the following question: How could these efforts of ontology engineering using social plattforms be improved using Linked Data?
  • For instance, inside a semantic wiki it is already difficult to create data that is conforming to a specific vocabularies/ontologies. Often, only forms and templates can help, here.
  • This becomes even more difficult, if several semantic wikis shall be linked. For instance, Wikia probably is the biggest user of SMW. Examples of wikis in Wikia that have SMW activated are familypedia.wikia.com, yugioh.wikia.com, www.wowwiki.com, glee.wikia.com and madmen.wikia.com. Can such information be integrated somehow, e.g. for semantic search over all these wikis? Can any new information be deduced from this information integration?
  • Student: Benjamin Kling
  • Supervisor: Benedikt Kämpgen

Topic: Evaluating ontologies with respect to Linked Data

  • An important step in ontology engineering is to evaluate the results.
  • OntoClean [10] is a methodology for evaluating taxonomic relationships captured in an ontology.
  • Also, evaluation frameworks like SPIN have been developed [Using SPARQL and SPIN for Data Quality Management on the Semantic Web]
  • There are various ways to define quality of ontologies (e.g., for reuse, for usage, for evaluation engineering results)
  • Ontology Design Patterns, classic modelling errors
  • In this topic, we want to deal with the following question: How can Linked Data be used to evaluate an ontology?
  • For instance, there are quality criteria for Linked Data sources [11].
  • Quality Criteria for SKOS Vocabularies [12]
  • Linked Data may provide a different view on coverage, or completeness; for instance, Linked Data may provide gold standards of cities.
  • The results of the investigation should be demonstrated using well-known ontologies and vocabularies, e.g., OpenCyc, Yago, Geonames, schema.org, and DBpedia.
  • Student: Dorothea Wieczorek
  • Supervisor: Denny Vrandecic

Topic: Contextualizing ontologies in Linked Data

  • Ontologies are shared models of a domain that encode a view which is common to a set of different parties.
  • Contexts are local models that encode a party’s subjective view of a domain.
  • Contexts and ontologies have both strengths and weaknesses and it is argued that if combined they could complement each other [13].
  • In this topic, we want to examine whether Linked Data can help to contextualize ontologies.

Linked Data for reuse of ontologies

  • For Ontology Engineering, reuse is an important point.
  • The reuse of ontological resources is encouraged by a recent increase in the number of online available ontologies, ontology libraries and repositories [14].
  • General or common ontologies provide conceptualization of generic topics such as time and space. Domain ontologies provide knowledge of a concrete domain such as medicine, pharmacy, fisheries.
  • Ontologies can be reused as a whole or only one part or module. Possibly, only single statements are reused.
  • Also, Ontology Design Pattern reuse is a type of reuse.

Topic: Methodologies of reuse with Linked Data

  • ONTOMETRIC [Gómez-Pérez, 2004] presents a method to measure the suitability of existing ontologies, regarding the requirements of their systems. There are other methodologies that focus on ontology reuse [Reusing ontologies on the Semantic Web: A feasibility study].
  • For instance, an approach to reuse ontologies is presented by [Ontology Reuse and Exploration via Interactive Graph Manipulation].
  • In this topic, we want to examine whether Linked Data can help with ontology reuse.
  • Linked Data encourage reuse in linking to other datasources - Stronger commitment to reuse instead of development from scratch.
  • Linked Data is more data-driven - data first, ontology second.
  • Linked Data tries to split up the effort of publishing information.
  • In this topic, changes to Methodologies w.r.t. Linked Data shall be proposed.
  • E.g., using Linked Data principles for publishing and reusing SW ontologies.
  • self-descriptiveness of SW ontologies (Bizer ESWC SS):
  1. Enable clients to retrieve the schema
  2. Reuse terms from common vocabularies
  3. Publish schema mappings for proprietary terms
  4. Provide provenance metadata
  5. Provide licensing metadata
  6. Provide data-set-level metadata using voiD
  7. Refer to additional access methods using voiD

Topic: Ontology Summarization using Linked Data

  • In order to help with reuse of ontologies, summarization methods have been developed that make important aspects of ontologies understandable [Ontology Summarization : An Analysis and An Evaluation], [Understanding an Ontology through Divergent Exploration].
  • In this topic, the benefit of Linked Data in summarizing ontologies shall be examined.
  • Student: Erwin Leung
  • Supervisor: Benedikt Kämpgen

Topic: Upper-level ontologies in Linked Data

  • Upper-level ontologies are typically the result of extensive discussions and considerations and allow to ground more specific ontologies. One approach to help reuse could be to map ontologies to Upper-level ontologies to allow their comparison on higher level [Mapping the Central LOD Ontologies to PROTON Upper-Level Ontology]
  • [The use of foundational ontologies in ontology development : an empirical assessment]

Topic: Communicating reuse of ontologies in Linked Data

  • How can applications state, what vocabularies/ontologies/constraints data needs to conform to in order to be used by this application? Such information may support linked data publishers or ontology engineers.
  • In this topic it shall be discussed how applications can communicate what ontologies they reuse.
  • Maybe useful information: [15]
  • Student: Tobias Berg
  • Supervisor: Benedikt Kämpgen

Linked Data for Ontology Learning

  • Ontology learning deals with reusing and re-engineering of non-ontological resources [Ontology Learning, Cimiano, Mädche, Staab], [Ontology Learning for the Semantic Web], [Ontology Learning].
  • Ontology learning techniques serve the purpose of supporting an ontology engineer in the task of creating and maintaining an ontology.
  • Here, Machine Learning is applied to construct an ontology automatically.
  • Ontology learning techniques can be applied to
    • structured data such as databases
    • semi-structured such as HTML/XML
    • unstructured textual documents

Topic: Ontology Learning on (semi-)structured data

  • In this topic, the benefit of Linked Data on the activity to learn ontological structures from (semi-)structured data shall be examined.
  • Possible (semi-)structured data
    • Lexica or Thesauri [GenTax : A Generic Methodology for Deriving OWL and RDF-S Ontologies from Hierarchical Classifications , Thesauri , and Inconsistent Taxonomies], [Ontology and the Lexicon - Handbook of ontologies]
    • relational databases [Learning ontology from relational database], [A Comparison of RDB-to-RDF Mapping Languages Categories and Subject Descriptors]
    • XML
    • HTML
    • Wikipedia info tables
    • Query logs, e.g., Learning from applications, e.g., sparql queries from dbpedia. [A Query-Driven Characterization of Linked Data]
  • Student: Tim Straub
  • Supervisor: Elena Simperl

Topic: Ontology Learning from unstructured data

  • Several methods and techniques have been developed for learning ontological structures from unstructured text. Important fields are Natural Language Processing, Information Extraction [Information Extraction - Handbook of ontologies], and Named Entity Recognition.
  • Various types of lexical resources can be exploited for learning, from named entity dictionaries to domain terminologies or ontological thesauri.
  • For instance, in [Learning by googling] the Google API is used to capture metadata about words. This metadata can then be used for ontology learning.
  • In this topic, we want to analyse possible benefits of Linked Data to this effort.
  • For instance, in [DBpedia Spotlight : Shedding Light on the Web of Documents] Dbpedia is used to make sense of words in text.
  • Student: Volker Arrass
  • Supervisor: Elena Simperl

Linked Data for ontology matching

  • An important activity in ontology engineering is to find correspondences between two or more ontologies. We define an ontology mapping as a set of correspondences between components of two ontologies. These correspondences can be equivalence relationships, they can be subclass or superclass relationships, transformation rules, and so on. The process of finding ontology mapping is often referred to as ontology matching. [Ontology Mapping - Noy - Handbook of ontologies].
  • Ontologies are mostly compared pair-wise, even though there might be a large number of ontologies compared [How Matchable Are Four Thousand Ontologies on the Semantic Web].
  • Ontology matching often is evaluated using artificial ontologies, newer approaches also consider real-world data [Benchmarking Matching Applications on the Semantic Web].

Topic: Schema matching

  • In this topic, Linked Data shall be examined on their benefits on the task to find correspondences in terminological knowledge in ontologies, e.g., concepts and properties.
  • In Linked Data schema matching may not be so relevant, as most knowledge is contained in instances. Still, there is work on schema matching, also
  • [A Survey of Schema-based Matching Approaches]
  • [Ontology Alignment for Linked Open Data]

Topic: Instance matching

  • Instance matching deals with the problem of finding resources that mean the same thing.
  • Linked Data mostly contains links between instances, although not always semantically correct [When owl: sameAs isn’t the same: An analysis of identity links on the Semantic Web].
  • [A Self-Training Approach for Resolving Object Coreference on the Semantic Web]
  • One can use rules to find matching instances [Silk – A Link Discovery Framework for the Web of Data].

Topic: Linked Data for localizing ontologies

  • In ontology localization developers adapt an existing ontology to one or various languages and culture communities, obtaining as a result a multilingual ontology.
  • Localization relevant also for Linked Data publishing, e.g. if a dataset or vocabulary shall be described with metadata in several languages.
  • In this topic, it shall be examined, how Linked Data can help to localize an ontology.
  • M. Espinoza, A. Gómez-Pérez, and E. Mena. Enriching an ontology with multilingual information. In Proceedings of the European Semantic Web Conference (ESWC 2008), pages 333–347, 2008.
  • M. Espinoza, A. Gómez-Pérez, and E. Montiel-Ponsoda. Multilingual and localization support for ontologies. In Proceedings of the European Semantic Web Conference (ESWC 2009), pages 821–825, 2009.
  • M. Espinoza, E. Montiel-Ponsoda, and A. Gómez-Pérez. Ontology localization. In Proceedings of the 5th International. Conference on Knowledge Capture (KCAP), pages 33–40, 2009.
  • P. Cimiano, E. Montiel-Ponsoda, P. Buitelaar, M. Espinoza, A. Gómez-Pérez. A Note on Ontology Localization - Journal of Applied Ontology 5(2), 2010.
  • Student: Victoria Kayser
  • Supervisor: Denny Vrandecic

Topic: Ontology Design Pattern Reuse

  • There are Ontology Design Patterns for Ontology Engineering
  • Remind of typical design error that can be avoided.
  • E.g., [Ontology Design Patterns for Semantic Web Content] and [16]
  • Used to reduce modeling difficulties, to speed up the modeling process, or to check the adequacy of modeling decisions
  • In Linked Data, best practices and design patterns are available, also [Linked Data Patterns A pattern catalogue for modelling , publishing , and consuming Linked Data and consuming Linked Data], e.g., [17]
  • In this topic, Ontology Design Patterns and Linked Data Design Patterns shall be compared. Would it make sense to update certain ontology design patterns due to Linked Data?
  • Student: Raoul Strohhäker
  • Supervisor: Benedikt Kämpgen

Topic: Restructuring ontological resources

  • Ontologies seldomly can be kept at a current state for long. Domains change and so should ontological representations.
  • Ontological dynamics have been examined [E-Business Vocabularies as a Moving Target : Quantifying the Conceptual Dynamics in Domains].
  • In this field, versioning of ontologies is an important research issue [Automatic Identification of Ontology Versions Using Machine Learning Techniques]. Similarly, ontology evolution [Consistent Evolution of OWL Ontologies] deals with the difficulties of refining and updating ontologies.
  • In this topic, the relevance of Linked Data in research about ontology evolution and versioning shall be examined. For instance, ontology evolution techniques have often been evaluated using a limited amount of data [Understanding and Supporting Ontology Evolution by Observing the WWW Conference]. Linked Data may provide data to do this evaluation.
  • Also, Linked Data has to deal with problems of dynamicity. For instance, if URIs in vocabularies or datasets in Linked Data change, applications may not work, anymore.