ENCRYPT Blog Series #9: Knowledge Graphs

Knowledge Graphs,

by Mary Papoutsoglou Ph.D.*, Georgios Meditskos Professor * and Christina Karalka Msc*

*Aristotle University of Thessaloniki

The Semantic Web represents an evolution of the World Wide Web, envisioning a space where data can be shared, reused, and interconnected seamlessly. Conceived by Tim Berners-Lee, its primary goal is to transform the web from a space filled with unstructured documents to one where information is structured and defined in a way that machines can understand. By using standards like Resource Description Framework (RDF), Web Ontology Language (OWL), and SPARQL query language, the Semantic Web enables data integration and interoperability. In essence, it facilitates a web that “understands” and responds to complex user needs by assembling information from diverse sources, paving the way for smarter search engines, enhanced knowledge management, and innovative applications.

Ontologies are foundational pillars in the realm of knowledge representation, offering structured frameworks to define concepts, relationships, and categories within a domain. Originating from philosophy, the term “ontology” in computer science and the Semantic Web context denotes a formal specification of a conceptualization. Through ontologies, disparate systems can communicate, ensuring consistent understanding and interpretation of data. Utilizing standard languages like OWL, ontologies describe entities, their attributes, and the relational dynamics among them. In essence, they provide a blueprint for the organization of knowledge, enabling intelligent systems to comprehend and navigate complex information landscapes.

Knowledge graphs are expansive networks that represent information as nodes (entities) and edges (relationships), encapsulating the intricate web of interconnections within a domain of knowledge. These graphs serve as structured data labyrinths, enabling enhanced data querying, integration, and inference. Central to their efficacy is their ability to amalgamate diverse data sources into a cohesive, navigable structure, paving the way for smarter AI applications and improved information retrieval.

The main objective of the use of knowledge graphs in the ENCRYPT project is to introduce a semantic layer on top of the available data relevant to the use cases, with the aim of interlinking and contextually enriching the schemata and data in an interoperable manner. Knowledge Graphs are used as the underlying technology to promote interoperability, extensibility and sharing of information, enabling end users to acquire a better understanding of the schema and dependencies of the data, while it secures their future use and integration by other organizations even in different domains.

ENCRYPT Knowledge Graphs in the three use cases of the project (financial, health, CTI). To support the previous mentioned use cases the following well-established ontologies have been identified as candidate conceptual models for the project:

Fintech: The Financial Industry Business Ontology (FIBO [1]) is designed to represent a wide range of financial terms and concepts relevant to business applications.
CTI: The Unified Cyber Ontology (UCO [2]) is a community-developed model that captures various cyber-related concepts, including threats, vulnerabilities, and incidents.
Health: The DICOM Controlled Terminology [3] defines an extensive code set for describing and annotating medical imaging data and related procedures. Additionally, the SNOMED Clinical Terms is an extensively used clinical terminology that covers a vast array of medical concepts.
The Data Privacy Vocabulary (DVP) encompasses concepts related to data privacy and protection expressing various aspects, such as personal data categorization, consent management and legal compliance.

An example of the use of Knowledge Graphs is depicted in the following figure where we see the interconnection between the initial set of data and the schemas the ontology that describes these data and the knowledge graph that is produced as an outcome of the ontology and the related instances. This structure can be used to extract knowledge through the use of techniques such as SPARQL queries. More specifically, the semantics encapsulated in the Knowledge Graphs can be used to automatically infer new facts based on predefined logical axioms (RML rules) or statistical learning and pattern recognition techniques to discover new knowledge from graph-structured data.

[1] https://github.com/edmcouncil/fibo

[2] https://github.com/Ebiquity/Unified-Cybersecurity-Ontology

[3] https://bioportal.bioontology.org/ontologies/DCM