An innovative framework for privacy-preserving healthcare data management was recently presented by Apostolos Mavridis at the SWAT4HCLS conference in Barcelona. This methodology integrates Large Language Models (LLMs), ontologies and vector databases for healthcare data processing and analysis.
The research, conducted by the ENCRYPT team from Aristotle University of Thessaloniki, addresses the challenge of mapping medical terminology to RDF Knowledge Graphs efficiently and with heightened privacy. Traditional methods, which often rely on rule-based systems, have struggled to scale and adapt to new medical conditions. However, the team’s use of LLMs, specifically trained on biomedical corpora like SNOMED CT, introduces a more dynamic and scalable solution.

Key aspects of the proposed system include preprocessing and standardization of medical terms, LLM-based semantic interpretation and hybrid retrieval methods that combine vector embeddings with keyword search. This approach not only enhances the semantic understanding of medical data but also ensures compliance with stringent privacy regulations.
The framework has been thoroughly evaluated by experts and has demonstrated high accuracy and scalability, reducing the need for manual curation of medical ontologies and potentially transforming the landscape of medical data management.
This research aligns with the ENCRYPT project’s goals of advancing privacy technology and methodologies in the healthcare sector, ensuring that sensitive information is handled with the utmost care while maintaining high standards of data utility and accessibility.