Vision
Existing Privacy-Preserving (PP) technologies such as Homomorphic Encryption (HE), Multi-Party-Computation (SMPC), Trusted Execution Environment (TEE) or Differential Privacy (DP) even if promising at a small-scale level still need to overcome several limitations in order to become mainstream security solutions. Moreover, none of the aforementioned PP techniques can be used as a single standalone security solution meaning that in most cases a mix of them have to be deployed to cover the full spectrum of possible cyber-threats while taking into account the regulations, the end-users needs and the already in place infrastructures.
On the other hand, a large amount of big data is available nowadays waiting to be used to address new challenges and develop better research and digital services. Maybe the most common example is the design of new, more accurate machine learning models by the learning over massive datasets in a federated manner. However, the major impediment in the processing of this data, involving usually sensitive or personal information, lies in the potential cyber-security attacks and the disclosure or misuse of it.
The regulations on data protection and the EU’s high norms and laws on personal data are additional constraints one has to take into account while manipulating the personal data. In order to address the above issues, the advanced privacy-preserving computation technologies such as FHE, SMPC or DP can provide valid GDPR-compliant solutions once they become more scalable and reliable i.e., ready for realistic scenarios.
Both FHE and SMPC solutions for privacy-preserving of data in use have scalability issues when treating a lot of data: while FHE has a high computational overhead to treat the encrypted data the SMPC requires high communication costs for the secret sharing. Another common limitation is that their integration with the existing networking infrastructure and security protocols is a neglected aspect of the current on-going research. Of course, each one of these technologies has its own limitations: while the homomorphic encryption in the most standard setting requires the trust of a third-party, the SMPC needs to keep alive connections and continuous interactions between the parties.
With regards to DP as a privacy-preserving technique for machine learning, it is well adapted for low sensitivity queries meaning that for certain families of queries (e.g., sum), it is more difficult to have effective DP mechanisms. Another drawback is that the DP technique requires a predefined privacy budget depending linearly on a fixed number of queries which can affect its utility in practice, thereby making it complex to apply in adaptive settings. Finally, the hardware solutions such as SGX provide fast and trusted computation but for small workloads so they also face scalability difficulties. This weakness can make them thus difficult to apply for large-scale aggregated computations that involve many users’ input (large overhead due to the limited paging).
ENCRYPT project’s vision it to go beyond-the-state-of-the-art to overcome the limitations of these Privacy Preserving technologies in several aspects. First of all, it will address the scalability issue by going beyond the single-key FHE paradigm and explore the application and the practicability of new multi-key and threshold FHE schemes especially in a federated context. Second, as a way to address the drawbacks of each technology in terms of covered threats and performance, ENCRYPT will investigate the combinations of several of these PP methods: TEE with HE, SMPC with HE, DP with HE, etc. Third, ENCRYPT will address the slow computation times associated with the existing solutions for privacy-preserving technologies based on HE or SMPC, by providing hardware acceleration in a user-friendly way, since users will not have to write GPU code for FHE, but rather obtain it from the TornadoVM compiler. Fourth, ENCRYPT will look at the necessary methods in order to make these advanced PP technologies easier to interact with existing infrastructures and more traditional security mechanisms. In particular, it will investigate the use and the application of the transcipher method for the FHE, allowing to switch from “traditional” symmetric encryption to a homomorphic one, without the need to decrypt the sensitive data. This powerful method will permit not only to keep almost standard symmetric cryptography on the clients’ terminals but also to reduce the bandwidth requirements for exchanging the encrypted data (thus addressing also the scalability drawback).
Since a major impediment in the adoption of these PP technologies is also their “user-friendliness”, ENCRYPT will also provide an AI-based recommendation system allowing to choose one or a combination of them and to configure them in function of the system’s requirements and the identified needs in terms of protection of the users’ personal data and of performance. Finally, the proposed solutions will be developed and validated in several settings and real-world use cases including the challenging cross-border federated processing of large datasets.