Abstract: The EU-funded CyberAid project enables financial institutions to train AI models and collaboratively compute functions, like fraud detection, without sharing raw data. By integrating federated learning with advanced cryptographic techniques, like homomorphic encryption and MPC, it ensures privacy and regulatory compliance. This scalable, policy-driven approach allows for robust, joint analysis while protecting sensitive data from exposure.
Keywords: Financial Sector Security, secure federated learning, privacy-enhancing technologies, GDPR compliance.
The financial sector has become one of the most data-intensive domains for artificial intelligence. From credit risk assessment and fraud detection to algorithmic trading and anti-money laundering, modern financial institutions rely heavily on training AI models using vast and sensitive datasets. These datasets include transaction histories, personal identifiers, and behavioral patterns, all of which are highly confidential and subject to strict regulatory requirements. This creates a fundamental tension: institutions must leverage large-scale data to build accurate and robust models while ensuring strong privacy guarantees and regulatory compliance.
Privacy-preserving federated learning (FL) has emerged as a key paradigm to address this challenge. Federated learning enables multiple institutions to collaboratively train machine learning models without sharing raw data. Instead of centralizing datasets, each participant such as banks, insurers, or financial intermediaries trains a local model using its own data and shares only model updates, such as gradients or parameters. These updates are then aggregated to produce a global model. This decentralized approach aligns well with regulatory frameworks and reduces the risk of data breaches, as sensitive information remains under the control of each organization.
However, federated learning alone does not fully guarantee privacy. Model updates may still leak sensitive information through advanced inference attacks, such as gradient inversion or membership inference. These attacks can potentially reconstruct parts of the training data or reveal whether specific records were used during training. To mitigate these risks, federated learning must be combined with advanced cryptographic techniques that provide stronger confidentiality guarantees.
Two key techniques in this space are homomorphic encryption (HE) and secure multi-party computation (MPC). Homomorphic encryption enables computations to be performed directly on encrypted data, allowing participants to encrypt their model updates before sharing them for aggregation. Secure multi-party computation allows multiple parties to jointly compute a function over their inputs while keeping those inputs private. In federated learning, MPC can be used to securely aggregate model updates without revealing individual contributions. Together, these approaches enable secure collaboration and offer resilience against emerging threats, including those posed by post-quantum adversaries.
These methods fall under the broader category of privacy-enhancing technologies (PETs), which are becoming increasingly important in financial AI systems. Their application extends beyond model training to collaborative analytics tasks such as fraud detection and risk analysis. Since fraudulent activities often span multiple institutions, the ability to jointly analyze distributed data can significantly improve detection accuracy. At the same time, PETs ensure that sensitive data remains protected, enabling collaboration without compromising confidentiality.
Scalability remains a major challenge. Financial systems operate at massive scale, with millions of transactions and numerous participants. Federated learning frameworks must support large numbers of clients, high-dimensional models, and frequent updates, all while maintaining low latency and high reliability. Cryptographic techniques, while essential, introduce computational and communication overhead. Homomorphic encryption can be significantly slower than plaintext computation, and MPC protocols may require additional communication and coordination.
The EU-funded project CyberAid is designed to address these challenges by advancing scalable, privacy-preserving federated learning for the financial sector. CyberAid integrates federated learning with advanced cryptographic methods such as homomorphic encryption and secure multi-party computation, alongside a comprehensive suite of privacy-enhancing technologies. To ensure flexibility and real-world applicability, the project explores multiple deployment frameworks, including NVIDIA FLARE and PySyft. These frameworks support distributed training, secure orchestration, and the integration of privacy-preserving mechanisms.
An important aspect of the CyberAid approach is the use of policy-driven controls. Input and output policies, defined and approved by data owners, are enforced to regulate how data is used and which computations are permitted. These policies specify what functions can be executed, on which datasets, and under what conditions, ensuring that collaborative processes remain controlled, auditable, and compliant.
Ultimately, collaborative approaches enable more accurate and robust model training and inference by leveraging distributed data insights. However, such collaboration must remain compliant with regulatory frameworks such as the General Data Protection Regulation, which impose strict requirements on the handling of sensitive data. By combining federated learning, advanced cryptography, and scalable system design, CyberAid enables financial institutions to collaborate effectively without exposing raw data, thereby ensuring compliance, enhancing analytical capabilities, and maintaining trust.



