The problem: why traditional email security falls short
Imagine you’re a financial controller at a wealth management firm. Your CEO emails you at 4pm asking for an urgent EUR 20,000 wire transfer — confidential acquisition, don’t tell anyone, needs to go out today. The email passes your spam filter. SPF, DKIM, DMARC? The attacker registered their own domain, so everything checks out. The display name says “John Smith CEO.” It looks real.
This is the reality of Business Email Compromise (BEC) — the most financially damaging form of cyber attack targeting financial services today. Traditional rule-based filters were designed to catch spam and commodity phishing. They are not equipped to handle the kind of targeted, context-aware social engineering that wealth management firms face daily.
As part of the CyberAId project, University of Piraeus Research Center (UPRC) team has been working on a different approach. In close collaboration with our wealth management partner KM CUBE Asset Management [KM3], who brought real-world operational insight into the kinds of attacks wealth managers actually encounter, we developed an EMTD (Email Threat Detection) pipeline — a multi-agent AI framework purpose-built for the financial sector.
Architecture: A Five-Stage Detection Pipeline
In Stage 1, eight deterministic pre-checks execute, validating email authentication (SPF, DKIM, DMARC), detecting reply-to mismatches, Unicode homoglyph substitutions, zero-width character injection, and display name inconsistencies. These fast checks provide immediate signals and contextual evidence for the subsequent LLM classification.
In Stage 2, an LLM-powered Router analyses the email content alongside pre-check results to identify candidate attack patterns from the threat ontology.
Stage 3 performs a configuration-driven lookup: each candidate attack maps to a specific subset of agents via 265 trigger rules, each with an assigned priority (CRITICAL, HIGH, MEDIUM, LOW). Stage 4 executes the selected agents in parallel — a mix of deterministic rules, LLM-based content analysis, and heuristic checks. Finally, Stage 5’s Orchestrator fuses all findings through weighted confidence scoring, applies multi-source corroboration boosting and single-source capping to prevent false positives, and produces a human-readable threat explanation.
To make this concrete, let’s walk through what happens when our CEO impersonation email hits the system. In the first stage, eight deterministic agents run. SA-01.3 (DMARC Evaluator) flags the authentication failure with 90% confidence. SA-02.8 (Reply-To Mismatch Detector) spots that the reply-to points to a personal Gmail account — a different domain entirely — at 95% confidence. SA-03.1 (Display Name Checker) notices a role claim (“CEO”) paired with a suspicious domain.
But here’s the part that matters: for invoice fraud emails where the attacker controls their own domain, all authentication checks pass cleanly. That’s where the LLM-powered agents take over. SA-07.1 (Urgency & Pressure Analyser) picks up the “don’t discuss with anyone” and “needs to go out today” patterns. An LLM-based content classifier cross-references the email against our ontology of 23 wealth management-specific attack types — from client impersonation to capital call fraud to QR code phishing.
Where we are and what’s next
This work is ongoing. The framework’s architecture, threat ontology, and scoring engine are operational, with a working REST API and web dashboard that lets analysts inspect every agent’s finding in real time. Early tests show the system correctly quarantining BEC attempts while passing legitimate business emails through cleanly. Our discussions with KM3 have been instrumental in shaping the attack ontology — ensuring the 23 attack types reflect genuine threats that wealth management operations face, not just textbook classifications.
Next steps within CyberAId include implementing the full set of 66 agents (currently, eight deterministic pre-checks are production-ready while the remaining agents are progressively being developed), validating against real-world email datasets, and exploring federated deployment models that would allow multiple financial institutions to benefit from shared threat intelligence without exposing private communications. We’re also investigating how the framework can be extended beyond email to cover other communication channels used in financial operations.





