A Multi-Agent Framework for Email Threat Detection pipeline in Financial Services

The problem: why traditional email security falls short

Imagine you’re a financial controller at a wealth management firm. Your CEO emails you at 4pm asking for an urgent EUR 20,000 wire transfer — confidential acquisition, don’t tell anyone, needs to go out today. The email passes your spam filter. SPF, DKIM, DMARC? The attacker registered their own domain, so everything checks out. The display name says “John Smith CEO.” It looks real.

This is the reality of Business Email Compromise (BEC) — the most financially damaging form of cyber attack targeting financial services today. Traditional rule-based filters were designed to catch spam and commodity phishing. They are not equipped to handle the kind of targeted, context-aware social engineering that wealth management firms face daily.

As part of the CyberAId project, University of Piraeus Research Center (UPRC) team has been working on a different approach. In close collaboration with our wealth management partner KM CUBE Asset Management [KM3], who brought real-world operational insight into the kinds of attacks wealth managers actually encounter, we developed an EMTD (Email Threat Detection) pipeline — a multi-agent AI framework purpose-built for the financial sector.

Architecture: A Five-Stage Detection Pipeline

In Stage 1, eight deterministic pre-checks execute, validating email authentication (SPF, DKIM, DMARC), detecting reply-to mismatches, Unicode homoglyph substitutions, zero-width character injection, and display name inconsistencies. These fast checks provide immediate signals and contextual evidence for the subsequent LLM classification.

In Stage 2, an LLM-powered Router analyses the email content alongside pre-check results to identify candidate attack patterns from the threat ontology.

Stage 3 performs a configuration-driven lookup: each candidate attack maps to a specific subset of agents via 265 trigger rules, each with an assigned priority (CRITICAL, HIGH, MEDIUM, LOW). Stage 4 executes the selected agents in parallel — a mix of deterministic rules, LLM-based content analysis, and heuristic checks. Finally, Stage 5’s Orchestrator fuses all findings through weighted confidence scoring, applies multi-source corroboration boosting and single-source capping to prevent false positives, and produces a human-readable threat explanation.

To make this concrete, let’s walk through what happens when our CEO impersonation email hits the system. In the first stage, eight deterministic agents run. SA-01.3 (DMARC Evaluator) flags the authentication failure with 90% confidence. SA-02.8 (Reply-To Mismatch Detector) spots that the reply-to points to a personal Gmail account — a different domain entirely — at 95% confidence. SA-03.1 (Display Name Checker) notices a role claim (“CEO”) paired with a suspicious domain.

But here’s the part that matters: for invoice fraud emails where the attacker controls their own domain, all authentication checks pass cleanly. That’s where the LLM-powered agents take over. SA-07.1 (Urgency & Pressure Analyser) picks up the “don’t discuss with anyone” and “needs to go out today” patterns. An LLM-based content classifier cross-references the email against our ontology of 23 wealth management-specific attack types — from client impersonation to capital call fraud to QR code phishing.

Where we are and what’s next

This work is ongoing. The framework’s architecture, threat ontology, and scoring engine are operational, with a working REST API and web dashboard that lets analysts inspect every agent’s finding in real time. Early tests show the system correctly quarantining BEC attempts while passing legitimate business emails through cleanly. Our discussions with KM3 have been instrumental in shaping the attack ontology — ensuring the 23 attack types reflect genuine threats that wealth management operations face, not just textbook classifications.

Next steps within CyberAId include implementing the full set of 66 agents (currently, eight deterministic pre-checks are production-ready while the remaining agents are progressively being developed), validating against real-world email datasets, and exploring federated deployment models that would allow multiple financial institutions to benefit from shared threat intelligence without exposing private communications. We’re also investigating how the framework can be extended beyond email to cover other communication channels used in financial operations.

IMG_8059

Figure 1. Walk-through of a BEC detection: five agents flag independent signals, the orchestrator fuses them into a QUARANTINE verdict with a full explanation.

IMG_8058

Figure 2. The five-stage EMTD pipeline, from email ingestion to threat verdict

view related Posts

6

Jul

Category : Blog

From Requirements to User Stories: Building User-Centred Cybersecurity Solutions in the Financial sector
read more
23

Jun

Category : Blog

From Blueprint to Reality: Co-Creating and Validating AI Security in Real Banking Environments
read more
20

May

Category : Blog

Beyond the Algorithm: How CyberAId Is Exploring Quantum Token-Based Authentication for Financial Transactions
read more

Pilots

Tiered Client Impersonation Detection for Private Banking and Trading Systems

Advanced Anti-Money Laundering Detection for Payment Service Providers

AI-Enhanced Anomaly Detection and Incident Response for Banking Cybersecurity

Cyber-Resilient AI-Powered Threat Detection and Risk Modelling for High-Frequency Trading