Graph-Based Agentic AI · APRA Compliance · Production Architecture

Compliance decisions need
reasoning chains, not scores.
Graphs make the reasoning explainable.

A multi-agent APRA compliance monitoring system built on Neo4j, Claude, and OpenAI Embeddings — demonstrating that graph-native architecture is the only way to make AI-driven regulatory decisions auditable and defensible at production scale.

Neo4j AuraDBClaude Sonnet + Haiku OpenAI EmbeddingsGraphRAG FastMCP · StreamlitAPRA APS-112 · APG-223 · APS-220 3-Layer Knowledge GraphApache 2.0

System at a Glance

APRA Regulations

MCP Tools

Graph Layers

Anomaly Patterns

466

Loan Applications

189

Reg. Chunks (RAG)

The Problem

Compliance officers and risk analysts need explainable, evidence-backed verdicts — not black-box scores. When a regulator asks "why was this loan approved?", the answer must be traceable to specific regulatory text, threshold values, and the reasoning chain that produced the decision. LoanGuard AI makes every compliance verdict auditable by design, persisting full reasoning chains, cited regulatory sections, and semantic evidence to a Neo4j knowledge graph.

Why a Knowledge Graph?

Financial compliance is inherently relational. A single loan application connects a borrower, their ownership structure, their jurisdiction, their industry, and multiple APRA regulations — across three distinct data layers. A relational database treats each layer in isolation. A graph makes the traversal native.

LoanGuard AI's three-layer Neo4j graph connects financial entities (Layer 1) to regulatory obligations (Layer 2) through a Jurisdiction bridge, then writes compliance assessments (Layer 3) as first-class graph nodes — creating a queryable audit trail that spans the full decision chain.

Why Agentic AI?

Compliance questions are not single-step lookups. They require multi-step reasoning: traverse the entity graph, retrieve applicable regulations, evaluate thresholds against actual data, surface anomaly patterns, and synthesise a verdict with citations. That is an agent loop, not a prompt.

The Orchestrator routes questions to specialist agents — ComplianceAgent for threshold evaluation and InvestigationAgent for graph traversal and AML risk signals — running them in parallel when both are needed. Every tool call is tracked. Every reasoning step is persisted.

Three-Layer Neo4j Knowledge Graph

LAYER 1

Entity Graph

Borrowers, LoanApplications, BankAccounts, Transactions, Collateral, Officers, Jurisdictions, Industries — loaded from synthetic CSVs

LAYER 2

Regulatory Graph

APS-112, APG-223, APS-220 — extracted from APRA PDFs by Claude; chunked and embedded with OpenAI text-embedding-3-small for semantic RAG

LAYER 3

Assessment Graph

Runtime compliance results written by agents: Assessment, Finding, ReasoningStep nodes — forming a complete, queryable audit trail per decision

Key Design Decisions

What makes this production-grade

🔗

Jurisdiction Bridge

Borrowers link to Jurisdiction nodes via RESIDES_IN / REGISTERED_IN. Regulations declare which jurisdictions they govern. No direct many-to-many. Extending to New Zealand requires only new nodes — no schema changes.

⚡

Parallel Agent Execution

When both compliance and investigation are needed, the Orchestrator dispatches to ComplianceAgent and InvestigationAgent simultaneously via ThreadPoolExecutor. Pre-run traverse calls are also parallelised per regulation.

📋

threshold_type in Data

Each Threshold node carries threshold_type: minimum, maximum, trigger, or informational. Compliance semantics live in the graph, not in code — so regulatory changes re-run the pipeline, they don't require code changes.

🎯

Prompt Caching

The ComplianceAgent marks its system prompt (containing the full graph schema + APRA context) with cache_control: ephemeral. This eliminates repeated token processing across agent iterations and across queries for the same regulation.

🛡️

Prompt Injection Defence

Every tool result passes through guard_tool_result() before entering the agent context. Structural [TOOL DATA] framing plus nine regex patterns covering common injection attempts — logged, not redacted.

🔍

GraphRAG with Scores

Similarity scores from retrieve_regulatory_chunks are stored on CITES_CHUNK relationship properties — so trace_evidence can recover them for the Evidence panel without re-running vector search.

Why graph is different

The connections that compliance actually needs

1

Cross-layer traversal in a single query. From a loan application, traverse to its borrower, to their jurisdiction, to the applicable regulations, to specific threshold values — in one Cypher hop chain. In a relational system, this is five joins and an ORM nightmare.
2

Anomaly patterns that require relationship depth. Transaction structuring, layered ownership (OWNS chains of depth ≥ 2), guarantor concentration — these are fundamentally graph patterns. They require traversing relationship chains, not scanning tables.
3

Audit trails as first-class graph nodes. Layer 3 Assessment, Finding, and ReasoningStep nodes are linked to the entities and regulatory sections they reference. The audit trail is not a log file — it is a queryable subgraph that can be traversed, visualised, and replayed.
4

Semantic similarity that respects document boundaries. SEMANTICALLY_SIMILAR edges connect chunks from different regulations (cosine > 0.85) — but not same-document pairs. Agents can traverse regulatory cross-references semantically, not just by citation.

Read the full series

Part 1 · Medium

Architecting a Graph-Based Agentic System

When a regulator asks "why was this loan approved?"

Part 2 · Medium

Building Graph-Based Agentic Systems

Failures, fixes, and how the answer gets there

System Architecture

End-to-End Component Overview

↗ Click any component to see its role in the system

User Interface

🖥️

Streamlit Dashboard

app.py — chat + compliance UI

📓

Jupyter Notebooks

311–317 interactive development

Agent Pipeline

🎯

Orchestrator

Routes intent · dispatches agents · synthesises response

⚖️

ComplianceAgent

Threshold evaluation · Layer 3 persistence · per-regulation loop

🔎

InvestigationAgent

Graph traversal · anomaly detection · risk signal surfacing

MCP Tool Layer (FastMCP)

🗺️

traverse_compliance_path

L1→L2 via Jurisdiction

🔍

retrieve_regulatory_chunks

vector semantic search

⚠️

detect_graph_anomalies

6 Cypher patterns

✅

evaluate_thresholds

PASS / BREACH / TRIGGER

💾

persist_assessment

idempotent MERGE to L3

🧵

trace_evidence

walk Assessment → cited nodes

📖

read-neo4j-cypher

ad-hoc read queries

✏️

write-neo4j-cypher

L3 writes only

🕸️

Neo4j AuraDB

3-layer knowledge graph

🤖

Anthropic API

Sonnet (agents) · Haiku (routing)

📐

OpenAI Embeddings

text-embedding-3-small (1536-d)

← Click a component

Agent Pipeline

Orchestrator → ComplianceAgent + InvestigationAgent

↗ Click any stage to see what happens inside

💬

User Question

"Is LOAN-0002 compliant with APG-223?" · "Show suspicious connections around BRW-0001."

↓ Routing call (Claude Haiku · MODEL_FAST · 512 tokens)

🗺️

Orchestrator: Routing

intent classification

entity_ids

regulations

needs_agents

↓ Parallel dispatch when both needed (ThreadPoolExecutor)

⚖️

ComplianceAgent

1. Pre-run traverse (parallel per regulation)
2. evaluate_thresholds → PASS/BREACH/TRIGGER
3. retrieve_regulatory_chunks (optional)
4. persist_assessment → Layer 3

max 14 iterations temperature=0

🔎

InvestigationAgent

1. Pre-run detect_graph_anomalies (scoped)
2. One comprehensive first-degree query
3. Targeted follow-ups (≤3 more calls)
4. Structured risk signal summary

7 tool call budget 6 history pairs

↓ trace_evidence fetches cited sections + chunks · Synthesis call

📝

Orchestrator: Synthesis

Merges outputs → InvestigationResponse: answer + verdict + findings + cited_sections + cited_chunks + next_steps

🖥️

Streamlit UI

Verdict banner · Routing expander · Findings severity chart · Evidence panel · Entity Profile · Recommended next steps

← Click a stage

Graph Data Model

Three-Layer Neo4j Property Graph

↗ Click any node or relationship label to explore details

↗ Click any node or
relationship label
to see details

Compliance Engine

Threshold Types · Verdict Logic · Anomaly Patterns

↗ Click any threshold or pattern to see evaluation rules

Threshold Type System

minimum

Floor

Entity must meet or exceed. BREACH when condition is False.

maximum

Ceiling

Entity must not exceed. BREACH when condition is False.

trigger

Monitor

Fires concern when met. REQUIRES_REVIEW — not a hard breach.

informational

Reference

ADI-level reference value. Always N/A — excluded from verdict.

APG-223 Thresholds Evaluated Per Loan

THR-001

serviceability_interest_rate_buffer ≥ 3.0%

minimum

THR-002

credit_card_revolving_debt_repayment_rate == 3.0%

informational

THR-003

non_salary_income_haircut ≥ 20% (if income_type ≠ salary)

minimum

THR-004

rental_income_haircut ≥ 20% (if rental_income_gross present)

minimum

THR-005

LVR ≥ 90% → senior management review required

trigger

Verdict Priority (worst-case wins)

✗

NON_COMPLIANT

priority 4

⚠

REQUIRES_REVIEW

priority 3

⚑

ANOMALY_DETECTED

priority 2

✓

COMPLIANT

priority 1

ℹ

INFORMATIONAL

priority 0

Anomaly Detection Patterns (ANOMALY_REGISTRY)

transaction_structuring

HIGH

Sub-$10k suspicious transfers to same account. AUSTRAC threshold evasion signal.

high_lvr_loans

HIGH

LVR ≥ 90% — linked to APG-223-THR-005. 63 loans in dataset.

high_risk_jurisdiction

HIGH

Borrowers in JUR-VU / JUR-MM / JUR-KH (aml_risk_rating = high).

high_risk_industry

MEDIUM

Gambling, Financial Asset Investing, Liquor & Tobacco — high AML sensitivity.

layered_ownership

MEDIUM

OWNS chains depth ≥ 2. Obscures beneficial ownership. BRW-0582 → 3-hop chain.

guarantor_concentration

MEDIUM

Borrower guaranteeing 2+ loans — undisclosed contingent liability exposure.

← Click a threshold, verdict, or anomaly pattern

System System Control Flow

Orchestrator Sequence for a Compliance Question

↗ Click any row to see what happens at that step

👤 User / UI

🎯 Orchestrator

⚖️ ComplianceAgent

🕸️ Neo4j + APIs

Init Phase

Question submitted

Haiku routing call

JSON routing plan

Parallel Agent Dispatch

traverse_compliance_path × 3 regs

L2 subgraph + thresholds

evaluate_thresholds

persist_assessment → Layer 3

Investigation Phase (parallel)

detect_graph_anomalies (scoped)

read-neo4j-cypher × N

Synthesis Phase

fetch findings + trace_evidence

Haiku synthesis (streaming)

InvestigationResponse → UI

← Click a flow step

Compliance decisions needreasoning chains, not scores.Graphs make the reasoning explainable.

Why a Knowledge Graph?

Why Agentic AI?

Compliance decisions need
reasoning chains, not scores.
Graphs make the reasoning explainable.