Graph-Based Agentic AI · APRA Compliance · Production Architecture

Compliance decisions need
reasoning chains, not scores.
Graphs make the reasoning explainable.

A multi-agent APRA compliance monitoring system built on Neo4j, Claude, and OpenAI Embeddings — demonstrating that graph-native architecture is the only way to make AI-driven regulatory decisions auditable and defensible at production scale.

Neo4j AuraDBClaude Sonnet + Haiku OpenAI EmbeddingsGraphRAG FastMCP · StreamlitAPRA APS-112 · APG-223 · APS-220 3-Layer Knowledge GraphApache 2.0
System at a Glance
3
APRA Regulations
8
MCP Tools
3
Graph Layers
6
Anomaly Patterns
466
Loan Applications
189
Reg. Chunks (RAG)
The Problem

Compliance officers and risk analysts need explainable, evidence-backed verdicts — not black-box scores. When a regulator asks "why was this loan approved?", the answer must be traceable to specific regulatory text, threshold values, and the reasoning chain that produced the decision. LoanGuard AI makes every compliance verdict auditable by design, persisting full reasoning chains, cited regulatory sections, and semantic evidence to a Neo4j knowledge graph.

Why a Knowledge Graph?

Financial compliance is inherently relational. A single loan application connects a borrower, their ownership structure, their jurisdiction, their industry, and multiple APRA regulations — across three distinct data layers. A relational database treats each layer in isolation. A graph makes the traversal native.

LoanGuard AI's three-layer Neo4j graph connects financial entities (Layer 1) to regulatory obligations (Layer 2) through a Jurisdiction bridge, then writes compliance assessments (Layer 3) as first-class graph nodes — creating a queryable audit trail that spans the full decision chain.

Why Agentic AI?

Compliance questions are not single-step lookups. They require multi-step reasoning: traverse the entity graph, retrieve applicable regulations, evaluate thresholds against actual data, surface anomaly patterns, and synthesise a verdict with citations. That is an agent loop, not a prompt.

The Orchestrator routes questions to specialist agents — ComplianceAgent for threshold evaluation and InvestigationAgent for graph traversal and AML risk signals — running them in parallel when both are needed. Every tool call is tracked. Every reasoning step is persisted.

Three-Layer Neo4j Knowledge Graph
LAYER 1
Entity Graph
Borrowers, LoanApplications, BankAccounts, Transactions, Collateral, Officers, Jurisdictions, Industries — loaded from synthetic CSVs
LAYER 2
Regulatory Graph
APS-112, APG-223, APS-220 — extracted from APRA PDFs by Claude; chunked and embedded with OpenAI text-embedding-3-small for semantic RAG
LAYER 3
Assessment Graph
Runtime compliance results written by agents: Assessment, Finding, ReasoningStep nodes — forming a complete, queryable audit trail per decision
Key Design Decisions
What makes this production-grade
🔗
Jurisdiction Bridge
Borrowers link to Jurisdiction nodes via RESIDES_IN / REGISTERED_IN. Regulations declare which jurisdictions they govern. No direct many-to-many. Extending to New Zealand requires only new nodes — no schema changes.
Parallel Agent Execution
When both compliance and investigation are needed, the Orchestrator dispatches to ComplianceAgent and InvestigationAgent simultaneously via ThreadPoolExecutor. Pre-run traverse calls are also parallelised per regulation.
📋
threshold_type in Data
Each Threshold node carries threshold_type: minimum, maximum, trigger, or informational. Compliance semantics live in the graph, not in code — so regulatory changes re-run the pipeline, they don't require code changes.
🎯
Prompt Caching
The ComplianceAgent marks its system prompt (containing the full graph schema + APRA context) with cache_control: ephemeral. This eliminates repeated token processing across agent iterations and across queries for the same regulation.
🛡️
Prompt Injection Defence
Every tool result passes through guard_tool_result() before entering the agent context. Structural [TOOL DATA] framing plus nine regex patterns covering common injection attempts — logged, not redacted.
🔍
GraphRAG with Scores
Similarity scores from retrieve_regulatory_chunks are stored on CITES_CHUNK relationship properties — so trace_evidence can recover them for the Evidence panel without re-running vector search.
Why graph is different
The connections that compliance actually needs
  • 1
    Cross-layer traversal in a single query. From a loan application, traverse to its borrower, to their jurisdiction, to the applicable regulations, to specific threshold values — in one Cypher hop chain. In a relational system, this is five joins and an ORM nightmare.
  • 2
    Anomaly patterns that require relationship depth. Transaction structuring, layered ownership (OWNS chains of depth ≥ 2), guarantor concentration — these are fundamentally graph patterns. They require traversing relationship chains, not scanning tables.
  • 3
    Audit trails as first-class graph nodes. Layer 3 Assessment, Finding, and ReasoningStep nodes are linked to the entities and regulatory sections they reference. The audit trail is not a log file — it is a queryable subgraph that can be traversed, visualised, and replayed.
  • 4
    Semantic similarity that respects document boundaries. SEMANTICALLY_SIMILAR edges connect chunks from different regulations (cosine > 0.85) — but not same-document pairs. Agents can traverse regulatory cross-references semantically, not just by citation.
Read the full series
Part 1 · Medium
Architecting a Graph-Based Agentic System
When a regulator asks "why was this loan approved?"
Part 2 · Medium
Building Graph-Based Agentic Systems
Failures, fixes, and how the answer gets there
System Architecture
End-to-End Component Overview
↗ Click any component to see its role in the system
User Interface
🖥️
Streamlit Dashboard
app.py — chat + compliance UI
📓
Jupyter Notebooks
311–317 interactive development
Agent Pipeline
🎯
Orchestrator
Routes intent · dispatches agents · synthesises response
⚖️
ComplianceAgent
Threshold evaluation · Layer 3 persistence · per-regulation loop
🔎
InvestigationAgent
Graph traversal · anomaly detection · risk signal surfacing
MCP Tool Layer (FastMCP)
🗺️
traverse_compliance_path
L1→L2 via Jurisdiction
🔍
retrieve_regulatory_chunks
vector semantic search
⚠️
detect_graph_anomalies
6 Cypher patterns
evaluate_thresholds
PASS / BREACH / TRIGGER
💾
persist_assessment
idempotent MERGE to L3
🧵
trace_evidence
walk Assessment → cited nodes
📖
read-neo4j-cypher
ad-hoc read queries
✏️
write-neo4j-cypher
L3 writes only
🕸️
Neo4j AuraDB
3-layer knowledge graph
🤖
Anthropic API
Sonnet (agents) · Haiku (routing)
📐
OpenAI Embeddings
text-embedding-3-small (1536-d)
← Click a component
Agent Pipeline
Orchestrator → ComplianceAgent + InvestigationAgent
↗ Click any stage to see what happens inside
💬
User Question
"Is LOAN-0002 compliant with APG-223?" · "Show suspicious connections around BRW-0001."
↓ Routing call (Claude Haiku · MODEL_FAST · 512 tokens)
🗺️
Orchestrator: Routing
intent classification
entity_ids
regulations
needs_agents
↓ Parallel dispatch when both needed (ThreadPoolExecutor)
⚖️
ComplianceAgent
1. Pre-run traverse (parallel per regulation)
2. evaluate_thresholds → PASS/BREACH/TRIGGER
3. retrieve_regulatory_chunks (optional)
4. persist_assessment → Layer 3
max 14 iterations temperature=0
🔎
InvestigationAgent
1. Pre-run detect_graph_anomalies (scoped)
2. One comprehensive first-degree query
3. Targeted follow-ups (≤3 more calls)
4. Structured risk signal summary
7 tool call budget 6 history pairs
↓ trace_evidence fetches cited sections + chunks · Synthesis call
📝
Orchestrator: Synthesis
Merges outputs → InvestigationResponse: answer + verdict + findings + cited_sections + cited_chunks + next_steps
🖥️
Streamlit UI
Verdict banner · Routing expander · Findings severity chart · Evidence panel · Entity Profile · Recommended next steps
← Click a stage
Graph Data Model
Three-Layer Neo4j Property Graph
↗ Click any node or relationship label to explore details
LAYER 1: FINANCIAL LOAN DATA LAYER 2: REGULATORY DOCUMENTS LAYER 3: COMPLIANCE ASSESSMENT REASONING Transacti-on Officer Address BankAccount Borrower Jurisdictio-n Collateral LoanApplication Industry FROM_ACCOUNTTO_ACCOUNT DIRECTOR_OF LOCATED_AT HAS_ACCOUNT REGISTERED_IN GUARANTEED_BY SUBMITTED_BY BACKED_BY BELONGS_TO_INDUSTRY Requireme-nt Threshold Regulation Section Chunk DEFINES_THRESHOLD HAS_REQUIREMENT HAS_SECTION NEXT_SECTION CROSS_REFERENCES HAS_CHUNK NEXT_CHUNK SEMANTICALLY_SIMILAR APPLIES_TO_JURISDICTION ASSESSM-ENT ReasoningStep Finding HAS_ASSESSMENT HAS_STEP HAS_FINDING ASSESSED_UNDER CITES_SECTION CITES_CHUNK Every node a question. Every relationship a step in the reasoning chain.
↗ Click any node or
relationship label
to see details
Compliance Engine
Threshold Types · Verdict Logic · Anomaly Patterns
↗ Click any threshold or pattern to see evaluation rules
Threshold Type System
minimum
Floor
Entity must meet or exceed. BREACH when condition is False.
maximum
Ceiling
Entity must not exceed. BREACH when condition is False.
trigger
Monitor
Fires concern when met. REQUIRES_REVIEW — not a hard breach.
informational
Reference
ADI-level reference value. Always N/A — excluded from verdict.
APG-223 Thresholds Evaluated Per Loan
THR-001
serviceability_interest_rate_buffer ≥ 3.0%
minimum
THR-002
credit_card_revolving_debt_repayment_rate == 3.0%
informational
THR-003
non_salary_income_haircut ≥ 20% (if income_type ≠ salary)
minimum
THR-004
rental_income_haircut ≥ 20% (if rental_income_gross present)
minimum
THR-005
LVR ≥ 90% → senior management review required
trigger
Verdict Priority (worst-case wins)
NON_COMPLIANT
priority 4
REQUIRES_REVIEW
priority 3
ANOMALY_DETECTED
priority 2
COMPLIANT
priority 1
INFORMATIONAL
priority 0
Anomaly Detection Patterns (ANOMALY_REGISTRY)
transaction_structuring
HIGH
Sub-$10k suspicious transfers to same account. AUSTRAC threshold evasion signal.
high_lvr_loans
HIGH
LVR ≥ 90% — linked to APG-223-THR-005. 63 loans in dataset.
high_risk_jurisdiction
HIGH
Borrowers in JUR-VU / JUR-MM / JUR-KH (aml_risk_rating = high).
high_risk_industry
MEDIUM
Gambling, Financial Asset Investing, Liquor & Tobacco — high AML sensitivity.
layered_ownership
MEDIUM
OWNS chains depth ≥ 2. Obscures beneficial ownership. BRW-0582 → 3-hop chain.
guarantor_concentration
MEDIUM
Borrower guaranteeing 2+ loans — undisclosed contingent liability exposure.
← Click a threshold, verdict, or anomaly pattern
System System Control Flow
Orchestrator Sequence for a Compliance Question
↗ Click any row to see what happens at that step
👤 User / UI
🎯 Orchestrator
⚖️ ComplianceAgent
🕸️ Neo4j + APIs
Init Phase
Question submitted
Haiku routing call
JSON routing plan
Parallel Agent Dispatch
traverse_compliance_path × 3 regs
L2 subgraph + thresholds
evaluate_thresholds
persist_assessment → Layer 3
Investigation Phase (parallel)
detect_graph_anomalies (scoped)
read-neo4j-cypher × N
Synthesis Phase
fetch findings + trace_evidence
Haiku synthesis (streaming)
InvestigationResponse → UI
← Click a flow step