Proof — Abdul-Sobur Ayinde

[eval]

Evaluation Report

The eval framework I built for FraudShield. Slice metrics, regression tests, CI integration.

Open Report →

[drift]

How I'd handle drift if this were production. Decision tree for retrain vs. rollback.

Open Runbook →

[sec]

120+ test cases I wrote to attack my own RAG system. Injection, exfil, jailbreak, tool abuse.

Open Report

[rag]

Metrics I track for SecureRAG: retrieval quality, faithfulness, grounding verification.

Open Dashboard →

[cost]

Benchmarks from local testing. Latency percentiles, cost estimates, what I'd optimize.

Open Cost Report →

[card]

Documentation for the FraudShield model. Limitations, failure modes, performance by slice.

Open Model Card →

[data]

Collection methodology, known biases, split logic, and demographic annotations.

Open Datasheet →

[post]

A simulated failure I designed to practice incident response. Root cause analysis format.

Open Postmortem →