ML Systems Engineer
[eval]

Evaluation Report

The eval framework I built for FraudShield. Slice metrics, regression tests, CI integration.

  • Per-slice PR-AUC, F1, calibration
  • Regression tests with 5% tolerance
  • CI integration (GitHub Actions)
  • Automated report generation
Open Report →
[drift]

Drift Runbook

How I'd handle drift if this were production. Decision tree for retrain vs. rollback.

  • PSI thresholds per feature
  • KS test alert conditions
  • Response decision tree
  • Escalation procedures
Open Runbook →
[sec]

Security Test Report

120+ test cases I wrote to attack my own RAG system. Injection, exfil, jailbreak, tool abuse.

  • 50 direct prompt injection
  • 20 indirect injection (via docs)
  • 20 data exfiltration attempts
  • 15 tool abuse cases
  • 15 PII extraction attempts
Open Report
[rag]

RAG Eval Dashboard

Metrics I track for SecureRAG: retrieval quality, faithfulness, grounding verification.

  • P@5 / R@5 retrieval metrics
  • LLM-as-judge faithfulness
  • Citation verification
  • Query-level breakdown
Open Dashboard →
[cost]

Cost & Latency Report

Benchmarks from local testing. Latency percentiles, cost estimates, what I'd optimize.

  • Latency percentiles (p50/p95/p99)
  • Throughput under load
  • Cost per 1k/1M requests
  • Optimization recommendations
Open Cost Report →
[card]

Model Card

Documentation for the FraudShield model. Limitations, failure modes, performance by slice.

  • Model details & training data
  • Performance across subgroups
  • Known limitations
  • Ethical considerations
Open Model Card →
[data]

Dataset Datasheet

Collection methodology, known biases, split logic, and demographic annotations.

  • Data collection process
  • Labeling methodology
  • Known biases & limitations
  • Recommended uses
Open Datasheet →
[post]

Incident Postmortem

A simulated failure I designed to practice incident response. Root cause analysis format.

  • Incident timeline
  • Root cause analysis
  • Response actions
  • Prevention measures
Open Postmortem →