Work
Case Studies
Setting up the foundations
“You can't fix what you don't track. But data alone doesn't tell you what to fix.”
Building the instrumentation layer that laid the groundwork for everything that followed
The evolution of ScreenSense
“Three generations of algorithmic iteration before the AI layer”
The Finder evolution from v11 to v13, reducing content failure rates from 15-20% to under 2% through algorithmic discipline.
<2%
Failure rate
5-8%
CSS dependency
AI Augmentation - ScreenSense's glow up
“AI as a surgical layer, not a replacement.”
LLM-generated selectors for weak elements, semantic matching for multilingual detection. Adding intelligence where algorithms genuinely hit their limits.
30%
Healing rate
$1M
ARR protected
Diagnostics - Helping customers ship with confidence
“Users want visibility, not abstraction.”
Building a self-service troubleshooting tool that reduced L1 support tickets by 35% and serves 700+ customers.
35%
Ticket reduction
700+
Customers
Prototype to Production: Evals for AI reliability
“Structured evaluation + error analysis to isolate failure modes”
From prompt to rule: building a 4-dimension LLM-as-judge framework that improved accuracy from 45% to 85%.
85%
Accuracy
4
Dimensions