Tool Calling Reliability Benchmark
A benchmark project for evaluating reliability in LLM tool-calling workflows across different scenarios and failure modes.
A benchmark project for evaluating reliability in LLM tool-calling workflows across different scenarios and failure modes.
A terminal-based MIDI mixer built in Go. Keyboard-driven workflow for mixing and routing—built as a focused developer tool.
Worked on an agentic customer + KYC workflow product: reliability-focused orchestration, document processing, and full-stack delivery.
"Make it work. Make it right. Make it fast. In that order."