AI Integration
You already have a product. I bolt AI onto it carefully: search, summarization, generation, RAG, classification. Production-ready, observable, and reversible.
What you get
- Production AI feature deployed in your app or as a standalone service
- Eval set proving the feature works on your real data
- Cost and latency telemetry plus a simple admin to monitor it
- Documentation for your engineering team
- Source code in your GitHub from day one
Transparent pricing
LLM API costs are billed to your accounts. I optimize prompts and pick models so your monthly inference bill is predictable. Prices vary by region to match local engineering market rates.
AI chat widget or search-augmented assistant on your existing site or app.
- Chat or search widget
- Embedded into your site
- Basic eval set
- Cost dashboard
- 2 weeks post-launch support
Production RAG pipeline with embeddings, retrieval, citations, and ingestion.
- Embeddings pipeline
- Vector database (pgvector, Chroma, or Pinecone)
- Citation UI
- Reindex on demand
- 4 weeks post-launch support
A real AI feature in your app: summarization, generation, classification, or scoring.
- End-to-end feature
- Full eval harness
- A/B comparison versus baseline
- Telemetry and admin panel
- 8 weeks post-launch support
Why I charge this
- Bolting AI onto a real product without breaking it is harder than building a demo. You are paying for restraint, evals, and a careful rollout, not for stack-overflow snippets.
- Every feature ships with an eval set. We measure quality before launch and after every prompt change. Your future self will thank you.
- I never lock you to one provider. The same feature can run on OpenAI, Anthropic, or open models with a single config change.
Why work with me
- I have built RAG and agent features for finance, developer tools, and education. The pitfalls are familiar; I will not learn them on your dime.
- No black box. Every prompt, model, and retrieval step is in your repo and documented.
- You see real responses on your real data within the first week, not a generic ChatGPT demo.
- If the AI feature does not pass the eval bar, I tell you. I do not ship something that lies to your users.
How an engagement runs
- Step 01
Discovery call
Free 30 minute call. We pick the feature with the highest leverage on your roadmap.
- Step 02
Eval-first scoping
I draft an eval set on your real data before writing the feature. We agree the bar in writing.
- Step 03
Build
Feature built behind a flag, with previews shipped to staging. You see responses on your data within a week.
- Step 04
Compare and launch
We benchmark versus baseline (manual or pre-AI). If it does not clear the bar, we iterate or stop. No vibes.
- Step 05
Monitor
Telemetry, cost dashboard, and 2 to 8 weeks of post-launch support.
Ideal for
- SaaS teams that want a real AI feature, not a tacked-on chatbot
- Companies sitting on a pile of internal docs, tickets, or contracts that should be queryable
- Founders who want AI in their product but want a sober engineer making the calls
Not the right fit if
- Pure prompt engineering with no engineering work. Talk to a prompt consultant for that.
- AI features that need a research-grade fine-tune. I integrate, I do not train foundation models.
Common questions
Will my data train someone elses model?
No. I default to OpenAI and Anthropic with the no-training settings configured, or to open models running on infrastructure you control.
What does it cost to run?
For a small SaaS doing 10k AI calls a month, expect $30 to $200 in LLM spend. RAG over a typical knowledge base adds about $5 to $30 in monthly embedding and vector storage.
How do you measure quality?
A real eval set: 30 to 100 representative inputs, each with a clear pass criterion. We compute pass rate on every prompt or model change.
Can you work with my existing engineering team?
Yes, and I prefer it. I leave a clean codebase, write the docs, and pair with your engineers if you want.
Ready to scope this?
Free 30 minute call. By the end of it you have a written scope, a price, and a timeline. No pressure to proceed.