How Much Does It Cost to Add an AI Chatbot to Your App

Adding an AI chatbot to an existing app in 2026 typically costs $2,000 to $60,000 in build cost plus $50 to $5,000 per month in running cost, depending on traffic and how custom you go. Most founders we see ship a working v1 for under $8,000 and burn $200 a month on tokens for the first 90 days.

The cost to add an AI chatbot is mostly an integration problem, not a "build a chatbot" problem. You already have an app. You already have users. You need a chat surface, an LLM call, retrieval over your own content, and a place to log conversations. That is a 2 to 4 week piece of work for one engineer who knows what they are doing, plus ongoing token spend that scales with usage.

This guide gives you the actual math: build paths, real LLM API costs, a feature-by-feature breakdown, and how to pick the right team to wire it up.

What you are actually paying for

When founders ask "how much does it cost to add an AI chatbot," they usually mean one of four very different projects. Pricing diverges sharply by which one.

Drop-in widget (Intercom Fin, Tidio, Crisp): script tag, vendor knows your help center. $30 to $1,500/month. Zero engineering. Capped customization.
Branded chatbot wired to your data: your UI, your brand, retrieval over your docs and your DB. $5K to $25K to build, $100 to $1,000/month to run.
Agentic in-app assistant that can do things (book a call, refund an order, change a setting): $15K to $60K to build, $500 to $5,000/month to run.
Custom NLP product with proprietary models and SOC 2: $75K to $1M+. This is what the older agency cost guides describe, and it is almost never what a startup founder actually needs.

If you are reading this, you probably want option 2 or 3. The rest of this post focuses there.

What goes into a real chatbot integration

The work splits into roughly six pieces. None of them are mysterious in 2026. The Vercel AI SDK, Claude, OpenAI, and a handful of UI libraries have collapsed what used to be a 3-month build into a focused 2-week sprint.

Chat UI surface: a sidebar, modal, or full-page chat. assistant-ui, Vercel AI SDK examples, or a custom React component. 1 to 3 days.
LLM provider integration: Claude (via Anthropic SDK), OpenAI, or a router like OpenRouter. Streaming responses, error handling, retries. 1 to 2 days.
Retrieval (RAG) over your content: chunking your docs / KB, embeddings, a vector store (Pinecone, Supabase pgvector, Turbopuffer), and a retrieval step before the LLM call. 2 to 5 days.
Tools / function calling if the bot needs to do things: book a meeting, look up an order, kick off a refund. Each tool is roughly half a day to a day to define, wire, and test. Plan for 3 to 8 tools in v1.
Auth, rate limits, abuse protection: tying chats to logged-in users, capping tokens per user per day, blocking obvious prompt injection. 1 to 2 days.
Logging, evals, and a dashboard so you can see what users actually ask, what the bot got wrong, and where retrieval failed. Langfuse, Helicone, or a homegrown table. 1 to 3 days.

That is roughly 9 to 19 engineer-days for a clean v1. Call it 2 to 4 weeks for one engineer working full-time, or 6 weeks if it shares attention with other work.

Cost breakdown by approach

This is the table to actually budget against. Numbers assume a v1 chatbot wired into an existing Next.js or similar app, with retrieval over your docs and 3 to 5 tools. Build cost is one-time. Running cost is monthly at moderate traffic (~10K conversations/month).

Approach	Build cost	Timeline	Pros	Cons
Vendor widget (Intercom Fin, Tidio)	$0 setup	1 day	Zero eng, fast, mature	Their UI, their brand, can't do app-specific actions
US full-time hire (mid-level)	$25,000-$45,000	8-12 weeks (incl. hiring)	Owns it long-term, deep context	Hiring loop, salary + benefits, hard to unwind if v1 fails
Dev agency (US/EU)	$25,000-$80,000	6-10 weeks	Project-managed, accountable	Markup is 2-3x raw labor, hard to iterate after handoff
Freelancer (Upwork/Toptal)	$4,000-$20,000	4-8 weeks	Cheaper, flexible	Variance is huge, vetting is on you, quality bimodal
Toptal / similar	$15,000-$45,000	2-3 weeks to start + build	Vetted, fast intro	Monthly minimums, $60-$200/hr, contract friction
Cadence	$1,000-$8,000 (1-4 weeks of one engineer)	48-hour trial then ship	Every engineer is AI-native (Claude/Cursor/Copilot fluency vetted), weekly billing, replace any week, no notice period	Less suited to enterprise procurement and multi-month statements of work

The weekly model is the part most cost guides miss. If you book a mid-level engineer at $1,000/week and they ship the chatbot in 3 weeks, your build cost is $3,000. If they ship in 2, it's $2,000. You stop paying the moment the work stops. There is no severance, no notice period, no project rescoping email.

Compare that to a $45K agency contract with a 4-week kickoff, or a 6-week hiring loop for a full-time hire whose first 30 days are spent learning your codebase. For a piece of work that is 9 to 19 engineer-days, weekly billing matches the shape of the actual job. (We use the same lens in our cost to build a SaaS app guide, where most v1s are 4 to 12 weeks of focused engineering, not the 6-month estimates agencies quote.)

Feature-by-feature cost: API, infra, and SaaS

Here is what you actually pay vendors, separate from labor. Numbers as of mid-2026.

Component	Vendor / option	Cost
LLM (chat)	Claude Haiku 4.5	~$0.80/M input tokens, $4/M output
LLM (chat, premium)	Claude Sonnet 4.5	~$3/M input, $15/M output
LLM (chat, alt)	GPT-4o	~$2.50/M input, $10/M output
LLM (chat, budget)	GPT-4o-mini	~$0.15/M input, $0.60/M output
Embeddings	OpenAI text-embedding-3-small	~$0.02/M tokens
Vector store	Supabase pgvector	Free tier, then $25/mo Pro
Vector store (managed)	Pinecone serverless	~$0.10/M reads, ~$70/M writes, ~$0.33/GB/mo storage
Vector store (alt)	Turbopuffer	$0.04/GB/mo storage, pay-per-query
Eval / observability	Langfuse self-host	Free (your infra)
Eval / observability	Helicone, Langfuse cloud	$0-$200/mo at startup volumes
Hosting	Vercel, Railway, Fly	$20-$200/mo for the API routes
Auth	Clerk	Free up to 10K MAU, then $25/mo + $0.02/MAU

A real example. You ship a chatbot for a SaaS docs site. Average conversation is 6 turns, 800 input tokens (system prompt + retrieved chunks + history), 200 output tokens per turn. With Claude Haiku 4.5 at $0.80/$4 per million tokens, each conversation costs roughly:

Input: 6 × 800 × $0.80 / 1,000,000 = $0.0038
Output: 6 × 200 × $4 / 1,000,000 = $0.0048
Total per conversation: ~$0.009

At 10,000 conversations/month, that is $90/month in LLM cost. Add Pinecone serverless ($30 to $80/mo at this scale), Vercel ($40/mo for the function calls), and Langfuse cloud ($50/mo for tracing): you are at $210 to $260/month all-in, plus your existing app hosting.

Switch to Claude Sonnet 4.5 for the harder questions and the per-conversation cost jumps to ~$0.04. At 10K conversations, that is $400/month in tokens. Most teams route easy questions to Haiku and escalate hard ones to Sonnet. You can build that router in an afternoon.

How to reduce costs without cutting corners

Five moves consistently cut chatbot cost without making the bot worse.

Pick the smallest model that passes your eval. Most "AI chatbot" workloads run fine on Haiku 4.5 or GPT-4o-mini. Founders default to Sonnet or GPT-4o because they tested with it; the eval discipline is to actually compare. A good tooling audit-style approach works here: write 30 real questions, score each model, pick the cheapest that hits 90% acceptable.
Cache retrieval results aggressively. If 40% of users ask "how do I reset my password," your retrieval and your LLM call both repeat. Cache by question hash for 24 hours. Saves 30 to 60% of token spend on docs bots.
Use prompt caching. Anthropic and OpenAI both support cache-control on the system prompt and retrieved context. For a chatbot with a 4K-token system prompt, this can drop input cost by 80% on the second turn onward.
Stream responses. Cheaper than batch for perceived latency, and lets you cut the response off early if the user closes the modal. Built into the Vercel AI SDK.
Don't custom-build retrieval if you have under 10K docs. Supabase pgvector or Turbopuffer with a 50-line setup outperforms a custom Pinecone + reranker pipeline for most startups. Custom build only when you can prove eval scores justify it.

If you are still in the "should I build this at all" stage, run the question through our build/buy/book tool before writing a line of code. For a chatbot that is mostly support deflection on a generic SaaS, an off-the-shelf vendor often beats a custom build for the first 6 months, no matter what your engineering team wants to hear.

The fastest path from "we should add a chatbot" to shipped

Three steps for a founder with an existing app and no in-house AI engineer:

Decide the surface in 30 minutes. Sidebar widget? Full page? Mobile? What 3 to 5 things should the bot be able to do? Write it down. This is your spec.
Pick your stack in an hour. Default for most: Next.js + Vercel AI SDK + Claude Haiku 4.5 + Supabase pgvector + Langfuse. Swap any piece if you have a strong reason; otherwise this stack ships fastest.
Get one engineer for 2 to 4 weeks. This is where teams stall. If you have a strong full-stack engineer with AI integration experience on staff, hand them the spec on Monday. If you don't, the fastest path is a weekly booking instead of a 6-week hiring loop. Cadence shortlists 4 vetted engineers in 2 minutes; every engineer is AI-native, vetted on Cursor, Claude, and Copilot fluency before they unlock bookings, and the 48-hour free trial means you can confirm fit before committing to a single week. Most chatbot integrations come in at $2K to $5K of weekly cost.

Whichever path you take, ship a v0 in week 1. Even a janky chatbot that answers 60% of questions correctly will tell you more about what your users actually ask than another planning meeting.

If you want to see what shipping a chatbot would actually cost on Cadence, book an engineer in 2 minutes. The 48-hour trial is free, weekly billing kicks in only if you keep them, and you can replace any week with no notice.

FAQ

How long does it take to add an AI chatbot to an existing app?

Two to four weeks of focused engineering for a v1 with retrieval and 3 to 5 tools. Six weeks if it shares attention with other work. A drop-in vendor widget (Intercom Fin, Tidio) is one day if you accept their UI.

What is the cheapest way to add an AI chatbot?

For a tiny site or a docs bot, a vendor widget at $30 to $100/month beats any custom build. For anything where you want app-specific actions or your own UI, the cheapest defensible path is one mid-level engineer for 2 to 3 weeks (~$2K to $3K) plus ~$200/month in API and infra costs.

Should I use Claude or GPT for my chatbot?

Both are good. Claude Haiku 4.5 is currently the price-performance winner for most chat workloads (cheaper than GPT-4o-mini at comparable quality on long-context tasks). Claude Sonnet 4.5 wins on harder reasoning. GPT-4o is the safer default if your team already has OpenAI infrastructure. Most production setups route easy questions to a cheap model and escalate hard ones.

How much do LLM tokens cost per conversation?

Roughly $0.009 per 6-turn conversation on Claude Haiku 4.5 (~~$90 per 10K conversations/month). Roughly $0.04 on Claude Sonnet 4.5 or GPT-4o (~~$400 per 10K conversations). Embeddings and vector storage add $30 to $100/month at startup scale.

Build vs buy: how do I decide?

Buy a vendor (Intercom Fin, Tidio, Crisp) if your bot is mostly help-center deflection and you don't need it to take actions in your app. Build if (a) you need app-specific tools the vendor can't do, (b) brand and UX matter enough that a third-party widget is a no, or (c) your data is sensitive enough that you can't ship it to a vendor's cloud. For most B2B SaaS founders past $10K MRR, build wins because the vendor pricing scales faster than the engineering cost.

Can I build it myself as a non-technical founder?

You can ship a Tidio or Intercom widget in an afternoon with no code. You cannot ship a custom chatbot wired into your app's database without an engineer, full stop. The good news is the engineering side is small enough now that "an engineer for 2 weeks" is a real option, not a hypothetical.

All posts