Now accepting Q2 projects — limited slots available. Get started →

Capability

Your AI Feature Shipped Last Quarter. Nobody Uses It.

If you're a product lead who greenlit 'AI' and got a chatbot nobody opens, here's how we build RAG, semantic search, and automation your users actually need.

Book a free call → Send a message

Stack

Claude APIOpenAIVercel AI SDKpgvectorSupabaseFAL APINext.jsTypeScript

We build AI features that people actually use. Not chatbots buried three clicks deep that get 2% engagement, but RAG pipelines, semantic search, and automation wired into the workflows your users already follow. If you shipped an AI feature last quarter and your analytics show a flatline, the problem is almost never the model -- it is the product thinking around the model.

Why does nobody use your AI feature?

Most AI features fail because they were built as demos, not as products. A product lead sees a competitor announce "AI-powered" something, panics, and greenlights a chatbot. The team ships it in six weeks. It sits in a sidebar. Usage peaks at launch, then craters.

We have seen this pattern across dozens of projects. The feature fails for one of three reasons:

It answers questions nobody was asking. The AI does something technically impressive but unrelated to the user's actual job.
It is slower than the thing it replaced. LLM inference adds latency. If the old search bar returned results in 200ms and your AI assistant takes 4 seconds, users will close it.
It hallucinates in ways that destroy trust. One confidently wrong answer is enough to make a user never open the feature again.

The fix is not better prompts. It is better scoping. Every AI feature we build starts with a specific, measurable user problem -- not a technology looking for a use case.

What does ai powered development actually cost?

Real numbers, because vague ranges help nobody. The costs break into three buckets: tooling, development, and inference.

Tooling costs for your development team run $200 to $600 per developer per month on average. That includes seat licenses for tools like Copilot or Cursor ($20-60/month) plus usage-based token costs for agentic tools that can spike to $2,000+ per engineer monthly. Most teams land somewhere in the middle.

Development costs depend on what you are building. For the kinds of features we ship -- RAG over a knowledge base, semantic search, content pipelines -- budgets typically fall between $35,000 and $160,000 depending on data complexity, integration depth, and how many edge cases need handling. Enterprise AI projects can reach $300,000 to $1.5 million, but most product teams do not need that scope.

Inference costs are the subscription nobody budgets for. A $100,000 development project with $20,000/month in API calls is actually a $340,000 Year 1 investment. Current API pricing puts Claude Sonnet 4.7 at $3.00 per million input tokens and $15.00 per million output tokens. GPT-4o runs $2.50/$10.00. These numbers compound fast when you are processing 100,000 queries daily.

We scope inference costs before writing a single line of code, because a feature that works beautifully in staging and bankrupts you in production is not a feature -- it is a liability.

What do we actually build?

We build production AI features on the Vercel AI SDK and Anthropic SDK. Specifically:

RAG systems over your documentation or knowledge base. Your users search your docs, your internal wiki, your product catalog. Instead of keyword matching, the system retrieves relevant chunks and generates accurate, cited answers. We use pgvector with Supabase for the vector store.
Semantic search over structured data. Users describe what they want in natural language. The system translates that intent into precise queries against your database. This is where AI actually outperforms traditional search -- when the user does not know the exact terminology.
Content generation pipelines. Using Claude and GPT-4 for first-draft generation with human review gates. We have built these for product descriptions, support documentation, and marketing copy. The key is treating the output as a draft, not a publication.
AI scoring and quality assessment. Automated evaluation of content, leads, or submissions against criteria you define. This is where AI saves the most human hours -- repetitive judgment calls across high volumes.
Automated image generation workflows using FAL API, integrated into your existing content pipeline so the output lands where your team already works.

If your AI feature needs to connect with existing internal tools or a CRM your team actually uses, that integration work matters as much as the model itself. A model that cannot see your data cannot do the job.

When should you build custom vs. buy off the shelf?

The industry has largely answered this question. Research shows 76% of organizations now buy foundational AI capabilities and reserve custom development for systems that genuinely differentiate the business. That is the right instinct.

Buy when: The problem is commodity. Transcription, basic summarization, generic chatbots, spam filtering. Someone has already built this better than you will.

Build custom when: The AI feature touches your proprietary data, your specific workflow, or your competitive advantage. RAG over your knowledge base is inherently custom -- nobody else has your data. Semantic search tuned to your product taxonomy is custom. These are the features worth investing in.

We see the same pattern with internal tools. Teams spend $40K/quarter on a CRM nobody uses, then wonder why. Sometimes the answer is building the tool around how your team actually works, not forcing your team into someone else's workflow.

How do you avoid the quality trap?

AI-generated code is produced 55.8% faster according to GitHub Copilot studies. It also introduces 1.7x more defects. AI-generated code is 1.91x more likely to create insecure object references and 1.88x more likely to mishandle passwords.

Speed without quality control is just faster debt accumulation.

Here is how we handle this for every AI feature we ship:

Every AI output has a fallback. When the model returns garbage -- and it will -- the user sees a graceful degradation, not a broken screen.
We instrument everything. Response times, token usage, error rates, user satisfaction signals. You cannot improve what you do not measure.
We set latency budgets before development starts. If the feature cannot respond within an acceptable window, we rearchitect or kill it. No exceptions.
Human review gates on generated content. AI produces reasonable first drafts and saves up to 60% of time on documentation tasks specifically. But "reasonable first draft" is not "publishable."

Elite engineering teams maintain AI-assisted code share of 60-75% while keeping code turnover ratios below 1.3x compared to human-only baselines. That ratio is the key metric -- it tells you whether the speed gains are real or just creating rework.

What about teams outside the US?

Building AI features with a US-based team costs $500,000 to $1.5 million annually in salaries alone -- two ML engineers, one data scientist, one DevOps engineer -- before benefits or overhead. Outsourcing comparable work runs 30-50% less.

We work with teams across time zones. Some of our clients, including teams based in Stockholm, came to us specifically because the local talent market priced them out of shipping AI features at all.

The tradeoff is real: in-house teams give you more control for multi-year AI initiatives. But for shipping a specific feature -- RAG search, a content pipeline, a scoring system -- a focused engagement with a team that has done it before will get you to production faster and cheaper.

Who is this not for?

We are opinionated about where AI belongs and where it does not. If your product needs computer vision ($75,000-$280,000), predictive maintenance with sensor integration ($80,000-$350,000), or large-scale model training, you need a dedicated ML engineering firm with infrastructure we do not maintain.

We also do not build AI features for industries where we lack domain context and there is no room for error -- medical diagnosis, autonomous systems, financial trading algorithms.

What we do exceptionally well is the layer between the foundation model and your user -- the retrieval architecture, the UI integration, the fallback logic, the performance tuning. That layer is where most AI features succeed or die, and it is pure web engineering.

This applies across verticals. We have built search and content systems for industries as different as pest control and creator platforms. The AI patterns transfer; the domain context is what changes.

The honest math on AI features

AI/ML adoption has hit 89% across organizations. With 67% actively using AI-assisted development, this is no longer a competitive advantage -- it is a baseline expectation. The question is not whether to build AI features but whether to build the right ones.

The teams that ship AI features people actually use are not the ones with the biggest model budgets. They are the ones with the strongest product discipline -- clear user problems, honest latency budgets, fallback paths for when the model misbehaves, and inference cost projections that survive contact with real traffic. That is the work we do.

Social Animal

Need help with your ai feature shipped last quarter. nobody uses it.?

Get a free quote

FAQ

Common questions

What AI models do you use?

Claude (Anthropic) for content generation and reasoning tasks -- it produces the most consistent, high-quality text output. GPT-4o for multimodal tasks. Smaller, faster models (Claude Haiku, GPT-4o mini) for high-volume, latency-sensitive operations.

What is RAG and when do I need it?

RAG (Retrieval-Augmented Generation) lets an LLM answer questions based on your specific content -- your documentation, knowledge base, or product data. Without RAG, the model only knows its training data. With RAG, it can answer accurately about your specific content.

How do you handle AI costs in production?

I implement caching for repeated queries, use smaller models for classification and routing, and larger models only for generation. I set up cost monitoring alerts in the AI provider dashboard and optimise prompts to reduce token usage.

Can you add AI search to my existing site?

Yes. The typical implementation: embed your content with a text embedding model, store vectors in pgvector (Supabase), and query them semantically at search time. I have added this to Next.js and Astro sites in existing codebases.

Is AI content generation good for SEO?

It depends entirely on the quality and the process. AI-generated content that passes through proper human review, NLP scoring, and originality checks can rank well. Unreviewed, low-quality AI output is increasingly penalised by Google. I build content pipelines with quality gates, not bulk generators.

Ready to get started?

Free consultation. No commitment. Just an honest conversation about your project.

Book a free call →

Get in touch

Let's build
something together.

Whether it's a migration, a new build, or an SEO challenge — the Social Animal team would love to hear from you.

Get in touch →