Visitors often have questions: What services do we offer? How much does it cost? Can we help with their specific challenge? Rather than making them dig through pages or wait for a response, we built zxtra. An AI assistant that answers questions instantly, right on the page.

This is the story of how we built zxtra using Cloudflare Workers AI, the technical decisions we made, and why AI assistants are becoming essential for modern businesses.

Try zxtra yourself →

TL;DR

Problem: Visitors had questions scattered across multiple pages. FAQ pages help, but people don't always check them
Solution: zxtra, a context-aware AI assistant that lives on every page
Stack: Cloudflare Workers AI, Cloudflare Turnstile, React floating chat widget
Result: Instant answers 24/7, zero infrastructure overhead, enterprise-grade security

The Problem: Information Overload

Our website has grown significantly. Service pages, case studies, blog posts, pricing information, team bios. While this depth establishes expertise, it creates a challenge: visitors don't always know where to find what they need.

Common questions we heard:

"What services do you offer?"
"How much does a DevOps engagement cost?"
"Can you help with AWS cost optimization?"
"Do you have experience with Kubernetes?"

These questions all have answers on our website. Finding them requires navigation, reading, and sometimes connecting dots across multiple pages.

Traditional Solutions Fall Short

Solution	Problem
FAQ Page	People don't always check it. Can't anticipate everything
Live Chat	Requires staff availability. Not scalable
Contact Form	Creates friction for simple questions

The AI Solution

What if visitors could ask questions naturally and get instant, accurate answers? That's what zxtra provides. A conversational interface to our entire website's knowledge base.

Building zxtra: Technical Decisions

Why Cloudflare Workers AI

We evaluated several options:

Platform	Pros	Cons
OpenAI API	Best models, extensive docs	Higher cost, latency varies
AWS Bedrock	AWS integration, multiple models	Complex setup, region-locked
Cloudflare Workers AI	Edge deployment, simple pricing	Newer, fewer model options

We chose Cloudflare Workers AI for four reasons:

Edge deployment: Our website already runs on Cloudflare Workers. Adding AI inference meant zero additional infrastructure
Simple pricing: Generous free tier, predictable costs. No surprise bills from token overages
Low latency: Inference runs at the edge, close to users. Response times are consistently fast globally
Llama 3.1: Fast, reliable responses with the standard chat completion format

The Knowledge Architecture

zxtra doesn't browse the internet or hallucinate information. It only knows what we explicitly tell it.

Loading diagram...

The system prompt is built dynamically from our centralized data modules. Same data sources as the website. Single source of truth.

Benefits:

zxtra's knowledge updates automatically when we update website content
No hallucinations. The model can only reference information we've provided
Context awareness. We include the current page so zxtra can summarize what the visitor is viewing

The chat interface appears on every page as a floating button. Non-intrusive. Expands when clicked. Minimizable while keeping the conversation.

Key UX decisions:

Markdown support for AI responses
Copy button for easy response sharing
Suggested questions for first-time users
Rate limit indicators to set expectations

Security: Multiple Layers of Protection

An AI endpoint is an attractive target for abuse. Without protection, bots could drain API quotas, use our AI for unrelated queries, or attempt prompt injection attacks.

We implemented multiple layers:

Layer 1: Cloudflare Turnstile

Turnstile is Cloudflare's invisible CAPTCHA replacement. Unlike traditional CAPTCHAs, it doesn't require user interaction. It runs in the background and verifies that the visitor is human.

Key learning: Turnstile tokens are single-use. After verifying a token on the server, you must reset the widget to get a new token for the next request. We spent time debugging "verification failed" errors before realizing this.

Layer 2: Rate Limiting

Even with bot protection, we limit request frequency:

20 requests per minute per user
100 requests per hour per user
2-second minimum interval between requests

Rate limits are tracked using signed cookies, making them resistant to manipulation.

Layer 3: Input Validation

Basic but essential:

Maximum 2000 characters per message
Required fields must be present
JSON body must parse correctly

Layer 4: Strict System Prompt

The system prompt explicitly constrains the AI's behavior:

Only answer questions using information provided
If a question is not about ZSoftly, decline politely
If information is missing, direct to contact us
Never answer general knowledge questions unrelated to ZSoftly

This prevents prompt injection attacks where users try to make the AI behave outside its intended purpose.

Lessons Learned

What Worked Well

Edge deployment was the right choice. Response times are consistently under 2 seconds globally. No cold starts, no regional latency issues.

Dynamic system prompts pay off. By pulling from our centralized data modules, zxtra's knowledge stays current automatically.

Turnstile provides invisible protection. Users don't notice it, but it effectively blocks automated abuse.

Challenges We Faced

Single-use Turnstile tokens. This wasn't obvious from documentation. After a successful verification, the token is consumed. You need a new one for each request.

Balancing strictness with helpfulness. Too strict, and the AI refuses legitimate questions. Too loose, and it goes off-topic. Finding the right system prompt took iteration.

What We'd Do Differently

Add conversation history. Currently, each message is independent. Adding context from previous messages would make multi-turn conversations more natural.

Implement streaming responses. The full response is generated before being sent. Streaming would improve perceived latency for longer responses.

Build analytics. We don't yet track what questions visitors ask. This data would help us improve both the AI and our content.

The Business Case for AI Assistants

Building zxtra took about a week of focused development time. Ongoing cost is minimal. Cloudflare Workers AI has a generous free tier, and our usage is well within it.

What we get in return:

Benefit	Impact
24/7 availability	Visitors get answers any time
Instant responses	No waiting for a human to be available
Consistent quality	Every response uses the same knowledge base
Reduced friction	Simple questions don't require forms
Scalability	10 or 10,000 visitors, the AI handles it

Should You Build an AI Assistant?

Consider it if:

Visitors frequently ask similar questions
Your content spans multiple pages or topics
You want to provide support outside business hours
You're already on Cloudflare (makes deployment trivial)

Hold off if:

Your queries require real-time data (inventory, frequently changing pricing)
You need transactional capabilities (placing orders, making changes)
Compliance requirements restrict AI usage

Try zxtra Yourself

We built zxtra not just for our visitors, but as a demonstration of what we can build for your business. Whether you need a customer service chatbot, an internal knowledge assistant, or a specialized AI tool, the architecture patterns are similar.

Try the full zxtra experience →

Or click the chat bubble in the bottom-right corner of any page to see it in action.

What's Next

We're continuing to evolve zxtra:

Conversation memory across multiple messages
Streaming responses as they're generated
Usage analytics to understand what visitors are asking
Expanded knowledge and capabilities

If you're interested in building something similar for your business, we'd love to chat. Our AI Agents & Chatbots service covers custom development from design through deployment.

Get in touch to discuss your AI project →

Building an AI-powered product or service? We help businesses design and implement intelligent solutions. Contact us →

Introducing zxtra: Our AI Assistant Built on Cloudflare Workers AI