Skip to main content
ZSoftly logo
AI

Introducing zxtra: Our AI Assistant Built on Cloudflare Workers AI

Staff at ZSoftly
7 min read
Share:
Introducing zxtra: Our AI Assistant Built on Cloudflare Workers AI - Featured image

Visitors often have questions: What services do we offer? How much does it cost? Can we help with their specific challenge? Rather than making them dig through pages or wait for a response, we built zxtra. An AI assistant that answers questions instantly, right on the page.

This is the story of how we built zxtra using Cloudflare Workers AI, the technical decisions we made, and why AI assistants are becoming essential for modern businesses.

Try zxtra yourself →


TL;DR

  • Problem: Visitors had questions scattered across multiple pages. FAQ pages help, but people don't always check them
  • Solution: zxtra, a context-aware AI assistant that lives on every page
  • Stack: Cloudflare Workers AI, Cloudflare Turnstile, React floating chat widget
  • Result: Instant answers 24/7, zero infrastructure overhead, enterprise-grade security

The Problem: Information Overload

Our website has grown significantly. Service pages, case studies, blog posts, pricing information, team bios. While this depth establishes expertise, it creates a challenge: visitors don't always know where to find what they need.

Common questions we heard:

  • "What services do you offer?"
  • "How much does a DevOps engagement cost?"
  • "Can you help with AWS cost optimization?"
  • "Do you have experience with Kubernetes?"

These questions all have answers on our website. Finding them requires navigation, reading, and sometimes connecting dots across multiple pages.

Traditional Solutions Fall Short

SolutionProblem
FAQ PagePeople don't always check it. Can't anticipate everything
Live ChatRequires staff availability. Not scalable
Contact FormCreates friction for simple questions

The AI Solution

What if visitors could ask questions naturally and get instant, accurate answers? That's what zxtra provides. A conversational interface to our entire website's knowledge base.


Building zxtra: Technical Decisions

Why Cloudflare Workers AI

We evaluated several options:

PlatformProsCons
OpenAI APIBest models, extensive docsHigher cost, latency varies
AWS BedrockAWS integration, multiple modelsComplex setup, region-locked
Cloudflare Workers AIEdge deployment, simple pricingNewer, fewer model options

We chose Cloudflare Workers AI for four reasons:

  1. Edge deployment: Our website already runs on Cloudflare Workers. Adding AI inference meant zero additional infrastructure
  2. Simple pricing: Generous free tier, predictable costs. No surprise bills from token overages
  3. Low latency: Inference runs at the edge, close to users. Response times are consistently fast globally
  4. Llama 3.1: Fast, reliable responses with the standard chat completion format

The Knowledge Architecture

zxtra doesn't browse the internet or hallucinate information. It only knows what we explicitly tell it.

Loading diagram...

The system prompt is built dynamically from our centralized data modules. Same data sources as the website. Single source of truth.

Benefits:

  • zxtra's knowledge updates automatically when we update website content
  • No hallucinations. The model can only reference information we've provided
  • Context awareness. We include the current page so zxtra can summarize what the visitor is viewing

The Chat Widget

The chat interface appears on every page as a floating button. Non-intrusive. Expands when clicked. Minimizable while keeping the conversation.

Key UX decisions:

  • Markdown support for AI responses
  • Copy button for easy response sharing
  • Suggested questions for first-time users
  • Rate limit indicators to set expectations

Security: Multiple Layers of Protection

An AI endpoint is an attractive target for abuse. Without protection, bots could drain API quotas, use our AI for unrelated queries, or attempt prompt injection attacks.

We implemented multiple layers:

Layer 1: Cloudflare Turnstile

Turnstile is Cloudflare's invisible CAPTCHA replacement. Unlike traditional CAPTCHAs, it doesn't require user interaction. It runs in the background and verifies that the visitor is human.

Key learning: Turnstile tokens are single-use. After verifying a token on the server, you must reset the widget to get a new token for the next request. We spent time debugging "verification failed" errors before realizing this.

Layer 2: Rate Limiting

Even with bot protection, we limit request frequency:

  • 20 requests per minute per user
  • 100 requests per hour per user
  • 2-second minimum interval between requests

Rate limits are tracked using signed cookies, making them resistant to manipulation.

Layer 3: Input Validation

Basic but essential:

  • Maximum 2000 characters per message
  • Required fields must be present
  • JSON body must parse correctly

Layer 4: Strict System Prompt

The system prompt explicitly constrains the AI's behavior:

  • Only answer questions using information provided
  • If a question is not about ZSoftly, decline politely
  • If information is missing, direct to contact us
  • Never answer general knowledge questions unrelated to ZSoftly

This prevents prompt injection attacks where users try to make the AI behave outside its intended purpose.


Lessons Learned

What Worked Well

Edge deployment was the right choice. Response times are consistently under 2 seconds globally. No cold starts, no regional latency issues.

Dynamic system prompts pay off. By pulling from our centralized data modules, zxtra's knowledge stays current automatically.

Turnstile provides invisible protection. Users don't notice it, but it effectively blocks automated abuse.

Challenges We Faced

Single-use Turnstile tokens. This wasn't obvious from documentation. After a successful verification, the token is consumed. You need a new one for each request.

Balancing strictness with helpfulness. Too strict, and the AI refuses legitimate questions. Too loose, and it goes off-topic. Finding the right system prompt took iteration.

What We'd Do Differently

Add conversation history. Currently, each message is independent. Adding context from previous messages would make multi-turn conversations more natural.

Implement streaming responses. The full response is generated before being sent. Streaming would improve perceived latency for longer responses.

Build analytics. We don't yet track what questions visitors ask. This data would help us improve both the AI and our content.


The Business Case for AI Assistants

Building zxtra took about a week of focused development time. Ongoing cost is minimal. Cloudflare Workers AI has a generous free tier, and our usage is well within it.

What we get in return:

BenefitImpact
24/7 availabilityVisitors get answers any time
Instant responsesNo waiting for a human to be available
Consistent qualityEvery response uses the same knowledge base
Reduced frictionSimple questions don't require forms
Scalability10 or 10,000 visitors, the AI handles it

Should You Build an AI Assistant?

Consider it if:

  • Visitors frequently ask similar questions
  • Your content spans multiple pages or topics
  • You want to provide support outside business hours
  • You're already on Cloudflare (makes deployment trivial)

Hold off if:

  • Your queries require real-time data (inventory, frequently changing pricing)
  • You need transactional capabilities (placing orders, making changes)
  • Compliance requirements restrict AI usage

Try zxtra Yourself

We built zxtra not just for our visitors, but as a demonstration of what we can build for your business. Whether you need a customer service chatbot, an internal knowledge assistant, or a specialized AI tool, the architecture patterns are similar.

Try the full zxtra experience →

Or click the chat bubble in the bottom-right corner of any page to see it in action.


What's Next

We're continuing to evolve zxtra:

  • Conversation memory across multiple messages
  • Streaming responses as they're generated
  • Usage analytics to understand what visitors are asking
  • Expanded knowledge and capabilities

If you're interested in building something similar for your business, we'd love to chat. Our AI Agents & Chatbots service covers custom development from design through deployment.

Book a call to discuss your AI project →


Building an AI-powered product or service? We help businesses design and implement intelligent solutions. Contact us →