RAG chatbot — client discovery questionnaire¶

A business-friendly questionnaire to scope an AI chatbot built on your internal documents (Standard Operating Procedures, policies, handbooks, training material, etc.).

You do not need to know the underlying technology to answer these questions. Anywhere the question is technical, we have given a plain explanation and a "if you are not sure, the most common answer is..." hint. You can leave answers blank and we will work them out together on the discovery call.

The questionnaire mirrors the seven decisions we walk through with every client. Allow ~45 minutes the first time through, or send it round internally to the relevant department heads.

Section 1 — About the use case¶

In one or two sentences, what should the chatbot help your staff do? (For example: "Answer HR policy questions so HR is not asked the same things every week.")
Which departments or staff groups are the primary audience? (HR only, all staff, branch managers, etc.)
What would make this project a clear success, six months after go-live? (e.g. "HR inbox down by half", "new joiners on-boarded without printed manuals", "audit can trace any policy answer back to source".)
Is there a regulator, auditor, or internal governance committee that needs to bless this before it goes live? If yes, who?
Are there well-known no-go zones we should know about up front? (e.g. "do not let it answer medical questions", "do not surface anything from the legal team's folder".)

Section 2 — Your documents¶

Roughly how many documents are in scope for the first phase? Choose a bucket: under 1,000 / 1,000-5,000 / 5,000-25,000 / 25,000-100,000 / 100,000+.
Where do these documents live today? (SharePoint, intranet, shared drive, Confluence, Notion, individual OneDrives, a public website, paper that needs scanning, etc.) List as many as apply.
What formats are they in? (Word, PDF, scanned PDF, PowerPoint, Excel, web pages, video transcripts, other.)
Are the documents you have today actually the up-to-date, approved versions? Or is there cleanup needed before they should be used as a source of truth? Honest answer please — this is where most projects underestimate effort.
Who in your organisation decides a document is "approved" to be answered from? A named person or team is best.
How often do these documents change? (Rarely, monthly, weekly, daily, varies by document type.)
Roughly how many new or updated documents will land per month after go-live?
Are any of the documents in languages other than English? Which languages, and what share of the total?

Section 3 — Your users¶

How many staff in total will have access to the chatbot?
Of those, how many do you expect to use it on a typical working day? (A guess is fine — "most of HR daily, occasional use by everyone else" is enough.)
Is there a peak time when many people would use it at once? (Monday morning, end-of-month, after a policy announcement, etc.)
Should everyone who can log in see the same answers, or do some documents belong only to certain roles? (For example: only HR should see disciplinary procedures; only managers should see salary-band guidance.)
Which system manages your staff logins today? (Microsoft Entra/Azure AD is the most common; Okta, Google Workspace, or a custom system also possible.)
Are external users (contractors, agency staff, suppliers) expected to use the chatbot?
How long should chat conversations be kept, and who in your organisation is allowed to see them? (e.g. "kept for 30 days, only the user themselves and an admin can see them".)

Section 4 — Where staff use the chatbot¶

Where would you like staff to interact with the chatbot? Tick all that apply:
- Inside Microsoft Teams (most popular if Teams is your daily driver)
- As a button on your existing intranet or SharePoint home page
- A dedicated internal website (e.g. ai.yourcompany.com)
- A mobile app (note: significantly more effort)
- Other (please describe)
Do you want the chatbot branded as a product of yours, of ours, or unbranded?
Is there an existing tool you currently use for any of this that we should plan to replace or integrate with?

Section 5 — How the AI itself should be hosted¶

This is the question with the biggest range of cost outcomes. If you are not sure, "no strong preference" is a fine answer and we will recommend the right shape on the discovery call.

Which of these statements best matches your organisation's position on AI data handling?
- A. "We are comfortable with leading commercial AI providers (such as Anthropic, OpenAI, Google) processing our documents under enterprise agreements, as long as data is not used for training and is hosted in a UK / EU region." (Fastest and highest quality. Most clients land here.)
- B. "We would prefer the AI model to run in our own cloud tenant or a region we control, even at some quality cost."
- C. "Our policy or regulator requires that no data ever leaves our own premises. We need to host everything ourselves on hardware we own."
- D. Not sure yet — we want to discuss the trade-offs.
If you chose B or C above: do you already have GPU servers / AI infrastructure in place, or would this project also be the first AI hosting setup?
If you have a regulator or compliance regime that applies here, which one(s)? (GDPR / DPIA process, ISO 27001, SOC 2, sector-specific such as PCI-DSS, HIPAA, FCA, etc.)

Section 6 — Safety and governance¶

Should the chatbot always cite which document each answer came from? (Our default: yes, always.)
What should the chatbot do when it cannot find a relevant source? (Our default: say "I do not have a source for that" and offer to escalate to a human contact.)
Are there topics it should explicitly refuse to answer even if a relevant document exists? (e.g. "do not answer legal advice questions, even though the legal handbook is in scope".)
Do you need a full audit log of every question asked and answer given, with the user's identity? (Most clients want this; some regulators require it.)
Are there people, customer names, or commercial figures that must never appear in an answer, even if they are in the source documents?

Section 7 — After go-live¶

Who in your organisation will own the chatbot day to day after we launch? A named person, even informally, makes a large difference to long-term success.
Who will be responsible for keeping the source documents up to date? (Same person? Different teams per document set?)
How quickly should we respond to a problem report? Pick a tier:
- Bronze: next business day response, 5 business days to resolve.
- Silver: 4 business-hour response, 2 business days to resolve.
- Gold: 1 business-hour response, next business day to resolve, with weekend cover.
Would you prefer that we run the system for you ("managed service" — recommended for most clients under 200 staff), share the responsibility, or take a full handover at go-live?
Is there a date by which the chatbot needs to be live? Is that date driven by something specific (a board meeting, an audit, a contract, a season)?

Section 8 — Budget and procurement¶

Do you have an approved budget range for the initial build? If yes, sharing it makes the proposal materially better — we can scope to fit, rather than guessing.
Do you have an approved monthly run-cost budget?
Is this a one-time project, or the first phase of a wider AI programme on your side?
Are there procurement requirements (preferred-supplier list, competitive tender, security questionnaire, insurance evidence) we should know about?

What happens next¶

Once you return this questionnaire, we will:

Send back a written summary of what we heard, so we can correct any misunderstandings before pricing.
Hold a 45-minute discovery call to walk through anything ambiguous.
Send a written proposal with two costs:
One-off implementation cost, broken down by phase and with a wall-clock timeline.
Steady-state monthly cost, broken down into fixed and usage-based lines, with a worked example for your size of organisation.

We will also include one or two alternative shapes (e.g. "here is what it costs if you self-host instead", or "here is what it costs if you start with HR only and add other departments in a phase 2") so you can see the trade-offs, not just one number.