The Problem With Generic AI Tools
Most companies that try to use AI for internal knowledge management run into the same wall: the tools that work well — ChatGPT, Claude, Gemini — do not know anything about your business. They cannot read your contracts, your internal policies, your product manuals, your case archive or your compliance documentation. And if you upload those documents to a consumer AI tool, you have just sent sensitive business data to a third-party server you do not control.
PRAG solves this. Private RAG is a self-hosted AI document intelligence system that runs entirely within your own infrastructure. Your documents never leave your server. Your queries are processed privately. And the AI that answers your questions is grounded in your actual documents — not trained on generic internet data.
What Is RAG?
RAG stands for Retrieval Augmented Generation. It is the architectural pattern that makes AI genuinely useful for private knowledge bases.
Here is how it works: instead of trying to train an AI model on your documents (expensive, slow, requires retraining every time documents change), RAG keeps your documents in a searchable index. When a user asks a question, the system retrieves the most relevant passages from that index and passes them to a language model as context. The model then generates an answer grounded in those specific passages — with source citations so users can verify every claim.
The result is an AI that can accurately answer questions about your specific business, your specific documents and your specific domain — without any fine-tuning, and without any of your data being sent to external AI providers.
What PRAG Delivers
A Private, Self-Hosted Knowledge Base
PRAG runs on your own server — cloud or on-premise. Documents are indexed and stored locally. Queries are processed locally. Nothing is shared externally. This makes PRAG suitable for companies with strict data governance requirements: law firms, financial services, healthcare providers, government contractors and any organisation handling confidential information.
Natural Language Search Across All Your Documents
Upload PDFs, Word documents, text files, manuals, contracts, reports — and immediately start asking questions in plain language. No need to remember which folder something is in, no need to use specific search syntax. Ask the way you would ask a colleague, and get a sourced answer.
Examples of what this looks like in practice:
- What are the termination clauses in the 2024 supplier agreements?
- What does our health and safety policy say about contractors on-site?
- Summarise the findings from last quarter’s compliance audit.
- Which product specifications mention the ISO 9001 certification?
Hybrid Search for Maximum Recall
PRAG combines two retrieval methods to ensure nothing is missed. Vector semantic search finds passages that match the meaning of your question, even if the exact words are different. BM25 keyword search catches exact term matches — critical for document numbers, names, product codes and regulatory references. The two signals are combined into a single ranked result set, giving consistently high retrieval accuracy across diverse document types.
Multiple AI Models, One System
PRAG is not locked to a single AI provider. It supports routing queries to different language models depending on the task: fast, lightweight models for quick lookups; larger, more capable models for complex analysis; reasoning models that show their chain of thought for tasks that require step-by-step logic. When better models become available, they can be integrated without rebuilding the system.
Oracle Mode: Automatic Quality Control
The Oracle is PRAG’s built-in hallucination guard. Every answer is automatically evaluated by a secondary AI critic that scores it for accuracy and flags any claims that appear invented or unverifiable from the source documents. If an answer falls below the quality threshold, the system automatically generates a corrected version — grounded strictly in the retrieved passages — before returning anything to the user.
For organisations where answer accuracy is not optional — legal teams, compliance departments, medical practices — this is a critical safeguard. You are not just getting an AI answer; you are getting an AI answer that has been checked.
Project Separation
PRAG supports multiple independent projects on a single installation. Each project has its own document index, its own conversation history, its own model settings. A law firm can run separate projects for different practice areas. A company can separate HR policy documents from technical product documentation. Access control keeps each team in their own space.
Full Interaction Logging
Every question and answer is logged, including which documents were retrieved, which model was used, what the Oracle scored, and whether a correction was applied. This audit trail is essential for regulated industries and for identifying where the system can be improved.
Who Is PRAG For?
Law Firms and Legal Departments
Legal professionals spend significant time searching through contracts, rulings and precedents. PRAG allows instant natural language search across a private document archive, with page-level citations for every answer.
Financial Services
Compliance teams, analysts and advisors can query internal policy libraries, regulatory filings, risk frameworks and client documentation — privately, accurately and with a full audit log.
Healthcare and Pharma
Clinical guidelines, drug interaction databases, trial documentation and patient protocols are high-stakes documents that need to be queryable accurately. PRAG’s Oracle mode is specifically designed for contexts where wrong answers have real consequences.
Enterprise and Manufacturing
Technical manuals, maintenance procedures, engineering specifications and quality control documents are often vast and poorly indexed. PRAG makes them immediately searchable in plain language, reducing the time engineers and technicians spend hunting for information.
Professional Services
Consultancies, accountancy firms and agencies managing large client document libraries can use PRAG to instantly surface relevant precedents, methodologies and reports from past engagements.
How PRAG Compares to Sending Documents to ChatGPT
Uploading documents to a consumer AI tool is fast and easy — but it comes with significant downsides for business use:
- Data privacy: Your documents are processed on external servers. PRAG keeps everything on your infrastructure.
- Document limits: Consumer tools limit how much you can upload per session. PRAG indexes your entire document library permanently.
- Persistence: Consumer tools do not maintain a knowledge base between sessions. PRAG does.
- Audit trail: Consumer tools do not log who asked what and what they were told. PRAG does.
- Quality control: Consumer tools do not tell you when they are making things up. PRAG’s Oracle mode does.
Technical Foundation
PRAG is built on a modern, production-grade stack: FastAPI for the backend, FAISS for vector indexing, fastembed for local CPU embeddings, and a multi-provider LLM routing layer. Embedding can run entirely on-server without any external API calls, making fully air-gapped deployment possible for the most sensitive environments. The system deploys on a standard Linux server with no specialised hardware required for most use cases.
Getting Started
PRAG can be deployed on your infrastructure or in a private cloud environment of your choice. We handle setup, configuration and integration with your existing document workflows.
If your team is spending time searching through documents that an AI should be able to answer in seconds — get in touch. We will assess your document volumes, access control requirements and query patterns, and give you a clear picture of what PRAG would look like in your environment.
For a technical deep dive into the RAG architecture, hybrid search and Oracle mode, read our engineering article.