Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.verbex.ai/llms.txt

Use this file to discover all available pages before exploring further.

What you’ll learn
  • How to format your documents properly?
  • Examples: How to Format Your Documents
  • How to update your prompt for best knowledge-base search?

How RAG (Retrieval Augmented Generation) Works?

Retrieval Augmented Generation (RAG) enhances large language models (LLMs) by supplementing them with external knowledge. When a query is asked, the system retrieves relevant chunks of information from the knowledge base and feeds them into the model alongside the query. This ensures responses are more accurate, grounded, and contextually relevant.

How to Format Your Documents?

  • Use clear structure: Organize content with headings, subheadings, and bullet points to improve navigation.
  • Keep it simple: Write in plain, jargon-free language so users with minimal technical knowledge can understand.
  • Optimize for retrieval: Ensure important information is stated explicitly in text. Avoid burying key details in images or tables without descriptive text.
  • Language considerations: While the system supports multiple languages, documents in English are easier to search and retrieve. If your knowledge base is in English, ensure your prompts specify that search queries should be in English for optimal results.
  • Supported formats: Most common formats like DOCX, Markdown, Excel, and PDF are supported. Arrange and format text according to your file system for consistency.

Examples: How to Format Your Documents

Here are some concrete examples to help you apply the best practices:

Example 1: Clear Structure

Bad:
System requirements: Windows 10, 8 GB RAM, Installation steps: Download the file, Run installer, Follow prompts, Done.
Good:
### System Requirements
- Windows 10
- 8 GB RAM

### Installation Steps
1. Download the file
2. Run the installer
3. Follow the prompts
4. Installation complete

Example 2: Simple, Jargon-Free Language

Bad:
The application leverages a multi-threaded paradigm for expedited data processing.
Good:
The application uses multiple threads to process data faster.

Example 3: Optimize for Retrieval

Bad:
![Installation steps screenshot](install.png)
Good:
Installation Steps:
1. Download the file from our official site.
2. Double-click the installer.
3. Accept the license agreement.
4. Choose the installation folder.

Important Note on Table Retrieval

While text-based tables (Markdown, CSV, Excel) are far better than images for retrieval, they still require careful design for best performance:
  • Keep headers clear and descriptive: Avoid abbreviations that may confuse the search. For example, use Storage (GB) instead of just Stor.
  • Use consistent units: Always specify units in the header or the value (e.g., 10 GB vs just 10).
  • Avoid merging cells: Complex merged cells in spreadsheets make retrieval harder. Keep one value per cell.
  • Add context in surrounding text: Before or after the table, explain what the table contains. For example: “The following table shows pricing plans and their features.”
Better Example:
The following table shows pricing plans and their features:

| Plan   | Price | Storage (GB) | Support        |
|--------|-------|--------------|----------------|
| Plan A | $10   | 10           | Email support  |
| Plan B | $20   | 50           | Phone support  |
| Plan C | $50   | 200          | 24/7 support   |
This way, both the table structure and the explanatory text are retrievable, making searches more reliable.

Example 4: Supported Formats

Bad:
All information stored in a single large image file.
Good:
Use DOCX, PDF, or Markdown with headings and text descriptions so search tools can retrieve information easily.

How to Tune Knowledge Base Performance?

When your knowledge base isn’t returning expected results for certain queries, here are strategies to improve performance:
  • Rephrase and duplicate content: If a particular question isn’t retrieving the right answer, add the same information in multiple ways. For example, if users ask “How to reset password?” but don’t get results, add variations like:
    • “How do I reset my password?”
    • “Steps to reset forgotten password”
    • “Password reset procedure”
    • “I forgot my password, what should I do?”
  • Include common user phrasings: Monitor what questions users actually ask and include those exact phrasings in your documents alongside the formal content.
  • Create FAQ-style entries: For critical information, create dedicated Q&A sections that mirror how users naturally ask questions.

Example: Improving Retrieval Through Rephrasing

Original content (might not be found easily):
### User Authentication
The system uses a secure authentication mechanism...
Improved content (with multiple phrasings):
### User Authentication / Login Issues / Sign-in Problems

The system uses a secure authentication mechanism...

**Common Questions:**
- How do I log in?
- What do I do if I can't sign in?
- Login not working - what should I do?
- I forgot my password - how to reset?

**Answer:** If you're having trouble logging in...
  • Clarify tool usage: In your system prompt, explicitly state under which conditions the knowledge-base search tool should be triggered.
  • Language consistency: Always specify that the query language must match the document language. For example, if documents are in English, include: “The knowledge-base-search tool query must be in English.” This is crucial for optimal search performance.
  • Be explicit: Provide clear instructions for how the retrieval tool should parse and interpret queries.
  • Context matters: Tailor prompts to explain how retrieved knowledge should be used in the response (e.g., grounding answers, summarizing docs, or citing evidence).

Example Prompt Update for English Knowledge Base

If your knowledge base is primarily in English, add this to your system prompt:\

Knowledge Base Improvements

The Knowledge Base system has been upgraded with several improvements to enhance performance, reliability, and multilingual support. These enhancements improve how your AI Agent retrieves and utilizes information.

Improved Context Retrieval
The system now uses a combination of semantic (vector-based) search and keyword-based search to retrieve more relevant information from your documents. This ensures that responses are more accurate and context-aware.
Queries will return results only after document processing is fully completed.

Advanced Chunking Strategies
A new set of chunking strategies has been introduced to improve how documents are split and indexed. These strategies are more context-aware and help preserve the meaning of the content.
This results in:
  • Better quality responses
  • More complete and readable context
  • Reduced chances of broken or incomplete information


Durable Ingestion System
The ingestion pipeline has been improved to handle large volumes of document uploads efficiently.
This includes:
  • Reliable processing under high load
  • Retry mechanisms for failed uploads
  • Prevention of duplicate processing


Fallback Document Extraction
A fallback extraction mechanism has been added to handle edge cases where standard document parsing may fail. This ensures better support for complex or non-standard files, especially PDFs.

Multi-language Support
The Knowledge Base now supports improved indexing and retrieval for multiple languages, including:
  • Bengali (Bangla)
  • Japanese
Selecting the correct language while creating the knowledge base is critical for achieving accurate results. This is especially important for Bengali content.
Document Processing Behavior
  • Documents must reach a Completed status before they can be used in queries
  • Retrieved responses will include relevant chunks from the uploaded documents
  • If a document or knowledge base is deleted, its data will no longer be available for retrieval


Migration Notes
Due to system upgrades, existing users must re-select their knowledge base when configuring agents.
Failure to do so may result in missing or inaccurate responses.

Access Availability
This feature is currently available to a limited set of users. Access requires valid:
  • verbex_id
  • org_id
Please contact your administrator or support team to enable access.
When using the knowledge-base-search tool, ensure all search queries are formulated in English, as the knowledge base documents are in English. This will provide the best search results and retrieval accuracy.