> ## Documentation Index
> Fetch the complete documentation index at: https://docs.verbex.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Configure Your AI Agent

> Learn how to set up language, model, voice, and other key settings for your AI Agent

<Note>
  Proper configuration of your AI Agent is crucial for optimal performance. Each setting affects how your Agent communicates and processes information.
</Note>

<Frame>
  <img height="343" width="624" src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXexcKEXvskbB15wDcKMS2wC125rl7o0hgOxXFIzF1YmHAG30iYvd-ELkWKTPeic4M6ALRsexmmQCR3rD18MH1J2qESLYQRrRt8g0rk_f-3cBjiDO2GBPAZvoib1GcWM66sBalAaDw?key=wYTIfHCKSlT8hfaWOLugPTzh" />
</Frame>

## Core Settings

### 1. Select Language

Choose the primary language for your AI Agent's interactions.

<Frame>
  <img height="225" width="455" src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXdWLmPuzAZ08OOEQpWh22-yMsC1IKojWz0N1D_17Wq3G_6mg3eIE_XFJB07iqCrPn1DaXxzjiV8ci6TEUdHXYnn012Y3zY19-j-jBhpobWrcxeJdfx4i7EBtu1ul6VjYokTP4qClw?key=wYTIfHCKSlT8hfaWOLugPTzh" />
</Frame>

<Tip>
  Select a language that matches your target audience. This affects both understanding and response capabilities.
</Tip>

### 2. Select Model

Choose the AI model that will power your Agent's intelligence.

<Frame>
  <img height="212" width="423" src="https://mintcdn.com/hishabsingaporepteltd/Wx-6A5SGbH-ne_kM/images/model_selection.png?fit=max&auto=format&n=Wx-6A5SGbH-ne_kM&q=85&s=e0da7f6c649ef6102afa9189e8e65819" data-path="images/model_selection.png" />
</Frame>

## OpenAI Models

### GPT-4.1

**Highest intelligence**

* Ideal for complex tasks and advanced instruction following
* Supports up to 1 million tokens for extensive context handling
* Excels in reasoning, and long-context comprehension
* Suitable for use in Agentic AI Agents
* Best choice for tool integration and sophisticated applications

### GPT-4.1-mini

**Balanced intelligence and performance**

* Faster responses with lower latency compared to GPT-4.1
* Maintains support for up to 1 million tokens
* Handles moderately complex instructions effectively
* Suitable for applications requiring a balance between performance and cost
* Can be used in Agentic AI Agents

### GPT-4.1-nano

**Lightweight and cost-effective**

* Fastest response times among the GPT-4.1 models
* Supports up to 1 million tokens for context
* Optimized for simple tasks like classification and autocomplete
* Not suitable for complex instruction following or tool integration
* Best for applications where speed and cost are prioritized over advanced capabilities

### GPT-4o

**Higher intelligence**

* First for use with complex instructions in the system prompt of a Simple AI Agent. In an Agentic AI Agent, this model is used by default
* Slower responses, higher latency
* Better understanding
* Higher accuracy, handles out of context queries better
* Optimal for using with tools

### GPT-4o Mini

**Lower intelligence, budget-friendly**

* Faster responses, lower latency
* Simple system prompt with less complex instructions
* Limited performance with tools
* Best for simple queries
* Not supported in Agentic AI Agents

## Realtime Models

Realtime models are optimized for low-latency, real-time conversational experiences. These models enable natural, fluid interactions with minimal delay.

### OpenAI Realtime Models

#### GPT-Realtime

**Latest OpenAI realtime model**

* Ultra-low latency for real-time voice conversations
* Optimized for natural, fluid dialogue
* Supports streaming responses with minimal delay
* Ideal for voice-based AI agents and live customer interactions
* Handles context switches and interruptions gracefully

#### GPT-Realtime-2025-08-28

**Dated version of OpenAI realtime model**

* Specific snapshot of the realtime model from August 28, 2025
* Provides consistency for applications requiring a fixed model version
* Same low-latency capabilities as GPT-Realtime
* Use when you need version stability and predictable behavior

#### GPT-4o-Realtime-Preview

**Preview version of GPT-4o realtime capabilities**

* Combines GPT-4o intelligence with realtime processing
* Enhanced reasoning capabilities in real-time scenarios
* Better handling of complex, multi-turn conversations
* Preview status means features and performance may evolve

### Google Gemini Realtime Models

#### Gemini-2.0-Flash-Live-001

**Google's fast realtime model**

* Optimized for speed and low latency
* Excellent for live voice interactions
* Fast response times suitable for conversational AI
* Supports multimodal inputs in real-time scenarios

#### Gemini-Live-2.5-Flash-Preview

**Enhanced preview version of Gemini Live**

* Latest preview of Google's realtime capabilities
* Improved performance and features over 2.0
* Better context understanding in live conversations
* Preview status indicates ongoing improvements

<Note>
  Realtime models are specifically designed for voice-based AI agents and scenarios requiring immediate responses. They prioritize low latency and natural conversation flow over complex reasoning tasks.
</Note>

<Warning>
  **No STT Required**: Realtime models have built-in speech recognition capabilities and do not require a separate Speech-to-Text (STT) module. They can directly process voice input and work seamlessly with the platform's TTS (Text-to-Speech) for output.
</Warning>

<Tip>
  Choose realtime models when building AI agents that need to handle voice calls or live chat interactions where response speed is critical. For complex tool-calling or reasoning tasks, consider using GPT-4.1 or GPT-4o standard models.
</Tip>

## Groq Models

Groq models are open-source models powered by Groq's high-performance inference infrastructure, offering exceptional speed and cost-effectiveness.

### Groq Llama 3.3 70B Versatile

**High-performance open-source model**

* 70 billion parameters with optimized transformer architecture
* 128K token context window for extensive context handling
* Strong instruction following and tool use capabilities
* Fast inference powered by Groq's infrastructure

### Groq Llama 3.1 8B Instant

**Fast and budget-friendly open-source model**

* 8 billion parameters optimized for instant responses
* 128K token context window
* Ultra-fast inference for real-time applications
* Ideal for applications requiring quick responses without complex reasoning

<Note>
  Groq models leverage Groq's specialized LPU (Language Processing Unit) architecture to deliver exceptional inference speed, making them ideal for high-throughput applications and real-time use cases.
</Note>

<Tip>
  Choose **Llama 3.3 70B Versatile** for complex tasks requiring strong reasoning and tool use. Choose **Llama 3.1 8B Instant** when speed and cost are priorities and tasks are relatively straightforward.
</Tip>

## Verbex Models

### Verbex Bangla Mini

**Bengali-specialized model**

* Optimized for Bengali language tasks and tool-calling scenarios
* Handles Bengali text and tool interactions effectively
* Suitable for building conversational agents that need to interact with external systems in Bengali
* Faster than GPT-4o, GPT-4.1 model
* Best for Bengali language tasks and tool-calling scenarios

**Model Card**: [Verbex Bangla Mini](/models/verbex_bangla_mini)

### Custom Model

Verbex support OpenAI compatible custom LLM models. You can use any LLM that is supported by OpenAI compatible models.

Before adding a custom model to Verbex, keep in mind that:

* The model should be OpenAI compatible.
* The model should support streaming responses.
* The model should support tool calling.

<Frame>
  <img src="https://mintlify.s3.us-west-1.amazonaws.com/hishabsingaporepteltd/images/basics/custom_model_1.png" />
</Frame>

<Frame>
  <img src="https://mintlify.s3.us-west-1.amazonaws.com/hishabsingaporepteltd/images/basics/custom_model_2.png" />
</Frame>

<Warning>
  For AI Agents that use tools (like calendar booking, email sending, etc.), GPT-4o is strongly recommended. GPT-4o Mini has limited capabilities in handling tool interactions, which may result in unreliable tool execution and degraded performance.
</Warning>

<Note>
  We plan to add models like Anthropic Claude, Google Gemini, and more in the future.
</Note>

### 3. Select STT

Select the STT module that will convert user speech into text.

<Note>
  The STT module is crucial for accurate transcription of customer speech, directly impacting your AI Agent's ability to understand and respond appropriately.
</Note>

### 4. Select Voice

Choose a voice that represents your brand and resonates with your audience.

<Frame>
  <img height="252" width="434" src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXcYEvTmbOiQUV7o1Vhbf2qwgo50pc2GyQknkL31FIcN1BC9NM1nRmM9A79H7F87jrk9K1nHMZ37koRUJQ7FPCBC6wqOR9MXdicQpYdUy-k7X_GHDjESZXMxa3W06uPMr8f2fnsB?key=wYTIfHCKSlT8hfaWOLugPTzh" />
</Frame>

<Check>
  When selecting a voice, consider:

  * Language compatibility
  * Gender preference
  * Accent appropriateness
  * Speaking style
  * Brand alignment
</Check>

## Configuration Best Practices

<CardGroup cols={2}>
  <Card title="Language & Region" icon="globe">
    Match your target market's primary language and regional preferences
  </Card>

  <Card title="Model Selection" icon="microchip">
    Balance performance needs with budget constraints
  </Card>

  <Card title="Voice Choice" icon="microphone">
    Align voice characteristics with brand identity
  </Card>

  <Card title="STT Accuracy" icon="waveform">
    Test STT performance with your typical use cases
  </Card>
</CardGroup>

## Performance Considerations

<Warning>
  Model selection significantly impacts:

  * Response speed
  * Reasoning ability of the AI Agent
  * Handling out of context queries
  * Overall user experience
</Warning>

### Model Comparison

| Model                   | Performance | Cost     | Best For                                                 | Tool Calling |
| ----------------------- | ----------- | -------- | -------------------------------------------------------- | ------------ |
| GPT-4.1                 | Highest     | Highest  | Complex tasks, extensive context (1M tokens), agentic AI | Excellent    |
| GPT-4.1-mini            | High        | Moderate | Balanced performance and cost, agentic AI                | Excellent    |
| GPT-4.1-nano            | Moderate    | Low      | Simple tasks, classification, autocomplete               | Limited      |
| GPT-4o                  | High        | Higher   | Complex interactions, tool-based operations              | Excellent    |
| GPT-4o Mini             | Moderate    | Lower    | Basic queries without tools                              | Limited      |
| GPT-Realtime            | High        | Higher   | Real-time voice conversations, live interactions         | Excellent    |
| GPT-4o-Realtime-Preview | High        | Higher   | Real-time with enhanced reasoning                        | Excellent    |
| Groq Llama 3.3 70B      | High        | Low      | Complex reasoning, coding, tool use (cost-effective)     | Excellent    |
| Groq Llama 3.1 8B       | Moderate    | Lowest   | Fast responses, simple tasks (ultra cost-effective)      | Good         |

<Tip>
  If your AI Agent will be using any tools or integrations, choose GPT-4o to ensure reliable tool calling performance. While GPT-4o Mini is cost-effective, it's best suited for simple conversation-only scenarios.
</Tip>

Learn more about [Tools Configuration →](/tools/connect-tools)
