v0.8
January 2026— Cerebras IntegrationModels
Cerebras Models
Ultra-fast inference powered by Wafer-Scale Engine technology.
- Llama 3.1 8B at $0.10/$0.10 per 1M tokens
- Llama 3.3 70B with function calling at $0.60/$0.60
- Qwen 3 32B supporting 29+ languages at $0.20/$0.20
- GPT-OSS 120B reasoning model at 3,000 TPS
Enhancement
Speed Improvements
Industry-leading throughput for high-volume applications with simple per-token pricing.
v0.7
December 2025— Gemini 3 ModelsModels
Gemini 3 Pro & Flash
Google's latest models with 1M token context and PhD-level reasoning (90.4% on GPQA Diamond).
- Gemini 3 Pro: $2.00/$12.00 per 1M tokens
- Gemini 3 Flash: 3x faster at $0.50/$3.00
- Native multimodal: text, images, audio, video
Feature
Agentic Workflows
Built-in support for real-time tool use, coding, and function calling.
v0.6
November 2025— GPT-5.2 LaunchModels
GPT-5.2
OpenAI's flagship with 400K context, 128K output, and enhanced reasoning tokens.
- Pricing: $1.75 input / $14.00 output per 1M tokens
- Excels at agentic workflows and multi-step coding
- Enhanced structured document analysis
v0.5
August 2025— GPT-5 Model FamilyModels
Full GPT-5 Suite
Complete model family from premium to cost-efficient variants.
- GPT-5: Premium flagship ($1.25/$10.00)
- GPT-5 mini: Balanced performance ($0.25/$2.00)
- GPT-5 nano: Fastest, most affordable ($0.05/$0.40)
- GPT-5.1 Codex: Agentic coding ($1.50/$12.00)
v0.4
August 2025— GPT OSS Models via GroqModels
Open Source GPT Models
OpenAI's open-weight models available through Groq's ultra-fast infrastructure.
- GPT-OSS 20B: $0.10/$0.50 per 1M tokens
- GPT-OSS 120B: $0.15/$0.75 per 1M tokens
- Built-in browser search and code execution
v0.3
May 2025— Emotion & Sentiment AnalysisFeature
Emotions Detection
Detect emotions in English conversations for deeper user understanding.
Feature
Sentiment Analytics
Track sentiment for both user input and AI responses with performance stats.
v0.2
April 2025— Intelligent RAGFeature
RAG Enhancements
AI-driven retrieval that chooses what to search and when, unified across all providers.
- Content deduplication and cost savings
- Context propagation while needed
- Detailed RAG analytics
Models
New Models
LLaMA 4 Scout/Maverik via Groq, OpenAI GPT 4.1 family.
v0.1
March 2025— Multi-Provider SupportFeature
Provider Integration
One API for OpenAI, Gemini, and Groq with automatic failover.
Feature
Observability
Latency metrics, cost breakdowns, and cache savings tracking.
v0.0
September 2024— Initial ReleaseFeature
Core Platform
Conversation management, WhatsApp integration, RAG model support, and knowledge base.