Documentation
Routing

Model Routing

A deep dive into how ARKAbrain selects the optimal model for each request through classification, complexity assessment, and intelligent fallbacks.

Routing Overview

When you send a request to ARKA-AI, ARKAbrain performs a multi-step routing process to select the best model. This process considers your task type, input complexity, available providers, and plan tier.

Routing Pipeline

InputClassifyAssess ComplexitySelect RoleChoose ModelExecute

Task Classification

The first step is identifying what type of task you're performing. ARKAbrain uses a two-stage classifier:

Stage 1: Heuristics

Fast pattern matching using regex rules. Detects keywords like "summarize", "rewrite", "explain", etc. with weighted confidence scores.

Stage 2: LLM Fallback

If heuristics have low confidence (<70%), a lightweight LLM call provides more accurate classification.

ARKAbrain recognizes 7 distinct task types:

Task TypeDetection Patterns
rewrite_editingrewrite, rephrase, improve, polish, make better
summarizationsummarize, tl;dr, key points, condense
explain_simpleexplain, ELI5, break down, simplify
seo_marketingSEO, meta description, title tag, marketing
prompt_optimizationimprove prompt, prompt engineering, optimize
coding_debuggingcode blocks, error, debug, implement, function
general_chatFallback for unclassified requests

Complexity Assessment

After identifying the task type, ARKAbrain assesses the complexity of your request. This determines whether to use a fast/cheap model or a powerful/premium one.

Low Complexity

  • • Short input (<100 tokens)
  • • Keywords: "simple", "quick", "brief", "just"
  • • No code blocks

Medium Complexity

  • • Moderate input (100-500 tokens)
  • • Keywords: "explain", "describe", "outline"
  • • One code block

High Complexity

  • • Long input (>500 tokens)
  • • Keywords: "complex", "detailed", "comprehensive"
  • • Multiple code blocks

Role Selection Matrix

Based on task type and complexity, ARKAbrain assigns a model role. Each role has a primary model and fallback chain:

TaskLowMediumHigh
Rewrite/EditFAST_CHEAPBALANCEDBALANCED
SummarizeFAST_CHEAPFAST_CHEAPBALANCED
ExplainFAST_CHEAPFAST_CHEAPBALANCED
SEO/MarketingBALANCEDSEO_WRITERSEO_WRITER
PromptsBALANCEDREASONINGREASONING
CodingBALANCEDCODINGCODING
GeneralFAST_CHEAPBALANCEDBALANCED

Model Registry

ARKAbrain maintains a registry of 11 models across OpenAI and OpenRouter. Each model is configured with its capabilities, costs, and strengths:

ModelProviderContextSpeedStrengths
GPT-4oOpenAI128KMediumGeneral, Coding, Reasoning
GPT-4o MiniOpenAI128KFastFast, Cheap, General
o1 MiniOpenAI128KSlowReasoning, Math, Coding
Claude 3.5 SonnetOpenRouter200KMediumCoding, Writing, Long Context
Claude 3 HaikuOpenRouter200KFastFast, Cheap, General
Gemini 1.5 ProOpenRouter2MMediumReasoning, Long Context
Gemini 1.5 FlashOpenRouter1MFastFast, Cheap, Long Context
Llama 3.1 70BOpenRouter131KMediumGeneral, Coding, Cheap
DeepSeek CoderOpenRouter128KFastCoding, Cheap

Fallback Chains

Each role has a primary model and a chain of fallbacks. If the primary model fails (rate limit, error, etc.), ARKAbrain automatically tries the next model in the chain:

FAST_CHEAP
GPT-4o MiniClaude 3 HaikuGemini 1.5 FlashLlama 3.1 70B
BALANCED
GPT-4oClaude 3.5 SonnetGPT-4 TurboGemini 1.5 Pro
REASONING
o1 MiniGPT-4oClaude 3.5 SonnetGemini 1.5 Pro
CODING
GPT-4oClaude 3.5 SonnetDeepSeek CoderGPT-4 Turbo
LONG_CONTEXT
Gemini 1.5 ProClaude 3.5 SonnetGPT-4oGemini 1.5 Flash
SEO_WRITER
GPT-4oClaude 3.5 SonnetGPT-4o MiniLlama 3.1 70B

Temperature Tuning

ARKAbrain sets optimal temperature for each task type. Lower temperature means more focused output; higher means more creative:

Task TypeTemperature RangeWhy
Coding/Debugging0.1 - 0.3Precision is critical
Summarization0.2 - 0.3Accuracy over creativity
Rewrite/Edit0.2 - 0.4Preserve meaning
Prompt Optimization0.3 - 0.5Balance precision and exploration
Explain Simply0.4 - 0.6Allow creative analogies
General Chat0.5 - 0.7Natural conversation
SEO/Marketing0.6 - 0.8Encourage creativity

Plan-Based Routing

Your subscription plan affects which roles are available:

Starter Plan

  • FAST_CHEAP role
  • BALANCED role
  • REASONING → BALANCED
  • CODING → BALANCED
  • LONG_CONTEXT → BALANCED
  • SEO_WRITER → BALANCED

Pro Plan

  • FAST_CHEAP role
  • BALANCED role
  • REASONING role (o1 Mini)
  • CODING role (optimized)
  • LONG_CONTEXT role (Gemini)
  • SEO_WRITER role

Automatic Downgrades

Starter plan users still get great results! Premium roles are automatically downgraded to BALANCED, which uses GPT-4o - an excellent all-around model.