Routing

Model Routing

A deep dive into how ARKAbrain selects the optimal model for each request through classification, complexity assessment, and intelligent fallbacks.

Routing Overview

When you send a request to ARKA-AI, ARKAbrain performs a multi-step routing process to select the best model. This process considers your task type, input complexity, available providers, and plan tier.

Routing Pipeline

Input→Classify→Assess Complexity→Select Role→Choose Model→Execute

Task Classification

The first step is identifying what type of task you're performing. ARKAbrain uses a two-stage classifier:

Stage 1: Heuristics

Fast pattern matching using regex rules. Detects keywords like "summarize", "rewrite", "explain", etc. with weighted confidence scores.

Stage 2: LLM Fallback

If heuristics have low confidence (<70%), a lightweight LLM call provides more accurate classification.

ARKAbrain recognizes 7 distinct task types:

Task Type	Detection Patterns
rewrite_editing	rewrite, rephrase, improve, polish, make better
summarization	summarize, tl;dr, key points, condense
explain_simple	explain, ELI5, break down, simplify
seo_marketing	SEO, meta description, title tag, marketing
prompt_optimization	improve prompt, prompt engineering, optimize
coding_debugging	code blocks, error, debug, implement, function
general_chat	Fallback for unclassified requests

Complexity Assessment

After identifying the task type, ARKAbrain assesses the complexity of your request. This determines whether to use a fast/cheap model or a powerful/premium one.

Low Complexity

• Short input (<100 tokens)
• Keywords: "simple", "quick", "brief", "just"
• No code blocks

Medium Complexity

• Moderate input (100-500 tokens)
• Keywords: "explain", "describe", "outline"
• One code block

High Complexity

• Long input (>500 tokens)
• Keywords: "complex", "detailed", "comprehensive"
• Multiple code blocks

Role Selection Matrix

Based on task type and complexity, ARKAbrain assigns a model role. Each role has a primary model and fallback chain:

Task	Low	Medium	High
Rewrite/Edit	FAST_CHEAP	BALANCED	BALANCED
Summarize	FAST_CHEAP	FAST_CHEAP	BALANCED
Explain	FAST_CHEAP	FAST_CHEAP	BALANCED
SEO/Marketing	BALANCED	SEO_WRITER	SEO_WRITER
Prompts	BALANCED	REASONING	REASONING
Coding	BALANCED	CODING	CODING
General	FAST_CHEAP	BALANCED	BALANCED

Model Registry

ARKAbrain maintains a registry of 11 models across OpenAI and OpenRouter. Each model is configured with its capabilities, costs, and strengths:

Model	Provider	Context	Speed	Strengths
GPT-4o	OpenAI	128K	Medium	General, Coding, Reasoning
GPT-4o Mini	OpenAI	128K	Fast	Fast, Cheap, General
o1 Mini	OpenAI	128K	Slow	Reasoning, Math, Coding
Claude 3.5 Sonnet	OpenRouter	200K	Medium	Coding, Writing, Long Context
Claude 3 Haiku	OpenRouter	200K	Fast	Fast, Cheap, General
Gemini 1.5 Pro	OpenRouter	2M	Medium	Reasoning, Long Context
Gemini 1.5 Flash	OpenRouter	1M	Fast	Fast, Cheap, Long Context
Llama 3.1 70B	OpenRouter	131K	Medium	General, Coding, Cheap
DeepSeek Coder	OpenRouter	128K	Fast	Coding, Cheap

Fallback Chains

Each role has a primary model and a chain of fallbacks. If the primary model fails (rate limit, error, etc.), ARKAbrain automatically tries the next model in the chain:

FAST_CHEAP

GPT-4o Mini→Claude 3 Haiku→Gemini 1.5 Flash→Llama 3.1 70B

BALANCED

GPT-4o→Claude 3.5 Sonnet→GPT-4 Turbo→Gemini 1.5 Pro

REASONING

o1 Mini→GPT-4o→Claude 3.5 Sonnet→Gemini 1.5 Pro

CODING

GPT-4o→Claude 3.5 Sonnet→DeepSeek Coder→GPT-4 Turbo

LONG_CONTEXT

Gemini 1.5 Pro→Claude 3.5 Sonnet→GPT-4o→Gemini 1.5 Flash

SEO_WRITER

GPT-4o→Claude 3.5 Sonnet→GPT-4o Mini→Llama 3.1 70B

Temperature Tuning

ARKAbrain sets optimal temperature for each task type. Lower temperature means more focused output; higher means more creative:

Task Type	Temperature Range	Why
Coding/Debugging	0.1 - 0.3	Precision is critical
Summarization	0.2 - 0.3	Accuracy over creativity
Rewrite/Edit	0.2 - 0.4	Preserve meaning
Prompt Optimization	0.3 - 0.5	Balance precision and exploration
Explain Simply	0.4 - 0.6	Allow creative analogies
General Chat	0.5 - 0.7	Natural conversation
SEO/Marketing	0.6 - 0.8	Encourage creativity

Plan-Based Routing

Your subscription plan affects which roles are available:

Starter Plan

FAST_CHEAP role
BALANCED role
REASONING → BALANCED
CODING → BALANCED
LONG_CONTEXT → BALANCED
SEO_WRITER → BALANCED

Pro Plan

FAST_CHEAP role
BALANCED role
REASONING role (o1 Mini)
CODING role (optimized)
LONG_CONTEXT role (Gemini)
SEO_WRITER role

Automatic Downgrades

Starter plan users still get great results! Premium roles are automatically downgraded to BALANCED, which uses GPT-4o - an excellent all-around model.

ARKAbrain Cost Optimization