Premium museum expansion ยท Provider catalog
Arcee AI
Trinity sparse MoE foundation line (Nano / Mini / Large), AFM compact foundation models, a broad post-train catalog (SuperNova, Virtuoso, …), plus MergeKit, Arcee Fusion, and Arcee Cloud. Catalog below mirrors docs.arcee.ai structure (March 2026 snapshot); product names and preview SKUs change faster than this page.
Architecture & stack (overview)
Trinity models use a sparse mixture-of-experts (MoE) design with
efficient attention for lower latency and cost at long context. Public docs cite
AfmoeForCausalLM for Trinity MoE checkpoints; AFM-4.5B uses a dense decoder
(ArceeForCausalLM) with GQA and ReLUยฒ activations.
Arcee Fusion + MergeKit remain the open merge stack for composing models (e.g. SuperNova family). Post-train lines often distill or merge from Llama-class and Qwen-class teachers.
Official: arcee.ai ยท Docs: docs.arcee.ai ยท Hugging Face: arcee-ai ยท Chat: chat.arcee.ai
Model catalog (docs sidebar)
Foundation vs post-train groupings as listed in Arcee’s documentation.
Foundation models
- Trinity Nano 6B Preview โ smallest Trinity MoE; edge / local.
- Trinity Mini 26B โ mid MoE; cloud & on-prem.
- AFM 4.5B โ dense Arcee Foundation Model (non-Trinity).
Post-train models
- Arcee-SuperNova Medius, Arcee-SuperNova-Lite, Arcee-SuperNova (70B-class releases on HF)
- Arcee-Spark, Arcee-Miraj-Mini, Arcee-Agent, Arcee-Lite, Arcee-Nova, Arcee-Scribe, Arcee-SEC
- Virtuoso-Lite, Virtuoso-Medium-v2
- Caller (32B) โ tool-use / orchestration (separate release track)
Trinity family (MoE)
All Trinity variants share the same capability profile in Arcee messaging; footprint (total vs active parameters) scales for edge vs cloud. Architecture: sparse MoE + efficient attention; Nano/Mini public specs: 128 experts, 8 active, 1 shared โ training on large curated mixes (e.g. 10T tokens for Nano/Mini with math/code emphasis) on clusters such as 512ร H200 (per docs).
| Variant | Total / active | Context | Role |
|---|---|---|---|
| Trinity Nano | 6B / 1B active | 128K | Edge, on-device, offline, low-latency voice/UI loops. |
| Trinity Mini | 26B / 3B active | 128K | Cloud & on-prem (AWS, GCP, Azure, vLLM, SGLang, llama.cpp). |
| Trinity Large Preview | 400B / 13B active | 512K (frontier); preview API notes 128K @ 8-bit | Agent harnesses, toolchains, creative workloads; preview API / download flows. |
License (Nano/Mini): Apache 2.0 in public docs. Inference: Transformers (main branch), vLLM, llama.cpp, LM Studio where supported.
AFM-4.5B (dense foundation)
Instruction-tuned 4.5B decoder-only model for enterprise and edge: ~8T tokens
pretrain + midtraining (math/code), SFT + RL on preferences; data curation with Datology (per
Arcee docs). Architecture: GQA, ReLUยฒ, ArceeForCausalLM. License: Apache 2.0.
Post-train highlights (selected)
| Model | Notes (from Arcee docs) |
|---|---|
| Arcee-SuperNova Medius | ~14B; Qwen2.5-14B lineage; multi-teacher distillation (e.g. Qwen2.5-72B-Instruct + Llama-3.1-405B-Instruct narratives). |
| Arcee-SuperNova-Lite | 8B; Llama-3.1-8B-class; distilled from Llama-3.1-405B logits; EvolKit-style instruction data. |
| Arcee-SuperNova-v1 (70B) | Open merge / distillation flagship (earlier release wave). |
| Spark, Miraj-Mini, Agent, Lite, Nova, Scribe, SEC | Task-tuned and domain-specialized post-train SKUs โ see per-model pages on docs. |
| Virtuoso-Lite, Virtuoso-Medium-v2 | Virtuoso merge line; medium/lite footprints. |
| Caller (32B) | Qwen-2.5-32Bโclass tooling / API orchestration focus. |
Capabilities (product)
- Agent reliability โ function selection, valid parameters, schema-true JSON, recovery when tools fail.
- Coherent multi-turn โ long sessions without re-explaining context.
- Structured outputs โ JSON schema adherence; native function calling and tool orchestration.
- Same skills across sizes โ move workloads edge โ cloud without rebuilding prompts.
- Efficient attention โ lower cost at long context vs dense baselines (Arcee messaging).
- Context utilization โ use large inputs for grounded answers.
Context & I/O
- 128K token context for Nano/Mini (and related API surfaces).
- Structured outputs with JSON schema adherence.
- Native function calling and tool orchestration.
Training philosophy (docs)
- Curated data โ quality filtering and classification pipelines.
- Synthetic augmentation โ tool calling, schema adherence, error recovery, preferences, voice-friendly styles.
- Evaluation โ tool reliability, long-turn coherence, structured-output accuracy.
Merge stack & cloud
- MergeKit โ open merge tooling; multi-GPU acceleration.
- Arcee Fusion โ selective fusion masks (importance scoring + thresholds).
- Arcee Cloud โ train, merge, deploy custom LLMs (SaaS).
Technical notes
| Topic | Detail |
|---|---|
| MergeKit | Community merges; blog: MergeKit v0.1+ (Arcee Fusion + multi-GPU). |
| Licensing | Many open weights under Apache 2.0 โ confirm per checkpoint on Hugging Face. |
| Inference stacks | vLLM, SGLang, llama.cpp, LM Studio, Transformers (per model). |
Selected timeline (2024โ2026)
- 2024: MergeKit + Arcee Fusion; AFM-4.5B; SuperNova / Virtuoso waves.
- 2025: Trinity Nano/Mini; expanded post-train; Caller; Arcee Cloud.
- 2026: Trinity Large preview tier; frontier MoE + agent positioning in market.
References
docs.arcee.ai (per-model pages: Trinity, AFM, SuperNova, …) ยท Arcee blog ยท Hugging Face ยท arcee-ai
Compiled from Arcee public docs and blog; not an official Arcee product sheet. Verify live model IDs, API limits, and licenses before production.