Prism AI
Prism AI

The multimodal AI operating platform.

Orchestration-first. Model-agnostic. Execution-ready.

Explore Architecture PDF
Scroll to explore
Native Android Client

A complete workspace, built for the edge.

Before we dive into how the engine works, this is what you actually do with it. Prism AI brings desktop-grade multimodal capabilities to Android for the very first time. Every mode is optimized for a specific intent, powered by the local orchestrator.

The Engine Beneath

One model does not equal intelligence. True autonomy lives in the orchestration layer.

To make all those modes work on a mobile device seamlessly, Prism AI relies on an advanced orchestration engine. It isn’t just a wrapper around an LLM—it operates like a kernel for AI workflows.

When your intent enters the system, the orchestrator instantly decomposes the task, selects the optimal model, provisions an isolated sandbox, injects long-term memory, and executes.

Intent Parsing

User request arrives. Orchestrator decodes intent.

Task Decomposition

Harness breaks intent into sub-tasks.

Skill & Tool Loading

Dynamically loads MCP tools and skills.

Sandboxed Execution

Runs code perfectly in isolated AVF/Docker.

Memory Storage

Logs context to pgvector for long-term recall.

The Agentic Harness

The control layer that makes Prism AI autonomous and safe.

Models are engines, but they need a chassis. The harness observes outputs, plans next steps, evaluates results against intent, retries on failure, and formats final delivery. It protects the user from raw model hallucination.

Observe
Plan
Execute
Evaluate
Retry
Finalize
Live System Metrics
Active
Active Workflow
wf_9b8x2a
Tool Routing
mcp.github.read
Memory Injection
pgvector.K=5
Sandbox Status
AVF/Microdroid
Provider Context
Claude 3.5 Sonnet
Execution Trace
> evaluating constraints... ok
> generating plan strategy... done
> verifying tool permissions... authorized
> compiling payload... ready
> deploying to sandbox... running

Execution
Environments

LLMs generate code. Prism AI executes it securely. We isolate every agent interaction within hardware-backed local microVMs or ephemeral cloud containers.

Zero-Trust Environments

Android Virtualization Framework

Android Host
Microdroid
pKVM BOUNDARY
ARM HARDWARE

Local agents run code via Microdroid. A protected KVM enforces hardware-backed isolation between the host OS and the disposable guest VM.

Stateless Isolation

Ephemeral Containers

DOCKER
f_VM
DOCKER
f_VM

Cloud tasks spawn Docker/Firecracker microVMs that exist strictly for the duration of a single inference block.

Read-Only Rootfs

Immutable State

ROOTFS

Guest environments boot from dm-verity verified, read-only root filesystems preventing persistent tampering.

Secure Boundary Line

Socket-Based IPC

{ AST_DIFF }STDOUTSIGKILL

Communication across the host-guest boundary occurs exclusively through constrained vsock protocols, safely piping AST diffs and stdout logs.

Models are engines. They are not the product.

Prism AI routes tasks through an abstraction layer. The orchestration layer remains stable regardless of the model you plug in.

OpenAI
OpenAI
Anthropic
Anthropic
OpenRouter
OpenRouter
Groq
Groq
TogetherAI
TogetherAI
Replicate
Replicate
Ollama (Local)
Ollama (Local)
Custom Provider

Extensibility Engine

Dynamically loaded skills & MCP interoperability.

Skills are not hardcoded. The orchestrator invokes skills autonomously from the registry. Built-in support for Model Context Protocol (MCP) servers allows infinite tool interoperability via stdio, HTTP, and WebSockets.

Native Skills
MCP Servers
CORE
UI Generatoragent:internal
System Logicskill:registry
Local Repomcp:stdio
Live Oraclemcp:http
Event Streammcp:ws
local:stdio

Repo Analyzer

skill.invoke("repo-analyzer")
server:http

PDF Summarizer

skill.invoke("pdf-summarizer")
skill:registry

UI Generator

skill.invoke("ui-generator")
agent:internal

Prompt Enhancer

skill.invoke("prompt-enhancer")
mcp:websocket

SEO Optimizer

skill.invoke("seo-optimizer")

Memory Architecture

A continuous living archive.

Conversations aren't just strings of text. They are vectorized, embedded, and stored for deep contextual recall across entirely separate sessions.

STM CACHE
LTM VECTOR STORE

Short-Term Cache

Redis In-Memory

High-performance volatile storage for active session states, fast-moving UI locks, and immediate recent-token access. Assures sub-millisecond data retrieval during fluid conversations.

IOPS optimized

Long-Term Memory

PGVector Ext.

Persistent semantic embeddings of your entire workflow history. Conversations are dynamically chunked, vectorized, and injected into context explicitly when relevant topics are breached.

Cosine similarity search

Engineering Transparency

Full Stack Architecture

Client UI

KotlinJetpack ComposeMaterial 3MVVM & Clean ArchWebSockets

Orchestration & Backend

Python FastAPILangGraphLiteLLMMCP ProtocolAsync Python

Isolated Execution

AVF (pKVM)MicrodroidDockerFirecracker microVMs

Memory & Data

PostgreSQLpgvectorRedisAndroid Keystore
Roadmap

Architecture & Planning

We are currently in Phase 1. Architecture design and resource planning are actively underway to lay down a solid foundation.

Phase 1Active

Architecture & Planning

Designing system boundaries, setting up environments, and architecture blueprints.

Phase 2Upcoming

App Prototype

Core application interface, fundamental tools tracking, and basic workflows.

Phase 3Upcoming

Core (AVF & Docker)

Integrating Android Virtualization Framework and secure sandbox deployments.

Phase 4Upcoming

Full Launch

Refining stability, expanding plugins, and enabling global beta release.