The brain is the only omnimodal system that does it right
BrainVI turns it into an API.
Cognition API · Cognitive research lab for AI

Give machines a sense
of human emotion.

Agents and robots are fluent in language and blind to people. BrainVI is building a cortical-embedding API — a neuroscience-grounded read on how a human mind responds.

Get API key Read the research
API in private beta · join the waitlist for early access
505 SUBJECTS· 755+ fMRI HOURS· 9 PUBLIC DATASETS· 81,924 CORTICAL VERTICES· fsaverage6 SURFACE· 6-STREAM ENCODER· CC0 / CC-BY LICENSED· 4 RESEARCH PAPERS· 505 SUBJECTS· 755+ fMRI HOURS· 9 PUBLIC DATASETS· 81,924 CORTICAL VERTICES· fsaverage6 SURFACE· 6-STREAM ENCODER· CC0 / CC-BY LICENSED· 4 RESEARCH PAPERS·
01↘ The gap
Intelligence reads your words.
It doesn't read you.
The blind spot

A robot in your home, an agent on your phone — each acts with no model of the person in front of it. Today's embeddings learn what looks alike on the internet, not how a mind responds.

The brain already solves this. Vision, sound, language, and feeling converge in one place — the cortex. That convergence is the representation machines are missing.

Principles · observed
  1. α
    The cortex is natively omnimodal.
  2. β
    Predict the brain, don't scan it.
  3. γ
    Honesty over flattery.
  4. δ
    Standing on prior art.
02↘ The thesis
The next frontier,
rooted in neuroscience.

Our bet: an embedding grounded in the cortex beats one scraped from the web. It carries how content is processed, not just what co-occurs online.

We call these cortical embeddings. Our flagship model already matches the published state of the art in brain encoding — on a third of the training data.

Read: The Average Brain Is No Brain At All
EMBEDDING · CONCEPTPRIVATE BETA
# Cortical embedding — one omnimodal vector per stimulus
from BrainVI import MARY

emb = MARY.embed("clip.mp4")        # → 81,924-dim cortical vector
sim = MARY.similarity(a, b)         # brain-space similarity, not pixel-space

# How alike are two stimuli in a mind — not on the internet?
# Interface shown for illustration · API in private beta
↑ illustrative · not a live endpoint yet
03↘ The model · scroll to explore each stream
A six-stream
cortical encoder.
The encoder

MARY maps any stimulus to predicted cortical activity via six backbones that mimic the brain's pathways — 81,924 vertices of the fsaverage6 surface, one stream per sense.

MARY Nano 1.0, our research-preview model, matches the Algonauts 2025 state of the art — Pearson r = 0.216 vs Meta TRIBE's 0.2146 (Schaefer-1000) — on just ~23h of fMRI, a third of TRIBE's data. On held-out films it generalizes at r = 0.170, rising to 0.185 with MARY Nano 1.1.

01 · Vision
Primary visual cortex (V1) · occipital
Spatiotemporal motion — movement, dynamics, and timing.
02 · Scene / VL
Fusiform & lateral occipitotemporal
Vision-language — scene understanding & semantics.
03 · Audio
Superior temporal gyrus · A1
Audio — sound events, music, and ambience.
04 · Speech
Inferior frontal gyrus · Broca's
Speech — words, voice, and prosody.
05 · Narrative
Prefrontal cortex
Long-context language — narrative & comprehension.
06 · On-screen text
Visual word-form area (VWFA)
On-screen text — captions & legible detail.
Frozen backbones are used as feature extractors only · we do not train them.
04↘ The corpus
Trained on a clean corpus of real brains.
SUBJECTS
505
Across 9 public video-watching fMRI datasets.
fMRI HOURS
755+
Commercially-safe, CC0 / CC-BY licensed.
VERTICES
81,924
Cortical surface points, fsaverage6 — 4× the field's resolution.
OF TRIBE'S DATA
33%
MARY Nano 1.0 matched the Algonauts SOTA on ~23h of fMRI. The rest is runway.
Sources · all public

We train on openly licensed (CC0 / CC-BY) datasets. With close ties to academic faculty, we conduct our own fMRI studies.

05↘ The API · private beta
One signal.
Many surfaces.

Score how a scene lands in visual cortex. Compare two voices by predicted response. Read a user's engagement. Condition a generative model on a cortical target.

Capabilities below are in private beta · join the waitlist for access
01 · EMBED
Cortical embeddings

Turn any image, video, audio clip, or text into a single omnimodal cortical vector — a representation of how a mind processes it.

MARY · 81,924-DIM · OMNIMODAL
02 · SCORE
Region-level response

Read predicted activity across cortical networks — attention, emotion, memory, and more — each grounded in a brain region, not a proxy classifier.

YEO-7 NETWORKS · PER-SECOND
03 · DECODE
Cross-modal translation

Move between modalities through one cortical state — image, text, and audio. Shown in our research.

RESEARCH PREVIEW
04 · MEMORY
BrainVI Memory

Drop-in multimodal memory for agents, indexed by cortical response — what a user saw, and how it landed.

AGENT INTEGRATION
06↘ Research
Open by default.
All research
07↘ Where we're headed
The conditioning signal
for generative AI.
Research direction

Cortical embeddings as a conditioning signal for generative models.

AIM · NEAR
A cortical alternative to CLIP / CLAP — an omnimodal conditioning space grounded in the brain.
AIM · MID
Reranking generative outputs by predicted comprehension, engagement, and recall.
AIM · LONG
Cortical LoRAs that condition image, video, and audio generation on a target neural response.
Trust & compliance

Built to be trusted.

↘ SOC 2-CERTIFIED CLOUD INFRASTRUCTURE
↘ GDPR & CCPA PRINCIPLES BY DESIGN
↘ OPENLY LICENSED DATA · CONSENTED RESEARCH
↘ PRIVACY MODE — OPT OUT OF TRAINING ANY TIME
09↘ Get involved
Three ways in.
01
Get an API key
Join the waitlist for early access to the cortical-embedding API. Developers, agent builders, and curious consumers welcome.
Join waitlist
02
Investor interest
We're building the cognition layer for the agentic era. If you invest in deep tech and frontier AI, let's talk.
Express interest
03
Work with us
Neuroscience, ML, and infrastructure. If predicting the brain in silico is the problem you want, register your interest.
See careers
Get API key Read the research