Prim — Primitive Relational Intermediate Map — is an AI-native serialization format. It uses familiar key-value syntax and indentation-based nesting that both LLMs and developers already understand. No cryptic symbols, no type markers, no mixed delimiters.
Prim keeps what makes a format LLM-friendly — line-based atomicity, deterministic grammar, streaming compatibility — while using syntax that every model has seen millions of times in training.
- Why Prim?
- Key Features
- When Not to Use Prim
- Installation & Quick Start
- CLI
- Format Overview
- Using Prim with LLMs
- Documentation
- Contributing
- License
Existing formats optimize for human readability (YAML), interoperability (JSON), or schema rigidity (Protobuf). None optimize for how LLMs tokenize, attend to, and generate structured data — while staying familiar enough for ecosystem adoption.
Consider a simple object in JSON:
{
"user": {
"name": "edward",
"age": 20,
"roles": ["admin", "developer"],
"active": true
}
}Every ", {, }, , costs a token. The structural overhead adds up.
Prim represents the same data with minimal syntax that any LLM already knows:
user:
name: edward
age: 20
roles: [admin, developer]
active: true
Lines are atomic. Indentation defines nesting. No closing delimiters, no quotes around bare words, no commas between fields.
- Familiar Syntax — key-value pairs and indentation-based nesting. No cryptic single-character operators. LLMs recognize the pattern immediately.
- Line-Based Atomicity — Every line is independently parseable. Streaming, chunking, and partial recovery are free.
- Standard Types — bare words and
"..."for strings, bare digits for numbers,true/false/nullkeywords. No type marker symbols. - Deterministic Parser — Grammar fits in ~100 LOC. One way to express any structure. No optional syntax, no contextual parsing.
- JSON Data Model — Encodes the same objects, arrays, and primitives as JSON. Lossless round-trips for the JSON data model.
- Streaming-Friendly — Line-oriented output works naturally with autoregressive generation. Emit one relation at a time.
- Zero-Dependency Core — Reference parser and CLI have minimal dependencies. The format itself is dependency-free.
By convention, Prim files use the .prim extension. Prim documents are always UTF-8 encoded with \n line terminators. The provisional media type is text/prim.
- Human-first workflows — Prim is designed for machine consumption. If a human needs to hand-edit frequently, YAML or TOML may be more appropriate.
- Schema-heavy domains — Prim is schema-optional. If you need strict schema enforcement at the encoding layer, use Protobuf or Avro.
- Existing JSON pipelines — If your entire stack already speaks JSON and token costs aren't a bottleneck, switching formats may not be worth the integration cost.
- Binary data at scale — Prim is text-based. For high-throughput binary serialization, use MessagePack, CBOR, or Protobuf.
# Install from source
cargo install --path packages/cli
# Convert JSON to Prim
prim -i example.json -o example.primcargo add toprimuse toprim::json_to_prim;
let json = r#"{"name": "prim", "version": 1}"#;
let prim = json_to_prim(json).unwrap();
println!("{prim}");
// name: prim
// version: 1Command-line tool for JSON ↔ Prim conversion.
# Convert JSON file to Prim
prim -i data.json -o data.prim
# Convert Prim file to JSON
prim -i data.prim -o data.json
# Output to stdout (omit -o)
prim -i data.jsonPrim uses key: value pairs and indentation for nesting. No closing delimiters. No commas between fields. Bare words are strings.
| Element | Syntax | Example |
|---|---|---|
| string | bare word or "…" |
name: edward |
| number | bare digits | age: 20 |
| boolean | true / false |
admin: true |
| null | null |
middle: null |
| binary | %base64 |
avatar: %iVBORw0KGgo |
| bind | key: value |
host: localhost |
| block | key: + indent |
user: then name: |
| list | - value |
- rust |
| inline list | [v1, v2] |
tags: [rust, cli, ai] |
| heredoc | <<EOF … EOF |
text: <<EOF |
app:
name: prim
version: 1
description: "AI-native serialization format"
config:
debug: false
max_connections: 100
database:
host: localhost
port: 5432
features:
- streaming
- compact
- partial-recovery
authors:
- edward
license: Apache-2.0
- Familiar syntax — key-value and indentation patterns every LLM knows
- Line-based atomic units — streaming, chunking, partial recovery are free
- One way to do anything — no competing syntaxes for the same structure
- Standard types — bare words, quoted strings, bare numbers, keyword booleans
- Tiny deterministic grammar — ~100 LOC parser, less branching = lower hallucination
Prim is designed for AI consumption. Wrap Prim in ```prim code blocks when providing structured data to models. The format is self-documenting — models parse it immediately since the syntax mirrors patterns they've seen millions of times in training (YAML, Markdown, configuration files).
When asking models to generate Prim output, a one-shot example is usually sufficient. The key conventions — key: value for scalars, indentation for nesting, - for list items — are already familiar.
See the skill file for a complete reference optimized for AI agents.
- SPEC.md — Normative specification with all rules and syntax
- SKILL.md — AI agent skill reference for generating and parsing Prim
- packages/cli/ — CLI tool and
toprim/primtoRust libraries
Contributions welcome. Open an issue or pull request to discuss changes before submitting.
Licensed under the Apache License, Version 2.0. See LICENSE for the full text.
Thanks toon-format for the idea and post inspiration.
