Markdown-Linked Data β Write RDF knowledge graphs as plain Markdown. Parse to quads, generate back, merge documents. Zero dependencies, round-trip safe.
MD-LD is the only RDF format that is both writable by humans and parseable by machines in the same document. Unlike Turtle (write-only), JSON-LD (machine-only), and RDFa (embedded-in-HTML-only), MD-LD annotations flow with natural Markdown prose β making knowledge graphs readable without a renderer.
- π Specification β Formal specification and test suite
- π Documentation β Complete documentation with guides and references
- π― Examples β Real-world MD-LD examples and use cases
- π Grammar β EBNF+TextMate grammar specifications
- π§© Ontologies β W3C and related standard ontologies used in RDF
MD-LD is not just another RDF syntax. It's a universal semantic writing interface that removes the intermediary between human text and machine-readable graphs.
Traditional systems require:
Human β UI β App Logic β Hidden Database β APIs β Exports
MD-LD enables:
Human text β Graph immediately
Core value: Author and maintain knowledge graphs as plain text with deterministic round-trip safety. No platforms, databases, or proprietary SaaS mediation required.
[ex] <tag:ame@example.com,2026:>
# Alice {=ex:alice .prov:Person label}
[Alice Smith] {ex:fullName}
[alice@example.com] {ex:email}Generates RDF quads that work with n3.js, rdflib, and any RDF/JS-compatible library.
pnpm install mdld-parseimport { parse, generate, merge } from 'mdld-parse';
// Parse MDLD to RDF quads
const result = parse({ text: mdldString });
console.log(result.quads); // RDF/JS quads
console.log(result.primary); // Primary metadata (subject, type, label, comment)
console.log(result.statements); // Elevated statements
console.log(result.origin); // Provenance tracking
// Generate MDLD from quads
const { text } = generate({ quads: result.quads });
// Merge multiple documents (CRDT-style)
const merged = merge([doc1, doc2, doc3]);Most software today uses graphs internally but hides them behind UIs:
- Notion, Slack, Google Docs β Human interfaces over hidden graphs
- CRMs, task apps, note apps β Proprietary data silos
- Social networks β Platform-controlled knowledge prisons
Users cannot access the graph directly. Semantics are hidden. Data is locked in products.
MD-LD removes the intermediary. Writing becomes publishing. Publishing becomes graph construction.
Key benefits:
- Graph sovereignty β You own text, graph, provenance, execution, history
- No central platform required β Works offline, in browsers, on servers
- Universal semantic substrate β Agents can read, reason, write, execute, validate
- Continuous semantic narrative β Unifies chat, tasks, notes, emails, calendar, files
- Native time dimension β Every action, statement, correction becomes part of the graph
- Decentralized authority β RFC 4151 tag: URIs enable self-sovereign identity without central registries
- Text-native agent memory - LLM Agent memory substrate in plain text β parse context, write knowledge, merge with other agents, all as Markdown files. No database required.
[alice] <tag:alice@example.com,2026:>
# Meeting Notes {=alice:meeting-2024-01-15 .alice:Meeting label}
Attendees:
**Alice** {+alice:alice ?alice:attendee label}
**Bob** {+alice:bob ?alice:attendee label}
Action items:
**Review proposal** {+alice:task-1 ?alice:actionItem label}[api] <tag:brian@example.org,2026:app/api/>
# Get User by ID {=api:/users/:id .api:Endpoint label}
Method: [GET] {+api:methods/GET ?api:method}
Path: [/users/:id] {api:path}
Status: [OK] {api:status}[alice] <tag:alice@example.org,2026:>
# Semantic Web {=alice:research/paper-semantic-markdown .alice:ScholarlyArticle label}
Is part of [semantic research] {+alice:research/semantic !member}
Authored by [Alice Johnson] {+alice:alice-johnson ?alice:author} on [2026-08-12] {alice:datePublished ^^xsd:date}.[blog] <tag:justin@example.org,2026:>
# Understanding MD-LD {=blog:post-mdld .blog:Post label}
[MD-LD] {blog:emphasized} allows you to embed RDF directly in Markdown.- π Prefix folding β Build hierarchical namespaces with CURIE-based IRI authoring
- π Subject declarations β
{=IRI}and{=#fragment}for context setting - π― Object IRIs β
{+IRI}and{+#fragment}for temporary object declarations - π Three predicate forms β
p(SβL),?p(SβO),!p(OβS) - π·οΈ Type declarations β
.Classfor rdf:type triples - π
Datatypes & language β
^^xsd:dateand@ensupport - π§© Fragments β Document structuring with
{=#fragment} - β‘ Polarity system β Sophisticated diff authoring with
+and-prefixes - π Origin tracking β Complete provenance with lean quad-to-source mapping
- π Span chains β Walkable textual topology between semantic blocks for context recovery and resonance
- π― Elevated statements β Automatic rdf:Statement pattern detection
- π·οΈ Primary metadata quartet β Subject, type, label, comment for document identity
- π Round-trip safety β Deterministic parse β generate cycles
Bundle size: 86KB unminified, 20KB gzipped
pnpm install mdld-parse
node -e "
import { parse } from 'mdld-parse';
console.log(parse({ text: '# Test {=tag:test@example.org,2026:index .prov:Entity label}' }));
"<script type="importmap">
{
"imports": {
"mdld-parse": "https://cdn.jsdelivr.net/npm/mdld-parse/+esm",
}
}
</script>
<script type="module">
import { parse } from 'mdld-parse';
const result = parse('[ex] <tag:my@example.com,2026:test/>\n\n# Hello {=ex:init .prov:Activity label}');
</script>You can copy and paste this code into your browser console to see the list of tasks as an easy to render JSON object.
const mdld = await import('https://cdn.jsdelivr.net/npm/mdld-parse/+esm')
const text = `[my] <tag:alice@example.org:>
# Tasks {=my:tasks .prov:Collection label}
## Task 1 {=my:tasks/1 .prov:Activity label}
One of my [urgent] {my:tasks/status} [tasks] {+my:tasks !prov:hadMember}
> Explore deeper the concept of a triple in RDF {comment}
## Task 2 {=my:tasks/2 .prov:Activity label}
One of my [tasks] {+my:tasks !prov:hadMember}
> Start building knowledge graphs {comment}
`;
const result = parse({ text });
function extractByType (quads, type) {
return Object.values(
quads.reduce((acc, q) => {
const s = q.subject.value;
const key = q.predicate.value.split(/[#/]/).pop();
(acc[s] ??= { iri: s })[key] = q.object.value;
return acc;
}, {})
)
.filter(x => x.type === type)
.map(({ type, ...x }) => x);
}
const tasks = extractByType(result.quads,"http://www.w3.org/ns/prov#Activity")
console.log(tasks);
/*
[
{
"iri": "tag:alice@example.org:tasks/1",
"label": "Task 1",
"status": "urgent",
"comment": "Explore deeper the concept of a triple in RDF"
},
{
"iri": "tag:alice@example.org:tasks/2",
"label": "Task 2",
"comment": "Start building knowledge graphs"
}
]
*/MD-LD encodes a directed labeled multigraph where three nodes may be in scope:
- S β current subject (IRI)
- O β object resource (IRI from link/image)
- L β literal value (string + optional datatype/language)
Each predicate form determines the graph edge:
| Form | Edge | Example | Meaning |
|---|---|---|---|
p |
S β L | [Alice] {label} |
literal property |
?p |
S β O | [NASA] {=ex:nasa ?org} |
object property |
!p |
O β S | [Parent] {=ex:p !hasPart} |
reverse object |
Set current subject (emits no quads):
[ex] <tag:nasa@example.org,2026:>
## Apollo 11 {=ex:apollo11}Emit rdf:type triple:
[ex] <tag:nasa@example.org,2026:>
## Apollo 11 {=ex:apollo11 .ex:SpaceMission .prov:Entity}Inline value carriers emit literal properties:
[ex] <tag:nasa@example.org,2026:>
# Mission {=ex:apollo11}
[Neil Armstrong] {ex:commander}
[1969] {ex:year ^^xsd:gYear}
[Historic mission] {ex:description @en}Links create relationships (use ? prefix):
[ex] <tag:nasa@example.org,2026:>
# Mission {=ex:apollo11}
[NASA] {=ex:nasa ?ex:organizer}Declare resources inline with {+iri}:
[ex] <tag:nasa@example.org,2026:>
# Mission {=ex:apollo11}
[Neil Armstrong] {+ex:armstrong ?ex:commander .Person}Use + and - for retractions:
[ex] <tag:carol@example.org,2026:>
New student [Alice] {=ex:new-student .prov:Person ex:name} is our [class] {+ex:my-class !member}. I think she might know [Bob] {+ex:bob ?ex:knows}.
**Correction:** [Her] {=ex:new-student} name is not [Alice] {-ex:name}, it's [Ellie] {ex:name}.
**Correction:** I asked her directly - no, she doesn't know [him] {+ex:bob -?ex:knows}.
**IRI replacement:** Let's create a proper [Class] {=ex:my-class} record for [Ellie] {+ex:Ellie .prov:Person ex:name label ?member} instead of temporary [Ellie] {+ex:new-student -.prov:Person -ex:name -?member} record created earlier.After generate(parse({text})) would look like this:
[ex] <tag:carol@example.org,2026:>
# Ellie {=ex:Ellie .prov:Person label}
[Ellie] {ex:name}
# my-class {=ex:my-class}
[ex:Ellie] {+ex:Ellie ?member}Parse MDLD to RDF quads with lean origin tracking.
Parameters:
text(string, required) β MDLD formatted textcontext(object, optional) β Prefix mappingsdataFactory(object, optional) β Custom RDF/JS DataFactorygraph(string, optional) β Named graph IRI
Returns: { quads, remove, statements, origin, context, primarySubject, primary, md }
quadsβ RDF/JS Quads (final resolved graph state)removeβ RDF/JS Quads (external retractions for diff workflows)statementsβ Elevated SPO quads from rdf:Statement patternsoriginβ Lean origin tracking:quadIndex,blocks,spans,documentStructurecontextβ Final context with prefixesprimarySubjectβ String IRI or null (canonical append identity)primaryβ Primary metadata quartet:{ subject, type, label, comment }mdβ Clean Markdown with annotations stripped
Merge multiple MDLD documents with diff polarity resolution.
Parameters:
docs(array) β Array of markdown strings or ParseResult objectsoptions(object, optional):context(object) β Prefix mappings
Returns: { quads, remove, statements, origin, context, primarySubjects, primary }
quadsβ RDF/JS Quads (final resolved graph state)removeβ RDF/JS Quads (external retractions)statementsβ Elevated statements from all documentsoriginβ Merge origin with document trackingcontextβ Final context with prefixesprimarySubjectsβ Array of string IRIs (canonical identities)primaryβ Array of primary metadata objects
Use case: CRDT-style state management with append-only documents.
Generate deterministic MDLD from RDF quads.
Parameters:
quads(array, required) β RDF/JS Quads to convertcontext(object, optional) β Prefix mappingsprimarySubject(string, optional) β IRI to place first in outputcompactInline(boolean, optional) β Inline type/label compaction (default:false)renderReverse(boolean, optional) β Reverse connections as!p(default:false)remove(array, optional) β RDF/JS Quads to retract (for diff generation)lang(string, optional) β Preferred language for labels (e.g.,'en','es','fr'). Priority: specified lang β untagged β English β any language
Returns: { text, context, compactStats }
textβ Generated MDLD textcontextβ Full context with prefixescompactStatsβ Compaction metrics
Features: Visual styling, label-in-heading, round-trip safe, diff generation, language preference.
Example with language preference:
const { text } = generate({
quads: result.quads,
lang: 'es' // Prefer Spanish labels
});Generate node-centric MDLD for a specific IRI.
Parameters:
quads(array, required) β RDF/JS Quads to searchfocusIRI(string, required) β IRI to center view oncontext(object, optional) β Prefix mappingscompactInline(boolean, optional) β Inline compaction (default:true)renderReverse(boolean, optional) β Reverse connections (default:true)lang(string, optional) β Preferred language for labels (e.g.,'en','es','fr'). Priority: specified lang β untagged β English β any language
Returns: { text, context, compactStats }
Safety: Returns empty text if focusIRI not found (prevents accidental full database rendering).
Update carrier text of a literal quad in MDLD text.
Parameters:
text(string) β Original MDLD textquad(object) β Quad to updatevalue(string) β New carrier textorigin(object, optional) β ParseResult.origin
Returns: Updated MDLD text (fail-safe)
Use case: Editor applications updating literal values.
Locate quad origin entry for UI navigation.
Returns: { blockId, range, valueRange, carrierType, ... } or null
import {
DEFAULT_CONTEXT, // Default prefix mappings
DataFactory, // RDF/JS DataFactory
hash, // String hashing
expandIRI, // IRI expansion
shortenIRI, // IRI shortening
parseSemanticBlock // Semantic block parsing
} from 'mdld-parse';- Zero dependencies β Pure JavaScript, 85KB unminified (20KB gzipped)
- Streaming-first β Single-pass parsing, O(n) complexity
- Character-based tokenization β 20-28% faster than regex-based approaches
- Standards-compliant β RDF/JS data model, W3C CURIE 1.0 syntax
- Deterministic β Same input always produces same output
- Explicit semantics β No guessing, inference, or heuristics
- Dual-layer origin β Every parse emits both a semantic quad graph and a walkable textual topology graph simultaneously
The parser output includes a complete document chain at no extra cost:
[Block] --(Span)-- [Block] --(Span)-- [Block]
- Blocks (
origin.blocks) β semantic anchors: tokens that produced RDF quads, withprevSpanId/nextSpanIdlinks - Spans (
origin.spans) β textual observations: raw byte ranges between blocks, with bidirectional block and span links
Spans store no text β content is always recovered via sourceText.slice(span.range[0], span.range[1]). This unlocks context-aware UI, autocomplete neighborhood retrieval, and cross-document topology without any parser-level interpretation.
- Real-time (60fps): Up to 4,527 quads per frame
- Batch processing: Up to 225,059 quads per second
- Memory efficient: ~640 bytes per quad retained after GC
- Streaming-friendly: Full document never in memory
Quads work with:
n3.jsβ Turtle/N-Triples serializationrdflib.jsβ RDF storessparqljsβ SPARQL queriesrdf-extβ RDF utilities
- RDF 1.1 β Core RDF concepts
- RDFS β Schema vocabulary
- PROV-O β Provenance ontology
- SHACL β Constraint validation
- W3C CURIE 1.0 β Compact URI syntax
pnpm testComprehensive test suite covering:
- Syntax parsing and tokenization
- Context management and prefix folding
- Polarity system and retractions
- Elevated statements detection
- Primary metadata extraction
- Round-trip parse/generate cycles
- Origin tracking and provenance
MD-LD is a craft project. Its coherence comes from a single evolving understanding of how semantic text should work β not from consensus, but from sustained attention to the same problem over time.
This means:
- Decisions are made by the steward, informed by discussion and use
- The project prioritizes conceptual integrity over inclusiveness
- Contributions that align with the model are welcomed and incorporated
- Contributions that expand scope without deepening coherence are respectfully declined
- The spec will not grow features to attract users β it will grow depth to serve understanding
MD-LD is currently published as copyrighted source material.
The project is under active development and no open-source license has been selected yet.
Individuals, researchers, educators, and non-commercial users are welcome to experiment with the technology.
Organizations interested in production or commercial use should contact the author.
The long-term governance and licensing model remains under evaluation.
The primary goal at this stage is preserving the simplicity, interoperability, and long-term integrity of the system while the ecosystem forms around it.