A stratified, semiring-parameterized Datalog and its execution engine — the symbolic core of a neuro-symbolic logic system. This repository is the CPU reference stack: frontend, IR, checker, interpreter, and reference implementations of the probabilistic (режим B) and answer-set layers. The GPU engine and the knowledge-compilation / incremental optimizations are later phases (see ARCHITECTURE.md).
Status. The executable core runs end-to-end:
text → parse → check → Core-IR → interpret → result. Positive Datalog, stratified negation, and aggregates over the Bool and Trop (tropical, min-plus) semirings; exact probabilistic queries (?prob, distribution semantics) and an answer-set (stable model) solver for@aspmodules. Cross-checked against Soufflé and fuzzed (10k random programs, naive vs semi-naive and vs Soufflé). Everything beyond this — the GPU backend,Prov/Prov_kprovenance,neural,@terms,?grad— parses into valid IR and returns a stable "not implemented in Phase 0" diagnostic.
cargo build
cargo run -p strata-cli -- run examples/tc.strataexamples/tc.strata computes a transitive closure:
% transitive closure
domain node.
pred edge(node, node): Bool.
pred path(node, node): Bool.
path(X, Y) :- edge(X, Y).
path(X, Z) :- edge(X, Y), path(Y, Z).
edge(a, b).
edge(b, c).
edge(c, d).
$ strata run examples/tc.strata
edge(a, b)
edge(b, c)
edge(c, d)
path(a, b)
path(a, c)
path(a, d)
path(b, c)
path(b, d)
path(c, d)
Grammar priority: optimized for LLMs to write, human readability second.
Familiar Datalog :-, Prolog lexical convention (variables Uppercase,
constants lowercase), and mandatory predicate signatures so a typo becomes
a compile error, not a silently empty relation.
pred edge(node, node): Bool. % a predicate signature: types + semiring
5 :: edge(a, b). % a tropical (weighted) fact
reach(X, Z) :- edge(X, Y), reach(Y, Z). % a recursive rule
unreach(X) :- node(X), not reach(X). % stratified negation
outdeg(X, count<Y>) :- edge(X, Y). % an aggregate
input edge from "edges.tsv". % load facts from a Soufflé-compatible TSV
| Feature | Notes |
|---|---|
| Semirings | Bool (boolean Datalog), Trop (min-plus: shortest paths, Viterbi) |
| Weights | Trop weights are i64 with a distinct +∞ — comparisons are bit-exact |
| Negation | not p(..) — stratified in the deductive core; unstratified under @asp |
| Aggregates | min/max/sum/count (prob_or reserved); non-recursive |
| Probabilistic | 0.87 :: fact. + ?prob q(..) → exact marginals (distribution semantics) |
| Answer sets | @asp. module → stable models via a reference solver |
| EDB | inline facts, or input p from "file.tsv" |
$ strata run examples/prob.strata # ?prob path(a, c)
0.625 :: path(a, c)
$ strata run examples/asp.strata # @asp in/out choice over {x, y}
Answer 1: {in(x), in(y), node(x), node(y)}
Answer 2: {in(x), node(x), node(y), out(y)}
Answer 3: {in(y), node(x), node(y), out(x)}
Answer 4: {node(x), node(y), out(x), out(y)}
The full language is described in docs/language.md; the formal grammar is in docs/grammar.ebnf.
strata check <file.strata> [--error-format=text|json] # parse + type-check, no run
strata run <file.strata> [--semi-naive] # evaluate and print relations
strata fmt <file.strata> [--check] # canonical formatter
strata ir <file> --to json|surface # convert surface <-> JSON IR
Diagnostics carry a stable code (E0xxx front-end, E1xxx checker), a source
span, and (where possible) a machine-applicable fix:
$ strata check bad.strata
error[E1001]: predicate `edge` is used but never declared
--> 2:1
| path(X, X) :- edge(X, X).
| ^^^^^^^^^^^^^^^^^^^^^^^^^
Exit codes: 0 ok · 1 diagnostics · 2 usage · 4 runtime fault.
The High-IR JSON document is the source of truth; the surface syntax is a
canonical projection of it (strata ir converts both ways, and fmt is
parse → print). An LLM can author either. The JSON Schema is published at
schema/high-ir.schema.json; the encoding
convention is in docs/ir-encoding.md.
A Cargo workspace of six crates, layered so the base has no sibling dependencies:
strata-ir IR data model (High-IR + Core-IR), symbol dictionary, diagnostics,
JSON schema, tropical weight — the shared base (no sibling deps)
strata-front lexer, parser (surface → High-IR), canonical printer, fmt, E0xxx diagnostics
strata-check dependency graph, stratification, type/semiring checks (table 2.4),
normalization High-IR → Core-IR, E1xxx diagnostics
strata-eval the reference interpreter over Core-IR — naive T_P, semi-naive,
and exact probabilistic marginals (режим B); the differential oracle
strata-asp the reference answer-set solver (grounding + stable models)
strata-cli the `strata` binary
strata-front and strata-check are siblings that both depend only on
strata-ir. The GPU backend (later phase) will sit beside strata-eval,
consuming the same Core-IR. See ARCHITECTURE.md for the full
picture and CONTRIBUTING.md to build and test.
The reference interpreter is the oracle. It ships two engines — the
obviously-correct naive T_P fixpoint and a semi-naive delta engine — that are
cross-checked against each other, and against Soufflé for the Bool fragment:
cargo test # unit + integration + corpus diffs
cargo test -p strata-eval --test fuzz # naive == semi-naive over 10k random programs
cargo test -p strata-cli --test souffle_diff # our engine vs Soufflé (needs `souffle`)
STRATA_SOUFFLE_FUZZ_N=10000 cargo test -p strata-cli --test souffle_diff fuzz_bool_vs_souffleTrop is validated against an independent shortest-path oracle rather than
Soufflé. If souffle is not installed, the differential tests skip cleanly.
Probabilistic queries are computed by exact possible-world enumeration (the
obviously-correct режим-B reference); answer sets by the Gelfond–Lifschitz
reduct — both the slow, exact oracles a compiled/GPU method must later match.
book/ holds Programs That Know Why — a short book on logic programming in the
age of LLMs and the design of Strata/K, built with mdBook.
📖 Read it online: https://vsov.github.io/strata-k/
Work in progress. Every runnable listing lives in examples/book/ and runs
under the current strata CLI (CI-checked — see crates/strata-cli/tests/book_listings.rs). Text is licensed CC BY-SA 4.0; example code is
MIT/Apache like the rest of the repo.
mdbook serve book # if you have mdBook installed- docs/language.md — the language reference (syntax + semantics)
- ARCHITECTURE.md — crate graph, two-level IR, evaluation engines
- CONTRIBUTING.md — build, test, and the differential-testing story
- docs/grammar.ebnf — the formal grammar
- docs/ir-encoding.md — the JSON IR encoding convention
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT license (LICENSE-MIT)
at your option. The book text under book/ is licensed
CC BY-SA 4.0; its example code
follows the repo's MIT/Apache dual license.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.