Back to patterns

FAIR Envelope

Problem

A shareable object without a label is a black box — nobody outside your project can find it, tell if it’s trustworthy, know who made it, or use it legally. This pattern applies to aDNA’s shareable units (lattices, modules, datasets): without a standard metadata envelope, sharing them means emailing files around and explaining each one by hand. That doesn’t scale.

Solution

Wrap every shareable object in a FAIR envelope — a standardized metadata block that answers four questions (§6):

PrincipleQuestionKey Fields
FindableCan someone discover this?keywords (required), identifier (optional DOI)
AccessibleCan they retrieve it?license (required, SPDX), access_protocol, location
InteroperableCan they use it with other systems?format (type vocabulary), schema
ReusableCan they build on it?provenance, creators

Two required fields: keywords (at least one) and license (SPDX identifier). Everything else is optional but recommended for shared objects.

Two representations:

FormatWhereStructure
Flat.lattice.yaml, .dataset.yamlfair.keywords, fair.license, etc.
NestedVault .md frontmatterfair.findable.keywords, fair.accessible.license, etc.

Both are round-trip safe for core fields. Use flat in YAML transport files, nested in markdown documentation.

The minimum viable envelope:

fair:
  keywords: [knowledge-architecture, documentation]
  license: "MIT"

Two fields, two lines. That’s the entry cost for making an object findable and legally shareable.

When to Use

  • Every .lattice.yaml and .dataset.yaml file
  • Every module definition intended for sharing
  • Context files in a shared context library
  • Any object that might be federated, published, or used by another project

Example: This Vault

Lattice examples: Every .lattice.yaml in what/lattices/examples/ carries a FAIR block. The protein_binder_design example has:

fair:
  license: "MIT"
  keywords: [protein-design, binder, computational-biology]
  creators: ["Lattice Labs"]
  provenance: "Designed for de novo protein binder pipeline"

Four lines that make this lattice discoverable, legally clear, attributable, and traceable.

Context files: The files in what/context/adna_core/ use FAIR-aligned metadata in their frontmatter: sources (provenance), tags (findability), and quality scores (reusability signals). While they use frontmatter fields rather than a formal fair: block, the principles are the same — every file carries enough metadata to evaluate before loading.

The FAIR mapping reference: what/context/adna_core/context_adna_core_fair_mapping.md documents the complete field correspondence between flat and nested FAIR, including anti-patterns (mixing formats, omitting license for shared objects).

Anti-Pattern

No metadata at all: A lattice file with nodes and edges but no fair: block. It works locally but can never be shared — federation validation requires license and keywords.

Keywords as afterthought: keywords: [misc]. Keywords are the primary findability mechanism. They should be specific enough to distinguish this object from others: [protein-design, binder] not [science, data].

Missing license on shared objects: federation.shareable: true without fair.license. The federation protocol will reject it — no license means no legal basis for sharing.

Mixing flat and nested: Using fair.findable.keywords in a .lattice.yaml file (should be flat fair.keywords) or fair.keywords in vault frontmatter (should be nested fair.findable.keywords). Pick the format for the file type and be consistent.