The "cnif" artifact: the C code generator's output as a NIF file.
This is deliberately not NIFC: the C text is kept verbatim (Nim's C-level machinery — exception handling in particular — is more refined than what NIFC models today; the gap can be closed incrementally later). The only structure the artifact adds is the part dead code elimination and generic-instance merging need:
- raw C text as string literals
- every global entity's C name as a Symbol token
- every emitted proc definition as a (cdef SymbolDef flags ...) group
The C generator marks names with control characters at the single place a global's C name is minted (fillBackendName) and emits a definition directive at the single place finished procs are appended; the marks then ride through all of the snippet composition untouched. This module turns the final marked module text into the .c.nif artifact and strips the marks for the actual .c output. Rendering C from the artifact is a plain token walk: string literals verbatim, symbols by name — which is also where a later merge step redirects losing generic instances.
Marker scheme (cannot collide: C string literals escape control chars, and \1/\31/\23 of cgen's postprocess directives are distinct): 2 name 3 a global's C name 4 name 31 flags 31 nif 5 start of the definition of name; nif is the defining symbol's NIF name (empty for backend-minted symbols) so a later run can re-demand the definition 4 5 end of the definitions section
Types
CnifHeads = object valid*: bool ## file parsed, carries the meta head and has ## the current format version initRequired*: bool datInitRequired*: bool semmedNif*: string ## the semmed NIF this TU was generated from moduleBase*: string ## the module's mangled base name cdefs*: seq[tuple[cname, nifname: string]] ## the proc definitions cdata*: seq[tuple[cname, nifname: string]] ## the data definitions crefs*: seq[string] ## C names referenced but not defined here cdeps*: seq[string] ## module suffixes whose routine bodies this ## TU embeds (impl-cookie gated on reuse)
- The cheap-to-parse part of an artifact that a later run needs in order to reuse the TU without regenerating it. Source Edit
CnifLiveness = object defs*: int ## proc definitions emitted across all modules liveDefs*: int ## of those, reachable from the roots live*: HashSet[string] ## live C names broken*: bool
- Source Edit
MergeDecision = object live*: HashSet[string] ## globally reachable C names (dead cdefs ## are dropped from every module) owners*: Table[string, string] ## for each `'u'`-flagged (unique, ## externally-linked) definition, the single ## artifact base name allowed to embed its ## body; every other module prototypes it broken*: bool ## an artifact was missing or unparsable — ## the caller should fall back / regenerate defs*, liveDefs*: int
- What the per-module backend's merge stage computes from every module's .c.nif and what its emit stage consumes to render the final .c of one module. Source Edit
Consts
CnifDefEnd = '\x05'
- Source Edit
CnifDefSep = '\x1F'
- Source Edit
CnifDefStart = '\x04'
- Source Edit
CnifSymEnd = '\x03'
- Source Edit
CnifSymStart = '\x02'
- Source Edit
CnifVersion = "4"
- Artifact format version, stored in the meta head. Artifacts written by an older compiler lack the NIF names and the cref group the def-retention check needs (v2), the cdeps group the fine-grained reuse gate needs (v3), or the type NIF names and cnif-marked extern RTTI references the typeinfo flavor of the def-retention check needs (v4); readCnifHeads reports them as invalid so their TUs simply regenerate once. Source Edit
MergeDecisionFile = "ic.backend.merge.nif"
- Fixed name of the merge stage's output in the nimcache, read by emit. Source Edit
Procs
proc cnifDefDirective(name, flags, nifName: string): string {....raises: [], tags: [], forbids: [].}
- Source Edit
proc cnifEndDefs(): string {....raises: [], tags: [], forbids: [].}
- Source Edit
proc computeLiveFromCArtifacts(files: openArray[string]): CnifLiveness {. ...raises: [ValueError, KeyError], tags: [ReadDirEffect, RootEffect], forbids: [].}
-
dce1-style mark&sweep over the C-shaped artifacts: a (cdef ...) group is a definition (flags 'x'/'c'/'m' — exportc, compilerproc, method/dispatcher — make it a root), names at the top level (data, globals, init code) are roots, names inside a group are its uses. Because the artifact is fully lowered output, no conservative modelling is needed: every call the C code contains is a token here.
NB: mangled C names contain no dots, so NIF's text reader classifies them as Ident rather than Symbol; the dialect therefore treats Ident tokens as name uses. Inside a (cdef ...) the flags ident is the one immediately following the SymbolDef; everything after is a use.
Source Edit proc computeMergeDecision(files: openArray[string]): MergeDecision {. ...raises: [ValueError, KeyError], tags: [ReadDirEffect, RootEffect], forbids: [].}
-
One pass over every .c.nif: the same mark&sweep as computeLiveFromCArtifacts plus, per definition, owner assignment.
Each cg process emits the body of every definition it demands (emit-everywhere), so the same externally-linked definition appears in several artifacts. A 'u' flag on the (cdef ...) marks those that need exactly one owner, assigned here across processes: the owner is the lexicographically smallest artifact that emits it — a pure function of the claimant set, hence stable across rebuilds. Definitions without 'u' (inline procs, dispatchers) are static/main-only and emitted into every using TU, so they get no owner entry and are never deduplicated.
Source Edit proc hasCnifMarks(s: string): bool {....raises: [], tags: [], forbids: [].}
- Source Edit
proc readCnifHeads(f: string): CnifHeads {....raises: [ValueError], tags: [ReadDirEffect, RootEffect], forbids: [].}
- Reads (meta ...), (cdata ...), (cref ...) and the (cdef ...) head names from an artifact. Artifacts written by an older compiler (no meta head or a different format version) report valid=false. Source Edit
proc readMergeDecision(f: string): MergeDecision {....raises: [ValueError], tags: [ReadDirEffect, RootEffect], forbids: [].}
- Reads back a writeMergeDecision file; broken=true if absent/unparsable. Source Edit
proc renderCFromArtifact(artifact: string; d: MergeDecision; ownerId: string; dropped: var int): string {....raises: [ValueError], tags: [ReadDirEffect, RootEffect], forbids: [].}
- The per-module backend's emit stage: render one module's final .c from its .c.nif and the merge decision. String literals are emitted verbatim, symbols by name; a (cdef ...) body is dropped when the name is dead, or when it is a 'u' unique definition this module does not own. The body's prototype lives in the surrounding raw text (cgen emits a forward declaration for every used proc, independent of where the body lands), so a dropped body still leaves a valid declaration — no synthesis needed. The head groups (meta/cdata/cref/cdeps) carry no C text. Source Edit
proc renderMarkedC(code: string; live: HashSet[string]; dropped: var int): string {. ...raises: [], tags: [], forbids: [].}
- Renders the final C text from the marked module text: symbol marks are removed (keeping the names — a later merge step substitutes them here), and definitions whose name is not in live are dropped entirely. Each definition is self-delimiting (genProcAux emits an end directive right after the proc's text), so text written by other emitters is never part of a definition's span and survives unconditionally. Source Edit
proc stripCnifMarks(s: string): string {....raises: [], tags: [], forbids: [].}
- Removes the symbol marks (keeping the names) and the definition directives (entirely) so the result is plain C. Source Edit
proc writeCnifArtifact(code: string; outfile: string; initRequired = false; datInitRequired = false; dataDefs: openArray[tuple[cname, nifname: string]] = []; semmedNif = ""; moduleBase = ""; implDeps: openArray[string] = []) {....raises: [Exception], tags: [RootEffect], forbids: [].}
- Splits the marked module text into the .c.nif artifact. The artifact starts with a (meta <flags> "semmedNif" "moduleBase" "version") head — whether the module has an init/datInit proc ('i'/'d'), which semmed NIF it was generated from and the module's mangled base name (what registerModuleToMain and the reuse decision need when the TU is reused in a later run, possibly without the module ever being loaded again) — a (cdata (SymbolDef StrLit)*) group naming the data definitions (consts, globals, RTTI) the TU embeds together with their NIF names, a (cref Ident*) group naming every C name the TU references but does not define itself (what the def-retention check consults when some other TU regenerates), and a (cdeps Ident*) group naming the modules whose routine bodies this TU embeds (redirected defs, shared instances, hooks): the fine-grained reuse gate checks their .impl.nif cookies on top of the direct imports' .iface.nif cookies. Source Edit
proc writeMergeDecision(outfile: string; d: MergeDecision) {. ...raises: [KeyError, Exception], tags: [RootEffect], forbids: [].}
- Serializes the merge decision: (merge (live Symbol*) (owners (own Symbol StrLit)*)). C names are mangled (no dots) so they serialize as symbols; owner artifact base names go in string literals. Source Edit