Incremental Compilation (IC)

The nim ic command provides incremental compilation for Nim projects. It decomposes compilation into per-module steps whose results are cached as NIF files, and uses the external nifmake build tool to re-run only the steps whose inputs changed.

This document describes how `nim ic` works today, including the edge cases that shaped the current design. The per-module backend rewrite that earlier editions of this document listed as a Plan has landed: the whole-program, reuse/redirect/def-retention backend is gone and codegen is now a set of nifmake-driven per-module rules (see The backend).

Overview

The pipeline has two halves driven by one process (nim ic, commandIc in compiler/deps.nim) that constructs a dependency graph, writes a build file, and hands it to nifmake:

  1. Frontend — per module:
    • nifler parse --deps turns .nim source into a parsed NIF (.p.nif) plus a static dependency list (.deps.nif).
    • nim m (the semantic step, cmdM) reads the parsed NIF + the precompiled NIFs of the module's imports, type-checks, and writes the semmed NIF (.nif) plus invalidation sidecars (see Cookies).
  2. Backendnim nifc (cmdNifC, compiler/nifbackend.nim) reads the semmed NIFs, generates C, compiles and links.

nifmake orders the steps by their input/output files: every nim m runs before the nim nifc step that consumes its NIF, and a step re-fires only when one of its inputs is newer than its outputs. The driver invokes nifmake run --parallel by default, so independent steps at the same DAG depth fan out across cores; pass -d:icNoParallel to serialize (readable child output when debugging a build).

Artifacts (the NIF zoo)

Per module <suffix> (a content hash of the path; see NIF symbols below), under the nimcache directory:

FileProducerPurpose
<s>.p.nifniflerparsed AST (syntactic)
<s>.deps.nifniflerstatic import list (syntactic imports)
<s>.s.deps.nifnim mreal post-sem imports (incl. macro-generated); see Discovery
<s>.nifnim msemmed module (symbols resolved, typed)
<s>.iface.nifnim miface cookie: hash of the importer-visible surface
<s>.impl.nifnim mimpl cookie: hash of the entire content (bodies included)
<s>.edges.nifnim mNeedsImpl edges: modules whose bodies this sem consumed
<s>.c.nifnim nifcthe C text as a NIF, with def/ref markers for DCE & dedup
ic_config.cfg.nifdriverprecompiled config replayed by every child (icconfig.nim)
ic.versiondriverformat stamp; a mismatch wipes the cache (icFormatVersion)

NIF symbols and ownership

(See ../nifspec/doc/nif-spec.md.) A global symbol is <ident>.<disamb>.<moduleSuffix>. For a generic instantiation the <disamb> is not a counter but a content hashsetInstanceDisamb (modulegraphs.nim) MD5s the generic's identity plus the typeKey of every concrete type argument, masks it to 30 bits and tags it with InstanceDisambBit. So the only part of the name that varies between two modules making the same instantiation (seq[Foo]) is the <moduleSuffix>. Two consequences drive the backend:

The driver: graph construction (commandIc)

  1. Stamp/​wipe the cache by icFormatVersion.
  2. Seed the graph with the root module and `system.nim`. system's entire import closure is folded into one node (one nim m invocation) — see single-writer below.
  3. traverseDeps runs nifler per module and reads .deps.nif to add import edges.
  4. SCC grouping: strongly-connected import cycles are collapsed (Tarjan). A singleton compiles as nim m <mod>; a cycle compiles as one nim m <rep> --icGroup:<member>… that builds every member from source in one process (resolving the recursion in memory) and writes each member's NIF. Only edges leaving the component become build-graph inputs.
  5. Discovery fixpoint: write the build file, run nifmake; if it fails, re-derive the graph from every module's .s.deps.nif (adding nodes/edges for imports the static scanner missed), and retry. See Discovery.
  6. The backend step (nim nifc) depends on every module's semmed NIF, so nifmake runs it last.

Invalidation: the cookie system

A dependent must re-sem only when a dependency's relevant surface changed. Two hashes per module (ast2nif.nim):

NeedsImpl edges (.edges.nif): if a module consumed another module's body during sem — a macro expansion, a generic instantiation, a getImpl, or a compile-time call run in the VM — it records a strong edge. The dependent is then gated on that dependency's impl cookie instead of its iface cookie, so e.g. const x = dep.foo() re-sems when foo's body changes. Recording sites: semExprs.semTemplateExpr (templates), seminst.generateInstance (generics), vmgen.genProc (VM/macros/CT procs), vm.opcGetImpl (getImpl). Inline iterators and inline procs are not tracked — they are inlined at codegen, where the backend's NIF-mtime invalidation re-codegens their users.

Discovery of macro-generated imports

The static scanner only sees syntactic imports. A macro can synthesize one (chronicles does parseStmt("import chronicles/textlines") driven by the chronicles_sinks define). Such an import is invisible until sem runs the macro. Each nim m records the imports it actually resolved (via the semdata.addImportFileDep hook → graph.importDepsast2nif.writeSemDeps) into <s>.s.deps.nif; a child that fails on a not-yet-built import flushes it before erroring. The driver re-derives the graph from those sidecars — adding the missing node + the importer→import edge — and reruns to a fixpoint. (This replaced an earlier icmissing.txt side channel.)

The backend: per-module nifc stages

Codegen is no longer one whole-program process. nim nifc (cmdNifC, compiler/nifbackend.nim) is invoked once per stage via --icBackendStage:<stage>; commandIc emits these as ordinary nifmake rules so "which TUs rebuild" is just "which rules nifmake re-fires from input mtimes" — exactly as the frontend already works. There are four stages:

  1. `cg` (--icBackendStage:cg --icBackendModule:<suffix>) — generate C for the single named module and write only its <s>.c.nif artifact. A non-main target loads only its own import closure (loadDepClosure), so the whole program is not pulled into every parallel cg process. Codegen is still demand-driven and emit-everywhere: a cg process emits every entity it demands (generic instances, hooks, RTTI), referencing nothing extern-only. There is no whole-program DCE here — a liveness pass over all ~260 NIFs would cost ~900 MB for a result the merge stage recomputes anyway. The main module's cg is special: it loads everything (loadBackendModules), emits the whole-program method dispatchers and NimMain, and registers every other module's init/datInit from the .c.nif meta heads — so it runs last, after every other .c.nif exists. Every cg rule always leaves a .c.nif (empty if the module owns no code) so its nifmake output exists and the rule settles.
  2. `merge` (--icBackendStage:merge) — a pure artifact pass, no module graph loaded. Reads every .c.nif, computes the one program-wide live set and, for each unique definition that several cg processes emitted, the single artifact allowed to embed its body; writes that to a merge-decision file (computeMergeDecision / writeMergeDecision). This is the cross-process replacement for the old in-process first-claimant + DCE coordination.
  3. `emit` (--icBackendStage:emit --icBackendModule:<suffix>) — render the target module's final .c from its .c.nif and the merge decision (renderCFromArtifact, dropping globally-dead and non-owned bodies). No codegen runs; the target is loaded only so getCFile yields the path cg wrote.
  4. `link` (--icBackendStage:link) — register every module's emitted .c and run extccomp.callCCompiler once (it parallelizes per-file cc and skips up-to-date objects). Per-module C compile/link directives ({.passL.} etc.) are re-collected here via replayBackendActions, since the cg processes that originally saw them are separate processes (without this, e.g. math's -lm would be lost → undefined floor/pow at link).

Because each stage is a nifmake rule keyed on file mtimes, a body-only edit to one module re-fires that module's cg+emit (and the merge/link), not the whole program — and an unchanged module's cg does not run at all.

Edge cases (and why the machinery exists)

Resolved by the rewrite

The whole-program backend's hand-rolled mini-nifmakecomputeModuleReuse, enforceDefRetention, redirectToLiveModule, the cached-defs/claim bookkeeping and the standalone dce.nimis gone. Reuse is now just per-rule nifmake mtime checks, and the single-writer decision is the merge stage. The old cross-mm / `--force` `var not init` hazard dissolved with it: every codegen rule's config (including --mm) is a declared nifmake input, so a stale-config TU is simply rebuilt rather than mixed in. koch bootic is green under both orc and --mm:refc.

Known residual hack

Status and performance

nim ic self-builds the compiler (koch bootic's byte-identical fixed-point check) under both orc and --mm:refc, and passes the external-package CI set.

Cold full bootstrap on a 32-core box (-d:release, no edits — IC's worst case, since incremental reuse is not exercised):

| wall | notes |


koch boot (classic) | ~1m00s | reference |
koch bootic (nim ic) | ~1m39s | ~1.66× |

This is down from ~7.5× in the whole-program-backend era. IC does modestly more aggregate work (more processes, NIF re-parsing of imports per process), but on a many-core box that overhead is absorbed by the parallel nim m/nifc fan-out, and the C compile+link floor is shared with the classic backend. On few-core machines the cold gap is correspondingly wider — IC trades single-build latency for incremental latency.

The cold number is the least favourable comparison: it pays IC's full per-process overhead while using none of its incremental machinery. Warm rebuilds — the actual point of IC — recompile only the modules whose inputs changed (a body-only edit re-fires one module's cg+emit, not the program), so an edit-driven rebuild is a small fraction of either full build.

The strategic direction (decided 2026-06-13) is to make this NIF backend (cmdNifC) the default code generator. The per-module pipeline above is the realization of that direction; remaining work is promotion + deletion of the classic path, not new machinery.

Design notes and open decisions

The per-module backend (above) mirrors Nimony's src/nimony/deps.nim: the backend stopped re-implementing nifmake; each stage is a build rule, so reuse is just mtime checks and the merge stage is the only cross-module coordination.

Settled vs. open:

Validation bar (held on every change): koch bootic must reach its byte-identical fixed point, and binary size must not regress (DCE parity), across the external-package CI set.

Code, logic & debugging

Core modules:

Manual workflow:

See also