/docs/archive

Immutable Archive

GoBD-compliant retention of tax-relevant documents under § 146 AO — hash-chained, time-stamped, tamper-evident for 10 years.

Overview

The Invocore archive is an append-only store for every tax-relevant document. On upload, the file is SHA-256 hashed, appended to the per-organisation hash chain, and written to the filesystem with chmod 0444 + chattr +i. No one — administrators included — can change archived content after the fact.

  • SHA-256 hash per document, anchored in a per-organisation hash chain
  • PostgreSQL triggers block UPDATE / DELETE / TRUNCATE on the journal table
  • 10-year default retention (§ 147 AO), with a service-side floor that rejects shorter overrides
  • Hourly anchoring in a Merkle tree with RFC 3161 TSA timestamp
  • Self-contained ZIP verification package — external auditors verify without backend access

Per-organisation hash chain

Every organisation starts with a deterministic genesis block (block_number = 0, prev_hash = 64×"0"). Every subsequent upload appends a block whose prev_hash points at the previous block's entry_hash. If anyone tampers with a document or a block, the chain deterministically breaks at a known index — the verify endpoint reports the broken block loudly.

How a block is built

  1. 1. Hashing

    The file is read into memory and SHA-256 hashed. If a block with the same hash already exists in your org, the endpoint returns 409 Conflict (idempotency by hash).

  2. 2. Storage

    The bytes are written to primary storage (local filesystem with immutability flags). Off-host secondary replication (Hetzner Storage Box) is wired but disabled by default — toggle ARCHIVE_SECONDARY_ENABLED to enable.

  3. 3. Block append

    In the same DB transaction, a row is appended to journal_entries. A PG trigger (journal_entry_chain_trg) reads the prior block's entry_hash, sets it as prev_hash, and deterministically computes entry_hash = SHA-256(block_number ‖ prev_hash ‖ doc_hash ‖ operation).

  4. 4. Audit log

    In parallel, an audit_logs row (action = archive_upload) is written with user_id, IP, user-agent, SHA-256, and block number. Transaction rollback drops the block and the audit log consistently.

API reference

All endpoints require JWT (Bearer) authentication and are tenant-scoped via the X-Tenant-Id header.

POST/api/v1/archive/documents

Archive a document immutably (multipart upload). Returns SHA-256, block number, hash-chain entry.

Sample response

{
  "document_id": "31c2c1f0-502a-47f3-bb85-01f81362a7b7",
  "sha256": "13ebe841cbed79bb6eec0031c3af5019…",
  "block_number": 1,
  "entry_hash": "61212e86d03e3165a15f958bf2975…",
  "storage_primary_path": "42426bf6-1428-…/2026/05/31c2c1…",
  "replication_status": "none",
  "immutable_locked": true,
  "retention_until": "2036-05-16T21:14:00Z"
}
GET/api/v1/archive/documents

List archived documents with filter and sort. Fields: items, total, page, page_size, pages.

Sample response

{
  "items": [
    {
      "document_id": "…",
      "original_filename": "rechnung_2026_001.pdf",
      "sha256": "…",
      "size_bytes": 4242
    }
  ],
  "total": 1,
  "page": 1,
  "page_size": 50,
  "pages": 1
}
GET/api/v1/archive/chain/verify

Verify the hash chain integrity for your organisation. ok=true means no block was tampered with.

Sample response

{
  "ok": true,
  "entries": 42,
  "genesis": true,
  "reason": null,
  "broken_at": null
}
GET/api/v1/archive/documents/{id}/verification_package

Download ZIP: original file + manifest.json + standalone verify.py + RFC 3161 TSA token.

Sample response

{
  "format_version": "1.0",
  "document": {
    "sha256": "…"
  },
  "chain": {
    "entries": 42
  }
}

Retention (§ 147 AO)

Retention is set at upload time as retention_until (default: 10 years for invoice, contract, form). A retention_years override below 10 raises ValueError("archive.retention_too_short") — that is the compliance floor. The nightly retention sweep marks expired documents as is_retention_deleted = true, removes the bytes from primary + secondary storage, and appends an immutable archive_retention_delete block to the hash chain as proof of deletion.

TSA timestamping (RFC 3161)

Every hour the anchor worker batches new blocks per organisation into a Merkle tree, computes the root, and fetches an RFC 3161 timestamp from a Time Stamping Authority. Development uses FreeTSA; production targets a qualified TSA (D-Trust, A-Trust). The TSA response is stored as a blob in the anchors table and shipped with the verification package. If TSA acquisition fails, the anchor retries with exponential backoff — the chain itself remains intact without the timestamp.

External verification

Every document can be verified fully offline. The ZIP verification package contains everything an external auditor needs — no backend access required:

  • manifest.json with block_number, sha256, prev_hash, entry_hash, and path to the original file
  • document/<filename> — the unmodified original bytes
  • verify.py — standalone Python script (stdlib + cryptography only) that checks the SHA-256, recomputes the chain, and verifies the TSA token against the Merkle root
  • tsa_token.bin — RFC 3161 DER-encoded TSA response with qualified certificate
  • README.txt — auditor instructions

In the customer account

The archive lives under Archive in the main menu. Each tenant sees only their own hash chain and their own documents.

Archive page with hash-chain status and the archived-documents list
Overview page with hash chain ("Intact"), replication status, and the list of archived documents.
Drag-and-drop area for file upload
Upload area — PDF, PNG, or JPEG via drag-and-drop or click. 10-year retention confirmed on every upload.
Download button for the verification package
Per-row download button for the ZIP verification package — perfect to attach to your tax advisor email.

Frequently asked questions

What happens if I upload the same document twice?

The server computes the SHA-256 and compares it to your organisation's archive. On a match it returns 409 Conflict and includes the original filename. You can decide whether it's an accidental dup or whether you wanted to verify the original copy.

Can I delete an archived document?

Not directly. Only the nightly retention sweep may remove documents after the 10-year window, and even then the hash-chain block stays — as proof that the document existed. The same machinery handles GDPR right-to-erasure when applicable.

What if the hash chain breaks?

The /chain/verify endpoint returns ok=false plus the block_index where the break occurred. That is a strong signal of a serious incident (manual DB manipulation), and an operator can use audit_logs to identify who wrote the last valid block and which entry changed.

How is the archive replicated?

Primary storage is the local filesystem with immutability flags. Off-host secondary replication on a Hetzner Storage Box is wired but disabled by default — toggle ARCHIVE_SECONDARY_ENABLED. With replication on, each document carries a replication_status (pending / replicated / failed).

Can my tax advisor verify the archive directly?

Yes. Download the verification package for the documents in scope, hand it to your advisor, and they run python verify.py on any machine. No Invocore access required.

Ready to use the archive?

The archive is enabled on every Invocore account. Upload your first document — the hash chain starts automatically with the genesis block.