/docs/archive
Immutable Archive
GoBD-compliant retention of tax-relevant documents under § 146 AO — hash-chained, time-stamped, tamper-evident for 10 years.
Overview
The Invocore archive is an append-only store for every tax-relevant document. On upload, the file is SHA-256 hashed, appended to the per-organisation hash chain, and written to the filesystem with chmod 0444 + chattr +i. No one — administrators included — can change archived content after the fact.
- ✓SHA-256 hash per document, anchored in a per-organisation hash chain
- ✓PostgreSQL triggers block UPDATE / DELETE / TRUNCATE on the journal table
- ✓10-year default retention (§ 147 AO), with a service-side floor that rejects shorter overrides
- ✓Hourly anchoring in a Merkle tree with RFC 3161 TSA timestamp
- ✓Self-contained ZIP verification package — external auditors verify without backend access
Legal background
The German principles for properly maintained electronic books (GoBD) require three properties for every tax-relevant document:
§ 146 AO — Immutability: Once recorded, a transaction must not be modified in a way that hides the original content. We enforce this with PostgreSQL triggers (no UPDATE / DELETE / TRUNCATE on journal_entries) and filesystem immutability (chmod 0444 + chattr +i).
§ 145 AO — Traceability: Every entry must be reconstructable from source to author. Every upload writes an audit log entry with user_id, IP, SHA-256, and block number.
§ 147 AO — Retention: Tax documents must be kept for 10 years. We set retention at upload time and the service rejects any override below 10 years.
Per-organisation hash chain
Every organisation starts with a deterministic genesis block (block_number = 0, prev_hash = 64×"0"). Every subsequent upload appends a block whose prev_hash points at the previous block's entry_hash. If anyone tampers with a document or a block, the chain deterministically breaks at a known index — the verify endpoint reports the broken block loudly.
How a block is built
1. Hashing
The file is read into memory and SHA-256 hashed. If a block with the same hash already exists in your org, the endpoint returns 409 Conflict (idempotency by hash).
2. Storage
The bytes are written to primary storage (local filesystem with immutability flags). Off-host secondary replication (Hetzner Storage Box) is wired but disabled by default — toggle ARCHIVE_SECONDARY_ENABLED to enable.
3. Block append
In the same DB transaction, a row is appended to journal_entries. A PG trigger (journal_entry_chain_trg) reads the prior block's entry_hash, sets it as prev_hash, and deterministically computes entry_hash = SHA-256(block_number ‖ prev_hash ‖ doc_hash ‖ operation).
4. Audit log
In parallel, an audit_logs row (action = archive_upload) is written with user_id, IP, user-agent, SHA-256, and block number. Transaction rollback drops the block and the audit log consistently.
API reference
All endpoints require JWT (Bearer) authentication and are tenant-scoped via the X-Tenant-Id header.
/api/v1/archive/documentsArchive a document immutably (multipart upload). Returns SHA-256, block number, hash-chain entry.
Sample response
{
"document_id": "31c2c1f0-502a-47f3-bb85-01f81362a7b7",
"sha256": "13ebe841cbed79bb6eec0031c3af5019…",
"block_number": 1,
"entry_hash": "61212e86d03e3165a15f958bf2975…",
"storage_primary_path": "42426bf6-1428-…/2026/05/31c2c1…",
"replication_status": "none",
"immutable_locked": true,
"retention_until": "2036-05-16T21:14:00Z"
}/api/v1/archive/documentsList archived documents with filter and sort. Fields: items, total, page, page_size, pages.
Sample response
{
"items": [
{
"document_id": "…",
"original_filename": "rechnung_2026_001.pdf",
"sha256": "…",
"size_bytes": 4242
}
],
"total": 1,
"page": 1,
"page_size": 50,
"pages": 1
}/api/v1/archive/chain/verifyVerify the hash chain integrity for your organisation. ok=true means no block was tampered with.
Sample response
{
"ok": true,
"entries": 42,
"genesis": true,
"reason": null,
"broken_at": null
}/api/v1/archive/documents/{id}/verification_packageDownload ZIP: original file + manifest.json + standalone verify.py + RFC 3161 TSA token.
Sample response
{
"format_version": "1.0",
"document": {
"sha256": "…"
},
"chain": {
"entries": 42
}
}Retention (§ 147 AO)
Retention is set at upload time as retention_until (default: 10 years for invoice, contract, form). A retention_years override below 10 raises ValueError("archive.retention_too_short") — that is the compliance floor. The nightly retention sweep marks expired documents as is_retention_deleted = true, removes the bytes from primary + secondary storage, and appends an immutable archive_retention_delete block to the hash chain as proof of deletion.
TSA timestamping (RFC 3161)
Every hour the anchor worker batches new blocks per organisation into a Merkle tree, computes the root, and fetches an RFC 3161 timestamp from a Time Stamping Authority. Development uses FreeTSA; production targets a qualified TSA (D-Trust, A-Trust). The TSA response is stored as a blob in the anchors table and shipped with the verification package. If TSA acquisition fails, the anchor retries with exponential backoff — the chain itself remains intact without the timestamp.
External verification
Every document can be verified fully offline. The ZIP verification package contains everything an external auditor needs — no backend access required:
- →manifest.json with block_number, sha256, prev_hash, entry_hash, and path to the original file
- →document/<filename> — the unmodified original bytes
- →verify.py — standalone Python script (stdlib + cryptography only) that checks the SHA-256, recomputes the chain, and verifies the TSA token against the Merkle root
- →tsa_token.bin — RFC 3161 DER-encoded TSA response with qualified certificate
- →README.txt — auditor instructions
In the customer account
The archive lives under Archive in the main menu. Each tenant sees only their own hash chain and their own documents.



Frequently asked questions
What happens if I upload the same document twice?
The server computes the SHA-256 and compares it to your organisation's archive. On a match it returns 409 Conflict and includes the original filename. You can decide whether it's an accidental dup or whether you wanted to verify the original copy.
Can I delete an archived document?
Not directly. Only the nightly retention sweep may remove documents after the 10-year window, and even then the hash-chain block stays — as proof that the document existed. The same machinery handles GDPR right-to-erasure when applicable.
What if the hash chain breaks?
The /chain/verify endpoint returns ok=false plus the block_index where the break occurred. That is a strong signal of a serious incident (manual DB manipulation), and an operator can use audit_logs to identify who wrote the last valid block and which entry changed.
How is the archive replicated?
Primary storage is the local filesystem with immutability flags. Off-host secondary replication on a Hetzner Storage Box is wired but disabled by default — toggle ARCHIVE_SECONDARY_ENABLED. With replication on, each document carries a replication_status (pending / replicated / failed).
Can my tax advisor verify the archive directly?
Yes. Download the verification package for the documents in scope, hand it to your advisor, and they run python verify.py on any machine. No Invocore access required.
Ready to use the archive?
The archive is enabled on every Invocore account. Upload your first document — the hash chain starts automatically with the genesis block.