Why Storage Systems Lose Computational Work

Abstract

Modern computational infrastructure focuses on storing data, embeddings, and application state, yet autonomous computational systems ultimately depend on something more fundamental: the preservation of completed computational work.

Computational artifacts produced by autonomous systems increasingly form dependency graphs representing accumulated computational work. Preserving these artifact graphs requires that artifacts remain retrievable, verifiable, and reusable across agents, workflows, and time.

Traditional storage systems—including filesystems, databases, object storage systems, and vector databases—were designed for application data persistence rather than the preservation of artifact graphs.

This document evaluates these storage paradigms against the requirements implied by artifact graphs and the principle of Computational Work Conservation. The analysis demonstrates that traditional storage systems fail to preserve artifact graphs in distributed agent ecosystems, motivating the need for a distinct architectural layer responsible for artifact availability.

1. Storage Systems and Computational Artifacts

Traditional storage systems were designed to persist application data.

Examples include:

files stored within filesystems
structured records stored within databases
objects stored within cloud object storage
embeddings stored within vector databases

These systems perform their intended role effectively within conventional software architectures.

However, the artifact graph model described in previous notes introduces additional requirements. Computational artifacts must remain accessible as nodes within a distributed dependency graph representing accumulated computational work.

Preserving this structure requires more than simple data persistence.

2. Requirements Implied by Artifact Graphs

If artifact graphs represent the structure of computational work, storage systems must satisfy several architectural requirements.

Stable Artifact Identity

Artifacts must possess stable identifiers independent of storage location or application context.

Persistent Accessibility

Artifacts must remain retrievable across process lifetimes and distributed execution environments.

Derivation Relationships

Systems must preserve the relationships between artifacts that define computational lineage.

Cross‑Agent Reuse

Artifacts must be accessible to independent agents participating in distributed workflows.

Verifiability

Artifacts must be verifiable to ensure retrieved artifacts correspond to their original computational outputs.

Traditional storage models were not designed to satisfy these requirements simultaneously.

3. Filesystems

Filesystems represent one of the oldest abstractions for persistent storage.

Files are identified by hierarchical paths and stored within a specific machine or mounted storage environment.

While filesystems provide durable storage, they exhibit several limitations when applied to artifact graphs:

file identity is location‑dependent
derivation relationships between files are not explicitly represented
file access is typically restricted to a specific system environment

As a result, filesystems provide limited support for preserving artifact graphs across distributed computational systems.

4. Databases

Relational and document databases store structured application data.

Databases support indexing, querying, and transactional integrity, making them effective for many application workloads.

However, databases introduce constraints that conflict with artifact graph requirements:

artifact identity becomes tied to database schema and application context
cross‑system accessibility is limited
artifact lineage relationships must be manually encoded within application logic

Databases therefore provide persistence for application state rather than preservation of computational artifacts.

5. Cloud Object Storage

Cloud object storage systems provide highly durable storage for large volumes of data.

Objects are typically addressed through location-based identifiers within centralized provider infrastructure.

Although object storage improves durability, several limitations remain:

object identity remains location‑dependent
derivation relationships between objects are not preserved
cross-agent coordination must be implemented externally

Object storage therefore preserves data objects but not artifact graphs.

6. Version Control Systems

Version control systems such as Git provide content-addressable storage for software artifacts.

These systems preserve revision history and support distributed collaboration.

However, version control systems are optimized for source code management rather than computational artifact graphs.

Limitations include:

repository-centric identity
limited representation of artifact derivation relationships
workflows optimized for human collaboration rather than autonomous agents

Version control systems therefore provide partial support for artifact identity but do not preserve artifact graphs across distributed computational workflows.

7. Vector Databases

Vector databases are frequently used in modern AI systems to store embeddings representing semantic representations of data.

These systems enable similarity search across large embedding collections.

While vector databases are often presented as infrastructure for AI systems, they exhibit significant limitations when evaluated against artifact graph requirements:

embeddings represent derived representations rather than canonical artifacts
artifact lineage is not preserved
embedding identity does not correspond to computational artifact identity
embeddings cannot reconstruct the original computational workflow

Vector databases therefore support retrieval of semantic representations rather than preservation of computational artifacts.

8. Orchestration Systems

Orchestration systems coordinate the execution of computational processes — scheduling tasks, managing dependencies, and sequencing agent workflows. They are designed to manage execution, not preserve outputs.

The artifacts produced within an orchestrated workflow require the same availability infrastructure as any other computational artifact.

9. Structural Mismatch

Traditional storage models share a common architectural assumption: they persist data objects, not computational artifacts within dependency graphs.

Artifact graphs require systems that preserve:

artifact identity
derivation relationships
cross-agent accessibility
computational lineage

Traditional storage systems preserve data persistence, but they do not preserve the structure of computational work.

This mismatch explains why artifact graphs degrade in distributed agent ecosystems.

10. Implications

As autonomous computational systems continue to generate large volumes of artifacts, the inability of traditional storage systems to preserve artifact graphs becomes increasingly significant.

Systems that fail to preserve artifact graphs effectively destroy portions of accumulated computational work.

This observation reinforces the principle of Computational Work Conservation established in the previous note.

Preserving artifact availability therefore becomes a necessary architectural property of distributed computational systems.

Conclusion

Traditional storage systems were designed to persist application data rather than preserve artifact graphs representing computational work.

Filesystems, databases, object storage systems, version control systems, and vector databases each provide mechanisms for storing data. However, none of these systems were designed to preserve the structural relationships that define artifact graphs.

As a result, the outputs of computational systems are routinely treated as temporary or disposable objects rather than as durable units of completed computational work.

In distributed agent ecosystems, this assumption becomes increasingly problematic.

As autonomous systems produce larger volumes of computational artifacts, the inability to preserve artifact graphs results in the repeated destruction of accumulated computational work.

The analysis presented here suggests that preserving artifact availability cannot be treated as a secondary feature of existing storage systems.

Instead, artifact availability must be treated as a first‑class architectural requirement.

This observation motivates the need for a distinct infrastructure layer responsible for preserving artifact availability across distributed computational environments.

The architecture of such a layer is explored in the next note in this series: The Artifact Availability Layer.

Discussion and Feedback

The ideas presented in this document are part of an ongoing exploration of architectural requirements for agent-based computational systems.

Comments, critiques, and alternative perspectives are encouraged.

Feedback may be submitted through issues or discussions within this repository.

Future notes in this series introduce the Artifact Availability Layer, describe deterministic artifact identity, and establish the principle of Computational Work Conservation.

Citation

If referencing this work, please cite:

Kopcho, Rich. Why Storage Systems Lose Computational Work.
Agent Artifact Availability (AAA) Series. Technical Note, March 2026.