summaryrefslogtreecommitdiffstats
path: root/_log/vcs-1.md
diff options
context:
space:
mode:
Diffstat (limited to '_log/vcs-1.md')
-rw-r--r--_log/vcs-1.md38
1 files changed, 19 insertions, 19 deletions
diff --git a/_log/vcs-1.md b/_log/vcs-1.md
index 93e842b..d11b342 100644
--- a/_log/vcs-1.md
+++ b/_log/vcs-1.md
@@ -4,30 +4,29 @@ date: 2026-05-01
layout: post
---
-Implemented init, status, add, commit, log, show, and diff commands using Perl
-and OpenBSD base-system tools. Didn't bother with collaborative workflows.
+Implemented init, status, add, commit, log, show, and diff using Perl and
+OpenBSD base-system tools. Didn't bother with collaborative workflows.
-Initial design mirrored the work tree using symlinks. Using filesystem as a
-database felt clever, but walking directories on every command and the inode
-churn were untenable. Replaced the symlink architecture with a path-sorted
-index.
+Initial design mirrored the work tree with symlinks. Using filesystem as a
+database felt clever, but walking directories on every command was untenable.
+Replaced the symlink architecture with a path-sorted index.
The index tracks path, mtime, size, and SHA-1 hashes of staged, committed, and
-base files. Only entries whose mtime and size changed, or share the same mtime
-as the index are hashed.
+base files. Only entries whose mtime and size changed (or has the same mtime as
+the index to mitigate races caused by mtime precision) are hashed.
Implemented directory scans as a two-finger walk with the index; linear index
-access trades random-access speed for sequential IO and keeps memory footprint
-low.
+access trades random-access speed for sequential IO and keeps the memory
+footprint low.
Commits save staged files, trees, and deltas to a content-addressable object
-store. Bundled deltas into tarballs to conserve inodes. Gzipped objects larger
-than 512 bytes. The threshold was arbitrary. Did not tune further.
+store. Deltas are bundled into tarballs to conserve inodes. Objects larger than
+512 bytes are gzipped. The threshold was arbitrary. Did not tune further.
Deltas, computed using diff, target the original file. Subsequent versions are
reconstructed via a single patch—no chains. Diff output is bloated but
-compresses well, so rebase threshold is set to 1.4, assuming a 30-40%
-compression ratio. When the delta exceeds that, the file becomes the new base.
+compresses well. Rebase threshold is set to 1.4, assuming a 30-40% compression
+ratio. When the delta exceeds the threshold, the file becomes the new base.
Commands run in memory, using text streams and pipes wherever possible. Left
MEM_LIMIT configurable to fall back to disk for large repositories:
@@ -53,7 +52,8 @@ if ((!$use_disk && $tot_size > MEM_LIMIT) ||
}
```
-Benchmarked against Git v2.51.0 on a T490 (i7-10510U, OpenBSD 7.8):
+Benchmarked against Git v2.51.0 on a T490 (i7-10510U, OpenBSD 7.8). Measured
+with `/usr/bin/time -l sh -c`. Max RSS excludes child processes:
<pre class="pre-no-style">
=============================================================
@@ -104,15 +104,15 @@ Final Inodes | 1462 | 41
TOTAL URN REBASES: 0
</pre>
-Git is 10x times faster.
+Git is 10x faster.
On storage, Urn shows promise. Git wrote 12 MB to track a 17 MB repository; Urn
-wrote 9 MB. Over 80 commits, Git's inode consumption grew by 562. Urn's crept
-from 1,300 to 1,462.
+wrote 9 MB. Over 80 commits, Git's inode consumption grew by 562, while Urn's
+crept from 1,300 to 1,462.
Then fell the GC hammer. Inodes: 41. Space recovered: 8.4 MB.
-Precise impact on TBW and write amplification is not yet known.
+Precise impact on TBW and write amplification is unknown.
Commit: <a
href="https://git.asciimx.com/urn/commit/?id=ff98b5711ae91d5cafd75764be192c0be5e592cf"