summaryrefslogtreecommitdiffstats
path: root/_log/vcs-1.md
diff options
context:
space:
mode:
authorSadeep Madurange <sadeep@asciimx.com>2026-05-07 19:19:44 +0800
committerSadeep Madurange <sadeep@asciimx.com>2026-05-08 18:10:39 +0800
commit0140adc68d7c46f98658c5dbc51b53ff332da4ef (patch)
tree07e2ac5d0f045952f080287d97e0df466b506a6d /_log/vcs-1.md
parent0e0d336d7ec80ff00e0e4d9acf00267ad3faa214 (diff)
downloadwww-0140adc68d7c46f98658c5dbc51b53ff332da4ef.tar.gz
Improve prose.
Diffstat (limited to '_log/vcs-1.md')
-rw-r--r--_log/vcs-1.md21
1 files changed, 10 insertions, 11 deletions
diff --git a/_log/vcs-1.md b/_log/vcs-1.md
index ac74ad5..4244007 100644
--- a/_log/vcs-1.md
+++ b/_log/vcs-1.md
@@ -4,8 +4,8 @@ date: 2026-05-01
layout: post
---
-Implemented init, status, add, commit, log, show, and diff commands. Depends
-only on OpenBSD base system tools. Didn't bother with collaborative workflows.
+Implemented init, status, add, commit, log, show, and diff commands using Perl
+and OpenBSD base-system tools. Didn't bother with collaborative workflows.
Initial design mirrored the work tree using symlinks. Using filesystem as a
database felt clever, but walking directories on every command and the inode
@@ -13,22 +13,21 @@ churn were untenable. Replaced the symlink architecture with a path-sorted
index.
The index tracks path, mtime, size, and SHA-1 hashes of staged, committed, and
-base files. Hashing is skipped when mtime and size are unchanged. If the file
-and the index share the same timestamp, it's rehashed to catch sub-second
-changes.
+base files. Only entries whose mtime and size changed, or share the same mtime
+as the index are hashed.
-Implemented directory scans as a two-finger walk with the index. Linear index
+Implemented directory scans as a two-finger walk with the index; linear index
access trades random-access speed for sequential IO and keeps memory footprint
low.
-Commits save staged files, trees, and deltas to the content-addressable object
+Commits save staged files, trees, and deltas to a content-addressable object
store. Bundled deltas into tarballs to conserve inodes. Gzipped objects larger
than 512 bytes. The threshold was arbitrary. Did not tune further.
Deltas, computed using diff, target the original file. Subsequent versions are
-reconstructed via a single patch—no chains. When the delta exceeds the rebase
-threshold, the file becomes the new base. Diff output is bloated but compresses
-well, so rebase threshold is set to 1.4, assuming a 30-40% compression ratio.
+reconstructed via a single patch—no chains. Diff output is bloated but
+compresses well, so rebase threshold is set to 1.4, assuming a 30-40%
+compression ratio. When the delta exceeds that, the file becomes the new base.
Commands run in memory, using text streams and pipes wherever possible. Left
MEM_LIMIT configurable to fall back to disk for large repositories:
@@ -113,7 +112,7 @@ from 1,300 to 1,462.
Then fell the GC hammer. Inodes: 41. Space recovered: 8.4 MB.
-Precise impact on TBW and write amplification remains unknown.
+Precise impact on TBW and write amplification is not yet known.
Commit: <a
href="https://git.asciimx.com/urn/commit/?id=79d9ec2bdef0a82172fa0aa56f12004bef206c04"