From f70640148a2952271e4b3e0788e0aad3a7203e7d Mon Sep 17 00:00:00 2001 From: Sadeep Madurange Date: Sun, 24 May 2026 11:17:37 +0800 Subject: Minor improvement to Bumblebee, geometry, urn posts. --- _log/vcs-1.md | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) (limited to '_log/vcs-1.md') diff --git a/_log/vcs-1.md b/_log/vcs-1.md index 93e842b..d11b342 100644 --- a/_log/vcs-1.md +++ b/_log/vcs-1.md @@ -4,30 +4,29 @@ date: 2026-05-01 layout: post --- -Implemented init, status, add, commit, log, show, and diff commands using Perl -and OpenBSD base-system tools. Didn't bother with collaborative workflows. +Implemented init, status, add, commit, log, show, and diff using Perl and +OpenBSD base-system tools. Didn't bother with collaborative workflows. -Initial design mirrored the work tree using symlinks. Using filesystem as a -database felt clever, but walking directories on every command and the inode -churn were untenable. Replaced the symlink architecture with a path-sorted -index. +Initial design mirrored the work tree with symlinks. Using filesystem as a +database felt clever, but walking directories on every command was untenable. +Replaced the symlink architecture with a path-sorted index. The index tracks path, mtime, size, and SHA-1 hashes of staged, committed, and -base files. Only entries whose mtime and size changed, or share the same mtime -as the index are hashed. +base files. Only entries whose mtime and size changed (or has the same mtime as +the index to mitigate races caused by mtime precision) are hashed. Implemented directory scans as a two-finger walk with the index; linear index -access trades random-access speed for sequential IO and keeps memory footprint -low. +access trades random-access speed for sequential IO and keeps the memory +footprint low. Commits save staged files, trees, and deltas to a content-addressable object -store. Bundled deltas into tarballs to conserve inodes. Gzipped objects larger -than 512 bytes. The threshold was arbitrary. Did not tune further. +store. Deltas are bundled into tarballs to conserve inodes. Objects larger than +512 bytes are gzipped. The threshold was arbitrary. Did not tune further. Deltas, computed using diff, target the original file. Subsequent versions are reconstructed via a single patch—no chains. Diff output is bloated but -compresses well, so rebase threshold is set to 1.4, assuming a 30-40% -compression ratio. When the delta exceeds that, the file becomes the new base. +compresses well. Rebase threshold is set to 1.4, assuming a 30-40% compression +ratio. When the delta exceeds the threshold, the file becomes the new base. Commands run in memory, using text streams and pipes wherever possible. Left MEM_LIMIT configurable to fall back to disk for large repositories: @@ -53,7 +52,8 @@ if ((!$use_disk && $tot_size > MEM_LIMIT) || } ``` -Benchmarked against Git v2.51.0 on a T490 (i7-10510U, OpenBSD 7.8): +Benchmarked against Git v2.51.0 on a T490 (i7-10510U, OpenBSD 7.8). Measured +with `/usr/bin/time -l sh -c`. Max RSS excludes child processes:
 =============================================================
@@ -104,15 +104,15 @@ Final Inodes    |                 1462 |                   41
 TOTAL URN REBASES: 0
 
-Git is 10x times faster. +Git is 10x faster. On storage, Urn shows promise. Git wrote 12 MB to track a 17 MB repository; Urn -wrote 9 MB. Over 80 commits, Git's inode consumption grew by 562. Urn's crept -from 1,300 to 1,462. +wrote 9 MB. Over 80 commits, Git's inode consumption grew by 562, while Urn's +crept from 1,300 to 1,462. Then fell the GC hammer. Inodes: 41. Space recovered: 8.4 MB. -Precise impact on TBW and write amplification is not yet known. +Precise impact on TBW and write amplification is unknown. Commit: