diff options
Diffstat (limited to '_log/vcs-1.md')
| -rw-r--r-- | _log/vcs-1.md | 38 |
1 files changed, 19 insertions, 19 deletions
diff --git a/_log/vcs-1.md b/_log/vcs-1.md index 93e842b..d11b342 100644 --- a/_log/vcs-1.md +++ b/_log/vcs-1.md @@ -4,30 +4,29 @@ date: 2026-05-01 layout: post --- -Implemented init, status, add, commit, log, show, and diff commands using Perl -and OpenBSD base-system tools. Didn't bother with collaborative workflows. +Implemented init, status, add, commit, log, show, and diff using Perl and +OpenBSD base-system tools. Didn't bother with collaborative workflows. -Initial design mirrored the work tree using symlinks. Using filesystem as a -database felt clever, but walking directories on every command and the inode -churn were untenable. Replaced the symlink architecture with a path-sorted -index. +Initial design mirrored the work tree with symlinks. Using filesystem as a +database felt clever, but walking directories on every command was untenable. +Replaced the symlink architecture with a path-sorted index. The index tracks path, mtime, size, and SHA-1 hashes of staged, committed, and -base files. Only entries whose mtime and size changed, or share the same mtime -as the index are hashed. +base files. Only entries whose mtime and size changed (or has the same mtime as +the index to mitigate races caused by mtime precision) are hashed. Implemented directory scans as a two-finger walk with the index; linear index -access trades random-access speed for sequential IO and keeps memory footprint -low. +access trades random-access speed for sequential IO and keeps the memory +footprint low. Commits save staged files, trees, and deltas to a content-addressable object -store. Bundled deltas into tarballs to conserve inodes. Gzipped objects larger -than 512 bytes. The threshold was arbitrary. Did not tune further. +store. Deltas are bundled into tarballs to conserve inodes. Objects larger than +512 bytes are gzipped. The threshold was arbitrary. Did not tune further. Deltas, computed using diff, target the original file. Subsequent versions are reconstructed via a single patch—no chains. Diff output is bloated but -compresses well, so rebase threshold is set to 1.4, assuming a 30-40% -compression ratio. When the delta exceeds that, the file becomes the new base. +compresses well. Rebase threshold is set to 1.4, assuming a 30-40% compression +ratio. When the delta exceeds the threshold, the file becomes the new base. Commands run in memory, using text streams and pipes wherever possible. Left MEM_LIMIT configurable to fall back to disk for large repositories: @@ -53,7 +52,8 @@ if ((!$use_disk && $tot_size > MEM_LIMIT) || } ``` -Benchmarked against Git v2.51.0 on a T490 (i7-10510U, OpenBSD 7.8): +Benchmarked against Git v2.51.0 on a T490 (i7-10510U, OpenBSD 7.8). Measured +with `/usr/bin/time -l sh -c`. Max RSS excludes child processes: <pre class="pre-no-style"> ============================================================= @@ -104,15 +104,15 @@ Final Inodes | 1462 | 41 TOTAL URN REBASES: 0 </pre> -Git is 10x times faster. +Git is 10x faster. On storage, Urn shows promise. Git wrote 12 MB to track a 17 MB repository; Urn -wrote 9 MB. Over 80 commits, Git's inode consumption grew by 562. Urn's crept -from 1,300 to 1,462. +wrote 9 MB. Over 80 commits, Git's inode consumption grew by 562, while Urn's +crept from 1,300 to 1,462. Then fell the GC hammer. Inodes: 41. Space recovered: 8.4 MB. -Precise impact on TBW and write amplification is not yet known. +Precise impact on TBW and write amplification is unknown. Commit: <a href="https://git.asciimx.com/urn/commit/?id=ff98b5711ae91d5cafd75764be192c0be5e592cf" |
