diff options
| author | Sadeep Madurange <sadeep@asciimx.com> | 2026-04-23 16:01:14 +0800 |
|---|---|---|
| committer | Sadeep Madurange <sadeep@asciimx.com> | 2026-04-23 16:01:14 +0800 |
| commit | 42f6044b0bbcdf92fe2e0667dfc07c2f4cfaed69 (patch) | |
| tree | 70d4a47d88b356fa08c49ed20d5f2a064f761ca3 | |
| parent | 58c2b94a37b21c642287065b44bf83334b7306fc (diff) | |
| download | www-42f6044b0bbcdf92fe2e0667dfc07c2f4cfaed69.tar.gz | |
Wrote VCS post in log style.
| -rw-r--r-- | _log/vcs-1.md | 45 |
1 files changed, 23 insertions, 22 deletions
diff --git a/_log/vcs-1.md b/_log/vcs-1.md index 70531bc..c07d3bd 100644 --- a/_log/vcs-1.md +++ b/_log/vcs-1.md @@ -4,23 +4,22 @@ date: 2026-04-23 layout: post --- -Urn is an experimental VCS built to minimize SSD wear, write amplification, and -inode churn, even when that costs CPU time. +Implemented init, status, add, commit, log, show, and diff. Tracks regular +files, symlinks. Didn't bother with collaborative workflows. -Implemented init, status, add, commit, log, show, and diff. Handles text files, -symlinks, and binary files. Collaborative workflows are out of scope. +Moved away from the initial work tree mirroring with symlinks to an index to +minimize inode churn and opening directories on every status/add command. -Sorted index tracks file paths, SHA-1 hashes of staged files (staged hash), -parent (commit hash), and base (base hash), mtime, and size. Permissions are -not tracked. mtime and size are used to avoid computing hashes for files that -didn't change. +Implemented path-sorted index to track staged, commit, and base SHA-1 hashes, +mtime, and size. Like git, mtime and size are used to skip entries that didn't +change. Excluded file permissions. -When a new file is committed, it's saved in the object store as the base. When +When a new file is committed, it’s saved in the object store as the base. When the file changes, diff generates a patch against the base. If the patch is larger than the file, the file becomes the new base. -Unix diff doesn't handle binary files well. Rolled a diff for binary files that -works well enough except for small changes that shift bytes: +Unix diff doesn't compute binary deltas. Rolled a basic binary diff that works +well enough except for small changes that shift bytes: ``` my $patch = pack("Q", $new_size); @@ -38,15 +37,17 @@ while (1) { } ``` -Commits store sorted lists of paths (tree), base hashes, and patch sets in the -object store. Patches are stored as tarballs. Tarballs larger than 512 bytes -are gzipped. Objects are content-addressable. To reconstruct a file, look up -the revision, follow the base hash and apply the patch. +A commit is a tree (list of paths and their base hashes) and a patch set. +Patches are stored as tarballs. Gzipped tarballs larger than 512 bytes. +Objects in the store are content-addressable. To reconstruct a file, look up +the revision file, follow the base hash, and apply the patch. -Status and add commands scan the work tree, sort entries by path and performs a -two-finger walk with the index to minimize random access. Operations are -performed in memory — often using text streams and pipes. MEM_LIMIT can be used -to fall back on the disk for large repositories: +Status and add commands scan the work tree, sort entries by path, and perform a +two-finger walk with the index. Linear index access trades random-access speed +for sequential IO. + +Operations are performed in memory — often using text streams and pipes. +MEM_LIMIT can be used to fall back on the disk for large repositories: ``` my $flush = sub { @@ -69,7 +70,7 @@ if ((!$use_disk && $tot_size > MEM_LIMIT) || } ``` -Benchmarks on T490 (i7-10510U, OpenBSD 7.8) against git v2.51.0: +Performed benchmarks on T490 (i7-10510U, OpenBSD 7.8) against git v2.51.0: <pre class="pre-no-style"> ============================================================= @@ -240,8 +241,8 @@ storage. Over 80 commits, git wrote 17 MB to track a 17 MB repo. Urn only wrote went from 1,578 to 1,693. Git GC reclaims inodes, but doesn't save much space. On a 36.6 MB repo, git -used up 1819 inodes and 59 MB pre-GC. After GC inode count dropped to 1514, but -space only shrank by 6 MB. +used up 1819 inodes and 59 MB pre-GC. After GC, inode count dropped to 1514, +but space only shrank by 6 MB. Commit: <a href="https://git.asciimx.com/urn/commit/?id=49ae7748e4a95afa1fd9d08f4886952dfc1deca4" |
