1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
|
---
title: Built and benchmarked Urn against Git
date: 2026-04-30
layout: post
---
Implemented init, status, add, commit, log, show, and diff. Tracks regular
files, symlinks. Didn't bother with collaborative workflows.
Replaced the initial work tree mirroring with symlinks to a path-sorted index;
Minimizes inode churn; Avoids walking directories on every command.
Index tracks paths, mtimes, sizes, and SHA-1 hashes of staged, committed, and
base files. Like Git, used mtime and size to skip entries that didn't change.
Excluded file permissions for now.
Used a two-finger walk with the index to scan work/commit trees. Linear index
access trades random-access speed for sequential IO; keeps memory footprint
low.
Operations run in memory, using text streams and pipes wherever possible. Left
MEM_LIMIT configurable to fall back to disk for large repositories:
```
my $flush = sub {
if (!$use_disk) {
($tmp_fh, $tmp_path) = tempfile(UNLINK => 1);
$tmp_fh->setvbuf(undef, POSIX::_IOFBF(), $chunk_size);
binmode $tmp_fh, ":raw";
$use_disk = 1;
}
print $tmp_fh @buf;
};
push @buf, $line;
$buf_size += length($line);
$tot_size += length($line);
if ((!$use_disk && $tot_size > MEM_LIMIT) ||
($use_disk && $buf_size > $chunk_size)) {
$flush->();
}
```
Commits save staged files, trees, and the deltas to the object store. Bundled
deltas into tarballs to conserve inodes. Gzipped objects larger than 512 bytes
(length of tar + gzip headers). Object store is content-addressable.
Deltas target the original file (base). Subsequent versions are reconstructed
via one patch—no chains. When the delta exceeds the rebase threshold, the file
becomes the new base.
Avoiding frequent rebases is key. Diff output is bloated but compresses well.
Set rebase threshold to 1.4 expecting 30-40% compression ratio.
Unix diff doesn't compute binary deltas. Rolled a basic binary diff to stay in
the base system. Works well enough except for small changes that shift bytes.
Benchmarked against Git v2.51.0 on a T490 (i7-10510U, OpenBSD 7.8):
<pre class="pre-no-style">
=============================================================
COMMIT BENCHMARK: 1000 files (100 commits)
CONDITIONS: Depth=2, Files Mod=0.5%, Line Mod=5%
INITIAL REPO SIZE: 17332 KB
=============================================================
SNAPSHOT: Commit #20
-------------------------------------------------------------
METRIC | URN | GIT
----------------+----------------------+---------------------
Time | 0.29s | 0.03s
Max RSS | 0.02 MB | 0.01 MB
Page faults | Maj:0 / Min:0 | Maj:0 / Min:0
Inodes | 1300 | 1425
Repo size | 6836 KB | 8296 KB
-------------------------------------------------------------
SNAPSHOT: Commit #40
-------------------------------------------------------------
METRIC | URN | GIT
----------------+----------------------+---------------------
Time | 0.29s | 0.03s
Max RSS | 0.02 MB | 0.01 MB
Page faults | Maj:0 / Min:0 | Maj:0 / Min:0
Inodes | 1340 | 1566
Repo size | 7332 KB | 9268 KB
-------------------------------------------------------------
SNAPSHOT: Commit #60
-------------------------------------------------------------
METRIC | URN | GIT
----------------+----------------------+---------------------
Time | 0.35s | 0.03s
Max RSS | 0.02 MB | 0.01 MB
Page faults | Maj:0 / Min:0 | Maj:0 / Min:0
Inodes | 1381 | 1706
Repo size | 7896 KB | 10236 KB
-------------------------------------------------------------
SNAPSHOT: Commit #80
-------------------------------------------------------------
METRIC | URN | GIT
----------------+----------------------+---------------------
Time | 0.35s | 0.03s
Max RSS | 0.02 MB | 0.01 MB
Page faults | Maj:0 / Min:0 | Maj:0 / Min:0
Inodes | 1421 | 1847
Repo size | 8456 KB | 11200 KB
-------------------------------------------------------------
SNAPSHOT: Commit #100
-------------------------------------------------------------
METRIC | URN | GIT
----------------+----------------------+---------------------
Time | 0.35s | 0.03s
Max RSS | 0.02 MB | 0.01 MB
Page faults | Maj:0 / Min:0 | Maj:0 / Min:0
Inodes | 1462 | 1987
Repo size | 9020 KB | 12168 KB
-------------------------------------------------------------
AFTER GIT GC
-------------------------------------------------------------
Final Size | 9020 KB | 3812 KB
Final Inodes | 1462 | 41
-------------------------------------------------------------
TOTAL URN REBASES: 0
</pre>
Git wins on speed and memory.
On storage, Urn shows more promise. Git wrote 12 MB to track a 17 MB
repository; Urn wrote 9 MB. Over 80 commits, Git's inode consumption grew by 562.
Urn's crept from 1,300 to 1,462.
Then happened the GC. Inodes: 41. Space recovered: 8.4 MB.
Urn's sequential IO and reduced write frequency are theoretically gentler on
the NAND gates. Git's dramatic GC pass (12 MB → 3.8 MB) incurs write
amplification Urn likely avoids.
Precise impact on SSD TBW and write amplification, however, remains unknown.
Commit: <a
href="https://git.asciimx.com/urn/commit/?id=79d9ec2bdef0a82172fa0aa56f12004bef206c04"
class="external" target="_blank" rel="noopener noreferrer">79d9ec2</a>
|