summaryrefslogtreecommitdiffstats
path: root/_log/site-search.md
diff options
context:
space:
mode:
Diffstat (limited to '_log/site-search.md')
-rw-r--r--_log/site-search.md51
1 files changed, 32 insertions, 19 deletions
diff --git a/_log/site-search.md b/_log/site-search.md
index b0b1d32..0848dce 100644
--- a/_log/site-search.md
+++ b/_log/site-search.md
@@ -1,5 +1,5 @@
---
-title: Search engine for static sites
+title: Built a search engine for website based on suffix arrays
date: 2026-01-03
layout: post
---
@@ -63,29 +63,42 @@ Small seek/reads are fast on modern SSDs; keeps memory footprint small.
Benchmarks on T490 (i7-10510U, OpenBSD 7.8, article size: 16 KB) against linear
regex search:
+<pre class="pre-no-style">
+=============================================================
+SEARCH BENCHMARK: Suffix array vs. Linear regex
+ARTICLE SIZE: 16 KB
+=============================================================
+
500 files:
- - Index size: 204.94 KB
- - Indexing time: 0.1475 s
- - Peak RAM (SA): 8828 KB
- - Peak RAM (Regex): 9136 KB
- - Search (SA): 0.0012 s
- - Search (Regex): 0.0407 s
+-------------------------------------------------------------
+METRIC | SA | REGEX
+----------------+----------------------+---------------------
+Search time | 0.0012s | 0.0407s
+Peak RAM | 8828 KB | 9136 KB
+Indexing time | 0.1475s | N/A
+Index size | 204.94 KB | N/A
+-------------------------------------------------------------
1,000 files:
- - Index size: 410.51 KB
- - Indexing time: 0.3101 s
- - Peak RAM (SA): 8980 KB
- - Peak RAM (Regex): 9460 KB
- - Search (SA): 0.0019 s
- - Search (Regex): 0.0795 s
+-------------------------------------------------------------
+METRIC | SA | REGEX
+----------------+----------------------+---------------------
+Search time | 0.0019s | 0.0795s
+Peak RAM | 8980 KB | 9460 KB
+Indexing time | 0.3101s | N/A
+Index size | 410.51 KB | N/A
+-------------------------------------------------------------
10,000 files:
- - Index size: 4163.44 KB
- - Indexing time: 10.9661 s
- - Peak RAM (SA): 12504 KB
- - Peak RAM (Regex): 12804 KB
- - Search (SA): 0.0161 s
- - Search (Regex): 0.9120 s
+-------------------------------------------------------------
+METRIC | SA | REGEX
+----------------+----------------------+---------------------
+Search time | 0.0161s | 0.9120s
+Peak RAM | 12504 KB | 12804 KB
+Indexing time | 10.9661s | N/A
+Index size | 4163.44 KB | N/A
+-------------------------------------------------------------
+</pre>
Seek/read consistently outperformed mmap at <1k files. At 10k, mmap was
occasionally faster (~200 µs), but used more memory—possibly OpenBSD's VM