From f0d65be8cef87084f65f373ddfe51ce5c8405879 Mon Sep 17 00:00:00 2001 From: Sadeep Madurange Date: Sat, 10 Jan 2026 15:34:34 +0800 Subject: IBM VGA fonts, major changes to typography, update site-search. --- _log/site-search.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) (limited to '_log/site-search.md') diff --git a/_log/site-search.md b/_log/site-search.md index 1752db7..4ccae3c 100644 --- a/_log/site-search.md +++ b/_log/site-search.md @@ -4,7 +4,7 @@ date: 2026-01-03 layout: post --- -Article count is growing. Need a way to search. +Article count is growing. Need search. Requirements: matches substrings, case-insensitive, fast, secure. No JavaScript. @@ -14,7 +14,7 @@ Architecture: browser → httpd → slowcgi → Perl CGI script. Perl, httpd, slowcgi are in the OpenBSD base system. Instead of secrets, file system permissions govern access. -2025-12-30: Regex search. +2025-12-30: Regex. 140-line Perl script searches 500 files in 40ms. Fast enough; O(N) pull felt at higher file counts. @@ -22,7 +22,7 @@ higher file counts. Introduces ReDoS and symlink attack vectors. Both can be mitigated. Tempted to stop here. -2026-01-03: Suffix Array (SA) based index lookup. +2026-01-03: Suffix array (SA) based index lookup. Slurping files on every request bothers me. Regex search depends almost entirely on hardware for speed. @@ -36,8 +36,8 @@ $ cd cgi-bin/ $ perl indexer.pl ``` -Indexer extracts HTML, lowercases, encodes into UTF-8 binary sequences. Null -byte sentinel for document boundaries. sa.bin stores suffix offsets as +Indexer extracts HTML, lowercases, and encodes into UTF-8 binary sequences. +Null byte sentinel for document boundaries. sa.bin stores suffix offsets as 32-bit unsigned integers, sorted by lexicographical order: ``` @@ -120,7 +120,7 @@ Resource exhaustion and XSS attacks are inherent. Former mitigated by limiting concurrent searches via lock-file semaphores. Query length (64B) and result set (20) capped. All output is HTML-escaped to prevent XSS. -Secure by default. Fast. Durable. +Verdict: Fast. Durable. Secure by default. Commit: