summaryrefslogtreecommitdiffstats
path: root/_log/site-search.md
diff options
context:
space:
mode:
authorSadeep Madurange <sadeep@asciimx.com>2026-01-16 22:29:35 +0800
committerSadeep Madurange <sadeep@asciimx.com>2026-01-16 22:29:35 +0800
commit301adf7f088e4bc75d37ea927c11fe2e4a455d8f (patch)
treeb7d30a48e4869b838cd10a4e4a9b10c2374b2a5a /_log/site-search.md
parent1517a5cdf8a1a3d57cadc3f363931ace5af06438 (diff)
downloadwww-301adf7f088e4bc75d37ea927c11fe2e4a455d8f.tar.gz
Remove matrix style.
Diffstat (limited to '_log/site-search.md')
-rw-r--r--_log/site-search.md25
1 files changed, 14 insertions, 11 deletions
diff --git a/_log/site-search.md b/_log/site-search.md
index 0ff97fc..df5a7ab 100644
--- a/_log/site-search.md
+++ b/_log/site-search.md
@@ -9,9 +9,9 @@ Needed search for site.
Requirements: matches substrings, case-insensitive, fast, secure. No
JavaScript.
-Architecture: browser → httpd → slowcgi → Perl CGI script.
+Architecture: browser → httpd → slowcgi → Perl script.
-SA index implemented with three files: corpus.bin, sa.bin, file_map.dat. Index
+Implemented SA index with three files: corpus.bin, sa.bin, file_map.dat. Index
built with site:
```
@@ -40,8 +40,10 @@ my @sa = 0 .. (length($corpus) - 1);
Sort is the bottleneck. Time complexity: O(L⋅N log N). Fast path caps L at 64
bytes (length of a cache line) → O(N log N).
-Search: Textbook range query with twin binary searches. Uses fixed-width
-offsets for random access:
+32-bit offsets limits index size to 4GB (243k articles).
+
+Search: Textbook range query with twin binary searches. Fixed-width offsets
+enable random access:
```
seek($fh_sa, $mid * 4, 0);
@@ -80,17 +82,18 @@ regex search:
- Peak RAM (SA): 12504 KB
- Peak RAM (Regex): 12804 KB
- Search (SA): 0.0161 s
- - Search (Regex): 0.9120 s
+ - Search (Regex): 0.9120 S
-Security: httpd, slowcgi, Perl in base system--no dependencies. File system
-permissions govern access. Runs in chroot.
+Security: httpd, slowcgi, Perl are in the base system--no dependencies. File
+system permissions govern access. Runs in chroot.
Resource exhaustion and XSS attacks are inherent. Lock-file semaphores limit
-concurrent searches. Query length (64B) and result set (20) capped. All output
-is HTML-escaped to prevent XSS.
+concurrent searches. Query length (64B) and result set (20) are capped. All
+output is HTML-escaped to prevent XSS.
+
+Warranty: 10,000 / 12 → 833 years.
-Warranty: 10,000 / 12 → 833 years. Next release: inverted index, year of our
-Lord 2859.
+Next release: inverted index; Anno Domini 2859.
Commit: <a
href="https://git.asciimx.com/www/commit/?h=term&id=6da102d6e0494a3eac3f05fa3b2cdcc25ba2754e"