summaryrefslogtreecommitdiffstats
path: root/_log/search-with-cgi.md
diff options
context:
space:
mode:
Diffstat (limited to '_log/search-with-cgi.md')
-rw-r--r--_log/search-with-cgi.md73
1 files changed, 0 insertions, 73 deletions
diff --git a/_log/search-with-cgi.md b/_log/search-with-cgi.md
deleted file mode 100644
index 0109294..0000000
--- a/_log/search-with-cgi.md
+++ /dev/null
@@ -1,73 +0,0 @@
----
-title: Site search using Perl + CGI
-date: 2025-12-29
-layout: post
----
-
-Number of articles on the site are growing. Need a way to search site.
-
-Searching the RSS feed client-side using JavaScript is not an option. That
-would make the feed much heavier and break the site for text-based web browsers
-like Lynx.
-
-Not gonna use an inverted index--More than an evening's effort, especially if I
-want partial matching. I want partial matching.
-
-Few lines of Perl could do a regex search and send the result back via CGI.
-OpenBSD httpd speaks CGI. Perl and slowcgi are in the base system. No
-dependencies.
-
-Perl: traverse directory with File::Find. If search text is found grab the file
-name, title and up to 50 chars from the first paragraph to include in the
-search result.
-
-```
-find({
- wanted => sub {
- return unless -f $_ && $_ eq 'index.html';
- # ... file reading ...
- if ($content =~ /\Q$search_text\E/i) {
- # Extract title, snippet
- push @results, {
- path => $File::Find::name,
- title => $title,
- snippet => $snippet
- };
- }
- },
- follow => 0,
-}, $dir);
-```
-
-httpd sets the search text in QUERY_STRING env. Don't need Perl's CGI module.
-
-```
-my %params;
-if ($ENV{QUERY_STRING}) {
- foreach my $pair (split /&/, $ENV{QUERY_STRING}) {
- my ($key, $value) = split /=/, $pair;
- $value =~ tr/+/ /;
- $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
- $params{$key} = $value;
- }
-}
-```
-
-Security.
-
-ReDOS, XSS, command injection, symlink attacks. Did I miss anything? Probably.
-
-ReDOS: sanitized user input, length-limit search text, quote metacharacters
-with `\Q$search_text\E`.
-
-XSS: sanitized user input. Escaped HTML.
-
-Command injection: no exec()/system() calls. Non-privileged user (www).
-
-Symlink attacks: File::Find don't follow symlinks (follow => 0). chroot.
-
-Access controls: files (444), directories and CGI script: 554.
-
-Verdict: O(n) speed. Works on every conceivable browser. Good enough.
-
-Commit: [9fec793](https://git.asciimx.com/www/commit/?h=term&id=9fec793abe0a73e5cd502a1d1e935e2413b85079)