diff options
| author | Sadeep Madurange <sadeep@asciimx.com> | 2025-12-30 22:36:53 +0800 |
|---|---|---|
| committer | Sadeep Madurange <sadeep@asciimx.com> | 2025-12-30 22:36:53 +0800 |
| commit | b65dabd3a2c83404e6612ce881ba4ad32e0b1ccd (patch) | |
| tree | d0f06062fe0178b4539e3103b558427037bb6afc /_log | |
| parent | 9fec793abe0a73e5cd502a1d1e935e2413b85079 (diff) | |
| download | www-b65dabd3a2c83404e6612ce881ba4ad32e0b1ccd.tar.gz | |
Readme.
Diffstat (limited to '_log')
| -rw-r--r-- | _log/search-with-cgi.md | 98 |
1 files changed, 38 insertions, 60 deletions
diff --git a/_log/search-with-cgi.md b/_log/search-with-cgi.md index 2578878..0109294 100644 --- a/_log/search-with-cgi.md +++ b/_log/search-with-cgi.md @@ -4,47 +4,42 @@ date: 2025-12-29 layout: post --- -Need a way to search site--number of articles are growing. +Number of articles on the site are growing. Need a way to search site. -Searching site client-side using the RSS feed and JavaScript is not an option-- -bloats the feed and breaks the site for Lynx and other text browsers. +Searching the RSS feed client-side using JavaScript is not an option. That +would make the feed much heavier and break the site for text-based web browsers +like Lynx. -Perl's great for text processing--especially regex work. Few lines of Perl -could do a regex search and send the result back via CGI. OpenBSD httpd speaks -CGI, Perl and slowcgi are in the base systems. No dependencies. Works on every -conceivable browser. +Not gonna use an inverted index--More than an evening's effort, especially if I +want partial matching. I want partial matching. -Perl: traverse the directory with File::Find recursively. If search text is -found grab the file name, title and up to 50 chars of the first paragraph to -include in the search result. +Few lines of Perl could do a regex search and send the result back via CGI. +OpenBSD httpd speaks CGI. Perl and slowcgi are in the base system. No +dependencies. + +Perl: traverse directory with File::Find. If search text is found grab the file +name, title and up to 50 chars from the first paragraph to include in the +search result. ``` -find(sub { - if (open my $fh, '<', $_) { - my $content = do { local $/; <$fh> }; - close $fh; - - if ($content =~ /\Q$search_text\E/i) { - my ($title) = $content =~ /<title>(.*?)<\/title>/is; - $title ||= $File::Find::name; - my ($p_content) = $content =~ /<p[^>]*>(.*?)<\/p>/is; - my $snippet = $p_content || ""; - $snippet =~ s/<[^>]*>//g; - $snippet =~ s/\s+/ /g; - $snippet = substr($snippet, 0, 50); - $snippet .= "..." if length($p_content || "") > 50; - - push @results, { - path => $File::Find::name, - title => $title, - snippet => $snippet - }; - } - } +find({ + wanted => sub { + return unless -f $_ && $_ eq 'index.html'; + # ... file reading ... + if ($content =~ /\Q$search_text\E/i) { + # Extract title, snippet + push @results, { + path => $File::Find::name, + title => $title, + snippet => $snippet + }; + } + }, + follow => 0, }, $dir); ``` -Don't need the Perl CGI module, httpd sets QUERY_STRING for the slowcgi script: +httpd sets the search text in QUERY_STRING env. Don't need Perl's CGI module. ``` my %params; @@ -58,38 +53,21 @@ if ($ENV{QUERY_STRING}) { } ``` -Run the script as www user. Permissions: 554 (read + execute). +Security. -Running in OpenBSD chroot: Check Perl's dynamic object dependencies: +ReDOS, XSS, command injection, symlink attacks. Did I miss anything? Probably. -``` -$ ldd $(which perl) -/usr/bin/perl: - Start End Type Open Ref GrpRef Name - 000008797e8e6000 000008797e8eb000 exe 1 0 0 /usr/bin/perl - 0000087c1ffe5000 0000087c20396000 rlib 0 1 0 /usr/lib/libperl.so.26.0 - 0000087bf4508000 0000087bf4539000 rlib 0 2 0 /usr/lib/libm.so.10.1 - 0000087b9e801000 0000087b9e907000 rlib 0 2 0 /usr/lib/libc.so.102.0 - 0000087bba182000 0000087bba182000 ld.so 0 1 0 /usr/libexec/ld.so -``` +ReDOS: sanitized user input, length-limit search text, quote metacharacters +with `\Q$search_text\E`. -Copy them over to chroot. Now should have /var/www/usr/bin/perl, -/usr/lib/libperl.so.26.0, and so on. +XSS: sanitized user input. Escaped HTML. -Troubleshooting: look for issues in logs or try executing the script in chroot: +Command injection: no exec()/system() calls. Non-privileged user (www). -``` -$ cat /var/log/messages | grep slowcgi -# chroot /var/www/ htdocs/path/to/script/script.cgi -``` -The last command exposes any missing Perl modules in chroot and where to find -them. Copy them over as well. +Symlink attacks: File::Find don't follow symlinks (follow => 0). chroot. -``` -location "/cgi-bin/*" { - fastcgi socket "/run/slowcgi.sock" -} -``` +Access controls: files (444), directories and CGI script: 554. -in httpd.conf routes queries to slowcgi. +Verdict: O(n) speed. Works on every conceivable browser. Good enough. +Commit: [9fec793](https://git.asciimx.com/www/commit/?h=term&id=9fec793abe0a73e5cd502a1d1e935e2413b85079) |
