diff options
Diffstat (limited to '_site/log/search-with-cgi/index.html')
| -rw-r--r-- | _site/log/search-with-cgi/index.html | 133 |
1 files changed, 0 insertions, 133 deletions
diff --git a/_site/log/search-with-cgi/index.html b/_site/log/search-with-cgi/index.html deleted file mode 100644 index 71b9f23..0000000 --- a/_site/log/search-with-cgi/index.html +++ /dev/null @@ -1,133 +0,0 @@ -<!DOCTYPE html> -<html> - <head> - <meta charset="utf-8"> - <title>Site search using Perl + CGI</title> - - <head> - <meta charset="utf-8"> - <meta name="viewport" content="width=device-width, initial-scale=1"> - <title>Site search using Perl + CGI</title> - <link rel="stylesheet" href="/assets/css/main.css"> - <link rel="stylesheet" href="/assets/css/skeleton.css"> -</head> - - - - </head> - <body> - - <div id="nav-container" class="container"> - <ul id="navlist" class="left"> - - <li > - <a href="/" class="link-decor-none">hme</a> - </li> - <li class="active"> - <a href="/log/" class="link-decor-none">log</a> - </li> - <li > - <a href="/projects/" class="link-decor-none">poc</a> - </li> - <li > - <a href="/about/" class="link-decor-none">abt</a> - </li> - <li> - <a href="/cgi-bin/find.cgi" class="link-decor-none">sws</a> - </li> - <li> - <a href="/feed.xml" class="link-decor-none">rss</a> - </li> - </ul> -</div> - - - - <main> - <div class="container"> - <div class="container-2"> - <h2 class="center" id="title">SITE SEARCH USING PERL + CGI</h2> - <h6 class="center">29 DECEMBER 2025</h5> - <br> - <div class="twocol justify"><p>Number of articles on the site are growing. Need a way to search site.</p> - -<p>Searching the RSS feed client-side using JavaScript is not an option. That -would make the feed much heavier and break the site for text-based web browsers -like Lynx.</p> - -<p>Not gonna use an inverted index–More than an evening’s effort, especially if I -want partial matching. I want partial matching.</p> - -<p>Few lines of Perl could do a regex search and send the result back via CGI. -OpenBSD httpd speaks CGI. Perl and slowcgi are in the base system. No -dependencies.</p> - -<p>Perl: traverse directory with File::Find. If search text is found grab the file -name, title and up to 50 chars from the first paragraph to include in the -search result.</p> - -<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>find({ - wanted => sub { - return unless -f $_ && $_ eq 'index.html'; - # ... file reading ... - if ($content =~ /\Q$search_text\E/i) { - # Extract title, snippet - push @results, { - path => $File::Find::name, - title => $title, - snippet => $snippet - }; - } - }, - follow => 0, -}, $dir); -</code></pre></div></div> - -<p>httpd sets the search text in QUERY_STRING env. Don’t need Perl’s CGI module.</p> - -<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>my %params; -if ($ENV{QUERY_STRING}) { - foreach my $pair (split /&/, $ENV{QUERY_STRING}) { - my ($key, $value) = split /=/, $pair; - $value =~ tr/+/ /; - $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; - $params{$key} = $value; - } -} -</code></pre></div></div> - -<p>Security.</p> - -<p>ReDOS, XSS, command injection, symlink attacks. Did I miss anything? Probably.</p> - -<p>ReDOS: sanitized user input, length-limit search text, quote metacharacters -with <code class="language-plaintext highlighter-rouge">\Q$search_text\E</code>.</p> - -<p>XSS: sanitized user input. Escaped HTML.</p> - -<p>Command injection: no exec()/system() calls. Non-privileged user (www).</p> - -<p>Symlink attacks: File::Find don’t follow symlinks (follow => 0). chroot.</p> - -<p>Access controls: files (444), directories and CGI script: 554.</p> - -<p>Verdict: O(n) speed. Works on every conceivable browser. Good enough.</p> - -<p>Commit: <a href="https://git.asciimx.com/www/commit/?h=term&id=9fec793abe0a73e5cd502a1d1e935e2413b85079">9fec793</a></p> -</div> - <p class="post-author right">by W. D. Sadeep Madurange</p> - </div> - </div> - </main> - - <div class="footer"> - <div class="container"> - <div class="twelve columns right container-2"> - <p id="footer-text">© ASCIIMX - 2025</p> - </div> - </div> -</div> - - - </body> -</html> |
