SITE SEARCH USING PERL + CGI
29 DECEMBER 2025
Need a way to search site–number of articles are growing.
Searching site client-side using the RSS feed and JavaScript is not an option–
bloats the feed and breaks the site for Lynx and other text browsers.
Perl’s great for text processing–especially regex work. Few lines of Perl
could do a regex search and send the result back via CGI. OpenBSD httpd speaks
CGI, Perl and slowcgi are in the base systems. No dependencies. Works on every
conceivable browser.
Perl: traverse the directory with File::Find recursively. If search text is
found grab the file name, title and up to 50 chars of the first paragraph to
include in the search result.
find(sub {
if (open my $fh, '<', $_) {
my $content = do { local $/; <$fh> };
close $fh;
if ($content =~ /\Q$search_text\E/i) {
my ($title) = $content =~ /<title>(.*?)<\/title>/is;
$title ||= $File::Find::name;
my ($p_content) = $content =~ /<p[^>]*>(.*?)<\/p>/is;
my $snippet = $p_content || "";
$snippet =~ s/<[^>]*>//g;
$snippet =~ s/\s+/ /g;
$snippet = substr($snippet, 0, 50);
$snippet .= "..." if length($p_content || "") > 50;
push @results, {
path => $File::Find::name,
title => $title,
snippet => $snippet
};
}
}
}, $dir);
Don’t need the Perl CGI module, httpd sets QUERY_STRING for the slowcgi script:
my %params;
if ($ENV{QUERY_STRING}) {
foreach my $pair (split /&/, $ENV{QUERY_STRING}) {
my ($key, $value) = split /=/, $pair;
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$params{$key} = $value;
}
}
Run the script as www user. Permissions: 554 (read + execute).
Running in OpenBSD chroot: Check Perl’s dynamic object dependencies:
$ ldd $(which perl)
/usr/bin/perl:
Start End Type Open Ref GrpRef Name
000008797e8e6000 000008797e8eb000 exe 1 0 0 /usr/bin/perl
0000087c1ffe5000 0000087c20396000 rlib 0 1 0 /usr/lib/libperl.so.26.0
0000087bf4508000 0000087bf4539000 rlib 0 2 0 /usr/lib/libm.so.10.1
0000087b9e801000 0000087b9e907000 rlib 0 2 0 /usr/lib/libc.so.102.0
0000087bba182000 0000087bba182000 ld.so 0 1 0 /usr/libexec/ld.so
Copy them over to chroot. Now should have /var/www/usr/bin/perl,
/usr/lib/libperl.so.26.0, and so on.
Troubleshooting: look for issues in logs or try executing the script in chroot:
$ cat /var/log/messages | grep slowcgi
# chroot /var/www/ htdocs/path/to/script/script.cgi
The last command exposes any missing Perl modules in chroot and where to find
them. Copy them over as well.
location "/cgi-bin/*" {
fastcgi socket "/run/slowcgi.sock"
}
in httpd.conf routes queries to slowcgi.
Need a way to search site–number of articles are growing.
Searching site client-side using the RSS feed and JavaScript is not an option– bloats the feed and breaks the site for Lynx and other text browsers.
Perl’s great for text processing–especially regex work. Few lines of Perl could do a regex search and send the result back via CGI. OpenBSD httpd speaks CGI, Perl and slowcgi are in the base systems. No dependencies. Works on every conceivable browser.
Perl: traverse the directory with File::Find recursively. If search text is found grab the file name, title and up to 50 chars of the first paragraph to include in the search result.
find(sub {
if (open my $fh, '<', $_) {
my $content = do { local $/; <$fh> };
close $fh;
if ($content =~ /\Q$search_text\E/i) {
my ($title) = $content =~ /<title>(.*?)<\/title>/is;
$title ||= $File::Find::name;
my ($p_content) = $content =~ /<p[^>]*>(.*?)<\/p>/is;
my $snippet = $p_content || "";
$snippet =~ s/<[^>]*>//g;
$snippet =~ s/\s+/ /g;
$snippet = substr($snippet, 0, 50);
$snippet .= "..." if length($p_content || "") > 50;
push @results, {
path => $File::Find::name,
title => $title,
snippet => $snippet
};
}
}
}, $dir);
Don’t need the Perl CGI module, httpd sets QUERY_STRING for the slowcgi script:
my %params;
if ($ENV{QUERY_STRING}) {
foreach my $pair (split /&/, $ENV{QUERY_STRING}) {
my ($key, $value) = split /=/, $pair;
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$params{$key} = $value;
}
}
Run the script as www user. Permissions: 554 (read + execute).
Running in OpenBSD chroot: Check Perl’s dynamic object dependencies:
$ ldd $(which perl)
/usr/bin/perl:
Start End Type Open Ref GrpRef Name
000008797e8e6000 000008797e8eb000 exe 1 0 0 /usr/bin/perl
0000087c1ffe5000 0000087c20396000 rlib 0 1 0 /usr/lib/libperl.so.26.0
0000087bf4508000 0000087bf4539000 rlib 0 2 0 /usr/lib/libm.so.10.1
0000087b9e801000 0000087b9e907000 rlib 0 2 0 /usr/lib/libc.so.102.0
0000087bba182000 0000087bba182000 ld.so 0 1 0 /usr/libexec/ld.so
Copy them over to chroot. Now should have /var/www/usr/bin/perl, /usr/lib/libperl.so.26.0, and so on.
Troubleshooting: look for issues in logs or try executing the script in chroot:
$ cat /var/log/messages | grep slowcgi
# chroot /var/www/ htdocs/path/to/script/script.cgi
The last command exposes any missing Perl modules in chroot and where to find them. Copy them over as well.
location "/cgi-bin/*" {
fastcgi socket "/run/slowcgi.sock"
}
in httpd.conf routes queries to slowcgi.