summaryrefslogtreecommitdiffstats
path: root/_log/search-with-cgi.md
blob: 25788781f7012a1f6284cdebb4ff1bad88dbebf6 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
title: Site search using Perl + CGI
date: 2025-12-29
layout: post
---

Need a way to search site--number of articles are growing.

Searching site client-side using the RSS feed and JavaScript is not an option--
bloats the feed and breaks the site for Lynx and other text browsers.

Perl's great for text processing--especially regex work. Few lines of Perl
could do a regex search and send the result back via CGI. OpenBSD httpd speaks
CGI, Perl and slowcgi are in the base systems. No dependencies. Works on every
conceivable browser.

Perl: traverse the directory with File::Find recursively. If search text is
found grab the file name, title and up to 50 chars of the first paragraph to
include in the search result.

```
find(sub {
    if (open my $fh, '<', $_) {
        my $content = do { local $/; <$fh> };
        close $fh;
            
    if ($content =~ /\Q$search_text\E/i) {
        my ($title) = $content =~ /<title>(.*?)<\/title>/is;
        $title ||= $File::Find::name;
        my ($p_content) = $content =~ /<p[^>]*>(.*?)<\/p>/is;
        my $snippet = $p_content || "";
        $snippet =~ s/<[^>]*>//g; 
        $snippet =~ s/\s+/ /g;
        $snippet = substr($snippet, 0, 50);
        $snippet .= "..." if length($p_content || "") > 50;

        push @results, { 
            path    => $File::Find::name, 
            title   => $title, 
            snippet => $snippet 
        };
    }
  }
}, $dir);
```

Don't need the Perl CGI module, httpd sets QUERY_STRING for the slowcgi script:

```
my %params;
if ($ENV{QUERY_STRING}) {
    foreach my $pair (split /&/, $ENV{QUERY_STRING}) {
        my ($key, $value) = split /=/, $pair;
        $value =~ tr/+/ /;
        $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
        $params{$key} = $value;
    }
}
```

Run the script as www user. Permissions: 554 (read + execute).

Running in OpenBSD chroot: Check Perl's dynamic object dependencies:

```
$ ldd $(which perl)
/usr/bin/perl:
        Start            End              Type  Open Ref GrpRef Name
        000008797e8e6000 000008797e8eb000 exe   1    0   0      /usr/bin/perl
        0000087c1ffe5000 0000087c20396000 rlib  0    1   0      /usr/lib/libperl.so.26.0
        0000087bf4508000 0000087bf4539000 rlib  0    2   0      /usr/lib/libm.so.10.1
        0000087b9e801000 0000087b9e907000 rlib  0    2   0      /usr/lib/libc.so.102.0
        0000087bba182000 0000087bba182000 ld.so 0    1   0      /usr/libexec/ld.so
```

Copy them over to chroot. Now should have /var/www/usr/bin/perl,
/usr/lib/libperl.so.26.0, and so on.

Troubleshooting: look for issues in logs or try executing the script in chroot:

```
$ cat /var/log/messages | grep slowcgi
# chroot /var/www/ htdocs/path/to/script/script.cgi
```
The last command exposes any missing Perl modules in chroot and where to find
them. Copy them over as well.

```
location "/cgi-bin/*" {
    fastcgi socket "/run/slowcgi.sock"
}
```

in httpd.conf routes queries to slowcgi.