blob: b4c9c254ae2a140d644c736bd32232f08dca233e (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
|
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Bumblebee: browser automation</title>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Bumblebee: browser automation</title>
<link rel="stylesheet" href="/assets/css/main.css">
<link rel="stylesheet" href="/assets/css/skeleton.css">
</head>
</head>
<body>
<div id="nav-container" class="container">
<ul id="navlist" class="left">
<li >
<a href="/" class="link-decor-none">hme</a>
</li>
<li class="active">
<a href="/log/" class="link-decor-none">log</a>
</li>
<li >
<a href="/projects/" class="link-decor-none">poc</a>
</li>
<li >
<a href="/about/" class="link-decor-none">abt</a>
</li>
<li>
<a href="/cgi-bin/find.cgi" class="link-decor-none">sws</a>
</li>
<li>
<a href="/feed.xml" class="link-decor-none">rss</a>
</li>
</ul>
</div>
<main>
<div class="container">
<div class="container-2">
<h2 class="center" id="title">BUMBLEBEE: BROWSER AUTOMATION</h2>
<h6 class="center">02 APRIL 2025</h5>
<br>
<div class="twocol justify"><p>Built with Andy Zhang for an employer. Tool to automate web scraping script
generation.</p>
<video style="max-width:100%; margin-bottom: 10px" controls="" poster="poster.png">
<source src="bee.mp4" type="video/mp4" />
</video>
<p>Manual script authoring took hours. Scripts poorly optimized, CPUs maxed
constantly, cloud costs excessive.</p>
<p>Initially considered browser extension. Desktop app won—extensions don’t give
deep event control. Company policy blocked extensions anyway.</p>
<p>First prototype: C# Win Forms + CefSharp.</p>
<p>Second prototype: C# Win Forms + WebView2. Packaging and distribution more
complex, but the API is well-designed; integrates well with Win Forms.</p>
<p>Microsoft Edge required. Portability not a concern, only need to target
controlled Windows environments. Choosing WebView2 over CefSharp.</p>
<p>Embed <a href="https://github.com/desjarlais/Scintilla.NET" class="external" toarget="_blank" rel="noopener noreferrer">Scintilla.NET</a> editor for
overriding generated script.</p>
<p>Code generation sequence: Inject JavaScript to intercept client-side events.
Capture internal browser events (pop-ups, file downloads). Event
raised → parsed into a token → insert to list → interpret event → look up
instruction from a table → form instruction with event args → insert text to a
parallel list → run both lists through optimizer → update Scintilla editor.</p>
<p>Limitation: manual overriding via Scintilla editor mid-session causes the code
list to go out of sync with the event list. Optimizer can’t handle this yet.</p>
<p>Note to self: need to rethink the event/text list data structures in the
context of the optimizer–look to compilers for inspiration maybe?</p>
</div>
<p class="post-author right">by W. D. Sadeep Madurange</p>
</div>
</div>
</main>
<div class="footer">
<div class="container">
<div class="twelve columns right container-2">
<p id="footer-text">© ASCIIMX - 2025</p>
</div>
</div>
</div>
</body>
</html>
|