summaryrefslogtreecommitdiffstats
path: root/_site/log/bumblebee/index.html
blob: 0962a27f29b2b15fd5733a1d2afeca3f307da2d5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <title>Bumblebee: browser automation</title>

    <head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>Bumblebee: browser automation</title>
  <link rel="stylesheet" href="/assets/css/main.css">
  <link rel="stylesheet" href="/assets/css/skeleton.css">
</head>



  </head>
  <body>

    <div id="nav-container" class="container">
  <ul id="navlist" class="left">
    
    <li >
      <a href="/" class="link-decor-none">hme</a>
    </li>
    <li class="active">
      <a href="/log/" class="link-decor-none">log</a>
    </li>
    <li >
      <a href="/projects/" class="link-decor-none">poc</a>
    </li>
    <li >
      <a href="/about/" class="link-decor-none">abt</a>
    </li>
    <li><a href="/feed.xml" class="link-decor-none">rss</a></li>
  </ul>
</div>



    <main>
      <div class="container">
        <div class="container-2">
          <h2 class="center" id="title">BUMBLEBEE: BROWSER AUTOMATION</h2>
          <h6 class="center">02 APRIL 2025</h5>
          <br>
          <div class="twocol justify"><p>Built with Andy Zhang for an employer. Tool to automate web scraping script
generation.</p>

<video style="max-width:100%; margin-bottom: 10px" controls="" poster="poster.png">
  <source src="bee.mp4" type="video/mp4" />
</video>

<p>Manual script authoring took hours. Scripts poorly optimized, CPUs maxed
constantly, cloud costs excessive.</p>

<p>Initially considered browser extension. Desktop app won—extensions don’t give
deep event control. Company policy blocked extensions anyway.</p>

<p>First prototype: C# Win Forms + CefSharp.</p>

<p>Second prototype: C# Win Forms + WebView2. Packaging and distribution more
complex, but the API is well-designed; integrates well with Win Forms.</p>

<p>Microsoft Edge required. Portability not a concern, only need to target
controlled Windows environments. Choosing WebView2 over CefSharp.</p>

<p>Embed <a href="https://github.com/desjarlais/Scintilla.NET" class="external" toarget="_blank" rel="noopener noreferrer">Scintilla.NET</a> editor for
overriding generated script.</p>

<p>Code generation sequence: Inject JavaScript to intercept client-side events.
Capture internal browser events (pop-ups, file downloads). Event
raised → parsed into a token → insert to list → interpret event → look up
instruction from a table → form instruction with event args → insert text to a
parallel list → run both lists through optimizer → update Scintilla editor.</p>

<p>Limitation: manual overriding via Scintilla editor mid-session causes the code
list to go out of sync with the event list. Optimizer can’t handle this yet.</p>

<p>Note to self: need to rethink the event/text list data structures in the
context of the optimizer–look to compilers for inspiration maybe?</p>

</div>
          <p class="post-author right">by W. D. Sadeep Madurange</p>
        </div>
      </div>
    </main>

    <div class="footer">
  <div class="container">
    <div class="twelve columns right container-2">
      <p id="footer-text">&copy; ASCIIMX - 2025</p>
    </div>
  </div>
</div>


  </body>
</html>