diff options
| author | Sadeep Madurange <sadeep@asciimx.com> | 2025-12-20 21:36:45 +0800 |
|---|---|---|
| committer | Sadeep Madurange <sadeep@asciimx.com> | 2025-12-20 21:36:45 +0800 |
| commit | e15f1076b59e997108914f6a5b9b28652d323268 (patch) | |
| tree | 294b0c5da8f8410ad0aab89dd2c5d107f581e083 /_poc/bumblebee.md | |
| parent | 31616ee1b8ff316d6558a0e7c87e4bd4211c9932 (diff) | |
| download | www-e15f1076b59e997108914f6a5b9b28652d323268.tar.gz | |
Change website structure to a log.
Diffstat (limited to '_poc/bumblebee.md')
| -rw-r--r-- | _poc/bumblebee.md | 50 |
1 files changed, 0 insertions, 50 deletions
diff --git a/_poc/bumblebee.md b/_poc/bumblebee.md deleted file mode 100644 index cb0441d..0000000 --- a/_poc/bumblebee.md +++ /dev/null @@ -1,50 +0,0 @@ ---- -title: "Bumblebee: browser automation" -date: 2025-04-02 -thumbnail: thumb_sm.png -layout: post ---- - -Bumblebee is a tool I built for one of my employers to automate the generation -of web scraping scripts. - -<video style="max-width:100%; margin-bottom: 10px" controls="" poster="poster.png"> - <source src="bee.mp4" type="video/mp4"> -</video> - -In 2024, we were tasked with collecting market data using various methods, -including scraping data from authorized websites for traders' use. - -Manual authoring of such scripts took time. The scripts were often brittle due -to the complexity of the modern web, and they lacked optimizations such as -bypassing the UI and retrieving the data files directly when possible, which -would have significantly reduced our compute costs. - -To alleviate these challenges, I, with the help of a colleague, Andy Zhang, -built Bumblebee: a web browser powered by C# Windows Forms, Microsoft Edge <a -src="https://developer.microsoft.com/en-us/microsoft-edge/webview2" -class="external" target="_blank" rel="noopener noreferrer">WebView2</a>, and -the <a src="https://github.com/desjarlais/Scintilla.NET" class="external" -toarget="_blank" rel="noopener noreferrer">Scintilla.NET</a> text editor. - -Bumblebee works by injecting a custom JavaScript program that intercepts -client-side events and sends them to Bumblebee for analysis. In addition to -front-end events, Bumblebee also captures internal browser events, which it -then interprets to generate code in real time. Note that we developed Bumblebee -before the advent of now-popular LLMs. Bumblebee supports dynamic websites, -pop-ups, developer tools, live manual override, event debouncing, and filtering -hidden elements and scripts. - -Before settling on a desktop application, we contemplated designing Bumblebee -as a browser extension. We chose the desktop app because extensions don't offer -the deep, event-based control we needed. Besides, the company's security -policy, which prohibited browser extensions, would have complicated the -deployment of an extension-based solution. My first prototype used a C# binding -of the Chromium project. WebView's more intuitive API and its seamless -integration with Windows Forms led us to choose it over the Chromium wrapper. - -What began as a personal side project to improve my own workflow enabled us to -collectively improve the quality of our web scripts at a much larger scale. -Bumblebee predictably reduced the time we spent on authoring scripts from hours -to a few minutes. - |
