From 6101dc4cb4f0e018fad6f067e8f2448a453433b7 Mon Sep 17 00:00:00 2001 From: Sadeep Madurange Date: Sun, 7 Dec 2025 17:27:22 +0800 Subject: Bumblebee --- _projects/bumblebee.md | 52 +++++++++++++++++++++++++++---------- _site/feed.xml | 2 +- _site/posts.xml | 2 +- _site/projects/bumblebee/index.html | 48 ++++++++++++++++++++++++---------- 4 files changed, 75 insertions(+), 29 deletions(-) diff --git a/_projects/bumblebee.md b/_projects/bumblebee.md index ef38b2f..a8fa5eb 100644 --- a/_projects/bumblebee.md +++ b/_projects/bumblebee.md @@ -6,23 +6,47 @@ thumbnail: thumb.png layout: post --- -Bumblebee is a web browser that converts browser sessions into C# scripts for -playback. It eliminates the need for authoring browser automation scripts. +Bumblebee is a tool I built for one of my employers to automate the generation +of web scraping scripts. -Bumblebee is a Windows Forms application written in C#. Web content is rendered -by the embedded Microsoft Edge browser (via WebView). The text editor on the -right is Scintilla.NET. Users can -override the generated script at any point during the session. The users can -configure Bumblebee to debounce events, ignore hidden elements, etc. - -Bumblebee works by injecting a custom JavaScript program that tracks user -interactions. The tracker intercepts and sends them to the Bumblebee backend as -events for analysis. In addition to the front-end events, Bumblebee also -intercepts events internal to the web browser, which it then interprets to -generate C# code for the Selenium WebDriver in real time. +In 2024, we were tasked with collecting market data using various methods, +including scraping data from authorized websites for traders' use. + +Manual authoring of such scripts took time. The scripts were often brittle due +to the complex nature of modern websites, and they lacked optimizations such as +bypassing the UI and retrieving the data files directly when possible, which +would have significantly reduced our compute costs. + +To alleviate these challenges, I, with the help of a colleague, Andy Zhang, +built Bumblebee: a C# Windows Forms desktop application that uses Microsoft +Edge WebView2 for +rendering web content. + +Bumblebee works by injecting a custom JavaScript program that intercepts +client-side events and sends them to Bumblebee for analysis. In addition to +front-end events, Bumblebee also captures internal browser events, which it +then interprets to generate code in real time. Note that we developed Bumblebee +before the advent of now-popular LLMs. Bumblebee reliably handles dynamic +websites and pop-ups. The user can access developer tools, override any part of +the script at any point during the session (using the embedded Scintilla.NET editor), debounce +events, and block hidden elements and scripts. + +Before settling on a desktop application, we contemplated a browser extension. +We decided against that because we didn't want the browser vendor to dictate +Bumblebee's capabilities. Furthermore, the company's security policy prohibited +browser extensions, complicating its deployment. The initial prototype used a +C# wrapper of the Chromium project instead of WebView. Its incoherent API +design led us to toss it in favour of WebView, which presented a well-designed +API that interfaced seamlessly with Windows Forms. + +Bumblebee reduced the time we spent on authoring scripts from hours to a few +minutes. Since the rules for code generation were written and optimized by +experts in web technologies, the output was more robust. diff --git a/_site/feed.xml b/_site/feed.xml index 7b79eaf..ed18899 100644 --- a/_site/feed.xml +++ b/_site/feed.xml @@ -1 +1 @@ -Jekyll2025-12-06T21:14:11+08:00/feed.xmlASCIIMX | ArchiveWickramage Don Sadeep MadurangeHow I manage Suckless software packages2025-11-30T00:00:00+08:002025-11-30T00:00:00+08:00/archive/suckless-softwareWickramage Don Sadeep MadurangeNeo4J A* search2025-09-14T00:00:00+08:002025-09-14T00:00:00+08:00/archive/neo4j-a-star-searchWickramage Don Sadeep MadurangeMOSFETs as electronic switches2025-06-22T00:00:00+08:002025-06-22T00:00:00+08:00/archive/mosfet-switchesWickramage Don Sadeep MadurangeHow to configure ATmega328P microcontrollers to run at 3.3V and 5V2025-04-10T00:00:00+08:002025-04-10T00:00:00+08:00/archive/arduino-unoWickramage Don Sadeep MadurangeHow to set up ATSAM3X8E microcontrollers for bare-metal programming in C2024-10-05T00:00:00+08:002024-10-05T00:00:00+08:00/archive/arduino-dueWickramage Don Sadeep Madurange \ No newline at end of file +Jekyll2025-12-07T17:27:10+08:00/feed.xmlASCIIMX | ArchiveWickramage Don Sadeep MadurangeHow I manage Suckless software packages2025-11-30T00:00:00+08:002025-11-30T00:00:00+08:00/archive/suckless-softwareWickramage Don Sadeep MadurangeNeo4J A* search2025-09-14T00:00:00+08:002025-09-14T00:00:00+08:00/archive/neo4j-a-star-searchWickramage Don Sadeep MadurangeMOSFETs as electronic switches2025-06-22T00:00:00+08:002025-06-22T00:00:00+08:00/archive/mosfet-switchesWickramage Don Sadeep MadurangeHow to configure ATmega328P microcontrollers to run at 3.3V and 5V2025-04-10T00:00:00+08:002025-04-10T00:00:00+08:00/archive/arduino-unoWickramage Don Sadeep MadurangeHow to set up ATSAM3X8E microcontrollers for bare-metal programming in C2024-10-05T00:00:00+08:002024-10-05T00:00:00+08:00/archive/arduino-dueWickramage Don Sadeep Madurange \ No newline at end of file diff --git a/_site/posts.xml b/_site/posts.xml index 68176c8..599537e 100644 --- a/_site/posts.xml +++ b/_site/posts.xml @@ -1 +1 @@ -Jekyll2025-12-06T21:14:11+08:00/posts.xmlASCIIMXWickramage Don Sadeep Madurange \ No newline at end of file +Jekyll2025-12-07T17:27:10+08:00/posts.xmlASCIIMXWickramage Don Sadeep Madurange \ No newline at end of file diff --git a/_site/projects/bumblebee/index.html b/_site/projects/bumblebee/index.html index 8219448..576885f 100644 --- a/_site/projects/bumblebee/index.html +++ b/_site/projects/bumblebee/index.html @@ -44,24 +44,46 @@

BUMBLEBEE: BROWSER AUTOMATION

02 APRIL 2025

-

Bumblebee is a web browser that converts browser sessions into C# scripts for -playback. It eliminates the need for authoring browser automation scripts.

+

Bumblebee is a tool I built for one of my employers to automate the generation +of web scraping scripts.

-

Bumblebee is a Windows Forms application written in C#. Web content is rendered -by the embedded Microsoft Edge browser (via WebView). The text editor on the -right is Scintilla.NET. Users can -override the generated script at any point during the session. The users can -configure Bumblebee to debounce events, ignore hidden elements, etc.

- -

Bumblebee works by injecting a custom JavaScript program that tracks user -interactions. The tracker intercepts and sends them to the Bumblebee backend as -events for analysis. In addition to the front-end events, Bumblebee also -intercepts events internal to the web browser, which it then interprets to -generate C# code for the Selenium WebDriver in real time.

+

In 2024, we were tasked with collecting market data using various methods, +including scraping data from authorized websites for traders’ use.

+ +

Manual authoring of such scripts took time. The scripts were often brittle due +to the complex nature of modern websites, and they lacked optimizations such as +bypassing the UI and retrieving the data files directly when possible, which +would have significantly reduced our compute costs.

+ +

To alleviate these challenges, I, with the help of a colleague, Andy Zhang, +built Bumblebee: a C# Windows Forms desktop application that uses Microsoft +Edge WebView2 for +rendering web content.

+ +

Bumblebee works by injecting a custom JavaScript program that intercepts +client-side events and sends them to Bumblebee for analysis. In addition to +front-end events, Bumblebee also captures internal browser events, which it +then interprets to generate code in real time. Note that we developed Bumblebee +before the advent of now-popular LLMs. Bumblebee reliably handles dynamic +websites and pop-ups. The user can access developer tools, override any part of +the script at any point during the session (using the embedded Scintilla.NET editor), debounce +events, and block hidden elements and scripts.

+ +

Before settling on a desktop application, we contemplated a browser extension. +We decided against that because we didn’t want the browser vendor to dictate +Bumblebee’s capabilities. Furthermore, the company’s security policy prohibited +browser extensions, complicating its deployment. The initial prototype used a +C# wrapper of the Chromium project instead of WebView. Its incoherent API +design led us to toss it in favour of WebView, which presented a well-designed +API that interfaced seamlessly with Windows Forms.

+ +

Bumblebee reduced the time we spent on authoring scripts from hours to a few +minutes. Since the rules for code generation were written and optimized by +experts in web technologies, the output was more robust.

-- cgit v1.2.3