From 6101dc4cb4f0e018fad6f067e8f2448a453433b7 Mon Sep 17 00:00:00 2001 From: Sadeep Madurange Date: Sun, 7 Dec 2025 17:27:22 +0800 Subject: Bumblebee --- _projects/bumblebee.md | 52 ++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 38 insertions(+), 14 deletions(-) (limited to '_projects') diff --git a/_projects/bumblebee.md b/_projects/bumblebee.md index ef38b2f..a8fa5eb 100644 --- a/_projects/bumblebee.md +++ b/_projects/bumblebee.md @@ -6,23 +6,47 @@ thumbnail: thumb.png layout: post --- -Bumblebee is a web browser that converts browser sessions into C# scripts for -playback. It eliminates the need for authoring browser automation scripts. +Bumblebee is a tool I built for one of my employers to automate the generation +of web scraping scripts. -Bumblebee is a Windows Forms application written in C#. Web content is rendered -by the embedded Microsoft Edge browser (via WebView). The text editor on the -right is Scintilla.NET. Users can -override the generated script at any point during the session. The users can -configure Bumblebee to debounce events, ignore hidden elements, etc. - -Bumblebee works by injecting a custom JavaScript program that tracks user -interactions. The tracker intercepts and sends them to the Bumblebee backend as -events for analysis. In addition to the front-end events, Bumblebee also -intercepts events internal to the web browser, which it then interprets to -generate C# code for the Selenium WebDriver in real time. +In 2024, we were tasked with collecting market data using various methods, +including scraping data from authorized websites for traders' use. + +Manual authoring of such scripts took time. The scripts were often brittle due +to the complex nature of modern websites, and they lacked optimizations such as +bypassing the UI and retrieving the data files directly when possible, which +would have significantly reduced our compute costs. + +To alleviate these challenges, I, with the help of a colleague, Andy Zhang, +built Bumblebee: a C# Windows Forms desktop application that uses Microsoft +Edge WebView2 for +rendering web content. + +Bumblebee works by injecting a custom JavaScript program that intercepts +client-side events and sends them to Bumblebee for analysis. In addition to +front-end events, Bumblebee also captures internal browser events, which it +then interprets to generate code in real time. Note that we developed Bumblebee +before the advent of now-popular LLMs. Bumblebee reliably handles dynamic +websites and pop-ups. The user can access developer tools, override any part of +the script at any point during the session (using the embedded Scintilla.NET editor), debounce +events, and block hidden elements and scripts. + +Before settling on a desktop application, we contemplated a browser extension. +We decided against that because we didn't want the browser vendor to dictate +Bumblebee's capabilities. Furthermore, the company's security policy prohibited +browser extensions, complicating its deployment. The initial prototype used a +C# wrapper of the Chromium project instead of WebView. Its incoherent API +design led us to toss it in favour of WebView, which presented a well-designed +API that interfaced seamlessly with Windows Forms. + +Bumblebee reduced the time we spent on authoring scripts from hours to a few +minutes. Since the rules for code generation were written and optimized by +experts in web technologies, the output was more robust. -- cgit v1.2.3