diff options
Diffstat (limited to '_site/log/bumblebee')
| -rw-r--r-- | _site/log/bumblebee/index.html | 50 |
1 files changed, 24 insertions, 26 deletions
diff --git a/_site/log/bumblebee/index.html b/_site/log/bumblebee/index.html index f1881aa..5f5c3b0 100644 --- a/_site/log/bumblebee/index.html +++ b/_site/log/bumblebee/index.html @@ -3,7 +3,7 @@ <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> - <title>Bumblebee: browser automation</title> + <title>Bumblebee: web script synthesizer</title> <link rel="stylesheet" href="/assets/css/main.css"> <link rel="stylesheet" href="/assets/css/skeleton.css"> </head> @@ -37,44 +37,42 @@ <main> <div class="container"> <div class="container-2"> - <h2 class="center" id="title">BUMBLEBEE: BROWSER AUTOMATION</h2> + <h2 class="center" id="title">BUMBLEBEE: WEB SCRIPT SYNTHESIZER</h2> <h5 class="center">02 APRIL 2025</h5> <br> - <div class="twocol justify"><p>Built with Andy Zhang for an employer. Tool to automate web scraping script -generation.</p> + <div class="twocol justify"><p>Work project. Browser session-to-code conversion.</p> <video style="max-width:100%; margin-bottom: 10px" controls="" poster="poster.png"> <source src="bee.mp4" type="video/mp4" /> </video> -<p>Manual script authoring took hours. Scripts poorly optimized, CPUs maxed -constantly, cloud costs excessive.</p> +<p>Architecture: C# WinForms host, embedded browser, code editor. Browser +extension rejected due to security policy and shallow event control.</p> -<p>Initially considered browser extension. Desktop app won—extensions don’t give -deep event control. Company policy blocked extensions anyway.</p> +<p>Tool evaluation:</p> -<p>First prototype: C# Win Forms + CefSharp.</p> +<ul> + <li>CefSharp: Discarded. API lacked elegance.</li> + <li>WebView2: Selected. Better WinForms integration. Hard dependency on +Microsoft Edge–acceptable for corporate Windows environments.</li> +</ul> -<p>Second prototype: C# Win Forms + WebView2. Packaging and distribution more -complex, but the API is well-designed; integrates well with Win Forms.</p> +<p>Implementation:</p> -<p>Microsoft Edge required. Portability not a concern, only need to target -controlled Windows environments. Choosing WebView2 over CefSharp.</p> +<ol> + <li>Interception: Injected JS hooks; internal browser event monitoring +(pop-ups/downloads).</li> + <li>Transformation: Event → Token → Instruction Table → String.</li> + <li>Optimization: Parallel event/text lists processing; rendered +in <a href="https://github.com/desjarlais/Scintilla.NET" class="external" target="_blank" rel="noopener noreferrer">Scintilla.NET</a></li> +</ol> -<p>Embed <a href="https://github.com/desjarlais/Scintilla.NET" class="external" toarget="_blank" rel="noopener noreferrer">Scintilla.NET</a> editor for -overriding generated script.</p> +<p>Bug: Manual mid-session overrides desync code/event lists, bypassing optimizer. +Linear lists inadequate for state synchronization. Need to rethink data +structures; look to compiler Abstract Syntax Trees (AST) for intermediate +representation.</p> -<p>Code generation sequence: Inject JavaScript to intercept client-side events. -Capture internal browser events (pop-ups, file downloads). Event -raised → parsed into a token → insert to list → interpret event → look up -instruction from a table → form instruction with event args → insert text to a -parallel list → run both lists through optimizer → update Scintilla editor.</p> - -<p>Limitation: manual overriding via Scintilla editor mid-session causes the code -list to go out of sync with the event list. Optimizer can’t handle this yet.</p> - -<p>Note to self: need to rethink the event/text list data structures in the -context of the optimizer–look to compilers for inspiration maybe?</p> +<p>Verdict: Serves its purpose.</p> </div> <p class="post-author right">by W. D. Sadeep Madurange</p> |
