blob: bbe0490f2bf765313d59a79cdd151fd09a586f2d (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
|
---
title: Built a browser automation script synthesizer
date: 2025-04-02
layout: post
project: true
thumbnail: thumb_sm.png
---
One year at the trading firm. Webscrapers are causing problems. CPUs are
saturated, servers are stalling.
2025-02: Built a C# WinForms application to record browser sessions and
automate the synthesis of scripts.
<video style="max-width:100%; margin-bottom: 10px" controls="" poster="poster.png">
<source src="bee.mp4" type="video/mp4">
</video>
Hosted WebView2 (Edge) in the WinForms application to render web content.
Intercepted events by injecting JS hooks to web pages (client-side events) and
listening to WebView events (internal browser events). Converted intercepted
events to Selenium code by sending through if-else blocks.
Implemented a basic optimizer to squash event sequences into single commands
(e.g., calendar clicks → text input), use heuristics to improve DOM addressing
(xpath, id, element).
Integrated Scintilla.NET editor to allow user more control over the generated
script.
Events and code are stored in two linear lists. Without ASTs, mid-session
manual edits desync the lists, block the optimizer. As a workaround, only edit
scripts at the end of recording.
2025-03: Shipped the first iteration. Began work on a key optimization: bypass
the browser, grab data files directly when possible.
2025-04: Abandoned project. Left the firm.
|