diff options
Diffstat (limited to '_log/neo4j-a-star-search.md')
| -rw-r--r-- | _log/neo4j-a-star-search.md | 41 |
1 files changed, 7 insertions, 34 deletions
diff --git a/_log/neo4j-a-star-search.md b/_log/neo4j-a-star-search.md index 9f4a7d2..3105662 100644 --- a/_log/neo4j-a-star-search.md +++ b/_log/neo4j-a-star-search.md @@ -1,25 +1,15 @@ --- -title: That time I forked Neo4J to implement A* search +title: Neo4J contribution date: 2018-03-06 layout: post --- Written in 2026, backdated to 2018. -Our vessel tracking system hit a wall. We needed shortest paths through a -13,000-vertex graph. But the system slowed to a crawl after 4,000. Kristian, -our CTO, suggested replacing Dijkstra's search with the faster A* search—an -algorithm I had encountered as an undergraduate. +Before v3.4.0, Neo4J shipped with Dijkstra's shortest path search. The +algorithm was too slow for our marine vessel tracking application. -The database we used, Neo4J, employed a plug-in system for graph algorithms—a -mature open-source project developed by experts. Forking it for our unassuming -start-up was bold. Often, application developers are reluctant to step into -lower layers. - -It took me a week to set up the toolchain, build the project, and run the -tests. I reviewed Dijkstra's implementation, separating logic from -infrastructure. I then refactored the algorithm to use the haversine function -to steer the search: +Forked and added A* search. Used the haversine function to steer the search: ``` private double computeHeuristic( @@ -43,8 +33,7 @@ private double computeHeuristic( } ``` -And revised the core search loop to update the costs when a better path is -found: +Core search loop updates costs when a better path is found: ``` private void updateCosts( @@ -59,25 +48,9 @@ private void updateCosts( } ``` -I remember celebrating a 300x speed up, and comfortably scaling to the target -13,000. - -I was even more thrilled when the upstream project accepted my changes, and -invited me to contribute. In hindsight, I'm a little surprised that my changes -were accepted in their original form. The heuristic was tightly coupled to our -use case. Decoupling it would require significant refactoring and interface -changes. - -It was, for a time, the best outcome I could have imagined. - -My celebration of success, however, was short-lived. In a matter of months, as -the database grew larger and denser—to 16,000 vertices, if memory serves—A* -search too began to struggle. We fell back on a far less elegant PostgreSQL -cache. I learnt my first bitter lesson about optimization: dumb caches beat -clever algorithms. I briefly wondered if I should have stayed with pure -mathematics. +Upstreamed the changes. -Source code: <a +GitHub release: <a href="https://github.com/neo4j-contrib/neo4j-graph-algorithms/releases/tag/3.4.0.0" class="external" target="_blank" rel="noopener noreferrer">Neo4J v3.4.0</a> | <a |
