summaryrefslogtreecommitdiffstats
path: root/_log/neo4j-a-star-search.md
blob: ad927e51d18e5c62d30ae0252ef5cc0e37a6edd3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
title: 'Neo4J search: A* optimization'
date: 2018-03-06
layout: post
---

Written in 2026, backdated to 2018. 

First real performance problem. Vessel tracking with Neo4J hit a wall. Need to
find shortest paths in a 13,000-vertex graph. Dijkstra's search, which Neo4J
ships with, slowed to a crawl after 4,000.

Spent better part of the week replacing Dijkstra's algorithm with A*
search—Python shop, junior developer, took more than a day to set up toolchain
and build the project. Haversine function as heuristic uses distance between
ports as the crow flies to steer search:

```
private double computeHeuristic(
    final double lat1, final double lon1,
    final double lat2, final double lon2) {
    final int earthRadius = 6371;
    final double kmToNM = 0.539957;

    final double latDistance = Math.toRadians(lat2 - lat1);
    final double lonDistance = Math.toRadians(lon2 - lon1);

    final double a = Math.sin(latDistance / 2) 
        * Math.sin(latDistance / 2)
        + Math.cos(Math.toRadians(lat1)) 
        * Math.cos(Math.toRadians(lat2))
        * Math.sin(lonDistance / 2) 
        * Math.sin(lonDistance / 2);

    final double c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1 - a));
    return earthRadius * c * kmToNM;
}
```

Core search loop updates costs when better path found:

```
private void updateCosts(
    final int source, final int target,
    final double newCost, final double heuristic) {
    final double oldCost = gCosts.getOrDefault(target, Double.MAX_VALUE);
    if (newCost < oldCost) {
        gCosts.put(target, newCost);
        fCosts.put(target, newCost + heuristic);
        path.put(target, source);
    }
}
```

Verdict: 300x speedup—scaled to 13,000 vertices.

Upstreamed changes: <a
href="https://github.com/neo4j-contrib/neo4j-graph-algorithms/releases/tag/3.4.0.0"
class="external" target="_blank" rel="noopener noreferrer">Neo4J v3.4.0</a> | 
<a
href="https://github.com/neo4j-contrib/neo4j-graph-algorithms/blob/bd9732d9a690319552e134708692acb5a0d6b37c/algo/src/main/java/org/neo4j/graphalgo/impl/ShortestPathAStar.java"
class="external" target="_blank" rel="noopener noreferrer">Full source</a>.

NOTE: Despite impressive gain, performance horizon visible. Unlikely to scale past
16,000 vertices without caching.