Thesis

Fanout targets are selected on a transaction basis. That means, for each transaction to be relayed, a subset of our peers is selected for fanout, while the rest would be used for transaction reconciliation. The fanout candidate selection can be done at two stages of the transaction relay process:

Our current implementation follows the former, since it is easier to reason about transaction dependencies during transaction scheduling, however, at relay time it should be easier to adapt to fanout candidates already knowing about the target transaction, and potentially swap candidates so the desired fanout rate can be achieved. For a more extensive explanation of both approaches, check Choosing fanout peers at relay scheduling time and Choosing fanout peers at transaction relay time.

Simulation

To simulate the difference between both approaches, a new branch has been created in the simulator where the decision making is done at transaction relay time instead of at relay scheduling time. The new branch gets rid of the delayed_set for Erlay peers in their TxReconciliationState, and all transactions are kept in to_be_announced until relay time, where they would be picked for fanout or reconciliation based on get_fanout_targets. Since this process happens on a peer basis, fanout targets are now cached, so the selection is consistent for all peers of a given node for a given transaction.

To simulate this we will use a randomly generated network containing 110000 nodes, 10000 reachable, 100000 unreachable with 8 outbounds per node, all of them implementing Erlay. The simulation can be run using the code at ‣.

Results

Here are the results of running our simulation in our reference model, that is, a network of the aforementioned characteristics selecting fanout candidates at scheduling time:

2024-12-20T22:12:59.431Z INFO  [hyper_lib::simulator] Using user provided rng seed: 5831067508548421759
2024-12-20T22:12:59.438Z INFO  [hyper_lib::network] Creating nodes (110000: 10000 reachable, 100000 unreachable)
2024-12-20T22:12:59.568Z INFO  [hyperion] The simulation will be run 100 times and results will be averaged
2024-12-20T22:17:42.069Z INFO  [hyper_lib] Transaction reached 90% of nodes in the network in 20.764025s
2024-12-20T22:17:42.069Z INFO  [hyper_lib] Transaction reached all nodes in 28.108896s
2024-12-20T22:17:42.069Z INFO  [hyper_lib] Reachable nodes sent/received 222.2216/369.73523 messages (715.59644/1069.915 bytes) (avg)
2024-12-20T22:17:42.069Z INFO  [hyper_lib] Unreachable nodes sent/received 32.36946/17.618101 messages (94.462814/59.03094 bytes) (avg)

And the same simulation run now in our updated model, selecting fanout peers at relay time:

2024-12-20T22:34:03.686Z INFO  [hyper_lib::simulator] Using user provided rng seed: 5831067508548421759
2024-12-20T22:34:03.694Z INFO  [hyper_lib::network] Creating nodes (110000: 10000 reachable, 100000 unreachable)
2024-12-20T22:34:03.826Z INFO  [hyperion] The simulation will be run 100 times and results will be averaged
2024-12-20T22:39:01.915Z INFO  [hyper_lib] Transaction reached 90% of nodes in the network in 20.655256s
2024-12-20T22:39:01.915Z INFO  [hyper_lib] Transaction reached all nodes in 28.008577s
2024-12-20T22:39:01.915Z INFO  [hyper_lib] Reachable nodes sent/received 221.15518/367.86584 messages (715.5273/1068.999 bytes) (avg)
2024-12-20T22:39:01.915Z INFO  [hyper_lib] Unreachable nodes sent/received 32.20926/17.538193 messages (94.39123/59.044044 bytes) (avg)

Finally, here are both simulations as a csv for easier comparison:

timestamp,percentile-target,percentile-time,full-propagation-time,r-sent-msgs,r-received-msgs,r-sent-bytes,r-received-bytes,u-sent-msgs,u-received-msgs,u-sent-bytes,u-received-bytes,r,u,n,erlay,seed
1734733062,90,20764024832,28108896256,222.2216,369.73523,715.59644,1069.915,32.36946,17.618101,94.462814,59.03094,10000,100000,100,true,5831067508548421759
1734734341,90,20655255552,28008577024,221.15518,367.86584,715.5273,1068.999,32.20926,17.538193,94.39123,59.044044,10000,100000,100,true,583106750854842175

We can see that there are no meaningful differences between the two simulations, which may sound surprising. To understand why this is the case, we can follow a similar approach as per our Filter fanout candidates based on transaction knowledge simulation. This time we will modify the simulation slightly. Instead of just recording how many peers have announced a transaction to us by the time we need to select fanout candidates, we will be taking two samples: the first one will be taken at transaction relay scheduling, point at where our reference model selects fanout peers, and the second one will be taken at transaction relay time, when our current model selects the candidates.