Well, all my nodes are running with –max-connections flag set to something like 1k. Then I utilise this command mina advanced node-status -daemon-peers. Here is the actual crontab task from my hosting server:
57 * * * * cd ~/minanodesinfo; ssh -p ### #####@###.#.###.## -i ~/.ssh/#####1 mina advanced node-status -daemon-peers | grep -E -o '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)' | sort --unique > ips1.txt; ssh -p ### #####@###.#.###.## -i ~/.ssh/#####2 mina advanced node-status -daemon-peers | grep -E -o '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)' | sort --unique > ips2.txt; ssh -p ### #####@###.#.###.## -i ~/.ssh/####3 mina advanced node-status -daemon-peers | grep -E -o '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)' | sort --unique > ips3.txt; cat ips1.txt ips2.txt ips3.txt | sort --unique > ips.txt
Of course, for security reasons, I’ve set from=”##.##.##.##” for those ssh keys on all my nodes servers. So, now we have ips.txt file with a list of unique reachable IP addresses. Here comes the second crontab job:
59 * * * * cd ~/minanodesinfo; python3 geolookup.py; wc -l < ips.txt 2>&1 | ./timestamp.sh >> log.csv
As you see there are two scripts there. The first one, geolookup.py parse the list of IPs with geoip2.database (also installed on the hosting server), and not only parse, it gives three files as the output: data.csv – the complete spreadsheet with all the info from the geoip2 lite database, and two more files – sorted data for Countries and ASN (basically, the two tables you see on the website). Then we also have timestamp.sh, here it is
#!/bin/bash
while read x; do
echo -n `date +%d/%m/%Y\ %H:%M,`;
echo -n "";
echo $x;
done
As the output it gives a csv file with lines like this 28/01/2022 12:59,5723 – this is the data you can se on the graph. Then we have Python modules folium (for the map), bokeh (for the graph), the tables are served by pandas and JS datatables. Like it’s written on the website, there are usually 40-60 private IP addresses (172.18.0.1, 10.142.0.5 etc.), these nodes are still reachable, so they are reflected in the total number you see on top. (the actual code is…
<h1>Mina Protocol nodes: {{totaln}}</h1>
<h2>reacheable as of <span id="datetime"></span>
<script async>
var dt = new Date();
document.getElementById("datetime").innerHTML = (("0"+dt.getDate()).slice(-2)) +"."+ (("0"+(dt.getMonth()+1)).slice(-2)) +"."+ (dt.getFullYear()) +" "+ (("0"+dt.getHours()).slice(-2)) +":"+ (("00").slice(-3));
</script></h2>)
, but not presented on the map, the graph and in the tables. The whole thing is a Flask app, running with NGINX + Gunicorn. That’s it.
Like I’ve said I could push those csv files to google docs or something automatically, but I wouldn’t like to make it completely public. Also, I know that’s not the most elegant solution. When I have more free time I’d like to create an actual database with mySQL or something. As most of the nodes are having more or less the same IPs over the time, it would allow to use some more precise geolocation database/API – most of them have free tiers – a few k of requests per month – with a database it would have been enough. This would also allow to create time-series animations for example
And of course I would like to say it again: the whole thing it’s just an estimation! I just think it’s nice to have such an estimation and I would like to make it more precise and interesting as well.
Footnote:
Finally found out why it was so terribly slow it takes time for Folium to put thousands of circles to the map (and I was sure I misconfigured some server settings Now it’s fixed, as the map being rendered to a separate html once per hour, not every time someone is opening the page.
Visit