Metadata-Version: 2.3
Name: whereamicloud
Version: 0.1.0
Summary: Pings various known datacenter locations to triangulate itself
License: LICENSE
Author: Dank
Author-email: 4129589+danktec@users.noreply.github.com
Requires-Python: >=3.13
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: End Users/Desktop
Classifier: Topic :: System :: Networking
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Requires-Dist: ping3 (>=5.1.5,<6.0.0)
Requires-Dist: tcppinglib (>=2.0.4,<3.0.0)
Description-Content-Type: text/markdown

# Where Am I?
whereami is a tool which answers the question of where a system is located on Earth. Or on another planet,
or space, as long as the system can reach other systems with ICMP or HTTP.

## How does it work?
whereami uses Ping / ICMP Echo Requests and a list of KNOWN systems to approximately
infer where it must be located based on the response times of other systems.

Install pipx https://github.com/pypa/pipx

## TLDR:
```bash
# install poetry using pipx (installs as a standalone system package)
pipx install poetry
# use poetry to install dependencies
poetry install
# run the package as a module
python3 -m whereami
```

## What's the point?
Believe it or not, it can be somewhat difficult - or even impossible - to pinpoint exactly where
a system is located - even as you're using it. You usually only ever have an approximation.

There are various classic systems and techniques which can help:
* Traceroute
* Whois
* BGP Looking Glass
* DNS
* GeoIP Databases

Used together, we can get a pretty good indicator of WHERE in the world our
system is, and which networks host it, But getting a really close pinpoint is much harder. 

Generally the databases are not current and the information about physical location is variable and opaque, by design.

This is because ISP's, DataCenter Providers and Cloud Hosting Providers don't want to give away
too much information for security and privacy reasons.
They'll tell you a system is inside a region and inside an availability-zone but not which DataCenter and
certainly not any information about internal datacenter architecture.

## Why would I care?
If you're deploying distributed/connected workloads and resources in various locations, their distance in hops
matters from a performance and reliability perspective.

Generally speaking, the less distance a packet needs to travel the less it will cost and the more reliable
its journey will be.

Sure, you can put everything in a single DC or region, at the cost of performance for edge customers.
You can also garner cost savings by running selected workloads on cheaper hardware.

If you're looking to assess network path performance between two points, this is a useful tool. Often you'll find the results switch between two providers which indicate mostly equal performance... once this factor is taken out we can focus on things like cost or reliability to make decisions.

## How does this tool know its own location?
It's a very simple concept, pinging out to known hosts and narrowing down to find the fastest one.

If a host is INSIDE our datacenter, then we will get the lowest possible response-time to our ping.

The real power comes from the data in locations.json. The more accurate and complete data, the better the performance
of this tool.

If the dataset of locations and pingable IP's covers every datacenter in the world, this tool can easily
determine that it's in the same DC as the fastest responding host.

## Contributing
We need to build out the locations.json file to contain as many known hosts inside datacenters as possible.

If you own a system which consistently responds to pings and you know the datacenter within which it's located, you can
add it under the appropriate region in locations.json

## locations.json File Format
This file should provide as many different providers and DC's as possible under each region for best accuracy.

locations.json
```json
{
    "[region]":
        {
            "[provider_name-dc_code]": "[pingable_address]",
            "[...]": "[...]"
        },
    "[...]": {}
}
```

## Scraped Data

Cloudping https://www.cloudping.info/ Has a list of HTTP pingable IP's which I've scraped into JSON. 
Credit to Michael Leonhard for putting this list together https://gitlab.com/leonhard-llc/cloudping.info

```bash
wget https://www.cloudping.info/ | grep -A 3 "<tr>" | grep "<td" > test.csv
```

```python
import json
f = open("test.csv")
lines = f.readlines()
output = {}
for line in lines:
    if line.find("<td>") > 0:
        left = line.split(">")[1]
        right = left.split("<")
        region_name = right[0]
        output[region_name] = {}
    # Get the provider
    if line.find("<b>") > 0:
        provider_name = line.split('/')[2]
    # Get the ping URL
    if line.find("pingUrl") > 0:
        ping_url = line.split("/")[2]
        output[region_name][provider_name] = ping_url
print(json.dumps(output))
```

# TODO:
    * Distribute as pip module

