Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found
Select Git revision
Loading items

Target

Select target project
  • tomprince/PrivateStorageio
  • privatestorage/PrivateStorageio
2 results
Select Git revision
Loading items
Show changes
Showing
with 758 additions and 167 deletions
Developer documentation
=======================
Building
--------
The build system uses `Nix`_ which must be installed before anything can be built.
Start by setting up the development/operations environment::
$ nix-shell
Testing
-------
The test system uses `Nix`_ which must be installed before any tests can be run.
Unit tests are run using this command::
$ nix-build --attr unit-tests
Unit tests are also run on CI.
The system tests are run using this command::
$ nix-build --attr system-tests
The build requires > 10 GB of disk space,
and the VMs might be timing out on slow or busy machines.
If you run into timeouts,
try `raising the number of retries <https://whetstone.private.storage/privatestorage/PrivateStorageio/-/blob/e8233d2/nixos/modules/tests/run-introducer.py#L55-62>`_.
It is also possible go through the testing script interactively - useful for debugging::
$ nix-build --attr system-tests.private-storage.driver
This will give you a result symlink in the current directory.
Inside that is bin/nixos-test-driver which gives you a kind of REPL for interacting with the VMs.
The kind of `Python in this testScript <https://whetstone.private.storage/privatestorage/PrivateStorageio/-/blob/78881a3/nixos/modules/tests/private-storage.nix#L180>`_ is what you can enter into this REPL.
Consult the `official documentation on NixOS Tests <https://nixos.org/manual/nixos/stable/index.html#sec-nixos-tests>`_ for more information.
Updatings Pins
--------------
Nixpkgs
```````
To update the version of NixOS we deploy with, run::
nix-shell --run 'update-nixpkgs'
That will update ``nixpkgs.json`` to the latest release on the nixos release channel.
To update the channel, the script will need to be updated,
along with the filenames that have the channel in them.
To create a text summary of what an update changes - to put in Merge Requests, for example - run::
nix-build -A morph -o result-before
update-nixpkgs
nix-build -A morph -o result-after
nix-shell -p nixUnstable
nix --extra-experimental-features nix-command store diff-closures ./result-before/ ./result-after/
Gitlab Repositories
```````````````````
To update the version of packages we import from gitlab, run::
nix-shell --command 'update-gitlab-repo nixos/pkgs/<package>/repo.json'
That will update the package to point at the latest version of the project.\
The command uses branch and repository owner specified in the ``repo.json`` file,
but you can override them by passing the ``--branch`` or ``-owner`` arguments to the command.
A specific revision can also be pinned, by passing ``-rev``.
Interactions
------------
Storage-Time Purchase (ie Payment)
``````````````````````````````````
.. uml::
actor User as User
participant GridSync
participant ZKAPAuthorizer
database ZKAPAuthzDB as "ZKAPAuthorizer"
participant Browser
participant PaymentServer as "Payment Server"
database PaymentServerDB as "Payment Server"
participant WebServer as "Web Server"
participant Stripe
User -> GridSync : buy storage-time
activate User
GridSync -> GridSync : generate voucher
GridSync -> ZKAPAuthorizer : redeem voucher
activate ZKAPAuthorizer
ZKAPAuthorizer -> ZKAPAuthzDB : store voucher
ZKAPAuthorizer -> GridSync : acknowledge
GridSync -> Browser : open payment page
loop until redeemed
GridSync -> ZKAPAuthorizer : query voucher state
ZKAPAuthorizer -> GridSync : not paid
end
Browser -> WebServer : request payment form
WebServer -> Browser : payment form
Browser -> User : Payment form displayed
activate User
User -> Browser : Submit payment details
Browser -> Stripe : Submit payment details
alt payment details accepted
Stripe -> Browser : details okay, return card token
Browser -> PaymentServer : create charge using card token
PaymentServer -> Stripe : charge card using token
note left: the user has now paid for the service
Stripe -> PaymentServer : acknowledge
PaymentServer -> PaymentServerDB : store voucher paid state
else payment details rejected
Stripe -> Browser : payment failure
end
Browser -> User : payment processing results displayed
deactivate User
group repeat for each redemption group
ZKAPAuthorizer -> ZKAPAuthzDB : generate and store random tokens
ZKAPAuthorizer -> PaymentServer : redeem voucher with blinded tokens
PaymentServer -> ZKAPAuthorizer : return signatures for blinded tokens
ZKAPAuthorizer -> ZKAPAuthzDB : store unblinded signatures for tokens
note right: the user has now been authorized to use the service
end
deactivate ZKAPAuthorizer
loop until redeemed
GridSync -> ZKAPAuthorizer : query voucher state
ZKAPAuthorizer -> GridSync : fully redeemed
end
GridSync -> User : storage-time available displayed
deactivate User
Storage-Time Spending (ie Use)
``````````````````````````````
.. uml::
participant MagicFolder
participant TahoeLAFS as "Tahoe-LAFS"
participant ZKAPAuthorizer
database ZKAPAuthzDB as "ZKAPAuthorizer"
participant StorageNode as "Storage Node"
participant SpendingService as "Spending Service"
[-> MagicFolder: upload triggered
activate MagicFolder
MagicFolder -> TahoeLAFS : store some data
activate TahoeLAFS
TahoeLAFS -> ZKAPAuthorizer : store some data
activate ZKAPAuthorizer
loop until tokens accepted
ZKAPAuthorizer <- ZKAPAuthzDB : load some tokens
ZKAPAuthorizer -> StorageNode : store some data using these tokens
StorageNode -> SpendingService : spend these tokens
alt spent tokens
SpendingService -> StorageNode: already spent, rejected
StorageNode -> ZKAPAuthorizer: already spent, rejected
else fresh tokens
SpendingService -> StorageNode: accepted
end
end
StorageNode -> ZKAPAuthorizer: data stored
deactivate ZKAPAuthorizer
ZKAPAuthorizer -> ZKAPAuthzDB: discard spent tokens
ZKAPAuthorizer -> TahoeLAFS: data stored
deactivate TahoeLAFS
TahoeLAFS -> MagicFolder: data stored
deactivate MagicFolder
.. include::
../../morph/grid/local/README.rst
.. _Nix: https://nixos.org/nix
System Designs
--------------
.. toctree::
:maxdepth: 2
System Design Template <template>
$HEADLINE
=========
*The goal is to do the least design we can get away with while still making a quality product.*
*Think of this as a tool to help define the problem, analyze solutions, and share results.*
*Feel free to skip sections that you don't think are relevant*
*(but say that you are doing so).*
*Delete the bits in italics*
**Contacts:** *The primary contacts for this design.*
**Date:** *The last time this design was modified. YYYY-MM-DD*
*Short description of the feature.*
*Consider clarifying by also describing what it is not.*
Rationale
---------
*Why are we doing this now?*
*What value does this give to our users?*
*Which users?*
User Stories
------------
**$STORY NAME**
**Category:** *must / nice to have / must not*
As a **$PERSON** I want **$FEATURE** so that **$BENEFIT**.
**Acceptance Criteria:**
* *What concrete conditions must be met for the implementation to be acceptable?*
* *Surface assumptions about the user story that may not be shared across the team.*
* *Describe failure modes and negative scenarios when preconditions for using the feature are not met.*
* *Place the story in a performance/scaling context with real numbers.*
*Have as many as you like.*
*Group user stories together into meaningfully deliverable units.*
*Gather Feedback*
-----------------
*It might be a good idea to stop at this point & get feedback to make sure you're solving the right problem.*
Alternatives Considered
-----------------------
*What we've considered.*
*What trade-offs are involved with each choice.*
*Why we've chosen the one we did.*
Detailed Implementation Design
------------------------------
*Focus on:*
* external and internal interfaces
* how externally-triggered system events (e.g. sudden reboot; network congestion) will affect the system
* scalability and performance
Data Integrity
~~~~~~~~~~~~~~
*If we get this wrong once, we lose forever.*
*What data does the system need to operate on?*
*How will old data be upgraded to meet the requirements of the design?*
*How will data be upgraded to future versions of the implementation?*
Security
~~~~~~~~
*What threat model does this design take into account?*
*What new attack surfaces are added by this design?*
*What defenses are deployed with the implementation to keep those surfaces safe?*
Backwards Compatibility
~~~~~~~~~~~~~~~~~~~~~~~
*What existing systems are impacted by these changes?*
*How does the design ensure they will continue to work?*
Performance and Scalability
~~~~~~~~~~~~~~~~~~~~~~~~~~~
*How will performance of the implementation be measured?*
*After measuring it, record the results here.*
Further Reading
---------------
*Links to related things.*
*Other designs, tickets, epics, mailing list threads, etc.*
......@@ -6,13 +6,16 @@
Welcome to PrivateStorageio's documentation!
============================================
Howdy! We separated the documentation into parts addressing different audiences. Please enjoy our docs for:
Howdy!
We separated the documentation into parts addressing different audiences.
Please enjoy our docs for:
.. toctree::
:maxdepth: 2
Administrators <ops/README>
Developers <dev/README>
System Designs <dev/designs/index>
Naming
......
......@@ -3,11 +3,11 @@ Administrator documentation
This contains documentation regarding running PrivateStorageio.
.. include::
../../../morph/README.rst
.. include::
monitoring.rst
.. include::
generating-keys.rst
.. toctree::
:maxdepth: 2
morph
monitoring
generating-keys
backup-recovery
stripe
Backup/Recovery
===============
This document covers the details of backups of the data required for PrivateStorageio to operate.
It describes the situations in which these backups are intended to be useful.
It also explains how to use these backups to recover in these situations.
Tahoe-LAFS Storage Nodes
------------------------
The state associated with a Tahoe-LAFS storage node consists of at least:
1. the "node directory" containing
configuration,
logs,
public and private keys,
and service fURLs.
2. the "storage" directory containing
user ciphertext,
garbage collector state,
and corruption advisories.
Node Directories
~~~~~~~~~~~~~~~~
The "node directory" changes gradually over time.
New logs are written (including incident reports).
The announcement sequence number is incremented.
The introducer cache is updated.
The critical state necessary to reproduce an identical storage node does not change.
This state consists of
* the node id (my_nodeid)
* the node private key (private/node.privkey)
* the node x509v3 certificate (private/node.pem)
A backup of the node directory can be used to create a Tahoe-LAFS storage node with the same identity as the original storage node.
It *cannot* be used to recover the user ciphertext held by the original storage node.
Nor will it recover the state which gradually changes over time.
Backup
``````
A one-time backup has been made of these directories in the PrivateStorageio 1Password account.
The "Tahoe-LAFS Storage Node Backups" vault contains backups of staging and production node directories.
The process for creating these backups is as follows:
::
DOMAIN=private.storage
FILES="node.pubkey private/ tahoe.cfg my_nodeid tahoe-client.tac node.url permutation-seed"
DIR=/var/db/tahoe-lafs/storage
for n in $(seq 1 5); do
NODE=storage00${n}.${DOMAIN}
ssh $NODE tar vvjcf - -C $DIR $FILES > ${NODE}.tar.bz2
done
tar vvjcf ${DOMAIN}.tar.bz2 *.tar.bz2
Recovery
````````
#. Prepare a system onto which to recover the node directory.
The rest of these steps assume that PrivateStorageio is deployed on the node.
#. Download the backup tarball from 1Password
#. Extract the particular node directory backup to recover from ::
[LOCAL]$ tar xvf ${DOMAIN}.tar.bz2 ${NODE}.${DOMAIN}.tar.bz2
#. Upload the node directory backup to the system onto which recovery is taking place ::
[LOCAL]$ scp ${NODE}.${DOMAIN}.tar.bz2 ${NODE}.${DOMAIN}:recovery.tar.bz2
#. Clean up the local copies of the backup files ::
[LOCAL]$ rm -iv ${DOMAIN}.tar.bz2 ${NODE}.${DOMAIN}.tar.bz2
#. The rest of the steps are executed on the system on which recovery is taking place.
Log in ::
[LOCAL]$ ssh ${NODE}.${DOMAIN}
#. On the node make sure there is no storage service running ::
[REMOTE]$ systemctl status tahoe.storage.service
If there is then figure out why and stop it if it is safe to do so ::
[REMOTE]$ systemctl stop tahoe.storage.service
#. On the node make sure there is no existing node directory ::
[REMOTE]$ stat /var/db/tahoe-lafs/storage
If there is then figure out why and remove it if it is safe to do so.
#. Unpack the node directory backup into the correct location ::
[REMOTE]$ mkdir -p /var/db/tahoe-lafs/storage
[REMOTE]$ tar xvf recovery.tar.bz2 -C /var/db/tahoe-lafs/storage
#. Mark the node directory as created and consistent ::
[REMOTE]$ touch /var/db/tahoe-lafs/storage.created
#. Start the storage service ::
[REMOTE]$ systemctl start tahoe.storage.service
#. Clean up the remote copies of the backup file ::
[REMOTE]$ rm -iv recovery.tar.bz2
Storage Directories
~~~~~~~~~~~~~~~~~~~
The user ciphertext is backed up using `Borg backup <https://borgbackup.readthedocs.io/>`_ to a separate location - currently a SaaS backup storage service (`borgbase.com <https://borgbase.com>`_).
Borg backup uses a *RepoKey* secured by a *passphrase* to encrypt the backup data and an *SSH key* to authenticate against the backup storage service.
Each Borg backup job requires one *backup repository*.
The backups are automatically checked periodically.
SSH keys
````````
Borgbase `recommends creating ed25519 ssh keys with one hundred KDF rounds <https://www.borgbase.com/ssh>`_.
We create one key pair per grid (not per host)::
$ ssh-keygen -f borgbackup-appendonly-staging -t ed25519 -a 100
$ ssh-keygen -f borgbackup-appendonly-production -t ed25519 -a 100
Save the key without a passphrase and upload the public part to `Borgbase SSH keys <https://www.borgbase.com/ssh>`_.
Passphrase
``````````
Make up a passphrase to encrypt our repository key with. Use computer help if you like::
nix-shell --packages pwgen --command 'pwgen --secure 83 1' # 83 is the year I was born. Very random.
Create & initialize the backup repository
`````````````````````````````````````````
Borgbase.com offers a `borgbase.com GraphQL API <https://docs.borgbase.com/api/>`_.
Since our current number of repositories is small we save time by creating the repositories by clicking a few buttons in the `borgbase.com Web Interface <https://www.borgbase.com/repositories>`_:
* Set up one repository per backup job.
* Set the *Repository Name* to the FQDN of the host to be backed up.
* Add the SSH key created earlier as *Append-Only Access* key.
* Leave the other settings at their defaults.
Then initialize those repositories with our chosen parameters::
export BORG_PASSCOMMAND="cat borgbackup-passphrase-staging"
export BORG_RSH="ssh -i borgbackup-appendonly-staging"
borg init -e repokey-blake2 xyxyx123@xyxyx123.repo.borgbase.com:repo
Reliability checks
``````````````````
Borg handles large amounts of data.
Given enough bits rare, spurious bit flips become a problem.
That is why regular runs of ``borg check`` are recommended
(see the `borgbase FAQ <https://docs.borgbase.com/faq/#how-often-should-i-run-borg-check>`_).
Recovery
````````
Borg offers various methods to restore backups.
A very convenient method is to mount a backup set using FUSE.
Please consult the restore documentation at `Borgbase <https://docs.borgbase.com/restore/>`_ and `Borg <https://borgbackup.readthedocs.io/en/stable/usage/mount.html>`_.
Generating keys
===============
There's an example ``secrets`` repo in ``morph/grid/local/secrets``.
There are example ``public-keys`` and ``private-keys`` repos in ``morph/grid/local/``.
``<grid>/config.json`` has the paths for the key files for the respective grid.
Create a symlink named ``secrets`` to your secret key repository for the deployment you are working on.
Create a symlink ``private-keys`` to your secret key repositories for the deployment you are working on.
Create a directory named ``public-keys`` containing the corresponding public keys for the deployment.
Stripe
......@@ -41,36 +42,9 @@ For example::
echo -n "SILOWzbnkBjxC1hGde9d5Q3Ir/4yLosCLEnEQGAxEQE=" > ristretto.signing-key
ZKAP-Issuer TLS
```````````````
The ZKAPIssuer.service needs a working TLS certificate and expects it in the certbot directory for the domain you configured, in my case::
openssl req -x509 -newkey rsa:4096 -nodes -keyout privkey.pem -out cert.pem -days 3650
touch chain.pem
Move the three .pem files into the payment's server ``/var/lib/letsencrypt/live/payments.localdev/`` directory and issue a ``sudo systemctl restart zkapissuer.service``.
Monitoring VPN
``````````````
Create Wireguard VPN key pairs in ``secrets/monitoringvpn/`` or where you have them.
``tools/create-vpn-keys.sh`` holds a script to rotate all VPN keys at once::
Create all of the Wireguard VPN keys for a grid::
./tools/create-vpn-keys.sh morph/grid/testing/grid.nix
Or do it manually::
cd secrets/monitoringvpn
for i in 1 11 12 13 ; do
wg genkey | tee 172.23.23.${i}.key | wg pubkey > 172.23.23.${i}.pub
done
ln -s 172.23.23.1.key server.key
ln -s 172.23.23.1.pub server.pub
And a shared VPN key for "post-quantum resistance"::
wg genpsk > preshared.key
<mxfile host="app.diagrams.net" modified="2023-04-20T20:17:44.466Z" agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" etag="8PyLTVr0G94q4Dna4Dsz" version="21.2.1" type="device">
<diagram name="Page-1" id="aaaa8250-4180-3840-79b5-4cada1eebb92">
<mxGraphModel dx="794" dy="476" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="1920" pageHeight="1200" background="#ffffff" math="0" shadow="0">
<root>
<mxCell id="0" />
<mxCell id="1" parent="0" />
<mxCell id="vhdg0YFc32S7_3H95Ew1-1" value="&lt;span&gt;Management VPN&lt;br&gt;(Wireshark, TINC...)&lt;/span&gt;" style="ellipse;shape=cloud;whiteSpace=wrap;html=1;" parent="1" vertex="1">
<mxGeometry x="780" y="720" width="450" height="110" as="geometry" />
</mxCell>
<mxCell id="2mYkRctJDop23S32jJdh-2" value="" style="rounded=0;whiteSpace=wrap;html=1;" parent="1" vertex="1">
<mxGeometry x="840" y="405.46" width="370" height="304.54" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-3" value="Loki" style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.application2;fillColor=#86E83A;strokeColor=#B0F373;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="1116" y="592.9000000000001" width="62" height="53" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-4" value="Prometheus" style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.application;fillColor=#4286c5;strokeColor=#57A2D8;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="866" y="585" width="62" height="68.8" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-5" value="Grafana" style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.ami2;aspect=fixed;fillColor=#FF9900;strokeColor=#ffffff;" parent="1" vertex="1">
<mxGeometry x="996" y="425" width="74" height="50" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-6" value="Node 1" style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="779" y="845" width="74" height="50" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-7" value="Operator" style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.end_user;strokeColor=#9673a6;fillColor=#e1d5e7;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="1276" y="305" width="49" height="100.46" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-16" value="Node ..." style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="902" y="845" width="74" height="50" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-17" value="Node ..." style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="1025" y="845" width="74" height="50" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-18" value="Node N" style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="1147.5" y="845" width="74" height="50" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-45" value="" style="endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;" parent="1" edge="1">
<mxGeometry x="806" y="695" width="50" height="50" as="geometry">
<mxPoint x="894.6666666666666" y="695" as="sourcePoint" />
<mxPoint x="936" y="835" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-46" value="" style="endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;" parent="1" edge="1">
<mxGeometry x="806" y="695" width="50" height="50" as="geometry">
<mxPoint x="894.6666666666666" y="695" as="sourcePoint" />
<mxPoint x="1046" y="835" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-47" value="" style="endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;" parent="1" edge="1">
<mxGeometry x="806" y="695" width="50" height="50" as="geometry">
<mxPoint x="894.6666666666666" y="695" as="sourcePoint" />
<mxPoint x="1166" y="835" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-50" value="" style="endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;" parent="1" edge="1">
<mxGeometry x="818.6666666666667" y="695" width="63.33333333333333" height="75" as="geometry">
<mxPoint x="846" y="835" as="sourcePoint" />
<mxPoint x="1148" y="695" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-51" value="" style="endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;" parent="1" edge="1">
<mxGeometry x="818.6666666666667" y="695" width="63.33333333333333" height="75" as="geometry">
<mxPoint x="946" y="835" as="sourcePoint" />
<mxPoint x="1148" y="695" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-52" value="" style="endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;" parent="1" edge="1">
<mxGeometry x="818.6666666666667" y="695" width="63.33333333333333" height="75" as="geometry">
<mxPoint x="1056" y="835" as="sourcePoint" />
<mxPoint x="1148" y="695" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-53" value="" style="endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;" parent="1" edge="1">
<mxGeometry x="818.6666666666667" y="695" width="63.33333333333333" height="75" as="geometry">
<mxPoint x="1186" y="835" as="sourcePoint" />
<mxPoint x="1148" y="695" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-57" value="" style="endArrow=classic;html=1;strokeColor=#d79b00;strokeWidth=1;fillColor=#ffe6cc;" parent="1" edge="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="988" y="495" as="sourcePoint" />
<mxPoint x="928" y="565" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-58" value="" style="endArrow=classic;html=1;strokeColor=#d79b00;strokeWidth=1;fillColor=#ffe6cc;" parent="1" edge="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="1078" y="495" as="sourcePoint" />
<mxPoint x="1136" y="565" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-61" value="View dashboards&lt;br&gt;in browser" style="shape=flexArrow;endArrow=classic;html=1;strokeColor=#9673a6;strokeWidth=1;fillColor=#e1d5e7;spacing=8;" parent="1" edge="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="1259.5" y="415" as="sourcePoint" />
<mxPoint x="1109.5" y="445" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-63" value="&lt;h1&gt;Monitoring architecture&amp;nbsp;&lt;/h1&gt;&lt;p&gt;Keep it simple, sunshine!&lt;br&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;i&gt;Grafana&lt;/i&gt; retrieves metrics from &lt;i&gt;Prometheus&lt;/i&gt; and logs from&amp;nbsp;&lt;i&gt;Loki&lt;/i&gt;,&amp;nbsp;&lt;span&gt;shows dashboards (web) and does alerting (via eMail? Slack?)&lt;/span&gt;&lt;br&gt;&lt;p&gt;&lt;i&gt;Prometheus&lt;/i&gt; stores metrics it pulls from various &lt;i&gt;Exporters&lt;/i&gt; on nodes&lt;/p&gt;&lt;p&gt;&lt;i&gt;Promtail&lt;/i&gt; on nodes pushes logs to &lt;i&gt;Loki&lt;br&gt;&lt;br&gt;&lt;/i&gt;&lt;/p&gt;&lt;p&gt;We try to keep the system as simple as possible: All monitoring and alerting runs on a single machine.&lt;/p&gt;&lt;h2&gt;Changes&lt;/h2&gt;&lt;div&gt;v2: Add Github authentication to Grafana. Add management VPN.&lt;br&gt;&lt;br&gt;v1: Initial version&lt;/div&gt;" style="text;html=1;strokeColor=none;fillColor=none;spacing=5;spacingTop=-20;whiteSpace=wrap;overflow=hidden;rounded=0;shadow=0;comic=0;sketch=0;" parent="1" vertex="1">
<mxGeometry x="596" y="315" width="164" height="545" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-65" value="Send alerts" style="shape=flexArrow;endArrow=classic;html=1;strokeColor=#9673a6;strokeWidth=1;fillColor=#e1d5e7;" parent="1" edge="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="1086" y="405.46000000000004" as="sourcePoint" />
<mxPoint x="1236" y="375.46000000000004" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="2mYkRctJDop23S32jJdh-3" value="Monitoring server" style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;" parent="1" vertex="1">
<mxGeometry x="845" y="411" width="120" height="20" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-44" value="" style="endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;" parent="1" edge="1">
<mxGeometry x="806" y="695" width="50" height="50" as="geometry">
<mxPoint x="894.6666666666666" y="695" as="sourcePoint" />
<mxPoint x="826" y="835" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="vhdg0YFc32S7_3H95Ew1-7" value="GitHub&lt;br&gt;OAuth2" style="ellipse;shape=cloud;whiteSpace=wrap;html=1;" parent="1" vertex="1">
<mxGeometry x="984" y="305" width="98" height="60" as="geometry" />
</mxCell>
<mxCell id="vhdg0YFc32S7_3H95Ew1-8" value="" style="endArrow=classic;html=1;strokeColor=#d79b00;strokeWidth=1;fillColor=#ffe6cc;" parent="1" edge="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="1032.76" y="417" as="sourcePoint" />
<mxPoint x="1032.76" y="372" as="targetPoint" />
</mxGeometry>
</mxCell>
</root>
</mxGraphModel>
</diagram>
</mxfile>
<!--[if IE]><meta http-equiv="X-UA-Compatible" content="IE=5,IE=9" ><![endif]-->
<!DOCTYPE html>
<html>
<head>
<title>monitoring-architecture.html</title>
<meta charset="utf-8"/>
</head>
<body>
<div class="mxgraph" style="max-width:100%;border:1px solid transparent;" data-mxgraph="{&quot;highlight&quot;:&quot;#0000ff&quot;,&quot;nav&quot;:true,&quot;resize&quot;:true,&quot;xml&quot;:&quot;&lt;mxfile host=\&quot;app.diagrams.net\&quot; modified=\&quot;2023-04-20T20:19:05.428Z\&quot; agent=\&quot;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36\&quot; etag=\&quot;4ETb3iIJGh8a_GDWm07E\&quot; version=\&quot;21.2.1\&quot; type=\&quot;device\&quot;&gt;&lt;diagram name=\&quot;Page-1\&quot; id=\&quot;aaaa8250-4180-3840-79b5-4cada1eebb92\&quot;&gt;&lt;mxGraphModel dx=\&quot;794\&quot; dy=\&quot;476\&quot; grid=\&quot;1\&quot; gridSize=\&quot;10\&quot; guides=\&quot;1\&quot; tooltips=\&quot;1\&quot; connect=\&quot;1\&quot; arrows=\&quot;1\&quot; fold=\&quot;1\&quot; page=\&quot;1\&quot; pageScale=\&quot;1\&quot; pageWidth=\&quot;1920\&quot; pageHeight=\&quot;1200\&quot; background=\&quot;#ffffff\&quot; math=\&quot;0\&quot; shadow=\&quot;0\&quot;&gt;&lt;root&gt;&lt;mxCell id=\&quot;0\&quot;/&gt;&lt;mxCell id=\&quot;1\&quot; parent=\&quot;0\&quot;/&gt;&lt;mxCell id=\&quot;vhdg0YFc32S7_3H95Ew1-1\&quot; value=\&quot;&amp;lt;span&amp;gt;Management VPN&amp;lt;br&amp;gt;(WireGuard)&amp;lt;/span&amp;gt;\&quot; style=\&quot;ellipse;shape=cloud;whiteSpace=wrap;html=1;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;780\&quot; y=\&quot;720\&quot; width=\&quot;450\&quot; height=\&quot;110\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;2mYkRctJDop23S32jJdh-2\&quot; value=\&quot;\&quot; style=\&quot;rounded=0;whiteSpace=wrap;html=1;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;840\&quot; y=\&quot;405.46\&quot; width=\&quot;370\&quot; height=\&quot;304.54\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-3\&quot; value=\&quot;Loki\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.application2;fillColor=#86E83A;strokeColor=#B0F373;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;1116\&quot; y=\&quot;592.9000000000001\&quot; width=\&quot;62\&quot; height=\&quot;53\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-4\&quot; value=\&quot;Prometheus\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.application;fillColor=#4286c5;strokeColor=#57A2D8;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;866\&quot; y=\&quot;585\&quot; width=\&quot;62\&quot; height=\&quot;68.8\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-5\&quot; value=\&quot;Grafana\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.ami2;aspect=fixed;fillColor=#FF9900;strokeColor=#ffffff;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;996\&quot; y=\&quot;425\&quot; width=\&quot;74\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-6\&quot; value=\&quot;Node 1\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;779\&quot; y=\&quot;845\&quot; width=\&quot;74\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-7\&quot; value=\&quot;Operator\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.end_user;strokeColor=#9673a6;fillColor=#e1d5e7;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;1276\&quot; y=\&quot;305\&quot; width=\&quot;49\&quot; height=\&quot;100.46\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-16\&quot; value=\&quot;Node ...\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;902\&quot; y=\&quot;845\&quot; width=\&quot;74\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-17\&quot; value=\&quot;Node ...\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;1025\&quot; y=\&quot;845\&quot; width=\&quot;74\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-18\&quot; value=\&quot;Node N\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;1147.5\&quot; y=\&quot;845\&quot; width=\&quot;74\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-45\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;806\&quot; y=\&quot;695\&quot; width=\&quot;50\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;894.6666666666666\&quot; y=\&quot;695\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;936\&quot; y=\&quot;835\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-46\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;806\&quot; y=\&quot;695\&quot; width=\&quot;50\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;894.6666666666666\&quot; y=\&quot;695\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1046\&quot; y=\&quot;835\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-47\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;806\&quot; y=\&quot;695\&quot; width=\&quot;50\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;894.6666666666666\&quot; y=\&quot;695\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1166\&quot; y=\&quot;835\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-50\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;818.6666666666667\&quot; y=\&quot;695\&quot; width=\&quot;63.33333333333333\&quot; height=\&quot;75\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;846\&quot; y=\&quot;835\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1148\&quot; y=\&quot;695\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-51\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;818.6666666666667\&quot; y=\&quot;695\&quot; width=\&quot;63.33333333333333\&quot; height=\&quot;75\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;946\&quot; y=\&quot;835\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1148\&quot; y=\&quot;695\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-52\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;818.6666666666667\&quot; y=\&quot;695\&quot; width=\&quot;63.33333333333333\&quot; height=\&quot;75\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;1056\&quot; y=\&quot;835\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1148\&quot; y=\&quot;695\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-53\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;818.6666666666667\&quot; y=\&quot;695\&quot; width=\&quot;63.33333333333333\&quot; height=\&quot;75\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;1186\&quot; y=\&quot;835\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1148\&quot; y=\&quot;695\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-57\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#d79b00;strokeWidth=1;fillColor=#ffe6cc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry width=\&quot;50\&quot; height=\&quot;50\&quot; relative=\&quot;1\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;988\&quot; y=\&quot;495\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;928\&quot; y=\&quot;565\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-58\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#d79b00;strokeWidth=1;fillColor=#ffe6cc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry width=\&quot;50\&quot; height=\&quot;50\&quot; relative=\&quot;1\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;1078\&quot; y=\&quot;495\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1136\&quot; y=\&quot;565\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-61\&quot; value=\&quot;View dashboards&amp;lt;br&amp;gt;in browser\&quot; style=\&quot;shape=flexArrow;endArrow=classic;html=1;strokeColor=#9673a6;strokeWidth=1;fillColor=#e1d5e7;spacing=8;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry width=\&quot;50\&quot; height=\&quot;50\&quot; relative=\&quot;1\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;1259.5\&quot; y=\&quot;415\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1109.5\&quot; y=\&quot;445\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-63\&quot; value=\&quot;&amp;lt;h1&amp;gt;Monitoring architecture&amp;amp;nbsp;&amp;lt;/h1&amp;gt;&amp;lt;p&amp;gt;Keep it simple, sunshine!&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&amp;lt;/p&amp;gt;&amp;lt;p&amp;gt;&amp;lt;/p&amp;gt;&amp;lt;i&amp;gt;Grafana&amp;lt;/i&amp;gt; retrieves metrics from &amp;lt;i&amp;gt;Prometheus&amp;lt;/i&amp;gt; and logs from&amp;amp;nbsp;&amp;lt;i&amp;gt;Loki&amp;lt;/i&amp;gt;,&amp;amp;nbsp;&amp;lt;span&amp;gt;shows dashboards (in a web browser) and does alerting (via Zulip)&amp;lt;/span&amp;gt;&amp;lt;br&amp;gt;&amp;lt;p&amp;gt;&amp;lt;i&amp;gt;Prometheus&amp;lt;/i&amp;gt; stores metrics it pulls from various &amp;lt;i&amp;gt;Exporters&amp;lt;/i&amp;gt; on nodes&amp;lt;/p&amp;gt;&amp;lt;p&amp;gt;&amp;lt;i&amp;gt;Promtail&amp;lt;/i&amp;gt; on nodes pushes logs to &amp;lt;i&amp;gt;Loki&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&amp;lt;/i&amp;gt;&amp;lt;/p&amp;gt;&amp;lt;p&amp;gt;We try to keep the system as simple as possible: All monitoring and alerting runs on a single machine.&amp;lt;/p&amp;gt;&amp;lt;h2&amp;gt;Changes&amp;lt;/h2&amp;gt;&amp;lt;div&amp;gt;v3: Fix WireGuard/Wireshark braino, Update Auth (Google, not GitHub)&amp;lt;/div&amp;gt;&amp;lt;div&amp;gt;&amp;lt;br&amp;gt;&amp;lt;/div&amp;gt;&amp;lt;div&amp;gt;v2: Add Github authentication to Grafana. Add management VPN.&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;v1: Initial version&amp;lt;/div&amp;gt;\&quot; style=\&quot;text;html=1;strokeColor=none;fillColor=none;spacing=5;spacingTop=-20;whiteSpace=wrap;overflow=hidden;rounded=0;shadow=0;comic=0;sketch=0;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;596\&quot; y=\&quot;315\&quot; width=\&quot;164\&quot; height=\&quot;605\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-65\&quot; value=\&quot;Send alerts\&quot; style=\&quot;shape=flexArrow;endArrow=classic;html=1;strokeColor=#9673a6;strokeWidth=1;fillColor=#e1d5e7;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry width=\&quot;50\&quot; height=\&quot;50\&quot; relative=\&quot;1\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;1086\&quot; y=\&quot;405.46000000000004\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1236\&quot; y=\&quot;375.46000000000004\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;UserObject label=\&quot;Monitoring server\&quot; link=\&quot;https://monitoring.private.storage/\&quot; id=\&quot;2mYkRctJDop23S32jJdh-3\&quot;&gt;&lt;mxCell style=\&quot;text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;845\&quot; y=\&quot;411\&quot; width=\&quot;120\&quot; height=\&quot;20\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;/UserObject&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-44\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;806\&quot; y=\&quot;695\&quot; width=\&quot;50\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;894.6666666666666\&quot; y=\&quot;695\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;826\&quot; y=\&quot;835\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;vhdg0YFc32S7_3H95Ew1-7\&quot; value=\&quot;GSuite&amp;lt;br&amp;gt;OAuth2\&quot; style=\&quot;ellipse;shape=cloud;whiteSpace=wrap;html=1;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;984\&quot; y=\&quot;305\&quot; width=\&quot;98\&quot; height=\&quot;60\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;vhdg0YFc32S7_3H95Ew1-8\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#d79b00;strokeWidth=1;fillColor=#ffe6cc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry width=\&quot;50\&quot; height=\&quot;50\&quot; relative=\&quot;1\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;1032.76\&quot; y=\&quot;417\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1032.76\&quot; y=\&quot;372\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;/root&gt;&lt;/mxGraphModel&gt;&lt;/diagram&gt;&lt;/mxfile&gt;&quot;,&quot;toolbar&quot;:&quot;pages zoom layers lightbox&quot;,&quot;page&quot;:0}"></div>
<script type="text/javascript" src="https://app.diagrams.net/js/viewer-static.min.js"></script>
</body>
</html>
......@@ -17,6 +17,19 @@ Analyzing long-term trends
How big is my database and how fast is it growing? How quickly is my daily-active user count growing?
Architecture
````````````
Below you find a diagram of the software and systems that comprise our monitoring and alerting infrastructure.
It has intentionally been kept simple, yet is already surprisingly complex (at least if you are new to the monitoring world).
The software stack is industry standard and chosen so it would be easy to find solutions to problems and people who can help out.
Log in to `staging <https://monitoring.privatestorage-staging.com/>`_ and `production <https://monitoring.private.storage/>`_ Grafana via your GSuite Private.Storage session.
.. raw:: html
:file: monitoring-architecture.html
Introduction to our dashboards
``````````````````````````````
......
.. include::
../../morph/README.rst
Stripe
======
We use Stripe for payment processing.
We have test-mode keys for use in staging and live-mode keys for use in production.
There is "payment link" state in Stripe to facilitate the payment workflow.
This was created with ``admin/create-payment-link.sh``
(once for test-mode and once for live-mode).
The payment links can be found in PrivateStorageOps.
They are the values of the stage3 variables ``stripe_payment_link_staging`` and ``stripe_payment_link_production``.
There is also "webhook" state in Stripe so that PaymentServer receives notification of payment.
This was created with ``admin/create-webhook.sh``
(once for test-mode and once for live-mode).
The test-mode webhook is ``we_1LxwKnBHXBAMm9bPDJXJNcDN``.
The live-mode webhook is ``we_1LzioNA9OAm23rYOmAcp3V85``.
The webhook secrets can be found with the rest of the each grid's private keys in ``stripe.webhook-secret``.
!.gitignore
Developer documentation
=======================
Building
--------
The build system uses `Nix`_ which must be installed before anything can be built.
Start by setting up the development/operations environment::
$ nix-shell
Testing
-------
The test system uses `Nix`_ which must be installed before any tests can be run.
Unit tests are run using this command::
$ nix-build nixos/unit-tests.nix
Unit tests are also run on CI.
The system tests are run using this command::
$ nix-build nixos/system-tests.nix
The system tests boot QEMU VMs which prevents them from running on CI at this time.
The build requires > 10 GB of disk space,
and the VMs might be timing out on slow or busy machines.
If you run into timeouts,
try `raising the number of retries <https://github.com/PrivateStorageio/PrivateStorageio/blob/e8233d2/nixos/modules/tests/run-introducer.py#L55-L62>`_.
It is also possible go through the testing script interactively - useful for debugging::
$ nix-build -A private-storage.driver nixos/system-tests.nix
This will give you a result symlink in the current directory.
Inside that is bin/nixos-test-driver which gives you a kind of REPL for interacting with the VMs.
The kind of `Perl in this testScript <https://github.com/PrivateStorageio/PrivateStorageio/blob/78881a3/nixos/modules/tests/private-storage.nix#L180>`_ is what you can enter into this REPL.
Consult the `official documentation on NixOS Tests <https://nixos.org/manual/nixos/stable/index.html#sec-nixos-tests>`_ for more information.
Architecture overview
---------------------
.. graphviz:: architecture-overview.dot
.. include::
../../../morph/grid/local/README.rst
.. _Nix: https://nixos.org/nix
digraph subscriptions {
rankdir=LR
subgraph cluster_usercontrolled {
label = "User Operated"
rankdir=LR
GridSync [label="GridSync", shape=circle]
Browser [label="Browser", shape=circle]
TahoeLAFS [label="Tahoe-LAFS", shape=circle]
}
subgraph cluster_pscontrolled {
label = "PrivateStorage.io Operated"
rankdir = TB
PSWebServer [label="PrivateStorage.io Web Server", shape=box]
SubscriptionConfigWHPeer [label="Subscription Config Wormhole Peer", shape=box]
PaymentServer [label="Payment Server", shape=box]
SATIssuer [label="SAT Issuer", shape=box]
PSStorageGrid [label="PrivateStorage.io Storage Grid", shape=box]
}
User [label="User", shape=egg]
Stripe [label="Stripe", shape=pentagon]
User -> PSWebServer [label="1. Get wormhole code", fontcolor=red, color=red]
PSWebServer -> User [label="2. 7-petulant-banana", fontcolor=blue, color=blue]
User -> GridSync [label="3. 7-petulant-banana", fontcolor=brown, color=brown]
GridSync -> SubscriptionConfigWHPeer [label="4. Get configuration", fontcolor=black, color=black]
SubscriptionConfigWHPeer -> GridSync [label="5. Grid configuration", fontcolor=magenta, color=magenta]
GridSync -> TahoeLAFS [label="6. Instantiate", fontcolor=aquamarine3, color=aquamarine3]
GridSync -> TahoeLAFS [label="7. Redeem PRN", fontcolor=crimson, color=crimson]
TahoeLAFS -> PaymentServer [label="8. Redeem PRN", fontcolor=crimson, color=crimson]
PaymentServer -> TahoeLAFS [label="9. Payment required", fontcolor=gold3, color=gold3]
TahoeLAFS -> GridSync [label="10. Payment required", fontcolor=gold3, color=gold3]
GridSync -> Browser [label="11. Open payment window", fontcolor=gold3, color=gold3]
User -> Browser [label="12. Enter payment info", fontcolor=blue, color=blue]
Browser -> Stripe [label="13. Submit payment form", fontcolor=brown, color=brown]
Stripe -> Browser [label="14. Payment ok", fontcolor=black, color=black]
Stripe -> PaymentServer [label="15. Payment notification", fontcolor=magenta, color=magenta]
GridSync -> TahoeLAFS [label="16. Redeem PRN", fontcolor=aquamarine3, color=aquamarine3]
TahoeLAFS -> TahoeLAFS [label="17. Generate blinded tokens", fontcolor=crimson, color=crimson]
TahoeLAFS -> SATIssuer [label="18. Redeem PRN, blinded-tokens=xs", fontcolor=crimson, color=crimson]
SATIssuer -> PaymentServer [label="19. Check PRN", fontcolor=gold3, color=gold3]
PaymentServer -> SATIssuer [label="20. PRN Valid", fontcolor=gold3, color=gold3]
SATIssuer -> TahoeLAFS [label="21. PRN valid, signed-tokens=ys", fontcolor=crimson, color=crimson]
TahoeLAFS -> TahoeLAFS [label="22. Store signed tokens", fontcolor=crimson, color=crimson]
TahoeLAFS -> GridSync [label="23. PRN Redeemed", fontcolor=red, color=red]
TahoeLAFS -> PSStorageGrid [label="24. Use storage, passes=y", fontcolor=magenta, color=magenta]
}
......@@ -36,14 +36,20 @@ lib
---
This contains Nix library code for defining the grids.
It has all the details of how each type of node in our grid is configured.
It knows about morph (so defines ``deployment.secrets`` and has the logic for collecting data defined by other nodes).
It defines options (i.e. ``grid.*``) for things specific to how we configure grids (e.g. ``grid.publicKeyPath``).
It defines metadata about nodes that we use on other nodes (e.g. ``grid.monitoringvpnIPv4`` which is used to define various things on the monitoring node).
Each top-level module here defines one type of node with all (or at least most) of the configuration necessary for that node.
grid
----
Specific grid definitions live in subdirectories beneath this directory.
They consist almost exclusively setting options defined in ``morph/lib`` (and few options defined elsewhere) and then delegating to the ``morph/lib`` modules.
secrets
~~~~~~~
private-keys
~~~~~~~~~~~~
This must be created and populated before the grid can be built or deployed.
......@@ -55,10 +61,44 @@ This path is **ignored** by git.
The intended workflow is that the secrets will be maintained on secure storage and a symlink to the correct location created here.
This keeps the secrets themselves out of the git working tree as an extra protection against unintentionally committing them.
An exception is the ``secrets`` directory in the ``local`` morph grid:
An exception is the ``private-keys`` directory in the ``local`` morph grid:
That directory is fully populated, provided as an example, and mostly: not very secret.
Do not deploy these keys to machines reachable via the internet.
Strictly speaking,
this path is configurable in the grid's ``config.json`` but all three grids currently use this name.
public-keys
~~~~~~~~~~~
This must be created and populated before the grid can be built or deployed.
This directory contains any public key material necessary for operation of the grid.
This includes the public keys corresponding to any private keys held in ``private-keys``.
As for ``private-keys``,
this path can be configured in the grid's ``config.json``.
Star-crossed Keys
^^^^^^^^^^^^^^^^^
Where the system uses keypairs,
the public and private parts of those keypairs are stored in different locations
(``public-keys`` and ``private-keys`` mentioned above).
This somewhat complicates key management because any key rotation involves changing key material in two location instead of just one.
This complication is balanced against a specific operational goal:
that our build systems operate without copies of our private keys.
Our system configurations do currently have build-time dependencies on public keys.
Splitting public keys and private keys across two different storage locations provides a simple mechanism for providing build systems with the public keys but withholding the private keys.
In the future we may:
* be sufficiently confident in the security of our build systems to let them have our private keys; or
* remove the dependency upon public keys from the build process.
Either of these directions would let us re-unify public/private-key storage and remove this complication.
config.json
~~~~~~~~~~~
......
{ pkgs ? import ../nixpkgs.nix {} }:
let
lib = pkgs.lib;
gridlib = import ./lib;
inherit (gridlib.pkgs) ourpkgs;
grids-path = "${builtins.toString ./.}/grid";
grid-configs = lib.mapAttrs (n: v: grids-path + "/${n}/grid.nix") (lib.filterAttrs (n: v: v == "directory") (builtins.readDir grids-path));
# It would be useful if morph exposed this as a function.
# https://github.com/DBCDK/morph/pull/166
morph-eval = networkExpr: (import "${pkgs.morph.lib}/eval-machines.nix") { inherit networkExpr; };
grids = lib.mapAttrs (n: v: (morph-eval v)) grid-configs;
# Derivation with symlinks to the morph output for each grid.
output = pkgs.runCommand "privatestorage-morph"
{ preferLocalBuild = true; allowSubstitutes = false; passthru = { inherit gridlib ourpkgs grids; }; }
''
mkdir $out
${lib.concatStringsSep "\n" (
lib.mapAttrsToList (
name: morph:
let
output = morph.machines {
# It would be nice if we didn't need to write this data to a file.
# https://github.com/DBCDK/morph/pull/186
argsFile = pkgs.writeText "args" (builtins.toJSON { Names = lib.attrNames morph.nodes; });
};
in
''
ln -s ${output} $out/${lib.escapeShellArg name}
''
) grids
)}'';
in output
.vagrant
/.vagrant
/public-keys/users.nix
......@@ -8,14 +8,18 @@ Issues with networking that looked like guest misconfigurations vanished after c
This requires `NixOS <https://nixos.org/>`_.
Nix without the OS will not work.
Use the local development environment
`````````````````````````````````````
0. Add VirtualBox to your NixOs system configuration at ``/etc/nixos/configuration.nix``::
0. Add to your NixOS system configuration at ``/etc/nixos/configuration.nix`` (and rebuild)::
virtualisation.virtualbox.host.enable = true;
# Save bytes and build time, optional but recommended:
virtualisation.virtualbox.host.headless = true;
# Enable libvirt - likely incompatible with virtualisation.virtualbox!
virtualisation.libvirtd.enable = true;
# Required for LibVirt
security.polkit.enable = true;
# Enable HW acceleration if (nested virtualisation is) available
#boot.kernelModules = [ "kvm-amd" "kvm-intel" ];
1. Enter the morph local grid directory::
......@@ -27,26 +31,27 @@ Use the local development environment
3. Build and start the VMs::
VAGRANT_DEFAULT_PROVIDER=virtualbox vagrant up
vagrant up --provider=libvirt
4. Then, add the Vagrant SSH configuration to your user's ``~/.ssh/config`` file::
Optionally, to switch from QEMU to KVM virtualization, edit the virtual machine definition of all the machines and replace the "qemu" on the first line with "kvm"::
install -d ~/.ssh ; vagrant ssh-config >> ~/.ssh/config
sudo virsh list
sudo virsh edit <machine id> (once for every machine)
vagrant halt
vagrant up
5. Edit the generated configuration: Add the ``publicIP`` addresses from ``grid.nix`` to ssh config **Host** match blocks (**not** HostName) so the ``Host`` lines all read like::
Host payments 192.168.67.21
HostName 127.0.0.1
User vagrant
[...]
4. Then, add the Vagrant SSH configuration to your user's ``~/.ssh/config`` file::
install -d ~/.ssh ; vagrant ssh-config >> ~/.ssh/config
Latest Morph honors the ``SSH_CONFIG_FILE`` environment variable (`since 3f90aa88 (March 2020, v 1.5.0) <https://github.com/DBCDK/morph/commit/3f90aa885fac1c29fce9242452fa7c0c505744ef#diff-d155ad793bd62e6ea4c44ba985049ecb13a4f4f32f799791b2bce695a16c0101>`_), so in the future this should get a bit more convenient.
6. Add your SSH key to ``users.nix`` so you'll be able to log in after deploying the new configuration::
5. Create a ``public-keys/users.nix`` file with your SSH key (see ``public-keys/users.nix.example`` for the format) so you'll be able to log in after deploying the new configuration::
$EDITOR secrets/users.nix
$EDITOR public-keys/users.nix
7. Then, build and deploy our software to the Vagrant VMs::
6. Then, build and deploy our software to the Vagrant VMs::
morph build grid.nix
morph push grid.nix
......@@ -56,4 +61,3 @@ Use the local development environment
morph upload-secrets grid.nix
You should now be able to log in with the users and keys you set in your ``users.nix`` file.