Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • privatestorage/PrivateStorageio
  • tomprince/PrivateStorageio
2 results
Show changes
Showing
with 644 additions and 126 deletions
$HEADLINE
=========
*The goal is to do the least design we can get away with while still making a quality product.*
*Think of this as a tool to help define the problem, analyze solutions, and share results.*
*Feel free to skip sections that you don't think are relevant*
*(but say that you are doing so).*
*Delete the bits in italics*
**Contacts:** *The primary contacts for this design.*
**Date:** *The last time this design was modified. YYYY-MM-DD*
*Short description of the feature.*
*Consider clarifying by also describing what it is not.*
Rationale
---------
*Why are we doing this now?*
*What value does this give to our users?*
*Which users?*
User Stories
------------
**$STORY NAME**
**Category:** *must / nice to have / must not*
As a **$PERSON** I want **$FEATURE** so that **$BENEFIT**.
**Acceptance Criteria:**
* *What concrete conditions must be met for the implementation to be acceptable?*
* *Surface assumptions about the user story that may not be shared across the team.*
* *Describe failure modes and negative scenarios when preconditions for using the feature are not met.*
* *Place the story in a performance/scaling context with real numbers.*
*Have as many as you like.*
*Group user stories together into meaningfully deliverable units.*
*Gather Feedback*
-----------------
*It might be a good idea to stop at this point & get feedback to make sure you're solving the right problem.*
Alternatives Considered
-----------------------
*What we've considered.*
*What trade-offs are involved with each choice.*
*Why we've chosen the one we did.*
Detailed Implementation Design
------------------------------
*Focus on:*
* external and internal interfaces
* how externally-triggered system events (e.g. sudden reboot; network congestion) will affect the system
* scalability and performance
Data Integrity
~~~~~~~~~~~~~~
*If we get this wrong once, we lose forever.*
*What data does the system need to operate on?*
*How will old data be upgraded to meet the requirements of the design?*
*How will data be upgraded to future versions of the implementation?*
Security
~~~~~~~~
*What threat model does this design take into account?*
*What new attack surfaces are added by this design?*
*What defenses are deployed with the implementation to keep those surfaces safe?*
Backwards Compatibility
~~~~~~~~~~~~~~~~~~~~~~~
*What existing systems are impacted by these changes?*
*How does the design ensure they will continue to work?*
Performance and Scalability
~~~~~~~~~~~~~~~~~~~~~~~~~~~
*How will performance of the implementation be measured?*
*After measuring it, record the results here.*
Further Reading
---------------
*Links to related things.*
*Other designs, tickets, epics, mailing list threads, etc.*
......@@ -6,13 +6,16 @@
Welcome to PrivateStorageio's documentation!
============================================
Howdy! We separated the documentation into parts addressing different audiences. Please enjoy our docs for:
Howdy!
We separated the documentation into parts addressing different audiences.
Please enjoy our docs for:
.. toctree::
:maxdepth: 2
Administrators <ops/README>
Developers <dev/README>
System Designs <dev/designs/index>
Naming
......
......@@ -3,11 +3,11 @@ Administrator documentation
This contains documentation regarding running PrivateStorageio.
.. include::
../../../morph/README.rst
.. include::
monitoring.rst
.. include::
generating-keys.rst
.. toctree::
:maxdepth: 2
morph
monitoring
generating-keys
backup-recovery
stripe
Backup/Recovery
===============
This document covers the details of backups of the data required for PrivateStorageio to operate.
It describes the situations in which these backups are intended to be useful.
It also explains how to use these backups to recover in these situations.
Tahoe-LAFS Storage Nodes
------------------------
The state associated with a Tahoe-LAFS storage node consists of at least:
1. the "node directory" containing
configuration,
logs,
public and private keys,
and service fURLs.
2. the "storage" directory containing
user ciphertext,
garbage collector state,
and corruption advisories.
Node Directories
~~~~~~~~~~~~~~~~
The "node directory" changes gradually over time.
New logs are written (including incident reports).
The announcement sequence number is incremented.
The introducer cache is updated.
The critical state necessary to reproduce an identical storage node does not change.
This state consists of
* the node id (my_nodeid)
* the node private key (private/node.privkey)
* the node x509v3 certificate (private/node.pem)
A backup of the node directory can be used to create a Tahoe-LAFS storage node with the same identity as the original storage node.
It *cannot* be used to recover the user ciphertext held by the original storage node.
Nor will it recover the state which gradually changes over time.
Backup
``````
A one-time backup has been made of these directories in the PrivateStorageio 1Password account.
The "Tahoe-LAFS Storage Node Backups" vault contains backups of staging and production node directories.
The process for creating these backups is as follows:
::
DOMAIN=private.storage
FILES="node.pubkey private/ tahoe.cfg my_nodeid tahoe-client.tac node.url permutation-seed"
DIR=/var/db/tahoe-lafs/storage
for n in $(seq 1 5); do
NODE=storage00${n}.${DOMAIN}
ssh $NODE tar vvjcf - -C $DIR $FILES > ${NODE}.tar.bz2
done
tar vvjcf ${DOMAIN}.tar.bz2 *.tar.bz2
Recovery
````````
#. Prepare a system onto which to recover the node directory.
The rest of these steps assume that PrivateStorageio is deployed on the node.
#. Download the backup tarball from 1Password
#. Extract the particular node directory backup to recover from ::
[LOCAL]$ tar xvf ${DOMAIN}.tar.bz2 ${NODE}.${DOMAIN}.tar.bz2
#. Upload the node directory backup to the system onto which recovery is taking place ::
[LOCAL]$ scp ${NODE}.${DOMAIN}.tar.bz2 ${NODE}.${DOMAIN}:recovery.tar.bz2
#. Clean up the local copies of the backup files ::
[LOCAL]$ rm -iv ${DOMAIN}.tar.bz2 ${NODE}.${DOMAIN}.tar.bz2
#. The rest of the steps are executed on the system on which recovery is taking place.
Log in ::
[LOCAL]$ ssh ${NODE}.${DOMAIN}
#. On the node make sure there is no storage service running ::
[REMOTE]$ systemctl status tahoe.storage.service
If there is then figure out why and stop it if it is safe to do so ::
[REMOTE]$ systemctl stop tahoe.storage.service
#. On the node make sure there is no existing node directory ::
[REMOTE]$ stat /var/db/tahoe-lafs/storage
If there is then figure out why and remove it if it is safe to do so.
#. Unpack the node directory backup into the correct location ::
[REMOTE]$ mkdir -p /var/db/tahoe-lafs/storage
[REMOTE]$ tar xvf recovery.tar.bz2 -C /var/db/tahoe-lafs/storage
#. Mark the node directory as created and consistent ::
[REMOTE]$ touch /var/db/tahoe-lafs/storage.created
#. Start the storage service ::
[REMOTE]$ systemctl start tahoe.storage.service
#. Clean up the remote copies of the backup file ::
[REMOTE]$ rm -iv recovery.tar.bz2
Storage Directories
~~~~~~~~~~~~~~~~~~~
The user ciphertext is backed up using `Borg backup <https://borgbackup.readthedocs.io/>`_ to a separate location - currently a SaaS backup storage service (`borgbase.com <https://borgbase.com>`_).
Borg backup uses a *RepoKey* secured by a *passphrase* to encrypt the backup data and an *SSH key* to authenticate against the backup storage service.
Each Borg backup job requires one *backup repository*.
The backups are automatically checked periodically.
SSH keys
````````
Borgbase `recommends creating ed25519 ssh keys with one hundred KDF rounds <https://www.borgbase.com/ssh>`_.
We create one key pair per grid (not per host)::
$ ssh-keygen -f borgbackup-appendonly-staging -t ed25519 -a 100
$ ssh-keygen -f borgbackup-appendonly-production -t ed25519 -a 100
Save the key without a passphrase and upload the public part to `Borgbase SSH keys <https://www.borgbase.com/ssh>`_.
Passphrase
``````````
Make up a passphrase to encrypt our repository key with. Use computer help if you like::
nix-shell --packages pwgen --command 'pwgen --secure 83 1' # 83 is the year I was born. Very random.
Create & initialize the backup repository
`````````````````````````````````````````
Borgbase.com offers a `borgbase.com GraphQL API <https://docs.borgbase.com/api/>`_.
Since our current number of repositories is small we save time by creating the repositories by clicking a few buttons in the `borgbase.com Web Interface <https://www.borgbase.com/repositories>`_:
* Set up one repository per backup job.
* Set the *Repository Name* to the FQDN of the host to be backed up.
* Add the SSH key created earlier as *Append-Only Access* key.
* Leave the other settings at their defaults.
Then initialize those repositories with our chosen parameters::
export BORG_PASSCOMMAND="cat borgbackup-passphrase-staging"
export BORG_RSH="ssh -i borgbackup-appendonly-staging"
borg init -e repokey-blake2 xyxyx123@xyxyx123.repo.borgbase.com:repo
Reliability checks
``````````````````
Borg handles large amounts of data.
Given enough bits rare, spurious bit flips become a problem.
That is why regular runs of ``borg check`` are recommended
(see the `borgbase FAQ <https://docs.borgbase.com/faq/#how-often-should-i-run-borg-check>`_).
Recovery
````````
Borg offers various methods to restore backups.
A very convenient method is to mount a backup set using FUSE.
Please consult the restore documentation at `Borgbase <https://docs.borgbase.com/restore/>`_ and `Borg <https://borgbackup.readthedocs.io/en/stable/usage/mount.html>`_.
......@@ -42,17 +42,6 @@ For example::
echo -n "SILOWzbnkBjxC1hGde9d5Q3Ir/4yLosCLEnEQGAxEQE=" > ristretto.signing-key
ZKAP-Issuer TLS
```````````````
The ZKAPIssuer.service needs a working TLS certificate and expects it in the certbot directory for the domain you configured, in my case::
openssl req -x509 -newkey rsa:4096 -nodes -keyout privkey.pem -out cert.pem -days 3650
touch chain.pem
Move the three .pem files into the payment's server ``/var/lib/letsencrypt/live/payments.localdev/`` directory and issue a ``sudo systemctl restart zkapissuer.service``.
Monitoring VPN
``````````````
......
<mxfile host="app.diagrams.net" modified="2023-04-20T20:17:44.466Z" agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36" etag="8PyLTVr0G94q4Dna4Dsz" version="21.2.1" type="device">
<diagram name="Page-1" id="aaaa8250-4180-3840-79b5-4cada1eebb92">
<mxGraphModel dx="794" dy="476" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="1920" pageHeight="1200" background="#ffffff" math="0" shadow="0">
<root>
<mxCell id="0" />
<mxCell id="1" parent="0" />
<mxCell id="vhdg0YFc32S7_3H95Ew1-1" value="&lt;span&gt;Management VPN&lt;br&gt;(Wireshark, TINC...)&lt;/span&gt;" style="ellipse;shape=cloud;whiteSpace=wrap;html=1;" parent="1" vertex="1">
<mxGeometry x="780" y="720" width="450" height="110" as="geometry" />
</mxCell>
<mxCell id="2mYkRctJDop23S32jJdh-2" value="" style="rounded=0;whiteSpace=wrap;html=1;" parent="1" vertex="1">
<mxGeometry x="840" y="405.46" width="370" height="304.54" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-3" value="Loki" style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.application2;fillColor=#86E83A;strokeColor=#B0F373;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="1116" y="592.9000000000001" width="62" height="53" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-4" value="Prometheus" style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.application;fillColor=#4286c5;strokeColor=#57A2D8;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="866" y="585" width="62" height="68.8" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-5" value="Grafana" style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.ami2;aspect=fixed;fillColor=#FF9900;strokeColor=#ffffff;" parent="1" vertex="1">
<mxGeometry x="996" y="425" width="74" height="50" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-6" value="Node 1" style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="779" y="845" width="74" height="50" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-7" value="Operator" style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.end_user;strokeColor=#9673a6;fillColor=#e1d5e7;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="1276" y="305" width="49" height="100.46" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-16" value="Node ..." style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="902" y="845" width="74" height="50" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-17" value="Node ..." style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="1025" y="845" width="74" height="50" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-18" value="Node N" style="verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;" parent="1" vertex="1">
<mxGeometry x="1147.5" y="845" width="74" height="50" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-45" value="" style="endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;" parent="1" edge="1">
<mxGeometry x="806" y="695" width="50" height="50" as="geometry">
<mxPoint x="894.6666666666666" y="695" as="sourcePoint" />
<mxPoint x="936" y="835" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-46" value="" style="endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;" parent="1" edge="1">
<mxGeometry x="806" y="695" width="50" height="50" as="geometry">
<mxPoint x="894.6666666666666" y="695" as="sourcePoint" />
<mxPoint x="1046" y="835" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-47" value="" style="endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;" parent="1" edge="1">
<mxGeometry x="806" y="695" width="50" height="50" as="geometry">
<mxPoint x="894.6666666666666" y="695" as="sourcePoint" />
<mxPoint x="1166" y="835" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-50" value="" style="endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;" parent="1" edge="1">
<mxGeometry x="818.6666666666667" y="695" width="63.33333333333333" height="75" as="geometry">
<mxPoint x="846" y="835" as="sourcePoint" />
<mxPoint x="1148" y="695" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-51" value="" style="endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;" parent="1" edge="1">
<mxGeometry x="818.6666666666667" y="695" width="63.33333333333333" height="75" as="geometry">
<mxPoint x="946" y="835" as="sourcePoint" />
<mxPoint x="1148" y="695" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-52" value="" style="endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;" parent="1" edge="1">
<mxGeometry x="818.6666666666667" y="695" width="63.33333333333333" height="75" as="geometry">
<mxPoint x="1056" y="835" as="sourcePoint" />
<mxPoint x="1148" y="695" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-53" value="" style="endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;" parent="1" edge="1">
<mxGeometry x="818.6666666666667" y="695" width="63.33333333333333" height="75" as="geometry">
<mxPoint x="1186" y="835" as="sourcePoint" />
<mxPoint x="1148" y="695" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-57" value="" style="endArrow=classic;html=1;strokeColor=#d79b00;strokeWidth=1;fillColor=#ffe6cc;" parent="1" edge="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="988" y="495" as="sourcePoint" />
<mxPoint x="928" y="565" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-58" value="" style="endArrow=classic;html=1;strokeColor=#d79b00;strokeWidth=1;fillColor=#ffe6cc;" parent="1" edge="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="1078" y="495" as="sourcePoint" />
<mxPoint x="1136" y="565" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-61" value="View dashboards&lt;br&gt;in browser" style="shape=flexArrow;endArrow=classic;html=1;strokeColor=#9673a6;strokeWidth=1;fillColor=#e1d5e7;spacing=8;" parent="1" edge="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="1259.5" y="415" as="sourcePoint" />
<mxPoint x="1109.5" y="445" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-63" value="&lt;h1&gt;Monitoring architecture&amp;nbsp;&lt;/h1&gt;&lt;p&gt;Keep it simple, sunshine!&lt;br&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;i&gt;Grafana&lt;/i&gt; retrieves metrics from &lt;i&gt;Prometheus&lt;/i&gt; and logs from&amp;nbsp;&lt;i&gt;Loki&lt;/i&gt;,&amp;nbsp;&lt;span&gt;shows dashboards (web) and does alerting (via eMail? Slack?)&lt;/span&gt;&lt;br&gt;&lt;p&gt;&lt;i&gt;Prometheus&lt;/i&gt; stores metrics it pulls from various &lt;i&gt;Exporters&lt;/i&gt; on nodes&lt;/p&gt;&lt;p&gt;&lt;i&gt;Promtail&lt;/i&gt; on nodes pushes logs to &lt;i&gt;Loki&lt;br&gt;&lt;br&gt;&lt;/i&gt;&lt;/p&gt;&lt;p&gt;We try to keep the system as simple as possible: All monitoring and alerting runs on a single machine.&lt;/p&gt;&lt;h2&gt;Changes&lt;/h2&gt;&lt;div&gt;v2: Add Github authentication to Grafana. Add management VPN.&lt;br&gt;&lt;br&gt;v1: Initial version&lt;/div&gt;" style="text;html=1;strokeColor=none;fillColor=none;spacing=5;spacingTop=-20;whiteSpace=wrap;overflow=hidden;rounded=0;shadow=0;comic=0;sketch=0;" parent="1" vertex="1">
<mxGeometry x="596" y="315" width="164" height="545" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-65" value="Send alerts" style="shape=flexArrow;endArrow=classic;html=1;strokeColor=#9673a6;strokeWidth=1;fillColor=#e1d5e7;" parent="1" edge="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="1086" y="405.46000000000004" as="sourcePoint" />
<mxPoint x="1236" y="375.46000000000004" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="2mYkRctJDop23S32jJdh-3" value="Monitoring server" style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;" parent="1" vertex="1">
<mxGeometry x="845" y="411" width="120" height="20" as="geometry" />
</mxCell>
<mxCell id="TrmSFti5pUnXIGkjMKb6-44" value="" style="endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;" parent="1" edge="1">
<mxGeometry x="806" y="695" width="50" height="50" as="geometry">
<mxPoint x="894.6666666666666" y="695" as="sourcePoint" />
<mxPoint x="826" y="835" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="vhdg0YFc32S7_3H95Ew1-7" value="GitHub&lt;br&gt;OAuth2" style="ellipse;shape=cloud;whiteSpace=wrap;html=1;" parent="1" vertex="1">
<mxGeometry x="984" y="305" width="98" height="60" as="geometry" />
</mxCell>
<mxCell id="vhdg0YFc32S7_3H95Ew1-8" value="" style="endArrow=classic;html=1;strokeColor=#d79b00;strokeWidth=1;fillColor=#ffe6cc;" parent="1" edge="1">
<mxGeometry width="50" height="50" relative="1" as="geometry">
<mxPoint x="1032.76" y="417" as="sourcePoint" />
<mxPoint x="1032.76" y="372" as="targetPoint" />
</mxGeometry>
</mxCell>
</root>
</mxGraphModel>
</diagram>
</mxfile>
<!--[if IE]><meta http-equiv="X-UA-Compatible" content="IE=5,IE=9" ><![endif]-->
<!DOCTYPE html>
<html>
<head>
<title>monitoring-architecture.html</title>
<meta charset="utf-8"/>
</head>
<body>
<div class="mxgraph" style="max-width:100%;border:1px solid transparent;" data-mxgraph="{&quot;highlight&quot;:&quot;#0000ff&quot;,&quot;nav&quot;:true,&quot;resize&quot;:true,&quot;xml&quot;:&quot;&lt;mxfile host=\&quot;app.diagrams.net\&quot; modified=\&quot;2023-04-20T20:19:05.428Z\&quot; agent=\&quot;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36\&quot; etag=\&quot;4ETb3iIJGh8a_GDWm07E\&quot; version=\&quot;21.2.1\&quot; type=\&quot;device\&quot;&gt;&lt;diagram name=\&quot;Page-1\&quot; id=\&quot;aaaa8250-4180-3840-79b5-4cada1eebb92\&quot;&gt;&lt;mxGraphModel dx=\&quot;794\&quot; dy=\&quot;476\&quot; grid=\&quot;1\&quot; gridSize=\&quot;10\&quot; guides=\&quot;1\&quot; tooltips=\&quot;1\&quot; connect=\&quot;1\&quot; arrows=\&quot;1\&quot; fold=\&quot;1\&quot; page=\&quot;1\&quot; pageScale=\&quot;1\&quot; pageWidth=\&quot;1920\&quot; pageHeight=\&quot;1200\&quot; background=\&quot;#ffffff\&quot; math=\&quot;0\&quot; shadow=\&quot;0\&quot;&gt;&lt;root&gt;&lt;mxCell id=\&quot;0\&quot;/&gt;&lt;mxCell id=\&quot;1\&quot; parent=\&quot;0\&quot;/&gt;&lt;mxCell id=\&quot;vhdg0YFc32S7_3H95Ew1-1\&quot; value=\&quot;&amp;lt;span&amp;gt;Management VPN&amp;lt;br&amp;gt;(WireGuard)&amp;lt;/span&amp;gt;\&quot; style=\&quot;ellipse;shape=cloud;whiteSpace=wrap;html=1;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;780\&quot; y=\&quot;720\&quot; width=\&quot;450\&quot; height=\&quot;110\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;2mYkRctJDop23S32jJdh-2\&quot; value=\&quot;\&quot; style=\&quot;rounded=0;whiteSpace=wrap;html=1;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;840\&quot; y=\&quot;405.46\&quot; width=\&quot;370\&quot; height=\&quot;304.54\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-3\&quot; value=\&quot;Loki\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.application2;fillColor=#86E83A;strokeColor=#B0F373;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;1116\&quot; y=\&quot;592.9000000000001\&quot; width=\&quot;62\&quot; height=\&quot;53\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-4\&quot; value=\&quot;Prometheus\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.application;fillColor=#4286c5;strokeColor=#57A2D8;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;866\&quot; y=\&quot;585\&quot; width=\&quot;62\&quot; height=\&quot;68.8\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-5\&quot; value=\&quot;Grafana\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.ami2;aspect=fixed;fillColor=#FF9900;strokeColor=#ffffff;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;996\&quot; y=\&quot;425\&quot; width=\&quot;74\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-6\&quot; value=\&quot;Node 1\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;779\&quot; y=\&quot;845\&quot; width=\&quot;74\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-7\&quot; value=\&quot;Operator\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.end_user;strokeColor=#9673a6;fillColor=#e1d5e7;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;1276\&quot; y=\&quot;305\&quot; width=\&quot;49\&quot; height=\&quot;100.46\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-16\&quot; value=\&quot;Node ...\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;902\&quot; y=\&quot;845\&quot; width=\&quot;74\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-17\&quot; value=\&quot;Node ...\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;1025\&quot; y=\&quot;845\&quot; width=\&quot;74\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-18\&quot; value=\&quot;Node N\&quot; style=\&quot;verticalLabelPosition=bottom;html=1;verticalAlign=top;strokeWidth=1;align=center;outlineConnect=0;dashed=0;outlineConnect=0;shape=mxgraph.aws3d.worker;fillColor=#ECECEC;strokeColor=#5E5E5E;aspect=fixed;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;1147.5\&quot; y=\&quot;845\&quot; width=\&quot;74\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-45\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;806\&quot; y=\&quot;695\&quot; width=\&quot;50\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;894.6666666666666\&quot; y=\&quot;695\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;936\&quot; y=\&quot;835\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-46\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;806\&quot; y=\&quot;695\&quot; width=\&quot;50\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;894.6666666666666\&quot; y=\&quot;695\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1046\&quot; y=\&quot;835\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-47\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;806\&quot; y=\&quot;695\&quot; width=\&quot;50\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;894.6666666666666\&quot; y=\&quot;695\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1166\&quot; y=\&quot;835\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-50\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;818.6666666666667\&quot; y=\&quot;695\&quot; width=\&quot;63.33333333333333\&quot; height=\&quot;75\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;846\&quot; y=\&quot;835\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1148\&quot; y=\&quot;695\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-51\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;818.6666666666667\&quot; y=\&quot;695\&quot; width=\&quot;63.33333333333333\&quot; height=\&quot;75\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;946\&quot; y=\&quot;835\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1148\&quot; y=\&quot;695\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-52\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;818.6666666666667\&quot; y=\&quot;695\&quot; width=\&quot;63.33333333333333\&quot; height=\&quot;75\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;1056\&quot; y=\&quot;835\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1148\&quot; y=\&quot;695\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-53\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#82b366;strokeWidth=1;fillColor=#d5e8d4;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;818.6666666666667\&quot; y=\&quot;695\&quot; width=\&quot;63.33333333333333\&quot; height=\&quot;75\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;1186\&quot; y=\&quot;835\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1148\&quot; y=\&quot;695\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-57\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#d79b00;strokeWidth=1;fillColor=#ffe6cc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry width=\&quot;50\&quot; height=\&quot;50\&quot; relative=\&quot;1\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;988\&quot; y=\&quot;495\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;928\&quot; y=\&quot;565\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-58\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#d79b00;strokeWidth=1;fillColor=#ffe6cc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry width=\&quot;50\&quot; height=\&quot;50\&quot; relative=\&quot;1\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;1078\&quot; y=\&quot;495\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1136\&quot; y=\&quot;565\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-61\&quot; value=\&quot;View dashboards&amp;lt;br&amp;gt;in browser\&quot; style=\&quot;shape=flexArrow;endArrow=classic;html=1;strokeColor=#9673a6;strokeWidth=1;fillColor=#e1d5e7;spacing=8;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry width=\&quot;50\&quot; height=\&quot;50\&quot; relative=\&quot;1\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;1259.5\&quot; y=\&quot;415\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1109.5\&quot; y=\&quot;445\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-63\&quot; value=\&quot;&amp;lt;h1&amp;gt;Monitoring architecture&amp;amp;nbsp;&amp;lt;/h1&amp;gt;&amp;lt;p&amp;gt;Keep it simple, sunshine!&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&amp;lt;/p&amp;gt;&amp;lt;p&amp;gt;&amp;lt;/p&amp;gt;&amp;lt;i&amp;gt;Grafana&amp;lt;/i&amp;gt; retrieves metrics from &amp;lt;i&amp;gt;Prometheus&amp;lt;/i&amp;gt; and logs from&amp;amp;nbsp;&amp;lt;i&amp;gt;Loki&amp;lt;/i&amp;gt;,&amp;amp;nbsp;&amp;lt;span&amp;gt;shows dashboards (in a web browser) and does alerting (via Zulip)&amp;lt;/span&amp;gt;&amp;lt;br&amp;gt;&amp;lt;p&amp;gt;&amp;lt;i&amp;gt;Prometheus&amp;lt;/i&amp;gt; stores metrics it pulls from various &amp;lt;i&amp;gt;Exporters&amp;lt;/i&amp;gt; on nodes&amp;lt;/p&amp;gt;&amp;lt;p&amp;gt;&amp;lt;i&amp;gt;Promtail&amp;lt;/i&amp;gt; on nodes pushes logs to &amp;lt;i&amp;gt;Loki&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&amp;lt;/i&amp;gt;&amp;lt;/p&amp;gt;&amp;lt;p&amp;gt;We try to keep the system as simple as possible: All monitoring and alerting runs on a single machine.&amp;lt;/p&amp;gt;&amp;lt;h2&amp;gt;Changes&amp;lt;/h2&amp;gt;&amp;lt;div&amp;gt;v3: Fix WireGuard/Wireshark braino, Update Auth (Google, not GitHub)&amp;lt;/div&amp;gt;&amp;lt;div&amp;gt;&amp;lt;br&amp;gt;&amp;lt;/div&amp;gt;&amp;lt;div&amp;gt;v2: Add Github authentication to Grafana. Add management VPN.&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;v1: Initial version&amp;lt;/div&amp;gt;\&quot; style=\&quot;text;html=1;strokeColor=none;fillColor=none;spacing=5;spacingTop=-20;whiteSpace=wrap;overflow=hidden;rounded=0;shadow=0;comic=0;sketch=0;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;596\&quot; y=\&quot;315\&quot; width=\&quot;164\&quot; height=\&quot;605\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-65\&quot; value=\&quot;Send alerts\&quot; style=\&quot;shape=flexArrow;endArrow=classic;html=1;strokeColor=#9673a6;strokeWidth=1;fillColor=#e1d5e7;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry width=\&quot;50\&quot; height=\&quot;50\&quot; relative=\&quot;1\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;1086\&quot; y=\&quot;405.46000000000004\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1236\&quot; y=\&quot;375.46000000000004\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;UserObject label=\&quot;Monitoring server\&quot; link=\&quot;https://monitoring.private.storage/\&quot; id=\&quot;2mYkRctJDop23S32jJdh-3\&quot;&gt;&lt;mxCell style=\&quot;text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;845\&quot; y=\&quot;411\&quot; width=\&quot;120\&quot; height=\&quot;20\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;/UserObject&gt;&lt;mxCell id=\&quot;TrmSFti5pUnXIGkjMKb6-44\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#6c8ebf;strokeWidth=1;fillColor=#dae8fc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;806\&quot; y=\&quot;695\&quot; width=\&quot;50\&quot; height=\&quot;50\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;894.6666666666666\&quot; y=\&quot;695\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;826\&quot; y=\&quot;835\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;vhdg0YFc32S7_3H95Ew1-7\&quot; value=\&quot;GSuite&amp;lt;br&amp;gt;OAuth2\&quot; style=\&quot;ellipse;shape=cloud;whiteSpace=wrap;html=1;\&quot; parent=\&quot;1\&quot; vertex=\&quot;1\&quot;&gt;&lt;mxGeometry x=\&quot;984\&quot; y=\&quot;305\&quot; width=\&quot;98\&quot; height=\&quot;60\&quot; as=\&quot;geometry\&quot;/&gt;&lt;/mxCell&gt;&lt;mxCell id=\&quot;vhdg0YFc32S7_3H95Ew1-8\&quot; value=\&quot;\&quot; style=\&quot;endArrow=classic;html=1;strokeColor=#d79b00;strokeWidth=1;fillColor=#ffe6cc;\&quot; parent=\&quot;1\&quot; edge=\&quot;1\&quot;&gt;&lt;mxGeometry width=\&quot;50\&quot; height=\&quot;50\&quot; relative=\&quot;1\&quot; as=\&quot;geometry\&quot;&gt;&lt;mxPoint x=\&quot;1032.76\&quot; y=\&quot;417\&quot; as=\&quot;sourcePoint\&quot;/&gt;&lt;mxPoint x=\&quot;1032.76\&quot; y=\&quot;372\&quot; as=\&quot;targetPoint\&quot;/&gt;&lt;/mxGeometry&gt;&lt;/mxCell&gt;&lt;/root&gt;&lt;/mxGraphModel&gt;&lt;/diagram&gt;&lt;/mxfile&gt;&quot;,&quot;toolbar&quot;:&quot;pages zoom layers lightbox&quot;,&quot;page&quot;:0}"></div>
<script type="text/javascript" src="https://app.diagrams.net/js/viewer-static.min.js"></script>
</body>
</html>
......@@ -17,6 +17,19 @@ Analyzing long-term trends
How big is my database and how fast is it growing? How quickly is my daily-active user count growing?
Architecture
````````````
Below you find a diagram of the software and systems that comprise our monitoring and alerting infrastructure.
It has intentionally been kept simple, yet is already surprisingly complex (at least if you are new to the monitoring world).
The software stack is industry standard and chosen so it would be easy to find solutions to problems and people who can help out.
Log in to `staging <https://monitoring.privatestorage-staging.com/>`_ and `production <https://monitoring.private.storage/>`_ Grafana via your GSuite Private.Storage session.
.. raw:: html
:file: monitoring-architecture.html
Introduction to our dashboards
``````````````````````````````
......
.. include::
../../morph/README.rst
Stripe
======
We use Stripe for payment processing.
We have test-mode keys for use in staging and live-mode keys for use in production.
There is "payment link" state in Stripe to facilitate the payment workflow.
This was created with ``admin/create-payment-link.sh``
(once for test-mode and once for live-mode).
The payment links can be found in PrivateStorageOps.
They are the values of the stage3 variables ``stripe_payment_link_staging`` and ``stripe_payment_link_production``.
There is also "webhook" state in Stripe so that PaymentServer receives notification of payment.
This was created with ``admin/create-webhook.sh``
(once for test-mode and once for live-mode).
The test-mode webhook is ``we_1LxwKnBHXBAMm9bPDJXJNcDN``.
The live-mode webhook is ``we_1LzioNA9OAm23rYOmAcp3V85``.
The webhook secrets can be found with the rest of the each grid's private keys in ``stripe.webhook-secret``.
!.gitignore
Developer documentation
=======================
Building
--------
The build system uses `Nix`_ which must be installed before anything can be built.
Start by setting up the development/operations environment::
$ nix-shell
Testing
-------
The test system uses `Nix`_ which must be installed before any tests can be run.
Unit tests are run using this command::
$ nix-build nixos/unit-tests.nix
Unit tests are also run on CI.
The system tests are run using this command::
$ nix-build nixos/system-tests.nix
The system tests boot QEMU VMs which prevents them from running on CI at this time.
The build requires > 10 GB of disk space,
and the VMs might be timing out on slow or busy machines.
If you run into timeouts,
try `raising the number of retries <https://github.com/PrivateStorageio/PrivateStorageio/blob/e8233d2/nixos/modules/tests/run-introducer.py#L55-L62>`_.
It is also possible go through the testing script interactively - useful for debugging::
$ nix-build -A private-storage.driver nixos/system-tests.nix
This will give you a result symlink in the current directory.
Inside that is bin/nixos-test-driver which gives you a kind of REPL for interacting with the VMs.
The kind of `Perl in this testScript <https://github.com/PrivateStorageio/PrivateStorageio/blob/78881a3/nixos/modules/tests/private-storage.nix#L180>`_ is what you can enter into this REPL.
Consult the `official documentation on NixOS Tests <https://nixos.org/manual/nixos/stable/index.html#sec-nixos-tests>`_ for more information.
Architecture overview
---------------------
.. graphviz:: architecture-overview.dot
.. include::
../../../morph/grid/local/README.rst
.. _Nix: https://nixos.org/nix
digraph subscriptions {
rankdir=LR
subgraph cluster_usercontrolled {
label = "User Operated"
rankdir=LR
GridSync [label="GridSync", shape=circle]
Browser [label="Browser", shape=circle]
TahoeLAFS [label="Tahoe-LAFS", shape=circle]
}
subgraph cluster_pscontrolled {
label = "PrivateStorage.io Operated"
rankdir = TB
PSWebServer [label="PrivateStorage.io Web Server", shape=box]
SubscriptionConfigWHPeer [label="Subscription Config Wormhole Peer", shape=box]
PaymentServer [label="Payment Server", shape=box]
SATIssuer [label="SAT Issuer", shape=box]
PSStorageGrid [label="PrivateStorage.io Storage Grid", shape=box]
}
User [label="User", shape=egg]
Stripe [label="Stripe", shape=pentagon]
User -> PSWebServer [label="1. Get wormhole code", fontcolor=red, color=red]
PSWebServer -> User [label="2. 7-petulant-banana", fontcolor=blue, color=blue]
User -> GridSync [label="3. 7-petulant-banana", fontcolor=brown, color=brown]
GridSync -> SubscriptionConfigWHPeer [label="4. Get configuration", fontcolor=black, color=black]
SubscriptionConfigWHPeer -> GridSync [label="5. Grid configuration", fontcolor=magenta, color=magenta]
GridSync -> TahoeLAFS [label="6. Instantiate", fontcolor=aquamarine3, color=aquamarine3]
GridSync -> TahoeLAFS [label="7. Redeem PRN", fontcolor=crimson, color=crimson]
TahoeLAFS -> PaymentServer [label="8. Redeem PRN", fontcolor=crimson, color=crimson]
PaymentServer -> TahoeLAFS [label="9. Payment required", fontcolor=gold3, color=gold3]
TahoeLAFS -> GridSync [label="10. Payment required", fontcolor=gold3, color=gold3]
GridSync -> Browser [label="11. Open payment window", fontcolor=gold3, color=gold3]
User -> Browser [label="12. Enter payment info", fontcolor=blue, color=blue]
Browser -> Stripe [label="13. Submit payment form", fontcolor=brown, color=brown]
Stripe -> Browser [label="14. Payment ok", fontcolor=black, color=black]
Stripe -> PaymentServer [label="15. Payment notification", fontcolor=magenta, color=magenta]
GridSync -> TahoeLAFS [label="16. Redeem PRN", fontcolor=aquamarine3, color=aquamarine3]
TahoeLAFS -> TahoeLAFS [label="17. Generate blinded tokens", fontcolor=crimson, color=crimson]
TahoeLAFS -> SATIssuer [label="18. Redeem PRN, blinded-tokens=xs", fontcolor=crimson, color=crimson]
SATIssuer -> PaymentServer [label="19. Check PRN", fontcolor=gold3, color=gold3]
PaymentServer -> SATIssuer [label="20. PRN Valid", fontcolor=gold3, color=gold3]
SATIssuer -> TahoeLAFS [label="21. PRN valid, signed-tokens=ys", fontcolor=crimson, color=crimson]
TahoeLAFS -> TahoeLAFS [label="22. Store signed tokens", fontcolor=crimson, color=crimson]
TahoeLAFS -> GridSync [label="23. PRN Redeemed", fontcolor=red, color=red]
TahoeLAFS -> PSStorageGrid [label="24. Use storage, passes=y", fontcolor=magenta, color=magenta]
}
......@@ -36,11 +36,17 @@ lib
---
This contains Nix library code for defining the grids.
It has all the details of how each type of node in our grid is configured.
It knows about morph (so defines ``deployment.secrets`` and has the logic for collecting data defined by other nodes).
It defines options (i.e. ``grid.*``) for things specific to how we configure grids (e.g. ``grid.publicKeyPath``).
It defines metadata about nodes that we use on other nodes (e.g. ``grid.monitoringvpnIPv4`` which is used to define various things on the monitoring node).
Each top-level module here defines one type of node with all (or at least most) of the configuration necessary for that node.
grid
----
Specific grid definitions live in subdirectories beneath this directory.
They consist almost exclusively setting options defined in ``morph/lib`` (and few options defined elsewhere) and then delegating to the ``morph/lib`` modules.
private-keys
~~~~~~~~~~~~
......
{ pkgs ? import ../nixpkgs.nix {} }:
let
lib = pkgs.lib;
gridlib = import ./lib;
inherit (gridlib.pkgs) ourpkgs;
grids-path = "${builtins.toString ./.}/grid";
grid-configs = lib.mapAttrs (n: v: grids-path + "/${n}/grid.nix") (lib.filterAttrs (n: v: v == "directory") (builtins.readDir grids-path));
# It would be useful if morph exposed this as a function.
# https://github.com/DBCDK/morph/pull/166
morph-eval = networkExpr: (import "${pkgs.morph.lib}/eval-machines.nix") { inherit networkExpr; };
grids = lib.mapAttrs (n: v: (morph-eval v)) grid-configs;
# Derivation with symlinks to the morph output for each grid.
output = pkgs.runCommand "privatestorage-morph"
{ preferLocalBuild = true; allowSubstitutes = false; passthru = { inherit gridlib ourpkgs grids; }; }
''
mkdir $out
${lib.concatStringsSep "\n" (
lib.mapAttrsToList (
name: morph:
let
output = morph.machines {
# It would be nice if we didn't need to write this data to a file.
# https://github.com/DBCDK/morph/pull/186
argsFile = pkgs.writeText "args" (builtins.toJSON { Names = lib.attrNames morph.nodes; });
};
in
''
ln -s ${output} $out/${lib.escapeShellArg name}
''
) grids
)}'';
in output
private-keys
{ "domain": "deerfield.leastauthority.com"
, "publicStoragePort": 8898
, "privateKeyPath": "./private-keys"
, "publicKeyPath": "./public-keys"
, "monitoringvpnEndpoint": "monitoring.deerfield.leastauthority.com:51820"
, "passValue": 1000000
, "tokensPerVoucher": 150000
, "issuerDomains": [
"payments.deerfield.leastauthority.com"
]
, "monitoringDomains": [
"monitoring.deerfield.leastauthority.com"
]
, "letsEncryptAdminEmail": "infrastructure@leastauthority.com"
, "allowedChargeOrigins": [
"https://leastauthority.com"
]
, "monitoringGoogleOAuthClientID": ""
}
# See morph/grid/local/grid.nix for additional commentary.
let
gridlib = import ../../lib;
grid-config = builtins.fromJSON (builtins.readFile ./config.json);
# Module with per-grid configuration
grid-module = {config, ...}: {
imports = [
gridlib.base
# Allow us to remotely trigger updates to this system.
../../../nixos/modules/deployment.nix
# Give it a good SSH configuration.
../../../nixos/modules/ssh.nix
];
services.private-storage.sshUsers = import ./public-keys/users.nix;
networking.domain = grid-config.domain;
# Convert relative paths to absolute so library code can resolve names
# correctly.
grid = {
publicKeyPath = toString ./. + "/${grid-config.publicKeyPath}";
privateKeyPath = toString ./. + "/${grid-config.privateKeyPath}";
inherit (grid-config) monitoringvpnEndpoint letsEncryptAdminEmail;
};
# Configure deployment management authorization for all systems in the grid.
services.private-storage.deployment = {
authorizedKey = builtins.readFile "${config.grid.publicKeyPath}/deploy_key.pub";
gridName = "hro-cloud";
};
};
payments = {
imports = [
gridlib.issuer
gridlib.hardware-payments-ovh
grid-module
];
config = {
grid.monitoringvpnIPv4 = "172.23.23.11";
grid.issuer = {
inherit (grid-config) issuerDomains allowedChargeOrigins tokensPerVoucher;
};
};
};
monitoring = {
imports = [
gridlib.monitoring
gridlib.hardware-monitoring-ovh
grid-module
];
config = {
grid.monitoringvpnIPv4 = "172.23.23.1";
grid.monitoring = {
inherit paymentExporterTargets blackboxExporterHttpsTargets;
inherit (grid-config) monitoringDomains;
googleOAuthClientID = grid-config.monitoringGoogleOAuthClientID;
enableSlackAlert = false;
};
system.stateVersion = "19.09";
};
};
defineStorageNode = name: { vpnIP, stateVersion }:
let
nodecfg = import (./. + "/${name}-config.nix");
hardware = (./. + "/${name}-hardware.nix");
in {
imports = [
# Get some of the very lowest-level system configuration for this
# node. This isn't all *completely* hardware related. Maybe some
# more factoring is in order, someday.
hardware
# Slightly awkwardly, enable some of our hardware / network / bootloader options.
../../../nixos/modules/100tb.nix
# At least some of our storage nodes utilize MegaRAID storage controllers.
# Monitor their array status.
../../../nixos/modules/monitoring/exporters/megacli2prom.nix
# Get all of the configuration that is common across all storage nodes.
gridlib.storage
# Also configure deployment management authorization
grid-module
];
config = {
grid.monitoringvpnIPv4 = vpnIP;
grid.storage = {
inherit (grid-config) passValue publicStoragePort;
};
system.stateVersion = stateVersion;
# And supply configuration for those hardware / network / bootloader
# options. See the 100tb module for handling of this value. The module
# name is quoted because `1` makes `100tb` look an awful lot like a
# number.
"100tb".config = nodecfg;
# Enable statistics gathering for MegaRAID cards.
# TODO would be nice to enable only on machines that have such a device.
services.private-storage.monitoring.exporters.megacli2prom.enable = true;
# Disable Borg Backup for this grid!
services.private-storage.borgbackup.enable = false;
};
};
# Define all of the storage nodes for this grid.
storageNodes = builtins.mapAttrs defineStorageNode {
storage001 = { vpnIP = "172.23.23.21"; stateVersion = "19.09"; };
storage002 = { vpnIP = "172.23.23.22"; stateVersion = "19.09"; };
storage003 = { vpnIP = "172.23.23.23"; stateVersion = "19.09"; };
};
paymentExporterTargets = [ "payments.monitoringvpn" ];
blackboxExporterHttpsTargets = [
"https://payments.deerfield.leastauthority.com/"
"https://monitoring.deerfield.leastauthority.com/"
];
in {
network = {
description = "HRO Grid";
inherit (gridlib) pkgs;
};
inherit payments;
inherit monitoring;
} // storageNodes
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAICB19iufUtpjsneYd46n4uJHudYeNROsgm6+BfZfQVuh hro-cloud-deploy
\ No newline at end of file