diff --git a/docs/configuring-playbook-s3-goofys.md b/docs/configuring-playbook-s3-goofys.md new file mode 100644 index 000000000..2efacddcc --- /dev/null +++ b/docs/configuring-playbook-s3-goofys.md @@ -0,0 +1,137 @@ +# Storing Matrix media files on Amazon S3 with Goofys (optional) + +If you'd like to store Synapse's content repository (`media_store`) files on Amazon S3 (or other S3-compatible service), +you can let this playbook configure [Goofys](https://github.com/kahing/goofys) for you. + +Another (and better performing) way to use S3 storage with Synapse is [synapse-s3-storage-provider](configuring-playbook-synapse-s3-storage-provider.md). + +Using a Goofys-backed media store works, but performance may not be ideal. If possible, try to use a region which is close to your Matrix server. + +If you'd like to move your locally-stored media store data to Amazon S3 (or another S3-compatible object store), we also provide some migration instructions below. + + +## Usage + +After [creating the S3 bucket and configuring it](configuring-playbook-s3.md#bucket-creation-and-security-configuration), you can proceed to configure Goofys in your configuration file (`inventory/host_vars/matrix./vars.yml`): + +```yaml +matrix_s3_media_store_enabled: true +matrix_s3_media_store_bucket_name: "your-bucket-name" +matrix_s3_media_store_aws_access_key: "access-key-goes-here" +matrix_s3_media_store_aws_secret_key: "secret-key-goes-here" +matrix_s3_media_store_region: "eu-central-1" +``` + +You can use any S3-compatible object store by **additionally** configuring these variables: + +```yaml +matrix_s3_media_store_custom_endpoint_enabled: true +matrix_s3_media_store_custom_endpoint: "https://your-custom-endpoint" +``` + +If you have local media store files and wish to migrate to Backblaze B2 subsequently, follow our [migration guide to Backblaze B2](#migrating-to-backblaze-b2) below instead of applying this configuration as-is. + + +## Migrating from local filesystem storage to S3 + +It's a good idea to [make a complete server backup](faq.md#how-do-i-backup-the-data-on-my-server) before migrating your local media store to an S3-backed one. + +Follow one of the guides below for a migration path from a locally-stored media store to one stored on S3-compatible storage: + +- [Storing Matrix media files on Amazon S3 with Goofys (optional)](#storing-matrix-media-files-on-amazon-s3-with-goofys-optional) + - [Usage](#usage) + - [Migrating from local filesystem storage to S3](#migrating-from-local-filesystem-storage-to-s3) + - [Migrating to any S3-compatible storage (universal, but likely slow)](#migrating-to-any-s3-compatible-storage-universal-but-likely-slow) + - [Migrating to Backblaze B2](#migrating-to-backblaze-b2) + +### Migrating to any S3-compatible storage (universal, but likely slow) + +It's a good idea to [make a complete server backup](faq.md#how-do-i-backup-the-data-on-my-server) before doing this. + +1. Proceed with the steps below without stopping Matrix services + +2. Start by adding the base S3 configuration in your `vars.yml` file (seen above, may be different depending on the S3 provider of your choice) + +3. In addition to the base configuration you see above, add this to your `vars.yml` file: + +```yaml +matrix_s3_media_store_path: /matrix/s3-media-store +``` + +This enables S3 support, but mounts the S3 storage bucket to `/matrix/s3-media-store` without hooking it to your homeserver yet. Your homeserver will still continue using your local filesystem for its media store. + +5. Run the playbook to apply the changes: `ansible-playbook -i inventory/hosts setup.yml --tags=setup-all,start` + +6. Do an **initial sync of your files** by running this **on the server** (it may take a very long time): + +```sh +sudo -u matrix -- rsync --size-only --ignore-existing -avr /matrix/synapse/storage/media-store/. /matrix/s3-media-store/. +``` + +You may need to install `rsync` manually. + +7. Stop all Matrix services (`ansible-playbook -i inventory/hosts setup.yml --tags=stop`) + +8. Start the S3 service by running this **on the server**: `systemctl start matrix-goofys` + +9. Sync the files again by re-running the `rsync` command you see in step #6 + +10. Stop the S3 service by running this **on the server**: `systemctl stop matrix-goofys` + +11. Get the old media store out of the way by running this command on the server: + +```sh +mv /matrix/synapse/storage/media-store /matrix/synapse/storage/media-store-local-backup +``` + +12. Remove the `matrix_s3_media_store_path` configuration from your `vars.yml` file (undoing step #3 above) + +13. Run the playbook: `ansible-playbook -i inventory/hosts setup.yml --tags=setup-all,start` + +14. You're done! Verify that loading existing (old) media files works and that you can upload new ones. + +15. When confident that it all works, get rid of the local media store directory: `rm -rf /matrix/synapse/storage/media-store-local-backup` + + +### Migrating to Backblaze B2 + +It's a good idea to [make a complete server backup](faq.md#how-do-i-backup-the-data-on-my-server) before doing this. + +1. While all Matrix services are running, run the following command on the server: + +(you need to adjust the 3 `--env` line below with your own data) + +```sh +docker run -it --rm -w /work \ +--env='B2_KEY_ID=YOUR_KEY_GOES_HERE' \ +--env='B2_KEY_SECRET=YOUR_SECRET_GOES_HERE' \ +--env='B2_BUCKET_NAME=YOUR_BUCKET_NAME_GOES_HERE' \ +-v /matrix/synapse/storage/media-store/:/work \ +--entrypoint=/bin/sh \ +docker.io/tianon/backblaze-b2:2.1.0 \ +-c 'b2 authorize-account $B2_KEY_ID $B2_KEY_SECRET > /dev/null && b2 sync /work/ b2://$B2_BUCKET_NAME' +``` + +This is some initial file sync, which may take a very long time. + +2. Stop all Matrix services (`ansible-playbook -i inventory/hosts setup.yml --tags=stop`) + +3. Run the command from step #1 again. + +Doing this will sync any new files that may have been created locally in the meantime. + +Now that Matrix services aren't running, we're sure to get Backblaze B2 and your local media store fully in sync. + +4. Get the old media store out of the way by running this command on the server: + +```sh +mv /matrix/synapse/storage/media-store /matrix/synapse/storage/media-store-local-backup +``` + +5. Put the [Backblaze B2 settings seen above](#backblaze-b2) in your `vars.yml` file + +6. Run the playbook: `ansible-playbook -i inventory/hosts setup.yml --tags=setup-all,start` + +7. You're done! Verify that loading existing (old) media files works and that you can upload new ones. + +8. When confident that it all works, get rid of the local media store directory: `rm -rf /matrix/synapse/storage/media-store-local-backup` diff --git a/docs/configuring-playbook-s3.md b/docs/configuring-playbook-s3.md index 43aaa8792..539f96d32 100644 --- a/docs/configuring-playbook-s3.md +++ b/docs/configuring-playbook-s3.md @@ -1,15 +1,44 @@ -# Storing Matrix media files on Amazon S3 (optional) +# Storing Synapse media files on Amazon S3 or another compatible Object Storage (optional) By default, this playbook configures your server to store Synapse's content repository (`media_store`) files on the local filesystem. If that's alright, you can skip this. -If you'd like to store Synapse's content repository (`media_store`) files on Amazon S3 (or other S3-compatible service), -you can let this playbook configure [Goofys](https://github.com/kahing/goofys) for you. +As an alternative to storing media files on the local filesystem, you can store them on [Amazon S3](https://aws.amazon.com/s3/) or another S3-compatible object store. -Using a Goofys-backed media store works, but performance may not be ideal. If possible, try to use a region which is close to your Matrix server. +First, [choose an Object Storage provider](#choosing-an-object-storage-provider). -If you'd like to move your locally-stored media store data to Amazon S3 (or another S3-compatible object store), we also provide some migration instructions below. +Then, [create the S3 bucket](#bucket-creation-and-security-configuration). +Finally, [set up S3 storage for Synapse](#setting-up) (with [Goofys](configuring-playbook-s3-goofys.md) or [synapse-s3-storage-provider](configuring-playbook-synapse-s3-storage-provider.md)). + + +## Choosing an Object Storage provider + +You can create [Amazon S3](https://aws.amazon.com/s3/) or another S3-compatible object store like [Backblaze B2](https://www.backblaze.com/b2/cloud-storage.html), [Wasabi](https://wasabi.com), [Digital Ocean Spaces](https://www.digitalocean.com/products/spaces), etc. + +Amazon S3 and Backblaze S3 are pay-as-you with no minimum charges for storing too little data. + +All these providers have different prices, with Backblaze B2 appearing to be the cheapest. + +Wasabi has a minimum charge of 1TB if you're storing less than 1TB, which becomes expensive if you need to store less data than that. + +Digital Ocean Spaces has a minimum charge of 250GB ($5/month as of 2022-10), which is also expensive if you're storing less data than that. + +Important aspects of choosing the right provider are: + +- a provider by a company you like and trust (or dislike less than the others) +- a provider which has a data region close to your Matrix server (if it's farther away, high latency may cause slowdowns) +- a provider which is OK pricewise +- a provider with free or cheap egress (if you need to get the data out often, for some reason) - likely not too important for the common use-case + + +## Bucket creation and Security Configuration + +Now that you've [chosen an Object Storage provider](#choosing-an-object-storage-provider), you need to create a storage bucket. + +How you do this varies from provider to provider, with Amazon S3 being the most complicated due to its vast number of services and complicated security policies. + +Below, we provider some guides for common providers. If you don't see yours, look at the others for inspiration or read some guides online about how to create a bucket. Feel free to contribute to this documentation with an update! ## Amazon S3 @@ -34,161 +63,45 @@ You'll need an Amazon S3 bucket and some IAM user credentials (access key + secr } ``` -**NOTE**: This policy needs to be attached to an IAM user creted from the **Security Credentials** menu. This is not a **Bucket Policy**. - -You then need to enable S3 support in your configuration file (`inventory/host_vars/matrix./vars.yml`). -It would be something like this: - -```yaml -matrix_s3_media_store_enabled: true -matrix_s3_media_store_bucket_name: "your-bucket-name" -matrix_s3_media_store_aws_access_key: "access-key-goes-here" -matrix_s3_media_store_aws_secret_key: "secret-key-goes-here" -matrix_s3_media_store_region: "eu-central-1" -``` +**NOTE**: This policy needs to be attached to an IAM user created from the **Security Credentials** menu. This is not a **Bucket Policy**. -## Using other S3-compatible object stores +## Backblaze B2 -You can use any S3-compatible object store by **additionally** configuring these variables: +To use [Backblaze B2](https://www.backblaze.com/b2/cloud-storage.html) you first need to sign up. -```yaml -matrix_s3_media_store_custom_endpoint_enabled: true -# Example: "https://storage.googleapis.com" -matrix_s3_media_store_custom_endpoint: "your-custom-endpoint" -``` +You [can't easily change which region (US, Europe) your Backblaze account stores files in](https://old.reddit.com/r/backblaze/comments/hi1v90/make_the_choice_for_the_b2_data_center_region/), so make sure to carefully choose the region when signing up (hint: it's a hard to see dropdown below the username/password fields in the signup form). -### Backblaze B2 - -To use [Backblaze B2](https://www.backblaze.com/b2/cloud-storage.html): +After logging in to Backblaze: - create a new **private** bucket through its user interface (you can call it something like `matrix-DOMAIN-media-store`) -- note the **Endpoint** for your bucket (something like `s3.us-west-002.backblazeb2.com`) -- adjust its lifecycle rules to use the following **custom** rules: - - File Path: *empty value* - - Days Till Hide: *empty value* - - Days Till Delete: `1` +- note the **Endpoint** for your bucket (something like `s3.us-west-002.backblazeb2.com`). +- adjust its Lifecycle Rules to: Keep only the last version of the file - go to [App Keys](https://secure.backblaze.com/app_keys.htm) and use the **Add a New Application Key** to create a new one - restrict it to the previously created bucket (e.g. `matrix-DOMAIN-media-store`) - give it *Read & Write* access -Copy the `keyID` and `applicationKey`. +The `keyID` value is your **Access Key** and `applicationKey` is your **Secret Key**. -You need the following *additional* playbook configuration (on top of what you see above): +For configuring [Goofys](configuring-playbook-s3-goofys.md) or [s3-synapse-storage-provider](configuring-playbook-synapse-s3-storage-provider.md) you will need: -```yaml -matrix_s3_media_store_bucket_name: "YOUR_BUCKET_NAME_GOES_HERE" -matrix_s3_media_store_aws_access_key: "YOUR_keyID_GOES_HERE" -matrix_s3_media_store_aws_secret_key: "YOUR_applicationKey_GOES_HERE" -matrix_s3_media_store_custom_endpoint_enabled: true -matrix_s3_media_store_custom_endpoint: "https://s3.us-west-002.backblazeb2.com" # this may be different for your bucket -``` +- **Endpoint URL** - this is the **Endpoint** value you saw above, but prefixed with `https://` -If you have local media store files and wish to migrate to Backblaze B2 subsequently, follow our [migration guide to Backblaze B2](#migrating-to-backblaze-b2) below instead of applying this configuration as-is. +- **Region** - use the value you see in the Endpoint (e.g. `us-west-002`) + +- **Storage Class** - use `STANDARD`. Backblaze B2 does not have different storage classes, so it doesn't make sense to use any other value. -## Migrating from local filesystem storage to S3 +## Other providers -It's a good idea to [make a complete server backup](faq.md#how-do-i-backup-the-data-on-my-server) before migrating your local media store to an S3-backed one. +For other S3-compatible providers, you may not need to configure security policies, etc. (just like for [Backblaze B2](#backblaze-b2)). -Follow one of the guides below for a migration path from a locally-stored media store to one stored on S3-compatible storage: - -- [Storing Matrix media files on Amazon S3 (optional)](#storing-matrix-media-files-on-amazon-s3-optional) - - [Amazon S3](#amazon-s3) - - [Using other S3-compatible object stores](#using-other-s3-compatible-object-stores) - - [Backblaze B2](#backblaze-b2) - - [Migrating from local filesystem storage to S3](#migrating-from-local-filesystem-storage-to-s3) - - [Migrating to any S3-compatible storage (universal, but likely slow)](#migrating-to-any-s3-compatible-storage-universal-but-likely-slow) - - [Migrating to Backblaze B2](#migrating-to-backblaze-b2) - -### Migrating to any S3-compatible storage (universal, but likely slow) - -It's a good idea to [make a complete server backup](faq.md#how-do-i-backup-the-data-on-my-server) before doing this. - -1. Proceed with the steps below without stopping Matrix services - -2. Start by adding the base S3 configuration in your `vars.yml` file (seen above, may be different depending on the S3 provider of your choice) - -3. In addition to the base configuration you see above, add this to your `vars.yml` file: - -```yaml -matrix_s3_media_store_path: /matrix/s3-media-store -``` - -This enables S3 support, but mounts the S3 storage bucket to `/matrix/s3-media-store` without hooking it to your homeserver yet. Your homeserver will still continue using your local filesystem for its media store. - -5. Run the playbook to apply the changes: `ansible-playbook -i inventory/hosts setup.yml --tags=setup-all,start` - -6. Do an **initial sync of your files** by running this **on the server** (it may take a very long time): - -```sh -sudo -u matrix -- rsync --size-only --ignore-existing -avr /matrix/synapse/storage/media-store/. /matrix/s3-media-store/. -``` - -You may need to install `rsync` manually. - -7. Stop all Matrix services (`ansible-playbook -i inventory/hosts setup.yml --tags=stop`) - -8. Start the S3 service by running this **on the server**: `systemctl start matrix-goofys` - -9. Sync the files again by re-running the `rsync` command you see in step #6 - -10. Stop the S3 service by running this **on the server**: `systemctl stop matrix-goofys` - -11. Get the old media store out of the way by running this command on the server: - -```sh -mv /matrix/synapse/storage/media-store /matrix/synapse/storage/media-store-local-backup -``` - -12. Remove the `matrix_s3_media_store_path` configuration from your `vars.yml` file (undoing step #3 above) - -13. Run the playbook: `ansible-playbook -i inventory/hosts setup.yml --tags=setup-all,start` - -14. You're done! Verify that loading existing (old) media files works and that you can upload new ones. - -15. When confident that it all works, get rid of the local media store directory: `rm -rf /matrix/synapse/storage/media-store-local-backup` +You most likely just need to create an S3 bucket and get some credentials (access key and secret key) for accessing the bucket in a read/write manner. -### Migrating to Backblaze B2 +## Setting up -It's a good idea to [make a complete server backup](faq.md#how-do-i-backup-the-data-on-my-server) before doing this. +To set up Synapse to store files in S3, follow the instructions for the method of your choice: -1. While all Matrix services are running, run the following command on the server: - -(you need to adjust the 3 `--env` line below with your own data) - -```sh -docker run -it --rm -w /work \ ---env='B2_KEY_ID=YOUR_KEY_GOES_HERE' \ ---env='B2_KEY_SECRET=YOUR_SECRET_GOES_HERE' \ ---env='B2_BUCKET_NAME=YOUR_BUCKET_NAME_GOES_HERE' \ --v /matrix/synapse/storage/media-store/:/work \ ---entrypoint=/bin/sh \ -docker.io/tianon/backblaze-b2:2.1.0 \ --c 'b2 authorize-account $B2_KEY_ID $B2_KEY_SECRET > /dev/null && b2 sync /work/ b2://$B2_BUCKET_NAME' -``` - -This is some initial file sync, which may take a very long time. - -2. Stop all Matrix services (`ansible-playbook -i inventory/hosts setup.yml --tags=stop`) - -3. Run the command from step #1 again. - -Doing this will sync any new files that may have been created locally in the meantime. - -Now that Matrix services aren't running, we're sure to get Backblaze B2 and your local media store fully in sync. - -4. Get the old media store out of the way by running this command on the server: - -```sh -mv /matrix/synapse/storage/media-store /matrix/synapse/storage/media-store-local-backup -``` - -5. Put the [Backblaze B2 settings seen above](#backblaze-b2) in your `vars.yml` file - -6. Run the playbook: `ansible-playbook -i inventory/hosts setup.yml --tags=setup-all,start` - -7. You're done! Verify that loading existing (old) media files works and that you can upload new ones. - -8. When confident that it all works, get rid of the local media store directory: `rm -rf /matrix/synapse/storage/media-store-local-backup` +- using [synapse-s3-storage-provider](configuring-playbook-synapse-s3-storage-provider.md) (recommended) +- using [Goofys to mount the S3 store to the local filesystem](configuring-playbook-s3-goofys.md) diff --git a/docs/configuring-playbook-synapse-s3-storage-provider.md b/docs/configuring-playbook-synapse-s3-storage-provider.md new file mode 100644 index 000000000..bc0250f11 --- /dev/null +++ b/docs/configuring-playbook-synapse-s3-storage-provider.md @@ -0,0 +1,104 @@ +# Storing Synapse media files on Amazon S3 with synapse-s3-storage-provider (optional) + +If you'd like to store Synapse's content repository (`media_store`) files on Amazon S3 (or other S3-compatible service), +you can use the [synapse-s3-storage-provider](https://github.com/matrix-org/synapse-s3-storage-provider) media provider module for Synapse. + +An alternative (which has worse performance) is to use [Goofys to mount the S3 store to the local filesystem](configuring-playbook-s3-goofys.md). + + +## How it works? + +Summarized writings here are inspired by [this article](https://quentin.dufour.io/blog/2021-09-14/matrix-synapse-s3-storage/). + +The way media storage providers in Synapse work has some caveats: + +- Synapse still continues to use locally-stored files (for creating thumbnails, serving files, etc) +- the media storage provider is just an extra storage mechanism (in addition to the local filesystem) +- all files are stored locally at first, and then copied to the media storage provider (either synchronously or asynchronously) +- if a file is not available on the local filesystem, it's pulled from a media storage provider + +You may be thinking **if all files are stored locally as well, what's the point**? + +You can run some scripts to delete the local files once in a while, thus freeing up local disk space. If these files are needed in the future (for serving them to users, etc.), Synapse will pull them from the media storage provider on demand. + +While you will need some local disk space around, it's only to accommodate usage, etc., and won't grow as large as your S3 store. + + +## Installing + +After [creating the S3 bucket and configuring it](configuring-playbook-s3.md#bucket-creation-and-security-configuration), you can proceed to configure Goofys in your configuration file (`inventory/host_vars/matrix./vars.yml`): + +```yaml +matrix_synapse_ext_synapse_s3_storage_provider_enabled: true +matrix_synapse_ext_synapse_s3_storage_provider_config_bucket: your-bucket-name +matrix_synapse_ext_synapse_s3_storage_provider_config_region_name: some-region-name # e.g. eu-central-1 +matrix_synapse_ext_synapse_s3_storage_provider_config_endpoint_url: https://.. # delete this whole line for Amazon S3 +matrix_synapse_ext_synapse_s3_storage_provider_config_access_key_id: access-key-goes-here +matrix_synapse_ext_synapse_s3_storage_provider_config_secret_access_key: secret-key-goes-here +matrix_synapse_ext_synapse_s3_storage_provider_config_storage_class: STANDARD # or STANDARD_IA, etc. + +# For additional advanced settings, take a look at `roles/matrix-synapse/defaults/main.yml` +``` + +If you have existing files in Synapse's media repository (`/matrix/synapse/media-store/..`): + +- new files will start being stored both locally and on the S3 store +- the existing files will remain on the local filesystem only until [migrating them to the S3 store](#migrating-your-existing-media-files-to-the-s3-store) +- at some point (and periodically in the future), you can delete local files which have been uploaded to the S3 store already + + +## Migrating your existing media files to the S3 store + +Migrating your existing data can happen in multiple ways: + +- [using the `s3_media_upload` script from `synapse-s3-storage-provider`](#using-the-s3_media_upload-script-from-synapse-s3-storage-provider) (very slow when dealing with lots of data) +- [using another tool in combination with `s3_media_upload`](#using-another-tool-in-combination-with-s3_media_upload) (quicker when dealing with lots of data) + +### Using the `s3_media_upload` script from `synapse-s3-storage-provider` + +Instead of using `s3_media_upload` directly, which is very slow and painful for an initial data migration, we recommend [using another tool in combination with `s3_media_upload`](#using-another-tool-in-combination-with-s3_media_upload). + +To copy your existing files, SSH into the server and run `/usr/local/bin/matrix-synapse-s3-storage-provider-shell`. + +This launches a Synapse container, which has access to the local media store, Postgres database, S3 store and has some convenient environment variables configured for you to use (`MEDIA_PATH`, `BUCKET`, `ENDPOINT`, `UPDATE_DB_DAYS`, etc). + +Then use the following commands (`$` values come from environment variables - they're **not placeholders** that you need to substitute): + +- `s3_media_upload update-db $UPDATE_DB_DURATION` - create a local SQLite database (`cache.db`) with a list of media repository files (from the `synapse` Postgres database) eligible for operating on + - `$UPDATE_DB_DURATION` is influenced by the `matrix_synapse_ext_synapse_s3_storage_provider_update_db_day_count` variable (defaults to `0`) + - `$UPDATE_DB_DURATION` defaults to `0d` (0 days), which means **include files which haven't been accessed for more than 0 days** (that is, **all files will be included**). +- `s3_media_upload check-deleted $MEDIA_PATH` - check whether files in the local cache still exist in the local media repository directory +- `s3_media_upload upload $MEDIA_PATH $BUCKET --delete --endpoint-url $ENDPOINT` - uploads locally-stored files to S3 and deletes them from the local media repository directory + +The `upload` command may take a lot of time to complete. + + +### Using another tool in combination with `s3_media_upload` + +To migrate your existing local data to S3, we recommend to: + +- **first** use another tool ([`aws s3`](#copying-data-to-amazon-s3) or [`b2 sync`](#copying-data-to-backblaze-b2), etc.) to copy the local files to the S3 bucket + +- **only then** [use the `s3_media_upload` tool to finish the migration](#using-the-s3_media_upload-script-from-synapse-s3-storage-provider) (this checks to ensure all files are uploaded and then deletes the local files) + +#### Copying data to Amazon S3 + +Generally, you need to use the `aws s3` tool. + +This documentation section could use an improvement. Ideally, we'd come up with a guide like the one used in [Copying data to Backblaze B2](#copying-data-to-backblaze-b2) - running `aws s3` in a container, etc. + +#### Copying data to Backblaze B2 + +To copy to Backblaze B2, start a container like this: + +```sh +docker run -it --rm \ +-w /work \ +--env='B2_KEY_ID=YOUR_KEY_GOES_HERE' \ +--env='B2_KEY_SECRET=YOUR_SECRET_GOES_HERE' \ +--env='B2_BUCKET_NAME=YOUR_BUCKET_NAME_GOES_HERE' \ +--mount type=bind,src=/matrix/synapse/storage/media-store,dst=/work,ro \ +--entrypoint=/bin/sh \ +tianon/backblaze-b2:3.6.0 \ +-c 'b2 authorize-account $B2_KEY_ID $B2_KEY_SECRET > /dev/null && b2 sync /work b2://$B2_BUCKET_NAME --skipNewer' +``` diff --git a/roles/matrix-synapse/defaults/main.yml b/roles/matrix-synapse/defaults/main.yml index 40e05be72..383e67ab8 100644 --- a/roles/matrix-synapse/defaults/main.yml +++ b/roles/matrix-synapse/defaults/main.yml @@ -19,6 +19,10 @@ matrix_synapse_container_image_self_build_repo: "https://github.com/matrix-org/s # - `matrix_synapse_docker_image_final` matrix_synapse_container_image_customizations_enabled: "{{ matrix_synapse_ext_synapse_s3_storage_provider_enabled }}" +# Controls whether custom build steps will be added to the Dockerfile for installing s3-storage-provider. +# The version that will be installed is specified in `matrix_synapse_ext_synapse_s3_storage_provider_version`. +matrix_synapse_container_image_customizations_s3_storage_provider_installation_enabled: "{{ matrix_synapse_ext_synapse_s3_storage_provider_enabled }}" + # matrix_synapse_container_image_customizations_dockerfile_body contains your custom Dockerfile steps # for building your customized Synapse image based on the original (upstream) image (`matrix_synapse_docker_image`). # A `FROM ...` clause is included automatically so you don't have to. @@ -52,6 +56,7 @@ matrix_synapse_config_dir_path: "{{ matrix_synapse_base_path }}/config" matrix_synapse_storage_path: "{{ matrix_synapse_base_path }}/storage" matrix_synapse_media_store_path: "{{ matrix_synapse_storage_path }}/media-store" matrix_synapse_ext_path: "{{ matrix_synapse_base_path }}/ext" +matrix_synapse_ext_s3_storage_provider_path: "{{ matrix_synapse_base_path }}/ext/s3-storage-provider" matrix_synapse_container_client_api_port: 8008 @@ -787,6 +792,32 @@ matrix_synapse_ext_encryption_config_yaml: | patch_power_levels: {{ matrix_synapse_ext_encryption_disabler_patch_power_levels | to_json }} +# matrix_synapse_ext_synapse_s3_storage_provider_enabled controls whether to enable https://github.com/matrix-org/synapse-s3-storage-provider +# Installing it requires building a customized Docker image for Synapse (see `matrix_synapse_container_image_customizations_enabled`). +# Enabling this will enable customizations and inject the appropriate Dockerfile clauses for installing synapse-s3-storage-provider. +matrix_synapse_ext_synapse_s3_storage_provider_enabled: false +matrix_synapse_ext_synapse_s3_storage_provider_version: 1.1.2 +# Controls whether media from this (local) server is stored in s3-storage-provider +matrix_synapse_ext_synapse_s3_storage_provider_store_local: true +# Controls whether media from remote servers is stored in s3-storage-provider +matrix_synapse_ext_synapse_s3_storage_provider_store_remote: true +# Controls whether files are stored to S3 at the same time they are stored on the local filesystem. +# For slightly improved reliability, consider setting this to `true`. +# Even with asynchronous uploading to S3 (`false` value), data loss shouldn't be possible, +# because the local filesystem is a reliable data store anyway. +matrix_synapse_ext_synapse_s3_storage_provider_store_synchronous: false +matrix_synapse_ext_synapse_s3_storage_provider_config_bucket: '' +matrix_synapse_ext_synapse_s3_storage_provider_config_region_name: '' +matrix_synapse_ext_synapse_s3_storage_provider_config_endpoint_url: '' +matrix_synapse_ext_synapse_s3_storage_provider_config_access_key_id: '' +matrix_synapse_ext_synapse_s3_storage_provider_config_secret_access_key: '' +matrix_synapse_ext_synapse_s3_storage_provider_config_storage_class: STANDARD +matrix_synapse_ext_synapse_s3_storage_provider_config_threadpool_size: 40 +# matrix_synapse_ext_synapse_s3_storage_provider_update_db_day_count is a day value (number) for the `s3_media_upload update-db` command. +# It specifies how old files need to have been inactive to be eligible for migration from the local filesystem to the S3 data store. +# By default, we use `0` which says "all files are eligible for migration". +matrix_synapse_ext_synapse_s3_storage_provider_update_db_day_count: 0 + matrix_s3_media_store_enabled: false matrix_s3_media_store_custom_endpoint_enabled: false matrix_s3_goofys_docker_image: "ewoutp/goofys:latest" @@ -839,6 +870,10 @@ matrix_synapse_media_storage_providers: "{{ matrix_synapse_media_storage_provide matrix_synapse_media_storage_providers_auto: | {{ [] + + + [ + lookup('ansible.builtin.template', role_path + '/templates/synapse/ext/s3-storage-provider/media_storage_provider.yaml.j2') | from_yaml + ] if matrix_synapse_ext_synapse_s3_storage_provider_enabled else [] }} # matrix_synapse_media_storage_providers_custom contains your own custom list of storage providers. diff --git a/roles/matrix-synapse/tasks/ext/s3-storage-provider/init.yml b/roles/matrix-synapse/tasks/ext/s3-storage-provider/init.yml new file mode 100644 index 000000000..008161cb1 --- /dev/null +++ b/roles/matrix-synapse/tasks/ext/s3-storage-provider/init.yml @@ -0,0 +1,5 @@ +--- + +- ansible.builtin.set_fact: + matrix_systemd_services_list: "{{ matrix_systemd_services_list + ['matrix-synapse-s3-storage-provider-migrate.timer'] }}" + when: matrix_synapse_ext_synapse_s3_storage_provider_enabled | bool diff --git a/roles/matrix-synapse/tasks/ext/s3-storage-provider/setup.yml b/roles/matrix-synapse/tasks/ext/s3-storage-provider/setup.yml new file mode 100644 index 000000000..aefa49fe4 --- /dev/null +++ b/roles/matrix-synapse/tasks/ext/s3-storage-provider/setup.yml @@ -0,0 +1,10 @@ +--- + +- ansible.builtin.import_tasks: "{{ role_path }}/tasks/ext/s3-storage-provider/validate_config.yml" + when: matrix_synapse_ext_synapse_s3_storage_provider_enabled | bool + +- ansible.builtin.import_tasks: "{{ role_path }}/tasks/ext/s3-storage-provider/setup_install.yml" + when: matrix_synapse_ext_synapse_s3_storage_provider_enabled | bool + +- ansible.builtin.import_tasks: "{{ role_path }}/tasks/ext/s3-storage-provider/setup_uninstall.yml" + when: not matrix_synapse_ext_synapse_s3_storage_provider_enabled | bool diff --git a/roles/matrix-synapse/tasks/ext/s3-storage-provider/setup_install.yml b/roles/matrix-synapse/tasks/ext/s3-storage-provider/setup_install.yml new file mode 100644 index 000000000..31f721819 --- /dev/null +++ b/roles/matrix-synapse/tasks/ext/s3-storage-provider/setup_install.yml @@ -0,0 +1,54 @@ +--- + +# We install this into Synapse by making `matrix_synapse_ext_synapse_s3_storage_provider_enabled` influence other variables: +# - `matrix_synapse_media_storage_providers` (via `matrix_synapse_media_storage_providers_auto`) +# - `matrix_synapse_container_image_customizations_enabled` +# - `matrix_synapse_container_image_customizations_s3_storage_provider_installation_enabled` +# +# Below are additional tasks for setting up various helper scripts, etc. + +- name: Ensure s3-storage-provider env file installed + ansible.builtin.template: + src: "{{ role_path }}/templates/synapse/ext/s3-storage-provider/env.j2" + dest: "{{ matrix_synapse_ext_s3_storage_provider_path }}/env" + mode: 0640 + +- name: Ensure s3-storage-provider data path exists + ansible.builtin.file: + path: "{{ matrix_synapse_ext_s3_storage_provider_path }}/data" + state: directory + mode: 0750 + owner: "{{ matrix_user_username }}" + group: "{{ matrix_user_groupname }}" + +- name: Ensure s3-storage-provider database.yaml file installed + ansible.builtin.template: + src: "{{ role_path }}/templates/synapse/ext/s3-storage-provider/database.yaml.j2" + dest: "{{ matrix_synapse_ext_s3_storage_provider_path }}/data/database.yaml" + mode: 0640 + +- name: Ensure s3-storage-provider scripts installed + ansible.builtin.template: + src: "{{ role_path }}/templates/synapse/ext/s3-storage-provider/usr-local-bin/{{ item }}.j2" + dest: "{{ matrix_local_bin_path }}/{{ item }}" + mode: 0750 + with_items: + - matrix-synapse-s3-storage-provider-shell + - matrix-synapse-s3-storage-provider-migrate + +- name: Ensure matrix-synapse-s3-storage-provider-migrate.service and timer are installed + ansible.builtin.template: + src: "{{ role_path }}/templates/systemd/.j2" + src: "{{ role_path }}/templates/synapse/ext/s3-storage-provider/systemd/{{ item }}.j2" + dest: "{{ matrix_systemd_path }}/{{ item }}" + mode: 0640 + with_items: + - matrix-synapse-s3-storage-provider-migrate.service + - matrix-synapse-s3-storage-provider-migrate.timer + register: matrix_synapse_s3_storage_provider_systemd_service_result + +- name: Ensure systemd reloaded after matrix-synapse-s3-storage-provider-migrate.service installation + ansible.builtin.service: + daemon_reload: true + when: matrix_synapse_s3_storage_provider_systemd_service_result.changed | bool + diff --git a/roles/matrix-synapse/tasks/ext/s3-storage-provider/setup_uninstall.yml b/roles/matrix-synapse/tasks/ext/s3-storage-provider/setup_uninstall.yml new file mode 100644 index 000000000..205a55417 --- /dev/null +++ b/roles/matrix-synapse/tasks/ext/s3-storage-provider/setup_uninstall.yml @@ -0,0 +1,24 @@ +--- + +- name: Ensure matrix-synapse-s3-storage-provider-migrate.service and timer don't exist + ansible.builtin.file: + path: "{{ matrix_systemd_path }}/{{ item }}" + state: absent + with_items: + - matrix-synapse-s3-storage-provider-migrate.timer + - matrix-synapse-s3-storage-provider-migrate.service + register: matrix_synapse_s3_storage_provider_migrate_sevice_removal + +- name: Ensure systemd reloaded after matrix-synapse-s3-storage-provider-migrate.service removal + ansible.builtin.service: + daemon_reload: true + when: matrix_synapse_s3_storage_provider_migrate_sevice_removal.changed | bool + +- name: Ensure s3-storage-provider files don't exist + ansible.builtin.file: + path: "{{ item }}" + state: absent + with_items: + - "{{ matrix_local_bin_path }}/matrix-synapse-s3-storage-provider-shell" + - "{{ matrix_local_bin_path }}/matrix-synapse-s3-storage-provider-migrate" + - "{{ matrix_synapse_ext_s3_storage_provider_path }}" diff --git a/roles/matrix-synapse/tasks/ext/s3-storage-provider/validate_config.yml b/roles/matrix-synapse/tasks/ext/s3-storage-provider/validate_config.yml new file mode 100644 index 000000000..d71809fe5 --- /dev/null +++ b/roles/matrix-synapse/tasks/ext/s3-storage-provider/validate_config.yml @@ -0,0 +1,18 @@ +--- + +- name: Fail if required s3-storage-provider settings not defined + ansible.builtin.fail: + msg: >- + You need to define a required configuration setting (`{{ item }}`) for using s3-storage-provider. + when: "vars[item] == ''" + with_items: + - "matrix_synapse_ext_synapse_s3_storage_provider_config_bucket" + - "matrix_synapse_ext_synapse_s3_storage_provider_config_region_name" + - "matrix_synapse_ext_synapse_s3_storage_provider_config_access_key_id" + - "matrix_synapse_ext_synapse_s3_storage_provider_config_secret_access_key" + +- name: Fail if required matrix_synapse_ext_synapse_s3_storage_provider_config_endpoint_url looks invalid + ansible.builtin.fail: + msg: >- + `matrix_synapse_ext_synapse_s3_storage_provider_config_endpoint_url` needs to look like a URL (`http://` or `https://` prefix). + when: "matrix_synapse_ext_synapse_s3_storage_provider_config_endpoint_url != '' and not matrix_synapse_ext_synapse_s3_storage_provider_config_endpoint_url.startswith('http')" diff --git a/roles/matrix-synapse/tasks/ext/setup.yml b/roles/matrix-synapse/tasks/ext/setup.yml index d944f2574..6cf1afaa4 100644 --- a/roles/matrix-synapse/tasks/ext/setup.yml +++ b/roles/matrix-synapse/tasks/ext/setup.yml @@ -11,3 +11,5 @@ - ansible.builtin.import_tasks: "{{ role_path }}/tasks/ext/synapse-simple-antispam/setup.yml" - ansible.builtin.import_tasks: "{{ role_path }}/tasks/ext/mjolnir-antispam/setup.yml" + +- ansible.builtin.import_tasks: "{{ role_path }}/tasks/ext/s3-storage-provider/setup.yml" diff --git a/roles/matrix-synapse/tasks/init.yml b/roles/matrix-synapse/tasks/init.yml index a77320c22..9146936a0 100644 --- a/roles/matrix-synapse/tasks/init.yml +++ b/roles/matrix-synapse/tasks/init.yml @@ -26,6 +26,9 @@ matrix_systemd_services_list: "{{ matrix_systemd_services_list + ['matrix-goofys.service'] }}" when: matrix_s3_media_store_enabled | bool +- ansible.builtin.include_tasks: "{{ role_path }}/tasks/ext/s3-storage-provider/init.yml" + when: matrix_synapse_ext_synapse_s3_storage_provider_enabled | bool + - when: matrix_synapse_enabled | bool and matrix_synapse_metrics_proxying_enabled | bool block: - name: Fail if matrix-nginx-proxy role already executed diff --git a/roles/matrix-synapse/tasks/setup_synapse.yml b/roles/matrix-synapse/tasks/setup_synapse.yml index 7b887f30f..13a5819e1 100644 --- a/roles/matrix-synapse/tasks/setup_synapse.yml +++ b/roles/matrix-synapse/tasks/setup_synapse.yml @@ -12,6 +12,7 @@ - {path: "{{ matrix_synapse_ext_path }}", when: true} - {path: "{{ matrix_synapse_docker_src_files_path }}", when: "{{ matrix_synapse_container_image_self_build }}"} - {path: "{{ matrix_synapse_customized_docker_src_files_path }}", when: "{{ matrix_synapse_container_image_customizations_enabled }}"} + - {path: "{{ matrix_synapse_ext_s3_storage_provider_path }}", when: "{{ matrix_synapse_ext_synapse_s3_storage_provider_enabled }}"} # We handle matrix_synapse_media_store_path elsewhere (in ./synapse/setup_install.yml), # because if it's using Goofys and it's already mounted (from before), # trying to chown/chmod it here will cause trouble. diff --git a/roles/matrix-synapse/templates/synapse/customizations/Dockerfile.j2 b/roles/matrix-synapse/templates/synapse/customizations/Dockerfile.j2 index 7cce2086d..3919e9557 100644 --- a/roles/matrix-synapse/templates/synapse/customizations/Dockerfile.j2 +++ b/roles/matrix-synapse/templates/synapse/customizations/Dockerfile.j2 @@ -1,3 +1,7 @@ FROM {{ matrix_synapse_docker_image }} +{% if matrix_synapse_container_image_customizations_s3_storage_provider_installation_enabled %} +RUN pip install synapse-s3-storage-provider=={{ matrix_synapse_ext_synapse_s3_storage_provider_version }} +{% endif %} + {{ matrix_synapse_container_image_customizations_dockerfile_body_custom }} diff --git a/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/database.yaml.j2 b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/database.yaml.j2 new file mode 100644 index 000000000..ed11645eb --- /dev/null +++ b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/database.yaml.j2 @@ -0,0 +1,5 @@ +user: {{ matrix_synapse_database_user | to_json }} +password: {{ matrix_synapse_database_password | to_json }} +database: {{ matrix_synapse_database_database | to_json }} +host: {{ matrix_synapse_database_host | to_json }} +port: {{ matrix_synapse_database_port | to_json }} diff --git a/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/env.j2 b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/env.j2 new file mode 100644 index 000000000..4b09688ba --- /dev/null +++ b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/env.j2 @@ -0,0 +1,16 @@ +AWS_ACCESS_KEY_ID={{ matrix_synapse_ext_synapse_s3_storage_provider_config_access_key_id }} +AWS_SECRET_ACCESS_KEY={{ matrix_synapse_ext_synapse_s3_storage_provider_config_secret_access_key }} +AWS_DEFAULT_REGION={{ matrix_synapse_ext_synapse_s3_storage_provider_config_region_name }} + +ENDPOINT={{ matrix_synapse_ext_synapse_s3_storage_provider_config_endpoint_url }} +BUCKET={{ matrix_synapse_ext_synapse_s3_storage_provider_config_bucket }} + +PG_USER={{ matrix_synapse_database_user }} +PG_PASS={{ matrix_synapse_database_password }} +PG_DB={{ matrix_synapse_database_database }} +PG_HOST={{ matrix_synapse_database_host }} +PG_PORT={{ matrix_synapse_database_port }} + +MEDIA_PATH=/matrix-media-store-parent/{{ matrix_synapse_media_store_directory_name }} + +UPDATE_DB_DURATION={{ matrix_synapse_ext_synapse_s3_storage_provider_update_db_day_count }}d diff --git a/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/media_storage_provider.yaml.j2 b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/media_storage_provider.yaml.j2 new file mode 100644 index 000000000..97b0f5f2b --- /dev/null +++ b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/media_storage_provider.yaml.j2 @@ -0,0 +1,14 @@ +module: s3_storage_provider.S3StorageProviderBackend +store_local: {{ matrix_synapse_ext_synapse_s3_storage_provider_store_local | to_json }} +store_remote: {{ matrix_synapse_ext_synapse_s3_storage_provider_store_remote | to_json }} +store_synchronous: {{ matrix_synapse_ext_synapse_s3_storage_provider_store_synchronous | to_json }} +config: + bucket: {{ matrix_synapse_ext_synapse_s3_storage_provider_config_bucket | to_json }} + region_name: {{ matrix_synapse_ext_synapse_s3_storage_provider_config_region_name | to_json }} + endpoint_url: {{ matrix_synapse_ext_synapse_s3_storage_provider_config_endpoint_url | to_json }} + access_key_id: {{ matrix_synapse_ext_synapse_s3_storage_provider_config_access_key_id | to_json }} + secret_access_key: {{ matrix_synapse_ext_synapse_s3_storage_provider_config_secret_access_key | to_json }} + + storage_class: {{ matrix_synapse_ext_synapse_s3_storage_provider_config_storage_class | to_json }} + + threadpool_size: {{ matrix_synapse_ext_synapse_s3_storage_provider_config_threadpool_size | to_json }} diff --git a/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/systemd/matrix-synapse-s3-storage-provider-migrate.service.j2 b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/systemd/matrix-synapse-s3-storage-provider-migrate.service.j2 new file mode 100644 index 000000000..ea8f0c8cb --- /dev/null +++ b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/systemd/matrix-synapse-s3-storage-provider-migrate.service.j2 @@ -0,0 +1,7 @@ +[Unit] +Description=Migrates locally-stored Synapse media store files to S3 + +[Service] +Type=oneshot +Environment="HOME={{ matrix_systemd_unit_home_path }}" +ExecStart={{ matrix_local_bin_path }}/matrix-synapse-s3-storage-provider-migrate diff --git a/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/systemd/matrix-synapse-s3-storage-provider-migrate.timer.j2 b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/systemd/matrix-synapse-s3-storage-provider-migrate.timer.j2 new file mode 100644 index 000000000..61526ac12 --- /dev/null +++ b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/systemd/matrix-synapse-s3-storage-provider-migrate.timer.j2 @@ -0,0 +1,10 @@ +[Unit] +Description=Migrates locally-stored Synapse media store files to S3 + +[Timer] +Unit=matrix-synapse-s3-storage-provider-migrate.service +OnCalendar=*-*-* 05:00:00 +RandomizedDelaySec=2h + +[Install] +WantedBy=timers.target diff --git a/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/usr-local-bin/matrix-synapse-s3-storage-provider-migrate.j2 b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/usr-local-bin/matrix-synapse-s3-storage-provider-migrate.j2 new file mode 100644 index 000000000..0893f5d66 --- /dev/null +++ b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/usr-local-bin/matrix-synapse-s3-storage-provider-migrate.j2 @@ -0,0 +1,13 @@ +#jinja2: lstrip_blocks: "True" +#!/bin/bash + +{{ matrix_host_command_docker }} run \ + --rm \ + --env-file={{ matrix_synapse_ext_s3_storage_provider_path }}/env \ + --mount type=bind,src={{ matrix_synapse_storage_path }},dst=/matrix-media-store-parent,bind-propagation=slave \ + --mount type=bind,src={{ matrix_synapse_ext_s3_storage_provider_path }}/data,dst=/data \ + --workdir=/data \ + --network={{ matrix_docker_network }} \ + --entrypoint=/bin/bash \ + {{ matrix_synapse_docker_image_final }} \ + -c 's3_media_upload update-db $UPDATE_DB_DURATION && s3_media_upload --no-progress check-deleted $MEDIA_PATH && s3_media_upload --no-progress upload $MEDIA_PATH $BUCKET --delete --endpoint-url $ENDPOINT' diff --git a/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/usr-local-bin/matrix-synapse-s3-storage-provider-shell.j2 b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/usr-local-bin/matrix-synapse-s3-storage-provider-shell.j2 new file mode 100644 index 000000000..c67a6dda0 --- /dev/null +++ b/roles/matrix-synapse/templates/synapse/ext/s3-storage-provider/usr-local-bin/matrix-synapse-s3-storage-provider-shell.j2 @@ -0,0 +1,13 @@ +#jinja2: lstrip_blocks: "True" +#!/bin/bash + +{{ matrix_host_command_docker }} run \ + -it \ + --rm \ + --env-file={{ matrix_synapse_ext_s3_storage_provider_path }}/env \ + --mount type=bind,src={{ matrix_synapse_storage_path }},dst=/matrix-media-store-parent,bind-propagation=slave \ + --mount type=bind,src={{ matrix_synapse_ext_s3_storage_provider_path }}/data,dst=/data \ + --workdir=/data \ + --network={{ matrix_docker_network }} \ + --entrypoint=/bin/bash \ + {{ matrix_synapse_docker_image_final }}