About MCaaS
MCaaS Backup Methods and Practices
RDS
Continuous Backup/Point in Time (PIT) Recovery
Wherever possible we use the Aurora engine and enable the AWS Continuous backup, PIT recovery feature provided by Amazon. https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PIT.html
Our default setting is to use 14 days as the period for this type of PIT recovery. PIT restore is available for as far back as the earliest retained snapshot.
PIT restored RDS request
Steps to request PIT restored RDS:
- create a JIRA ticket.
- specify the DB instance.
- specify time or snapshot to restore.
- Coordinate with MCaaS team.
Snapshots
RDS Snapshots are done daily and stored for 30 days.
Restore
This process currently requires a JSM ticket to be submitted to MCaaS, and may take several hours to complete, based upon the size of the database to be restored.
Disaster Recovery Drills
When your account is provisioned and prior to going live into production, MCaaS will schedule a DRD to take place and allow you to verify the integrity of the restored data.
Engine Specific Settings
Some DB engines (mysql, oracle, postgresql, etc) have slightly different settings and procedures.
Examples of configurations and settings for some of the db engines used:
- RDS-mysql, see link (https://github.helix.gsa.gov/MCaaS/iac/blob/master/terraform/tenants/itdb/core/test/rds-mysql.tf) for settings and configurations for itdb tenant.
- RDS-aurora, see link (https://github.helix.gsa.gov/MCaaS/iac/blob/master/terraform/tenants/itdb/core/development/rds-aurora/main.tf) for settings and configurations for itdb tenant.
- RDS-mssql, see link (https://github.helix.gsa.gov/MCaaS/iac/blob/master/terraform/tenants/aod/core/development/rds_ms_sql.tf) for settings and configurations for aod tenant.
- RDS-oracle, see link (https://github.helix.gsa.gov/MCaaS/iac/blob/master/terraform/tenants/assist/core/production/oracle.tf) for settings and configurations for assist tenant.
- RDS-postgresql see link (https://github.helix.gsa.gov/MCaaS/iac/blob/master/terraform/tenants/cmccp/core/test/postgres.tf) for settings and configuratiions for cmccp tenant.
DocumentDB
DocumentDB uses the same PIT and snapshot settings as RDS, except the default snapshot retention period is 14 days instead of 30.
MCaaS has also built a custom mongo dump and restore pipeline for DocumentDB clusters. This creates daily mongo dumps, pushed to S3. Dumps and restores can be triggered in Jenkins on demand by a tenant.
This mongo dump/restore pipeline is made available to tenants upon request. Submit a Jira ticket to have it stood up.
see link (https://github.helix.gsa.gov/MCaaS/iac/blob/master/terraform/tenants/folio/core/test/document_db.tf) for settings and configurations sample for folio tenant.
ElasticSearch
Hourly snapshots are taken. These can be used to restore an existing elasticsearch/opensearch cluster, or to load restored data into a new cluster. This can be performed by the tenant.
Below is the link to the configuration and settings file used to provision elasticsearch.
Elastic File System (EFS)
The AWS Backup vault is used to make daily backups of the EFS; these backups are retained for 5 weeks, and can be restored to the time the snapshot was created. This process currently requires a JSM ticket to be submitted to MCaaS.
Below is the link to the configuration and settings file used to provision EFS.
See: https://github.helix.gsa.gov/MCaaS/iac/blob/master/terraform/tenants/itdb/core/test/efs.tf