MiaRec Data Backup Maintenance Guide
This guide describes the process of how to use utilities, Amazon Web Services and MiaRec native application features to backup the MiaRec database and audio files. It also covers the process of restoring a database in various scenarios in case of a catastrophic failure.
MiaRec stores data in two forms:
database entries containing call metadata and configuration data; and
audio files with raw audio recordings of calls.
It is possible to use existing utilities and native MiaRec features to dump all information from the MiaRec database and ensure redundancy of audio files. These actions can be completed on-demand or can be scheduled as reoccurring actions.
MiaRec recommends using an offsite storage facility, such as an Amazon Web Services (AWS) S3 bucket because it provides exceptional 99.999999999% durability and 99.99% availability and supports WORM (write-once-read-many) model to prevent corruption or tampering of backup files. Other offsite storage mechanisms exist, like SFTP, NFS, but they are not covered in these instructions.
MiaRec uses a PostgreSQL Database to store call metadata and
configuration data. It is possible to use the utilities
pg_restore, which are included in the
postgresql package, to backup and
restore the MiaRec database.
What is in the database?
The database holds the following data:
Tenant, Group and User configuration
Call Recording Details/State
Job Configuration (export, replication, relocation)
What is NOT in the database?
The database does not include the following:
- Audio Files Audio files are stored on the file system rather than in a database. The database only stored a path to the files. Therefore audio files should be backed up and restored using other means. As long as the file path entries in the database are still valid, after a database restoration, the audio files will be accessible.
Audio files can be stored:
locally on the same server that recorded the calls
externally on remote storage like FTP, SFTP, S3. In such a case, a background process automatically relocates audio files from local file storage to remote storage. A file relocation job is normally run by a schedule (every 5 minutes or so).
When external storage is used for audio files, such storage normally provides built-in redundancy, whether it is a NAS server with multiple disks in a RAID array or an Amazon S3 bucket with 99.999999999% durability. In most situations, the provided redundancy of the external storage is sufficient and we must focus on MiaRec database backup/restore only. During a database restoration, the configuration for this external storage target would be retained and audio files would be accessible at the same path.
Make sure to check for IP filtering rules on your external storage device, in the event of a disaster recovery the IP address of the MiaRec server may change.
MiaRec provides built-in support of external storage support. This is achieved by using a file relocation job that runs periodically and moves audio files from the local recording server to external storage like FTP, SFTP, FTPS or Amazon S3.
Note, that there is a short moment when files are kept locally on the recording server(s) before they are relocated to the external storage. If a disk failure occurs, such non-relocated audio files may be lost. We recommend using a RAID 1 disk array for physical servers, which would provide local redundancy to those local files in case of disk failures.
External storage is almost always a preferred solution except for the cases where the reliability of the network connection to the external storage is in question.
Using local storage on recording server instances is not considered fully redundant (although some form of redundancy can be achieved by using RAID 1 disk arrays). If there is a failure in one or all of the recorder instance(s), audio files would be lost and unrecoverable.
Redundancy can be achieved with local storage using the following methods:
MiaRec built-in replication mechanism that synchronizes audio files between two MiaRec clusters. In such a case, audio files exist in two copies on two servers, which can be geographically distributed.
Third-party file synchronization utilities like
rsync, periodically create copies of the local audio files on external storage.
In both of the mentioned methods, each audio file is duplicated, i.e. one copy is stored in the local storage and another copy is stored in remote storage or 2nd cluster.
Local storage is accessible reliably and with low latency, so it is ideal in scenarios with unreliable networks.