GCP storage has a wide range of managed storage and database options in its portfolio. Knowing the characteristics of each and being able to select a suitable solution is vital as an architect during the design process. The choice of the right GCP storage solution is not simple. Making a decision on which storage solution is right for your requirements is a balance of a number of characteristics including the type of data, scale, durability, availability, and location requirements.
Google Cloud-managed storage and database portfolio
From a high-level the services range from relational, NoSQL, object storage, data warehouse, to In memory. These GCP storage services are fully managed to scalable and backed by industry leading SLAs.
1. Cloud SQL : This fast and compatible storage service allows managing relational MySQL and PostgreSQL databases in the cloud.
2. Cloud Spanner : Another fully managed, relational Google Cloud database service, Cloud Spanner differs from Cloud SQL by focusing on combining the benefits of relational structure and non-relational scalability. It provides consistency across rows and high-performance operations and includes features like built-in security, automatic replication, and multi-language support.
3. Firestore : Firestore is a flexible, scalable NoSQL cloud database to store and sync data for a client- and server-side development.
4. Cloud Bigtable : Cloud Bigtable is a fully-managed non-relational database that is suitable for both real-time access and analytics workloads. It is an excellent solution for large-scale, low-latency applications as well as intensive data analytics such as IoT, personalisation, recommendations, monitoring and geospatial datasets.
5. Cloud Storage : Cloud Storage is one of the many storages available on Google Cloud Platform. This highly scalable service can manage an unlimited amount of objects up to 5 TB each, such as images and content files.
6. BigQuery : With BigQuery you can perform data analyses via SQL and query stream-data. Since BigQuery is a serverless data warehouse that’s fully managed, its built-in Data Transfer Service helps you migrate data from on-premises resources, including Teradata.
7. Memory Store : Designed to be secure, highly available, and scalable, Cloud Memorystore is a fully managed, in-memory Google Cloud data store that enables you to create application caches with sub-millisecond latency for data access.
Key Storage Characteristics
Different GCP storage services have different availability SLAs(Service Level Agreement). For a service, the availability SLA is often dependent on the configuration of the service.
For example, for Cloud storage as the image shows the availability varies depending on whether multi-regional, regional or coldline buckets are created. The same can be seen for Cloud Spanner and Firestore with multi-region offerings higher availability than single region configurations. This is where requirements are extremely important as they will help inform the storage choices.
Durability of data represents the odds of losing the data. Depending on the GCP storage solution, the durability is a shared responsibility. Google Cloud responsibility is to ensure that data is durable in the event of a hardware failure.
Your responsibility is performing backups of your data, for example Cloud storage provides you with 11 9’s durability and versioning is a feature. However, it’s your responsibility to determine when to use versioning. For other storage services to achieve durability, it usually means taking backups of data. For disks this means snapshots. So snapshot jobs should be scheduled. For Cloud SQL, Google Cloud provides automated machine backups, point-in-time recovery, and optionally a failover server. To improve durability SQL database backup should also be run. Spanner and Firestore provide automatic replication and you should run export jobs with the data being exported to Cloud storage.
Amount of data and the number of reads and writes
The amount of data and the number of reads and writes are important to know when selecting a data storage service.
Some services scale horizontally by adding nodes, for example, Bigtable and Spanner which is in contrast to Cloud SQL and Memorystore which scale machines vertically. Other services scale automatically with no limits for example, Cloud storage BigQuery, and Firestore.
Strong consistency is another important characteristic to consider when designing data solutions. Strongly consistent databases update all copies of data within a transaction. It ensures everyone gets the latest copy of the data on reads. Google Cloud Services providing strong consistency include Cloud storage, Cloud SQL, Spanner, and Firestore.
Eventual consistent databases typically have multiple copies of the same data for performance and scalability. They support handling large volumes of rights. They operate by updating one copy of the data synchronously and all copies asynchronously. Which means that not all readers are guaranteed to read the same value at a given point in time. The data will eventually become consistent but not immediately. Bigtable and Memorystore are examples of Google Cloud data services that have eventual consistency.
Total cost per GB
When designing a data storage solution, calculating the total cost per GB is important to help determine the financial implications of a choice.
- Bigtable and Spanner are designed for massive data sets and are not as cost effective for small data sets.
- Firestore is less expensive per GB stored, but the cost for reads and writes must be considered.
- Cloud storage is not as expensive, but is only suitable for certain data types.
- BigQuery storage Is relatively cheap but does not provide fast access to records and a cost is incurred for each query.
Storage and database decision chart
Leverage this chart when selecting a GCP storage or database service.