AWS Snowball Edge

Reading Time: 6 minutes

Snowball Edge is a physical, shippable and self-contained data storage and compute appliance. It enables organisations to move large volumes of data in and out of AWS cloud. Snowball edge can be used as a standalone device or clustered together to provide data mobility at petabyte scale.
Each device includes onboard compute capabilities that enable data collection, data manipulation and data preparation activities. Data is collected and later imported into the AWS cloud. It may also be used as a standalone operation without import.

The onboard compute capabilities enable customers to extend the power of the AWS cloud to their local environment. It also adds the functionality required to create a wide variety of solutions to meet their needs.

What can a Snowball Edge do for you?

The Snowball edge appliance is a multipurpose device that can be used to enhance your connection to the cloud. You can use it to simplify transporting large datasets to and from the cloud. Snowball edge enables you to expedite migration your data to the cloud. You can use it as a standalone device to store the data at a remote location and can simply redirect application storage to this device without changing your application logic.

You can use multiple devices to form a cluster for increased capacity and data durability. It also allows Amazon Elastic Cloud Compute or EC2 instances and Lambda functions to perform compute operations and process data at the edge for local use before importing your data into the cloud. This functionality enables you to perform activities such as field analytics, log processing, data tagging, data transformation, and data organization. Data stored in it is in an S3 compatible format. The S3 compatible data can be directly imported into an Amazon S3 bucket in the AWS cloud or exported from a bucket in the AWS cloud for local use. You can use the Amazon S3 Adapter for Snowball to programmatically transfer data to and from the Snowball Edge device using a subnet of Amazon S3 REST API calls.

Features:

AWS Snowball Edge supports NFS version3 and NFS version 4.1 protocol. You can start using a Snowball Edge cluster with a minimum of 5 nodes and expand the size of cluster up to maximum of 10 nodes. The available usage storage capacity per node is reduced when clustered altogether. To gain the higher durability, the data is protected across multiple clustered appliances. Clustering also adds to the available compute resources to service your applications use case needs.

Snowball Edge Model Options

Snowball Edge Storage Optimized

  • 80 TB of usable storage space,
  • 24 vCPUs,
  • 32 GiB of memory for compute functionality

Snowball Edge Compute Optimized

  • 39.5 TB of usable storage space,
  • 52 vCPUs,
  • 208 GiB of memory for compute functionality

Snowball Edge Compute Optimized with Graphics Processing Unit

  • 39.5 TB of usable storage space,
  • 52 vCPUs,
  • 208 GiB of memory for compute functionality
  • installed GPU

Snowball Edge Job Types

Snowball Edge enables three high-level primary customer use case scenarios:

  • Importing large datasets to the AWS cloud,
  • Exporting large datasets to from the AWS cloud,
  • Local data collection and processing.

Each Snowball Edge is configured for a single use case and can be used for only one job at a time. If you have multiple use case requirements, multiple locations or multiple job occurrences, then you’ll need to create multiple Snowball Edge orders and jobs to meet your needs.


Importing Jobs to Amazon S3

After you create a job in the AWS Snow Family Management Console or the job management API, aws ships you a Snowball. Once it arrives, you can connect the Snowball to your network and transfer the data that you want imported into Amazon S3 onto that Snowball using the Snowball client or the Amazon S3 Adapter for Snowball. When you’re done transferring data, ship the Snowball back to AWS, and aws import your data into Amazon S3.

Exporting Jobs from Amazon S3

After you create a job in the AWS Snow Family Management Console, a listing operation starts in Amazon S3 which splits your job into parts that can be up to about 80 TB in size, and each job part has exactly one Snowball associated with it. Soon after that, aws start exporting your data onto a Snowball. When the Snowball arrives at your data center, you’ll connect the Snowball to your network and transfer the data that you want exported to your servers by using the Snowball client or the Amazon S3 Adapter for Snowball.

Local Compute and Storage only jobs

Local compute and storage jobs enable you to use Amazon S3 and AWS Lambda powered by AWS IoT Greengrass locally, without an internet connection. While local storage and compute functionality also exists for the import and export job types, this job type is only for local use. You can’t export data from Amazon S3 onto the device or import data into Amazon S3 when the device is returned. You can read and write objects to an AWS Snowball Edge device using the Amazon S3 Adapter for Snowball or the file interface. The adapter comes built-into the device.

Snowball Edge Use Cases

1. Offline Data Collection:

Snowball Edge is useful for large volumes of data. It is most useful when quick and cost effective transport is difficult. Customers may also want to consume the data locally for a short time after its generation. For example, researchers on an airplane, ship or submarine might capture detailed scientific or environmental data for immediate analysis before sending the data for further analysis to the cloud. In this case, they require data collection and processing.The data is securely stored, high speed and reliable connectivity to the cloud is unavailable, and the job occurs for a limited duration before uploading the data to the cloud for later use.

2. Local Tiering and Compute:

A Snowball Edge cluster can support workloads independent from the cloud when connectivity is intermittent or latency is vitally important, such as in medical facility. A healthcare customer integrated Snowball Edge with their MRI systems. The code that interfaces with and manages the MRI image retrieval system also runs on the Snowball Edge cluster. If connectivity to the outside world is lost, the MRI machines continue to function and store data on the Snowball cluster. A copy of the data can later be rotated out of the AWS cloud after several weeks. The use of Snowball Edge enables a hybrid cloud implementation with specific local processing and secure import of data to the cloud for retention and security.

3. Local data transformation:

In many use cases, raw data must be stored, analyzed and transformed to extract or produce new information. An example is large scale physical document scanning and OCR. The raw scanned document images are sent to the Snowball Edge. After performing OCR operation, it stores the text alongside the image. The Snowball Edge has enough compute capacity available to conveniently perform some or all compute work before arrival for import into the AWS cloud. Another similar example is thumbnail generation, tagging and resizing for image applications.

4. Internet of Things:

Snowball Edge provides a platform for collection and analysis that can capture a raw data stream and quickly react. Considering an industrial wind farm where sensor data from the windmill streams to the Snowball Edge. Lambda functions examine the data streams for anomalies, aggregate metrics, and to send alarms or control signals. The raw data stages on the Snowball Edge cluster and later sent to the AWS cloud where it joins the larger overall data-set.

Client-Example

Philips Healthcare develops technology solutions for customers, patients, providers and caregivers across health continuum, from supporting healthy living and prevention to diagnosis, treatment and home care.

Phillips collects and processes patient data in over 1200 intensive care units. It collects and analyzes the patient data to help predict and provide actionable data to healthcare providers. In some cases, the healthcare providers need to respond in seconds. The previous solution was to have multiple servers and local storage in each hospital. The challenge was building, maintaining and managing the infrastructure and systems for high availability and reliability.

The solution with AWS is to embed Snowball Edge devices in their hospital networks. It will collect data and initiate real-time analytics. The Lambda functions perform analytics in real time. The Snowball Edge implementation provides a local data-set to run on in case of connectivity issues. It also enables secure upload of colder data to AWS cloud for retrieval and compliance retention.

Check for pricing
Snowball compatible EC2 instances

Written by 

Vidushi Bansal is a Software Consultant [Devops] at Knoldus Inc. She is passionate about learning and exploring new technologies.