Profile

Amazon S3 is an online file storage system that provides developers and IT teams with secure, durable, and highly-scalable cloud storage. Amazon S3 can be used alone, or together with other services including Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Block Store (Amazon EBS), and Amazon Glacier, as well as third-party storage repositories and gateways. Amazon S3 provides cost-effective object storage for a wide variety of use cases including cloud applications, content distribution, backup and archiving, disaster recovery, and big data analytics. The FileCatalyst and Amazon S3 Integration provides a means of delivering files to and from Amazon S3 securely, and reliably, at accelerated speeds

Challenges

As more organizations leverage the accessibility and scalability of cloud storage, they are still challenged with transferring large files to and from their cloud storage quickly, securely, and reliably.

Currently, there are a variety of options available to users that transfer their files to and from Amazon S3. Some of these options include file system drivers, FTP-based clients, and web interfaces. However, each of these options uses traditional HTTP, TCP, and FTP to transfer their files across the WAN.

This creates slow transfer speeds over high bandwidth connections, or wherever latency and packet loss may be present. Many of these cloud solutions cache the files locally and copy them to S3 storage in the background. This may create the illusion of speed, but files are not accessible until they have finished transferring.

Solution

To accelerate file transfers, the FileCatalyst and Amazon S3 integration uses Amazon’s SDK as a file system driver using Java NIO.2.

FileCatalyst is able to treat S3 storage as a file system, so functions such as support resume and MD5 verification are available. Files can be streamed directly to S3 and nothing lands locally.

FileCatalyst’s patented UDP-based solution allows users to upload their data at accelerated speeds of up to 10 Gbps. Multi-part HTTP file uploads by FileCatalyst are performed within the Amazon S3 infrastructure.

FileCatalyst’s use of UDP in the higher latency WAN areas, combined with lower latency S3 and Amazon EC2 areas, adds further performance improvements to FileCatalyst’s already blazing fast transfer speeds.

Use Case

 The FileCatalyst and Amazon S3 integration combine FileCatalyst’s unprecedented file transfer acceleration with Amazon S3 storage to provide solutions to a number of use cases including:

  1. Enterprises managing two sets of disparate data such as table-oriented on-premise data and SAN (Storage Area Network) repositories.
  2. Media companies with large amounts of multimedia files and related metadata. Using Amazon S3 with FileCatalyst eliminates the need to buy and maintain server infrastructure, and users can gather content directly from their cloud-based storage.
  3. Organizations who have large data warehouses.

Results

  • Reliability and congestion control is added to UDP at the application layer
  • Fine grain bandwidth management
  • Smart resume and re-try for large file transfers
  • MD5 checksums are performed before a file is resumed
Download Solution Paper