GAE: Google Cloud Storage

About Google Cloud Storage (GCS)

Google Cloud Storage is useful for storing and serving large files. Additionally, Cloud Storage offers the use of access control lists (ACLs), and the ability to resume upload operations if they’re interrupted, and many other features. (The GCS client library makes use of this resume capability automatically for your app, providing you with a robust way to stream data into GCS.)

About the Google Cloud Storage (GCS) client library

The GCS client library lets your application read files from and write files to buckets in Google Cloud Storage (GCS). This library supports reading and writing large amounts of data to GCS, with internal error handling and retries, so you don’t have to write your own code to do this. Moreover, it provides read buffering with prefetch so your app can be more efficient.

The GCS client library includes the following functionality:

Where to download the GCS client library

For download instructions and distribution contents, see the downloads page.

What you need to do to use the GCS client library

In order to access GCS from App Engine, you must activate a Cloud project for GCS as described on the activation page.

Alternative methods for accessing Google Cloud Storage

The GCS client library provides a way to read from and write to Google Cloud Storage that is closely integrated with Google App Engine, enabling App Engine apps to create objects in GCS and serve them from GCS.

However, there are other ways to access GCS from App Engine besides using the GCS client library. You can use any of these methods as well:

Blobstore API

You can use the Blobstore API to upload objects to and serve objects from GCS using the BlobStore API. You’ll need to use theBlobstoreService.createGsBlobKey() method to create a blob key representing the GCS object. This approach is useful for uploading files from a web page. When the Blobstore API is used together with the Images API, you get a powerful way to serve images, because you can serve images directly from GCS, bypassing the App Engine app, which saves on instance hour costs.

GCS REST API

You can use the Cloud Storage REST API directly to read and write data to GCS. The GCS client library actually uses the Cloud Storage REST API. However, the GCS REST API lacks the App Engine optimizations already done for you by the GCS client library, so you may be doing unnecessary work if you use the GCS REST API directly. If the GCS client library lacks some feature you need, and the REST API supplies that feature, using the REST API may be a good option.

GCS storage manager

If you need to upload objects quickly and don’t mind a manual process, you can use the GCS Storage Manager.

Key concepts of Google Cloud Storage

For complete details on GCS, incuding a complete description of concepts, you need to refer to the GCS documentation. The following brief synopsis of some GCS features impacting the GCS client library are provided as a convenience.

Buckets, objects, and ACLs

The storage location you read files from and write files to is a GCS bucket. GCS client library calls always specify the bucket being used. Your project can access multiple buckets. How do these buckets get created? There are no client library calls currently for creating GCS buckets, so you need to create these upfront using the Google Storage Manager or the gsutil tool provided by GCS.

Access to the buckets and to the objects contained in them is controlled by an access control list (ACL). Your Google Cloud project and your App Engine app are added to the ACL permitting bucket access during activation. The ACL governing bucket access is distinct from the potentially many ACLs governing the objects in that bucket. Thus, your app has read and write priviledges to the bucket(s) it is activated for, but it only has full rights to the objects it creates in the bucket. Your app’s access to objects created by other apps or persons is limited to the rights given your app by the objects’ creator.

If an object is created in the bucket without an ACL explicitly defined for it, it uses whatever default object ACL has been assigned to the bucket by the bucket owner. If the bucket owner has not specified a default object ACL, the object default is public-read, which means that anyone allowed bucket access can read the object.

ACLs and the GCS Client Library

An app using the GCS client library cannot change the bucket ACL, but it can specify an ACL that controls access to the objects it creates. The available ACL settings are described under documentation for the GcsService.FcsFileOptions object.

Modifying GCS objects

Once you have created an object in a bucket, it cannot be modified (no appending). To modify an object in a bucket, you need to overwrite the object with a new object of the same name that contains your desired changes.

GCS and “subdirectories”

Google Cloud Storage documentation refers to “subdirectories” and the GCS client library allows you to supply subdirectory delimiters when you create an object. However, GCS does not actually store the objects into any real subdirectory. Instead, the subdirectories are simply part of the object filename. For example, if I have a bucket my_bucket and store the file somewhere/over/the/rainbow.mp3, the file rainbow.mp3 is not really stored in the subdirectory somewhere/over/the/. It is actually a file named somewhere/over/the/rainbow.mp3.

Retries and exponential backoff

The GCS client library provides a configurable mechanism for automatic request retries in event of timeout failures when accessing GCS. The same mechanism also provides exponential backoff to determine an optimal processing rate. (For a description of exponential backoff in GCS, see the Google Cloud Storage documentation on backoff.)

To change the default values for retries and backoff, you use the RetryParams class.

Using GCS client library with the development app server

You can use the client library with the development server from SDK version 1.8.1 and greater. This provides GCS emulation using the local disk.

Pricing, quotas, and limits

There are no charges associated with making GCS client library calls to Google Cloud Storage.

However, any data stored at GCS is charged the usual GCS data storage fees. Cloud Storage is a pay-to-use service; you will be charged according to the Cloud Storage price sheet.

What to do next

To create, deploy, and run your app:

  1. Download the client library.
  2. Create an App Engine project and activate it for GCS.
  3. Optionally, if you have an existing app that uses the older Google Cloud Storage API, migrate your app.
  4. Go through the brief Getting Started instructions for a quick orientation in using the library.
  5. Upload and deploy your app to production App Engine.
  6. Test the app for expected behavior with GCS.