When analyzing Cloud Storage use, we consider three needs:

  1. Performance
  2. Retention
  3. Access patterns

Retention

Setting a lifecycle policy lets you tag specific objects or buckets and creates an automatic rule that will delete or even transform storage classes for that particular object or bucket type.

The ability to transform objects into lower-cost storage classes is a powerful tool, but one that must be used with caution. …


  1. Specify cluster image versions.
    Cloud Dataproc uses image versions to bundle operating system and big data components (including core and optional components) and GCP connectors into a single package that is deployed on a cluster.
    If you don’t specify an image version when creating a new cluster, Cloud Dataproc will default to the most recent stable image version. For production environments, we recommend that you always associate your cluster creation step with a specific minor Cloud Dataproc version, as shown in this example gcloud command:
    gcloud dataproc clusters create my-pinned-cluster --image-version 1.4-debian9
  2. Know when to use custom images.
    If you have…

  1. Control projection — Query only the columns that you need.
    Projection refers to the number of columns that are read by your query. Projecting excess columns incurs additional (wasted) I/O and materialization (writing results). Avoid SELECT *.
  2. Prune partitioned queries
    When querying a partitioned table, use the _PARTITIONTIME pseudo column to filter the partitions.
  3. Denormalizing data
    Denormalization is a common strategy for increasing read performance for relational datasets that were previously normalized.
    The recommended way to denormalize data in BigQuery is to use nested and repeated fields. It’s best to use this strategy when the relationships are hierarchical and frequently queried…

BigQuery operations that are free of charge in any location:

  • Batch loading data
  • Automatic re-clustering
  • Exporting data
  • Deleting table, views, partitions, functions and datasets
  • Metadata operations
  • Cached queries
  • Queries that result in an error
  • Storage for the first 10 GB of data per month
  • Query data processed for the first 1 TB of data per month

For any location, the BigQuery pricing is broken down like this:

1 Storage

  • Active storage
  • Long-term storage
  • Streaming inserts

2 Query processing

  • On-demand
  • Flat-rate

Storage

  1. Keep your data only as long as you need it.
    Configure default table expiration on your dataset. …

This post looks at calling BigQuery from devices through REST API. Devices often have a small footprint and have constraints and may not be possible to install Client Libraries. By using JWT directly as a bearer token, rather than an OAuth 2.0 access token, you can avoid having to make a network request to Google’s authorization server before making an API call.

  1. Create a service account, assign it necessary BigQuery IAM permissions and download the JSON key file. JSON key file structure is like this:
{
“type”: “service_account”,
“project_id”: “<PROJECT-ID>”,
“private_key_id”: “<PRIVATE_KEY_ID”,
“private_key”: “<PRIVATE_KEY>”,
“client_email”: “<SA_NAME>@<PROJECT-ID>.iam.gserviceaccount.com”,
“client_id”: “<CLIENT_ID>”,
“auth_uri”: “https://accounts.google.com/o/oauth2/auth"…

BigQuery provides an excellent CLI — “bq extract” and an API as well as Client Library in many languages for achieving the same. In exporting table data documentation it says — You can use a service such as Dataflow to read data from BigQuery instead of manually exporting it. This post looks at the Dataflow way to extract data out of BigQuery. This is useful in situations where “bq extract” doesn’t meet requirements and you really need a programmatic way to extract and manipulate into files.

As an example, I have used the Chicago Taxi Trips public dataset and the…

Suds Kumar

Google Cloud enthusiast

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store