BigQuery

Installing BigQuery clients

To interact with BigQuery from Python, install the google-cloud-bigquery library:

$ pip install google-cloud-bigquery

Access credentials

Access to Google services is provided via a credentials file, which can be downloaded from the IAM page of the Google Cloud Console, or given to you by your Google account administrator.

Typically, these credentials will either be per user or they will correspond to a service account (a system user who only has minimal permissions on the project):

  • If you have your own credentials, we suggest storing these in your home directory under /home/faculty. For a description of how your home directory persists across servers, read the Your home directory section of the documentation.
  • If the credentials belong to a service account linked to the project, we suggest storing the credentials in the project workspace so that everyone in the project can access them.

Once you have settled on a location for the credentials, make sure the BigQuery client knows how to find them by setting the GOOGLE_APPLICATION_CREDENTIALS environment variable. Run the following commands, replacing /path/to/credentials.json with the absolute path to the credentials:

echo "export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json" > /etc/faculty_environment.d/gcloud-credentials.sh
sudo sv restart jupyter  # Restart Jupyter to make sure it has access to credentials.

To make setting up servers more reproducible, we recommend adding these commands to the scripts section of a custom environment. For further information on setting environment variables in Faculty, refer to the Environment variables section.

Accessing BigQuery from Python

from google.cloud import bigquery

client = bigquery.Client()
query = "SELECT * FROM `bigquery-public-data.london_bicycles.cycle_hire` LIMIT 10"
df = client.query(query).to_dataframe()

This returns the result of the query as a Pandas dataframe.

Refer to the BigQuery documentation for other examples.