Skip to main content

Setting Up your first Data Pipeline

This guide assumes you already have Sheets installed on your site. It only covers the setup instructions to help you setup your first Sync Pipeline.

Broad steps to follow:

Setup Service Account & SpreadSheet Settings

Sheets uses a Service Account to access spreadsheets via Google Sheets API.

This section outlines the setup that you would have to do in your Google Developers Console. The final outcome would be a new Service Account that you would create and generate a JSON key that Sheets will use to identify itself.

Enable API Access for Google Project

  1. Head to Google Developers Console and create a new project (or select the one you already have).
  2. In the box labeled "Search for APIs and Services", search for "Google Drive API" and enable it.
  3. In the box labeled "Search for APIs and Services", search for "Google Sheets API" and enable it.

Setup Service Account

A service account is a special type of Google account intended to represent a non-human user that needs to authenticate and be authorized to access data in Google APIs [sic].

Since it's a separate account (Email), by default it does not have access to any spreadsheet until you share it with this account. Just like any other Google account.

Here's how to get one:

  1. Enable API Access for your project if you haven't done it yet.

  2. Go to "Credentials" and choose "Create credentials > Service account key".

  3. Fill out the form, click "Create" and "Done".

  4. Click on "Manage service accounts" for the Service Accounts section.

  5. Click on (Actions) for the respective Email (newly created service account) and select "Manage keys" and then click on "ADD KEY > Create new key".

  6. Select JSON key type and press "Create".

    You will automatically download a JSON file with credentials. It may look like this:

    {
    "type": "service_account",
    "project_id": "frappe-sheets-XXX",
    "private_key_id": "2xn … ba4",
    "private_key": "-----BEGIN PRIVATE KEY-----\nNrDyLw … jINQh/9\n-----END PRIVATE KEY-----\n",
    "client_email": "self-hosted-worker@frappe-sheets-XXX.iam.gserviceaccount.com",
    "client_id": "473 … hd.apps.googleusercontent.com",
    ...
    }
  7. As Administrator user, upload this file to the "Credentials" field under the "Service Account" section of the "SpreadSheet Settings" DocType.

Setup SpreadSheet Pipeline

Base Settings

  • Google Sheets ought to be shared with the Service Account in order for Sheets to have access.
  • Saving the record will populate the worksheets table. This is where we can map one worksheet to ingest data into a particular DocType.
  • Upon expanding a respective worksheet row, you can update settings for "Import Type", "Skip Failures" and more.

Setup Sync Settings

  • After successfully saving a new SpreadSheet record, you can manually trigger an import by pressing "Trigger Import".
  • Setting the "Import Frequency" field will setup an automated sync job. You may set a preset frequency or a custom cron pattern too.