Back to Blog
Salesforce

6 Steps to Integrating Google BigQuery with Salesforce Einstein Analytics

Performing analytics on data warehouse data is a core use case for many of our clients who look to purchase Salesforce Einstein Analytics, so it’s no wonder that Salesforce decided to introduce a native connector to Google’s BigQuery as part of Einstein Analytics. Last time, when we went through surfacing NetSuite data in Salesforce using Einstein Analytics, I explained that even though the Salesforce Docs do a great job of explaining the Einstein side of the connection, there is nearly no info on configuring the NetSuite side. So, we brought in one of our resident NetSuite experts, Rob MacEwen, to explain it.

If you check out the Salesforce help page on using the google Big Query connector, it has even less info then the NetSuite Page! Here’s what it says:

The service account that you use to connect to Google Big Query must have the Bigquery.Createtable permission. Account creation generates a JSON file of account properties. Most of the necessary connection settings are in this file.”

That’s it! Lucky for you, I, (who just happened to be a Google certified data engineer) is here to help! Starting with signing up for a free Google BigQuery trial account. So, let’s get started!

 

Step 1: Signup for a BigQuery Trial Account

Sign up for a BigQuery trial account here: https://cloud.google.com/bigquery, (you’ll Get 300$ of free credits). Once you have your free trial, go to the Google Cloud Platform Console here: https://console.cloud.google.com/. Google will automatically create a project for you called “My first Project”. We’ll use that one for now (but feel free to create a new one). Google will automatically give you project a cool ID like “weighty-country-286414”.

 

Step 2: Load data into Big Query

If you click on the Menu Icon, under the Big Data section you will find an option for BigQuery, select that, the BigQuery Console will open. The BigQuery Console is used mostly for querying data, but also for creating data sets and manually loading data. We can’t query anything if we don’t have data, so let’s create a data set and load it. Below is a screenshot, we can see our project, select it, and then click “create dataset”.

For our example, we are going to use the world-famous Iris dataset (Download it here). So, let’s call out dataset Iris. Unlike Einstein a BigQuery a dataset isn’t a “table” its more like a database that contains tables, so now that we have a dataset, let’s create a table, also called Iris. Select the dataset, then click “Create Table”, as shown below:

Fill in the form like so that it matched the image below, and click “Create Table”:

That’s it! Our table is created, and we can run queries against it!

Note that BigQuery also allows you to create views and supports SQL (it has its own variant – the documentation can be found here: https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators). I would strongly suggest building views in BigQuery that joins your tables together and does whatever data transformations and filtering you need. I will also note that Bigquery has an array data type that’s not really compatible with Einstein analytics multi-value picklist data type. But you can use the ARRAY_TO_STRING function (in a view) to convert the BigQuery arrays into semi-colon delimited strings making them compatible. But for now, to keep things simple, we are just going to connect directly to our table.

 

Step 3: Create a Service Account

To connect Einstein Analytics to BigQuery we need to create a service account. From the Google Cloud Console Menu select “IAM & Admin” then “Service Accounts”. Then click “Create Service Account”. (Confirm you are in the correct project when you do this)

Follow the wizard, give your service account a Name, for role select “Editor” and finally give yourself access to the Service account by entering your email address. Click Done.

 

Step 4: Create a Key

Now that you have a service account click on the ellipses and then “Create Key”. Like so:

Select “JSON” and then click “Create”. Save the JSON file, we will need it later.

 

Step 5: Create a Bucket

When connecting Einstein Analytics to BigQuery, because of the potential for large data volumes, Einstein Analytics wants to first export BigQuery from the data CSV, and then load the CSV to Einstein Analytics. That means that Einstein Analytics needs a place to store the CSV files, and that place is a Google Cloud bucket. So, we need to create a bucket and give the service account access to it.

From the Google Cloud Console Menu select “Storage” Then click “Create Bucket”. This lunches the new Bucket Wizard. Give it a name, select a region, for storage class select “Standard”, Select “Uniform” for control access. Then click “Create”.

Because our Service account is an editor on the project, we should not need to do any additional configuration for bucket security.

That’s it! We can now log into the Einstein Analytics Data Studio and connect to our BigQuery table.

 

Step 6: Connect Our Data

The official Salesforce docs do a pretty good job of explaining how to configure the BigQuery connector in Einstein Analytics. But let’s review it anyhow. From the Data manager click “Connect Data”.

This starts the connection Wizard

1) Click Add Connection

2) Select “Google BigQuery Standard Connection”

3) Fill in the connection details

– Connection Name

– Developer Name

– Description

– Schema: This is the name of the Google Dataset, so in our case “Iris”

– Private Key: This comes from the JSON file, (see the image below)

NOTE: Don’t include “private_key”: “—–BEGIN PRIVATE KEY—–\n at the start of the key, and \n—–END PRIVATE KEY—–\n” at the end. Be especially careful with the first \n it so easy to think that little n is part of the code!

1) Client Email: again, take this from the JSON file, this is the service account email generated by google, NOT your email.

2) Storage Path: This is the name of the Bucket we created

3) Project id: This is that cool ID Google created for us, it’s also in the JSON file.

4) Click “Save & Test”

Great! We now have our connection to BigQuery configured. And we are ready to add all our objects as Edgemarts. The wizard will automatically start the set up for a new object, but we can always add objects later (if we have more), just come back, click “Connect to Data” again and select our saved connection.

The wizard will automatically start the set up for a new object, it will ask us to select the object and then the fields as shown below:

And that’s basically it! We now have an EdgeMart for the Iris object! (Note: we do need to run the EdgeMart replication sync before we can use it). It can take a bit of time to run, so be patient.)

Next. We bring the Edgemart into a dataflow, perform all our transformations, and of course join it to our Salesforce data. Then create our dataset. For simplicity’s sake, I’m just going to push the data as is to a dataset, like so.

Now that we have access to all our BigQuery data in Einstein Analytics. We can go crazy building all our dashboards! And yes, even embedding them into our relevant pages. Boom!

Contact us today if you’re curious about how your organization can implement Google BigQuery and Salesforce to surface your pertinent business data!

Enjoyed this article? Visit Dave’s blog at: http://SalesforceDataBlog.com or subscribe by Email or RSS. Follow him on LinkedIn and/or Twitter.

Salesforce

6 Steps to Integrating Google BigQuery with Salesforce Einstein Analytics

  • Post link copied to clipboard!