Datastore Connectivity for BigQuery in go

Datastore Connectivity for BigQuery (bgc)

Datastore Connectivity library for BigQuery in Go. GoDoc

This library is compatible with Go 1.5+

Please refer to CHANGELOG.md if you encounter breaking changes.

This library uses SQL mode and streaming API to insert data as default. To use legacy SQL please use the following /* USE LEGACY SQL */ hint, in this case you will not be able to fetch repeated and nested fields.

Configuration parameters

insertMethod

To control insert method just provide config.parameters with the following value:

_table_name_.insertMethod = "load"

Note that if streaming is used, currently UPDATE and DELETE statements are not supported.

insertIdColumn

For streaming you can specify which column to use as insertId with the following config.params

_table_name_.insertMethod = "stream"
_table_name_.insertIdColumn = "sessionId"
streamBatchCount

streamBatchCount controls row count in batch (default 9999)

insertWaitTimeoutInMs

When inserting data data this library checks upto 60 sec if data has been added. To control this behaviour you can set insertWaitTimeoutInMs (default 60 sec)

To disable this mechanism set: insertWaitTimeoutInMs: -1

insertMaxRetires

Retries insert when 503 internal error

datasetId

Default dataset

pageSize

Default 500

The maximum number of rows of data to return per page of results. In addition to this limit, responses are also limited to 10 MB.

Credentials

  1. Google secrets for service account

a) set GOOGLE_APPLICATION_CREDENTIALS environment variable

b) credential can be a name with extension of the JSON secret file placed into ~/.secret/ folder

config.yaml

driverName: bigquery
credentials: bq # place your big query secret json to ~/.secret/bg.json
parameters:
  datasetId: myDataset

c) full URL to secret file

config.yaml

driverName: bigquery
credentials: file://tmp/secret/mySecret.json
parameters:
  datasetId: myDataset

Secret file has to specify the following attributes:

type Config struct {
	//google cloud credential
	ClientEmail  string `json:"client_email,omitempty"`
	TokenURL     string `json:"token_uri,omitempty"`
	PrivateKey   string `json:"private_key,omitempty"`
	PrivateKeyID string `json:"private_key_id,omitempty"`
	ProjectID  string `json:"project_id,omitempty"`
}
  1. Private key (pem)

config.yaml

driverName: bigquery
credentials: bq # place your big query secret json to ~/.secret/bg.json
parameters:
  serviceAccountId: "***@developer.gserviceaccount.com"
  datasetId: MyDataset
  projectId: spheric-arcadia-98015
  privateKeyPath: /tmp/secret/bq.pem

Usage:

The following is a very simple example of Reading and Inserting data

package main

import (
    "github.com/viant/bgc"
    "github.com/viant/dsc"
    "time"
    "fmt"
    "log"
)


type MostLikedCity struct {
	City      string
	Visits    int
	Souvenirs []string
}

type  Traveler struct {
	Id            int
	Name          string
	LastVisitTime time.Time
	Achievements  []string
	MostLikedCity MostLikedCity
	VisitedCities []struct {
		City   string
		Visits int
	}
}


func main() {

    config, err := dsc.NewConfigWithParameters("bigquery", "",
    	    "bq", // google cloud secret placed in ~/.secret/bg.json
            map[string]string{
                "datasetId":"MyDataset",
            })

    if err != nil {
        log.Fatal(err)
    }

		
    factory := dsc.NewManagerFactory()
    manager, err := factory.Create(config)
    if err != nil {
        log.Fatalf("Failed to create manager %v", err)
    }
   

    traveler := Traveler{}
    success, err := manager.ReadSingle(&traveler, " SELECT id, name, lastVisitTime, visitedCities, achievements, mostLikedCity FROM travelers WHERE id = ?", []interface{}{4}, nil)
    if err != nil {
        panic(err.Error())
    }

    travelers :=  make([]Traveler, 0)
    err:= manager.ReadAll(&interest, "SELECT iid, name, lastVisitTime, visitedCities, achievements, mostLikedCity",nil, nil)
    if err != nil {
        panic(err.Error())
    }

   // ...

    inserted, updated, err := manager.PersistAll(&travelers, "travelers", nil)
    if err != nil {
           panic(err.Error())
    }
    // ...
    




   //Custom reading handler with reading query info type to get CacheHit, TotalRows, TotalBytesProcessed

   var resultInfo = &bgc.QueryResultInfo{}
   var perf = make(map[string]int)  
   	err = manager.ReadAllWithHandler(`SELECT DATE(date), COUNT(*) FROM performance_agg WHERE DATE(date) = ?  GROUP BY 1`, []interface{}{
   		"2018-05-03",
   		resultInfo,
   	}, func(scanner dsc.Scanner) (toContinue bool, err error) {
   	        var date string
   	        var count int
   	        err = scanner.Scan(&date, &count)
   	        if err != nil {
   	        	return false, err
   	        }
   	        perf[date] = count
   		return true, nil
   	})
   	log.Printf("cache: %v,  rows: %v, bytes: %v", resultInfo.CacheHit, resultInfo.TotalRows, resultInfo.TotalBytesProcessed)

   
    dialect := dsc.GetDatastoreDialect(config.DriverName)
    DDL, err := dialect.ShowCreateTable(manager, "performance_agg")
    fmt.Printf("%v %v\n", DDL, err)
   
}

GoCover

GoCover

License

The source code is made available under the terms of the Apache License, Version 2, as stated in the file LICENSE.

Individual files may be made available under their own specific license, all compatible with Apache License, Version 2. Please see individual files for details.

Credits and Acknowledgements

Library Author: Adrian Witas

Contributors: Mikhail Berlyant

Owner
Similar Resources

espresso - a framework for testing BigQuery queries

espresso - a framework for testing BigQuery queries Goals Componentization: compose complex queries from smaller, reusable components Test driven deve

Dec 7, 2022

Stream data into Google BigQuery concurrently using InsertAll() or BQ Storage.

bqwriter A Go package to write data into Google BigQuery concurrently with a high throughput. By default the InsertAll() API is used (REST API under t

Dec 16, 2022

Stream data into Google BigQuery concurrently using InsertAll()

Kik and me (@oryband) are no longer maintaining this repository. Thanks for all the contributions. You are welcome to fork and continue development. B

Dec 7, 2022

BigQuery database/sql golang driver

BigQuery SQL Driver This library is compatible with Go 1.17+ Please refer to CHA

Dec 7, 2022

Go-bqstreamer - Stream data into Google BigQuery concurrently using InsertAll()

Kik and me (@oryband) are no longer maintaining this repository. Thanks for all the contributions. You are welcome to fork and continue development. B

Dec 7, 2022

A serverless bot which periodically checks configured BigQuery capacity commitments, reservations and assignments against actual slot consumption of running jobs and reports findings to Slack/Google Chat.

A serverless bot which periodically checks configured BigQuery capacity commitments, reservations and assignments against actual slot consumption of running jobs and reports findings to Slack/Google Chat.

Solution Guide This solution implements a ChatOps-like approach to monitoring slot utilization of Google Cloud BigQuery reservations. As an alternativ

Dec 7, 2022

Mackerel plugin to post bigquery's query result

mackerel-plugin-bigquery-query-result-importer Synopsis % mackerel-plugin-bigque

Feb 5, 2022

Command line tool for Google Cloud Datastore, written in Go

Command line tool for Google Cloud Datastore, written in Go

dsio dsio is a command line tool for Google Cloud Datastore. This tool is under development. Please use in your own risk. Features Bulk upsert entitie

Feb 8, 2022

Scalable datastore for metrics, events, and real-time analytics

InfluxDB InfluxDB is an open source time series platform. This includes APIs for storing and querying data, processing it in the background for ETL or

Jan 5, 2023

Datastore Testibility

Datastore Testibility (dsunit) This library is compatible with Go 1.10+ Please refer to CHANGELOG.md if you encounter breaking changes. Introduction M

Oct 19, 2022

Scalable datastore for metrics, events, and real-time analytics

InfluxDB InfluxDB is an open source time series platform. This includes APIs for storing and querying data, processing it in the background for ETL or

Jan 4, 2023

A playground project to create a simple web API backed by a MySQL datastore.

A playground project to create a simple web API backed by a MySQL datastore. Which will allow evaluating ORM & HTTP router Go modules.

Oct 16, 2021

A gRPC shim for go-datastore

This is a gRPC shim for go-datastore. The server wraps a datastore in a gRPC server, so that datastores can run out-of-process. If you have a special

Nov 8, 2022
Comments
  • Fields being inadvertently removed when inserting on Go 1.14

    Fields being inadvertently removed when inserting on Go 1.14

    Hello!

    We have an app using your library and we insert a bunch of data through PersistSingle and PersistAll. However the same code that ran just fine on go 1.12 when I upgraded to go 1.14 started throwing errors on inserts related to missing fields.

    The first one was a boolean flag which ran like this:

    record := &migration{Version: 1, Dirty: false}
    _, _, err := conn.PersistSingle(record, migrationsTable, nil)
    

    Resulted in errors like these:

      "status": {
        "errorResult": {
          "message": "Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.",
          "reason": "invalid"
        },
        "errors": [
          {
            "message": "Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.",
            "reason": "invalid"
          },
          {
            "message": "Error while reading data, error message: JSON processing encountered too many errors, giving up. Rows: 1; errors: 1; max bad: 0; error percent: 0",
            "reason": "invalid"
          },
          {
            "message": "Error while reading data, error message: JSON parsing error in row starting at position 0: Missing required field: dirty.",
            "reason": "invalid"
          }
        ],
        "state": "DONE"
      },
    

    I am guessing something changed on golang that made normalizeValue remove fields when it should not.

    Versions:

    bgc 0.7.0
    dsc 0.10.2
    

    I should note that I tried upgrading both dsc and bgc to the latest versions, and the errors persisted.

  • Fix QueryIterator rowsIndex

    Fix QueryIterator rowsIndex

    The qi.rowsIndex is been incremented at the beginning of the function. It makes the row := qi.Rows[qi.rowsIndex] only gets the 0 index when qi.fetchPage is called

Related tags
Stream data into Google BigQuery concurrently using InsertAll()

Kik and me (@oryband) are no longer maintaining this repository. Thanks for all the contributions. You are welcome to fork and continue development. B

Dec 7, 2022
A playground project to create a simple web API backed by a MySQL datastore.

A playground project to create a simple web API backed by a MySQL datastore. Which will allow evaluating ORM & HTTP router Go modules.

Oct 16, 2021
Datastore Connectivity in go

Datastore Connectivity (dsc) This library is compatible with Go 1.10+ Please refer to CHANGELOG.md if you encounter breaking changes. Motivation Usage

Sep 26, 2022
Go-datastore - Key-value datastore interfaces for golang
Go-datastore - Key-value datastore interfaces for golang

go-datastore key-value datastore interfaces Lead Maintainer Steven Allen Table o

Jan 18, 2022
BDIX Tester is a small CLI application for testing connectivity BDIX connected servers

BDIX Tester is a small CLI application for testing connectivity BDIX connected servers. It's primary purpose is to check whether a BDIX connected server is accessible from your current ISP or not.

May 17, 2021
Internet connectivity for your VPC-attached Lambda functions without a NAT Gateway
Internet connectivity for your VPC-attached Lambda functions without a NAT Gateway

lambdaeip Internet connectivity for your VPC-attached Lambda functions without a NAT Gateway Background I occasionally have serverless applications th

Nov 9, 2022
A tool based on eBPF, prometheus and grafana to monitor network connectivity.
A tool based on eBPF, prometheus and grafana to monitor network connectivity.

Connectivity Monitor Tracks the connectivity of a kubernetes cluster to its api server and exposes meaningful connectivity metrics. Uses ebpf to obser

Dec 8, 2022
Establishes inter-service mTLS connectivity.

\ \\, \\\,^,.,,. “Zero to Hero” ,;7~((\))`;;,, <zerotohero.dev> ,(@') ;)`))\;;', stay up to date, be curiou

Dec 22, 2021
A CLI tool that can be used to disrupt wireless connectivity in your area by jamming all the wireless devices connected to multiple access points.

sig-716i A CLI tool written in Go that can be used to disrupt wireless connectivity in the area accessible to your wireless interface. This tool scans

Oct 14, 2022
BigQuery Dataset to query IAM Roles-Permissions daily changes
BigQuery Dataset to query IAM Roles-Permissions daily changes

Google Cloud IAM Roles-Permissions Public Dataset Bigquery DataSet that contains a daily snapshot of all Standard Google Cloud IAM Roles and Permissio

Oct 4, 2022