BigQuery

REST · port 9060 · orchestrated

Orchestrated service. BigQuery runs as a Docker container (LocalBQ) managed by localgcp. Requires Docker. DuckDB-powered -- columnar, vectorized, Parquet/CSV native. The container starts lazily on first connection.

Quick start

$ localgcp up --services=bigquery

First connection pulls ghcr.io/slokam-ai/localbq, starts the container (~3s cached).

Connect with bq CLI

One env var, no code changes. Works with your existing bq installation:

$ export CLOUDSDK_API_ENDPOINT_OVERRIDES_BIGQUERY=http://localhost:9060/
$ bq --project_id=my-project query --use_legacy_sql=false 'SELECT 1 + 1 AS result'

+--------+
| result |
+--------+
|      2 |
+--------+

Connect with Python

from google.cloud import bigquery
from google.auth.credentials import AnonymousCredentials

client = bigquery.Client(
    project="my-project",
    credentials=AnonymousCredentials(),
    client_options={"api_endpoint": "http://localhost:9060"},
)

rows = client.query("SELECT 1 + 1 AS result").result()
for row in rows:
    print(row.result)  # 2

Connect with Go

package main

import (
    "context"
    "fmt"
    "log"

    "cloud.google.com/go/bigquery"
    "google.golang.org/api/iterator"
    "google.golang.org/api/option"
)

func main() {
    ctx := context.Background()

    client, err := bigquery.NewClient(ctx, "my-project",
        option.WithEndpoint("http://localhost:9060"),
        option.WithoutAuthentication(),
    )
    if err != nil {
        log.Fatal(err)
    }
    defer client.Close()

    q := client.Query("SELECT 1 + 1 AS result")
    it, err := q.Read(ctx)
    if err != nil {
        log.Fatal(err)
    }

    for {
        var row []bigquery.Value
        err := it.Next(&row)
        if err == iterator.Done {
            break
        }
        if err != nil {
            log.Fatal(err)
        }
        fmt.Println(row) // [2]
    }
}

Load data

Load files directly with the CLI:

$ localbq load mydata.users users.parquet

Or via SQL:

LOAD DATA INTO mydata.users FROM '/path/to/users.parquet'

Supported formats: Parquet, CSV, TSV, JSON, NDJSON.

If the table doesn't exist, it is created from the file schema. If it already exists, data is appended.

Data directory

Standalone: ~/.localbq/localbq.duckdb
Via localgcp: {data-dir}/localgcp-bigquery
Override with --data-dir for standalone mode
Each data directory is an independent environment
Data persists across restarts

Switching to production

For bq CLI, unset the endpoint override:

$ unset CLOUDSDK_API_ENDPOINT_OVERRIDES_BIGQUERY

For Python, use an env var check to switch endpoints:

import os

endpoint = os.environ.get("LOCALBQ_ENDPOINT")
if endpoint:
    client = bigquery.Client(
        project="my-project",
        credentials=AnonymousCredentials(),
        client_options={"api_endpoint": endpoint},
    )
else:
    client = bigquery.Client(project="my-project")

One-line switch, no config files.

GoogleSQL compatibility

LocalBQ includes a text-based lowering pass that translates GoogleSQL to DuckDB SQL. Supported patterns:

TIMESTAMP_SUB, TIMESTAMP_ADD, TIMESTAMP_TRUNC
DATE, DATE_SUB, DATE_ADD, DATE_TRUNC, DATE_DIFF
COUNTIF
SAFE_CAST
FLOAT64, BOOL type aliases
Backtick-quoted identifiers (`project.dataset.table`)
INFORMATION_SCHEMA regional prefix stripping
LOAD DATA

Standard SQL -- joins, CTEs, window functions, aggregations -- works natively through DuckDB.

Standalone mode

BigQuery is also available as a standalone binary without LocalGCP.

Docker

$ docker run -p 9060:9060 ghcr.io/slokam-ai/localbq:latest

From source

$ go install github.com/slokam-ai/localbq/cmd/localbq@latest

Usage

$ localbq up           # start in foreground
$ localbq up -d        # start in background
$ localbq stop         # stop background server
$ localbq status       # check if running

Source and issues: github.com/slokam-ai/localbq

Not yet supported

MERGE with WHEN NOT MATCHED BY SOURCE
QUALIFY clause
PIVOT / UNPIVOT
SELECT AS STRUCT, SELECT AS VALUE
Complex UNNEST with explicit joins
Scripting (DECLARE, stored procedures, JS UDFs)
GEOGRAPHY type (WGS84 geodesic)
BIGNUMERIC beyond 38 digits
Storage Read/Write API (gRPC)
BigQuery ML / BI Engine