使用使用者帳戶憑據從 BigQuery 讀取資料

In [1]: import pandas as pd

要在 BigQuery 中執行查詢,你需要擁有自己的 BigQuery 專案。我們可以請求一些公共樣本資料:

In [2]: data = pd.read_gbq('''SELECT title, id, num_characters
   ...:                       FROM [publicdata:samples.wikipedia]
   ...:                       LIMIT 5'''
   ...:                    , project_id='<your-project-id>')

這將列印出來:

Your browser has been opened to visit:

    https://accounts.google.com/o/oauth2/v2/auth...[looong url cutted]

If your browser is on a different machine then exit and re-run this
application with the command-line parameter

  --noauth_local_webserver

如果你使用的是本地計算機,則會彈出瀏覽器。授予許可權後,pandas 將繼續輸出:

Authentication successful.
Requesting query... ok.
Query running...
Query done.
Processed: 13.8 Gb

Retrieving results...
Got 5 rows.

Total time taken 1.5 s.
Finished at 2016-08-23 11:26:03.

結果:

In [3]: data
Out[3]: 
               title       id  num_characters
0       Fusidic acid   935328            1112
1     Clark Air Base   426241            8257
2  Watergate scandal    52382           25790
3               2005    35984           75813
4               .BLP  2664340            1659

作為副作用,pandas 將建立 json 檔案 bigquery_credentials.dat,這將允許你執行更多查詢,而無需再授予許可權:

In [9]: pd.read_gbq('SELECT count(1) cnt FROM [publicdata:samples.wikipedia]'
                   , project_id='<your-project-id>')
Requesting query... ok.
[rest of output cutted]

Out[9]: 
         cnt
0  313797035