Releases: googleapis/python-bigquery-pandas
Releases · googleapis/python-bigquery-pandas
Version 0.12.0
New features
- Add
max_resultsargument topandas_gbq.read_gbq(). Use this
argument to limit the number of rows in the results DataFrame. Set
max_resultsto 0 to ignore query outputs, such as for DML or DDL
queries. (#102) - Add
progress_bar_typeargument topandas_gbq.read_gbq(). Use
this argument to display a progress bar when downloading data.
(#182)
Dependency updates
- Update the minimum version of
google-cloud-bigqueryto 1.11.1.
(#296)
Documentation
- Add code samples to introduction and refactor how-to guides. (#239)
Bug fixes
- Fix resource leak with
use_bqstorage_apiby closing BigQuery Storage API client after use. (#294)
Version 0.11.0
- Breaking Change: Python 2 support has been dropped. This is to align
with the pandas package which dropped Python 2 support at the end of 2019.
(#268)
Enhancements
- Ensure
table_schemaargument is not modified inplace. (:issue:278)
Implementation changes
- Use object dtype for
STRING,ARRAY, andSTRUCTcolumns when
there are zero rows. (#285)
Internal changes
Version 0.10.0
Documentation
- Document BigQuery data type to pandas dtype conversion for
read_gbq. ( #269 )
Dependency updates
- Update the minimum version of
google-cloud-bigqueryto 1.9.0. ( #247 ) - Update the minimum version of
pandasto 0.19.0. ( #262 )
Internal changes
- Update the authentication credentials. Note: You may need to set
reauth=Truein order to update your credentials to the most recent version. This is required to use new functionality such as the BigQuery Storage API. ( #267 ) - Use
to_dataframe()fromgoogle-cloud-bigqueryin theread_gbq()function. ( #247 )
Enhancements
- Fix a bug where pandas-gbq could not upload an empty DataFrame. ( #237 )
- Allow table_schema in
to_gbqto contain only a subset of columns, with the rest being populated using the DataFrame dtypes ( #218 ) (contributed by @JohnPaton) - Read
project_idinto_gbqfrom provided credentials if available (contributed by @daureg) read_gbquses the timezone-awareDatetimeTZDtype(unit='ns', tz='UTC')dtype for BigQueryTIMESTAMPcolumns. ( #269 )- Add
use_bqstorage_apitoread_gbq. The BigQuery Storage API can be used to download large query results (>125 MB) more quickly. If the BQ Storage API can't be used, the BigQuery API is used instead. ( #133, #270 )
Version 0.9.0
Version 0.8.0
Breaking changes
- Deprecate private_key parameter to
pandas_gbq.read_gbqandpandas_gbq.to_gbqin favor of new credentials argument. Instead, create a credentials object usinggoogle.oauth2.service_account.Credentials.from_service_account_infoorgoogle.oauth2.service_account.Credentials.from_service_account_file. See the authentication how-to guide for examples. (#161, #231 )
Enhancements
- Allow newlines in data passed to to_gbq. (#180)
- Add
pandas_gbq.context.dialectto allow overriding the default SQL syntax dialect. (#195, #235) - Support Python 3.7. (#197, #232)
Internal changes
Version 0.7.0
intcolumns which containNULLare now cast tofloat, rather thanobjecttype. (#174)DATE,DATETIMEandTIMESTAMPcolumns are now parsed as pandas'timestampobjects (#224)- Add :class:
pandas_gbq.Contextto cache credentials in-memory, across calls toread_gbqandto_gbq. (#198, #208) - Fast queries now do not log above
DEBUGlevel. (#204) With BigQuery's release of clustering querying smaller samples of data is now faster and cheaper. - Don't load credentials from disk if reauth is
True. (#212) This fixes a bug where pandas-gbq could not refresh credentials if the cached credentials were invalid, revoked, or expired, even whenreauth=True. - Catch RefreshError when trying credentials. (#226)
Version 0.6.1
- Improved
read_gbqperformance and memory consumption by delegating DataFrame construction to the Pandas library, radically reducing the number of loops that execute in python (#128) - Reduced verbosity of logging from
read_gbq, particularly for short queries. (#201) - Avoid
SELECT 1query when runningto_gbq. (#202)
Version 0.6.0
Version 0.5.0
- Project ID parameter is optional in read_gbq and to_gbq when it can inferred from the environment. Note: you must still pass in a project ID when using user-based authentication. (#103)
- Progress bar added for to_gbq, through an optional library tqdm as dependency. (#162)
- Add location parameter to read_gbq and to_gbq so that pandas-gbq can work with datasets in the Tokyo region. (#177)
Version 0.4.1
- Only show verbose deprecation warning if Pandas version does not populate it. #157