Releases: aliyun/aliyun-odps-python-sdk
Releases · aliyun/aliyun-odps-python-sdk
v0.11.6.1
v0.11.6
Features
- Add support for cluster info and views in tables and table DDL output.
- Add support for easier threaded writing and writing in multiple processes for TableWriter.
Enhancements
- Use monotonic time to calculate timeout.
- Add support for http+unix socket connection.
- Optimize RequestsIO by introducing buffering and simplify threaded sync.
- Revoke embedded requests and use buffered writer for table API by default.
- Add cython converter for legacy decimal.
- Store local configs inside context variables if possible.
- Add support for zoneinfo of Python standard library since Python 3.9.
- Show SQL statement when encountered ParseError.
- (experimental) Add support for new V4 signature. The signature is turned off by default.
- (experimental) Add support for accessing MaxCompute with AlibabaCloud credentials.
- (experimental) Upgrade six and setuptools requirements under Python 3.12.
Bugfixes
- Fix TypeError when calling open_resource to create resources.
- Fix superset error when reuse_odps=true.
- Fix errors when decimal digits are not sufficient.
- Fix support for DataFrame UDFs under Python 3.11.
- Fix timezone of arrow tunnel to make it consistent with record tunnels.
- Fix comparison of date sequences in DataFrame.
Tests
- Add check for malicious tunnel requests.
Documentation
- Fix default timezone document.
- Add detailed usage for reading table data under multiprocessing and multiple threads.
- Add detailed info for low-level tunnel interfaces.
Compatibility issues
- Timestamp objects obtained with arrow tunnel now uses local timezone instead of UTC to keep consistency with record tunnels. Please update your code when you've already done manual timezone conversion.
v0.11.5.post0
Bugfix
- Fix attribute errors for table preview and storage API.
v0.11.5
Features
- Add support for arrow table preview reader
- Enhance support for Apache Superset
- Add support for storage tier on tables and partitions
- (Experimental) Add support for tunnel upsert
- (Experimental) Add image argument for DataFrame
Bugfixes
- Fill partition value for tunnel records
- Use PERCENTILE_APPROX for doubles under ODPS 2.0
- Convert all requirement files to UNIX format for pyodps-pack
- Fix error when reloading volume tunnel session
- Fix logview setting not working in options
- Dump SQL statement when encountered ParseError
- Remove misplaced warnings when pickling user functions
- Fix errors of to_pandas for InSessionInstance readers
- Fix position of tablesample clause for sample
- Fix compatibility for SQLAlchemy 2.0
- Fix results of value_counts when values are None
- Remove empty equal mark for url actions
- Stop copying and caching for
DataFrame(pd).persist
if possible to reduce memory usage - Fix missing quotaName in full lifecycle of tunnel requests
- Fill partition value for tunnel records
- Fix starting of Mars notebook and Mars import in some case
- Delete deflate Content-Encoding header for halo in storage api
Enhancements
- Supports scanning dependencies for pkg_resources
- Add PEP517 args for pyodps-pack
- Persist pandas dataframes in batches
- Use date in response headers to replace fields in Schemas
- Add detailed logs for sign server on errors
- Make option context as thread locals
- Adapt to extended types for ODPS arrow format
- Supports schema API along with SQL implementations
- Add support for MaxFieldSize passed by server end
- Add options to allow keeping resources for DataFrame
- Add support for timestamp_ntz type
- Refine error message for malfunctioning create instance response
- Allow adding custom log handlers to support displaying logs in notebook kernels
- Allow using run_sql to execute merge smallfiles or compact commands
- Allow specifying transactional table property
- Unify verbose_log into standard Python logging and dump progress when waiting for instances
- Return struct values as namedtuples by default and fix DataFrame customized functions on complex types
- Add retry for BufferedRecordWriter when writing blocks
- Reuse task utilities to simplify MCQA submission
Documentations
- Fix pyodps-pack doc on docker requirements
- Add doc for timezone setting
- Make bare tunnel docs more explicit
- Refine documents of instance tunnel limit
Compatibility issues
- PyODPS now returns struct values as namedtuples for tunnels to keep consistency with UDFs. For most of the cases your code might still work. If it doesn't, try configuring
options.struct_as_dict = True
. - From v0.11.5
nullable
property of columns is added for transactional tables, and default value for partition columns isFalse
. If you use these column instances in some scenario, for instance, using them as common columns to create tables, non-nullable columns could be created and insertion of null values will result in errors. To ignore nullable flags in columns, try configuringsql.ignore_fields_not_null = True
.
v0.11.5b2
v0.11.5b1
Features
- Add support for arrow table preview reader
- Enhance support for Apache Superset
- Add support for storage tier on tables and partitions
- (Experimental) Add support for tunnel upsert
Bugfixes
- Fill partition value for tunnel records
- Use PERCENTILE_APPROX for doubles under ODPS 2.0
- Convert all requirement files to UNIX format for pyodps-pack
- Fix error when reloading volume tunnel session
- Fix logview setting not working in options
- Dump SQL statement when encountered ParseError
- Remove misplaced warnings when pickling user functions
- Fix errors of to_pandas for InSessionInstance readers
- Fix position of tablesample clause for sample
- Fix compatibility for SQLAlchemy 2.0
- Fix results of value_counts when values are None
- Remove empty equal mark for url actions
Enhancements
- Supports scanning dependencies for pkg_resources
- Add PEP517 args for pyodps-pack
- Persist pandas dataframes in batches
- Use date in response headers to replace fields in Schemas
- Add detailed logs for sign server on errors
- Make option context as thread locals
- Adapt to extended types for ODPS arrow format
- Supports schema API along with SQL implementations
- Add support for MaxFieldSize passed by server end
- Add options to allow keeping resources for DataFrame
- Add support for timestamp_ntz type
- Refine error message for malfunctioning create instance response
- Allow adding custom log handlers to support displaying logs in notebook kernels
- Allow using run_sql to execute merge smallfiles or compact commands
- Allow specifying transactional table property
- Unify verbose_log into standard Python logging and dump progress when waiting for instances
- Return struct values as namedtuples by default and fix DataFrame customized functions on complex types
Documentations
- Fix pyodps-pack doc on docker requirements
- Add doc for timezone setting
- Make bare tunnel docs more explicit
- Refine documents of instance tunnel limit
Compatibility issues
- PyODPS now returns struct values as namedtuples for tunnels to keep consistency with UDFs. For most of the cases your code might still work. If it doesn't, try configuring
options.struct_as_dict = True
.
v0.11.4.1
Enhancements
- Reuse UDFs when code is same and without closures
- Add function to show versions of dependencies
- Make stream tunnel to write in blocks
- Add quota_name params for various tunnel sessions
- Refine MCQA execution API and fallback behavior
- Supports JSON column type
- Use TABLESAMPLE clause to implement sampling with frac or rows
- Allow packing dynamic libraries with pyodps-pack
- Auto resolve source dependencies in no docker mode in pyodps-pack
Bug fixes
- Fix jump targets when jump instruction size changes
- Fix auto-flush for arrow writers
v0.11.4.post0
Deployment
- Restrict urllib3 version to 1.x.
v0.11.4
Features
- Add API-by-API implementation for storage API
- Add retry for table read API
- Add automatic submission for table write API
Bugfixes
- Fix OSError caused by BPO-29097 under certain Python versions
- Show composite error message when failed to parse data type
Enhancements
- Drop support for Python 2.6
- Add more options of pip into pyodps-pack
- Show more information when command not found on pyodps-pack
- Refine creating ODPS instances from environment variables
- Use modified requests library to simplify file-like writers
- Optimize cython implementation of tunnel record IO by introducing more nogil marks
- Refine error parsing and add tag of endpoint
- Reduce calls of tenant APIs
- Add options to read antique datetime as None
- Add supports for minikube for pyodps-pack
- Support yielding data while writing in arrow tunnels
- Support to_pandas on slices of readers
Deployment
- Fix dir missing on installing with source code with Jupyter
Tests
- Migrate all tests to pytest
Documentation
- Require jQuery for documentations
- Add notifications for checking XFlow instances aster
iter_xflow_subinstances
.
Compatibility Issues
- Supports of Python 2.6 is formally dropped since 0.11.4. Please use 0.11.3.1 or earlier versions.
- Using
async_
arguments as position arguments is deprecated. Please use it as a keyword argument. BufferredRecordWriter
is now renamed asBufferedRecordWriter
. References to old class should be switched into new one.
v0.11.3.1
Enhancements
- Add support for none-Docker mode for
pyodps-pack
. It now supports limited scenarios when Docker not available. - Reduce maximum memory cost of
to_pandas()
on tunnels by converting to pandas in batches - Supports complex types when calling
to_pandas()
on tunnels - Use default schema when
odps.namespace.schema
enabled on tenants, oroptions.always_enable_schema
set to True - Make sure merging small files is available under schemas
- (Experimental) Supports more functionality of external volumes
Bugfixes
- Fixes tunnel writing when
pd.NA
is used
Documentation
- Multiple documentation fixes