Here described only the breaking and most significant changes. The full changelog and documentation for all released versions could be found in nicely formatted commit history.
- Local development has been migrated to using Hatch
- Rebased packaging on PEP 621
- Extracted experimental application/server from the codebase
- Implemented "Metadata.from_descriptor(allow_invalid=False)" (#1501)
- Various architectural and standards-compatibility improvements (minor breaking changes):
- Added new Console commands:
- list
- explore
- query
- script
- convert
- publish
- Rebased Console commands on Rich (nice output in the Console)
- Fixed
extract
returning the results depends on the source type (now it's always a dictionary indexed by the resource name) - Enforced type safety -- many tabular command will be marked as impossible for non-tabular resources if a type checker is used
- Improved
frictionless.Resource(source)
guessing abilities; if you just like to open a table resource usefrictionless.resources.TableResource(path=path)
- Added new Console commands:
- Implemented Implemented
catalog/dataset/package/resource.deference
(#1451)
- Various architectural and standards-compatibility improvements (minor breaking changes):
- Improved type detection mechanism (including remote descriptors)
- Added
resources
module includingFile/Text/Json/TableResource
- Deprecated
resource.type
argument -- use the classes above - Changed
catalog.packages[]
tocatalog.datasets[].package
- Made
resource.schema
optional (resource.has_schema
is removed) - Made
resource.normpath
optional (resource.normdata
is removed) - Standards-compatability improvements: profile, stats
- Renamed
system/plugin.select_Check/etc
tosystem/plugin.select_check_class/etc
- Added support for
sqlalchemy@2
(#1427)
- Implemented
program/resource.index
preview (#1395)
- Support
dialect.skip_blank_rows
(#1387)
- Support
steps.resource_update
for resource transformations (#1381)
- Added support for
wkt
format infields.StringField
(#1363 by @jze)
- Support
descriptor
argument foractions/program.extract
(#1372)
- Frictionless Framework (v5) is out of Beta and released on PyPi
- Implemented CKAN Integration (#1185)
- ForeignKeyError has been extended with additional information:
fieldNames
,fieldCells
,referenceName
, andreferenceFieldNames
- Implemented Github Integration (#1185)
- First beta version of Frictionless Framework (v5)
- Added Dialect support to packages (#1137)
- Fixed processing of incompatible decimal char in table schema and data (#1089)
- Added support for Time Zone data (#1097)
- Improved validation messages by adding
summary
and partial validation details (#1106) - Implemented new feature
summary
(#1127)schema.to_summary
report.to_summary
- Added CLI command
summary
- Fixed file compression
package.to_zip
(#1104) - Implemented feature to validate single resource (#1112)
- Improved error message to notify about invalid fields (#1117)
- Fixed type conversion of NaN values for data of type Int64 (#1115)
- Exposed valid/invalid flags in CLI
extract
command (#1130) - Implemented feature
package.to_er_diagram
(#1135)
- Implemented
checks.ascii_value
(#1064) - Implemented
checks.deviated_cell
(#1069) - Implemented
detector.field_true/false_values
(#1074)
- Deprecated high-level legacy actions (use class-based alternatives):
describe_*
extract_*
transform_*
validate_*
- Implemented pipeline actions:
pipeline.validate
(will replacevalidate_pipeline
in v5)pipeline.transform
(will replacetransform_pipeline
in v5)
- Implemented inqiury actions:
inqiury.validate
(will replacevalidate_inqiury
in v5)
- Implemented schema actions:
Schema.describe
(will replacedescribe_schema
in v5)schema.validate
(will replacevalidate_schema
in v5)
- Implemented new transform steps:
steps.field_merge
steps.field_pack
- Implemented package actions:
Package.describe
(will replacedescribe_package
in v5)package.extract
(will replaceextract_package
in v5)package.validate
(will replacevalidate_package
in v5)package.transform
(will replacetransform_package
in v5)
- Implemented resource actions:
Resource.describe
(will replacedescribe_resource
in v5)resource.extract
(will replaceextract_resource
in v5)resource.validate
(will replacevalidate_resource
in v5)resource.transform
(will replacetransform_resource
in v5)
- Added to_markdown() feature to metadata (#1052)
- Added a feature that allows to export table schema as excel (#1040)
- Added nontabular note to validation results to indicate nontabular file (#1046)
- Excel stats now shows bytes and hash (#1045)
- Added pprint feature which displays metadata in a readable and pretty way (#1039)
- Improved error message if resource.data is not a string (#1036)
- Made Detector's private properties public and writable (#1025)
- Improved an order of the metadata in YAML representation
- Exposed Dialect options via CLI such as
sheet
,table
,keys
, andkeyed
(#886)
- Validate 'schema.fields[].example' (#998)
- Allows descriptors that subclass collections.abc.Mapping (#985)
- Added support for
SqlDialect.basepath
(#982) (https://framework.frictionlessdata.io/docs/tutorials/formats/sql-tutorial)
- Added table dimensions check (#985)
- Added "extract --trusted" flag
- Added "--json/yaml" CLI options for transform
- Improved layout/schema detection algorithms (#945)
- Renamed
inlineDialect.keys
toinlineDialect.data_keys
due to a conflict withdict.keys
property
- Normalized metadata properties (increased type safety)
- Add fields, limit, sort and filter options to CkanDialect (#912)
- Implemented
system/plugin.create_candidates
(#893)
- Implemented
system.get/use_http_session
(#892)
- SQL Where Clause (#882)
- Implemented descriptor type detection for
extract/validate
(#881)
- Support external profiles for data package (#864)
- Added
json
argument toresource.to_snap
- Support resource/field renaming in transform (#843)
- Support
--path
CLI argument (#829)
- Added support for
Package(innerpath)
argument for unzipping a data package's descriptor
- Support control/dialect as JSON in CLI (#806)
- Implemented
describe_dialect
anddescribe(path, type="dialect")
- Support
--dialect
argument in CLI
- Implemented
Schema.from_jsonschema
(#797)
- Use
field.constraints.maxLength
for SQL's VARCHAR (#795)
- Implemented
resource.to_view()
(#781)
- Make
fields[].arrayItem
errors more granular (#767)
- Added support for
fields[].arrayItem
(#750)
- Released
frictionless@4
🎉
- Updated loaders (#658) (BREAKING)
- Renamed
filelike
loader tostream
loader - Migrated from
text
loader tobuffer
loader
- Renamed
- Improve transform API (#657) (BREAKING)
- Swithed to the
transform_resource(resource)
signature - Swithed to the
transform_package(package)
signature
- Swithed to the
- Improved resource/package import/export (#655) (BREAKING)
- Reworked
parser.write_row_stream
API - Reworked
resource.from/to
API - Reworked
package.from/to
API - Reworked
Storage
API - Reworked
system.create_storage
API - Merged
PandasStorage
intoPandasParser
- Merged
SpssStorage
intoSpssParser
- Reworked
- Improved transformation steps (#650) (BREAKING)
- Split value/formula/function concepts
- Renamed a few minor step arguments
- Improved layout and data streams concepts (#648) (BREAKING)
- Renamed
data_stream
tolist_stream
- Renamed
readData
toreadLists
- Renamed
sample
tofragment
(sample
now is raw lists) - Implemented loader.buffer
- Implemented parser.sample
- Added support for function based checks
- Added support for function based steps
- Renamed
- Reworked Error.tags (BREAKING)
- Reworked Check API and split labels/header (BREAKING)
- Rebased on
Detector
class (BREAKING)- Migrated all infer_*, sync/patch_schema and detect_encoding parameters to
Detector
- Made
resource.infer
omit empty objects - Added
resource.read_*(size)
argument - Added
resource.labels
property
- Migrated all infer_*, sync/patch_schema and detect_encoding parameters to
- Improved checks/steps API (#621) (BREAKING)
- Updated
validate(extra_checks=[...])
tovalidate(checks=[{"code": 'code', ...}])
- Updated
- Updated describe/extract/transform/validate APIs (BREAKING)
- Removed
validate_table
(usevalidate_resource
) - Removed legacy
Table
andFile
classes - Removed
dataflows
plugin - Replaced
nopool
byparallel
(not parallel by default) - Renamed
report.tables
toreport.tasks
- Rebased on
report.tasks[].resource
(instead of plain path/scheme/format/etc) - Flatten Pipeline steps signature
- Removed
- Introduced Layout class (BREAKING)
- Renamed
Query
class and arguments/properties toLayout
- Moved
header
options fromDialect
toLayout
- Renamed
- Updated transform API
- Added
transform(type)
argument
- Added
- Updated describe API (BREAKING)
- Renamed
describe(source_type)
argument totype
- Renamed
- Updated extract API (BREAKING)
- Removed
extract_table
(useextract_resource
with the same API) - Renamed
extract(source_type)
argument totype
- Removed
- Initial API/codebase improvements for v4 (BREAKING)
- Allow
Package/Resource(source)
notation (guess descriptor/path/etc) - Renamed
schema.infer
->Schema.from_sample
- Renamed
resource.inline
->resource.memory
- Renamed
compression_path
->innerpath
- Renamed
compression: no
->compression: ""
- Updated
Package/Resource.infer
not to infer stats (usestats=True
) - Removed
Package/Resource.infer(only_sample)
argument - Removed
Resouce.from/to_zip
(usePackage.from/to_zip
) - Removed
Resouce.source
(useResource.data
orResource.fullpath
) - Removed
package/resource.infer(source)
argument (use constructors) - Added some new API (will be covered in the updated docs after the v4 release)
- Allow
- Make Resource independent from Table/File (#607) (BREAKING)
- Resource can be opened like Table (it's recommended to use Resource instead of Table)
- Renamed
resource.read_sample()
toresource.sample
- Renamed
resource.read_header()
toresource.header
- Renamed
resource.read_stats()
toresource.stats
- Removed
resource.to_table()
- Removed
resource.to_file()
- Optimize Row/Header/Table and rename header errors (#601) (BREAKING)
- Row object is now lazy; it casts data on-demand preserving the same API
- Method
resource/table.read_data(_stream)
now includes a header row if present - Renamed
errors.ExtraHeaderError->ExtraLabelError
(extra-label-error
) - Renamed
errors.MissingHeaderError->MissingLabelError
(missing-label-error
) - Renamed
errors.BlankHeaderError->BlankLabelError
(blank-label-error
) - Renamed
errors.DuplicateHeaderError->DuplicateLabelError
(duplicate-label-error
) - Renamed
errors.NonMatchingHeaderError->IncorrectLabelError
(incorrect-label-error
) - Renamed
schema.read/write_data->read/write_cells
- Renamed aws plugin to s3 (#594) (BREAKING)
$ pip install frictionless[aws] # before
$ pip install frictionless[s3] # after
- Drafted support for writing Multipart Data (#583)
- Added support for writing to Remote Data (#582)
- Add support to writing to Google Sheets (#581)
- Renamed
gsheet
plugin/format togsheets
(BREAKING: minor)
- Added support for writing to S3 (#580)
- Update Loader/Parser API to write to different targets (#579) (BREAKING: minor)
- Implemented a standalone multipart loader (#573)
- Fixed Header not being an original one (#572)
- Fix bad format validation (#571)
- Added default errors limit equals to 1000 (#570)
- Added support for field.float_number (#569)
- Improved ckan plugin (#560)
- Remove not working elastic plugin draft (#558)
- Support custom types (#557)
- Added "resolve" option to "resource/package.to_zip" (#556)
- Moved
frictionless.controls
tofrictionless.plugins.*
(BREAKING) - Moved
frictionless.dialects
tofrictionless.plugins.*
(BREAKING) - Moved
frictionless.exceptions.FrictionlessException
tofrictionless.FrictionlessException
(BREAKING) - Moved
excel
dependencies tofrictionless[excel]
extras (BREAKING) - Moved
json
dependencies tofrictionless[json]
extras (BREAKING) - Consider
json
files to be a metadata by default (BREAKING)
Code example:
# Before
# pip install frictionless
from frictionless import dialects, exceptions
excel_dialect = dialects.ExcelDialect()
json_dialect = dialects.JsonDialect()
exception = exceptions.FrictionlessException()
# After
# pip install frictionless[excel,json]
from frictionless import FrictionlessException
from frictionless.plugins.excel import ExcelDialect
from frictionless.plugins.json import JsonDialect
excel_dialect = dialects.ExcelDialect()
json_dialect = dialects.JsonDialect()
exception = FrictionlessException()
- Implemented resource.write (#537)
- Added url parameter to SQL import/export (#535)
- Made tables with header and no data rows valid (#534) (BREAKING: minor)
- Various CLI improvements (#532)
- Added autocompletion
- Added stdin support
- Added "extract --csv"
- Exposed more options
- Added experimental CKAN support (#528)
- Add a "nopool" argument to validate (#527)
- Stop sorting keyed sources as the order is now guaranteed by Python (#512) (BREAKING)
- Added "nolookup" argument for validate_package (#515)
- Add transform functionality (#505)
- Methods
schema.get/remove_field
now raise if not found (#505) (BREAKING) - Methods
package.get/remove_resource
now raise if not found (#505) (BREAKING)
- Lower case resource.scheme/format/hashing/encoding/compression (#499) (BREAKING)
- Support "header_case" option for dialects (#488)
- Added suppport for DB2 format (#485)
- Improved SPSS plugin (#483)
- Improved BigQuery plugin (#470)
- Added support for SQL Views (#466)
- Rebased AwsLoader on streaming (#460)
- Added
hashing
parameter todescribe/describe_package
- Removed
table.onerror
property (BREAKING)
- Added timezone for datetime/time parsing (#457) (BREAKING)
- Fixed metadata.to_yaml (#455)
- Removed the
expand
argument frommetadata.to_dict
(BREAKING)
- Added native schema support to SqlParser (#452)
- Make Resource the main internal interface (#446) (BREAKING: for plugin authors)
- Move Resource's stats to
resource.stats
(BREAKING) - Rename
on_error
toonerror
(BREAKING) - Added
resource.stats.fields
- Add an
on_error
argument to Table/Resource/Package (#445)
- Added streaming to the extract functions (#442)
- Added experimental BigQuery support (#424)
- Added experimental SPSS support (#421)
- Rebased on a
goodtables
successor versioning
- Add support SQL/Pandas import/export (#31)
- Add support for custom JSONEncoder classes (#24)
- Normalize header terminology
- Initial public version