py-mcsv - A MetaCSV parser for Python
Copyright (C) 2020-2021 J. Férard https://github.com/jferard
License: GPLv3
py-mcsv is a MetaCSV parser for Python. I quote the README:
MetaCSV is an open specification for a CSV file description. This description is written in a small auxiliary CSV file that may be stored along wih the CSV file itself. This auxilary file should provide the informations necessary to read and type the content of the CSV file. The standard extension is ".mcsv".
py-mcsv is able to read and type the rows of a CSV file, provided that you have an appropriate MetaCSV file.
(The ColumnDet package is able to generate automatically a sensible MetaCSV file for a given CSV file.)
On Ubuntu Linux:
$ python3 setup.py install --user
Here's a basic example. The example.csv
file reads (encoding: utf-8,
newline: CRLF):
name,date,count
foo,2020-11-21,15
foo,2020-11-22,-8
The example.mcsv
file reads (encoding: utf-8,
newline: CRLF, see the MetaCSV format specification):
domain,key,value
data,col/1/type,date/YYYY-MM-dd
data,col/2/type,integer
The code is:
reader = open_dict_csv("example.csv")
for row in reader:
print(row)
Output:
{'count': 15, 'date': datetime.date(2020, 11, 21), 'name': 'foo'}
{'count': -8, 'date': datetime.date(2020, 11, 22), 'name': 'foo'}
The basic usage is:
reader = open_dict_csv("my-csv-file.csv")
for row in reader:
# row is a mapping: field name -> typed value
...
This assumes that the MetaCSV file has the same name as the CSV file, with the extension ".mcsv". Here, you need the "my-csv-file.mcsv".
You can provide a path to the MetaCSV file if necessary:
reader = open_dict_csv("my-csv-file.csv", "my-meta-csv-file.mcsv")
for row in reader:
# row is a mapping: field name -> typed value
...
If you need the MetaCSV types, just write:
reader = open_dict_csv("my-csv-file.csv", skip_types=False)
for row in reader:
# the first row is a mapping: field name -> MetaCSV description of type
# the remaining rows are a mappings: field name -> typed value
...
You may wish to access rows as lists:
reader = open_csv("my-csv-file.csv")
for row in reader:
# the first row is a header
# the remaining rows are list of typed values
...
Simple testing:
$ python3 -m pytest --doctest-modules
Full testing:
$ pip3 install pytest-cov
$ python3.8 -m pytest --cov-report term-missing --cov=mcsv && python3.8 -m pytest --cov-report term-missing --cov-append --doctest-modules mcsv --cov=mcsv && flake8 mcsv/* test/*