Skip to content

MetaCSV/py-mcsv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

py-mcsv - A MetaCSV parser for Python

Copyright (C) 2020-2021 J. Férard https://github.com/jferard

License: GPLv3

Build Status codecov

Overview

py-mcsv is a MetaCSV parser for Python. I quote the README:

MetaCSV is an open specification for a CSV file description. This description is written in a small auxiliary CSV file that may be stored along wih the CSV file itself. This auxilary file should provide the informations necessary to read and type the content of the CSV file. The standard extension is ".mcsv".

py-mcsv is able to read and type the rows of a CSV file, provided that you have an appropriate MetaCSV file.

(The ColumnDet package is able to generate automatically a sensible MetaCSV file for a given CSV file.)

Installation

On Ubuntu Linux:

$ python3 setup.py install --user

Example

Here's a basic example. The example.csv file reads (encoding: utf-8, newline: CRLF):

name,date,count
foo,2020-11-21,15
foo,2020-11-22,-8

The example.mcsv file reads (encoding: utf-8, newline: CRLF, see the MetaCSV format specification):

domain,key,value
data,col/1/type,date/YYYY-MM-dd
data,col/2/type,integer

The code is:

reader = open_dict_csv("example.csv")
for row in reader:
    print(row)

Output:

{'count': 15, 'date': datetime.date(2020, 11, 21), 'name': 'foo'}
{'count': -8, 'date': datetime.date(2020, 11, 22), 'name': 'foo'}

Usage

The basic usage is:

reader = open_dict_csv("my-csv-file.csv")
for row in reader:
    # row is a mapping: field name -> typed value
    ...

This assumes that the MetaCSV file has the same name as the CSV file, with the extension ".mcsv". Here, you need the "my-csv-file.mcsv".

You can provide a path to the MetaCSV file if necessary:

reader = open_dict_csv("my-csv-file.csv", "my-meta-csv-file.mcsv")
for row in reader:
    # row is a mapping: field name -> typed value
    ...

If you need the MetaCSV types, just write:

reader = open_dict_csv("my-csv-file.csv", skip_types=False)
for row in reader:
    # the first row is a mapping: field name -> MetaCSV description of type
    # the remaining rows are a mappings: field name -> typed value
    ...

You may wish to access rows as lists:

reader = open_csv("my-csv-file.csv")
for row in reader:
    # the first row is a header
    # the remaining rows are list of typed values
    ...

Testing

Simple testing:

$ python3 -m pytest --doctest-modules

Full testing:

$ pip3 install pytest-cov
$ python3.8 -m pytest --cov-report term-missing --cov=mcsv  && python3.8 -m pytest --cov-report term-missing --cov-append --doctest-modules mcsv --cov=mcsv && flake8 mcsv/* test/*

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages