Skip to content

Commit

Permalink
[Feat] Export CSV To Influx 0.1.0
Browse files Browse the repository at this point in the history
  • Loading branch information
Bugazelle committed Jul 12, 2019
0 parents commit 17aa131
Show file tree
Hide file tree
Showing 12 changed files with 1,046 additions and 0 deletions.
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
.idea/
venv/
dist/
build/
*.egg-info
*.egg
*.pyc
29 changes: 29 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
BSD 3-Clause License

Copyright (c) 2019, Bugazelle
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
154 changes: 154 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
Export CSV To Influx
====================

> Version 0.1.0
**Export CSV To Influx**: Process CSV data, and export the data to influx db

## Install
Use the pip to install the library. Then the binary **export_csv_to_influx** is ready.

```
pip install ExportCsvToInflux
```

## Features
1. Allow to use binary **export_csv_to_influx** to run exporter
2. Allow to check dozens of csv files in a folder
3. Auto convert csv data to int/float/string in Influx
4. Allow to limit string length in Influx
5. Allow to judge the csv has new data or not
6. Allow to use the latest file modify time as time column
7. Auto Create database if not exist
8. Allow to drop database before inserting data
9. Allow to drop measurements before inserting data
10. Allow to match or filter the data by using string or regex.
11. Allow to count, and generate count measurement

## Command Arguments
You could use `export_csv_to_influx -h` to see the help guide.

Here are the details:

```
-c, --csv, Input CSV file path, or the folder path. **Mandotory**
-d, --delimiter, CSV delimiter. Default: ','.
-lt, --lineterminator, CSV lineterminator. Default: '\n'.
-s, --server, InfluxDB Server address. Default: localhost:8086
-u, --user, InfluxDB User name. **Mandotory**
-p, --password, InfluxDB Password. **Mandotory**
-db, --dbname, InfluxDB Database name. **Mandotory**
-m, --measurement, Metric column name. **Mandotory**
-t, --time_column, Timestamp column name. Default: timestamp. If no timestamp column, the timestamp is set to the last file modify time for whole csv rows.
-tf, --time_format, Timestamp format. Default: '%Y-%m-%d %H:%M:%S' e.g.: 1970-01-01 00:00:00.
-tz, --time_zone, Timezone of supplied data. Default: UTC.
-fc, --field_columns, List of csv columns to use as fields, separated by comma. **Mandotory**
-tc, --tag_columns, List of csv columns to use as tags, separated by comma. **Mandotory**
-b, --batch_size, Batch size when inserting data to influx. Default: 500.
-lslc, --limit_string_length_columns, Limit string length column. Default: None.
-ls, --limit_length, Limit length. Default: 20.
-dd, --drop_database, Drop database before inserting data.
-dm, --drop_measurement, Drop measurement before inserting data.
-mc, --match_columns, Match the data you want to get for certain columns, separated by comma.
-mbs, --match_by_string, Match by string, separated by comma.
-mbr, --match_by_regex, Match by regex, separated by comma.
-fic, --filter_columns, Filter the data you want to filter for certain columns, separated by comma.
-fibs, --filter_by_string, Filter by string, separated by comma.
-fibr, --filter_by_regex, Filter by regex, separated by comma.
-ecm, --enable_count_measurement, Enable count measurement.
-fi, --force_insert_even_csv_no_update, Force insert data to influx, even csv no update.
```

> **Note 1:** You could use the library programmablly.
```
from ExportCsvToInflux import ExporterObject
exporter = ExporterObject()
exporter.export_csv_to_influx(...)
```

> **Note 2:** CSV data won't insert into influx again if no update. Use --force_insert_even_csv_no_update=True to force insert
## Sample
Here is the **demo.csv**.

```
timestamp,url,response_time
2019-07-11 02:04:05,https://jmeter.apache.org/,1.434
2019-07-11 02:04:06,https://jmeter.apache.org/,2.434
2019-07-11 02:04:07,https://jmeter.apache.org/,1.200
2019-07-11 02:04:08,https://jmeter.apache.org/,1.675
2019-07-11 02:04:09,https://jmeter.apache.org/,2.265
2019-07-11 02:04:10,https://sample-demo.org/,1.430
2019-07-12 08:54:13,https://sample-show.org/,1.300
2019-07-12 14:06:00,https://sample-7.org/,1.289
2019-07-12 18:45:34,https://sample-8.org/,2.876
```

1. Command to export whole data into influx:

```
export_csv_to_influx \
--csv demo.csv \
--dbname demo \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--user admin \
--password admin \
--server 127.0.0.1:8086
```
2. Command to export whole data into influx, **but: drop database**
```
export_csv_to_influx \
--csv demo.csv \
--dbname demo \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--user admin \
--password admin \
--server 127.0.0.1:8086 \
--drop_database=True
```
3. Command to export part of data: **timestamp matches 2019-07-12 and url matches sample-\d+**
```
export_csv_to_influx \
--csv demo.csv \
--dbname demo \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--user admin \
--password test-automation-monitoring-2019 \
--server 127.0.0.1:8086 \
--drop_database=True \
--match_columns=timestamp,url \
--match_by_reg='2019-07-12,sample-\d+'
```
4. Enable count measurement. A new measurement named: **demo_count** generated
```
export_csv_to_influx \
--csv demo.csv \
--dbname demo \
--measurement demo \
--tag_columns url \
--field_columns response_time \
--user admin \
--password admin \
--server 127.0.0.1:8086 \
--drop_database=True \
--match_columns=timestamp,url \
--match_by_reg='2019-07-12,sample-\d+' \
--force_insert_even_csv_no_update=True \
--enable_count_measurement=True
```
## Special Thanks
The lib is inspired by: [https://github.com/fabio-miranda/csv-to-influxdb](https://github.com/fabio-miranda/csv-to-influxdb)
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
influxdb>=5.2.2
2 changes: 2 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[metadata]
description-file = README.md
43 changes: 43 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
from setuptools import setup, find_packages
import os
import re

CURDIR = os.path.dirname(os.path.abspath(__file__))

with open(os.path.join(CURDIR, 'requirements.txt')) as f:
REQUIRES = f.read().splitlines()

with open(os.path.join(CURDIR, 'src', 'ExportCsvToInflux', '__version__.py')) as f:
VERSION = re.search("__version__ = '(.*)'", f.read()).group(1)

setup(
name='ExportCsvToInflux',
package_dir={'': 'src'},
packages=find_packages('src'),
version=VERSION,
zip_safe=False,
include_package_data=True,
description='ExportCsvToInflux: A Solution to export csv to influx db',
author='Bugazelle',
author_email='[email protected]',
keywords=['python', 'csv', 'influx'],
install_requires=REQUIRES,
download_url='',
url='https://github.com/Bugazelle/export-csv-to-inlfux',
classifiers=(
'Development Status :: Production/Stable',
'Intended Audience :: Developers',
'Natural Language :: English',
'Programming Language :: Python',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 237',
),
entry_points={
'console_scripts': [
'export_csv_to_influx = ExportCsvToInflux.exporter_object:export_csv_to_influx',
],
},
)
17 changes: 17 additions & 0 deletions src/ExportCsvToInflux/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from .influx_object import InfluxObject
from .csv_object import CSVObject
from .exporter_object import ExporterObject
from .base_object import BaseObject
from .__version__ import __version__

_version_ = __version__


class ExportCsvToInflux(InfluxObject,
CSVObject,
ExporterObject,
BaseObject,):
"""ExportCsvToInflux is library to export csv data into influx db"""

def __init__(self):
super(ExportCsvToInflux, self).__init__()
1 change: 1 addition & 0 deletions src/ExportCsvToInflux/__version__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__version__ = '0.1.0'
33 changes: 33 additions & 0 deletions src/ExportCsvToInflux/base_object.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
class BaseObject(object):
"""BaseObject"""

def __init__(self):
self.strip_chars = ' \r\n\t/"\',\\'

def str_to_list(self, string, delimiter=',', lower=False):
"""Function: str_to_list
:param string: the string
:param delimiter: the delimiter for list (default comma)
:param lower: lower the string (default False)
:return
"""

string_type = type(string)
if string_type is list or string_type is tuple:
if lower:
li = [str(item).strip(self.strip_chars).lower() for item in string]
else:
li = [str(item).strip(self.strip_chars) for item in string]
elif string_type is str or string_type is unicode:
li = string.strip(self.strip_chars).split(delimiter)
if lower:
li = [item.strip(self.strip_chars).lower() for item in li]
else:
li = [item.strip(self.strip_chars) for item in li]
elif bool(string) is False:
li = list()
else:
raise Exception('Error: The string should be list or string, use comma to separate. '
'Current is: type-{0}, {1}'.format(string_type, string))
return li
Loading

0 comments on commit 17aa131

Please sign in to comment.