Skip to content

Commit

Permalink
feat: lino analyse (#193)
Browse files Browse the repository at this point in the history
* feat: analyse command

* feat: iter over datasource

* feat: analyse extraction

* feat: lino analyse

* feat: add extractor  for oracle, mysql and db2 database

* fix: remove ; at the end of SQL Query

* docs: update changelog and readme

---------

Co-authored-by: Adrien Aury <[email protected]>
Co-authored-by: Adrien Aury <[email protected]>
  • Loading branch information
3 people authored Oct 17, 2023
1 parent dba5981 commit 8db698f
Show file tree
Hide file tree
Showing 15 changed files with 747 additions and 22 deletions.
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,12 @@ Types of changes
- `Fixed` for any bug fixes.
- `Security` in case of vulnerabilities.

## [2.5.0]

- `Added` command `analyse` to extract metrics from the database in YAML format.

## [2.4.0]

- `Added` go-ora driver for oracle in replacement of old driver (remove technical prerequisite to install Oracle Instant Client)

## [2.3.0]
Expand Down
116 changes: 106 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/CGI-FR/LINO/ci.yml?branch=main)
[![Go Report Card](https://goreportcard.com/badge/github.com/cgi-fr/pimo)](https://goreportcard.com/report/github.com/cgi-fr/lino)
![GitHub all releases](https://img.shields.io/github/downloads/CGI-FR/LINO/total)
![GitHub](https://img.shields.io/github/license/CGI-FR/LINO)
![GitHub Repo stars](https://img.shields.io/github/stars/CGI-FR/LINO)
![GitHub go.mod Go version](https://img.shields.io/github/go-mod/go-version/CGI-FR/LINO)
![GitHub release (latest by date)](https://img.shields.io/github/v/release/CGI-FR/LINO)
[![Go Report Card](https://goreportcard.com/badge/github.com/cgi-fr/pimo)](https://goreportcard.com/report/github.com/cgi-fr/lino)
![GitHub all releases](https://img.shields.io/github/downloads/CGI-FR/LINO/total)
![GitHub](https://img.shields.io/github/license/CGI-FR/LINO)
![GitHub Repo stars](https://img.shields.io/github/stars/CGI-FR/LINO)
![GitHub go.mod Go version](https://img.shields.io/github/go-mod/go-version/CGI-FR/LINO)
![GitHub release (latest by date)](https://img.shields.io/github/v/release/CGI-FR/LINO)

# LINO : Large Input, Narrow Output

Expand Down Expand Up @@ -287,11 +287,106 @@ lino push update source --table actor <<<'{"actor_id":998,"last_name":"CHASE","_

The `__usingpk__` field can also be used with an ingress descriptor at any level in the data. The name of this field can be changed to another value with the `--using-pk-field` flag.

### Interaction with other tools
## Analyse

Use the `lino analyse <data_connector_alias>` command to extract metrics from the database in YAML format.

Only tables and columns explicitly listed in the tables.yaml file will be analysed.

Example result :

```yaml
database: source
tables:
- name: first_name
columns:
- name: actor
type: string
concept: ""
constraint: []
confidential: null
mainMetric:
count: 200
empty: 0
unique: 128
sample:
- WALTER
- MAE
- LAURENCE
- GREG
- ALEC
stringMetric:
mostFrequentLen:
- length: 4
freq: 0.235
sample:
- GARY
- ALAN
- ADAM
- JEFF
- GINA
- length: 5
freq: 0.215
sample:
- REESE
- MILLA
- SALMA
- RALPH
- SUSAN
- length: 7
freq: 0.16
sample:
- OLYMPIA
- KIRSTEN
- MATTHEW
- RICHARD
- KIRSTEN
- length: 6
freq: 0.14
sample:
- WHOOPI
- WALTER
- SANDRA
- WHOOPI
- JOHNNY
- length: 3
freq: 0.12
sample:
- BOB
- BEN
- KIM
- BOB
- TOM
leastFrequentLen:
- length: 11
freq: 0.01
sample:
- CHRISTOPHER
- length: 9
freq: 0.02
sample:
- CHRISTIAN
- SYLVESTER
- length: 2
freq: 0.02
sample:
- AL
- ED
- length: 8
freq: 0.08
sample:
- JULIANNE
- LAURENCE
- JULIANNE
- SCARLETT
- LAURENCE
```
## Interaction with other tools
**LINO** respect the UNIX philosophy and use standards input an output to share data with others tools.
## MongoDB storage
### MongoDB storage
Data set could be store in mongoDB easily with the `mongoimport` tool:

Expand All @@ -304,11 +399,12 @@ and reload later to a database :
```bash
$ mongoexport --db myproject --collection customer | lino push customer --jdbc jdbc:oracle:thin:scott/tiger@target:1721:xe
```
## Miller `mlr`

### Miller `mlr`

`mlr` tool can be used to format json lines into another tabular format (csv, markdown table, ...).

## jq
### jq

`jq` tool can be piped with the **LINO** output to prettify it.

Expand Down
37 changes: 37 additions & 0 deletions cmd/lino/dep_analyse.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
// Copyright (C) 2021 CGI France
//
// This file is part of LINO.
//
// LINO is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// LINO is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with LINO. If not, see <http://www.gnu.org/licenses/>.

package main

import (
infra "github.com/cgi-fr/lino/internal/infra/analyse"
domain "github.com/cgi-fr/lino/pkg/analyse"
)

func analyseDataSourceFactory() map[string]domain.ExtractorFactory {
return map[string]domain.ExtractorFactory{
"postgres": infra.NewSQLExtractorFactory(),
"godror": infra.NewSQLExtractorFactory(),
"godror-raw": infra.NewSQLExtractorFactory(),
"mysql": infra.NewSQLExtractorFactory(),
"db2": infra.NewSQLExtractorFactory(),
}
}

func analyserFactory() domain.AnalyserFactory {
return infra.RimoAnalyserFactory{}
}
3 changes: 3 additions & 0 deletions cmd/lino/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import (
"text/template"

over "github.com/adrienaury/zeromdc"
"github.com/cgi-fr/lino/internal/app/analyse"
"github.com/cgi-fr/lino/internal/app/dataconnector"
"github.com/cgi-fr/lino/internal/app/http"
"github.com/cgi-fr/lino/internal/app/id"
Expand Down Expand Up @@ -158,6 +159,7 @@ func init() {
rootCmd.AddCommand(pull.NewCommand("lino", os.Stderr, os.Stdout, os.Stdin))
rootCmd.AddCommand(push.NewCommand("lino", os.Stderr, os.Stdout, os.Stdin))
rootCmd.AddCommand(http.NewCommand("lino", os.Stderr, os.Stdout, os.Stdin))
rootCmd.AddCommand(analyse.NewCommand("lino", os.Stderr, os.Stdout, os.Stdin))
}

func initConfig() {
Expand Down Expand Up @@ -202,6 +204,7 @@ func initConfig() {
zerolog.SetGlobalLevel(zerolog.Disabled)
}

analyse.Inject(tableStorage(), dataconnectorStorage(), analyseDataSourceFactory(), analyserFactory())
dataconnector.Inject(dataconnectorStorage(), dataPingerFactory())
relation.Inject(dataconnectorStorage(), relationStorage(), relationExtractorFactory())
table.Inject(dataconnectorStorage(), tableStorage(), tableExtractorFactory())
Expand Down
10 changes: 8 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ require (
github.com/adrienaury/zeromdc v0.0.0-20221116212822-6a366c26ee61
github.com/awalterschulze/gographviz v2.0.3+incompatible
github.com/cgi-fr/jsonline v0.5.0
github.com/cgi-fr/rimo v0.2.0
github.com/docker/docker-credential-helpers v0.8.0
github.com/go-sql-driver/mysql v1.7.1
github.com/gorilla/mux v1.8.0
Expand All @@ -27,17 +28,22 @@ require (

require (
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/google/go-cmp v0.5.8 // indirect
github.com/google/go-cmp v0.5.9 // indirect
github.com/hashicorp/errwrap v1.1.0 // indirect
github.com/hexops/valast v1.4.4 // indirect
github.com/iancoleman/orderedmap v0.3.0 // indirect
github.com/inconshreveable/mousetrap v1.1.0 // indirect
github.com/invopop/jsonschema v0.7.0 // indirect
github.com/klauspost/compress v1.16.0 // indirect
github.com/kr/pretty v0.3.0 // indirect
github.com/mattn/go-colorable v0.1.13 // indirect
github.com/mattn/go-sqlite3 v2.0.3+incompatible // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/spf13/pflag v1.0.5 // indirect
github.com/stretchr/objx v0.5.0 // indirect
golang.org/x/mod v0.12.0 // indirect
golang.org/x/sys v0.12.0 // indirect
golang.org/x/tools v0.12.0 // indirect
google.golang.org/protobuf v1.28.1 // indirect
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c // indirect
mvdan.cc/gofumpt v0.5.0 // indirect
)
Loading

0 comments on commit 8db698f

Please sign in to comment.