Skip to content

Commit

Permalink
Version 2.0, with Python 3 support.
Browse files Browse the repository at this point in the history
  • Loading branch information
Tomas Teijeiro committed Feb 19, 2019
2 parents 98381d7 + b5f232d commit 73daf4c
Show file tree
Hide file tree
Showing 90 changed files with 4,049 additions and 2,870 deletions.
2 changes: 1 addition & 1 deletion Beat_Classification.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This project includes an algorithm for automatic beat classification on ECG signals, described in the following paper:

- T. Teijeiro, P. Félix, J.Presedo and D. Castro: *Heartbeat classification using abstract features from the abductive interpretation of the ECG*
- T. Teijeiro, P. Félix, J.Presedo and D. Castro: *Heartbeat classification using abstract features from the abductive interpretation of the ECG*, IEEE journal of biomedical and health informatics, 2018, vol. 22, no 2, p. 409-420. [DOI: 10.1109/JBHI.2016.2631247](https://doi.org/10.1109/JBHI.2016.2631247) .

The algorithm relies on the [abductive interpretation of an ECG record](README.md#interpreting-external-ecg-records) to obtain a set of qualitative morphological and rhythm features for each QRS observation in the interpretation result. Then, a clustering task provides a partition of the full set of QRS observations, and finally a label is assigned to each cluster, classifying all the beats in the record.

Expand Down
79 changes: 53 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,32 @@

*Construe* is a knowledge-based abductive framework for time series interpretation. It provides a knowledge representation model and a set of algorithms for the interpretation of temporal information, implementing a hypothesize-and-test cycle guided by an attentional mechanism. The framework is fully described in the following paper:

[1]: T. Teijeiro and P. Félix: *On the adoption of abductive reasoning for time series interpretation*, 2016, [arXiv:1609.05632](http://arxiv.org/abs/1609.05632).
[1]: T. Teijeiro and P. Félix: *On the adoption of abductive reasoning for time series interpretation*, Artificial Intelligence, 2018, vol. 262, p. 163-188. [DOI:10.1016/j.artint.2018.06.005](https://doi.org/10.1016/j.artint.2018.06.005).

In this repository you will find the complete implementation of the data model and the algorithms, as well as a knowledge base for the interpretation of multi-lead electrocardiogram (ECG) signals, from the basic waveforms (P, QRS, T) to complex rhythm patterns (Atrial fibrillation, Bigeminy, Trigeminy, Ventricular flutter/fibrillation, etc.). In addition, we provide some utility scripts to reproduce the interpretation of all the ECG strips shown in paper [1], and to allow the interpretation of any ECG record in the [MIT-BIH format](https://www.physionet.org/faq.shtml#file_types) with a command-line interface very similar to that of the [WFDB applications](https://physionet.org/physiotools/wfdb.shtml).



Additionally, the repository includes an algorithm for [automatic heartbeat classification on ECG signals](Beat_Classification.md), described in the paper:

[2]: T. Teijeiro, P. Félix, J.Presedo and D. Castro: *Heartbeat classification using abstract features from the abductive interpretation of the ECG*
[2]: T. Teijeiro, P. Félix, J.Presedo and D. Castro: *Heartbeat classification using abstract features from the abductive interpretation of the ECG*, IEEE journal of biomedical and health informatics, 2018, vol. 22, no 2, p. 409-420. [DOI: 10.1109/JBHI.2016.2631247](https://doi.org/10.1109/JBHI.2016.2631247) .



The *Construe* algorithm is also the basis for the arrhythmia classification method described in the following papers:

[3]: T. Teijeiro, C.A. García, D. Castro and P. Félix: *Arrhythmia Classification from the Abductive Interpretation of Short Single-Lead ECG Records*, Computing in Cardiology, 2017, vol. 44, p. 1-4. [DOI: 10.22489/CinC.2017.166-054](https://doi.org/10.22489/CinC.2017.166-054).

[4]: T. Teijeiro, C.A. García, D. Castro and P. Félix: *Abductive reasoning as the basis to reproduce expert criteria in ECG Atrial Fibrillation identification*. Physiological Measurement, 39(8), 084006. [DOI: 10.1088/1361-6579/aad7e4](https://doi.org/10.1088/1361-6579/aad7e4)

This method won the First Prize in the [Physionet/Computing in Cardiology Challenge 2017](https://physionet.org/challenge/2017), providing the best results in Atrial Fibrillation detection among the 75 participating teams.


The *Construe* algorithm is also the basis for the method described in the paper *Arrhythmia Classification from the Abductive Interpretation of Short Single-Lead ECG Records*, by T. Teijeiro, C.A. García, D. Castro and P. Félix. This method won First Prize in the [Physionet/Computing in Cardiology Challenge 2017](https://physionet.org/challenge/2017), providing the best results in Atrial Fibrillation detection among the 75 participating teams.


## Installation

This project is implemented in pure python, so no installation is required. However, the core algorithms have strong dependencies with the following python packages:
This project is implemented in pure python 3, so no installation is required. However, the core algorithms have strong dependencies with the following python packages:

1. [sortedcontainers](https://pypi.python.org/pypi/sortedcontainers)
2. [numpy](https://pypi.python.org/pypi/numpy)
Expand All @@ -26,15 +38,15 @@ In addition, the knowledge base for ECG interpretation depends on the following
4. [scikit-learn](https://pypi.python.org/pypi/scikit-learn)
5. [PyWavelets](https://pypi.python.org/pypi/PyWavelets)

To support visualization of the interpretation results and the interpretations tree and run the usage examples, the following packages are also needed:
As optional dependencies to support the interactive visualization of the interpretation results and the interpretations tree and to run the demo examples, the following packages are also needed:

6. [matplotlib](https://pypi.python.org/pypi/matplotlib)
7. [networkx](https://pypi.python.org/pypi/networkx)
8. [pygraphviz](https://pypi.python.org/pypi/pygraphviz)
8. [pygraphviz](https://pypi.python.org/pypi/pygraphviz) and [graphviz](https://www.graphviz.org/)

Finally, to read ECG signal records it is necessary to have access to a proper installation of the [WFDB software package](http://www.physionet.org/physiotools/wfdb.shtml).

To make easier the installation of Python dependencies, we recommend the [Anaconda Python distribution](https://www.continuum.io/anaconda-overview). Alternatively, you can install them using pip with the following command:
To make easier the installation of Python dependencies, we recommend the [Anaconda](https://www.continuum.io/anaconda-overview) or [Miniconda](https://conda.io/miniconda.html) Python distributions. Alternatively, you can install them using pip with the following command:

```
~$ pip install -r requirements.txt
Expand All @@ -46,21 +58,9 @@ Once all the dependencies are satisfied, it is enough to download the project so
### *Construe* as a tool for ECG analysis
Along with the general data model for knowledge description and the interpretation algorithms, a comprehensive knowledge base for ECG signal interpretation is provided with the framework, so the software can be directly used as a tool for ECG analysis in multiple abstraction levels.

#### Demo examples
All signal strips in [1] are included as interactive examples to make it easier to understand how the interpretation algorithms work. For this, use the `run_example.sh` script, selecting the figure for which you want to reproduce the interpretation process:

```
./run_example.sh fig4
```

![fig4 interpretation](https://cloud.githubusercontent.com/assets/4498106/20661551/a1824bee-b54f-11e6-870f-a2aa14c43e88.png)


Once the interpretation is finished, the resulting observations are printed to the terminal, and two interactive figures are shown. One plots the ECG signal with all the observations organized into abstraction levels (deflections, waves, and rhythms), and the other shows the interpretations tree explored to find the result. Each node in the tree can be selected to show the observations at a given time point during the interpretation, allowing to reproduce the *abduce*, *deduce*, *subsume* and *predict* reasoning steps [1].

#### Interpreting external ECG records. The `construe-ecg` tool:

Any ECG record in [MIT-BIH format](https://www.physionet.org/physiotools/wag/header-5.htm) can be interpreted with the *Construe* algorithm. For this, we provide two convenient python modules that may be used as command-line tools. The first one (`fragment_processing.py`) is intended to visually show the result of the interpretation of a (small) ECG fragment, allowing to inspect and reproduce the interpretation process by navigating through the interpretations tree. But the main one is the (`construe_ecg.py`) script, which is intended to be used as a production tool that performs background interpretations of full ECG records (or sections). The result is a set of [annotations in the MIT format](https://www.physionet.org/physiotools/wag/annot-5.htm). Both tools try to follow the [WFDB Applications](https://www.physionet.org/physiotools/wag/wag.htm) command-line interface. The usage of the `construe-ecg` tool is as follows:
Any ECG record in [MIT-BIH format](https://www.physionet.org/physiotools/wag/header-5.htm) can be interpreted with the *Construe* algorithm. This is done via the `construe_ecg.py` script, which is intended to be used as a production command-line tool that performs background interpretations of full ECG records (or sections). The result is a set of [annotations in the MIT format](https://www.physionet.org/physiotools/wag/annot-5.htm). This tool tries to follow the [WFDB Applications](https://www.physionet.org/physiotools/wag/wag.htm) command-line interface. The usage of the `construe-ecg` application is as follows:

```
usage: construe_ecg.py [-h] -r record [-a ann] [-o oann]
Expand Down Expand Up @@ -131,18 +131,19 @@ optional arguments:
output the fragment being interpreted.
--no-merge Avoids the use of a branch-merging strategy for
interpretation exploration. If the selected
abstraction level is "conduction", this parameter is
abstraction level is "conduction", this parameter is
ignored.
```

#### Some common usage examples

Perform a full interpretation of record `100` from the [MIT-BIH Arrhythmia Database](https://www.physionet.org/physiobank/database/mitdb) (the output will be stored in the `100.iqrs` annotations file):
Perform a full interpretation of record `100` from the [MIT-BIH Arrhythmia Database](https://www.physionet.org/physiobank/database/mitdb) (the output will be stored in the `100.iqrs` annotation file):

```
$ python construe_ecg.py -r 100
```

Perform a delineation of the selected heartbeats in the `.man` annotations file for the record `sel30` from the [QT database](https://www.physionet.org/physiobank/database/qtdb), and storing the result in the `sel30.pqt` file.
Perform a delineation of the selected heartbeats in the `.man` annotation file for the record `sel30` from the [QT database](https://www.physionet.org/physiobank/database/qtdb), and store the result in the `sel30.pqt` file.

```
$ python construe_ecg.py -r sel30 -a man -o pqt --level conduction
Expand All @@ -154,12 +155,28 @@ The same than before, but avoiding P-Wave delineation (only includes QRS complex
$ python construe_ecg.py -r sel30 -a man -o pqt --level conduction --exclude-pwaves
```

### Using *Construe* in another problems and domains
#### Interactive demo examples

All signal strips in [1] are included as interactive examples to make it easier to understand how the interpretation algorithms work. For this, and after installing the optional dependencies described in the [installation](## Installation) section, use the `run_example.sh` script, selecting the figure for which you want to reproduce the interpretation process:

```
./run_example.sh fig4
```

![fig4 interpretation](https://cloud.githubusercontent.com/assets/4498106/20661551/a1824bee-b54f-11e6-870f-a2aa14c43e88.png)

Once the interpretation is finished, the resulting observations are printed to the terminal, and two interactive figures are shown. One plots the ECG signal with all the observations organized into abstraction levels (deflections, waves, and rhythms), and the other shows the interpretations tree explored to find the result. Each node in the tree can be selected to show the observations at a given time point during the interpretation, allowing to reproduce the *abduce*, *deduce*, *subsume* and *predict* reasoning steps [1].

In order to support this kind of interactive analysis in other arbitrary (short) ECG fragments, the `fragment_processing.py` script is provided. Please note that this tool is conceived just to give insights into the abductive interpretation algorithms and to illustrate the adopted reasoning paradigm, and not as a production tool.

### Using *Construe* in other problems and domains

We will be glad if you want to use *Construe* to solve problems different from ECG interpretation, and we will help you to do so. The first step is to understand what is under the hood, and the best reference is [1]. After this, you will have to define the **Abstraction Model** for your problem, based on the **Observable** and **Abstraction Pattern** formalisms. As an example, a high-level description of the ECG abstraction model is available in [2], and its implementation is in the [`knowledge`](construe/knowledge) subdirectory. A tutorial is also available in the project [wiki](https://github.com/citiususc/construe/wiki/How-to-define-abstraction-models).

Once the domain-specific knowledge base has been defined, the `fragment_processing.py` module should serve as a basis for the execution of the full hypothesize-and-test cycle with different time series and the new abstraction model.



## Repository structure

The source code is structured in the following main modules:
Expand All @@ -170,10 +187,20 @@ The source code is structured in the following main modules:
- [`model`](construe/model): General data model of the framework, including the base class for all *observables* and classes to implement *abstraction grammars* as finite automata.
- [`utils`](construe/utils): Miscellaneous utility modules, including signal processing and plotting routines.



## Known issues

- On windows and OS-X systems, the *Dynamic Time Warping* utilities included in the `construe.utils.signal_processing.dtw` package probably won't work. These sources are from the discontinued [mlpy](http://mlpy.sourceforge.net) project, and should be compiled using [cython](http://cython.org). The fastest solution is probably to install the *mlpy* package and change the `dtw_std` import in the `construe/knowledge/abstraction_patterns/segmentation/QRS.py` module.
- Abductive interpretation of time-series is NP-Hard [1]. This implementation includes several optimizations to make computations feasible, but still the running times are probably longer than you expect if the selected abstraction level is `rhythm`. Parameter tuning also help to increase the interpretation speed (usually at the cost of worse-quality results). Also try the `-v` flag to get feedback and make the wait less painful ;-).
- On windows and OS-X systems, the *Dynamic Time Warping* utilities included in the `construe.utils.signal_processing.dtw` package may not work. These sources are from the discontinued [mlpy](http://mlpy.sourceforge.net) project, and should be compiled using [cython](http://cython.org) with the following commands:
```bash
$ cd construe/utils/signal_processing/dtw
$ python3 setup.py build_ext --inplace
```
Another possible workaround is to install the *mlpy* package and change the `dtw_std` import in the `construe/knowledge/abstraction_patterns/segmentation/QRS.py` module.

- Abductive interpretation of time-series is NP-Hard [1]. This implementation includes several optimizations to make computations feasible, but still the running times are probably longer than you expect if the selected abstraction level is `rhythm`. Parameter tuning also helps to increase the interpretation speed (usually at the cost of worse-quality results). Also try the `-v` flag to get feedback and make the wait less painful ;-).



## License

Expand Down
12 changes: 6 additions & 6 deletions beat_classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
Cluster = collections.namedtuple('Cluster', ['beats', 'info'])

#Codes for the rhythm.
REGULAR, AFIB, ADVANCED, DELAYED = range(4)
REGULAR, AFIB, ADVANCED, DELAYED = list(range(4))
#Atrial fibrillation beats are tagged as NORMAL in the MIT-BIH Arrhythmia
#database, but during the classification, we marked them with a different code
#not used for other purposes, although semantically related with it
Expand Down Expand Up @@ -117,9 +117,9 @@ def get_similarity(sig1, sig2):
Obtains a measure of the similarity between two multi-lead signals, as the
mean of the cross-correlation maximum value for each lead.
"""
cleads = set(sig1.keys()).intersection(sig2.keys())
cleads = set(sig1.keys()).intersection(set(sig2.keys()))
corrs = []
for lead in set(sig1.keys()).union(sig2.keys()):
for lead in set(sig1.keys()).union(set(sig2.keys())):
if lead not in cleads:
corrs.append(0.0)
else:
Expand Down Expand Up @@ -277,7 +277,7 @@ def get_cluster_features(cluster, features):
(axis,))), axis=0)
#We select as representative the beat with minimum distance.
info = BeatInfo(cl[np.argmin(eucdist)])
info.pwave = np.mean(pwamps.values()) > 0.05
info.pwave = np.mean(list(pwamps.values())) > 0.05
#For the rhythm features, we use all beats
cl = {b for b in cluster if b in features}
info.rr = np.mean([features[b].rr for b in cl])
Expand Down Expand Up @@ -706,7 +706,7 @@ def find_normal_cluster(clusters):
-len(cl[1].beats))
#Cluster classification
classified = []
clist = sorted(clusters.iteritems(), key=keyf)
clist = sorted(clusters.items(), key=keyf)
#Single cluster classification
i = 0
while i < len(clist):
Expand Down Expand Up @@ -747,7 +747,7 @@ def find_normal_cluster(clusters):
#We also include the clustered artifacts.
for b in interp.get_observations(o.RDeflection, filt=lambda ba:
any([ba in cl.beats and any(isinstance(b, o.QRS)
for b in cl.beats) for cl in clusters.itervalues()])):
for b in cl.beats) for cl in clusters.values()])):
a = MIT.MITAnnotation.MITAnnotation()
a.code = b.tag
a.time = b.time.start
Expand Down
10 changes: 5 additions & 5 deletions construe/acquisition/obs_buffer.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
@author: T. Teijeiro
"""

from ..model import Observable, EventObservable, Interval as Iv
from ..model import Observable, EventObservable
from ..model.observable import overlap, end_cmp_key
import sortedcontainers
import numpy as np
Expand Down Expand Up @@ -73,12 +73,12 @@ def get_observations(clazz=Observable, start=0, end=np.inf,
if start == 0:
idx = 0
else:
dummy.time.value = Iv(start, start)
dummy.time.set(start, start)
idx = _OBS.bisect_left(dummy)
if end ==np.inf:
udx = len(_OBS)
else:
dummy.time.value = Iv(end, end)
dummy.time.set(end, end)
udx = _OBS.bisect_right(dummy)
return (obs for obs in _OBS.islice(idx, udx, reverse)
if obs.earlystart >= start and isinstance(obs, clazz) and filt(obs))
Expand All @@ -93,7 +93,7 @@ def nobs_before(time):
given time.
"""
dummy = EventObservable()
dummy.time.value = Iv(time, time)
dummy.time.set(time, time)
return _OBS.bisect_right(dummy)

def find_overlapping(observation, clazz=Observable):
Expand All @@ -105,7 +105,7 @@ def find_overlapping(observation, clazz=Observable):
obs1.start < obs2.start, then obs1.end < obs2.end.
"""
dummy = EventObservable()
dummy.time.value = Iv(observation.latestart, observation.latestart)
dummy.time.set(observation.latestart, observation.latestart)
idx = _OBS.bisect_right(dummy)
while idx < len(_OBS):
other = _OBS[idx]
Expand Down
Loading

0 comments on commit 73daf4c

Please sign in to comment.