Version 2.0, with Python 3 support.

citiususc · Feb 19, 2019 · 73daf4c · 73daf4c
2 parents 98381d7 + b5f232d
commit 73daf4c
Show file tree

Hide file tree

Showing 90 changed files with 4,049 additions and 2,870 deletions.
diff --git a/Beat_Classification.md b/Beat_Classification.md
@@ -2,7 +2,7 @@
 
 This project includes an algorithm for automatic beat classification on ECG signals, described in the following paper:
 
- - T. Teijeiro, P. Félix, J.Presedo and D. Castro: *Heartbeat classification using abstract features from the abductive interpretation of the ECG*
+ - T. Teijeiro, P. Félix, J.Presedo and D. Castro: *Heartbeat classification using abstract features from the abductive interpretation of the ECG*, IEEE journal of biomedical and health informatics, 2018, vol. 22, no 2, p. 409-420.  [DOI: 10.1109/JBHI.2016.2631247](https://doi.org/10.1109/JBHI.2016.2631247) .
 
 The algorithm relies on the [abductive interpretation of an ECG record](README.md#interpreting-external-ecg-records) to obtain a set of qualitative morphological and rhythm features for each QRS observation in the interpretation result. Then, a clustering task provides a partition of the full set of QRS observations, and finally a label is assigned to each cluster, classifying all the beats in the record.
 

diff --git a/README.md b/README.md
@@ -2,20 +2,32 @@
 
 *Construe* is a knowledge-based abductive framework for time series interpretation. It provides a knowledge representation model and a set of algorithms for the interpretation of temporal information, implementing a hypothesize-and-test cycle guided by an attentional mechanism. The framework is fully described in the following paper:
 
- [1]: T. Teijeiro and P. Félix: *On the adoption of abductive reasoning for time series interpretation*, 2016,  [arXiv:1609.05632](http://arxiv.org/abs/1609.05632).
+[1]: T. Teijeiro and P. Félix: *On the adoption of abductive reasoning for time series interpretation*, Artificial Intelligence, 2018, vol. 262, p. 163-188. [DOI:10.1016/j.artint.2018.06.005](https://doi.org/10.1016/j.artint.2018.06.005).
 
 In this repository you will find the complete implementation of the data model and the algorithms, as well as a knowledge base for the interpretation of multi-lead electrocardiogram (ECG) signals, from the basic waveforms (P, QRS, T) to complex rhythm patterns (Atrial fibrillation, Bigeminy, Trigeminy, Ventricular flutter/fibrillation, etc.). In addition, we provide some utility scripts to reproduce the interpretation of all the ECG strips shown in paper [1], and to allow the interpretation of any ECG record in the [MIT-BIH format](https://www.physionet.org/faq.shtml#file_types) with a command-line interface very similar to that of the [WFDB applications](https://physionet.org/physiotools/wfdb.shtml).
 
+
+
 Additionally, the repository includes an algorithm for [automatic heartbeat classification on ECG signals](Beat_Classification.md), described in the paper:
 
- [2]: T. Teijeiro, P. Félix, J.Presedo and D. Castro: *Heartbeat classification using abstract features from the abductive interpretation of the ECG*
+[2]: T. Teijeiro, P. Félix, J.Presedo and D. Castro: *Heartbeat classification using abstract features from the abductive interpretation of the ECG*, IEEE journal of biomedical and health informatics, 2018, vol. 22, no 2, p. 409-420.  [DOI: 10.1109/JBHI.2016.2631247](https://doi.org/10.1109/JBHI.2016.2631247) .
+
+
+
+The *Construe* algorithm is also the basis for the arrhythmia classification method described in the following papers:
+
+[3]: T. Teijeiro, C.A. García, D. Castro and P. Félix: *Arrhythmia Classification from the Abductive Interpretation of Short Single-Lead ECG Records*, Computing in Cardiology, 2017, vol. 44, p. 1-4. [DOI: 10.22489/CinC.2017.166-054](https://doi.org/10.22489/CinC.2017.166-054).
+
+[4]: T. Teijeiro, C.A. García, D. Castro and P. Félix: *Abductive reasoning as the basis to reproduce expert criteria in ECG Atrial Fibrillation identification*. Physiological Measurement, 39(8), 084006. [DOI: 10.1088/1361-6579/aad7e4](https://doi.org/10.1088/1361-6579/aad7e4)
+
+This method won the First Prize in the [Physionet/Computing in Cardiology Challenge 2017](https://physionet.org/challenge/2017), providing the best results in Atrial Fibrillation detection among the 75 participating teams.
+
 
-The *Construe* algorithm is also the basis for the method described in the paper *Arrhythmia Classification from the Abductive Interpretation of Short Single-Lead ECG Records*, by T. Teijeiro, C.A. García, D. Castro and P. Félix. This method won First Prize in the [Physionet/Computing in Cardiology Challenge 2017](https://physionet.org/challenge/2017), providing the best results in Atrial Fibrillation detection among the 75 participating teams.
 
 
 ## Installation
 
-This project is implemented in pure python, so no installation is required. However, the core algorithms have strong dependencies with the following python packages:
+This project is implemented in pure python 3, so no installation is required. However, the core algorithms have strong dependencies with the following python packages:
 
 1. [sortedcontainers](https://pypi.python.org/pypi/sortedcontainers)
 2. [numpy](https://pypi.python.org/pypi/numpy)
@@ -26,15 +38,15 @@ In addition, the knowledge base for ECG interpretation depends on the following
 4. [scikit-learn](https://pypi.python.org/pypi/scikit-learn)
 5. [PyWavelets](https://pypi.python.org/pypi/PyWavelets)
 
-To support visualization of the interpretation results and the interpretations tree and run the usage examples, the following packages are also needed:
+As optional dependencies to support the interactive visualization of the interpretation results and the interpretations tree and to run the demo examples, the following packages are also needed:
 
 6. [matplotlib](https://pypi.python.org/pypi/matplotlib)
 7. [networkx](https://pypi.python.org/pypi/networkx)
-8. [pygraphviz](https://pypi.python.org/pypi/pygraphviz)
+8. [pygraphviz](https://pypi.python.org/pypi/pygraphviz) and [graphviz](https://www.graphviz.org/)
 
 Finally, to read ECG signal records it is necessary to have access to a proper installation of the [WFDB software package](http://www.physionet.org/physiotools/wfdb.shtml).
 
-To make easier the installation of Python dependencies, we recommend the [Anaconda Python distribution](https://www.continuum.io/anaconda-overview). Alternatively, you can install them using pip with the following command:
+To make easier the installation of Python dependencies, we recommend the [Anaconda](https://www.continuum.io/anaconda-overview) or [Miniconda](https://conda.io/miniconda.html) Python distributions. Alternatively, you can install them using pip with the following command:
 
 ```
  ~$ pip install -r requirements.txt
@@ -46,21 +58,9 @@ Once all the dependencies are satisfied, it is enough to download the project so
 ### *Construe* as a tool for ECG analysis
 Along with the general data model for knowledge description and the interpretation algorithms, a comprehensive knowledge base for ECG signal interpretation is provided with the framework, so the software can be directly used as a tool for ECG analysis in multiple abstraction levels.
 
-#### Demo examples
-All signal strips in [1] are included as interactive examples to make it easier to understand how the interpretation algorithms work. For this, use the `run_example.sh` script, selecting the figure for which you want to reproduce the interpretation process:
-
-```
-./run_example.sh fig4
-```
-
-![fig4 interpretation](https://cloud.githubusercontent.com/assets/4498106/20661551/a1824bee-b54f-11e6-870f-a2aa14c43e88.png)
-
-
-Once the interpretation is finished, the resulting observations are printed to the terminal, and two interactive figures are shown. One plots the ECG signal with all the observations organized into abstraction levels (deflections, waves, and rhythms), and the other shows the interpretations tree explored to find the result. Each node in the tree can be selected to show the observations at a given time point during the interpretation, allowing to reproduce the *abduce*, *deduce*, *subsume* and *predict* reasoning steps [1].
-
 #### Interpreting external ECG records. The `construe-ecg` tool:
 
-Any ECG record in [MIT-BIH format](https://www.physionet.org/physiotools/wag/header-5.htm) can be interpreted with the *Construe* algorithm. For this, we provide two convenient python modules that may be used as command-line tools. The first one (`fragment_processing.py`) is intended to visually show the result of the interpretation of a (small) ECG fragment, allowing to inspect and reproduce the interpretation process by navigating through the interpretations tree. But the main one is the (`construe_ecg.py`) script, which is intended to be used as a production tool that performs background interpretations of full ECG records (or sections). The result is a set of [annotations in the MIT format](https://www.physionet.org/physiotools/wag/annot-5.htm). Both tools try to follow the [WFDB Applications](https://www.physionet.org/physiotools/wag/wag.htm) command-line interface. The usage of the `construe-ecg` tool is as follows:
+Any ECG record in [MIT-BIH format](https://www.physionet.org/physiotools/wag/header-5.htm) can be interpreted with the *Construe* algorithm. This is done via the `construe_ecg.py` script, which is intended to be used as a production command-line tool that performs background interpretations of full ECG records (or sections). The result is a set of [annotations in the MIT format](https://www.physionet.org/physiotools/wag/annot-5.htm). This tool tries to follow the [WFDB Applications](https://www.physionet.org/physiotools/wag/wag.htm) command-line interface. The usage of the `construe-ecg` application is as follows:
 
 ```
 usage: construe_ecg.py [-h] -r record [-a ann] [-o oann]
@@ -131,18 +131,19 @@ optional arguments:
                         output the fragment being interpreted.
   --no-merge            Avoids the use of a branch-merging strategy for
                         interpretation exploration. If the selected
-                        abstraction level is "conduction", this parameter is
+                        abstraction level is "conduction", this parameter is 
+                        ignored.
 ```
 
 #### Some common usage examples
 
-Perform a full interpretation of record `100` from the [MIT-BIH Arrhythmia Database](https://www.physionet.org/physiobank/database/mitdb) (the output will be stored in the `100.iqrs` annotations file):
+Perform a full interpretation of record `100` from the [MIT-BIH Arrhythmia Database](https://www.physionet.org/physiobank/database/mitdb) (the output will be stored in the `100.iqrs` annotation file):
 
 ```
 $ python construe_ecg.py -r 100
 ```
 
-Perform a delineation of the selected heartbeats in the `.man` annotations file for the record `sel30` from the [QT database](https://www.physionet.org/physiobank/database/qtdb), and storing the result in the `sel30.pqt` file.
+Perform a delineation of the selected heartbeats in the `.man` annotation file for the record `sel30` from the [QT database](https://www.physionet.org/physiobank/database/qtdb), and store the result in the `sel30.pqt` file.
 
 ```
 $ python construe_ecg.py -r sel30 -a man -o pqt --level conduction
@@ -154,12 +155,28 @@ The same than before, but avoiding P-Wave delineation (only includes QRS complex
 $ python construe_ecg.py -r sel30 -a man -o pqt --level conduction --exclude-pwaves
 ```
 
-### Using *Construe* in another problems and domains
+#### Interactive demo examples
+
+All signal strips in [1] are included as interactive examples to make it easier to understand how the interpretation algorithms work. For this, and after installing the optional dependencies described in the [installation](## Installation) section, use the `run_example.sh` script, selecting the figure for which you want to reproduce the interpretation process:
+
+```
+./run_example.sh fig4
+```
+
+![fig4 interpretation](https://cloud.githubusercontent.com/assets/4498106/20661551/a1824bee-b54f-11e6-870f-a2aa14c43e88.png)
+
+Once the interpretation is finished, the resulting observations are printed to the terminal, and two interactive figures are shown. One plots the ECG signal with all the observations organized into abstraction levels (deflections, waves, and rhythms), and the other shows the interpretations tree explored to find the result. Each node in the tree can be selected to show the observations at a given time point during the interpretation, allowing to reproduce the *abduce*, *deduce*, *subsume* and *predict* reasoning steps [1].
+
+In order to support this kind of interactive analysis in other arbitrary (short) ECG fragments, the `fragment_processing.py` script is provided. Please note that this tool is conceived just to give insights into the abductive interpretation algorithms and to illustrate the adopted reasoning paradigm, and not as a production tool. 
+
+### Using *Construe* in other problems and domains
 
 We will be glad if you want to use *Construe* to solve problems different from ECG interpretation, and we will help you to do so. The first step is to understand what is under the hood, and the best reference is [1]. After this, you will have to define the **Abstraction Model** for your problem, based on the **Observable** and **Abstraction Pattern** formalisms. As an example, a high-level description of the ECG abstraction model is available in [2], and its implementation is in the [`knowledge`](construe/knowledge) subdirectory. A tutorial is also available in the project [wiki](https://github.com/citiususc/construe/wiki/How-to-define-abstraction-models).
 
 Once the domain-specific knowledge base has been defined, the `fragment_processing.py` module should serve as a basis for the execution of the full hypothesize-and-test cycle with different time series and the new abstraction model.
 
+
+
 ## Repository structure
 
 The source code is structured in the following main modules:
@@ -170,10 +187,20 @@ The source code is structured in the following main modules:
  - [`model`](construe/model): General data model of the framework, including the base class for all *observables* and classes to implement *abstraction grammars* as finite automata.
  - [`utils`](construe/utils): Miscellaneous utility modules, including signal processing and plotting routines.
 
+
+
 ## Known issues
 
-- On windows and OS-X systems, the *Dynamic Time Warping* utilities included in the `construe.utils.signal_processing.dtw` package probably won't work. These sources are from the discontinued [mlpy](http://mlpy.sourceforge.net) project, and should be compiled using [cython](http://cython.org). The fastest solution is probably to install the *mlpy* package and change the `dtw_std` import in the `construe/knowledge/abstraction_patterns/segmentation/QRS.py` module.
-- Abductive interpretation of time-series is NP-Hard [1]. This implementation includes several optimizations to make computations feasible, but still the running times are probably longer than you expect if the selected abstraction level is `rhythm`. Parameter tuning also help to increase the interpretation speed (usually at the cost of worse-quality results). Also try the `-v` flag to get feedback and make the wait less painful ;-).
+- On windows and OS-X systems, the *Dynamic Time Warping* utilities included in the `construe.utils.signal_processing.dtw` package may not work. These sources are from the discontinued [mlpy](http://mlpy.sourceforge.net) project, and should be compiled using [cython](http://cython.org) with the following commands:
+```bash
+$ cd construe/utils/signal_processing/dtw
+$ python3 setup.py build_ext --inplace
+```
+  	Another possible workaround is to install the *mlpy* package and change the `dtw_std` import in the `construe/knowledge/abstraction_patterns/segmentation/QRS.py` module.
+
+- Abductive interpretation of time-series is NP-Hard [1]. This implementation includes several optimizations to make computations feasible, but still the running times are probably longer than you expect if the selected abstraction level is `rhythm`. Parameter tuning also helps to increase the interpretation speed (usually at the cost of worse-quality results). Also try the `-v` flag to get feedback and make the wait less painful ;-).
+
+
 
 ## License
 

diff --git a/beat_classification.py b/beat_classification.py
@@ -37,7 +37,7 @@
 Cluster = collections.namedtuple('Cluster', ['beats', 'info'])
 
 #Codes for the rhythm.
-REGULAR, AFIB, ADVANCED, DELAYED = range(4)
+REGULAR, AFIB, ADVANCED, DELAYED = list(range(4))
 #Atrial fibrillation beats are tagged as NORMAL in the MIT-BIH Arrhythmia
 #database, but during the classification, we marked them with a different code
 #not used for other purposes, although semantically related with it
@@ -117,9 +117,9 @@ def get_similarity(sig1, sig2):
     Obtains a measure of the similarity between two multi-lead signals, as the
     mean of the cross-correlation maximum value for each lead.
     """
-    cleads = set(sig1.keys()).intersection(sig2.keys())
+    cleads = set(sig1.keys()).intersection(set(sig2.keys()))
     corrs = []
-    for lead in set(sig1.keys()).union(sig2.keys()):
+    for lead in set(sig1.keys()).union(set(sig2.keys())):
         if lead not in cleads:
             corrs.append(0.0)
         else:
@@ -277,7 +277,7 @@ def get_cluster_features(cluster, features):
                                         (axis,))), axis=0)
     #We select as representative the beat with minimum distance.
     info = BeatInfo(cl[np.argmin(eucdist)])
-    info.pwave = np.mean(pwamps.values()) > 0.05
+    info.pwave = np.mean(list(pwamps.values())) > 0.05
     #For the rhythm features, we use all beats
     cl = {b for b in cluster if b in features}
     info.rr = np.mean([features[b].rr for b in cl])
@@ -706,7 +706,7 @@ def find_normal_cluster(clusters):
                                       -len(cl[1].beats))
     #Cluster classification
     classified = []
-    clist = sorted(clusters.iteritems(), key=keyf)
+    clist = sorted(clusters.items(), key=keyf)
     #Single cluster classification
     i = 0
     while i < len(clist):
@@ -747,7 +747,7 @@ def find_normal_cluster(clusters):
     #We also include the clustered artifacts.
     for b in interp.get_observations(o.RDeflection, filt=lambda ba:
                     any([ba in cl.beats and any(isinstance(b, o.QRS)
-                        for b in cl.beats) for cl in clusters.itervalues()])):
+                        for b in cl.beats) for cl in clusters.values()])):
         a = MIT.MITAnnotation.MITAnnotation()
         a.code = b.tag
         a.time = b.time.start

diff --git a/construe/acquisition/obs_buffer.py b/construe/acquisition/obs_buffer.py
@@ -10,7 +10,7 @@
 @author: T. Teijeiro
 """
 
-from ..model import Observable, EventObservable, Interval as Iv
+from ..model import Observable, EventObservable
 from ..model.observable import overlap, end_cmp_key
 import sortedcontainers
 import numpy as np
@@ -73,12 +73,12 @@ def get_observations(clazz=Observable, start=0, end=np.inf,
     if start == 0:
         idx = 0
     else:
-        dummy.time.value = Iv(start, start)
+        dummy.time.set(start, start)
         idx = _OBS.bisect_left(dummy)
     if end ==np.inf:
         udx = len(_OBS)
     else:
-        dummy.time.value = Iv(end, end)
+        dummy.time.set(end, end)
         udx = _OBS.bisect_right(dummy)
     return (obs for obs in _OBS.islice(idx, udx, reverse)
             if obs.earlystart >= start and isinstance(obs, clazz) and filt(obs))
@@ -93,7 +93,7 @@ def nobs_before(time):
     given time.
     """
     dummy = EventObservable()
-    dummy.time.value = Iv(time, time)
+    dummy.time.set(time, time)
     return _OBS.bisect_right(dummy)
 
 def find_overlapping(observation, clazz=Observable):
@@ -105,7 +105,7 @@ def find_overlapping(observation, clazz=Observable):
     obs1.start < obs2.start, then obs1.end < obs2.end.
     """
     dummy = EventObservable()
-    dummy.time.value = Iv(observation.latestart, observation.latestart)
+    dummy.time.set(observation.latestart, observation.latestart)
     idx = _OBS.bisect_right(dummy)
     while idx < len(_OBS):
         other = _OBS[idx]