Cannot replicate extraction to MLCert #8

wenkokke · 2019-12-28T00:26:57Z

I'm currently trying to replicate the extraction from an EMNIST network to MLCert. This is as far as I've gotten trying to get the code in NNCert to work, patching things up as I go:

Install Miniconda
conda create --name NNCert python=3.5
conda activate NNCert
pip install tensorflow==1.6.1
Download EMNIST
Patch MLCert/NNCert/tf/extract_emnist.py:
- Line 6: Change emnist_load_data() to emnist_load_data(<path/to/emnist>)
- Note: The script will try to load from <path/to/emnist>/emnist
python extract_emnist.py (from MLCert/NNCert/tf)

Patch MLCert/NNCert/tf/dataset_params.py:

Restore definition make_dataset:

def make_dataset(images, labels):
  return DataSet(images, labels, reshape=False, dtype=tf.uint8)

Patch MLCert/NNCert/tf/pca.py:
- Line 9: Change _, _, _, _, _, load_data = .. to _, load_data = ..
python pca.py (from MLCert/NNCert/tf)
Rename generated files <path/to/emnist>/emnist/{train,test,validation}_reduced.pkl to <path/to/emnist>/emnist/{train,test,validation}_pca.pkl
Patch MLCert/NNCert/tf/Makefile:
- Line 7: Change train.py to main.py
- Line 10: Change eval.py to main.py
Patch MLCert/NNCert/tf/main.py:
- Line 41: Change hidden_sizes = map(..) to hidden_sizes = list(map(..))
make (from MLCert/NNCert/tf)

😭😭😭

FileNotFoundError: [Errno 2] No such file or directory: 'models/default/params.pkl.gz'
make: *** [train] Error 1

The text was updated successfully, but these errors were encountered:

gstew5 · 2019-12-28T06:53:40Z

Alex, do you think you could take a look?

@wenkokke, in the meantime, you might try the (currently undocumented) CLI @bagnalla built, by doing:

> cd NNCert
> python3 cli.py

The CLI's dependencies are listed in NNCert/cli_deps.txt (though not with version numbers -- perhaps @bagnalla can add).

wenkokke · 2019-12-31T14:40:44Z

Using cli.py works for me, using the following steps:

Install Miniconda
conda create --name NNCert python=3.5
conda activate NNCert
pip install tensorflow==1.6.1
pip install sklearn
pip instal gitpython
pip install inquirer
python cli.py (from MLCert/NNCert/)

It was initially rather unclear to me that the output of cli.py was a asking me to make a choice, rather than giving feedback on the progress.

Training a TF model (at least with float) works.
Compiling a TF model to Coq works.
Extraction fails on the first run, with the error below. I think this may be due to some problems with coq_makefile, at it seems like it doesn't process the files in the correct order w.r.t. dependencies? It seems to succeed on the second run. Will report back when it finishes.

Extracting model...
Makefile:17: Makefile.conf: No such file or directory
COQDEP VFILES
coq_makefile -R ../ MLCert -arg '-w none' net.v out.v bitnet.v kernel.v print.v config.v -o Makefile
Warning: ../ (used in -R or -Q) is not a subdirectory of the current directory

COQC net.v
COQC out.v
File "./out.v", line 8, characters 19-25:
Error: Unable to locate library bitnet.

make[1]: *** [out.vo] Error 1
make: *** [all] Error 2
File "./empiricalloss.v", line 14, characters 19-25:
Error: Unable to locate library bitnet.

wenkokke · 2020-01-03T14:34:36Z

Extracting a Coq model for EMNIST, with 10.000 nodes in the hidden layer, does not seem to be practical. Extraction ran for two days, and memory usage went up to 42GB, before I cancelled it. If you have had the same experience, @ejgallego has offered to help you optimise the generated Coq code, and has offered some suggestions:

Thanks for the details, after a quick peek to MLCert I think it should be possible to optimize memory consumption one order of magnitude. I'd be happy to help with it, of course if you folks think it'd be worth the effort.

Main things to try would be: use a native float and vector repesentation, profile Coq extraction, see where it is not being eager enough.

I'm guessing that extraction might not be your priority, however, as the networks imported to Coq can already be run using the TensorFlow architecture, and extracting the verified code to Haskell may not offer any additional benefits.

gstew5 · 2020-01-03T16:08:40Z

Thanks for your help, Wen! It would certainly be interesting to do the port to native vectors and floats (Coq 8.11), as suggested by @ejgallego. We're not actively working on MLCert at the moment, however, so I can't promise this will be done soon (or at all).

ejgallego · 2020-01-03T17:42:13Z

Hi folks, indeed I believe it should possible to improve the efficiency of extraction quite a bit; just ping us or open an issue at https://github.com/coq/coq/issues and we'll be happy to try to help.

gstew5 assigned bagnalla Dec 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot replicate extraction to MLCert #8

Cannot replicate extraction to MLCert #8

wenkokke commented Dec 28, 2019 •

edited

Loading

gstew5 commented Dec 28, 2019 •

edited

Loading

wenkokke commented Dec 31, 2019 •

edited

Loading

wenkokke commented Jan 3, 2020

gstew5 commented Jan 3, 2020

ejgallego commented Jan 3, 2020

Cannot replicate extraction to MLCert #8

Cannot replicate extraction to MLCert #8

Comments

wenkokke commented Dec 28, 2019 • edited Loading

gstew5 commented Dec 28, 2019 • edited Loading

wenkokke commented Dec 31, 2019 • edited Loading

wenkokke commented Jan 3, 2020

gstew5 commented Jan 3, 2020

ejgallego commented Jan 3, 2020

wenkokke commented Dec 28, 2019 •

edited

Loading

gstew5 commented Dec 28, 2019 •

edited

Loading

wenkokke commented Dec 31, 2019 •

edited

Loading