Skip to content

Commit

Permalink
Sync with r2.11 (#1156)
Browse files Browse the repository at this point in the history
* Use IPEX Pytorch whls instead of building IPEX from source (#674)

* Lpot2inc (#446)

* draft for lpot quantization and perf analysis jupyter notebook

Co-authored-by: ltsai1 <[email protected]>

* Sriniva2/ssd rn34 (#682)

* improve ssdrn34 perf.

* minor update.

* enabling synthetic data.

* Update base_benchmark_util.py

* Fix linting error

Signed-off-by: Abolfazl Shahbazi <[email protected]>

Co-authored-by: Abolfazl Shahbazi <[email protected]>

* Add doc updates for '--synthetic-data' option (#683)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Change checkpoint setting for Bert train phase 1 (#602)

* Change checkpoint setting for Bert train phase 1

* fix model and config saving

* fix error when runing gpu path (#686)

* fix load pretrained model error when using torch_ccl (#688)

* update py version in base spec (#678) (#690)

* TF addons upgrade to 0.17.1 (#689) (#691)

* updated tf adons version

* remove comment

* Update Dockerfiles prior to IMZ 2.8 release (#693)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update Documents prior to IMZ 2.8 release (#694)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update README.md (#697)

* change numpy version requirement (#703)

* Remove MiniGo training from IMZ (#644)

* remove MiniGo training scripts and unit test

* [RNN-T] [Inference] optimize the batch decoder (#711)

* reduce fill_ OP in rnnt embedding kernel

* optimize add between int and log to reduce dtype conversion

* rnnt: support dump tracing file and print profile table (#712)

* add support for open SUSE leap operating system (#708)

* rnnt inference: pre convert data to bf16 (#713)

* remove squeeze/slice/transpose (#714)

* update resnet50 training code (#710)

* update resnet50 training code

* not using ipex optimize for resnet50 training

* use ipex.optimize() on the whole model (#718)

* resnet50 bf32: calling ipex.optimize to enable bf32 path (#719)

* updated readme: nit fix (#723)

Co-authored-by: Rahul Nair <[email protected]>

* compute throughput by test_mini_batch_size (#740)

* pytorch resnet50: fix bf32 training path error (#739)

* Fix a subtle 'E275' style issue that causes unknown behavior (#742)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* rearrange the paragraphs and fix Markdown headers (#744)

* Align Transformers version for BERT models (#738)

* align transformer version(4.18) for bert models

* change scripts to legacy

* redo calibration

* patch fix

* Update README.md (#746)

* Add support for stock PYT- object detection models (#732)

* stock PYT and windows support for object detection models

* Weizhuoz/reduce model zoo steps (#762)

* reduce steps for bert-base, roberta, fpn models

* modify max_iter for fpn models

* reduce all img classification models steps

* update new config for bert models (#763)

* Addin Scipy for TensorFlow serving SSD-MobileNet model (#764)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update TF ResNet50v1.5 inference for SPR (baremetal) (#749)

* Added matplotlib dependency to image_segmentation requirements (#768)

* Update readmes for the path to output directory (#769)

* update wide & deep readme for the path to pretrained model directory (#771)

* add a check for ubuntu 22.04 support (#721)

* Changes to add bfloat16 support for DIEN training (#679)

* Changes to add bfloat16 support for DIEN training
* Some for for reporting performance
* Fixes for dien training and unit tests

* updated tpp file withr2.8 approvals (#773)

* Add Windows stock PyTorch support for TransNet v2 (#779)

* update TransNet v2 to work with stock pytorch
* update Windows.md path in all relevant docs

* add P99 metric for LZ models (#780)

Co-authored-by: Weizhuo Zhang <[email protected]>

* Rn50 training multiple epoches output 1 KPI and add training_steps argument. (#775)

* enable --training_steps and 1 training KPI output with multiple epoches

* add prefix

* update print freq

* fix display bug

* enable PyTorch resnet50 fp16 path (#783)

* enable PyTorch resnet50 fp16 path

* fix conflict

* Extract p99 metric from log to summary (#784)

* enable fp16 bert train and inference (#782)

* Vruddarr/pt update windows readmes (#778)

* remove bfloat16 experimental support note (#786)

* Update IPEX installation path (#788)

* Clean up _pycache_ files, remove symlinks, and add license headers for dien training bf16 (#787)

* update readme for jemalloc and iomp path (#789)

* update readme for jemalloc and iomp path

* Updated IOMP path as path to the intel-openmp directory

* PyTorch: fix resnext101 running script (#795)

* Update 3dunet mlperf bash scripts and README (#797)

* update 3dunet mlperf doc to use quickstart scripts, rename quickstart scripts for multi-instance

* fix tests job (#803)

* Adding quick start scripts to ssd-mobilenet bfloat16 precision (#798)

* Update T5 model with windows quick start scripts (#790)

* Update T5 model with windows quick start scripts

* Updated Readme by specifying values to environment variables

* Update inference int8 readme and script of 4 CV models using INC (#698)

* update docs to add INC int8 models as an option
* add instructions for how to quantize a fp32 model using INC

* rnnt: fix stft due to PyTorch API change (#811)

* rnnt training: fix stft due to PyTorch API change (#813)

* Update BareMetalSetup.md (#817)

* Gerardod/build container (#807)

First phase of GHA WF to build the image of a Model Zoo workload container and push it to CAAS.

* Sharvils/tf workload (#808)

* TFv2.10 support added. Horovod version updated.

* Vruddarr/tf add language translation bert fp32 quick start scripts (#804)

* Adding quick start scripts to language translation BERT FP32 model

* Updated TL notebooks for SPR Launch (#810)

* Updates for TL PyTorch notebook

* Edits for two more TL notebooks

* Reverting previous change for virtualenv

* Removed --no-deps and some nonexistent links

* Added TFHub cache dir

* Updated TL notebook README for legal/branding

* Update typo in Readme (#821)

* PyTorch: using ipex.optimize for bf16 training (#824)

* Fix CVEs for Pillow and notebook packages (#831)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* add intel-alphafold2 optimized w/ IPEX from realm of AIDD (#737)

* add alphafold2 from AIDD realm

* Remove unused variable in mlperf 3DUnet performance run (#832)

* Update Model Zoo name, Python version and message for IPEX (#833)

* Update instruction for Miniconda, Jemalloc, PyTorch and IPEX and updt… (#830)

* Update instruction for Miniconda, Jemalloc, PyTorch and IPEX and updting the readme by replacing conda with Miniconda.

* Adding comment to install torch in BareMetalSetup.md

* Adding IPEX version and removing *s

* Update models main tables (#836)

*update main readmes

* Adding jemalloc instructions and environment variables (#838)

* DLRM hybrid gradient product (#814)

* enable hybrid mergedembedding

* Hybrid Merge embedding

* update the TTT evaluation method by excluding dataloader & metric evaluation (#844)

Co-authored-by: Zhang, Liangang <[email protected]>

* PyTorch: resnet50 distributed training using lars optimizer (#826)

* modify dlrm's sklearn metric eval func to ipex's multi-thread version (#850)

* modify recall/precision/f1/ap 's eval as optional (#856)

* Port dataloader optimization for distributed training of dlrm (#847)

* update the TTT evaluation method by excluding dataloader & metric evaluation

* port dataloader optimization for distributed training of dlrm

* modify dlrm's sklearn metric eval func to ipex's multi-thread version (#850)

* modify recall/precision/f1/ap 's eval as optional (#856)

* port dataloader optimization for distributed training of dlrm

* delete local bs computation in evaluation stage

* modify the TTT output name

Co-authored-by: Zhang, Liangang <[email protected]>

* Update horovod version to fix run time failure due to Status call (#859)

* fix regression for dlrm single node training (#864)

Co-authored-by: Weizhuo Zhang <[email protected]>

* Update pytorch model zoo table of BF32 with landing zoo models (#865)

* Added SNYK scan (#855)

* Update SSD-ResNet34 code in start.sh(#862)

* Add Distilbert base model for inference (Tensorflow) to model zoo (#815)

* Add fp32 inference for distilbert base model

* Fix Bert spec file (#873)

* 1) Add torch.profiler (#871)

2) change the distributed_training.sh for dlrm to diamond cluster

* Update Wide & Deep docs (#875)

* The copy of #867(Porting evaluation iteration overlapping) (#876)

* port evaluation overlapping

* add resnet50 distributed training script (#879)

* add resnet50 distributed training script

* collect TTT

Co-authored-by: XiaobingSuper <[email protected]>

* reduce redundant bus traffic (#880)

* Port all_to_all index overlapping with interaction and top mlp. (#878)

* port all_to_all index overlapping with interaction and top mlp

* fix seg fault

* Add int8 support for distilbert (#823)

* Add fp32 inference for distilbert base model
Co-authored-by: syedshahbaaz <[email protected]>

* Update DIEN inference docs & quickstart scripts (#869)

* Update DIEN docs
* update for spr ww42
Co-authored-by: WafaaT <[email protected]>

* Update ResNet50v1.5 docs (#820)

* Update and Validate ResNet50v1.5 Inference and training model for TF SPR
* Update and validate docs for TF SPR

Co-authored-by: WafaaT <[email protected]>

* Update Wide & Deep using Large Dataset docs (#877)

* Vruddarr/tf bfloat32 precision check (#893)

* Update Wide and Deep Large Dataset Training Model docs (#881)

* Vruddarr/tf update image recognition models docs (#816)

* Update Inceptionv3,DenseNet 169, Inceptionv4, ResNet50, ResNet101, MobileNet V1 quickstart scripts and docs

* Update and validate MobileNet v1 for TF SPR

Co-authored-by: WafaaT <[email protected]>

* Fix BFloat32 precision check code for Resnet50v1.5 training model (#894)

* Update 3DUNet MLperf for SPR (#889)

* Updated Bert Large SPR READMEs (#887)

* Fix typos in MobilenetV1 scripts (#899)

* modify time function to solve int8 benchmark issue on windows (#898)

* modify time function to solve int8 benchmark issue on windows

* Replace the time.time function calls to time.perf_counter to improve the time statistic resolution. Updated for the additional 5 models

Co-authored-by: Ying <[email protected]>

* Update DIEN Training docs (#882)

* Adding permissions to scripts in DIEN and correcting pb file paths in README_SPR_baremetal (#901)

* Adding SPR_baremetal_readme and fixing model paths in the tables (#904)

* fix acc test for single node (#903)

* fix acc test for single node

* Update dlrm_s_pytorch.py

Co-authored-by: Weizhuo Zhang <[email protected]>

* commit cherry-picks from r2.9 (#900)

* update tbb files (#843)

* fix vulnerability issues reported by snyk scans (#848)

* upgrade for ipex 1.13

* Update Pillow to '>=9.3.0' (#884)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* fix some bugs for p99 (#909)

* Update tensorflow benchmarks to use latest horovod commit (#908)

* Update start.sh

* Update start.sh

* Update to use shortened commit hash

* do not convert data to bf16 while using fp32 and bf32 (#911)

Co-authored-by: Weizhuo Zhang <[email protected]>

* Update SSD-Resnet34 training docs for SPR task (#914)

* Update SSD-Resnet34 training & docs for SPR

* Vruddarr/tf update ssd mobilenet docs (#846)

* Update quick start scripts and spec file to run for all precisions

* Update and validate SSD-Mobilenet docs for TF SPR

Co-authored-by: WafaaT <[email protected]>

* fix print issue (#915)

Co-authored-by: Weizhuo Zhang <[email protected]>

* Update rfcn docs to use same quick start scripts (#897)

* Update rfcn docs to use same quick start scripts

Co-authored-by: WafaaT <[email protected]>

* Sharvils/spr ssd training (#917)

* Dockerfile updated

* Update SSD-ResNet34 Inference docs (#866)

* Update ResNet34 Inference to use same scripts & docs for all precisions

* Update for SPR WW42

Co-authored-by: WafaaT <[email protected]>

* Update transformer_mlperf scripts and README fro SPR WW42 (#891)


Co-authored-by: Wafaa Taie <[email protected]>

* Update TF models spec files for SPR WW42 (#919)

* update TF models spec files for spr ww42

* update docker partial for tf addons version

* workaround rdma config for spr (#925)

* remove supported OS checks (#926)

* Update Model paths in main readme (#928)

* Remove Linux/windows OS platform support checks (#927)

* update resnet50 distributed training script (#923)

* resnet50 distributed training: use logical core for ccl (#930)

* Update bert scripts to add same quick start scripts to all precisions (#910)

* Update MobilenetV1 SPR docs (#931)

* Update Resnet50v1_5_SPR_docs (#934)

* Update SSD-Mobilenet SPR docs (#935)

* Update Resenet50v1.5 inference SPR docs (#933)

* Fix DIEN inference.sh script and add pretrained model env var in mobilenetv1  SPR baremetal readme (#939)

* Update DIEN Inference and Training SPR docs (#937)

* Update SSD-Resnet34 training SPR docs (#936)

* Update SSD-Resnet34 Inference SPR docs (#938)

* Update README_SPR_baremetal.md
remove steps and warm_up steps env vars

Co-authored-by: Wafaa Taie <[email protected]>

* BERT training dockerfile fixed (#921)

* BERT repo version fixed for SPR container (#920)

* Update spr baremetal instructions for 3dunet, bert large and transformer mlperf (#932)

* Update Transformer MLPerf inference docs for pre-trained models (#940)

* Fix Language Translation BERT quickstart scripts (#941)

* fix scripts to detect the number of cores

* Update mlperf_gnmt docs (#945)

* Updating Transformer_LT_official scripts (#913)

* Update main README.md (#947)

* update main readme

* edit transformer_mlperf and bert SPR docs

* Fix CVEs based on Snyk scans in TL notebooks (#951)

* fix snyk critical issues in TL jupyter notebooks

* Remove INC dependency for Snyk issues (#953) (#954)

* removed neuralcompressorfor to avoid vulnerability in Snyk scans

* update spec files for pretrained models links (#957)

* Fixed num_intra_threads for bfloat16 (#959)

* Fixed num_intra_threads for bfloat16

* Modified open mpi instructions

* Added kmp_blocktime for bfloat16

* Fix syntax error and pythonpath in ssd-resnet34 training (#962)

* fix training bkms (#967)

* fix T5 inference script (#969)

* Fix for SSDRN34 training failure (#970)

* fix for ssdrn34 nightly failure

* Molly/rdma dist (#972)

* revert commit 925, enable RDMA CONFIG

* revert pr 925, enable rdma config

* Update Serving Docs Versions (#974)

* Update Versions

* Add TODO

* Removed precision folders and updated quickstart scripts (#922)

* Removed precision folders and updated quickstart scripts

* Updated README and changed script names

* generated README and advanced.md

* TF SPR DevCatalog READMEs (#983)

* add image recognition devcats

* add tf object detection devcats

* add TF language translation devcats

* add tf image segmentation devcats

* add tf language modeling devcats

* add recommendation tf devcats

* fix swapped containers and precision in run command

* add README_SPR to all getting started links and correct script names

* rename files and point getting started to itself

* fix last link

* Fix spec files (#989)

* fix spec file

* add docker.md doc snippet

* Fix ssd-mobilenet inference script (#990)

remove throughput aggregation

* Update Trasformer_MLPerf Inference docs to use same quick start scripts (#963)

* Update benchmark readmes and fix inference.sh file


---------

Co-authored-by: WafaaT <[email protected]>

* Corrected typo in SPR quickstart scripts (#991)

* Update Transformer_MLPerf Training docs (#973)

* Update Transformer_MLPerf Training docs
* changes for code review comments
* add workload container readme

Co-authored-by: WafaaT <[email protected]>

* fix minor error (#994)

* Update TF SPR ww42 containers partials, spec-files and dockerfiles  (#998)

* Sharvils/tf devcats fixes (#995)

Minor fixes to SPR TF DevCatalogs
---------

Co-authored-by: sharvil.shah

* SPR PyTorch DevCatalogs (#993)

Added Devcatalog files targeting SPR container launch

* Delete SPR containers README_SPR.md (#999)

* delete README_SPR.md

* remove references in spec-files

* fix numpy 1.24 deprecated np.float issue for MaskRCNN pytorch (#1006)

* enable fp16 for distilbert (#1005)

* Add pytorch and tensorflow devcatalog tables (#1008)

* add table of devcatalogs

* add devcatalog tables

* make title changes

* move files to docs folder

* Fix ssd-resnet34 workloads, which are currently failing in TF-CPU nightly testing (#1013)

* ssd-resnet34 training: import register_tensor_conversion_function from tensorflow.python.framework.tensor_conversion_registry, which is the current proper library

* ssd-resnet34: remove horovod requirement which is preventing workload from running in TFDO nightly testing due to too-old horovod version

* ssd-resnet34 training: apply register_tensor_conversion_function to bfloat16

* Update ssd-resnet34 README files to suggest horovod>=0.27.0 for training and removing horovod for inference

* Liangan1/tpp bert (#1016)

* Add SQuAD script for inference/training with TPP optimization

* Add pretrain scripts for TPP optimization with (fast_bert API)

* Align dataset/model path  for fast_bert script

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Fix train ENVs

* Update fast_bert_pretrain.sh

* Update run_pretrain_mlperf.py

* Add pretrain scripts with 8 nodes

* Refine scripts

* modify rn50 distributed training script (#1017)

* Fix ssd-resnet34 inference failure due to register_tensor_conversion_function moving from ops to tensor_conversion_registry (#1014)

* Adjust CODEOWNERS

* fix AG Ramesh CODEOWNER

* Remove AG Ramesh. Unknown user?

* modify rn50 distributed training script (#1028)

* upgrade ipython to 8.10.0 to avoid vulnerability (#1024)

* Fix weightsharing scripts for resnet50 v1.5 and bert large (#1027)

* fix numa cores lists for weightsharing instances for both resnet50 and bert large

* add --localalloc

* correct links and reduce table columns (#1011)

* correct links and reduce table columns

* correct segmentation table

* correct more tables

* change dataset links and description

* remove local path reference

* remove relative links

* changed links to point correctly

* Dataset API (#1032)

* add a new api to download and do minimal preprocessing if supported

* add support for brca, tabformer and dureader datasets

* add preprocessing support for brca dataset

* update inference script to use a single socket (#1007)

* set bf32 flag as env var (#1009)

* Update README for fp16 ENV (#1036)

* Molly/fast bert (#1037)

* revert commit 925, enable RDMA CONFIG

* bugfix: Args type bug fix

* pytorch maskrcnn dev catalog (#977)

* pytorch mask rcnn dev catalog

* adding links

* Update README_DEV_CAT.md

Remove Asian font comma, add all precisions to export command

---------

Co-authored-by: Clayne Robison <[email protected]>

* add support for msmarco dataset download (#1034)

* pytorch: update resnet50 readme of fp16 path (#1038)

* increase mnasnet_0_5 iterations for latency mode to 5000 (#1041)

* update windows.md (#1042)

* Fix TF BERT large weight sharing QSS script (#1043)

* fix script name to match readme, and remove printed cores lists

* clean up old log files

* test_multiple_jobs

* delete file

* test_multiple_jobs

* added manual play

* updated file extension

* added test_multiple_jobs label

* I O optimization for evaluation (#896)

* Updated runs-on

* Dataset API: Update DuReader dataset name and raw dataset links (#1046)

* update the dureadr dataset name and raw dataset links

* update readme for the dataset download command

* Update main README.md language based on feedback from legal (#1051)

* update the language of the main readme based on legal feeback

* changes for code review comments

* Update README.md

---------

Co-authored-by: Clayne Robison <[email protected]>

* Revert "I O optimization for evaluation (#896)" (#1061)

This reverts commit 483c45020010cd8d947ff30c7f1b970d4972c003.

* Fix for Horovod issue #3861 (#1071)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* add warmup for roberta and bert_base (#1073)

Co-authored-by: diwei sun <[email protected]>

* Added feature to include terms and conditions (#1062)

* Added feature to include terms and conditions

* Modified terms and conditions to be accepted once only

* Modified terms and conditions text

* Added new line at EOL

* Modified dependency file name

* Modified scripts to run as per user acceptance on terms and conditions

* Fixed URL formatting in TnC file

* Added conditions based on Terms and condition

* Modified the names of the variables

* Update datasets/dataset_api/terms_and_conditions.txt

Co-authored-by: Mahathi Salopanthula <[email protected]>
Co-authored-by: Clayne Robison <[email protected]>

* Code cleanup

---------
Co-authored-by: Mahathi Salopanthula <[email protected]>
Co-authored-by: Clayne Robison <[email protected]>

* Adding inter/intra op threads config to W-n-D data layer (#1076)

* Enable Vision Transformer (#992)

* fp32 precision

* Enable TF BERT-large SQuAD FP16 inference (#1070)

* enable bert_large float16 inference through launch_benchmark.py

Co-Authored-By: Bhavani Subramanian <[email protected]>

* add support for both AMP and keras MP

* adding support for quickstart scripts and minor lint-related changes

* adding float16 support in quickstart README

* renaming float16 to fp16

* updating README about the new flag for enabling grappler AMP

* updating model README

---------

Co-authored-by: Bhavani Subramanian <[email protected]>

* Enable TF ResNet50v1.5 FP16 inference (#1065)

* enable rn50 float16 inference through launch_benchmark.py

Co-Authored-By: Bhavani Subramanian <[email protected]>

* reverting training-related changes

* minor change - copyright year in new files

* renaming float16 to fp16 and adding support for quickstart scripts

* minor correction of renaming Float16 to FP16

* renaming float16 to fp16 changes

* updating model README to indicate the use of FP16

---------

Co-authored-by: Bhavani Subramanian <[email protected]>

* Bfloat16 support for TF-ViT model (#1077)

* Add Bfloat16 support for vision transformer model

* Update AMP optimizer name

* Add bf16 tests

* Change to multi-instance scripts for CI

* Update readme and accuracy script

* Address review comments

* Fix test

* Update main README

* Enabling float16 training for ResNet50v1_5 (#1079)

* Updated quickstart README files

* Added model init files for ResNet50v1_5 fp16 training

* Updated start.sh with fp16 precision

* Enabling fp16 in main model scripts

* Added fp16 precision to BaseBenchmarkUtil class

* Final changes to enable model

* Updated License Headers

* Added a unit test for RN50 FP16 training

* Updated 2 main Readme files

* Enabling float16 training for BERT large / SQuADv1.1 (#1018)

* Adding files to enable fp16 Bert Large training

* Resolved float16 scope name conflict

* Adding changes to support Loss scale optimization for the custom AdamWeightDecay Optimizer

* Added a flag to switch between AMP and KMP

* Changing default precision to float16

* Added changes to start.sh to support BERT large float16 training

* Changing name from float16 to fp16

* Adding model_init and Readme for BERT large Float16 training

* Shortening line length for unit test

* Removing trailing whitespaces

* Adding option for fp16 in BaseBenchmarkUtil class

* Adding flag to switch between AMP and KMP weight updates

* Updating TF requirement and Copyright year in all files

* Corrected grammatical error in precision description and added new comments.

* Updating FP16 convergence results in the README

* Removing unnecessary files

* Updating license headers for all the modified files

* Added fp16 datatype to the quickstart bash script

* Updated License year

* Final quickstart script for Bert Large Squad training

* Updated README for the new quickstart script

* Removed verbose log flag

* Updated Intel License Header to Readme

* Updated 3 main Readme files

* Added --amp details to Readme

* Renamed squad.sh to training_squad.sh

* Updated README files to include Squad Training use-case

* FP16 support for TF-ViT model (#1081)

* Add FP16 support

* Fix test

* Enable FP16 support for Distilbert Inference (Tensorflow) (#1075)

* Enable FP16 for distilbert model

* Add config and model_init file for distilbert fp16

* Update README; Add Unit Test for fp16

* Add quick start scripts for distilbert accuracy, latency and throughput

* Add extra line at the end of scripts

* Remove a unit test

* Add accuracy and benchmark unit test for fp16

* Update quickstart scripts and their file permissions

* Update --num-intra-threads arg in quickstart scripts; Update README

* Update README.md

* Update command for quickstart script in README

* Gda/pr poc (#1069)

* Added tests and changed logic to have precisions as an array on json file.

* Fix OOM caused by incorrect thread setting. (#1084)

* Ejan/model zoo quickstart (#1082)

* Fix syntax for resnet50v1.5 inference

* Import GPU Max and Flex Series workloads from develop-gpu (#1080)

* Add GPU DLRM FP16 inference

* Change to install ATS drivers from local repo

* Add GPU PYT bert large FP16 Inference

* fix _FusedMatmlul issue in GPU

* Updated PyTorch to use the common compiler partial and added ARG for the env var file since that changes per compiler

* Add package for ResNet 50 v1.5 int8 Inference pytorch gpu

* Update specs & build files for alpha2 rc1 whls

* Add ResNet50 v1.5 bf16 Training PYT GPU

* Add wrapper package for TF GPU tool container

* Update TF GPU training packages to use alpha2-rc1

* Update IPEX tools container and resnet50v1.5 models for alpha2 rc1

* Update PYT Bert LG and DLRM FP16 inference alpha2-rc1

* Update tf-gpu branch for ww15 dpcpp compiler

* Set ITEX_ENABLE_ONEDNN_LAYOUT_OPT=0 for bert training

* Add section to validate base container, fix dlrm printed statement

* Update the docs for alpha2-rc2 models

* fix ipex tool container readme

* Fix dlrm print using CPU statement to be XPU

* add 1t env vars

* Use add instead of addn

* Update bert large docs to be specific about which pretrained model to use

* Sync with develop

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update the main benchmarks README for gpu models

* Set ITEX_ENABLE_ONEDNN_LAYOUT_OPT=0 in ResNet50v1.5 bf16 training quickstart scripts

* Revert "tmp fix res50v1_5 int8"

This reverts commit 3c120e0bee3a576ee1548d9258b611a889897ee6

* Updates to match batch sizes in docs and updated pb links

* Updating compilar binary

* Update PYT GPU packages for IPEX alpha2 rc6

* rfcn-fp32-inference-k8s package

Signed-off-by: Kam D Kasravi <[email protected]>

* Update GPU specs to make the docs section a list and update TF training docs for DevCloud

* Doc updates for ResNet50v1.5 and BERT large training for GPU

* tf-gpu doc updates

* Fix the BKC and environment for resnet50v1.5 INT8, bert-larget and resenet50v1.5 BF16 training

* Update GPU PYT packages to have 2 READMEs

* Remove duplicate license from package

* AI Kit Model Package README

* Clean up PYT model pkgs and update baremetal docs

* Fix GPU tests (#5)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Sync with 'develop' and resolve conflicts (#3)

* Update README.md for IPS 00513014 and 00514541

* Enable remapper pass in densenet169 execution

* Adds protoc and pycocotools dependencies

* K8s packages tests: Checks if username has underscore before creating a namespace

* Fix and simplify serving k8s package path variables

* Upgrade to 'TensorFlow Serving 2.4.0'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* rfcn-fp32-inference-k8s package

Signed-off-by: Kam D Kasravi <[email protected]>

* Quickstart updates for using synthetic data or real data, except SSD-ResNet batch will always use synthetic

* Add Centos8 partials for SPR TF models

* Fix the URL for 'oneAPI-samples' repo

* snapshot

Signed-off-by: Kam D Kasravi <[email protected]>

* Add a copy of existing pytorch ipex icx centos specs to specs/centos

* Fix High vulnaribility issues reported by SNYK

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Setting OMP_NUM_THREADS based on num_intra_threads

* Weekly SNYK fixes

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fixes broken links in the Launch Benchmarks documentation

* Fix '3d-unet' docker image links

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix Python and TensorFlow Pip package versions for TF v1.15.2

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Adding a minor fix to dynamically calculate the number of remaining images to be steps provided x batch size.

Currently the max number of steps the RN50 inference supports is max of 5000 / batch size.. The 50k hard limit is not letting us to perform long inference runs for platform analysis. Hence requesting this fix.

This will enable us to collect telemetric data (like emon) to be collected for longer duration (like 5 mins).

Signed-off-by: Rajendrakumar Chinnaiyan <[email protected]>

* Remove unused 'num_cores' from 'rfcn'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Upgrade to 'Pillow>=8.1.2'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Compatibility fixes for automation

* Parameterized model name in resnet50v1.5 serving script
* Increase timeout and modify output
* Adjusts inceptionv3 client input and output

* fix mpi operator cluster scope issue

* Fixes SSD-MobileNet perf comparison by pre-installing numpy with --no-binary

* Enable more models for Perf Analysis notebooks and add auto testing for notebooks

* Update quickstart bare metal documentation to use ./quickstart/<script>.sh

* Fix lints tests for rfcn

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add support for SSD-ResNet34 BF16 inference

* Updating benchmarks table with 'SSD ResNet34 BFloat16'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* modifying requirements.txt in SSDRN34 to use tensorflow add-ons of any version greater than or equal to 0.11.0

* Moving quickstart files to their proper directories and bats test fix

* Update specs and assembler.py to make the documentation section a list

* Fix error in BF16 accuracy test for SSD-ResNet34 with input size of 1200

* Shwetaoj/horovod version

* Fix pip install commands for Python3 and 'numpy' version

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Updated README file for transformer_mlperf model, fixed of link of sections and added the instructions to run transformer model for both fp32 and bfloat16 inference

* Update BERT large docs for to separate out "advanced" and allow for using quickstart scripts when cloning the repo

* Adding DIEN model to modelzoo for inference (fp32 and bfloat16)

* Fixing data format issue for SSD_RN34 and Resnet50 training models

* Replaced existing mlperf transformer LT  bfloat16 training model with a converged model, multi-node support is kept

* Fix for accuracy flag

* Fix some styles for recently merged 'DIEN' model

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Set 'OMP_NUM_THREADS' to 'num_intra_threads'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Updated the transformer_mlperf README file, and also restore a change by accident

* Fix styles and other cleanup

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update BERT Large docs to put AI kit first

* Added support for frozen graph with bfloat16 precision.

* Update README file and fix few errors.

* Fixes for 3D-Unet Mlperf

* Fix link to 'g3doc' installation

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update docs for DenseNet 169 and Faster RCNN FP32 inference

* Fix `environment` spelling typo

* Fix for ssd-resnet34 inference

* Stock PyTorch vs Intel's optimization comparison notebook

* Doc updates for AI Kit

* Adding fix to ssd-resnet34 bfloat16 training

* Doc updates for recommendation models for AI Kit

* AI Kit doc updates for Faster RCNN

* Update SSD ResNet34 backbone model links

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* 3D U-Net AI Kit doc updates

* Mask RCNN AI Kit doc updates

* Doc updates for language modeling models for AI Kit

* UNet doc changes for AI Kit

* Fixed a bug in mlperf_transformer model real time performance measurement, which was caused by the batch size was fixed in the model. Also with some code cleaning up

* Doc updates for RFCN for AI Kit

* Update the docs/README.md to add a AI Kit doc link

* Removing $ from shell command snippets

* Doc updates for SSD-MobileNet for AI Kit

* Update DenseNet169 doc to use the tensorflow conda env for AI Kit

* IMZ CentOS Support for start.sh

* Doc updates for WaveNet for AI Kit

* Doc updates for InceptionV4 for AI Kit

* WORKAROUND - Update horovod version to a commit on master branch to fix build error in horovod

* rama/3d unet

* Enabled user specified warmup and benchmark steps.

* Merge branch 'dtran/platform_util_add' into 'develop'

Added functions to expose some of the properties like core, logical core, numa nodes

See merge request intelai/models!495

* update all TF images to latest

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update TF TPP link too

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add document for users who are new to docker

* Update InceptionV3 docs for AI Kit

* Update code to write checkpoint files to the --checkpoint dir, even when the backbone model isn't provided

* Fixing the link target to the README section that lists the model's prerequisites

* Update MobileNet V1 docs for AI Kit

* Update ResNet50 & ResNet101 docs for AI Kit

* Regenerate docs too for SSD ResNet34

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix SSD ResNet34 style and unittests

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Doc updates for language translation models for AI Kit

* Fix typo in "advanced" setup section

* Doc updates for ResNet50v1.5 for AI Kit

* In-graph arg should be omitted if None for BERT BF16 inference

* Changes to add num_iterations option for DIEN model

* DIEN script refactoring + static graph flag + bf16 online pass support

* Check for 'NOINSTALL' before running 'YUM' commands

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Initial commit for SSD-RN34 BF16 inference

* Prepare for Model Zoo v2.4.0 release

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update output based on new graph

* Sync with 'develop' and resolve conflicts

* Regen documentation and dockerfiles

* Update 'OWNERS' file (#4)

* Update 'OWNERS' file

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add more owners

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix one last failing test

* Update 'DIEN' readme (#6)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Prevent adding wheels or other archives to the repo (#7)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

Co-authored-by: ltsai1 <[email protected]>
Co-authored-by: Yimei Sun <[email protected]>
Co-authored-by: Melanie H Buehler <[email protected]>
Co-authored-by: Taie, Wafaa S <[email protected]>
Co-authored-by: Kasravi, Kam D <[email protected]>
Co-authored-by: Mahmoud Abuzaina <[email protected]>
Co-authored-by: Jones, Dina S <[email protected]>
Co-authored-by: Rajendrakumar Chinnaiyan <[email protected]>
Co-authored-by: Yerneni, Venkata P <[email protected]>
Co-authored-by: Thakkar, Om <[email protected]>
Co-authored-by: Ojha, Shweta <[email protected]>
Co-authored-by: Cui, Xiaoming <[email protected]>
Co-authored-by: Varghese, Jojimon <[email protected]>
Co-authored-by: xiaoming <xkdjfk>
Co-authored-by: Khanna, Kanvi <[email protected]>
Co-authored-by: mdfaijul <[email protected]>
Co-authored-by: Shiddibhavi, Sharada <[email protected]>
Co-authored-by: Shah, Sharvil <[email protected]>
Co-authored-by: Ketineni, Rama <[email protected]>

* GPU RN50v15 Inference (#11)

Generate package with support for all precision

Co-authored-by: Dina Suehiro Jones <[email protected]>

* Adds the PyTorch GPU BERT inference package  (#10)

* Add PyTorch GPU BERT inference container

* documentation updates

* Make scripts executable

* Removing these vars until we hear from mingxiao

* Make brackets consistant

* Formattting

* fix output dir

* Removing typo

* Add note that says the first run will download the pretrained model

* Fix which README goes in the package

* Updates based on the latest bkc

* Update quickstart file names in the spec

* Use tee

* Adds the PyTorch GPU BERT training package  (#13)

* Add documentation, quickstarts, and spec for PyTorch BERT training for GPU

* Fix which README goes in the package

* Add glue files

* Updates based on the latest BKCs

* Doc update and log to screen

* Doc update

* add support for bfloat16 (#17)

* Adds the PyTorch GPU ResNet50v1.5 training package (#15)

* Add ResNet50v1.5 PyTorch training model package

* Update files in package

* update file list for main.py

* Spec update

* Fixes after review

* Use tee

* Adds the PyTorch GPU ResNet50v1.5 inference package (#14)

* Add docs, quickstart scripts, and spec for PyTorch ResNet50v1.5 for GPU

* Fix file path

* Doc and BKC updates

* Update permissions

* Moving the PyTorch DLRM GPU model code  (#18)

* Moving the dlrm code out of the precision folder, since it's the same for all precisions

* Updates from the latest gpu-models 0.2.0gpu_rc1 branch

* Update the old DLRM spec, due to moving the code

* Moving DLRM inference/gpu code to be common gpu code used for both inference and training

* Fixing models paths from the old spec/quickstart

* Add GPU RN50v1.5 Training  (#19)

* added spec & generating package

* removed existing folder

* fixed docxumentation

* fix scripts

* review changes

* Adds the PyTorch GPU DLRM training package (#21)

* Add initial spec and docs for DLRM pytorch gpu training

* Updated docs

* Update permissions on quickstart

* update dataset doc

* Fix file path

* Adds the PyTorch GPU DLRM inference package (#22)

* Moving the dlrm code out of the precision folder, since it's the same for all precisions

* Updates from the latest gpu-models 0.2.0gpu_rc1 branch

* Update the old DLRM spec, due to moving the code

* Moving DLRM inference/gpu code to be common gpu code used for both inference and training

* Fixing models paths from the old spec/quickstart

* Add files for the spec and documentation for DLRM pytorch GPU inference

* Update quickstart file list

* Documentation updates

* removing old code

* Update dataset instructions

* Add log file analysis

* Update to add download of the model weights

* Doc updates

* Update the datasets instructions to note that the first time the model is run, the preprocessing happens

* Add init files in language modeling & tensorflow folders (#23) (#24)

* add init files in language modeling & tensorflow folders

* changed year

Co-authored-by: Jitendra Patil <[email protected]>

* Add GPU Bert Large inference (#25)

* added scripts

* update docs

* updated scripts

* removed unnecessary folders

* removed spec file

* fix import issues

* review changes

* review update 2

* Add GPU Bert Large training (#29)

* initial commit

* update spec file

* added missing init file

* update docs

* deleted unwated files

* review changes

* gpu support for bfloat16 (#31)

* updates to docs & scripts (#34)

* Update pytorch bert for gpu to include transformers code (#32)

* Update pytorch bert for gpu to include transformers code

* BERT large inference doc updates and fixes

* update to use a clone of the AI Kit conda env

* Fix paths in the quickstart script

* add sacremoses
 to the requirements

* Updates to the DLRM model packages for PyTorch GPU (#35)

* DLRM fixes

* update training precisions

* Updates for the user to download the pretrained model separately

* PyTorch GPU ResNet50v1.5 updates and fixes (#38)

* Add resnet models file

* Fix to use tee

* Updates for training

* Fix log file name

* Whitespace

* whitespace

* Updated model files from gpu-models 0.2.0gpu (b761567)

* Quickstart updates

* PyTorch model source from the gpu-models 0.2.0gpu branch (b761567) & add env vars (#39)

* Updated models from the gpu-models 0.2.0gpu branch (b761567)

* Updated BKCs

* Add setting of env vars

* Change warn to echo

* added env parameter (#44)

* Fixes for PyTorch GPU AI Kit models dependency install  (#45)

* PyTorch GPU fixes from SH for running from a read only directory (#46)

* Updates from the PyTorch team

* Set tensorboard logdir

* Updates from Agnieszka's fixes_0.2.0gpu branch

* Grabbing unchanged files

* Reverting header year change from unchanged file

* Removing old pytorch gpu model spec/dockerfiles (#50)

* Revert "Removing old pytorch gpu model spec/dockerfiles (#50)" (#51)

This reverts commit d716b915cf20efc437467930e48a2f829a898f55.

* Removing old PyTorch GPU dockerfiles/specs that are for a specific precision (#52)

* Updates for the PyTorch IPEX GPU base container package (#53)

* Updates for base pytorch gpu container

* Update dockerfile name

* Update name of the agama sources file

* Update docker image names in the doc

* Switch back to intel-graphics-local.list

* Doc update

* Update env vars

* update to use l_dpcpp-cpp-compiler_p_2021.3.0.3168_offline.sh

* Update to use l_dpcpp-cpp-compiler_p_2021.3.0.3168_offline.sh

* README updates

* Doc update to make title match what users will see in IRC

* Adds inference and training container packages for PyTorch BERT large for GPU (#57)

* Add workload containers for PyTorch IPEX BERT large inference & training for GPU

* Update to clarify base build and update build script to check for the base

* Fixing package name

* update dockerfile to use latest mkl

* Make base image vars

* Update run.sh to use --group-add

* Regenerate dockerfiles

* Syntax fix

* Add Tensorflow base container (#60)

* update specs

* first working version

* updated build

* updated docs, build & spec

* doc update

* tabs -> spaces

* Adds inference and training container package for PyTorch DLRM for GPU (#58)

* rename specs

* add initial files

* Updated docs and add build.sh and run.sh

* Fix dockerfile name

* Regenerate dockerfile

* update pretrained model path

* Add new line at the end of build.sh files

* Adds inference and training container packages for PyTorch ResNet50v1.5 for GPU  (#61)

* renaming specs

* Generate dockerfiles

* Add documentation for the wrapper package

* Add build and run scripts and update spec for the wrapper package

* Use makedirs to create leaf folders

* syntax

* syntax

* Fixing broken links

* Removing --do_eval for bert large training (#67)

* Add Bert Large inference GPU package (#69)

* initial version

* updated docs

* rename spec & package name

* review changes

* fix broken link

* Add Bert Large training GPU container package (#72)

* initial version

* updated build & run scripts

* update docs

* Add ResNet50v1.5  GPU container packages (#75)

* initial commit

* added docs in spec

* wrapper package generation

* update docs

* add training

* updated docs

* parameterize docker args (#76)

* Update run.sh with docker args for PyTorch GPU container packages (#77)

* update docs (#78)

* RN50 training bug fix (#80)

* bug fix

* update batch size

* GPU Bert training container package fix (#84)

* initial working version

* added env parm

* dummy data generation integratex

* more update

* env fix

* updated docs

* Adding NDA TPP file (#87)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update PyTorch IPEX gpu wheel name (#88)

* Update PyTorch IPEX gpu wheel name

* Fix files in spec

* Add pre-trained models for gpu container packages (#123)

* added pretrained models to pacakge

* update doc

* add pretrained models for rn50

* volume mount fix

* review changes

* Update pytorch for new wheel names (#128)

* Update PyTorch code from gpu-models b05e2161 (#129)

* Update pytorch for new wheel names

* Updated resnet50 images

* Fix formatting

* fix formatting in training file

* Update copyright year

* Updates for BERT from gpu-models

* Updated dlrm code file from gpu-models

* Add do eval for bert training

* Updated scripts and env var

* Remove UseVmBind and add EnableDirectSubmission=1 in setvars.sh

* Tensorflow - 2021.3.1 NDA release (#131)

* update itex binary name

* itex file name fix

* BKC changes

* debug changes

* increasing shm mem

* updates

* updated batch size

* loggin more frequently

* rolling back some changes

* Add copyright to python and bash scripts files (#147) (#149)

* add copyright to files

* one more file

(cherry picked from commit 1635120e5c7b7e6d7fff44e666533d33a47a6445)

* compilre version change (#155)

* PyTorch PVC updates (#188)

* Add dockerfile with PVC env vars

* PVC dockerfile updates

* Removing pvc specific dockerfile

* Renaming ATS vars to PVC

* Updated gpu-models code (07854e5d09cc7f380355f8ca50ebe8bc9c09bf22)

* BERT large inference and training quickstart updates

* Update BERT train long analysis function parameters to add batch size

* DLRM updates

* Doc updates for the DLRM terabyte dataset

* README updates

* Update 'Ats' in message

* Fix ENV in partial

* Fix typo

* Fix typo

* Another typo :(

* Revert "PyTorch PVC updates (#188)" (#194)

This reverts commit 99f569ca09d6a7b333959d14fb5ce29df4e08077.

* PyTorch updates for PVC pre-alpha (0.2.2) (#195)

* Add dockerfile with PVC env vars

* PVC dockerfile updates

* Removing pvc specific dockerfile

* Renaming ATS vars to PVC

* Updated gpu-models code (07854e5d09cc7f380355f8ca50ebe8bc9c09bf22)

* BERT large inference and training quickstart updates

* Update BERT train long analysis function parameters to add batch size

* DLRM updates

* Doc updates for the DLRM terabyte dataset

* README updates

* Update 'Ats' in message

* Fix ENV in partial

* Fix typo

* Fix typo

* Another typo :(

* Update Torch CCL install

* Update list of quickstart scripts in the DLRM inference spec

* Updated weights file for DLRM inference

* Fix <package name> text replacement

* update compiler and oneMKL

* Update the base container README due to ipex import changes and remove --privileged

* Update base container README based on review feedback

* update DATASET_DIR for DLRM to remove 'day'

* The DLRM dataset /day paths were correct - putting them back in)

* Updates for main_int8.py

* Add one CCL

* Add build arg for CCL

* Add l_oneapi_ccl_p_2021.4.0.423_offline.sh to the package

* Switch to use base kit as an experiment

* Update dockerfile for basekit

* Make a separate spec for basekit for debug

* Torch CCL from source

* make torch-ccl directory relative

* Fixing path in spec

* Fix package path for torch_ccl

* Go back to using wheels for Torch CCL

* Fix dockerfile name for basekit build.sh

* fix typo

* Removing ENVs that were already defined

* Make sure ONEAPI_ROOT is getting set

* update image tag

* update to use new wheels

* pip updates to prevent dependency version warnings

* PVC alpha release - Tensorflow (#201)

* base container update

* udpated env vars

* updated models

* added oneccl

* build script update

* fixed ccl installation

* update bkc

* training bkc update

* fixed bf16

* remove horovod whl install

* merge related fixes

* code review changes

* Add PyTorch PVC container package for SSD-ResNet34 Training  (#269)

* Add PyTorch PVC SSD-ResNet34 training spec, partial, docs, and code files

* Add git

* Moving partial to the ubuntu folder

* regenerate dockerfile with git install

* Reorder paritals

* Add python3.6-dev

* Add python3.7-dev

* Removing precision as a requirement

* Fix path

* Make training.sh executable

* Add env var

* Update docs and add block/plain format

* fix filenames in spec

* Removing dockerfile that's not used

* Add info on the known issue for plain format

* Add note about the original repo

* TF PVC 3D-UNet and MASK R-CNN containers. (#271)

* initial working package & build

* update build & run scripts

* maskrcnn pkg generation

* scripts updates

* doc update

* more doc update

* doc updates

* docs update +

* 3d-unet working with basekit

* scripts updates

* basekit based models

* fix docs

* mixed precision script

* update scripts

* docs update

* review changes

* review changes 2

* Mask RCNN pre-alpha container (#276)

* changes based on feedback from model owner

* fix typo

* fix path

* fix docs links (#326)

* Add model package for PyTorch SSD-ResNet34 inference for ATS-P  (#339)

* Updates to add ssd-resnset34 inference

* update models path

* Update quickstart paths

* Doc and setup script updates

* Add install setuptools

* Doc update and model script updates

* Write dllogger to a different dir

* Update dllogger dir

* add models folder and update to use dllogger from pip

* Update doc

* Add new setvars for ATS-P

* No deps for torchvision install

* Doc updates

* Updated gpu-models code

* Removing container related files since those aren't tested yet

* Adding back note about original repo

* putting back PVC setvars.sh

* Removing JIRA links

* Update PyTorch ResNet50v1.5 inference and training for AI Kit 2022.1 GPU NDA (#344)

* PyTorch ResNet50v1.5 updates for AI Kit 2022.1

* Update versions in main spec

* Updates years in header

* Update PyTorch SSD-ResNet34 training for AI Kit 2022.1 GPU NDA GPU (#345)

* Updates for SSD-ResNet34 training for AI Kit 2022.1

* Removing JIRA links

* Update to note that the same conda env is used for both inference and training

* tf gpu 3d-unet (#347)

* Add 'Deep Learning Examples for Tensor Cores' to '3d-unet' model for TF

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix both single tile and multi tiles patches for 'UNet_3D_Medical'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Pre-apply the single tile patch to 'UNet_3D_Medical'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* update the docs and spec for 3d-unet GPU

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update doc per review

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Regerate docs for 3d-unet'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Addin 'Intel' header to modified files

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Regen docs and remove checkpoints reference for 3d-unet

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* PyTorch GPU SSD-ResNet34 fixes (#350)

* Regenerate dockerfiles

* Don't have a dockerfile for this one yet

* PyTorch DLRM updates for AI Kit 2022.1 (#349)

* PyTorch DLRM updates for AI Kit 2022.1

* Update quickstarts

* Updates the PyTorch GPU BERT inference and training model packages for AI Kit 2022.1 (#348)

* BERT updates for PyTorch GPU

* Update doc to note sourcing setvars.sh

* add setup.sh to the specs

* Fix path

* Add inference models README

* Update to add README for bert training

* Adding rust

* Update setup script for training

* Updates from https://github.com/intel-innersource/frameworks.ai.pytorch.gpu-models/pull/129

* Update setup

* add data folder

* Fix path

* Removing transformers

* update dependencies

* Fix pip install

* Require the BERT_WEIGHT folder, since we can't write to the MODEL_DIR

* Add the PyTorch 3D UNet inference model package for AI Kit 2022.1 (#351)

* Add 3D-UNet for PyTorch GPU

* Removing wrapper package section for now

* doc updates and setup script fix

* Add matplotlib install

* Quickstart and doc updates

* Update to note that weights file will be downloaded by the setup script

* Doc and setup.sh script updates to set the BUILD_DIR

* Update for setting OUTPUT_DIR instead of PRETRAINED_MODEL dir

* Adding paths for the preprocess.py and the make mkdir_postprocessed_data BUILD_DIR

* More path updates for run.py

* Update pybind dir

* Update to loadgen dir

* Update to get loadgen 39 wheel

* update setup install

* README updates

* Update setvars.sh

* Update to pass build dir

* Removing loadgen and nnUnet, since those are now in artifactory

* Update setup.sh to move loadgen to a temp directory for install

* Removing dockerfile

* Updated TF GPU BKCs and docs for NDA release (#346)

* Updated NDA batch sizes and added pkg READMEs

* Generated docs

* Revert spec/doc changes for 3D U-Net and MaskRCNN

* Remove pretrained models from inference packages

* add rn50 bf16 inference (#352)

* PyTorch and TensorFlow fixes for the AI Kit 2022.1 NDA release (#362)

* BERT fixes for writing to the model dir

* Fix README references to ImageNet in the SSD-ResNet34 docs (should be COCO)

* Write BERT training data to OUTPUT_DIR

* Adds pip package dependency for TF 3D U-Net

Co-authored-by: Melanie H Buehler <[email protected]>

* PyTorch 2022.1 GPU NDA container package update (#370)

* Updates to the base container for 2022.1 pytorch

* Update PyTorch base dockerfiles with distutils (due to error with Python 3.9)

* Updates after testing

* Fix export to ENV

* Updating basekit filenames

* Add the PyTorch SSD-ResNet34 Inference container package for the 2022.1 GPU release (#373)

* Updates to the base container for 2022.1 pytorch

* Update PyTorch base dockerfiles with distutils (due to error with Python 3.9)

* Updates after testing

* Fix export to ENV

* Container updates for SSD-ResNet34 inference

* Fix to separate installs

* Update docs and run.sh with the PRETRAINED_MODEL env var

* Container updates for SSD-ResNet34 inference

* Fix to separate installs

* Update docs and run.sh with the PRETRAINED_MODEL env var

* Add the PyTorch 3D UNet container package for the 2022.1 GPU release (#376)

* Add dockerfile, docs, and dataset preprocessing script for the container package

* Dockerfile update

* Fixing missing env var

* Add clang install

* Fixes for preprocessing

* Doc updates and remove need for extra DATASET_DIR for inference since the preprocessed dataset is in the OUTPUT_DIR

* Add matplotlib to the dockerfile

* updates based on review comment

* Update the PyTorch DLRM inference container package to include pretrained weights (#381)

* Add back the pretrained model

* Fix link

* Update the TensorFlow and PyTorch base container documentation to include link to the driver (#378)

* Update READMEs to link the driver

* Note ATS-P

* Updating PyTorch BERT partials for the 2022.1 GPU release (#379)

* Fix bert path

* Update BERT inference partial

* Fixing line

* fix bert training installs

* tf 2022.1 nda gpu base container cleanup (#384)

* 2022.1 NDA base TF GPU container package update

* Added pre-trained models back to inference specs

* Update for new ITEX and TensorFlow wheels

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Download.md cleanup

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Regen Dockerfiles

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* add -p to the mkdir

* Fix incorrect 'BaseKit' version name

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* include wheels and basekit for 3dunet and fix build args

Co-authored-by: Melanie H Buehler <[email protected]>
Co-authored-by: Dina Suehiro Jones <[email protected]>

* take 1 (#392)

* take 1

* correct file tree

* remove third-party filenames

* Syncing up the doc fragment with the README update for DLRM inference (#414)

* CentOS, Debian, RedHat and SLES support for GPU (#418)

* Add support for CentOS 7 and Debian 10, 11 (#391)

* Add support for CentOS 7 and Debian 10, 11

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Replace 'dnf' with 'yum' for CentOS 7 compatibility

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* remove commented line

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add the Yum repo fix for 'CentOS 8'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Making Platform and OS check more portable (#393)

* Making Platform and OS check more portable

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix a minor syntax error

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Adding support for RedHat 7 and 8 (#394)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Finalize Red Hat and CentOS 7, 8 support (#398)

* Minor fix for Red Hat support

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Improve OS version checking

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Introduce devtoolset-7 for CentOS and Red Hat 7

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* minor regex fix

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* yum install consistency

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Adding support for SLES 15 (#399)

* Adding support for SLES 15.03

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Improve SLES version check regex

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix a minor typo in OS name

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Improve OS version checking (#401)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* PyTorch GPU updates to support both PVC and ATS (#416)

* Add ATS-P vs PVC args and conditionals

* Doc updates

* Updated PVC batch size for BERT large FP32 training

* Add env var

* Add pci utils to the pytorch base and try out new setvars with 3dunet

* update 3dunet spec setup.sh

* ResNet50v1.5 inf update

* Revert README changes

* Update quickstart scripts and specs

* Remove ATS and PVC specific setvars.sh

* Remove DEVICE env from run.sh

* Remove DEVICE

* Remove the 'downloads' for dlrm

* Update requirements to mention lscpi and apt/yum

* Run accuracy testing first for 3d unet

* Doc updates

* TF AI Kit 2022.1.1 NDA updates for PVC (#421)

* PVC vs. ATS detection for TF model packages

* Small update to RN50 BF16 inference BKC

* Adds pciutils requirement to documentation

* Adds pciutils partial

* AI Kit 2022.1.1 NDA remove TF pretrained models (#424)

* Remove pretrained models and fix RN50 bs

* Fixed BERT Large bf16 training bs

* PyTorch container package updates for 2022.1.1 GPU NDA (#427)

* PyTorch container package updates for 2022.1.1 GPU NDA

* update to basekit 140

* TF container package updates for 2022.1.1 GPU NDA (#428)

* TF container package updates for 2022.1.1 GPU NDA

* Fix merge conflict

* GPU Containers - Mount basekit from host machine (#438)

* removed basekit installation

* updated tf basekit build script

* updated docker file

* doc update and minor fixes

* pytorch changes

* doc updates

* GPU workload containers - use basekit from host machine (#439)

* tf change to use basekit on host machine

* changes for pytorch workload container to use basekit from host machine

* Add /opt/intel/oneapi check and volumne mount for the PyTorch 3D UNet dataset preprocessing run script

* fixed error

* update python path

* fix error

* removed ats specific envs

Co-authored-by: Dina Suehiro Jones <[email protected]>

* updated product and agama versions in tool container README; added main README for Container Packages (#447)

* GPU Mask RCNN training package (#454)

* Initial commit for MaskRCNN training model package

* Removed var and regenerate README

* Remove arg & update build.sh

* Remove build args for basekit and components

* Updated specs, partials, dockerfiles

* Fixed base tag args and pip install

* Corrected patch and model files

* Fixed dockerfile, quickstart script, and docs

* Added requirement and removed unnecessary args

* Remove unnecessary files

* Add Intel licence headers

* bug fix for aizoo-708 (#477)

* update README for missing links (#501)

* update README for missing links

* Update README.md

* Update README.md

* Update PyTorch GPU model links and removed unused files (#514)

* Remove old files

* Update list of PyTorch GPU models

* Adds a quickstart script for ResNet50 inference with synthetic data for PyTorch GPU (#522)

* Adds PyTorch ResNet50 inference GPU script that uses synthetic data

* Updated scripts from gpu-models master (75b09b19ed597b4e70fc065a6d68be94406221b3) to get support for dummy data

* Update to put import back to

* update tools docker file linux base to 20.04

* Add dataset dir for --dummy script

* Update PyTorch GPU ResNet50v1.5 synthetic data inference script to allow adjusting the number of iterations run (#545)

* add --num-iterations

* make num iterations a env var

* Update documentation to note number of iterations for synthetic data runs

* Updated wheels for the IPEX base container (#692)

Co-authored-by: msalopan <[email protected]>

* Add ITEX ATS-M whl updates (#696)

* made changes for ITEX ATS-M

* indentation changes

* Update Resnet50v1.5 (#684)

* Update Resnet50v1.5

* Adjust format and restore file

* ATS-M TF changes (#699)

* add benchmark mode for tensorflow ssd

* add resnet50 benchmark mode

* add rn50

* modify rn50 files

* fixing tengfei PR

* fixing incorrect folder changes

* added licences header

* fixed year

Co-authored-by: Tengfei, Han <[email protected]>

* merge TF base container based on new RC1 whl packages (#700)

* ssd-mobilenet tf gpu spec

* build…
  • Loading branch information
Show file tree
Hide file tree
Showing 834 changed files with 51,738 additions and 8,181 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,4 @@ tools/docker/models*
.ipynb_checkpoints
nc_workspace
benchmarks/horovod
data_connector/credentials.json
8 changes: 5 additions & 3 deletions CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,14 @@

# These owners will be the default owners for everything in the repo,
# but PR owner should be able to assign other contributors when appropriate
* @ashahba @claynerobison @dmsuehir
* [email protected] @ashahba @claynerobison
datasets @ashahba @claynerobison @dzungductran
docs @claynerobison @mhbuehler
k8s @ashahba @dzungductran @kkasravi
models @agramesh1 @ashraf-bhuiyan @riverliuintel @wei-v-wang
k8s @ashahba @dzungductran
models @ashraf-bhuiyan @riverliuintel
models @riverliuintel
models/**/pytorch/ @leslie-fang-intel @jiayisunx @zhuhaozhe
quickstart [email protected]
quickstart/**/pytorch/ @leslie-fang-intel @jiayisunx @zhuhaozhe

# Order is important. The last matching pattern has the most precedence.
Expand Down
58 changes: 42 additions & 16 deletions README.md

Large diffs are not rendered by default.

8 changes: 4 additions & 4 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,13 @@ For information on running more advanced use cases using the workload containers
| Image Segmentation | [3D U-Net MLPerf*](https://arxiv.org/pdf/1606.06650.pdf) | Inference | | [FP32 BFloat16 Int8](image_segmentation/tensorflow/3d_unet_mlperf/inference/README.md) | [BRATS 2019](https://www.med.upenn.edu/cbica/brats2019/data.html) |
| Image Segmentation | [MaskRCNN](https://arxiv.org/abs/1703.06870) | Inference | Model Containers: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/mask-rcnn-fp32-inference-tensorflow-container.html) <br> Model Packages: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/mask-rcnn-fp32-inference-tensorflow-model.html) | [FP32](image_segmentation/tensorflow/maskrcnn/inference/fp32/README.md) | [MS COCO 2014](https://github.com/IntelAI/models/tree/master/benchmarks/image_segmentation/tensorflow/maskrcnn/inference/fp32#datasets-and-pretrained-model) |
| Image Segmentation | [UNet](https://arxiv.org/pdf/1606.06650.pdf) | Inference | Model Containers: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/unet-fp32-inference-tensorflow-container.html) <br> Model Packages: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/unet-fp32-inference-tensorflow-model.html) | [FP32](image_segmentation/tensorflow/unet/inference/fp32/README.md) |
| Language Modeling | [BERT](https://arxiv.org/pdf/1810.04805.pdf) | Inference | Model Containers: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-fp32-inference-tensorflow-container.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-bfloat16-inference-tensorflow-container.html) <br> Model Packages: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-fp32-inference-tensorflow-model.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-bfloat16-inference-tensorflow-model.html) | [FP32](language_modeling/tensorflow/bert_large/inference/fp32/README.md) [BFloat16](language_modeling/tensorflow/bert_large/inference/bfloat16/README.md) | [SQuAD](https://github.com/IntelAI/models/tree/master/datasets/bert_data/README.md#inference) |
| Language Modeling | [BERT](https://arxiv.org/pdf/1810.04805.pdf) | Training | Model Containers: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-fp32-training-tensorflow-container.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-bfloat16-training-tensorflow-container.html) <br> Model Packages: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-fp32-training-tensorflow-model.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-bfloat16-training-tensorflow-model.html) | [FP32](language_modeling/tensorflow/bert_large/training/fp32/README.md) [BFloat16](language_modeling/tensorflow/bert_large/training/bfloat16/README.md) | [SQuAD](https://github.com/IntelAI/models/tree/master/datasets/bert_data/README.md#fine-tuning-with-bert-using-squad-data) and [MRPC](https://github.com/IntelAI/models/tree/master/datasets/bert_data/README.md#classification-training-with-bert) |
| Language Modeling | [BERT](https://arxiv.org/pdf/1810.04805.pdf) | Inference | Model Containers: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-fp32-inference-tensorflow-container.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-bfloat16-inference-tensorflow-container.html) <br> Model Packages: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-fp32-inference-tensorflow-model.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-bfloat16-inference-tensorflow-model.html) | [Int8](language_modeling/tensorflow/bert_large/inference/int8/README.md) [FP32](language_modeling/tensorflow/bert_large/inference/fp32/README.md) [BFloat16](language_modeling/tensorflow/bert_large/inference/bfloat16/README.md) | [SQuAD](https://github.com/IntelAI/models/tree/master/datasets/bert_data/README.md#inference) |
| Language Modeling | [BERT](https://arxiv.org/pdf/1810.04805.pdf) | Training | Model Containers: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-fp32-training-tensorflow-container.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-bfloat16-training-tensorflow-container.html) <br> Model Packages: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-fp32-training-tensorflow-model.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/bert-large-bfloat16-training-tensorflow-model.html) | [FP32](language_modeling/tensorflow/bert_large/training/fp32/Advanced.md) [BFloat16](language_modeling/tensorflow/bert_large/training/bfloat16/Advanced.md) [FP16](language_modeling/tensorflow/bert_large/training/fp16/Advanced.md) | [SQuAD](https://github.com/IntelAI/models/tree/master/datasets/bert_data/README.md#fine-tuning-with-bert-using-squad-data) and [MRPC](https://github.com/IntelAI/models/tree/master/datasets/bert_data/README.md#classification-training-with-bert) |
| Language Modeling | [distilBERT](https://arxiv.org/abs/1910.01108) | Inference | Model Containers: | [FP32 BFloat16](https://github.com/IntelAI/models/benchmarks/language_modeling/tensorflow/distilbert_base/inference/README.md) | [SST-2](https://huggingface.co/datasets/sst2) |
| Language Translation | [BERT](https://arxiv.org/pdf/1810.04805.pdf) | Inference | | [FP32](language_translation/tensorflow/bert/inference/README.md) | [MRPC](https://github.com/IntelAI/models/tree/master/datasets/bert_data/README.md#classification-training-with-bert) |
| Language Translation | [GNMT*](https://arxiv.org/pdf/1609.08144.pdf) | Inference | Model Containers: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/gnmt-fp32-inference-tensorflow-container.html) <br> Model Packages: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/gnmt-fp32-inference-tensorflow-model.html) | [FP32](language_translation/tensorflow/mlperf_gnmt/inference/README.md) | [MLPerf GNMT model benchmarking dataset](https://github.com/IntelAI/models/tree/master/benchmarks/language_translation/tensorflow/mlperf_gnmt/inference/fp32#datasets) |
| Language Translation | [Transformer_LT_mlperf*](https://arxiv.org/pdf/1706.03762.pdf) | Training | Model Containers: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/transformer-lt-mlperf-fp32-training-tensorflow-container.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/transformer-lt-mlperf-bfloat16-training-tensorflow-container.html) <br> Model Packages: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/transformer-lt-mlperf-fp32-training-tensorflow-model.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/transformer-lt-mlperf-bfloat16-training-tensorflow-model.html) | [FP32](language_translation/tensorflow/transformer_mlperf/training/fp32/README.md) [BFloat16](language_translation/tensorflow/transformer_mlperf/training/bfloat16/README.md) | [WMT English-German dataset](https://github.com/IntelAI/models/tree/master/datasets/transformer_data#transformer-language-mlperf-dataset) |
| Language Translation | [Transformer_LT_mlperf*](https://arxiv.org/pdf/1706.03762.pdf) | Inference | | [FP32](language_translation/tensorflow/transformer_mlperf/inference/fp32/README.md) [BFloat16](language_translation/tensorflow/transformer_mlperf/inference/bfloat16/README.md) [Int8](language_translation/tensorflow/transformer_mlperf/inference/int8/README.md) | [WMT English-German data](https://github.com/IntelAI/models/tree/master/datasets/transformer_data#transformer-language-mlperf-dataset) |
| Language Translation | [Transformer_LT_mlperf*](https://arxiv.org/pdf/1706.03762.pdf) | Training | Model Containers: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/transformer-lt-mlperf-fp32-training-tensorflow-container.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/transformer-lt-mlperf-bfloat16-training-tensorflow-container.html) <br> Model Packages: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/transformer-lt-mlperf-fp32-training-tensorflow-model.html) [BFloat16](https://software.intel.com/content/www/us/en/develop/articles/containers/transformer-lt-mlperf-bfloat16-training-tensorflow-model.html) | [FP32 BFloat16](language_translation/tensorflow/transformer_mlperf/training/README.md) | [WMT English-German dataset](https://github.com/IntelAI/models/tree/master/datasets/transformer_data#transformer-language-mlperf-dataset) |
| Language Translation | [Transformer_LT_mlperf*](https://arxiv.org/pdf/1706.03762.pdf) | Inference | | [FP32 BFloat16 Int8](language_translation/tensorflow/transformer_mlperf/inference/README.md) | [WMT English-German data](https://github.com/IntelAI/models/tree/master/datasets/transformer_data#transformer-language-mlperf-dataset) |
| Language Translation | [Transformer_LT_Official](https://arxiv.org/pdf/1706.03762.pdf) | Inference | Model Containers: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/transformer-lt-official-fp32-inference-tensorflow-container.html) <br> Model Packages: [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/transformer-lt-official-fp32-inference-tensorflow-model.html) | [FP32](language_translation/tensorflow/transformer_lt_official/inference/README.md) | [WMT English-German dataset](https://github.com/IntelAI/models/tree/master/datasets/transformer_data#transformer-language-mlperf-dataset) |
| Object Detection | [Faster R-CNN](https://arxiv.org/pdf/1506.01497.pdf) | Inference | Model Containers: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/faster-rcnn-int8-inference-tensorflow-container.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/faster-rcnn-fp32-inference-tensorflow-container.html) <br> Model Packages: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/faster-rcnn-int8-inference-tensorflow-model.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/faster-rcnn-fp32-inference-tensorflow-model.html) | [Int8](object_detection/tensorflow/faster_rcnn/inference/int8/README.md) [FP32](object_detection/tensorflow/faster_rcnn/inference/fp32/README.md) | [COCO 2017 validation dataset](https://github.com/IntelAI/models/tree/master/datasets/coco#download-and-preprocess-the-coco-validation-images) |
| Object Detection | [R-FCN](https://arxiv.org/pdf/1605.06409.pdf) | Inference | Model Containers: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/rfcn-int8-inference-tensorflow-container.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/rfcn-fp32-inference-tensorflow-container.html) <br> Model Packages: [Int8](https://software.intel.com/content/www/us/en/develop/articles/containers/rfcn-int8-inference-tensorflow-model.html) [FP32](https://software.intel.com/content/www/us/en/develop/articles/containers/rfcn-fp32-inference-tensorflow-model.html) | [Int8 FP32](object_detection/tensorflow/rfcn/inference/README.md) | [COCO 2017 validation dataset](https://github.com/IntelAI/models/tree/master/datasets/coco#download-and-preprocess-the-coco-validation-images) |
Expand Down
13 changes: 10 additions & 3 deletions benchmarks/common/base_benchmark_util.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#
# -*- coding: utf-8 -*-
#
# Copyright (c) 2018-2021 Intel Corporation
# Copyright (c) 2018-2023 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -68,8 +68,8 @@ def _define_args(self):

self._common_arg_parser.add_argument(
"-p", "--precision",
help="Specify the model precision to use: fp32, fp16, int8, or bfloat16",
required=required_arg, choices=["fp32", "fp16", "int8", "bfloat16"],
help="Specify the model precision to use: fp32, int8, bfloat16 or fp16",
required=required_arg, choices=["fp32", "int8", "bfloat16", "fp16"],
dest="precision")

self._common_arg_parser.add_argument(
Expand Down Expand Up @@ -228,6 +228,13 @@ def _define_args(self):
dest="experimental_gelu", choices=["True", "False"],
default=False)

self._common_arg_parser.add_argument(
"--amp",
help="use grappler auto-mixed precision as opposed to \
keras mixed precision",
dest="amp", choices=["True", "False"],
default=False)

# Note this can't be a normal boolean flag, because we need to know when the user
# does not explicitly set the arg value so that we can apply the appropriate
# default value, depending on the the precision.
Expand Down
64 changes: 52 additions & 12 deletions benchmarks/common/tensorflow/start.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/env bash
#
# Copyright (c) 2018-2019 Intel Corporation
# Copyright (c) 2018-2023 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -160,7 +160,7 @@ if [[ ${NOINSTALL} != "True" ]]; then
export HOROVOD_WITHOUT_PYTORCH=1
export HOROVOD_WITHOUT_MXNET=1
export HOROVOD_WITH_TENSORFLOW=1
export HOROVOD_VERSION=35b27e9
export HOROVOD_VERSION=b1d0ce8

# Install GCC 7 from devtoolset-7
if [[ ${OS_VERSION} =~ "7".* ]]; then
Expand Down Expand Up @@ -239,7 +239,16 @@ if [[ ${NOINSTALL} != "True" ]]; then
# In case installing released versions of Horovod fail,and there is
# a working commit replace next set of commands with something like:
apt-get install -y --no-install-recommends --fix-missing cmake git
python3 -m pip install --no-cache-dir git+https://github.com/horovod/horovod.git@${HOROVOD_VERSION}
# TODO: Once this PR https://github.com/horovod/horovod/pull/3864 is merged, we can install horovod as before.
# python3 -m pip install --no-cache-dir git+https://github.com/horovod/horovod.git@${HOROVOD_VERSION}
git clone https://github.com/horovod/horovod.git
cd horovod
git reset --hard ${HOROVOD_VERSION}
git submodule update --init --recursive
git fetch origin pull/3864/head:ashahba/issue-3861-fix
git checkout ashahba/issue-3861-fix
python3 -m pip install --no-cache-dir -v -e .

horovodrun --check-build
fi
fi
Expand Down Expand Up @@ -559,6 +568,9 @@ function bert_options() {
if [[ -n "${OPTIMIZED_SOFTMAX}" && ${OPTIMIZED_SOFTMAX} != "" ]]; then
CMD=" ${CMD} --optimized-softmax=${OPTIMIZED_SOFTMAX}"
fi
if [[ -n "${AMP}" && ${AMP} != "" ]]; then
CMD=" ${CMD} --amp=${AMP}"
fi

if [[ -n "${MPI_WORKERS_SYNC_GRADIENTS}" && ${MPI_WORKERS_SYNC_GRADIENTS} != "" ]]; then
CMD=" ${CMD} --mpi_workers_sync_gradients=${MPI_WORKERS_SYNC_GRADIENTS}"
Expand Down Expand Up @@ -991,7 +1003,11 @@ function resnet101_inceptionv3() {
function resnet50() {
export PYTHONPATH=${PYTHONPATH}:$(pwd):${MOUNT_BENCHMARK}
is_model_gpu_supported="True"

if [ ${GPU} == "True" ]; then
PYTHONPATH=${PYTHONPATH}:${MOUNT_INTELAI_MODELS_SOURCE}/${MODE}/gpu
else
PYTHONPATH=${PYTHONPATH}:${MOUNT_INTELAI_MODELS_SOURCE}/${MODE}/cpu
fi
# For accuracy, dataset location is required.
if [ "${DATASET_LOCATION_VOL}" == "None" ] && [ ${ACCURACY_ONLY} == "True" ]; then
echo "No Data directory specified, accuracy will not be calculated."
Expand All @@ -1002,10 +1018,6 @@ function resnet50() {
CMD="${CMD} $(add_steps_args) $(add_calibration_arg)"
PYTHONPATH=${PYTHONPATH} CMD=${CMD} run_model
elif [ ${PRECISION} == "fp32" ] || [ ${PRECISION} == "bfloat16" ] || [ ${PRECISION} == "fp16" ]; then
if [ ${PRECISION} == "fp16" ] && [ ${GPU} == "False" ]; then
echo "PRECISION=${PRECISION} is not supported without --gpu."
exit 1
fi
CMD="${CMD} $(add_steps_args)"
PYTHONPATH=${PYTHONPATH} CMD=${CMD} run_model
else
Expand Down Expand Up @@ -1187,7 +1199,7 @@ function ssd-resnet34() {
$(add_arg "--num_warmup_batches" ${NUM_WARMUP_BATCHES})"
local old_pythonpath=${PYTHONPATH}
export PYTHONPATH=${PYTHONPATH}:${MOUNT_EXTERNAL_MODELS_SOURCE}
export PYTHONPATH=${PYTHONPATH}:${TF_MODELS_DIR}:${TF_MODELS_DIR}/research:"/tmp/benchmark_ssd_resnet34/scripts/tf_cnn_benchmarks"
export PYTHONPATH=${PYTHONPATH}:${MOUNT_EXTERNAL_MODELS_SOURCE}/research:"/tmp/benchmark_ssd_resnet34/scripts/tf_cnn_benchmarks"
CMD=${CMD} run_model
PYTHONPATH=${old_pythonpath}
else
Expand Down Expand Up @@ -1466,8 +1478,7 @@ function bert_large() {
bert_options
CMD=${CMD} run_model
else
# Change if to support fp32
if [ ${PRECISION} == "fp32" ] || [ ${PRECISION} == "int8" ] || [ ${PRECISION} == "bfloat16" ]; then
if [ ${PRECISION} == "fp32" ] || [ $PRECISION == "int8" ] || [ $PRECISION == "bfloat16" ] || [ $PRECISION == "fp16" ]; then
export PYTHONPATH=${PYTHONPATH}:${MOUNT_EXTERNAL_MODELS_SOURCE}
bert_options
CMD=${CMD} run_model
Expand Down Expand Up @@ -1507,6 +1518,35 @@ function distilbert_base() {
fi
}

# distilBERT base model
function distilbert_base() {
if ([ ${PRECISION} == "fp32" ] || [ ${PRECISION} == "bfloat16" ] ||
[ ${PRECISION} == "int8" ] || [ ${PRECISION} == "fp16" ]); then
export PYTHONPATH=${PYTHONPATH}:${MOUNT_EXTERNAL_MODELS_SOURCE}
CMD="${CMD} $(add_arg "--warmup-steps" ${WARMUP_STEPS})"
CMD="${CMD} $(add_arg "--steps" ${STEPS})"

if [ ${NUM_INTER_THREADS} != "None" ]; then
CMD="${CMD} $(add_arg "--num-inter-threads" ${NUM_INTER_THREADS})"
fi

if [ ${NUM_INTRA_THREADS} != "None" ]; then
CMD="${CMD} $(add_arg "--num-intra-threads" ${NUM_INTRA_THREADS})"
fi

if [ -z ${STEPS} ]; then
CMD="${CMD} $(add_arg "--steps" ${STEPS})"
fi

if [ -z $MAX_SEQ_LENGTH ]; then
CMD="${CMD} $(add_arg "--max-seq-length" ${MAX_SEQ_LENGTH})"
fi
CMD=${CMD} run_model
else
echo "PRECISION=${PRECISION} not supported for ${MODEL_NAME} in this repo."
exit 1
fi
}

# Wide & Deep model
function wide_deep() {
Expand Down Expand Up @@ -1650,7 +1690,7 @@ elif [ ${MODEL_NAME} == "bert_large" ]; then
elif [ ${MODEL_NAME} == "dien" ]; then
dien
elif [ ${MODEL_NAME} == "distilbert_base" ]; then
distilbert_base
distilbert_base
else
echo "Unsupported model: ${MODEL_NAME}"
exit 1
Expand Down
Loading

0 comments on commit 5d29317

Please sign in to comment.