Import GPU Max and Flex Series workloads from develop-gpu (#1080) · intel/ai-reference-models@98e46e6

Commit

Import GPU Max and Flex Series workloads from develop-gpu (#1080)

* Add GPU DLRM FP16 inference

* Change to install ATS drivers from local repo

* Add GPU PYT bert large FP16 Inference

* fix _FusedMatmlul issue in GPU

* Updated PyTorch to use the common compiler partial and added ARG for the env var file since that changes per compiler

* Add package for ResNet 50 v1.5 int8 Inference pytorch gpu

* Update specs & build files for alpha2 rc1 whls

* Add ResNet50 v1.5 bf16 Training PYT GPU

* Add wrapper package for TF GPU tool container

* Update TF GPU training packages to use alpha2-rc1

* Update IPEX tools container and resnet50v1.5 models for alpha2 rc1

* Update PYT Bert LG and DLRM FP16 inference alpha2-rc1

* Update tf-gpu branch for ww15 dpcpp compiler

* Set ITEX_ENABLE_ONEDNN_LAYOUT_OPT=0 for bert training

* Add section to validate base container, fix dlrm printed statement

* Update the docs for alpha2-rc2 models

* fix ipex tool container readme

* Fix dlrm print using CPU statement to be XPU

* add 1t env vars

* Use add instead of addn

* Update bert large docs to be specific about which pretrained model to use

* Sync with develop

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update the main benchmarks README for gpu models

* Set ITEX_ENABLE_ONEDNN_LAYOUT_OPT=0 in ResNet50v1.5 bf16 training quickstart scripts

* Revert "tmp fix res50v1_5 int8"

This reverts commit 3c120e0bee3a576ee1548d9258b611a889897ee6

* Updates to match batch sizes in docs and updated pb links

* Updating compilar binary

* Update PYT GPU packages for IPEX alpha2 rc6

* rfcn-fp32-inference-k8s package

Signed-off-by: Kam D Kasravi <[email protected]>

* Update GPU specs to make the docs section a list and update TF training docs for DevCloud

* Doc updates for ResNet50v1.5 and BERT large training for GPU

* tf-gpu doc updates

* Fix the BKC and environment for resnet50v1.5 INT8, bert-larget and resenet50v1.5 BF16 training

* Update GPU PYT packages to have 2 READMEs

* Remove duplicate license from package

* AI Kit Model Package README

* Clean up PYT model pkgs and update baremetal docs

* Fix GPU tests (#5)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Sync with 'develop' and resolve conflicts (#3)

* Update README.md for IPS 00513014 and 00514541

* Enable remapper pass in densenet169 execution

* Adds protoc and pycocotools dependencies

* K8s packages tests: Checks if username has underscore before creating a namespace

* Fix and simplify serving k8s package path variables

* Upgrade to 'TensorFlow Serving 2.4.0'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* rfcn-fp32-inference-k8s package

Signed-off-by: Kam D Kasravi <[email protected]>

* Quickstart updates for using synthetic data or real data, except SSD-ResNet batch will always use synthetic

* Add Centos8 partials for SPR TF models

* Fix the URL for 'oneAPI-samples' repo

* snapshot

Signed-off-by: Kam D Kasravi <[email protected]>

* Add a copy of existing pytorch ipex icx centos specs to specs/centos

* Fix High vulnaribility issues reported by SNYK

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Setting OMP_NUM_THREADS based on num_intra_threads

* Weekly SNYK fixes

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fixes broken links in the Launch Benchmarks documentation

* Fix '3d-unet' docker image links

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix Python and TensorFlow Pip package versions for TF v1.15.2

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Adding a minor fix to dynamically calculate the number of remaining images to be steps provided x batch size.

Currently the max number of steps the RN50 inference supports is max of 5000 / batch size.. The 50k hard limit is not letting us to perform long inference runs for platform analysis. Hence requesting this fix.

This will enable us to collect telemetric data (like emon) to be collected for longer duration (like 5 mins).

Signed-off-by: Rajendrakumar Chinnaiyan <[email protected]>

* Remove unused 'num_cores' from 'rfcn'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Upgrade to 'Pillow>=8.1.2'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Compatibility fixes for automation

* Parameterized model name in resnet50v1.5 serving script
* Increase timeout and modify output
* Adjusts inceptionv3 client input and output

* fix mpi operator cluster scope issue

* Fixes SSD-MobileNet perf comparison by pre-installing numpy with --no-binary

* Enable more models for Perf Analysis notebooks and add auto testing for notebooks

* Update quickstart bare metal documentation to use ./quickstart/<script>.sh

* Fix lints tests for rfcn

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add support for SSD-ResNet34 BF16 inference

* Updating benchmarks table with 'SSD ResNet34 BFloat16'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* modifying requirements.txt in SSDRN34 to use tensorflow add-ons of any version greater than or equal to 0.11.0

* Moving quickstart files to their proper directories and bats test fix

* Update specs and assembler.py to make the documentation section a list

* Fix error in BF16 accuracy test for SSD-ResNet34 with input size of 1200

* Shwetaoj/horovod version

* Fix pip install commands for Python3 and 'numpy' version

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Updated README file for transformer_mlperf model, fixed of link of sections and added the instructions to run transformer model for both fp32 and bfloat16 inference

* Update BERT large docs for to separate out "advanced" and allow for using quickstart scripts when cloning the repo

* Adding DIEN model to modelzoo for inference (fp32 and bfloat16)

* Fixing data format issue for SSD_RN34 and Resnet50 training models

* Replaced existing mlperf transformer LT  bfloat16 training model with a converged model, multi-node support is kept

* Fix for accuracy flag

* Fix some styles for recently merged 'DIEN' model

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Set 'OMP_NUM_THREADS' to 'num_intra_threads'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Updated the transformer_mlperf README file, and also restore a change by accident

* Fix styles and other cleanup

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update BERT Large docs to put AI kit first

* Added support for frozen graph with bfloat16 precision.

* Update README file and fix few errors.

* Fixes for 3D-Unet Mlperf

* Fix link to 'g3doc' installation

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update docs for DenseNet 169 and Faster RCNN FP32 inference

* Fix `environment` spelling typo

* Fix for ssd-resnet34 inference

* Stock PyTorch vs Intel's optimization comparison notebook

* Doc updates for AI Kit

* Adding fix to ssd-resnet34 bfloat16 training

* Doc updates for recommendation models for AI Kit

* AI Kit doc updates for Faster RCNN

* Update SSD ResNet34 backbone model links

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* 3D U-Net AI Kit doc updates

* Mask RCNN AI Kit doc updates

* Doc updates for language modeling models for AI Kit

* UNet doc changes for AI Kit

* Fixed a bug in mlperf_transformer model real time performance measurement, which was caused by the batch size was fixed in the model. Also with some code cleaning up

* Doc updates for RFCN for AI Kit

* Update the docs/README.md to add a AI Kit doc link

* Removing $ from shell command snippets

* Doc updates for SSD-MobileNet for AI Kit

* Update DenseNet169 doc to use the tensorflow conda env for AI Kit

* IMZ CentOS Support for start.sh

* Doc updates for WaveNet for AI Kit

* Doc updates for InceptionV4 for AI Kit

* WORKAROUND - Update horovod version to a commit on master branch to fix build error in horovod

* rama/3d unet

* Enabled user specified warmup and benchmark steps.

* Merge branch 'dtran/platform_util_add' into 'develop'

Added functions to expose some of the properties like core, logical core, numa nodes

See merge request intelai/models!495

* update all TF images to latest

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update TF TPP link too

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add document for users who are new to docker

* Update InceptionV3 docs for AI Kit

* Update code to write checkpoint files to the --checkpoint dir, even when the backbone model isn't provided

* Fixing the link target to the README section that lists the model's prerequisites

* Update MobileNet V1 docs for AI Kit

* Update ResNet50 & ResNet101 docs for AI Kit

* Regenerate docs too for SSD ResNet34

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix SSD ResNet34 style and unittests

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Doc updates for language translation models for AI Kit

* Fix typo in "advanced" setup section

* Doc updates for ResNet50v1.5 for AI Kit

* In-graph arg should be omitted if None for BERT BF16 inference

* Changes to add num_iterations option for DIEN model

* DIEN script refactoring + static graph flag + bf16 online pass support

* Check for 'NOINSTALL' before running 'YUM' commands

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Initial commit for SSD-RN34 BF16 inference

* Prepare for Model Zoo v2.4.0 release

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update output based on new graph

* Sync with 'develop' and resolve conflicts

* Regen documentation and dockerfiles

* Update 'OWNERS' file (#4)

* Update 'OWNERS' file

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add more owners

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix one last failing test

* Update 'DIEN' readme (#6)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Prevent adding wheels or other archives to the repo (#7)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

Co-authored-by: ltsai1 <[email protected]>
Co-authored-by: Yimei Sun <[email protected]>
Co-authored-by: Melanie H Buehler <[email protected]>
Co-authored-by: Taie, Wafaa S <[email protected]>
Co-authored-by: Kasravi, Kam D <[email protected]>
Co-authored-by: Mahmoud Abuzaina <[email protected]>
Co-authored-by: Jones, Dina S <[email protected]>
Co-authored-by: Rajendrakumar Chinnaiyan <[email protected]>
Co-authored-by: Yerneni, Venkata P <[email protected]>
Co-authored-by: Thakkar, Om <[email protected]>
Co-authored-by: Ojha, Shweta <[email protected]>
Co-authored-by: Cui, Xiaoming <[email protected]>
Co-authored-by: Varghese, Jojimon <[email protected]>
Co-authored-by: xiaoming <xkdjfk>
Co-authored-by: Khanna, Kanvi <[email protected]>
Co-authored-by: mdfaijul <[email protected]>
Co-authored-by: Shiddibhavi, Sharada <[email protected]>
Co-authored-by: Shah, Sharvil <[email protected]>
Co-authored-by: Ketineni, Rama <[email protected]>

* GPU RN50v15 Inference (#11)

Generate package with support for all precision

Co-authored-by: Dina Suehiro Jones <[email protected]>

* Adds the PyTorch GPU BERT inference package  (#10)

* Add PyTorch GPU BERT inference container

* documentation updates

* Make scripts executable

* Removing these vars until we hear from mingxiao

* Make brackets consistant

* Formattting

* fix output dir

* Removing typo

* Add note that says the first run will download the pretrained model

* Fix which README goes in the package

* Updates based on the latest bkc

* Update quickstart file names in the spec

* Use tee

* Adds the PyTorch GPU BERT training package  (#13)

* Add documentation, quickstarts, and spec for PyTorch BERT training for GPU

* Fix which README goes in the package

* Add glue files

* Updates based on the latest BKCs

* Doc update and log to screen

* Doc update

* add support for bfloat16 (#17)

* Adds the PyTorch GPU ResNet50v1.5 training package (#15)

* Add ResNet50v1.5 PyTorch training model package

* Update files in package

* update file list for main.py

* Spec update

* Fixes after review

* Use tee

* Adds the PyTorch GPU ResNet50v1.5 inference package (#14)

* Add docs, quickstart scripts, and spec for PyTorch ResNet50v1.5 for GPU

* Fix file path

* Doc and BKC updates

* Update permissions

* Moving the PyTorch DLRM GPU model code  (#18)

* Moving the dlrm code out of the precision folder, since it's the same for all precisions

* Updates from the latest gpu-models 0.2.0gpu_rc1 branch

* Update the old DLRM spec, due to moving the code

* Moving DLRM inference/gpu code to be common gpu code used for both inference and training

* Fixing models paths from the old spec/quickstart

* Add GPU RN50v1.5 Training  (#19)

* added spec & generating package

* removed existing folder

* fixed docxumentation

* fix scripts

* review changes

* Adds the PyTorch GPU DLRM training package (#21)

* Add initial spec and docs for DLRM pytorch gpu training

* Updated docs

* Update permissions on quickstart

* update dataset doc

* Fix file path

* Adds the PyTorch GPU DLRM inference package (#22)

* Moving the dlrm code out of the precision folder, since it's the same for all precisions

* Updates from the latest gpu-models 0.2.0gpu_rc1 branch

* Update the old DLRM spec, due to moving the code

* Moving DLRM inference/gpu code to be common gpu code used for both inference and training

* Fixing models paths from the old spec/quickstart

* Add files for the spec and documentation for DLRM pytorch GPU inference

* Update quickstart file list

* Documentation updates

* removing old code

* Update dataset instructions

* Add log file analysis

* Update to add download of the model weights

* Doc updates

* Update the datasets instructions to note that the first time the model is run, the preprocessing happens

* Add init files in language modeling & tensorflow folders (#23) (#24)

* add init files in language modeling & tensorflow folders

* changed year

Co-authored-by: Jitendra Patil <[email protected]>

* Add GPU Bert Large inference (#25)

* added scripts

* update docs

* updated scripts

* removed unnecessary folders

* removed spec file

* fix import issues

* review changes

* review update 2

* Add GPU Bert Large training (#29)

* initial commit

* update spec file

* added missing init file

* update docs

* deleted unwated files

* review changes

* gpu support for bfloat16 (#31)

* updates to docs & scripts (#34)

* Update pytorch bert for gpu to include transformers code (#32)

* Update pytorch bert for gpu to include transformers code

* BERT large inference doc updates and fixes

* update to use a clone of the AI Kit conda env

* Fix paths in the quickstart script

* add sacremoses
 to the requirements

* Updates to the DLRM model packages for PyTorch GPU (#35)

* DLRM fixes

* update training precisions

* Updates for the user to download the pretrained model separately

* PyTorch GPU ResNet50v1.5 updates and fixes (#38)

* Add resnet models file

* Fix to use tee

* Updates for training

* Fix log file name

* Whitespace

* whitespace

* Updated model files from gpu-models 0.2.0gpu (b761567)

* Quickstart updates

* PyTorch model source from the gpu-models 0.2.0gpu branch (b761567) & add env vars (#39)

* Updated models from the gpu-models 0.2.0gpu branch (b761567)

* Updated BKCs

* Add setting of env vars

* Change warn to echo

* added env parameter (#44)

* Fixes for PyTorch GPU AI Kit models dependency install  (#45)

* PyTorch GPU fixes from SH for running from a read only directory (#46)

* Updates from the PyTorch team

* Set tensorboard logdir

* Updates from Agnieszka's fixes_0.2.0gpu branch

* Grabbing unchanged files

* Reverting header year change from unchanged file

* Removing old pytorch gpu model spec/dockerfiles (#50)

* Revert "Removing old pytorch gpu model spec/dockerfiles (#50)" (#51)

This reverts commit d716b915cf20efc437467930e48a2f829a898f55.

* Removing old PyTorch GPU dockerfiles/specs that are for a specific precision (#52)

* Updates for the PyTorch IPEX GPU base container package (#53)

* Updates for base pytorch gpu container

* Update dockerfile name

* Update name of the agama sources file

* Update docker image names in the doc

* Switch back to intel-graphics-local.list

* Doc update

* Update env vars

* update to use l_dpcpp-cpp-compiler_p_2021.3.0.3168_offline.sh

* Update to use l_dpcpp-cpp-compiler_p_2021.3.0.3168_offline.sh

* README updates

* Doc update to make title match what users will see in IRC

* Adds inference and training container packages for PyTorch BERT large for GPU (#57)

* Add workload containers for PyTorch IPEX BERT large inference & training for GPU

* Update to clarify base build and update build script to check for the base

* Fixing package name

* update dockerfile to use latest mkl

* Make base image vars

* Update run.sh to use --group-add

* Regenerate dockerfiles

* Syntax fix

* Add Tensorflow base container (#60)

* update specs

* first working version

* updated build

* updated docs, build & spec

* doc update

* tabs -> spaces

* Adds inference and training container package for PyTorch DLRM for GPU (#58)

* rename specs

* add initial files

* Updated docs and add build.sh and run.sh

* Fix dockerfile name

* Regenerate dockerfile

* update pretrained model path

* Add new line at the end of build.sh files

* Adds inference and training container packages for PyTorch ResNet50v1.5 for GPU  (#61)

* renaming specs

* Generate dockerfiles

* Add documentation for the wrapper package

* Add build and run scripts and update spec for the wrapper package

* Use makedirs to create leaf folders

* syntax

* syntax

* Fixing broken links

* Removing --do_eval for bert large training (#67)

* Add Bert Large inference GPU package (#69)

* initial version

* updated docs

* rename spec & package name

* review changes

* fix broken link

* Add Bert Large training GPU container package (#72)

* initial version

* updated build & run scripts

* update docs

* Add ResNet50v1.5  GPU container packages (#75)

* initial commit

* added docs in spec

* wrapper package generation

* update docs

* add training

* updated docs

* parameterize docker args (#76)

* Update run.sh with docker args for PyTorch GPU container packages (#77)

* update docs (#78)

* RN50 training bug fix (#80)

* bug fix

* update batch size

* GPU Bert training container package fix (#84)

* initial working version

* added env parm

* dummy data generation integratex

* more update

* env fix

* updated docs

* Adding NDA TPP file (#87)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update PyTorch IPEX gpu wheel name (#88)

* Update PyTorch IPEX gpu wheel name

* Fix files in spec

* Add pre-trained models for gpu container packages (#123)

* added pretrained models to pacakge

* update doc

* add pretrained models for rn50

* volume mount fix

* review changes

* Update pytorch for new wheel names (#128)

* Update PyTorch code from gpu-models b05e2161 (#129)

* Update pytorch for new wheel names

* Updated resnet50 images

* Fix formatting

* fix formatting in training file

* Update copyright year

* Updates for BERT from gpu-models

* Updated dlrm code file from gpu-models

* Add do eval for bert training

* Updated scripts and env var

* Remove UseVmBind and add EnableDirectSubmission=1 in setvars.sh

* Tensorflow - 2021.3.1 NDA release (#131)

* update itex binary name

* itex file name fix

* BKC changes

* debug changes

* increasing shm mem

* updates

* updated batch size

* loggin more frequently

* rolling back some changes

* Add copyright to python and bash scripts files (#147) (#149)

* add copyright to files

* one more file

(cherry picked from commit 1635120e5c7b7e6d7fff44e666533d33a47a6445)

* compilre version change (#155)

* PyTorch PVC updates (#188)

* Add dockerfile with PVC env vars

* PVC dockerfile updates

* Removing pvc specific dockerfile

* Renaming ATS vars to PVC

* Updated gpu-models code (07854e5d09cc7f380355f8ca50ebe8bc9c09bf22)

* BERT large inference and training quickstart updates

* Update BERT train long analysis function parameters to add batch size

* DLRM updates

* Doc updates for the DLRM terabyte dataset

* README updates

* Update 'Ats' in message

* Fix ENV in partial

* Fix typo

* Fix typo

* Another typo :(

* Revert "PyTorch PVC updates (#188)" (#194)

This reverts commit 99f569ca09d6a7b333959d14fb5ce29df4e08077.

* PyTorch updates for PVC pre-alpha (0.2.2) (#195)

* Add dockerfile with PVC env vars

* PVC dockerfile updates

* Removing pvc specific dockerfile

* Renaming ATS vars to PVC

* Updated gpu-models code (07854e5d09cc7f380355f8ca50ebe8bc9c09bf22)

* BERT large inference and training quickstart updates

* Update BERT train long analysis function parameters to add batch size

* DLRM updates

* Doc updates for the DLRM terabyte dataset

* README updates

* Update 'Ats' in message

* Fix ENV in partial

* Fix typo

* Fix typo

* Another typo :(

* Update Torch CCL install

* Update list of quickstart scripts in the DLRM inference spec

* Updated weights file for DLRM inference

* Fix <package name> text replacement

* update compiler and oneMKL

* Update the base container README due to ipex import changes and remove --privileged

* Update base container README based on review feedback

* update DATASET_DIR for DLRM to remove 'day'

* The DLRM dataset /day paths were correct - putting them back in)

* Updates for main_int8.py

* Add one CCL

* Add build arg for CCL

* Add l_oneapi_ccl_p_2021.4.0.423_offline.sh to the package

* Switch to use base kit as an experiment

* Update dockerfile for basekit

* Make a separate spec for basekit for debug

* Torch CCL from source

* make torch-ccl directory relative

* Fixing path in spec

* Fix package path for torch_ccl

* Go back to using wheels for Torch CCL

* Fix dockerfile name for basekit build.sh

* fix typo

* Removing ENVs that were already defined

* Make sure ONEAPI_ROOT is getting set

* update image tag

* update to use new wheels

* pip updates to prevent dependency version warnings

* PVC alpha release - Tensorflow (#201)

* base container update

* udpated env vars

* updated models

* added oneccl

* build script update

* fixed ccl installation

* update bkc

* training bkc update

* fixed bf16

* remove horovod whl install

* merge related fixes

* code review changes

* Add PyTorch PVC container package for SSD-ResNet34 Training  (#269)

* Add PyTorch PVC SSD-ResNet34 training spec, partial, docs, and code files

* Add git

* Moving partial to the ubuntu folder

* regenerate dockerfile with git install

* Reorder paritals

* Add python3.6-dev

* Add python3.7-dev

* Removing precision as a requirement

* Fix path

* Make training.sh executable

* Add env var

* Update docs and add block/plain format

* fix filenames in spec

* Removing dockerfile that's not used

* Add info on the known issue for plain format

* Add note about the original repo

* TF PVC 3D-UNet and MASK R-CNN containers. (#271)

* initial working package & build

* update build & run scripts

* maskrcnn pkg generation

* scripts updates

* doc update

* more doc update

* doc updates

* docs update +

* 3d-unet working with basekit

* scripts updates

* basekit based models

* fix docs

* mixed precision script

* update scripts

* docs update

* review changes

* review changes 2

* Mask RCNN pre-alpha container (#276)

* changes based on feedback from model owner

* fix typo

* fix path

* fix docs links (#326)

* Add model package for PyTorch SSD-ResNet34 inference for ATS-P  (#339)

* Updates to add ssd-resnset34 inference

* update models path

* Update quickstart paths

* Doc and setup script updates

* Add install setuptools

* Doc update and model script updates

* Write dllogger to a different dir

* Update dllogger dir

* add models folder and update to use dllogger from pip

* Update doc

* Add new setvars for ATS-P

* No deps for torchvision install

* Doc updates

* Updated gpu-models code

* Removing container related files since those aren't tested yet

* Adding back note about original repo

* putting back PVC setvars.sh

* Removing JIRA links

* Update PyTorch ResNet50v1.5 inference and training for AI Kit 2022.1 GPU NDA (#344)

* PyTorch ResNet50v1.5 updates for AI Kit 2022.1

* Update versions in main spec

* Updates years in header

* Update PyTorch SSD-ResNet34 training for AI Kit 2022.1 GPU NDA GPU (#345)

* Updates for SSD-ResNet34 training for AI Kit 2022.1

* Removing JIRA links

* Update to note that the same conda env is used for both inference and training

* tf gpu 3d-unet (#347)

* Add 'Deep Learning Examples for Tensor Cores' to '3d-unet' model for TF

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix both single tile and multi tiles patches for 'UNet_3D_Medical'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Pre-apply the single tile patch to 'UNet_3D_Medical'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* update the docs and spec for 3d-unet GPU

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update doc per review

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Regerate docs for 3d-unet'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Addin 'Intel' header to modified files

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Regen docs and remove checkpoints reference for 3d-unet

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* PyTorch GPU SSD-ResNet34 fixes (#350)

* Regenerate dockerfiles

* Don't have a dockerfile for this one yet

* PyTorch DLRM updates for AI Kit 2022.1 (#349)

* PyTorch DLRM updates for AI Kit 2022.1

* Update quickstarts

* Updates the PyTorch GPU BERT inference and training model packages for AI Kit 2022.1 (#348)

* BERT updates for PyTorch GPU

* Update doc to note sourcing setvars.sh

* add setup.sh to the specs

* Fix path

* Add inference models README

* Update to add README for bert training

* Adding rust

* Update setup script for training

* Updates from https://github.com/intel-innersource/frameworks.ai.pytorch.gpu-models/pull/129

* Update setup

* add data folder

* Fix path

* Removing transformers

* update dependencies

* Fix pip install

* Require the BERT_WEIGHT folder, since we can't write to the MODEL_DIR

* Add the PyTorch 3D UNet inference model package for AI Kit 2022.1 (#351)

* Add 3D-UNet for PyTorch GPU

* Removing wrapper package section for now

* doc updates and setup script fix

* Add matplotlib install

* Quickstart and doc updates

* Update to note that weights file will be downloaded by the setup script

* Doc and setup.sh script updates to set the BUILD_DIR

* Update for setting OUTPUT_DIR instead of PRETRAINED_MODEL dir

* Adding paths for the preprocess.py and the make mkdir_postprocessed_data BUILD_DIR

* More path updates for run.py

* Update pybind dir

* Update to loadgen dir

* Update to get loadgen 39 wheel

* update setup install

* README updates

* Update setvars.sh

* Update to pass build dir

* Removing loadgen and nnUnet, since those are now in artifactory

* Update setup.sh to move loadgen to a temp directory for install

* Removing dockerfile

* Updated TF GPU BKCs and docs for NDA release (#346)

* Updated NDA batch sizes and added pkg READMEs

* Generated docs

* Revert spec/doc changes for 3D U-Net and MaskRCNN

* Remove pretrained models from inference packages

* add rn50 bf16 inference (#352)

* PyTorch and TensorFlow fixes for the AI Kit 2022.1 NDA release (#362)

* BERT fixes for writing to the model dir

* Fix README references to ImageNet in the SSD-ResNet34 docs (should be COCO)

* Write BERT training data to OUTPUT_DIR

* Adds pip package dependency for TF 3D U-Net

Co-authored-by: Melanie H Buehler <[email protected]>

* PyTorch 2022.1 GPU NDA container package update (#370)

* Updates to the base container for 2022.1 pytorch

* Update PyTorch base dockerfiles with distutils (due to error with Python 3.9)

* Updates after testing

* Fix export to ENV

* Updating basekit filenames

* Add the PyTorch SSD-ResNet34 Inference container package for the 2022.1 GPU release (#373)

* Updates to the base container for 2022.1 pytorch

* Update PyTorch base dockerfiles with distutils (due to error with Python 3.9)

* Updates after testing

* Fix export to ENV

* Container updates for SSD-ResNet34 inference

* Fix to separate installs

* Update docs and run.sh with the PRETRAINED_MODEL env var

* Container updates for SSD-ResNet34 inference

* Fix to separate installs

* Update docs and run.sh with the PRETRAINED_MODEL env var

* Add the PyTorch 3D UNet container package for the 2022.1 GPU release (#376)

* Add dockerfile, docs, and dataset preprocessing script for the container package

* Dockerfile update

* Fixing missing env var

* Add clang install

* Fixes for preprocessing

* Doc updates and remove need for extra DATASET_DIR for inference since the preprocessed dataset is in the OUTPUT_DIR

* Add matplotlib to the dockerfile

* updates based on review comment

* Update the PyTorch DLRM inference container package to include pretrained weights (#381)

* Add back the pretrained model

* Fix link

* Update the TensorFlow and PyTorch base container documentation to include link to the driver (#378)

* Update READMEs to link the driver

* Note ATS-P

* Updating PyTorch BERT partials for the 2022.1 GPU release (#379)

* Fix bert path

* Update BERT inference partial

* Fixing line

* fix bert training installs

* tf 2022.1 nda gpu base container cleanup (#384)

* 2022.1 NDA base TF GPU container package update

* Added pre-trained models back to inference specs

* Update for new ITEX and TensorFlow wheels

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Download.md cleanup

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Regen Dockerfiles

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* add -p to the mkdir

* Fix incorrect 'BaseKit' version name

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* include wheels and basekit for 3dunet and fix build args

Co-authored-by: Melanie H Buehler <[email protected]>
Co-authored-by: Dina Suehiro Jones <[email protected]>

* take 1 (#392)

* take 1

* correct file tree

* remove third-party filenames

* Syncing up the doc fragment with the README update for DLRM inference (#414)

* CentOS, Debian, RedHat and SLES support for GPU (#418)

* Add support for CentOS 7 and Debian 10, 11 (#391)

* Add support for CentOS 7 and Debian 10, 11

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Replace 'dnf' with 'yum' for CentOS 7 compatibility

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* remove commented line

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add the Yum repo fix for 'CentOS 8'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Making Platform and OS check more portable (#393)

* Making Platform and OS check more portable

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix a minor syntax error

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Adding support for RedHat 7 and 8 (#394)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Finalize Red Hat and CentOS 7, 8 support (#398)

* Minor fix for Red Hat support

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Improve OS version checking

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Introduce devtoolset-7 for CentOS and Red Hat 7

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* minor regex fix

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* yum install consistency

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Adding support for SLES 15 (#399)

* Adding support for SLES 15.03

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Improve SLES version check regex

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix a minor typo in OS name

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Improve OS version checking (#401)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* PyTorch GPU updates to support both PVC and ATS (#416)

* Add ATS-P vs PVC args and conditionals

* Doc updates

* Updated PVC batch size for BERT large FP32 training

* Add env var

* Add pci utils to the pytorch base and try out new setvars with 3dunet

* update 3dunet spec setup.sh

* ResNet50v1.5 inf update

* Revert README changes

* Update quickstart scripts and specs

* Remove ATS and PVC specific setvars.sh

* Remove DEVICE env from run.sh

* Remove DEVICE

* Remove the 'downloads' for dlrm

* Update requirements to mention lscpi and apt/yum

* Run accuracy testing first for 3d unet

* Doc updates

* TF AI Kit 2022.1.1 NDA updates for PVC (#421)

* PVC vs. ATS detection for TF model packages

* Small update to RN50 BF16 inference BKC

* Adds pciutils requirement to documentation

* Adds pciutils partial

* AI Kit 2022.1.1 NDA remove TF pretrained models (#424)

* Remove pretrained models and fix RN50 bs

* Fixed BERT Large bf16 training bs

* PyTorch container package updates for 2022.1.1 GPU NDA (#427)

* PyTorch container package updates for 2022.1.1 GPU NDA

* update to basekit 140

* TF container package updates for 2022.1.1 GPU NDA (#428)

* TF container package updates for 2022.1.1 GPU NDA

* Fix merge conflict

* GPU Containers - Mount basekit from host machine (#438)

* removed basekit installation

* updated tf basekit build script

* updated docker file

* doc update and minor fixes

* pytorch changes

* doc updates

* GPU workload containers - use basekit from host machine (#439)

* tf change to use basekit on host machine

* changes for pytorch workload container to use basekit from host machine

* Add /opt/intel/oneapi check and volumne mount for the PyTorch 3D UNet dataset preprocessing run script

* fixed error

* update python path

* fix error

* removed ats specific envs

Co-authored-by: Dina Suehiro Jones <[email protected]>

* updated product and agama versions in tool container README; added main README for Container Packages (#447)

* GPU Mask RCNN training package (#454)

* Initial commit for MaskRCNN training model package

* Removed var and regenerate README

* Remove arg & update build.sh

* Remove build args for basekit and components

* Updated specs, partials, dockerfiles

* Fixed base tag args and pip install

* Corrected patch and model files

* Fixed dockerfile, quickstart script, and docs

* Added requirement and removed unnecessary args

* Remove unnecessary files

* Add Intel licence headers

* bug fix for aizoo-708 (#477)

* update README for missing links (#501)

* update README for missing links

* Update README.md

* Update README.md

* Update PyTorch GPU model links and removed unused files (#514)

* Remove old files

* Update list of PyTorch GPU models

* Adds a quickstart script for ResNet50 inference with synthetic data for PyTorch GPU (#522)

* Adds PyTorch ResNet50 inference GPU script that uses synthetic data

* Updated scripts from gpu-models master (75b09b19ed597b4e70fc065a6d68be94406221b3) to get support for dummy data

* Update to put import back to

* update tools docker file linux base to 20.04

* Add dataset dir for --dummy script

* Update PyTorch GPU ResNet50v1.5 synthetic data inference script to allow adjusting the number of iterations run (#545)

* add --num-iterations

* make num iterations a env var

* Update documentation to note number of iterations for synthetic data runs

* Updated wheels for the IPEX base container (#692)

Co-authored-by: msalopan <[email protected]>

* Add ITEX ATS-M whl updates (#696)

* made changes for ITEX ATS-M

* indentation changes

* Update Resnet50v1.5 (#684)

* Update Resnet50v1.5

* Adjust format and restore file

* ATS-M TF changes (#699)

* add benchmark mode for tensorflow ssd

* add resnet50 benchmark mode

* add rn50

* modify rn50 files

* fixing tengfei PR

* fixing incorrect folder changes

* added licences header

* fixed year

Co-authored-by: Tengfei, Han <[email protected]>

* merge TF base container based on new RC1 whl packages (#700)

* ssd-mobilenet tf gpu spec

* build based on latest RC1 whl packages

* changed horovod version

* Add PyTorch  SSD-Mobilenet inference for GPU (#685)

* Add SSD-Mobilenet

* modified some files

* modify readme.sh and add link in reference.sh

* add dummy data mode

* modify some description

* modify description about enviroment

* Added rc1 update (#702)

* Added oneccl whl (#704)

* do not use oneccl from basekit (#705)

* Add YOLOv4 (#687)

* Add YOLOv4

* update README and inference.sh

* modify readme and inference.sh

* add dummy data mode

* test lowecase

* test again

* modify script and description about dummy, add dummy img

* add miss file

* updated readme (#706)

* Updated RN50 PyTorch Inference spec file (#707)

- Updated names in the spec file for RN50 based on scripts in quickstart folder
- Updated scripts names in run.sh

* correct ssd-mobilenet and yolov4 (#709)

* fix bug where only default images would be used

* correct scripts

* ssd-mobilenet support int8 only

* pretrained waight file link is not a direct link, so remove if from script and nee user dowmload it

* Modified some descriptions

Co-authored-by: Feng Yuan <[email protected]>

* Added Pytorch RC3 whls (#730)

Co-authored-by: Tengfei, Han <[email protected]>

* updating TPPs (#728)

* 2.8 tpps

* remove old files

* Added Resnet50_Pytorch for ATS-M (#729)

* Added Resnet50_Pytorch for ATS-M

* Added documentation and wrapper README

* Made changes as per reviews

Co-authored-by: Tengfei, Han <[email protected]>

* ATS-M support for SSD-Mobilenet and Resnet50V1-5 (#724)

* modifying scripts for gpu ssd-mobilenet

* changed docker image name

* modify changes to test functionality

* made changes for obj_det build

* changed np version

* made version changes

* revert changes

* add 3.9 dev version and remove 1.17.4 np version to latest

* change path of coco py files in models to int8 folder

* update new .pb model file

* export vars and change/remove DATASET_DIR

* made -f to -d change in checkir DIR path

* add batch inference for ssd-mobilenet

* use dummy data for online and batch inference

* add untracked file

* change to new models

* change warmup and steps

* change warmup and steps

* add docs for ATS-M ssd-mobilenet

* add docs section for ATS-M w/ links

* add docs section for ATS-M w/ correct links

* modify baremetal.md

* modify spec file to add model package

* generate model-builder doc

* make alignment changes

* update GPU name and TF version in README.md and add oneapi dir path var

* unify docs of ssd-mbnet and rn50

* make rn50 doc changes and add oneapi as path var

* generate model-builder readmes

* correct typo

* correct typo

* add INT8 check,remove other precisions

* add ONEAPI_DIR to array

* formatting lines

* delete baremetal for ATS-M

* remove typo and baremetal.md

* cleanup and modify readmes

* create oneapi_dir for base build

* remove hrvd for rc2 test

* remove hvd from base build and ITEX BKC env

* Delete -tf-gpu-ssd-mobilenet-inference.temp.Dockerfile

* initial review changes

* add prvileged mode for cpu freq scaling

* remove aikit.md

* correct readme typos

* correct comments

* minor readme changes

* check dataset path only for accuracy

* check dataset_dir only for accuracy

* add aikit back

* add gpu name and refine base readme

* change docker.md on dummy data

* add aikit for both models

* add aikit for both resnet

* add privileged mode

Co-authored-by: Ramakrishna, Srikanth <[email protected]>
Co-authored-by: Mahathi Vatsal <[email protected]>

* Update readme (#733)

* updated readmes

* updated readmes again

* Added ssd-mobilenet pytorch for ATS-M (#734)

* Added ssd-mobilenet pytorch for ATS-M

* Made changes as per reviews

* Added YOLOv4 for ATS-M (#735)

* Added YOLOv4 for ATS-M

* Made changes as per reviews

* Made changes in model.py to run yolov4.

- Modified build.sh for ipex-tool-container.
- Modified run.sh in yolov4 to mount PRETRAINED_MODELS

* update docs

* Removed HVD and torch ccl whls (#741)

* Removed HVD and torch ccl whls

* Removed sythentic_data scripts ffrom rn50 spec file

* Removed scripts from run.sh

* Update rn50, ssd-mobilenet and yolo (#748)

* Update rn50,yolo and ssd-mobile

* delete emulation

* update model

Co-authored-by: chaohan <[email protected]>

* Mahathi/ipex mkl update (#753)

* Added mkl/compiler packages

* Added tbb in spec file

* Removed oneapi path in build.sh

* Modified old files

Co-authored-by: Srikanth Ramakrishna <[email protected]>

* dpcpp,mkl,tbb inside container ATS-M (#756)

* test dpcpp,mkl in base

* make partial changes

* add tbb files to partial

* fix typo in ttb addition

* remove two export vars

* remove oneapi dir check and mount

* add end of line

* re-add end of line

Co-authored-by: Mahathi <[email protected]>

* Removed oneapi from run.sh in workloads (#758)

Co-authored-by: Srikanth Ramakrishna <[email protected]>

* doc-level changes for ATS-M TF base and WL containers (#754)

* test dpcpp,mkl in base

* make partial changes

* add tbb files to partial

* fix typo in ttb addition

* remove two export vars

* remove oneapi dir check and mount

* change name of gpu

* change gpu name

* add driver download link and remove custom paths

* provide driver download link

* refine typos in wl and base docs

* remove onapi volume mount

* remove model req and path for ITEX

Co-authored-by: Mahathi <[email protected]>

* Modified all README's (#757)

* Modified all README's

* Modified README's

* update readmes

Co-authored-by: Srikanth Ramakrishna <[email protected]>

* Fixed typo in IPEX dockerfile (#760)

* Fix styler and unit tests for develop-gpu (#777)

* Fix styler and unit tests for develop-gpu

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix unittests too

Signed-off-by: Abolfazl Shahbazi <[email protected]>

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Sync with develop branch (#774)

* Update args.rank and args.world_size for maskrcnn (#338)

* Pytorch updates for SPR 2022 ww01 and resolve AIDEVOPS-703 (#330)

* Updates to resolve AIDEVOPS-703

* Removing empty .dockerignore

* Removing extra line

* Update TF inference language modeling (BERT Large) docs for  instructions to run on Windows (#342)

* update tf inference language modeling for windows instructions

* modify the BS of maskrcnn throughput (#356)

* Fix quick start scripts links in object detection docs (#358)

* Enable running models on certain num of cores (#343)

* Enable running on certain num of cores

* Removed hard-coded number

* Checking if HT is on/off

* Fixed tests and platform util for perf notebook (#361)

* Update dataset to 3 RNN-T training datasets (#357)

* Update dataset to 3 RNN-T training datasets

In this commit, train-clean-360 and train-other-500 are added in model.
These datasets need 500GB disk space to preprocess. It will take ~4 hours
to run the entire 3 datasets for one epoch in BF16. You can terminate
the training process by adding `num_steps` in
models/language_modeling/pytorch/rnnt/training/cpu/train.sh.

* Set NUM_STEPS outside of bash script

* Add note that FP32 runs 100 steps

* workaround to fix distributed training issue (#365)

* update the BS of maskrcnn throughput (#366)

* Fix maskrcnn output scirpt for ipex distributed training (#360)

* Update 3D UNet MLPerf doc to run FP32 inference on windows (#367)

* update 3dunet mlperf doc to run fp32 inf on windows

* Fix doc links for the Windows supported models list (#368)

* update links

* Transformer ML-Perf SPR WW04 (#359)

* Changed the attention part so that it can utilize the existing fusion of batchmatmul+mul+addv2, and also use static varibles to reduce redundant compution

* fixed a minor bug for a static variable

* Changed the model so that the reshape can be moved out of dense layer so that we can fuse the ops in the dense layers

* Changed the depth of attention to a static variable

* fix bert pre train distributed bug (#369)

* Weizhuoz/fix bert ddp (#374)

* tee Bert ddp to a specific log file

* Add tee on phase1

* Fix maskrcnn distributed training calculation

* Enable jemalloc for BERT throughput mode (#375)

* update bs and use ipex Lamb (#382)

* fix distribute training for DLRM and use launcher (#383)

* Add a separate doc for windows env setup (#371)

* add a separate doc for windows support on baremetal
* use msys bash to run start.sh for windows
* update supported models docs for model dependencies on Windows

* fix distribute training for DLRM and use launcher (#386)

* Update ImageNet Dataset preprocessing instructions (#385)

* update imagenet dataset preprocessing scripts and doc

* Ttitswor/snyk cli support (#340)

* tables version out of date

whl would not build properly on sf-client.

* Updating intel-tensorflow

version does not exist

* Updating tensorflow-addons

Version does not exist.

* Updating horovod

whl no longer builds successfully on Python 3.9+

* remove empty requirements.txt file

sf-client will fail, no need for empty req file.

* Updating Pandas

version out date, whl no longer builds successfully on Python 3.9+

* Update pandas

Version not longer builds whl successfully on python 3.9+.

* Update numpy

Version whl fails to build successfully on python 3.9+

* Updating horovod

Version fails to build whl successfully on Python 3.9+.

* Update SimpleITK

Version does not install correctly on python version 3.9+.

* Updating numpy

numpy==1.16.3 does not build whl successfully on Python 3.9+.

* Updating scipy

scipy==1.2.0 fails to build whl successfully on Python 3.9+

* Updating h5py

h5py==2.10.0 fails to build whl successfully on Python 3.9+.

* Updating numpy

numpy>=1.16.3 fails to build whl successfully on Python 3.9+.

* Update h5py

h5py==2.10.0 fails to build whl successfully on Python 3.9+.

* Remove upload to GCS (#387)

* Remove upload to GCS

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* remove gcs option from the shell script

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add support for CentOS 7 and Debian 10, 11 (#391)

* Add support for CentOS 7 and Debian 10, 11

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Replace 'dnf' with 'yum' for CentOS 7 compatibility

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* remove commented line

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add the Yum repo fix for 'CentOS 8'

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Adding support for RedHat 7 and 8 (#394)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update COCO validation dataset instructions for bare metal and docker (#390)

* update coco dataset instructions for baremetal and docker
* update coco script and instructions to remove output dir env var

* Add numactl partial to wide and deep (#396)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Making Platform and OS check more portable (#393)

* Making Platform and OS check more portable

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix a minor syntax error

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Improve OS version checking (#401)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Minor syntax updates for py38 or newer (#400)

* Minor syntax updates for py38 or newer

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* More Python3.8 compliant literal comparison fixes

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update training.sh (#403)

change "socked_id" to "node_id" for ipex launcher

* Fix tcmalloc path to set LD_PRELOAD (#388)

* Fix tcmalloc.so path

* Formatting

* Removing debug messages

* Unit test update

* Test updates

* Add tcmalloc to the int8 dockerfiles

* Removing files we don't need

* Finalize Red Hat and CentOS 7, 8 support (#398)

* Minor fix for Red Hat support

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Improve OS version checking

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Introduce devtoolset-7 for CentOS and Red Hat 7

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* minor regex fix

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* yum install consistency

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Stock TensorFlow v2.5/v2.6/v2.7 support for performance analysis notebook -(sync with develop branch Jan 26) (#377)

* add back some missing patches

* add TF_ENABLE_ONEDNN_OPTS support for stock TF 2.5 and above

* transformer patch fix

* Update README.md

* online mode support

* Adding support for SLES 15 (#399)

* Adding support for SLES 15.03

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Improve SLES version check regex

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix a minor typo in OS name

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Fix BERT data instructions (#402)

* add bert data instructions in a separate doc
* update bert large dataset instructions

* Weizhuoz/fix ipex ww05 (#404)

* fix DLRM throughput output error

* Modify socket_id to node_id for ipex launcher

* fix data preprocessing script link for bert base and bert LT(#407)

* Add kmp_blocktime arg for ResNet101 int8 (#410)

* [RNN-T training] Update download_dataset.sh (#412)

Align with MLPerf: Remove --speed 0.9 1.1

* Add a snippet to download COCO2014 dataset files (#411)

* Fix failing unit tests (test_bare_metal and bert_fp32_inference) (#409)

* Fix unit tests

* benchmarks/

* Rename var so that it's not confused with the actual number of platform cores

* Add socket id 0 test

* Fix the link for the income census dataset download script (#413)

* BERT: Enable weight sharing and remove data layer for benchmarking (#406)

* Fix unit and style tests for BERT (#415)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add Jupyter notebooks for fine tuning BERT from TF Hub  (#408)

* Add WIP notebooks

* Add question and answering notebook

* Update classifier to clean up and document and add a second dataset

* updated notebook and model map with more BERT models

* Add README and update the ipynb name

* Remove unused notebook

* Update to remove the section that displays data with the predictions

* Add utils file

* utils comments and README update

* Updated files

* Clean up displaying predictions to use a pandas df

* Updates after notebook clean up and add export to the q&a notebook

* Retested and updates

* README updates and comments/formatting in utils scripts

* Add note about expecting that tensorflow has already been installed

* Add notebooks to the main TL README

* Add missing new lines

* Add pip install ipywidgets==7.6.5 after testing on bare metal

* Rename BERT Question Answering notebook

* Notebook updates based on review feedback

* Remove inadvertant changes

* Removing empty line

* PYT transfer learning notebook for object detection (#397)

* Initial commit of notebook and utils

* Added a README

* Removed non-functioning datasets & models

* Doc edit

* Fixed bugs, improved explanations, suppressed warnings

* Adds notebook for generic image classification (#364)

* Adds image classification notebook for user datasets

* Adds Image Classification transfer learning notebook

* Fixed links and text

* Minor doc updates

* Updated for review feedback

* Moved training-specific vars to TL section

* Newline and license header

* fix python seed (#417)

* Fix DIEN no requirements.txt file found (#422)

* bug fix in ssd-resnet34 (#423)

* update the BS of maskrcnn throughput (#425)

* Add a doc for transformer language mlperf dataset (#419)

* Add a doc for wide and deep large dataset instructions (#420)

* add a doc for inference dataset instructions, and updating the models docs

* Doc updates for the Transfer Learning notebooks (#430)

* Add the TF models dataset links to the main models table (#429)

* Fix dlrm without ipex-interaction (#434)

* Fix link for PyTorch RoBERTa base inference (#436)

* Enable inference for PyTorch TransNetV2 (#426)

* Enable inference for PyTorch TransNetV2

* enable bf16 inference for PyTorch TransNetV2

* update README

* use dummy data

* Add the option to use a custom dataset in the BERT binary text classification notebook using TF Hub (#435)

* Add the option to use a custom dataset in the BERT binary text classification notebook using TF Hub

* update bert_utils to add the download_and_extract_zip function

* Updates based on review feedback

* add WER for RNN-T (#440)

* Update recommendation inference docs for Windows instructions (#437)

* add windows instructions for dien and wide&deep inference

* fix accuracy issue in 4.10 transformers in patches (#441)

Co-authored-by: Jiayi Sun <[email protected]>

* update pytorch maskrcnn for PT change (#442)

* use multi-instances(one node for each instance) for throughput run (#443)

* A new Jupyter notebook for lpot quantization tutorial and related perf analysis (#115)

* draft for lpot quantization and perf analysis jupyter notebook

* Update Louie/lpot perf analysis by review comments (#298)

* update with formal name of model zoo, correct wrong words, add license in python file

* rm empty line

Co-authored-by: Neo Zhang Jianyu <[email protected]>
Co-authored-by: Abolfazl Shahbazi <[email protected]>

* use multi-instances for maskrcnn training (#445)

* Update language translation docs for windows support (#444)

* update bert and transfromer lt official docs for windows support
* fix a wrong link for 3dunet readme

* Update run_bert_pretrain_phase2.sh (#449)

* Update run_bert_pretrain_phase1.sh (#450)

* enable resnet50 training for multi sockets (#448)

* Update DLRM training to train on 2S. (#451)

* Launcher command shell in Windows to achieve better AI workload performance for certain Intel client hardware (#395)

*Launcher command shell in Windows to achieve better AI workload performance for certain Intel client hardware

* update the list of supported models on windows (#455)

* Update BraTS2018 data preprocessing instructions for  3D-UNet (#452)

* Fix for keras experimental for bert. (#433)

* Add a PyTorch NLP fine tuning notebook using the IMDb dataset for sentiment analysis (#453)

* Add the pytorch IMDB fine tuning notebook

* Update markdown

* Add README

* Renaming notebook and main doc update

* Fix link

* fix path in readme

* Update requirements.txt

* Add datasets to requirements

* Add transformers to requirements

* add sklearn to requirements

* Updates based on review feedback - fixing 'extends pytorch

* Update the README to specify 3.9

* Use 'NeoZhangJianyu' ID from GitHub (#456)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Leslie/add runtime extension support (#457)

* add runtime extension for ssd-rn34 accuracy inference

* support iteration larger than dataloader

* change the weight sharing script name (#461)

* fine tune for dataset env var configuration (#463)

* Add rn50 inference runtime extension support for throughput/accuracy (#462)

* add rn50 throughput mode runtime extension support

* add rn50 accuracy mode runtime extension support

* Update the PyTorch Text Classification fine tuning notebook to allow using a custom dataset (#467)

* Update the PyTorch Text Classification fine tuning notebook to use a custom dataset

* update description at the top of the notebook to mention the custom dataset option

* Add citation for the SMS text collection dataset

* Update the PyTorch text classification README to note the custom dataset option

* Rename the notebook and update the main TL ReadMe

* Clearing notebook output

* Fix syntax

* Fix Transformer Language mlperf to add arg --kmp-blocktime (#469)

* fix transformer mlperf to parse --kmp-blocktime, in case set on the system

* Windows support for Transformer Language MLPerf inference (#471)

* fix python format and update docs for instructions

* Minor clean up (#459)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Updated the transformer_mlperf inference profiling option, and some minor changes in the README (#472)

* Modify the output tag for IPEX DDP (#475)

* remove manual conversion of models to datatype (#478)

* feed sample input while prepacking for training (#479)

Co-authored-by: Wang, Chuanqi <[email protected]>

* Minor flake8 fix (#481)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* update the Pytorch URL for develop branch (#485)

* Update versions and URLs for release v2.7 (#484)

* Update versions and URLs for release v2.7

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Regenerate docs and dockerfiles

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update the main IMZ README.md to list models per use case (#466)

* add usecases tables in the main model readme and benchmarks readme

* revert bf16 changes (#488)

* Add partials and spec yml for the end2end DLSA pipeline (#460)

* Add partials and specs for the end2end DLSA pipeline

* Add missing end line

* Update name to include ipex

* update specs to have use the public image as a base on one and SPR for the other

* Dockerfile updates for the updated DLSA repo

* Update pip install list

* Rename to public

* Removing partials that aren't used anymore

* Fixes for 'kmp-blocktime' env var (#493)

* Fixes for 'kmp-blocktime' env var

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* update per review feedback

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Add 'kmp-blocktime' for mlperf-gnmt (#494)

* Add 'kmp-blocktime' for mlperf-gnmt

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Remove duplicate parameter definition

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* add sample_input for resnet50 training (#495)

* remove the case when fragment_size not equal args.batch_size (#500)

* Changed the transformer_mlperf fp32 model so that we can fuse the ops… (#389)

* Changed the transformer_mlperf fp32 model so that we can fuse the ops in the model, and also minor changes for python3

* Changed the transformer_mlperf int8 model so that we can fuse the ops in the model, and also minor changes for python3

* SPR updates for WW12, 2022 (#492)

* SPR updates for WW12, 2022

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update for PyTorch SPR WW2022-12

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Update pytorch base for SPR too

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Stick with specific 'keras-nightly' version

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* Updates per code review

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* update maskrcnn training_multinode.sh (#502)

* Fixed a bug in the transformer_mlperf model threads setting (#482)

* Fixed a bug in the transformer_mlperf model threads setting

* Fix failing tests

Signed-off-by: Abolfazl Shahbazi <[email protected]>

Co-authored-by: Abolfazl Shahbazi <[email protected]>

* Added the default threads setting for transformer_mlperf inference in… (#504)

* Added the default threads setting for transformer_mlperf inference in case there is no command line input

* Fix unit tests

Signed-off-by: Abolfazl Shahbazi <[email protected]>

Co-authored-by: Abolfazl Shahbazi <[email protected]>

* PyTorch Image Classification TL notebook (#490)

* Adds new TL notebook with documentation

* Added newline

* Added to main TL README

* Small fixes

* Updated for review feedback

* Added more models and a download limit arg

* Removed py3.9 requirement and changed default model

* Adds Kitti torchvision dataset to TL notebook (#512)

* Adds Kitti torchvision dataset to TL notebook

* Fixed citations formatting

* update maskrcnn model (#515)

* minor update. (#465)

* Create unit-test github action workflow (#518)

* Create unit-test github action workflow

Tested here: https://github.com/sriester/frameworks.ai.models.intel-models/runs/6089350443?check_suite_focus=true
Runs tox py.test on push.

* Containerize job

* Update unit-test.yml

Changed docker credentials to imzbot

* Update to Horovod commit 11c1389 to fix TF v2.9 + Horovod install failure (#519)

Signed-off-by: Abolfazl Shahbazi <[email protected]>

* update distilbert model to  4.18 transformers and enable int8 path (#521)

* rnnt: use launcher to set output file path and name (#524)

* Update BareMetalSetup.md (#526)

Always use the latest torchvision

* Reduce memory usage for dlrm acc test (#527)

* updatedistilbert with text_classification (#529)

* add patch for distilbert (#530)

* Update the model-builder dockerfile to use ubuntu 20.04 (#532)

* Add script for coco training dataset processing (#525)

* and update tensorflow ssd-resnet34 training dataset instructions

* update patch (#533)

Co-authored-by: Wang, Chuanqi <[email protected]>

* [RNN-T training] Enable FP32 gemm using oneDNN (#531)

* Update the Readme guide for distilbert (#534)

* Update the Readme guide for distilbert

* Fix accuracy grep bug, and grep accuracy for distilbert

Co-authored-by: Weizhuo Zhang <[email protected]>

* Update end2end public dockerfile to look for IPEX in the conda directory (#535)

* Notebook to script conversion example (#516)

* Add notebook script conversion example

* Fixed doc

* Replaces custom preprocessor with built-in one

* Changed tag to remove_for_custom_dataset

…

Loading branch information

66 people authored Mar 27, 2023

1 parent ba240ab commit 98e46e6

CODEOWNERS

Validating CODEOWNERS rules …

-Original file line number
+Diff line change
@@ Expand Up / @@ -8,6 +8,7 @@ datasets @ashahba @claynerobison @dzungductran @@
     docs @claynerobison @mhbuehler
     k8s  @ashahba @dzungductran
     models @ashraf-bhuiyan @riverliuintel
+    models @riverliuintel
     models/**/pytorch/ @leslie-fang-intel @jiayisunx @zhuhaozhe
     quickstart [email protected]
     quickstart/**/pytorch/ @leslie-fang-intel @jiayisunx @zhuhaozhe
@@ Expand Down @@

README.md

-Original file line number
+Diff line change
@@ -1,6 +1,6 @@
     # Model Zoo for Intel® Architecture
-    This repository contains **links to pre-trained models, sample scripts, best practices, and step-by-step tutorials** for many popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors.
+    This repository contains **links to pre-trained models, sample scripts, best practices, and step-by-step tutorials** for many popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors and Intel® Data Center GPUs.
     Model packages and containers for running the Model Zoo's workloads can be found at the [Intel® Developer Catalog](https://software.intel.com/containers).
@@ Expand Down @@

benchmarks/common/base_benchmark_util.py

-Original file line number
+Diff line change
@@ -1,7 +1,7 @@
     #
     # -*- coding: utf-8 -*-
     #
-    # Copyright (c) 2023 Intel Corporation
+    # Copyright (c) 2018-2023 Intel Corporation
     #
     # Licensed under the Apache License, Version 2.0 (the "License");
     # you may not use this file except in compliance with the License.
@@ Expand Down Expand Up / @@ -281,6 +281,12 @@ def _define_args(self): @@
                 help="Additional command line arguments (prefix flag start with"
                      " '--').")
+            # Check if GPU is enabled.
+            self._common_arg_parser.add_argument(
+                "--gpu",
+                help="Run the benchmark script using GPU",
+                dest="gpu", action="store_true")
         def _validate_args(self):
             """validate the args and initializes platform_util"""
             # check if socket id is in socket number range
@@ Expand Down Expand Up / @@ -311,8 +317,9 @@ def _validate_args(self): @@
                                  format(system_num_cores))
             if args.output_results and ((args.model_name != "resnet50" and
-                                        args.model_name != "resnet50v1_5") or args.precision != "fp32"):
-                raise ValueError("--output-results is currently only supported for resnet50 FP32 inference.")
+                                        args.model_name != "resnet50v1_5") or
+                                        (args.precision != "fp32" and args.precision != "fp16")):
+                raise ValueError("--output-results is currently only supported for resnet50 FP32 or FP16 inference.")
             elif args.output_results and (args.mode != "inference" or not args.data_location):
                 raise ValueError("--output-results can only be used when running inference with a dataset.")
@@ Expand Down Expand Up / @@ -355,6 +362,14 @@ def _validate_args(self): @@
                           "This is less than the number of cores per socket on the system ({})".
                           format(args.socket_id, cpuset_len_for_socket, self._platform_util.num_cores_per_socket))
+            if args.gpu:
+                if args.socket_id != -1:
+                    raise ValueError("--socket-id cannot be used with --gpu parameter.")
+                if args.num_intra_threads is not None:
+                    raise ValueError("--num-intra-threads cannot be used with --gpu parameter.")
+                if args.num_inter_threads is not None:
+                    raise ValueError("--num-inter-threads cannot be used with --gpu parameter.")
         def initialize_model(self, args, unknown_args):
             """Create model initializer for the specified model"""
             model_initializer = None
@@ Expand Down @@

benchmarks/common/tensorflow/start.sh

-Original file line number
+Diff line change
@@ -1,6 +1,6 @@
     #!/usr/bin/env bash
     #
-    # Copyright (c) 2023 Intel Corporation
+    # Copyright (c) 2018-2023 Intel Corporation
     #
     # Licensed under the Apache License, Version 2.0 (the "License");
     # you may not use this file except in compliance with the License.
@@ Expand Down Expand Up / @@ -54,7 +54,26 @@ echo " NUMA_CORES_PER_INSTANCE: ${NUMA_CORES_PER_INSTANCE}" @@
     echo "    PYTHON_EXE: ${PYTHON_EXE}"
     echo "    PYTHONPATH: ${PYTHONPATH}"
     echo "    DRY_RUN: ${DRY_RUN}"
+    echo "    GPU: ${GPU}"
+    #  Enable GPU Flag
+    gpu_arg=""
+    is_model_gpu_supported="False"
+    if [ ${GPU} == "True" ]; then
+      gpu_arg="--gpu"
+      # Environment variables for GPU
+      export RenderCompressedBuffersEnabled=0
+      export CreateMultipleSubDevices=1
+      export ForceLocalMemoryAccessMode=1
+      export SYCL_PI_LEVEL_ZERO_BATCH_SIZE=1
+    else
+      unset RenderCompressedBuffersEnabled
+      unset CreateMultipleSubDevices
+      unset ForceLocalMemoryAccessMode
+      unset ForceNonSystemMemoryPlacement
+      unset TF_ENABLE_LAYOUT_OPT
+      unset SYCL_PI_LEVEL_ZERO_BATCH_SIZE
+    fi
     #  inference & training is supported right now
     if [ ${MODE} != "inference" ] && [ ${MODE} != "training" ]; then
       echo "${MODE} mode for ${MODEL_NAME} is not supported"
@@ Expand Down Expand Up @@
     # Common execution command used by all models
     function run_model() {
+      if [ ${is_model_gpu_supported} == "False"  ] && [ ${GPU} == "True" ]; then
+        echo "Runing ${MODEL_NAME} ${MODE} with precision ${PRECISION} does not support --gpu."
+        exit 1
+      fi
       # Navigate to the main benchmark directory before executing the script,
       # since the scripts use the benchmark/common scripts as well.
       cd ${MOUNT_BENCHMARK}
@@ Expand Down Expand Up / @@ -390,7 +413,8 @@ ${benchmark_only_arg} \ @@
     ${output_results_arg} \
     ${weight_sharing_arg} \
     ${synthetic_data_arg} \
-    ${verbose_arg}"
+    ${verbose_arg} \
+    ${gpu_arg}"
     if [ ${MOUNT_EXTERNAL_MODELS_SOURCE} != "None" ]; then
       CMD="${CMD} --model-source-dir=${MOUNT_EXTERNAL_MODELS_SOURCE}"
@@ Expand Down Expand Up / @@ -978,6 +1002,7 @@ function resnet101_inceptionv3() { @@
     # ResNet50  model
     function resnet50() {
         export PYTHONPATH=${PYTHONPATH}:$(pwd):${MOUNT_BENCHMARK}
+        is_model_gpu_supported="True"
         # For accuracy, dataset location is required.
         if [ "${DATASET_LOCATION_VOL}" == "None" ] && [ ${ACCURACY_ONLY} == "True" ]; then
@@ Expand Down Expand Up / @@ -1062,6 +1087,7 @@ function rfcn() { @@
     # SSD-MobileNet model
     function ssd_mobilenet() {
+      is_model_gpu_supported="True"
       if [ ${PRECISION} == "fp32" ] || [ ${PRECISION} == "bfloat16" ]; then
         if [ ${BATCH_SIZE} != "-1" ]; then
           echo "Warning: SSD-MobileNet FP32 inference script does not use the batch_size arg"
@@ Expand Down Expand Up / @@ -1404,7 +1430,21 @@ function wavenet() { @@
     # BERT base
     function bert_base() {
-      if [ ${PRECISION} == "fp32" ]  || [ $PRECISION == "bfloat16" ]; then
+      if [ ${GPU} == "True" ]; then
+        if [ ${MODE} == "inference" ]; then
+          echo "PRECISION=${PRECISION} on GPU not supported for ${MODEL_NAME} ${MODE} in this repo."
+          exit 1
+        elif [ ${MODE} == "training" ]; then
+          if [ ${PRECISION} != "fp32" ] && [ ${PRECISION} != "bfloat16" ]; then
+            echo "PRECISION=${PRECISION} on GPU not supported for ${MODEL_NAME} ${MODE} in this repo."
+            exit 1
+          fi
+        fi
+        is_model_gpu_supported="True"
+        export PYTHONPATH=${PYTHONPATH}:${MOUNT_EXTERNAL_MODELS_SOURCE}
+        bert_options
+        CMD=${CMD} run_model
+      elif [ ${PRECISION} == "fp32" ]  || [ $PRECISION == "bfloat16" ]; then
         export PYTHONPATH=${PYTHONPATH}:${MOUNT_EXTERNAL_MODELS_SOURCE}
         bert_options
         CMD=${CMD} run_model
@@ Expand All / @@ -1416,11 +1456,58 @@ function bert_base() { @@
     # BERT Large model
     function bert_large() {
-        # Change if to support fp32
-        if [ ${PRECISION} == "fp32" ]  || [ $PRECISION == "int8" ] || [ $PRECISION == "bfloat16" ] || [ $PRECISION == "fp16" ]; then
+        export PYTHONPATH=${PYTHONPATH}:${MOUNT_BENCHMARK}
+        if [ ${GPU} == "True" ]; then
+          if [ ${MODE} == "inference" ]; then
+            if [ ${PRECISION} != "fp32" ] && [ ${PRECISION} != "fp16" ] && [ ${PRECISION} != "bfloat16" ]; then
+              echo "PRECISION=${PRECISION} on GPU not supported for ${MODEL_NAME} ${MODE} in this repo."
+              exit 1
+            fi
+          elif [ ${MODE} == "training" ]; then
+            if [ ${PRECISION} != "fp32" ] && [ ${PRECISION} != "bfloat16" ]; then
+              echo "PRECISION=${PRECISION} on GPU not supported for ${MODEL_NAME} ${MODE} in this repo."
+              exit 1
+            fi
+          fi
+          is_model_gpu_supported="True"
           export PYTHONPATH=${PYTHONPATH}:${MOUNT_EXTERNAL_MODELS_SOURCE}
           bert_options
           CMD=${CMD} run_model
+        else
+          if [ ${PRECISION} == "fp32" ]  || [ $PRECISION == "int8" ] || [ $PRECISION == "bfloat16" ] || [ $PRECISION == "fp16" ]; then
+            export PYTHONPATH=${PYTHONPATH}:${MOUNT_EXTERNAL_MODELS_SOURCE}
+            bert_options
+            CMD=${CMD} run_model
+          else
+            echo "PRECISION=${PRECISION} not supported for ${MODEL_NAME} in this repo."
+            exit 1
+          fi
+        fi
+    }
+    # distilBERT base model
+    function distilbert_base() {
+        if [ ${PRECISION} == "fp32" ] || [ ${PRECISION} == "bfloat16" ]|| [ ${PRECISION} == "int8" ]; then
+          export PYTHONPATH=${PYTHONPATH}:${MOUNT_EXTERNAL_MODELS_SOURCE}
+          CMD="${CMD} $(add_arg "--warmup-steps" ${WARMUP_STEPS})"
+          CMD="${CMD} $(add_arg "--steps" ${STEPS})"
+          if [ ${NUM_INTER_THREADS} != "None" ]; then
+            CMD="${CMD} $(add_arg "--num-inter-threads" ${NUM_INTER_THREADS})"
+          fi
+          if [ ${NUM_INTRA_THREADS} != "None" ]; then
+            CMD="${CMD} $(add_arg "--num-intra-threads" ${NUM_INTRA_THREADS})"
+          fi
+          if [ -z ${STEPS} ]; then
+            CMD="${CMD} $(add_arg "--steps" ${STEPS})"
+          fi
+          if [ -z $MAX_SEQ_LENGTH ]; then
+            CMD="${CMD} $(add_arg "--max-seq-length" ${MAX_SEQ_LENGTH})"
+          fi
+          CMD=${CMD} run_model
         else
           echo "PRECISION=${PRECISION} not supported for ${MODEL_NAME} in this repo."
           exit 1
@@ Expand Down @@

benchmarks/image_recognition/tensorflow/resnet50v1_5/inference/bfloat16/model_init.py

-Original file line number
+Diff line change
@@ Expand Up @@
             config_file_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "config.json")
             self.set_kmp_vars(config_file_path, kmp_blocktime=str(self.args.kmp_blocktime))
-            set_env_var("OMP_NUM_THREADS", self.args.num_intra_threads)
+            if not self.args.gpu:
+                set_env_var("OMP_NUM_THREADS", self.args.num_intra_threads)
             # If weight-sharing flag is ON, then use the weight-sharing script.
             if self.args.weight_sharing and not self.args.accuracy_only:
                 benchmark_script = os.path.join(
                     self.args.intelai_models, self.args.mode,
                     "eval_image_classifier_inference_weight_sharing.py")
             else:
-                benchmark_script = os.path.join(
-                    self.args.intelai_models, self.args.mode,
-                    "eval_image_classifier_inference.py")
+                if self.args.gpu:
+                    benchmark_script = os.path.join(
+                        self.args.intelai_models, self.args.mode, self.args.precision,
+                        "eval_image_classifier_inference.py")
+                else:
+                    benchmark_script = os.path.join(
+                        self.args.intelai_models, self.args.mode,
+                        "eval_image_classifier_inference.py")
             self.benchmark_command = self.get_command_prefix(args.socket_id) + \
                 self.python_exe + " " + benchmark_script
             num_cores = self.platform_util.num_cores_per_socket if self.args.num_cores == -1 \
                 else self.args.num_cores
-            self.benchmark_command = \
-                self.benchmark_command + \
-                " --input-graph=" + self.args.input_graph + \
-                " --num-inter-threads=" + str(self.args.num_inter_threads) + \
-                " --num-intra-threads=" + str(self.args.num_intra_threads) + \
-                " --num-cores=" + str(num_cores) + \
-                " --batch-size=" + str(self.args.batch_size) + \
-                " --warmup-steps=" + str(self.args.warmup_steps) + \
-                " --steps=" + str(self.args.steps)
+            if self.args.gpu:
+                self.benchmark_command = \
+                    self.benchmark_command + \
+                    " --input-graph=" + self.args.input_graph + \
+                    " --num-cores=" + str(num_cores) + \
+                    " --batch-size=" + str(self.args.batch_size) + \
+                    " --warmup-steps=" + str(self.args.warmup_steps) + \
+                    " --steps=" + str(self.args.steps)
+            else:
+                self.benchmark_command = \
+                    self.benchmark_command + \
+                    " --input-graph=" + self.args.input_graph + \
+                    " --num-inter-threads=" + str(self.args.num_inter_threads) + \
+                    " --num-intra-threads=" + str(self.args.num_intra_threads) + \
+                    " --num-cores=" + str(num_cores) + \
+                    " --batch-size=" + str(self.args.batch_size) + \
+                    " --warmup-steps=" + str(self.args.warmup_steps) + \
+                    " --steps=" + str(self.args.steps)
             if self.args.data_num_inter_threads:
                 self.benchmark_command += " --data-num-inter-threads=" + str(self.args.data_num_inter_threads)
@@ Expand Down @@

benchmarks/image_recognition/tensorflow/resnet50v1_5/inference/fp16/__init__.py

-Original file line number
+Diff line change
@@ Expand Up / @@ -15,3 +15,5 @@ @@
     # See the License for the specific language governing permissions and
     # limitations under the License.
     #
+    #

benchmarks/image_recognition/tensorflow/resnet50v1_5/inference/fp16/model_init.py

-Original file line number
+Diff line change
@@ Expand Up @@
             config_file_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "config.json")
             self.set_kmp_vars(config_file_path, kmp_blocktime=str(self.args.kmp_blocktime))
-            set_env_var("OMP_NUM_THREADS", self.args.num_intra_threads)
             # If weight-sharing flag is ON, then use the weight-sharing script.
             if self.args.weight_sharing and not self.args.accuracy_only:
+                set_env_var("OMP_NUM_THREADS", self.args.num_intra_threads)
                 benchmark_script = os.path.join(
                     self.args.intelai_models, self.args.mode,
                     "eval_image_classifier_inference_weight_sharing.py")
             else:
-                benchmark_script = os.path.join(
-                    self.args.intelai_models, self.args.mode,
-                    "eval_image_classifier_inference.py")
+                if self.args.gpu:
+                    benchmark_script = os.path.join(
+                        self.args.intelai_models, self.args.mode, self.args.precision,
+                        "eval_image_classifier_inference.py")
+                else:
+                    set_env_var("OMP_NUM_THREADS", self.args.num_intra_threads)
+                    benchmark_script = os.path.join(
+                        self.args.intelai_models, self.args.mode,
+                        "eval_image_classifier_inference.py")
             self.benchmark_command = self.get_command_prefix(args.socket_id) + \
                 self.python_exe + " " + benchmark_script
             num_cores = self.platform_util.num_cores_per_socket if self.args.num_cores == -1 \
                 else self.args.num_cores
-            self.benchmark_command = \
-                self.benchmark_command + \
-                " --input-graph=" + self.args.input_graph + \
-                " --data-type=" + self.args.precision + \
-                " --num-inter-threads=" + str(self.args.num_inter_threads) + \
-                " --num-intra-threads=" + str(self.args.num_intra_threads) + \
-                " --num-cores=" + str(num_cores) + \
-                " --batch-size=" + str(self.args.batch_size) + \
-                " --warmup-steps=" + str(self.args.warmup_steps) + \
-                " --steps=" + str(self.args.steps)
+            if self.args.gpu:
+                self.benchmark_command = \
+                    self.benchmark_command + \
+                    " --input-graph=" + self.args.input_graph + \
+                    " --num-cores=" + str(num_cores) + \
+                    " --batch-size=" + str(self.args.batch_size) + \
+                    " --warmup-steps=" + str(self.args.warmup_steps) + \
+                    " --steps=" + str(self.args.steps)
+            else:
+                self.benchmark_command = \
+                    self.benchmark_command + \
+                    " --input-graph=" + self.args.input_graph + \
+                    " --data-type=" + self.args.precision + \
+                    " --num-inter-threads=" + str(self.args.num_inter_threads) + \
+                    " --num-intra-threads=" + str(self.args.num_intra_threads) + \
+                    " --num-cores=" + str(num_cores) + \
+                    " --batch-size=" + str(self.args.batch_size) + \
+                    " --warmup-steps=" + str(self.args.warmup_steps) + \
+                    " --steps=" + str(self.args.steps)
             if self.args.data_num_inter_threads:
                 self.benchmark_command += " --data-num-inter-threads=" + str(self.args.data_num_inter_threads)
@@ Expand Down @@

0 comments on commit `98e46e6`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `98e46e6`

Commit

There are no files selected for viewing

0 comments on commit 98e46e6

0 comments on commit `98e46e6`