Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev deepfm for check #385

Open
wants to merge 23 commits into
base: dev_deepfm
Choose a base branch
from

Conversation

davidxiaozhi
Copy link

请帮忙查看新的 deepfm 的 报错信息
@MARD1NO 帮忙看一下新版本的 deepfm,另外下一步计划拆分稠密网络 和 所有的 oneEmbedding输入, 线上部署是 Embedding 不再通过算子而是改用推理服务器拼接的方式, 如果有好的方法可以微信沟通

BBuf and others added 23 commits April 16, 2022 20:26
* fix import bug

* refine

* code format

* fix comment
* eval graph return broadcast

* fix

* eval data pipeline

* pred to local

* dense&label to float

* use flow roc_auc_score

* add throughput

* prefetch eval batches

* datareader worker=1

* use sklearn roc auc score

* default value for table size array

* update readme

* rm dtype in to global

* use of roc_auc_score in dlrm

* to_global is_balanced=True

* make dataset by spark

* sync eval

* rm is_balanced=True

* fix start time

* rm sklearn in requirements

* fix sparse missing value

* update
* dev_dlrm_dockerfile

* rm dockerfile

* add tools

* update

* update

* fix
* update dlrm tool default path

* makeDlrmDataset function

* mod_idx->modIdx

* update

* fix

* step by step
* wdl -> dlrm

* update train.py

* update readme temporary

* update

* update

* udpate

* update

* update

* update

* update arguments

* rm spase optimizer

* update

* update

* update

* dot

* eager 1 device, old embedding

* eager consistent ok

* OK for train only

* rm transpose

* still only train OK

* use register_buffer

* train and eval ok

* embedding type

* dense to int

* log(dense+1)

* eager OK

* rm model type

* ignore buffer

* update sh

* rm dropout

* update module

* one module

* update

* update

* update

* update

* labels dtype

* Dev dlrm parquet (Oneflow-Inc#282)

* update

* backup

* parquet train OK

* update

* update

* update

* dense to float

* update

* add lr scheduler (Oneflow-Inc#283)

* Dev dlrm eval partnum (Oneflow-Inc#284)

* eval data part number

* fix

* support slots (Oneflow-Inc#285)

* support slots

* self._origin in graph

* slots to consistent

* format

* fix speed (Oneflow-Inc#286)

Co-authored-by: guo ran <[email protected]>

* Update dlrm.py

bmm -> matmul

* Dev dlrm embedding split (Oneflow-Inc#290)

* support embedding model parallel

* to consistent for embedding

* update sbp derivation

* fix

* update

* dlrm one embedding add options (Oneflow-Inc#291)

* add options

* add fp16 and loss_scaler (Oneflow-Inc#292)

* fix (Oneflow-Inc#293)

* Dev dlrm offline auc (Oneflow-Inc#294)

* calculate auc offline

* fix one embedding module, rm optimizer conf (Oneflow-Inc#296)

* calculate auc offline

* update

* add auc calculater

* fix

* format print

* add fused_interaction

* fix

* rm optimizer conf

* fix

Co-authored-by: ShawnXuan <[email protected]>

* refine embedding options (Oneflow-Inc#299)

* refine options

* rename args

* fix arg

* Dev dlrm offline eval (Oneflow-Inc#300)

* update offline auc

* update

* merge master

* Dev dlrm consistent 2 global (Oneflow-Inc#303)

* consistent-

* update

* Dev dlrm petastorm (Oneflow-Inc#306)

petastorm dataset

* bce with logits (Oneflow-Inc#307)

* Dev dlrm make eval ds (Oneflow-Inc#308)

* fix

* new val dataloader each time

* rm usless

* rm usless

* rm usless

* Dev dlrm vocab size (Oneflow-Inc#309)

* fix

* new val dataloader each time

* rm usless

* rm usless

* rm usless

* vocab size

* fix fc(scores) init (Oneflow-Inc#310)

* udate dense relu (Oneflow-Inc#311)

* update

* use naive logger

* rm logger.py

* update

* fix loss to local

* rm usless line

* remove to local

* rank 0

* fix

* add graph_train.py

* keep graph mode only in graph_train.py

* rm is_global

* update

* train one_embedding with graph

* update

* rm usless files

* rm more files

* update

* save -> save_model

* update eval arguments

* rm eval_save_dir

* mv import oneflow before sklearn.metrics, otherwise not work on onebrain

* rm usless lines

* print host and device mem after eval

* add auc calculation time

* update

* add fused_dlrm temporarily

* eager train

* shuffling_queue_capacity -> shuffle_row_groups

* update trainer for eager

* rm dataset type

* update

* update

* parquet dataloader

* rm fused_dlrm.py

* update

* update graph train

* update

* update

* update lr scheduler

* update

* update shell

* rm lr scheduler

* rm useless lines

* update

* update one embedding api

* fix

* change size_factor order

* fix eval loader

* rm debug lines

* rm train/eval subfolders

* files

* support test

* update oneembedding initlizer

* update

* update

* update

* rm usless lines

* option -> options

* eval barrier

* update

* rm column_ids

* new api

* fix push pull job

* rm eager test

* rm graph test

* rm

* eager_train-

* rm

* merge graph train to train

* rm Embedding

* update

* rm vocab size

* rm test name

* rm split axis

* update

* train -> train_eval

* update

* replace class Trainer

* fix

* fix

* merge mlp and fused mlp

* pythonic

* interaction padding

* format

* left 3 store types

* left 3 store types

* use capacity_per_rank

* fix

* format

* update

* update

* update

* use 13 and 26

* update

* rm size factor

* update

* update

* update readme

* update

* update

* modify_read

* rm usless import

* add requirements.txt

* rm args.not_eval_after_training

* rm batch size per rank

* set default eval batches

* every_n_iter -> interval

* device_memory_budget_mb_per_rank -> cache_memory_budget_mb_per_rank

* dataloader-

* update

* update

* update

* update

* update

* update

* use_fp16-

* single py

* disable_fusedmlp

* 4 to 1

* new api

* add capacity

* Arguments description (Oneflow-Inc#325)

* Arguments description

* rectify README.md

* column-

* make_table

* MultiTableEmbedding

* update store type

* update

* update readme

* update README

* update

* iter->step

* update README

* add license

* update README

* install oneflow nightly

* Add tools directory info to  DLRM README.md (Oneflow-Inc#328)

* Add deepfm model(FM component missed)

* Add FM component

* Update README.md

* Fix loss bug; change weight initialization methods

* change lr scheduler to multistepLR

* Add dropout layer to dnn

* Add monitor for early stopping

* Simplify early stopping schema

* Normal initialization for oneembedding; Adam optimizer; h52parquet

* Add logloss in eval for early stop

* Fix dataloader slicing bug

* Change lr schedule to reduce lr on plateau

* Refine train/val/test

* Add validation and test evaluation

* Update readme and help message

* use flow.roc_auc_score, prefetch eval batches, fix train step start time

* Delete unused args;
Change file path;
Add Throughput measurement.

* Add deepfm with MultiColOneEmbedding

* remove fusedmlp; change interaction class to function; keep val graph predict in gpu

* Use flow._C.binary_cross_entropy_loss;
Remove sklearn from env requirement;

* Fix early stop bug;
Check if path valid before loading model

* Change auc time and logloss time to metrics time;
Remove last validation;

* replace view with keepdim;
replace nn.sigmoid with tensor.sigmoid

* change unsqueeze to keepdim;
use list in dataloader

* Use from numpy to reduce cast time

* Add early stop and save best to args

* Reformat deepfm_train_eval

* Use BCEWithLogitsLoss

* Update readme;
Change early_stop to disable_early_stop;
Update train script

* Update README.md

* Fix early stop bugs

* Refine save best model help message

* Add scala script and spark launching shell script

* Delete h5_to_parquet.py

* Update readme.md

* Use real values in table size array example;
delete criteo_parquet.py

* Add split_criteo_kaggle.py

* Update readme.md

* Rename training script;
Update readme.md

* Update Readme.md (fix bad links)

* Update README.md

* Format files

* Add out_features in DNN

Co-authored-by: ShawnXuan <[email protected]>
Co-authored-by: guo ran <[email protected]>
Co-authored-by: BakerMara <[email protected]>
Co-authored-by: BoWen Sun <[email protected]>
* add dcn files.

* add README.md

* update readme.md, requirements.txt, train.sh. pretrained models coverted from pytroch is in /models-torch2flow .

* deleted files

* deleted files

* auto format by CI

* deleted .gitignore

* updated files

* modified nn.init.zeros_ and nn.init.xavier_normal_ in crossnet.

* fix change form /scripts/swin_dataloader_compare_speed_with_pytorch.py

* add processing frappe from csv to parqurt format files: tools/frappe-parquet.py , tools/frappe-parquet.sh

* modified frappe download link in README.md

* delete tools dir

* add tools dir

* update dcn_graph_train_eval files

* update fuxi dcn graph train and eval files , new dataset make tool based on fuxi

* modified train.sh table_size_array

* fix some erroe in fuxi_data_util when save csv

* Criteo dcn related files

* modified README.md

* modified dcn_train_eval.py some arguments name

* create graph when lr_decay

* deleted fm_persistent

* update dcn_train_eval.py

* formated file by

* new tool dir , and modified dcn_train_eval.py/sh fake path

* add feature_map_json argment

* delete unnecessary and useless code

* add cast in make_criteo_parquet.py, modified dcn_train_eval.py

* delete useless

* add throughput

* add valid test samples arg

* fix batch_size and train_batch_size mismatched problem

* delete uesless print code

* add a blank line in the bottom of dataset_config.yaml

* add requirements.txt, update README.md

* move loss=loss.numpy() to improve efficiency

* delete fuxi code in dcn_train_eval.py, add scala related files, update README

* update README

* remove RecommenderSystems/dcn/tools/make_criteo_parquet.py and RecommenderSystems/dcn/tools/dataset_config.yaml, update table_size_array

* simplified DNN module, modified test eval process and related READEME and train.sh contents

* add Crossnet fuxi quote, modified directory description in Readme and ddn to dcn

* name auc loglogg in eval process as val_auc val_logloss, add pandas sklearn in requirements.txt, modified READEME

* simplified train.sh and related  README contents

* simplified L2,3,4 in train.sh

* set size_factor default=3

* add dcn structure image

* update Crossnet implementation in README

* update Crossnet implementation in README

* update Crossnet implementation in README

* update Crossnet implementation in README

* update README

Co-authored-by: oneflow-ci-bot <[email protected]>
* only ipnn

* ipnn only to pr

* rm .gitignore

* modify README

* delete useless code

* delete useless .py

* modify README

* add split_criteo_kaggle.py

* modify np_to_global function
* modify README, delete useless code, rename files

* modify model name

* modify readme

* modify readme

* delete useless code and black
* Replace Dnn with fused mlp

* Add disable_fusedmlp to args;

* Remove duplicate args

* Format deepfm_train_eval.py
* MMoe parquet script;
Add a mmoe model draft;

* Add Mmoe dataloader;
Add MmoeModule;

* Add mmoe eval part;
Remove useless code;

* Update args

* Add sh script

* Fix bugs in parallel

* Replace table size array;

* Update readme;
Update args;

* Update README.md

* Change gate and tower to dnn

* fix typo in mmoe_parquet.py;
remove used import

* Update README.md (dataset);
Update mmoe_train_eval.py to deal with empty str args;

* Remove sklearn and pandas dependency in mmoe_parquet.py

* Fix bugs in mmoe_parquet.py

* Simplify mmoe_parquet

* Update readme

* format mmoe_train_eval.py

* Format mmoe_parquet.py

* Remove num_sparse_features and num_dense_features
* add oneembedding key_type

* pad dense input
* test new pr

* update CPT

* update transformer

* dev roberta

* update README file

* updata README file

* fix bug

* fix roberta bug

* modify according to the review

* update readme file

* update train_MNLI.py

* update roberta

* fix CPT

* update readme

* update file

* Delete empty line

* update readme

* auto format by CI

* Remove redundant dependencies

Co-authored-by: oneflow-ci-bot <[email protected]>
* copy as a new pr

* update model.py

* train teacher

* add student_kd adn student

* add args

* add infer files

* update README file

* add train script

* Remove redundant files

* add requirements and update Readme

* add infer.sh

* black all files

* refactoring code

* refactoring code directory

* update readme

* update comment

* auto format by CI

* Update KnowledgeDistillation/KnowledgeDistillation/README.md

Co-authored-by: oneflow-ci-bot <[email protected]>
Co-authored-by: Liang Depeng <[email protected]>
* copy as a new pr

* update requirements, add some bash scripts

* Generate data

* convert easynlp to oneflow version

* generate data

* process data

* train teacher

* student first

* Perfect code for review

* add readme

* Adjust directory and delete redundant files

* Delete redundant files

* Delete redundant files again

* delete files in easynlp

* add requirement

* delete build.sh

* auto format by CI

* delete files in easynlp

* auto format by CI

* add requirement in easynlp

Co-authored-by: oneflow-ci-bot <[email protected]>
@ShawnXuan
Copy link
Contributor

Hi there,

We changed the data_dir and execute "deepfm_v1_train.py" in debug mode with no error, please find the log below.

debug mode log
loaded library: /lib/x86_64-linux-gnu/libibverbs.so.1
loaded library: loaded library: loaded library: loaded library: /lib/x86_64-linux-gnu/libibverbs.so.1
/lib/x86_64-linux-gnu/libibverbs.so.1/lib/x86_64-linux-gnu/libibverbs.so.1/lib/x86_64-linux-gnu/libibverbs.so.1


loaded library: loaded library: loaded library: loaded library: /lib/x86_64-linux-gnu/libibverbs.so.1/lib/x86_64-linux-gnu/libibverbs.so.1/lib/x86_64-linux-gnu/libibverbs.so.1/lib/x86_64-linux-gnu/libibverbs.so.1



path: ['/data/xiexuan/git-repos/oneflow/python/oneflow']
version: 0.8.1+cu116.git.1e2849098e
git_commit: 1e2849098e
cmake_build_type: Release
rdma: True
mlir: False
path: ['/data/xiexuan/git-repos/oneflow/python/oneflow']
version: 0.8.1+cu116.git.1e2849098e
git_commit: 1e2849098e
cmake_build_type: Release
rdma: True
mlir: False
path: ['/data/xiexuan/git-repos/oneflow/python/oneflow']
version: 0.8.1+cu116.git.1e2849098e
git_commit: 1e2849098e
cmake_build_type: Release
rdma: True
mlir: False
path: ['/data/xiexuan/git-repos/oneflow/python/oneflow']
version: 0.8.1+cu116.git.1e2849098e
git_commit: 1e2849098e
cmake_build_type: Release
rdma: True
mlir: False
------------------------ arguments ------------------------
  amp ............................................. False
  batch_size ...................................... 64
  cache_memory_budget_mb .......................... 1024
  data_dir ........................................ /data/criteo1t/criteo1t_dlrm_parquet_40M
  decay_batches ................................... 10
  decay_start ..................................... 10
  disable_early_stop .............................. False
  disable_fusedmlp ................................ False
  dnn ............................................. [1000, 1000, 1000, 1000, 1000]
  embedding_vec_size .............................. 128
  eval_batch_size ................................. 64
  eval_batches .................................... 10
  eval_interval ................................... 1
  learning_rate ................................... 0.001
  loss_print_interval ............................. 1
  loss_scale_policy ............................... static
  lr_factor ....................................... 0.1
  min_delta ....................................... 1e-06
  min_lr .......................................... 1e-06
  model_load_dir .................................. None
  model_save_dir .................................. ckpt
  net_dropout ..................................... 0.2
  num_dense_fields ................................ 13
  num_sparse_fields ............................... 26
  num_test_samples ................................ 6400
  num_train_samples ............................... 6400
  num_val_samples ................................. 6400
  patience ........................................ 2
  persistent_path ................................. ./persistent
  save_best_model ................................. True
  save_initial_model .............................. False
  save_model_after_each_eval ...................... False
  store_type ...................................... cached_host_mem
  table_size_array ................................ [39884407, 39043, 17289, 7420, 20263, 3, 7120, 1543, 63, 38532952, 2953546, 403346, 10, 2208, 11938, 155, 4, 976, 14, 39979772, 25641295, 39664985, 585935, 12972, 108, 36]
  train_batches ................................... 100
  warmup_batches .................................. 10
-------------------- end of arguments ---------------------
DeepFMModule(
  (embedding_layer): OneEmbedding(
    (one_embedding): MultiTableEmbedding()
  )
  (dnn_layer): DNN(
    (linear_layers): FusedMLP(in_features=3456, hidden_features=[1000, 1000, 1000, 1000, 1000], out_features=1, skip_final_activation=True)
  )
  (bottom_mlp): MLP(
    (linear_layers): FusedMLP(in_features=16, hidden_features=[512, 256], out_features=128, skip_final_activation=False)
  )
  (lr_out_mlp): MLP(
    (linear_layers): FusedMLP(in_features=3328, hidden_features=[512, 64], out_features=1, skip_final_activation=False)
  )
)
DeepFMModule(
  (embedding_layer): OneEmbedding(
    (one_embedding): MultiTableEmbedding()
  )
  (dnn_layer): DNN(
    (linear_layers): FusedMLP(in_features=3456, hidden_features=[1000, 1000, 1000, 1000, 1000], out_features=1, skip_final_activation=True)
  )
  (bottom_mlp): MLP(
    (linear_layers): FusedMLP(in_features=16, hidden_features=[512, 256], out_features=128, skip_final_activation=False)
  )
  (lr_out_mlp): MLP(
    (linear_layers): FusedMLP(in_features=3328, hidden_features=[512, 64], out_features=1, skip_final_activation=False)
  )
)
DeepFMModule(
  (embedding_layer): OneEmbedding(
    (one_embedding): MultiTableEmbedding()
  )
  (dnn_layer): DNN(
    (linear_layers): FusedMLP(in_features=3456, hidden_features=[1000, 1000, 1000, 1000, 1000], out_features=1, skip_final_activation=True)
  )
  (bottom_mlp): MLP(
    (linear_layers): FusedMLP(in_features=16, hidden_features=[512, 256], out_features=128, skip_final_activation=False)
  )
  (lr_out_mlp): MLP(
    (linear_layers): FusedMLP(in_features=3328, hidden_features=[512, 64], out_features=1, skip_final_activation=False)
  )
)
DeepFMModule(
  (embedding_layer): OneEmbedding(
    (one_embedding): MultiTableEmbedding()
  )
  (dnn_layer): DNN(
    (linear_layers): FusedMLP(in_features=3456, hidden_features=[1000, 1000, 1000, 1000, 1000], out_features=1, skip_final_activation=True)
  )
  (bottom_mlp): MLP(
    (linear_layers): FusedMLP(in_features=16, hidden_features=[512, 256], out_features=128, skip_final_activation=False)
  )
  (lr_out_mlp): MLP(
    (linear_layers): FusedMLP(in_features=3328, hidden_features=[512, 64], out_features=1, skip_final_activation=False)
  )
)
/data/xiexuan/miniconda3/envs/one/lib/python3.8/site-packages/petastorm/fs_utils.py:88: FutureWarning: pyarrow.localfs is deprecated as of 2.0.0, please use pyarrow.fs.LocalFileSystem instead.
  self._filesystem = pyarrow.localfs
/data/xiexuan/miniconda3/envs/one/lib/python3.8/site-packages/petastorm/fs_utils.py:88: FutureWarning: pyarrow.localfs is deprecated as of 2.0.0, please use pyarrow.fs.LocalFileSystem instead.
  self._filesystem = pyarrow.localfs
/data/xiexuan/miniconda3/envs/one/lib/python3.8/site-packages/petastorm/fs_utils.py:88: FutureWarning: pyarrow.localfs is deprecated as of 2.0.0, please use pyarrow.fs.LocalFileSystem instead.
  self._filesystem = pyarrow.localfs
/data/xiexuan/miniconda3/envs/one/lib/python3.8/site-packages/petastorm/fs_utils.py:88: FutureWarning: pyarrow.localfs is deprecated as of 2.0.0, please use pyarrow.fs.LocalFileSystem instead.
  self._filesystem = pyarrow.localfs
Rank[0], Step 1, Loss 3.3199, Latency 4466.140 ms, Throughput 14.3, 2022-09-07 07:53:22
Rank[0], Step 1, AUC 0.62220, Eval_time 0.32 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29500 MiB, 2022-09-07 07:53:22
Rank[0], Step 2, Loss 4.0608, Latency 18.939 ms, Throughput 3379.2, 2022-09-07 07:53:22
Rank[0], Step 2, AUC 0.62243, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29631 MiB, 2022-09-07 07:53:22
Rank[0], Step 3, Loss 3.5874, Latency 18.318 ms, Throughput 3493.9, 2022-09-07 07:53:22
Rank[0], Step 3, AUC 0.62209, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29759 MiB, 2022-09-07 07:53:23
Rank[0], Step 4, Loss 3.8712, Latency 18.762 ms, Throughput 3411.2, 2022-09-07 07:53:23
Rank[0], Step 4, AUC 0.62141, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29852 MiB, 2022-09-07 07:53:23
Rank[0], Step 5, Loss 3.7678, Latency 18.564 ms, Throughput 3447.6, 2022-09-07 07:53:23
Rank[0], Step 5, AUC 0.62084, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29853 MiB, 2022-09-07 07:53:23
Rank[0], Step 6, Loss 3.5638, Latency 18.523 ms, Throughput 3455.1, 2022-09-07 07:53:23
Rank[0], Step 6, AUC 0.62026, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29854 MiB, 2022-09-07 07:53:24
Rank[0], Step 7, Loss 3.6404, Latency 18.428 ms, Throughput 3473.0, 2022-09-07 07:53:24
Rank[0], Step 7, AUC 0.61867, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29855 MiB, 2022-09-07 07:53:24
Rank[0], Step 8, Loss 3.6860, Latency 19.103 ms, Throughput 3350.3, 2022-09-07 07:53:24
Rank[0], Step 8, AUC 0.61730, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29854 MiB, 2022-09-07 07:53:24
Rank[0], Step 9, Loss 3.4174, Latency 18.378 ms, Throughput 3482.4, 2022-09-07 07:53:24
Rank[0], Step 9, AUC 0.61639, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29854 MiB, 2022-09-07 07:53:24
Rank[0], Step 10, Loss 3.3846, Latency 18.243 ms, Throughput 3508.2, 2022-09-07 07:53:24
Rank[0], Step 10, AUC 0.61353, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:25
Rank[0], Step 11, Loss 3.2856, Latency 18.110 ms, Throughput 3534.0, 2022-09-07 07:53:25
Rank[0], Step 11, AUC 0.61228, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:25
Rank[0], Step 12, Loss 3.4572, Latency 17.881 ms, Throughput 3579.2, 2022-09-07 07:53:25
Rank[0], Step 12, AUC 0.60908, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:25
Rank[0], Step 13, Loss 3.3568, Latency 17.867 ms, Throughput 3582.0, 2022-09-07 07:53:25
Rank[0], Step 13, AUC 0.60783, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:26
Rank[0], Step 14, Loss 3.4907, Latency 18.426 ms, Throughput 3473.4, 2022-09-07 07:53:26
Rank[0], Step 14, AUC 0.60703, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:26
Rank[0], Step 15, Loss 3.0834, Latency 18.784 ms, Throughput 3407.1, 2022-09-07 07:53:26
Rank[0], Step 15, AUC 0.60634, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:26
Rank[0], Step 16, Loss 2.7252, Latency 18.446 ms, Throughput 3469.5, 2022-09-07 07:53:26
Rank[0], Step 16, AUC 0.60589, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:27
Rank[0], Step 17, Loss 2.9814, Latency 18.018 ms, Throughput 3551.9, 2022-09-07 07:53:27
Rank[0], Step 17, AUC 0.60520, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:27
Rank[0], Step 18, Loss 2.9342, Latency 18.516 ms, Throughput 3456.5, 2022-09-07 07:53:27
Rank[0], Step 18, AUC 0.60509, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:27
Rank[0], Step 19, Loss 2.9746, Latency 18.502 ms, Throughput 3459.1, 2022-09-07 07:53:27
Rank[0], Step 19, AUC 0.60497, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:28
Rank[0], Step 20, Loss 3.0782, Latency 18.153 ms, Throughput 3525.6, 2022-09-07 07:53:28
Rank[0], Step 20, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:28
Rank[0], Step 21, Loss 2.9523, Latency 18.443 ms, Throughput 3470.1, 2022-09-07 07:53:28
Rank[0], Step 21, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:28
Rank[0], Step 22, Loss 3.2065, Latency 18.266 ms, Throughput 3503.8, 2022-09-07 07:53:28
Rank[0], Step 22, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:28
Rank[0], Step 23, Loss 2.9332, Latency 18.447 ms, Throughput 3469.5, 2022-09-07 07:53:29
Rank[0], Step 23, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:29
Rank[0], Step 24, Loss 2.9895, Latency 19.213 ms, Throughput 3331.0, 2022-09-07 07:53:29
Rank[0], Step 24, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:29
Rank[0], Step 25, Loss 3.3795, Latency 18.651 ms, Throughput 3431.4, 2022-09-07 07:53:29
Rank[0], Step 25, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:29
Rank[0], Step 26, Loss 3.3579, Latency 18.242 ms, Throughput 3508.3, 2022-09-07 07:53:29
Rank[0], Step 26, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:30
Rank[0], Step 27, Loss 3.2023, Latency 18.808 ms, Throughput 3402.8, 2022-09-07 07:53:30
Rank[0], Step 27, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:30
Rank[0], Step 28, Loss 3.4082, Latency 18.901 ms, Throughput 3386.0, 2022-09-07 07:53:30
Rank[0], Step 28, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:30
Rank[0], Step 29, Loss 3.3067, Latency 18.364 ms, Throughput 3485.1, 2022-09-07 07:53:30
Rank[0], Step 29, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:31
Rank[0], Step 30, Loss 3.4026, Latency 18.178 ms, Throughput 3520.6, 2022-09-07 07:53:31
Rank[0], Step 30, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:31
Rank[0], Step 31, Loss 3.0869, Latency 18.307 ms, Throughput 3496.0, 2022-09-07 07:53:31
Rank[0], Step 31, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:31
Rank[0], Step 32, Loss 3.2750, Latency 18.457 ms, Throughput 3467.6, 2022-09-07 07:53:31
Rank[0], Step 32, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:31
Rank[0], Step 33, Loss 3.1438, Latency 18.313 ms, Throughput 3494.8, 2022-09-07 07:53:31
Rank[0], Step 33, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:32
Rank[0], Step 34, Loss 3.1537, Latency 18.092 ms, Throughput 3537.5, 2022-09-07 07:53:32
Rank[0], Step 34, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:32
Rank[0], Step 35, Loss 3.1989, Latency 18.436 ms, Throughput 3471.4, 2022-09-07 07:53:32
Rank[0], Step 35, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:32
Rank[0], Step 36, Loss 3.2316, Latency 18.102 ms, Throughput 3535.6, 2022-09-07 07:53:32
Rank[0], Step 36, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:33
Rank[0], Step 37, Loss 3.0106, Latency 18.401 ms, Throughput 3478.0, 2022-09-07 07:53:33
Rank[0], Step 37, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:33
Rank[0], Step 38, Loss 3.4271, Latency 18.224 ms, Throughput 3511.9, 2022-09-07 07:53:33
Rank[0], Step 38, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:33
Rank[0], Step 39, Loss 3.2087, Latency 17.910 ms, Throughput 3573.4, 2022-09-07 07:53:33
Rank[0], Step 39, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:33
Rank[0], Step 40, Loss 3.5536, Latency 18.292 ms, Throughput 3498.9, 2022-09-07 07:53:33
Rank[0], Step 40, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:34
Rank[0], Step 41, Loss 3.2040, Latency 18.160 ms, Throughput 3524.3, 2022-09-07 07:53:34
Rank[0], Step 41, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:34
Rank[0], Step 42, Loss 2.4936, Latency 18.270 ms, Throughput 3503.1, 2022-09-07 07:53:34
Rank[0], Step 42, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:34
Rank[0], Step 43, Loss 2.9037, Latency 17.861 ms, Throughput 3583.2, 2022-09-07 07:53:34
Rank[0], Step 43, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:35
Rank[0], Step 44, Loss 2.7676, Latency 18.104 ms, Throughput 3535.1, 2022-09-07 07:53:35
Rank[0], Step 44, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:35
Rank[0], Step 45, Loss 2.9128, Latency 17.930 ms, Throughput 3569.5, 2022-09-07 07:53:35
Rank[0], Step 45, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:35
Rank[0], Step 46, Loss 3.1016, Latency 18.955 ms, Throughput 3376.5, 2022-09-07 07:53:35
Rank[0], Step 46, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:36
Rank[0], Step 47, Loss 3.3592, Latency 18.216 ms, Throughput 3513.4, 2022-09-07 07:53:36
Rank[0], Step 47, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:36
Rank[0], Step 48, Loss 2.9092, Latency 18.410 ms, Throughput 3476.4, 2022-09-07 07:53:36
Rank[0], Step 48, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:36
Rank[0], Step 49, Loss 2.9091, Latency 18.947 ms, Throughput 3377.9, 2022-09-07 07:53:36
Rank[0], Step 49, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:37
Rank[0], Step 50, Loss 3.3312, Latency 18.398 ms, Throughput 3478.7, 2022-09-07 07:53:37
Rank[0], Step 50, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:37
Rank[0], Step 51, Loss 2.9679, Latency 17.965 ms, Throughput 3562.5, 2022-09-07 07:53:37
Rank[0], Step 51, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:37
Rank[0], Step 52, Loss 2.8849, Latency 18.073 ms, Throughput 3541.1, 2022-09-07 07:53:37
Rank[0], Step 52, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:37
Rank[0], Step 53, Loss 3.3143, Latency 17.990 ms, Throughput 3557.5, 2022-09-07 07:53:37
Rank[0], Step 53, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:38
Rank[0], Step 54, Loss 2.9341, Latency 18.200 ms, Throughput 3516.5, 2022-09-07 07:53:38
Rank[0], Step 54, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:38
Rank[0], Step 55, Loss 3.2370, Latency 17.924 ms, Throughput 3570.6, 2022-09-07 07:53:38
Rank[0], Step 55, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:38
Rank[0], Step 56, Loss 3.0604, Latency 18.434 ms, Throughput 3471.8, 2022-09-07 07:53:38
Rank[0], Step 56, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:39
Rank[0], Step 57, Loss 3.2732, Latency 18.997 ms, Throughput 3368.9, 2022-09-07 07:53:39
Rank[0], Step 57, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:39
Rank[0], Step 58, Loss 3.3162, Latency 18.082 ms, Throughput 3539.4, 2022-09-07 07:53:39
Rank[0], Step 58, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:39
Rank[0], Step 59, Loss 3.0230, Latency 17.910 ms, Throughput 3573.5, 2022-09-07 07:53:39
Rank[0], Step 59, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:40
Rank[0], Step 60, Loss 3.2325, Latency 17.995 ms, Throughput 3556.5, 2022-09-07 07:53:40
Rank[0], Step 60, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:40
Rank[0], Step 61, Loss 2.9899, Latency 17.903 ms, Throughput 3574.8, 2022-09-07 07:53:40
Rank[0], Step 61, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:40
Rank[0], Step 62, Loss 2.7771, Latency 18.275 ms, Throughput 3502.0, 2022-09-07 07:53:40
Rank[0], Step 62, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:41
Rank[0], Step 63, Loss 2.8222, Latency 17.962 ms, Throughput 3563.2, 2022-09-07 07:53:41
Rank[0], Step 63, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:41
Rank[0], Step 64, Loss 2.9877, Latency 18.630 ms, Throughput 3435.3, 2022-09-07 07:53:41
Rank[0], Step 64, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:41
Rank[0], Step 65, Loss 3.4542, Latency 17.934 ms, Throughput 3568.6, 2022-09-07 07:53:41
Rank[0], Step 65, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:41
Rank[0], Step 66, Loss 3.0751, Latency 18.667 ms, Throughput 3428.4, 2022-09-07 07:53:41
Rank[0], Step 66, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:42
Rank[0], Step 67, Loss 3.3020, Latency 18.395 ms, Throughput 3479.2, 2022-09-07 07:53:42
Rank[0], Step 67, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:42
Rank[0], Step 68, Loss 3.1114, Latency 18.717 ms, Throughput 3419.4, 2022-09-07 07:53:42
Rank[0], Step 68, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:42
Rank[0], Step 69, Loss 3.4781, Latency 18.296 ms, Throughput 3498.1, 2022-09-07 07:53:42
Rank[0], Step 69, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:43
Rank[0], Step 70, Loss 3.1538, Latency 18.559 ms, Throughput 3448.4, 2022-09-07 07:53:43
Rank[0], Step 70, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:43
Rank[0], Step 71, Loss 3.0717, Latency 17.930 ms, Throughput 3569.4, 2022-09-07 07:53:43
Rank[0], Step 71, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:43
Rank[0], Step 72, Loss 2.9696, Latency 18.266 ms, Throughput 3503.8, 2022-09-07 07:53:43
Rank[0], Step 72, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:44
Rank[0], Step 73, Loss 2.9164, Latency 18.047 ms, Throughput 3546.3, 2022-09-07 07:53:44
Rank[0], Step 73, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:44
Rank[0], Step 74, Loss 3.2112, Latency 18.301 ms, Throughput 3497.2, 2022-09-07 07:53:44
Rank[0], Step 74, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:44
Rank[0], Step 75, Loss 2.9350, Latency 18.151 ms, Throughput 3525.9, 2022-09-07 07:53:44
Rank[0], Step 75, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:44
Rank[0], Step 76, Loss 3.0354, Latency 18.197 ms, Throughput 3517.1, 2022-09-07 07:53:44
Rank[0], Step 76, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:45
Rank[0], Step 77, Loss 3.1407, Latency 18.419 ms, Throughput 3474.6, 2022-09-07 07:53:45
Rank[0], Step 77, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:45
Rank[0], Step 78, Loss 3.1265, Latency 18.547 ms, Throughput 3450.8, 2022-09-07 07:53:45
Rank[0], Step 78, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:45
Rank[0], Step 79, Loss 3.1059, Latency 18.958 ms, Throughput 3375.9, 2022-09-07 07:53:45
Rank[0], Step 79, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:46
Rank[0], Step 80, Loss 3.4540, Latency 18.128 ms, Throughput 3530.4, 2022-09-07 07:53:46
Rank[0], Step 80, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:46
Rank[0], Step 81, Loss 2.6493, Latency 18.294 ms, Throughput 3498.4, 2022-09-07 07:53:46
Rank[0], Step 81, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:46
Rank[0], Step 82, Loss 2.8318, Latency 18.075 ms, Throughput 3540.9, 2022-09-07 07:53:46
Rank[0], Step 82, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:47
Rank[0], Step 83, Loss 2.9971, Latency 18.288 ms, Throughput 3499.5, 2022-09-07 07:53:47
Rank[0], Step 83, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:47
Rank[0], Step 84, Loss 3.2572, Latency 18.303 ms, Throughput 3496.7, 2022-09-07 07:53:47
Rank[0], Step 84, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:47
Rank[0], Step 85, Loss 2.8068, Latency 18.571 ms, Throughput 3446.2, 2022-09-07 07:53:47
Rank[0], Step 85, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:47
Rank[0], Step 86, Loss 3.1237, Latency 18.172 ms, Throughput 3521.9, 2022-09-07 07:53:47
Rank[0], Step 86, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:48
Rank[0], Step 87, Loss 2.9193, Latency 18.141 ms, Throughput 3528.0, 2022-09-07 07:53:48
Rank[0], Step 87, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:48
Rank[0], Step 88, Loss 3.3536, Latency 17.939 ms, Throughput 3567.7, 2022-09-07 07:53:48
Rank[0], Step 88, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:48
Rank[0], Step 89, Loss 2.9193, Latency 18.720 ms, Throughput 3418.8, 2022-09-07 07:53:48
Rank[0], Step 89, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:49
Rank[0], Step 90, Loss 3.3988, Latency 18.251 ms, Throughput 3506.6, 2022-09-07 07:53:49
Rank[0], Step 90, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:49
Rank[0], Step 91, Loss 3.2051, Latency 18.085 ms, Throughput 3538.9, 2022-09-07 07:53:49
Rank[0], Step 91, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:49
Rank[0], Step 92, Loss 3.1134, Latency 17.868 ms, Throughput 3581.8, 2022-09-07 07:53:49
Rank[0], Step 92, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:50
Rank[0], Step 93, Loss 2.5236, Latency 17.648 ms, Throughput 3626.5, 2022-09-07 07:53:50
Rank[0], Step 93, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:50
Rank[0], Step 94, Loss 3.0203, Latency 18.044 ms, Throughput 3546.9, 2022-09-07 07:53:50
Rank[0], Step 94, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:50
Rank[0], Step 95, Loss 3.2143, Latency 18.956 ms, Throughput 3376.2, 2022-09-07 07:53:50
Rank[0], Step 95, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:51
Rank[0], Step 96, Loss 3.2334, Latency 18.462 ms, Throughput 3466.5, 2022-09-07 07:53:51
Rank[0], Step 96, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:51
Rank[0], Step 97, Loss 3.0373, Latency 18.065 ms, Throughput 3542.7, 2022-09-07 07:53:51
Rank[0], Step 97, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:51
Rank[0], Step 98, Loss 2.9081, Latency 19.059 ms, Throughput 3358.0, 2022-09-07 07:53:51
Rank[0], Step 98, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:52
Rank[0], Step 99, Loss 2.5825, Latency 17.985 ms, Throughput 3558.5, 2022-09-07 07:53:52
Rank[0], Step 99, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:52
Rank[0], Step 100, Loss 2.8165, Latency 18.658 ms, Throughput 3430.1, 2022-09-07 07:53:52
name:embedding_layer.one_embedding.shadow param-size:oneflow.Size([1])
name:dnn_layer.linear_layers.weight_0 param-size:oneflow.Size([1000, 3456])
name:dnn_layer.linear_layers.bias_0 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_1 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_1 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_2 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_2 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_3 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_3 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_4 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_4 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_5 param-size:oneflow.Size([1, 1000])
name:dnn_layer.linear_layers.bias_5 param-size:oneflow.Size([1])
name:bottom_mlp.linear_layers.weight_0 param-size:oneflow.Size([512, 16])
name:bottom_mlp.linear_layers.bias_0 param-size:oneflow.Size([512])
name:bottom_mlp.linear_layers.weight_1 param-size:oneflow.Size([256, 512])
name:bottom_mlp.linear_layers.bias_1 param-size:oneflow.Size([256])
name:bottom_mlp.linear_layers.weight_2 param-size:oneflow.Size([128, 256])
name:bottom_mlp.linear_layers.bias_2 param-size:oneflow.Size([128])
name:lr_out_mlp.linear_layers.weight_0 param-size:oneflow.Size([512, 3328])
name:lr_out_mlp.linear_layers.bias_0 param-size:oneflow.Size([512])
name:lr_out_mlp.linear_layers.weight_1 param-size:oneflow.Size([64, 512])
name:lr_out_mlp.linear_layers.bias_1 param-size:oneflow.Size([64])
name:lr_out_mlp.linear_layers.weight_2 param-size:oneflow.Size([1, 64])
name:lr_out_mlp.linear_layers.bias_2 param-size:oneflow.Size([1])
name:embedding_layer.one_embedding.shadow param-size:oneflow.Size([1])
name:dnn_layer.linear_layers.weight_0 param-size:oneflow.Size([1000, 3456])
name:dnn_layer.linear_layers.bias_0 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_1 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_1 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_2 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_2 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_3 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_3 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_4 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_4 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_5 param-size:oneflow.Size([1, 1000])
name:dnn_layer.linear_layers.bias_5 param-size:oneflow.Size([1])
name:bottom_mlp.linear_layers.weight_0 param-size:oneflow.Size([512, 16])
name:bottom_mlp.linear_layers.bias_0 param-size:oneflow.Size([512])
name:bottom_mlp.linear_layers.weight_1 param-size:oneflow.Size([256, 512])
name:bottom_mlp.linear_layers.bias_1 param-size:oneflow.Size([256])
name:bottom_mlp.linear_layers.weight_2 param-size:oneflow.Size([128, 256])
name:bottom_mlp.linear_layers.bias_2 param-size:oneflow.Size([128])
name:lr_out_mlp.linear_layers.weight_0 param-size:oneflow.Size([512, 3328])
name:lr_out_mlp.linear_layers.bias_0 param-size:oneflow.Size([512])
name:lr_out_mlp.linear_layers.weight_1 param-size:oneflow.Size([64, 512])
name:lr_out_mlp.linear_layers.bias_1 param-size:oneflow.Size([64])
name:lr_out_mlp.linear_layers.weight_2 param-size:oneflow.Size([1, 64])
name:lr_out_mlp.linear_layers.bias_2 param-size:oneflow.Size([1])
name:embedding_layer.one_embedding.shadow param-size:oneflow.Size([1])
name:dnn_layer.linear_layers.weight_0 param-size:oneflow.Size([1000, 3456])
name:dnn_layer.linear_layers.bias_0 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_1 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_1 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_2 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_2 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_3 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_3 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_4 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_4 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_5 param-size:oneflow.Size([1, 1000])
name:dnn_layer.linear_layers.bias_5 param-size:oneflow.Size([1])
name:bottom_mlp.linear_layers.weight_0 param-size:oneflow.Size([512, 16])
name:bottom_mlp.linear_layers.bias_0 param-size:oneflow.Size([512])
name:bottom_mlp.linear_layers.weight_1 param-size:oneflow.Size([256, 512])
name:bottom_mlp.linear_layers.bias_1 param-size:oneflow.Size([256])
name:bottom_mlp.linear_layers.weight_2 param-size:oneflow.Size([128, 256])
name:bottom_mlp.linear_layers.bias_2 param-size:oneflow.Size([128])
name:lr_out_mlp.linear_layers.weight_0 param-size:oneflow.Size([512, 3328])
name:lr_out_mlp.linear_layers.bias_0 param-size:oneflow.Size([512])
name:lr_out_mlp.linear_layers.weight_1 param-size:oneflow.Size([64, 512])
name:lr_out_mlp.linear_layers.bias_1 param-size:oneflow.Size([64])
name:lr_out_mlp.linear_layers.weight_2 param-size:oneflow.Size([1, 64])
name:lr_out_mlp.linear_layers.bias_2 param-size:oneflow.Size([1])
Rank[0], Step 100, AUC 0.60486, Eval_time 0.02 s, AUC_time 0.00 s, Eval_samples 640, GPU_Memory 4238 MiB, Host_Memory 29851 MiB, 2022-09-07 07:53:52
name:embedding_layer.one_embedding.shadow param-size:oneflow.Size([1])
name:dnn_layer.linear_layers.weight_0 param-size:oneflow.Size([1000, 3456])
name:dnn_layer.linear_layers.bias_0 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_1 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_1 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_2 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_2 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_3 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_3 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_4 param-size:oneflow.Size([1000, 1000])
name:dnn_layer.linear_layers.bias_4 param-size:oneflow.Size([1000])
name:dnn_layer.linear_layers.weight_5 param-size:oneflow.Size([1, 1000])
name:dnn_layer.linear_layers.bias_5 param-size:oneflow.Size([1])
name:bottom_mlp.linear_layers.weight_0 param-size:oneflow.Size([512, 16])
name:bottom_mlp.linear_layers.bias_0 param-size:oneflow.Size([512])
name:bottom_mlp.linear_layers.weight_1 param-size:oneflow.Size([256, 512])
name:bottom_mlp.linear_layers.bias_1 param-size:oneflow.Size([256])
name:bottom_mlp.linear_layers.weight_2 param-size:oneflow.Size([128, 256])
name:bottom_mlp.linear_layers.bias_2 param-size:oneflow.Size([128])
name:lr_out_mlp.linear_layers.weight_0 param-size:oneflow.Size([512, 3328])
name:lr_out_mlp.linear_layers.bias_0 param-size:oneflow.Size([512])
name:lr_out_mlp.linear_layers.weight_1 param-size:oneflow.Size([64, 512])
name:lr_out_mlp.linear_layers.bias_1 param-size:oneflow.Size([64])
name:lr_out_mlp.linear_layers.weight_2 param-size:oneflow.Size([1, 64])
name:lr_out_mlp.linear_layers.bias_2 param-size:oneflow.Size([1])
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
dataset schema
scala> spark.read.parquet("/data/criteo1t/criteo1t_dlrm_parquet_40M/*/*.parquet").printSchema

warning: 1 deprecation (since 2.13.3); for details, enable `:setting -deprecation` or `:replay -deprecation`
root
 |-- label: float (nullable = true)
 |-- I1: float (nullable = true)
 |-- I2: float (nullable = true)
 |-- I3: float (nullable = true)
 |-- I4: float (nullable = true)
 |-- I5: float (nullable = true)
 |-- I6: float (nullable = true)
 |-- I7: float (nullable = true)
 |-- I8: float (nullable = true)
 |-- I9: float (nullable = true)
 |-- I10: float (nullable = true)
 |-- I11: float (nullable = true)
 |-- I12: float (nullable = true)
 |-- I13: float (nullable = true)
 |-- C1: long (nullable = true)
 |-- C2: long (nullable = true)
 |-- C3: long (nullable = true)
 |-- C4: long (nullable = true)
 |-- C5: long (nullable = true)
 |-- C6: long (nullable = true)
 |-- C7: long (nullable = true)
 |-- C8: long (nullable = true)
 |-- C9: long (nullable = true)
 |-- C10: long (nullable = true)
 |-- C11: long (nullable = true)
 |-- C12: long (nullable = true)
 |-- C13: long (nullable = true)
 |-- C14: long (nullable = true)
 |-- C15: long (nullable = true)
 |-- C16: long (nullable = true)
 |-- C17: long (nullable = true)
 |-- C18: long (nullable = true)
 |-- C19: long (nullable = true)
 |-- C20: long (nullable = true)
 |-- C21: long (nullable = true)
 |-- C22: long (nullable = true)
 |-- C23: long (nullable = true)
 |-- C24: long (nullable = true)
 |-- C25: long (nullable = true)
 |-- C26: long (nullable = true)

Maybe you can check your dataset, or attach error message in detail, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants