Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Example] Add STAFNet Model for Air Quality Prediction #1070

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from

Conversation

dylan-yin
Copy link

PR types

PR changes

Describe

@CLAassistant
Copy link

CLAassistant commented Feb 7, 2025

CLA assistant check
All committers have signed the CLA.

Copy link
Collaborator

@HydrogenSulfate HydrogenSulfate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

整体项目请使用pre-commit格式化一边

@@ -0,0 +1,136 @@
hydra:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

配置文件开头请加上以下字段:

defaults:
- ppsci_default
- TRAIN: train_default
- TRAIN/ema: ema_default
- TRAIN/swa: swa_default
- EVAL: eval_default
- INFER: infer_default
- hydra/job/config/override_dirname/exclude_keys: exclude_keys_default
- _self_

Comment on lines 8 to 16
config:
override_dirname:
exclude_keys:
- TRAIN.checkpoint_path
- TRAIN.pretrained_model_path
- EVAL.pretrained_model_path
- mode
- output_dir
- log_freq
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个可以删了

Comment on lines 31 to 52
STAFNet_DATA_PATH: "/data6/home/yinhang2021/workspace/SATFNet/data/2020-2023_new/train_data.pkl" #
DATASET:
label_keys: ["label"]
data_dir: "/data6/home/yinhang2021/workspace/SATFNet/data/2020-2023_new/train_data.pkl"
STAFNet_DATA_args: {
"data_dir": "/data6/home/yinhang2021/workspace/SATFNet/data/2020-2023_new/train_data.pkl",
"batch_size": 1,
"shuffle": True,
"num_workers": 0,
"training": True
}



# "data_dir": "data/2020-2023_new/train_data.pkl",
# "batch_size": 32,
# "shuffle": True,
# "num_workers": 0,
# "training": True
# model settings
# MODEL: #

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议改为相对路径,以./data/...开头即可

Comment on lines 45 to 51
# "data_dir": "data/2020-2023_new/train_data.pkl",
# "batch_size": 32,
# "shuffle": True,
# "num_workers": 0,
# "training": True
# model settings
# MODEL: #
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个注释如果没用的可以删除

Comment on lines 82 to 113
# configs: {
# "task_name": "forecast",
# "output_attention": False,
# "seq_len": 72,
# "label_len": 24,
# "pred_len": 48,

# "aq_gat_node_features" : 7,
# "aq_gat_node_num": 35,

# "mete_gat_node_features" : 7,
# "mete_gat_node_num": 18,

# "gat_hidden_dim": 32,
# "gat_edge_dim": 3,
# "gat_embed_dim": 32,

# "e_layers": 1,
# "enc_in": 7,
# "dec_in": 7,
# "c_out": 7,
# "d_model": 16 ,
# "embed": "fixed",
# "freq": "t",
# "dropout": 0.05,
# "factor": 3,
# "n_heads": 4,

# "d_ff": 32 ,
# "num_kernels": 6,
# "top_k": 4
# }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上,如果没用可以删除

Comment on lines 117 to 120




Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

避免连续空行

Comment on lines +122 to +127
# set random seed for reproducibility
ppsci.utils.misc.set_random_seed(42)
# set output directory
OUTPUT_DIR = "./output_example"
# initialize logger
logger.init_logger("ppsci", f"{OUTPUT_DIR}/train.log", "info")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个可以删除,output_dir会由ppsci.utils.callbacks.InitCallback自动创建:

logger.init_logger(
"ppsci",
osp.join(full_cfg.output_dir, f"{full_cfg.mode}.log")
if full_cfg.output_dir and full_cfg.mode not in ["export", "infer"]
else None,
full_cfg.log_level,
)

from typing import Tuple

class Inception_Block_V1(paddle.nn.Layer):

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

冗余的空行请删除,下同

ppsci/arch/stafnet.py Outdated Show resolved Hide resolved
ppsci/data/dataset/stafnet_dataset.py Outdated Show resolved Hide resolved
@HydrogenSulfate HydrogenSulfate changed the title Add STAFNet Model for Air Quality Prediction [Example] Add STAFNet Model for Air Quality Prediction Feb 12, 2025
output_dir: ${hydra:run.dir}
log_freq: 20
# dataset setting
STAFNet_DATA_PATH: "/data6/home/yinhang2021/dataset/chongqing_1921/train_data.pkl" #
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 这里的路径是否能改成相对路径?比如 ./dataset/train_data.pkl,其余的路径字段也是,建议改为相对路径,并去掉用户名
  2. STAFNet_DATA_PATH是否应该放到DATASET字段下?



MODEL:
input_keys: ["aq_train_data","mete_train_data",]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
input_keys: ["aq_train_data","mete_train_data",]
input_keys: [aq_train_data, mete_train_data]


MODEL:
input_keys: ["aq_train_data","mete_train_data",]
output_keys: ["label"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
output_keys: ["label"]
output_keys: [label]

checkpoint_path: null

EVAL:
eval_data_path: "/data6/home/yinhang2021/dataset/chongqing_1921/val_data.pkl"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
eval_data_path: "/data6/home/yinhang2021/dataset/chongqing_1921/val_data.pkl"
eval_data_path: ./dataset/val_data.pkl

STAFNet_DATA_PATH: "/data6/home/yinhang2021/dataset/chongqing_1921/train_data.pkl" #
DATASET:
label_keys: ["label"]
data_dir: "/data6/home/yinhang2021/dataset/chongqing_1921/train_data.pkl"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. data_dir为什么是具体文件路径而不是某个文件夹路径?
  2. 此处的路径是否跟STAFNet_DATA_PATH重复了?

cfg.TRAIN.epochs,
ITERS_PER_EPOCH,
eval_during_train=cfg.TRAIN.eval_during_train,
seed=cfg.seed,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
seed=cfg.seed,

Comment on lines +89 to +94
"""
Validate after training an epoch

:param epoch: Integer, current training epoch.
:return: A log that contains information about validation
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""
Validate after training an epoch
:param epoch: Integer, current training epoch.
:return: A log that contains information about validation
"""

Comment on lines +106 to +110
"sampler": {
"name": "BatchSampler",
"drop_last": False,
"shuffle": True,
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"sampler": {
"name": "BatchSampler",
"drop_last": False,
"shuffle": True,
},

Comment on lines +145 to +150
# set random seed for reproducibility
ppsci.utils.misc.set_random_seed(42)
# set output directory
OUTPUT_DIR = "./output_example"
# initialize logger
logger.init_logger("ppsci", f"{OUTPUT_DIR}/train.log", "info")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# set random seed for reproducibility
ppsci.utils.misc.set_random_seed(42)
# set output directory
OUTPUT_DIR = "./output_example"
# initialize logger
logger.init_logger("ppsci", f"{OUTPUT_DIR}/train.log", "info")

OUTPUT_DIR = "./output_example"
# initialize logger
logger.init_logger("ppsci", f"{OUTPUT_DIR}/train.log", "info")
multiprocessing.set_start_method("spawn")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这句代码是什么作用?paddle的多卡训练不需要这样吧?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants