You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your TradingGym project which is really interesting and helpful.
I’m a bit unclear with two things in trading_env.py.
(1)Will code line [next_index = self.step_st+self.obs_len+1] in self.step function result in a blank trading day?
Suppose obs_len = 10 and step_len =5, the initial self.obs_res = self.obs_features[0:10], next_index = self.step_st+self.obs_len+1 =11, where is the 10th day info? Considering python list design exclude the last element. Is this a bug or I miss something?
(2)Would it be nicer if reward_ret value is a percent return rather than a absolute value?
Something like
self.reward_fluctuant = (self.price_current*self.position_share - self.transaction_details.iloc[-1]['price_mean']self.position_share - self.feeabs_pos) / self.transaction_details.iloc[-1]['price_mean']
By the way, I notice that, every step return a reward which is actually a stock value (cumulative return) rather than a flow value (the interval return). I doubt which is reasonable. Would you mind explaining something about this, it would be really appreciated. Thanks for your code and time.
Have a good day.
The text was updated successfully, but these errors were encountered:
Hi Yvictor
Thanks for your TradingGym project which is really interesting and helpful.
I’m a bit unclear with two things in trading_env.py.
(1)Will code line [next_index = self.step_st+self.obs_len+1] in self.step function result in a blank trading day?
Suppose obs_len = 10 and step_len =5, the initial self.obs_res = self.obs_features[0:10], next_index = self.step_st+self.obs_len+1 =11, where is the 10th day info? Considering python list design exclude the last element. Is this a bug or I miss something?
(2)Would it be nicer if reward_ret value is a percent return rather than a absolute value?
Something like
self.reward_fluctuant = (self.price_current*self.position_share - self.transaction_details.iloc[-1]['price_mean']self.position_share - self.feeabs_pos) / self.transaction_details.iloc[-1]['price_mean']
By the way, I notice that, every step return a reward which is actually a stock value (cumulative return) rather than a flow value (the interval return). I doubt which is reasonable. Would you mind explaining something about this, it would be really appreciated. Thanks for your code and time.
Have a good day.
The text was updated successfully, but these errors were encountered: