You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was running "Example Usage with Binary Treatment Synthetic Data" in the Causal Forest Notebook (https://github.com/py-why/EconML/blob/main/notebooks/Causal%20Forest%20and%20Orthogonal%20Random%20Forest%20Examples.ipynb). After running the following code, I found the shape of Y is weird. It is a 10001000 matrix while 2nd to 999th columns are all the same. I believe that we should reshape the Y matrix and only use the first column for modeling, however, that will make ATE results totally different (0.97 vs 3.1). How should I understand the shape of Y. Should it be a 10001000 matrix or 1000*1 matrix? Thank you!
# DGP constants
np.random.seed(1234)
n = 1000
n_w = 30
support_size = 5
n_x = 1
# Outcome support
support_Y = np.random.choice(range(n_w), size=support_size, replace=False)
coefs_Y = np.random.uniform(0, 1, size=support_size)
epsilon_sample = lambda n: np.random.uniform(-1, 1, size=n)
# Treatment support
support_T = support_Y
coefs_T = np.random.uniform(0, 1, size=support_size)
eta_sample = lambda n: np.random.uniform(-1, 1, size=n)
# Generate controls, covariates, treatments and outcomes
W = np.random.normal(0, 1, size=(n, n_w))
X = np.random.uniform(0, 1, size=(n, n_x))
# Heterogeneous treatment effects
TE = np.array([exp_te(x_i) for x_i in X])
# Define treatment
log_odds = np.dot(W[:, support_T], coefs_T) + eta_sample(n)
T_sigmoid = 1/(1 + np.exp(-log_odds))
T = np.array([np.random.binomial(1, p) for p in T_sigmoid])
# Define the outcome
Y = TE * T + np.dot(W[:, support_Y], coefs_Y) + epsilon_sample(n)
# ORF parameters and test data
subsample_ratio = 0.4
X_test = np.array(list(product(np.arange(0, 1, 0.01), repeat=n_x)))
The text was updated successfully, but these errors were encountered:
silulyu
changed the title
Shape of Y in Causal Forest notebook is weird
Why Shape of Y in Causal Forest notebook is 1000*1000
Jun 5, 2024
I'm unable to reproduce this - I see (1000,) as the shape of Y. Is it possible that you've redefined exp_te to return something other than a scalar? What is the shape of TE?
Thanks so much for your reply. That is a great catch! I add the function of exp_te in the Section 2 as the same function of that in Section 1, and the shape is (1000,) now.
I was running "Example Usage with Binary Treatment Synthetic Data" in the Causal Forest Notebook (https://github.com/py-why/EconML/blob/main/notebooks/Causal%20Forest%20and%20Orthogonal%20Random%20Forest%20Examples.ipynb). After running the following code, I found the shape of Y is weird. It is a 10001000 matrix while 2nd to 999th columns are all the same. I believe that we should reshape the Y matrix and only use the first column for modeling, however, that will make ATE results totally different (0.97 vs 3.1). How should I understand the shape of Y. Should it be a 10001000 matrix or 1000*1 matrix? Thank you!
The text was updated successfully, but these errors were encountered: