-
Notifications
You must be signed in to change notification settings - Fork 25
R: Notes
Some collected notes for using R.
Warning I don't know much R. So this might not be the best way to do it
The SUR example and the new helper functions are currently in my branch
Skipper's original file is in tools/R2nparray
The files can be sourced in R to make the helper functions available (for example with Windows path separators)
source("E:\\path_to_repo\\tools\\R2nparray\\R\\R2nparray.R") source("E:\\path_to_repo\\tools\\topy.R")
Assuming we already made a call to systemfit and assigned the results to SUR
> names(SUR) [1] "eq" "call" "coefficients" "coefCov" "residCovEst" "residCov" "method" "rank" [9] "df.residual" "iter" "control" "panelLike" > attributes(SUR) $names [1] "eq" "call" "coefficients" "coefCov" "residCovEst" "residCov" "method" "rank" [9] "df.residual" "iter" "control" "panelLike" $class(SUR) [1] "systemfit" > cc = SUR$coefCov > is.numeric(cc) [1] TRUE > class(cc) [1] "matrix" > is.matrix(cc) [1] TRUE > class(SUR$eq) [1] "list"
- A for loop that prints out all numeric attributes as python code that creates numpy arrays.
-
-
SUR[ [name]]
orget(SUR, name)
accesses the names attributes (?) of the object SUR. (I'm adding extra space between[ [
to avoid the Wiki to convert it to a link. It needs to be without space to be valid R code.) -
mkarray
is one of our helper functions in tools to print the data asnp.array
- it's a oneliner so it was easier to work with in the R shell
-
> for (name in names(SUR)) {if (is.numeric(SUR[ [name]])) {mkarray(SUR[ [name]], name)}}; cat("\n") coefficients = np.array([0.9979991848420328,0.06886083327936214,...,0.0429020916196108]) coefCov = np.array([157.3943509170185,-0.2165142902938106,...,0.002035467551712387]).reshape(15,15, order='F') residCovEst = np.array([176.3202565715889,-25.14782439226425,...,104.3078782568039]).reshape(5,5, order='F') residCov = np.array([180.2786473970981,3.703259980763286,...,111.6549965340746]).reshape(5,5, order='F') rank = np.array([15]) df.residual = np.array([85]) iter = np.array([1])
> aa = list(covparams=SUR$coefCov, rank=SUR$rank) > R2nparray(aa, fname="temp3.py")
The content of temp3.py
module is then
------------ temp3.py ---------- import numpy as np covparams = np.array([157.3943509170185,-0.2165142902938106,...,0.002035467551712387]).reshape(15,15, order='F') rank = np.array([15]) --------------------------------
f
is a data frame with fitted values from the ``SUR` model
> class(f) [1] "data.frame" > f Chrysler General.Electric General.Motors US.Steel Westinghouse X1935 32.98546930516650 34.82254735597956 208.2453286635445 247.5131792455174 12.27690563625844 X1936 61.83516118316266 66.98918588257341 420.2793547553419 300.2827737683187 30.52156144761057 ...
Calling another helper function, writes the data series of the data frame into a python module
> R2nparray(f, fname="temp4.py") ------------ temp4.py ---------- import numpy as np Chrysler = np.array([32.9854693051665,...,177.371048256085]) General_Electric = np.array([34.82254735597956,...,195.5150518056073]) General_Motors = np.array([208.2453286635445,...,1364.599470457204]) US_Steel = np.array([247.5131792455174,3...,566.277048536767]) Westinghouse = np.array([12.27690563625844,...,77.5688631853628]) --------------------------------
We can also combine these two, named list aa
and data frame f
and save them at the same time
R2nparray(c(aa, f), fname="temp5.py")
The resulting python module contains the merged content
>>> import temp5 >>> dir(temp5) ['Chrysler', 'General_Electric', 'General_Motors', 'US_Steel', 'Westinghouse', '__builtins__', '__doc__', '__file__', '__name__', '__package__', 'covparams', 'np', 'rank'] >>> temp5.covparams.shape (15, 15)
a new version that saves everything that is not blacklisted, but currently mainly numerical types are useful. (TODO:not committed to statsmodels/tools yet, and no name cleaning):
> cat_items(SUR, prefix="sur.", blacklist=c("eq", "control")) sur.call = '''systemfit(formula = formula, method = "SUR", data = panel)''' sur.coefficients = np.array([0.9979991848420328,...,0.0429020916196108]).reshape(15,1, order='F') sur.coefCov = np.array([157.3943509170185,...,0.002035467551712387]).reshape(15,15, order='F') sur.residCovEst = np.array([176.3202565715889,...,104.3078782568039]).reshape(5,5, order='F') sur.residCov = np.array([180.2786473970981,...,111.6549965340746]).reshape(5,5, order='F') sur.method = SUR sur.rank = 15 sur.df.residual = 85 sur.iter = 1 sur.panelLike = '''TRUE'''
Our helper functions use cat
to write the output. cat
print the strings to the standard output. The output can be redirected to a file using sink
, for example
fname = "tmp_sur.py" append = TRUE sink(file=fname, append=append) mkarray(SUR$coefficients, "params") mkarray(SUR$coefCov, "cov_params") mkarray(SUR$residCovEst, "resid_cov_est") mkarray(SUR$residCov, "resid_cov") mkarray(SUR$df.residual, "df_resid") sink()
sink()
clears the redirecting of the output. When there is an exception in the code, then sink()
is not called and the interpreter shell doesn't print any output anymore. Typing sink()
once or several times will bring the standard output back to the shell.