[datatable-help] Generating pseudodata as in Elements of Statistical Learning
machinelearner
supipia at gmx.de
Sat Dec 9 19:00:15 CET 2017
Hi dear statisticians,
I am trying to implement a Simulation from the book "Elements of Statistical
Learning" by Hastie et al.
My Problem is that I don't understand how to generate the pseudodata as they
did.
The book says /For each of N =100 Samples, we generated p standard Gaussian
features X with pairwise correlation 0.2. The outcome Y was generated
according to a linear model/ Y = \sum_{j=1}^p X_j*b_j + sigma*Epsilon,
(Sorry, don't know if a math mode exists here?)
/ where Epsilon was generated from a Standard Gaussian Distribution. For
each dataset, the set of coefficients b_j were also generated from a
Standard Gaussian Distribution. We investigated p = 20, 100 and 1000. The
standard deviation sigma was chosen in each case so that the
signal-to-noise-ratio Var[E(Y|X)]/sigma² equaled 2. /
So, what I managed to generate so far are the Xs, the Epsilons and the bs.
I don't get how I'm meant to generate Y without knowing sigma and according
to the description of sigma, I need Y to compute it.
Can someone please help me? What am I not understanding here??
Thanks and best regards!
--
Sent from: http://r.789695.n4.nabble.com/datatable-help-f2315188.html
More information about the datatable-help
mailing list