[datatable-help] Reading spss files

Steve Lianoglou lianoglou.steve at gene.com
Wed Jan 8 21:47:35 CET 2014


Hi,

2014/1/8 INEC Verónica Vaca <Veronica_Vaca at inec.gob.ec>:
> Thank you Steve
>
> I have tried with the foreign package but I have the problem with the
> memory, I can´t find the function setDT, I thought it would be very helpful,
>
> because the data frame takes all the  memory.
>
> I am very new at R, so I don´t know how to get that function if it is in the
> development version of the package.

The setDT is only in the development version of data.table, which is
available via SVN (and perhaps compiled) on r-forge:

https://r-forge.r-project.org/projects/datatable/

If the foreign package can't even read that data into memory for you,
though, then using the fancy setDT will be of no use to you, since it
requires the object be loaded as a data.frame already.

Is the SPSS object a table? (are they all tables? I have no idea,
never has used it)

Could you dump the data into a database from within SPSS then access
it that way from R? An SQLite database would be the first/easiest
choice.

* How big is the data (rows x columns)?
* How much RAM do you have?
* Are you on a 64-bit machine? Is R running in 64bit mode? (The value
of `.Machine$sizeof.pointer` should be 8).

Unfortunately, if the data can't fit into the amount of usable RAM you
have, then data.table will not be able to help you -- would getting
more RAM isn't an option for you.

If you can't fit the data into RAM, but you can dump it into a
database and still want to use R to do split/apply/combine computation
over the data as described here:

http://www.jstatsoft.org/v40/i01/paper

You might consider looking at the dplyr package Hadley Wickham is
developing that I believe supports (or will sometime in the future)
working with data stored in a database (among other places) in order
to perform split/apply/combine stuff:

https://github.com/hadley/dplyr

HTH,
-steve

-- 
Steve Lianoglou
Computational Biologist
Genentech


More information about the datatable-help mailing list