[datatable-help] melt spread

Carl Sutton suttoncarl at ymail.com
Wed Jan 18 02:27:39 CET 2017


Hi
This question is for information, not a coding problem.

Basic information:
The data table I am attempting to melt has 363 columns and 85,074 rows (246.5MB).  The first 14 are id variables and pose no problem.  One of the measure vars has sequences of 2:9,  The other 34 have sequence of 1:10.   think paste0("var_",1:10)

What works:
It is a simple matter to melt these measure var columns using columns 15:363.   Thanks to an answer on a prior question I can use tstrsplit to split the sequence number off the column heading,   So far, so good.  

The difficulty:
The problems arises when I attempt to spread the variable column which contains prior column names sans sequence numbers.   I have searched but not found a data.table function to spread the contents of "variable" into separate columns.  The tidyr "spread" command maxes out my available memory of 12GB.  I have attempted to use patterns to melt into separate columns but that results in column names of valuex, not the original column name.  In searching the arguments for melt I have not seen one for preserving the original column names.  Perhaps I missed something?
The solution:
Am I stuck with either 
    a)  splitting my data.table such that tidyr does not max out available memory, or    b)  use set names on 35 columns to get viable column names?
 Any and all thoughts are appreciated.  I have 20 of these datasets to mung and am starting on one of the smaller ones.  The goal is write the code once and use it on all datasets.
Carl Sutton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20170118/33ac28ad/attachment.html>


More information about the datatable-help mailing list