[datatable-help] Analysis of categorical variables with different levels
R_Exp64
deepaksharma64 at gmail.com
Sun Apr 8 07:15:46 CEST 2018
I have been given a dataframe with 50+ fields. My job is to find the fields
which are dependent and independent of each other. Most of the fields are
categorical variable and have different levels. For example - below table
shows different levels for each field Sex:2, Color:4, Geography:3, ID:8.
Now, how can I do regression/correlation on them ? Clearly, we see that all
Males have Red color and their Geography is North. What analysis I need to
do to get these kind of insights? I tried few things but could not figure
out the right way.
Idea is to develop a system where few of the fields are automatically filled
based on the information entered in other fields, like if we enter Sex as
Male,Geography is automatically North. Thanks!
**Sex** **Color** **Geography** **ID**
Male Red North 100
Male Red North 200
Male Red North 300
Male Red North 400
Male Blue North 500
Female Green South 600
Female Green
Female East 800
Female Yellow North 900
--
Sent from: http://r.789695.n4.nabble.com/datatable-help-f2315188.html
More information about the datatable-help
mailing list