[tm-commits] help with findAssocs()

xinrong lei xleiuiuc at gmail.com
Tue Mar 31 03:14:11 CEST 2009


Help with ‘tm’ findAssocs() and Rgraphviz installation.

THANK You!



How to use findAssocs()
I have a database of 100 surveys. the survey has two open-end questions. the
length of the answer is over 1000.
I saved each open-end question in one txt file, and saved in C:\textfile

File1.txt is one open end question, it contends 100 records.

File2.txt is another open end question, it contends 100 records as well.
I know term “research” occurs 49 times in File1.txt, so I want to find out
which other words are correlated to this word, so I use findAssocs(), and I
got tons of association  ‘1’ .

            academ             access          accompani
accord
ace

                 1                  1                  1
   1
1

            achiev             acquir           acquisit
 act
activ

                 1                  1                  1
   1
1



I tried other terms, and all association value is 1, which obviously is not
right.

Could any export tell me where did I do wrong?



My R-code is:

 R>my.path<-'C:\\textfile'

R>library(tm)

R>my.corpus <- Corpus(DirSource(my.path), readerControl = list
(reader=readPlain))

R>tdmO <- TermDocMatrix(my.corpus)

R>tdmO

An object of class “TermDocMatrix”

Slot "Data":

2 x 1426 sparse Matrix of class "dgCMatrix"

   [[ suppressing 1426 column names ‘000’, ‘0092’, ‘0093’ ... ]]







1 3 1 12 1 1 1 8 1 1 2 1 9 . 2 2 1 518 1 1 1 2 1 1 2 6 1 1 3 3 2 1 1 4 1 4 3
3 1 11 5 1 7 2 5 4 3 1 1

2 . .  . . . . . . . . . . 3 . . .   6 . . . . . . . . . . . . . . . 3 . . .
. .  1 . 1 . . . . . . .




1 1 2 1 4 1 5 4 4 2 4 6 2 2 . 3 1 2 1 3 1 2 1 4 1 1 3 1 1 1 12 2 1 1 2 1 1 4
1 1 . 3 1 2 1 3 3 1 1 2 2

2 . . . . . . . 3 . . 3 . . 1 . . . . . . . . . . . . . . .  . . . . . . . .
. . 1 . . 1 . . 2 . . . .

 …

R>findAssocs(tdmO,”research”,0.95)

            academ             access          accompani
accord
ace

                 1                  1                  1
   1
1

            achiev             acquir           acquisit
 act
activ

                 1                  1                  1
   1
1

            activi              adapt                add
addit
adequ

                 1                  1                  1
   1
1



……







Question2:



I can’t load Rgraphviz in R.

I am using windows XP professional, R 2.8.1

I followed the instruction in this link

http://groups.google.com/group/r-help-archive/browse_thread/thread/413605edc81b3422/b7917083646d9cd2?lnk=gst&q=Rgraphviz#b7917083646d9cd2

and

https://stat.ethz.ch/pipermail/bioconductor/2008-June/022838.html



What I did is

1. Close down any R sessions you have open.2. Download and install Microsoft
Visual C++ 2005 SP1 Redistributable Package:
http://www.microsoft.com/downloads/details.aspx?familyid=200B2FD9-AE1A-4A14-984D-389C36F85647&displaylang=en2.
Download and install the Graphviz 2.16.1 from the archives: I also tried
2.18.1, and 2.22.2



3. Check your PATH to see how Graphviz was added: graphvis 2.18 and later
versions will automatically add

C:\Program Files\Graphviz2.16\Bin

to Path.



4. open R and download and install Rgraphviz using: R> source("
http://bioconductor.org/biocLite.R") R> biocLite("Rgraphviz")

I got no error before the next step:



R>library(Rgraphviz)I got this error message:

Error in inDL(x, as.logical(local), as.logical(now), ...) :

  unable to load shared library
'C:/PROGRA~1/R/R-28~1.1/library/Rgraphviz/libs/Rgraphviz.dll':

  LoadLibrary failure:  The specified module could not be found.

Error : .onLoad failed in 'loadNamespace' for 'Rgraphviz'

Error: package/namespace load failed for 'Rgraphviz'



What else shall I do?



Thank you in advance!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.r-forge.r-project.org/pipermail/tm-commits/attachments/20090330/99440d00/attachment.htm 


More information about the tm-commits mailing list