[Traminer-users] selecting the number of clusters
Rimantas Vosylis
rvosylis at live.com
Wed Jun 10 12:30:38 CEST 2015
Dear Traminer users,
I am trying to build a typology of sequences by using cluster analysis with
OM and Ward algorith.
I have a problem of choosing the number of clusters. I use several empirical
indexes, but they don't help me a lot. I use Calinski and harabasz (CH)
index, but it has a peak at two cluster solution and the goes down. I also
use average shilloute width but it gives me the similar results as CH index.
I also run pseudo ANOVA to see which cluster solution explains most
variance, but it tells me the opposite - the more the clusters the higher
the pseudo R2 gets. When I look at the various plots (e.g. seqdplot) I see
that the most meaningful solutions (I have several types of sequences) lie
somewhere between 4-6 clusters.
Could You perhaps suggest which indexes worked best for You and matched Your
expectations / theoretical knowledge and that I could use in my analysis?
Thank You in advance!!
Sincerely,
Rimantas Vosylis
PhD student, lecturer
Insitute of Psychology
Faculty of Social Technologies
Mykolas Romeris University
e-mail: rimantasv at mruni.eu <mailto:rimantasv at mruni.eu>
e-mail2: rvosylis at live.com <mailto:rvosylis at live.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20150610/c03c9e9e/attachment.html>
More information about the Traminer-users
mailing list