[Traminer-users] selecting the number of clusters

Rimantas Vosylis rvosylis at live.com
Wed Jun 10 12:30:38 CEST 2015


Dear Traminer users,

 

I am trying to build a typology of sequences by using cluster analysis with
OM and Ward algorith.

 

I have a problem of choosing the number of clusters. I use several empirical
indexes, but they don't help me a lot. I use Calinski and harabasz (CH)
index, but it has a peak at two cluster solution and the goes down. I also
use average shilloute width but it gives me the similar results as CH index.
I also run pseudo ANOVA to see which cluster solution explains most
variance, but it tells me the opposite - the more the clusters the higher
the pseudo R2 gets. When I look at the various plots (e.g. seqdplot) I see
that the most meaningful solutions (I have several types of sequences) lie
somewhere between 4-6 clusters.

 

Could You perhaps suggest which indexes worked best for You and matched Your
expectations / theoretical knowledge and that I could use in my analysis?

 

Thank You in advance!!

 

 

Sincerely,

 

Rimantas Vosylis

PhD student, lecturer

Insitute of Psychology

Faculty of Social Technologies

Mykolas Romeris University

 

e-mail: rimantasv at mruni.eu <mailto:rimantasv at mruni.eu> 

e-mail2: rvosylis at live.com <mailto:rvosylis at live.com> 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20150610/c03c9e9e/attachment.html>


More information about the Traminer-users mailing list