I do have a problem with results of clustering algorithm with my categorical data.
In reality I have a big table with one's and zero's and I try to cluster according to 60 attributes.
I tries to cluster to two categories many times but I get only one cluster. What do you suggest. Which values can I change to the algorithm? Is there anything particular for categorical data ?
Thank for your help in advance.
Manolis
First, check Properties of your clustering model then AlgorithmParameters ->Set algorithm parameter; look at CLUSTER_COUNT parameter- it define the number of cluster in model- default is 10, maybe it has value 1.
Second explain the significance of your data with a set o sample so we can make a suggest.
|||The way the cluster algorithm works is it starts from a semi-random starting point (it uses the data distributions and randomly jitters them) using the number of clusters you choose (or the number of auto-detected clusters). If, during the clustering operation, two clusters are coincident, it merges the clusters and creates a new randomized starting point. Of course, this can end up coincident with another cluster as well. In the end, any remaining coincident clusters are merged.
I could imagine a scenario where the data is distributed such that a probabilistic model of two clusters always results in merging, yet a model of more clusters could produce an interesting model. So one thing to try would be to increase the number of clusters to see what you have.
Another option is to not use the default probabilistic clustering method and switch to K-means. You will not be able to access this parameter through the Table Analysis Tool "Detect Categories" button or the simple "Cluster" button on the DM ribbon. You need to use the advanced "Create Model Manually" option where you explicitly select an algorithm and there's a button for parameters. The appropriate parameter is "CLUSTERING_METHOD".
HTH
-Jamie
No comments:
Post a Comment