Text mining, what's next after Clustering?


  • Thanks for taking your time to view my Question.

    I'm exploring text mining to make some sense out of some (legal)documents. I created dictionary (stemmed words) and created vectors used it as a nested table with my training data as a case table and created clusters using scalable/non scalable EM(for I believe text attributes can only be discreet)

    I can see the population distribution of the variables among clusters and its probabilities but I cannot tell exactly what each cluster is?
    I'm lost to the point where I want to find out what each cluster makes it different from others. Since I cannot k-mean discreet attributes and attributes are sparsely distributed. I don't know what should be my next step in mining?

    I'll appreciated any advice or suggestion that can help me make any sense out it.



    Wednesday, April 08, 2009 7:23 PM