locked
Using Text Analytics - Help Needed! RRS feed

  • Question

  • Hey guys, I'm totally new to Azure ML and I am looking for some help with setting up a text analytics experiment.  Most of the analytics I've seen on text has been sentiment analysis but I have something else I would like to do.

    I'm given a bunch of call transcripts that ended in a positive manner.  I would like to feed these transcripts through ML to see if there are any key phrases versus called that ended in a negative manner.  I'm not sure if this is possible or the best way to go about it.  

    Assuming there is an api, what format would it be expected to be?  Currently the transcripts are just texts files with a format like below.  This could be changed if needed to make it work better but essentially we'd like to see if there was anything in particular that was said during positive calls that t stands out vs a negative call

    Caller: Hello?

    Comm: Hi Mr So and So...We're calling about such and such a product that you inquired about..

    <More conversation like above>

    Caller: Thanks and goodbye!

    Tuesday, September 1, 2015 7:34 PM

Answers

  • "I would like to feed these transcripts through ML to see if there are any key phrases versus calls that ended in a negative manner".

    OK.  I'm not completely clear on what you are trying to accomplish so please forgive me if I'm misunderstanding, but here are two directions you might try:

    1. It sounds like you are still trying to do a sentiment analysis problem since you are trying to determine key phrases for calls that end negatively.  Example of positive call endings would be "thank you so much" and negative ones might be swear words, etc.  But, it sounds like you only have examples of transcripts that ended positively?  To best build a model, it would be helpful to have examples of negative transcripts as well.  Is there any way that you can get that data? 

    2. If you only have call transcripts that end in a positive manner, and you want to find key phrases in them that help the call stay positive, I suggest trying the K-means clustering algorithm.  Clustering helps find groupings of similar things and could potentially pull out patterns of positive behavior in the text. 

    Regardless, I suggest you look at the sentiment analysis sample.  It shows you how to do text preprocessing and other things that might be useful in your situation.  There is a sample in the Azure Machine Learning gallery on Twitter Sentiment Analysis:

    http://gallery.azureml.net/Experiment/Binary-Classification-Twitter-sentiment-analysis-4?share=1

    Friday, September 4, 2015 3:07 PM

All replies

  • "I would like to feed these transcripts through ML to see if there are any key phrases versus calls that ended in a negative manner".

    OK.  I'm not completely clear on what you are trying to accomplish so please forgive me if I'm misunderstanding, but here are two directions you might try:

    1. It sounds like you are still trying to do a sentiment analysis problem since you are trying to determine key phrases for calls that end negatively.  Example of positive call endings would be "thank you so much" and negative ones might be swear words, etc.  But, it sounds like you only have examples of transcripts that ended positively?  To best build a model, it would be helpful to have examples of negative transcripts as well.  Is there any way that you can get that data? 

    2. If you only have call transcripts that end in a positive manner, and you want to find key phrases in them that help the call stay positive, I suggest trying the K-means clustering algorithm.  Clustering helps find groupings of similar things and could potentially pull out patterns of positive behavior in the text. 

    Regardless, I suggest you look at the sentiment analysis sample.  It shows you how to do text preprocessing and other things that might be useful in your situation.  There is a sample in the Azure Machine Learning gallery on Twitter Sentiment Analysis:

    http://gallery.azureml.net/Experiment/Binary-Classification-Twitter-sentiment-analysis-4?share=1

    Friday, September 4, 2015 3:07 PM
  • you can also use "bag of words" models, that will be unsupervised technique, but hopefully work.
    Saturday, January 30, 2016 10:28 PM