locked
Replacing specific values in dataset with Azure ML RRS feed

  • Question

  • Lately i've been testing Azure Machine Learning, and i like it. However, when i try to transform my dataset, there's a step that i can't perform easily : replacing a specific value in a column by another one.

    The Missing Values Scrubber module allows me to deal with undefined values, but in my case i need to change a specific value, or remove rows where that value appears. I don't see which module meets my requirement.

    Do you have any suggestion about this issue ?

    EDIT : i need to replace a string value, not a numeric one
    • Edited by Réda Mattar Monday, October 6, 2014 12:40 PM Precisions
    Monday, October 6, 2014 8:15 AM

Answers

  • Hey Reda,

    You can also try using the Convert To Dataset module (yes, it's in a strange place right now and we're working on organizing this workflow :) ):

    Regards,

    AK

    • Marked as answer by Réda Mattar Monday, October 6, 2014 11:09 PM
    Monday, October 6, 2014 5:10 PM

All replies

  • Hello,

    Here are 2 possible ideas:

    1) to delete an entire row based on a value in a column:

    use the Apply Math Operation module with Compare, Equal To and then specify the value you are looking for as well as the columns to look into and ResultOnly for OutputTo which should create a new column with 1s when the value is found in the column.

    you can then use the split module.  Select regular expression and change the expression for the name of the column created above:

    \"column name" ^start

    2) to replace specific values with other values you can use the R Module and follow the solution in this post:

    http://stackoverflow.com/questions/5824173/replace-a-value-in-a-data-frame-based-on-a-conditional-if-statement-in-r

    junk$nm <- as.character(junk$nm)
    junk$nm[junk$nm == "B"] <- "b"

    Monday, October 6, 2014 11:51 AM
  • Thanks for your reply !

    I forgot to mention it, but i need to replace a string value, so the Apply Math Operation module won't help. I'll try the second solution with R, that's a good opportunity for me to learn that language !

    I'll let you know as soon as i try it.

    Monday, October 6, 2014 12:40 PM
  • Hey Reda,

    You can also try using the Convert To Dataset module (yes, it's in a strange place right now and we're working on organizing this workflow :) ):

    Regards,

    AK

    • Marked as answer by Réda Mattar Monday, October 6, 2014 11:09 PM
    Monday, October 6, 2014 5:10 PM
  • Thanks, it works like a charm !

    It's indeed an unusual module for such a feature, i wouldn't have guessed it.

    Monday, October 6, 2014 11:09 PM