locked
Problem with Python Counter RRS feed

  • Question

  • I am trying to create a list of Counter objects based on a column in a dataframe and then add this list as a new

    column. This code works fine on my local machine. However when I run it in Azure ML, I get an error message stating

    Counter is an unsupported type. Are Counters not supported in Azure? The error message is below the code.

    from sklearn.feature_extraction import DictVectorizer import pandas as pd import numpy as np from collections import Counter def azureml_main(dataframe1 = None, dataframe2 = None): d = list(dataframe1['Address'].str.split().apply(Counter)) frame = pd.DataFrame({'corpus':d}) # Return value must be of a sequence of pandas.DataFrame return frame

    Caught exception while executing function: Traceback (most recent call last):
       File "C:\server\invokepy.py", line 176, in batch
         rutils.RUtils.DataFrameToRFile(outlist[i], outfiles[i])
       File "C:\server\RReader\rutils.py", line 28, in DataFrameToRFile
         rwriter.write_attribute_list(attributes)
       File "C:\server\RReader\rwriter.py", line 59, in write_attribute_list
         self.write_object(value);
       File "C:\server\RReader\rwriter.py", line 121, in write_object
         write_function(flags, value.values())
       File "C:\server\RReader\rwriter.py", line 104, in write_objects
         self.write_object(value)
       File "C:\server\RReader\rwriter.py", line 120, in write_object
         write_function = self.get_writer_method(value.getType())
       File "C:\server\RReader\rwriter.py", line 47, in get_writer_method
         raise Exception("Type unsupported " + str(rObjectType))
     Exception: Type unsupported <class 'collections.Counter'>
     

    Wednesday, July 29, 2015 8:33 PM

Answers

  • Hello Troy,

    Yes, we unfortunately do not support returning arbitrary Python objects inside data frames at the moment. A possible workaround for your case would be to break apart the counter into two columns: one containing the keys as strings and the other the counts as integers. You can then reconstruct it in a downstream Python script module if necessary.

    Would this work for you? In the future, we will definitely consider pickling objects we do not recognize and reconstituting them for your automatically.

    Thank you,

    Sudarshan (AzureML)

    • Marked as answer by TroyWalters Thursday, July 30, 2015 2:24 PM
    Thursday, July 30, 2015 2:04 PM

All replies

  • Hello Troy,

    Yes, we unfortunately do not support returning arbitrary Python objects inside data frames at the moment. A possible workaround for your case would be to break apart the counter into two columns: one containing the keys as strings and the other the counts as integers. You can then reconstruct it in a downstream Python script module if necessary.

    Would this work for you? In the future, we will definitely consider pickling objects we do not recognize and reconstituting them for your automatically.

    Thank you,

    Sudarshan (AzureML)

    • Marked as answer by TroyWalters Thursday, July 30, 2015 2:24 PM
    Thursday, July 30, 2015 2:04 PM
  • Yes, I will look into the option that you suggest. Thank you for your help!
    Thursday, July 30, 2015 2:24 PM