Is there an equivalent to Pandas Dataframe apply functions in U-SQL?


  • Hi!

    Just wondering if I could in any way apply a function to each row in a rowSet inside U-SQL, or if I'm limited to use a Python Assembly to do this? In this case, I've seen that I can use pandas and numpy as libraries, but is there a list available where I could see all of the permitted libraries?

    Also, I haven't seen many examples on using Python inside U-SQL. I completely understand since it's a very young product, but a nudge in the right direction would be greatly appreciated.



    Tuesday, February 6, 2018 4:15 PM

All replies

  • Dear Joan

    There are a few ways on applying custom code to a function to each row.

    There is the concept of a processor UDO, that takes 1 row and then allows you to return zero or one row as a result. The UDO has to be implemented using the UDO framework.

    There is the CROSS/OUTER APPLY expression that allows you to apply either EXPLODE (on an IEnumerable C# expression) or an applier UDO. That allows you to return zero to N rows per input row. Documentation is here.

    We are working on adding an Python processor and applier framework to our Python support. Since we are working on bringing Python in even more natively, we have been staying back with too many samples, so we can provide better coverage for the new upcoming capabilities.

    Michael Rys

    Tuesday, February 6, 2018 8:08 PM