locked
R Script library openNLP causing issues RRS feed

  • Question

  • Hello,

    I am trying to use the openNLP library in a R script within AML. When I don't register it, I get my output as expected.

    However, as soon as I register it, I get a blank output, and I cannot even see the columns of my output. The logs do not mention any problem, and I do not get any exception while running the R component.

    So here is how I register it:

    library(utils)
    library(NLP)
    library(rJava)
    library(openNLPdata)
    library(openNLP)
    library(tm)
    library(textcat)
    library(stringr)
    

    I found on the CRAN website the dependencies of openNLP, and this is why I register utils, NLP, rJava and openNLPdata before.

    Note, I also register "tm", "textcat" and "stringr". Can it be that they are conflicting?

    In my code, I do not even use openNLP but I get a blank output. If I use it to tokenize words for example, I also get a blank output, without any error.

    Do you have any idea what can be wrong?

    [EDIT] Here is a very simple script that illustrate the behavior.

    a <- "A"
    b <- "B"
    
    data.set <- data.frame(
                    a,
                    b,
                    stringsAsFactors = FALSE
                    )
     
    # Select data.frame to be sent to the output Dataset port
    maml.mapOutputPort("data.set"); 

    This works fine, I get the expected output:

    Now, I just register the openNLP and the dependencies:

    library(utils)
    library(NLP)
    library(rJava)
    library(openNLPdata)
    library(openNLP)
    
    a <- "A"
    b <- "B"
    
    data.set <- data.frame(
                    a,
                    b,
                    stringsAsFactors = FALSE
                    )
     
    # Select data.frame to be sent to the output Dataset port
    maml.mapOutputPort("data.set"); 

    And it does not print the output anymore:

    Thanks

    Matt


    • Edited by MattFFM0702 Monday, November 2, 2015 6:19 PM Added a simple example + screenshots
    Monday, November 2, 2015 6:09 PM

Answers

  • Hey everyone (Jessica - are you just merging support sources? not quite sure what you're asking)

    The problem is most likely the JVM, which doesn't run in our sandboxing technology currently. We are, however, investigating a potential solution that would get this capability to you more quickly.

    Unhelpfully, we hide sandbox-related errors from the users because it isn't very informative unless you're working on this implementation (and truthfully not very helpful there either). The error kills the R interpreter in an unhealthy way and we unfortunately aren't able to detect this failure cleanly outside of R. I'll file a defect to clean this up. 

    This is what leads to the green checkmark suggesting things are OK. Apologies for the inconvenience and inability to provide a more concrete timeline on Java support.

    Regards,

    AK

    • Marked as answer by MattFFM0702 Wednesday, November 4, 2015 11:30 AM
    Tuesday, November 3, 2015 8:02 PM

All replies

  • From Maral Dadvar, @MaralDadvar via Twitter:

    Any ideas why the openNLP package does not function on Execute Rscript, AzureML? no error but no output @R_Programming @Azure  @AzureSupport

    Thanks!

    @AzureSupport

    Tuesday, November 3, 2015 5:23 PM
  • Hey everyone (Jessica - are you just merging support sources? not quite sure what you're asking)

    The problem is most likely the JVM, which doesn't run in our sandboxing technology currently. We are, however, investigating a potential solution that would get this capability to you more quickly.

    Unhelpfully, we hide sandbox-related errors from the users because it isn't very informative unless you're working on this implementation (and truthfully not very helpful there either). The error kills the R interpreter in an unhealthy way and we unfortunately aren't able to detect this failure cleanly outside of R. I'll file a defect to clean this up. 

    This is what leads to the green checkmark suggesting things are OK. Apologies for the inconvenience and inability to provide a more concrete timeline on Java support.

    Regards,

    AK

    • Marked as answer by MattFFM0702 Wednesday, November 4, 2015 11:30 AM
    Tuesday, November 3, 2015 8:02 PM
  • Hi AK,

    Thanks for the detailed explanation, it explains why we didn't get any exception.

    So on our side this is a show stopper, but we will try to find alternatives to openNLP in R. Maybe in Python? Do you know any substitute?

    [EDIT]: I found NLTK which sounds very good and fully in Python

    Matt


    • Edited by MattFFM0702 Wednesday, November 4, 2015 5:10 PM Found alternative
    Wednesday, November 4, 2015 10:22 AM
  • Hi AK,

    Thanks for the detailed explanation, it explains why we didn't get any exception.

    So on our side this is a show stopper, but we will try to find alternatives to openNLP in R. Maybe in Python? Do you know any substitute?

    [EDIT]: I found NLTK which sounds very good and fully in Python

    Matt


    How to OpenNLP Modal train in C# code
    Wednesday, May 31, 2017 9:46 AM