locked
Error 0063 - missing values are not allowed in subscripted assignments of data frames when replacing NA values RRS feed

  • Question

  • I have code that pulls in data from an outside database, which lists the latitude and longitude coordinates for all US zip codes. There are missing values in the data set, so for critical cities, I manually add them in. The code works in RStudio, but fails with missing values are not allowed in subscripted assignments of data frames when I run it in AML.

    #Pull in latitude and longitude data for US cities
    US_LatLong <- maml.mapInputPort(1) # class: data.frame
    
    #load required packages
    library(dplyr)
    
    #Find the average or center point latitude and longitude for each city
    US_LatLongCenters <- US_LatLong %>%
        group_by(city, state) %>%
        summarise(Latitude = mean(latitude), Longitude = mean(longitude))
    
    US_LatLongCenters <- as.data.frame(US_LatLongCenters)
    
    #Manually add missing latitude and longitude values for key citties
    US_LatLongCenters[US_LatLongCenters$city == "Anaheim" &
                          US_LatLongCenters$state == "CA", 3:4] <- c(33.8341423, -117.9163969)

    What is it about my code that is problematic for AML?

    Thanks

    Thursday, January 19, 2017 12:22 AM

Answers

  • Hi AK,

    Thank you. Your advice helped me find my error. While my column types were not a problem, Running str() allowed me to see that there were more rows than there should have been in my data set.

    When I ran my code in RStudio, I called the data directly from a website. However, for AML I uploaded a copy as a data set. When I did that, blank rows were introduced.

    This part of my code works now.

    Thank you.

    Thursday, January 19, 2017 5:40 PM

All replies

  • Hey Logan,

    Generally this is because the input data isn't in the same format as you've seen locally (assuming the dplyr version matches or is "close enough"?).

    A good way to check this is to str(US_LatLong) in Azure ML right after the US_LatLong <- maml.mapInputPort(1) line. The contents can then be seen in View Output Log after the module has run.

    Doing the same to your data locally, do each of the column types match exactly? Are you seeing missing values in AML?

    Regards,

    AK

    Thursday, January 19, 2017 1:02 AM
  • Hi AK,

    Thank you. Your advice helped me find my error. While my column types were not a problem, Running str() allowed me to see that there were more rows than there should have been in my data set.

    When I ran my code in RStudio, I called the data directly from a website. However, for AML I uploaded a copy as a data set. When I did that, blank rows were introduced.

    This part of my code works now.

    Thank you.

    Thursday, January 19, 2017 5:40 PM