Problem with Database Loader RRS feed

  • Question

  • The Microsoft ML Database Loader reads data from database in Vector format. If for example a database column has type "int", the delivered value is "Vector<int>" etc. These data are impossible to train since values of this type are not supported. Feature conversion via OneHotEncoding doesn't help either.
    What to do? How to get rid of these Vectors? Any suggestion?

    Here's a code snippet illustrating the problem:

            DatabaseSource dbSource = new DatabaseSource(SqlClientFactory.Instance, connectionString, sqlCommand);
            DatabaseLoader loader = mlContext.Data.CreateDatabaseLoader<ModelInput>();
            IDataView dataView = loader.Load(dbSource);
            var pipeline = mlContext.Transforms.CopyColumns(outputColumnName: "Label", inputColumnName: "PROZENT")
                .Append(mlContext.Transforms.Categorical.OneHotEncoding(outputColumnName: "NAMEEncoded", inputColumnName: "NAME"))
                .Append(mlContext.Transforms.Categorical.OneHotEncoding(outputColumnName: "ZEITEncoded", inputColumnName: "ZEIT"))
                .Append(mlContext.Transforms.Concatenate("Features", "NAMEEncoded", "ZEITEncoded"))
            var model = pipeline.Fit(dataView);

    Running this code leads to an error:

    Nachricht = Schema mismatch for label column '': expected Single, got Vector
    Quelle = Microsoft.ML.Data
    at Microsoft.ML.Trainers.TrainerEstimatorBase2.CheckLabelCompatible(Column labelCol) at Microsoft.ML.Trainers.TrainerEstimatorBase2.CheckInputSchema(SchemaShape inputSchema)
    at Microsoft.ML.Trainers.TrainerEstimatorBase2.GetOutputSchema(SchemaShape inputSchema) at Microsoft.ML.Data.EstimatorChain1.GetOutputSchema(SchemaShape inputSchema)
    at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
    at HyperStockTrainingModel.Program.Train(MLContext mlContext, String dataPath) in Q:\Software\Apps\HyperStockTrainingModel\HyperStockTrainingModel\Program.cs:line 42
    at HyperStockTrainingModel.Program.Main(String[] args) in Q:\Software\Apps\HyperStockTrainingModel\HyperStockTrainingModel\Program.cs:line 23

    Saturday, October 26, 2019 12:54 PM

All replies

  • Hello,

    DatabaseLoader is still in preview. So, this could be an issue that can be reported at ML.Net repo in github.

    The documentation of the feature mentions to convert numerical data that is not Real to Real and then load the data. Here is the example from the documentation that can be used. Could you please try the same and check?


    Tuesday, October 29, 2019 11:38 AM