Asked by:
How to use String as a Predicted Column

Question
-
Goal:
Make a prediction on PaymentType. The target name or predicted column is a string value.Problem:
I retrieve a error message that is "ArgumentOutOfRangeException: Training label column 'Label' type isn't suitable for regression: Text. Type must be R4 or R8. Parameter name: data"Whar part from the source code am I missing?
Thank you!
Data:
https://github.com/dotnet/machinelearning/blob/master/test/data/taxi-fare-train.csvhttps://github.com/dotnet/machinelearning/blob/master/test/data/taxi-fare-test.csv
Info:
I'm new in ML.netCode:
using Microsoft.ML; using Microsoft.ML.Data; using Microsoft.ML.Models; using Microsoft.ML.Trainers; using Microsoft.ML.Transforms; using System; using System.Threading.Tasks; namespace TaxiFarePrediction { class Program { const string _datapath = @".\Data\taxi-fare-train.csv"; const string _testdatapath = @".\Data\taxi-fare-test.csv"; const string _modelpath = @".\Data\Model.zip"; static async Task Main(string[] args) { PredictionModel<TaxiTrip, TaxiTripFarePrediction> model = await Train(); Evaluate(model); var prediction = model.Predict(TestTrips.Trip1); Console.WriteLine("Predicted fare: {0}", prediction.PaymentType); Console.ReadLine(); } static async Task<PredictionModel<TaxiTrip, TaxiTripFarePrediction>> Train() { // Create learning pipeline var pipeline = new LearningPipeline { // Load and transform data new TextLoader(_datapath).CreateFrom<TaxiTrip>(separator: ','), // Labeling new ColumnCopier(("PaymentType", "Label")), // Feature engineering new CategoricalOneHotVectorizer("VendorId", "RateCode", "PaymentType"), // Combine features in a single vector new ColumnConcatenator("Features", "VendorId", "RateCode", "PassengerCount", "TripDistance", "FareAmount"), // Add learning algorithm new FastTreeRegressor() }; // Train the model PredictionModel<TaxiTrip, TaxiTripFarePrediction> model = pipeline.Train<TaxiTrip, TaxiTripFarePrediction>(); // Save the model to a zip file await model.WriteAsync(_modelpath); return model; } private static void Evaluate(PredictionModel<TaxiTrip, TaxiTripFarePrediction> model) { // Load test data var testData = new TextLoader(_datapath).CreateFrom<TaxiTrip>(useHeader: true, separator: ','); // Evaluate test data var evaluator = new RegressionEvaluator(); RegressionMetrics metrics = evaluator.Evaluate(model, testData); // Display regression evaluation metrics Console.WriteLine("Rms=" + metrics.Rms); Console.WriteLine("RSquared = " + metrics.RSquared); } } }
namespace TaxiFarePrediction { static class TestTrips { internal static readonly TaxiTrip Trip1 = new TaxiTrip { VendorId = "VTS", RateCode = "1", PassengerCount = 1, TripDistance = 10.33f, PaymentType = "", FareAmount = 7, //FareAmount = 0 // predict it. actual = 29.5 }; } }
using Microsoft.ML.Runtime.Api; namespace TaxiFarePrediction { public class TaxiTrip { [Column("0")] public string VendorId; [Column("1")] public string RateCode; [Column("2")] public float PassengerCount; [Column("3")] public float TripTime; [Column("4")] public float TripDistance; [Column("5")] public string PaymentType; [Column("6")] public float FareAmount; } }
using Microsoft.ML.Runtime.Api; namespace TaxiFarePrediction { public class TaxiTripFarePrediction { /* [ColumnName("Score")] public float FareAmount; */ [ColumnName("Score")] public string PaymentType; } }
All replies
-
Hi Sakura,
Are you following the quickstart from ML.net documentation? I ran the solution as mentioned in the documentation and it works as expected.
It looks like the issue is from ColumnCopier, Could you try to use CopyColumns as mentioned in the documentation? The error seen in this case is explained in this issue. You can try to vectorize the string values and try the same if columncopier is used.
-Rohit
- Proposed as answer by Yutong Tie - MSFTMicrosoft employee, Moderator Monday, November 4, 2019 5:19 AM