Machine Learning Modeling
Overview
Machine learning analyzes data through algorithms to perform classification and regression predictions, such as stock price forecasts, spam email classification, credit scoring, medical diagnosis, etc. We can use DataInterpreter to produce such algorithmic code, model data, and complete prediction tasks.
Example: Wine Recognition
Task
We use the sklearn wine recognition dataset as an example to illustrate how to use DataInterpreter for machine learning modeling. This is a classic multi-class dataset with several features such as color, chemical composition, etc., based on which the wine category of samples can be predicted. We require DataInterpreter to fetch the data, split the training and validation sets, train the model, and make predictions on the validation set.
Code
python examples/di/machine_learning.pypython examples/di/machine_learning.pyExecution Results
Example: Sales Forecast
Task
Let's use the Walmart Sales Forecast Dataset as an example to show how to use the DataInterpreter for sales forecasting modeling. There are four tables in the dataset: train.csv, test.csv, feature.csv, and store.csv, and we ask the DataInterpreter to fetch the data, stitch the data, slice the training and validation sets, train the model, and make predictions on the test set.
Code
python examples/di/machine_learning.py --use_case sales_forecastpython examples/di/machine_learning.py --use_case sales_forecastExecution Results
Mechanism Explained
DataInterpreter plans according to our requirements, forms several tasks, and executes them in sequence to fulfill the needs. The complete code generated by DataInterpreter will be stored in the data/output path.
Extension
For targeted processing of more complex machine learning tasks, please refer to Machine Learning with Tools.