Machine Learning Modeling
Overview
Machine learning analyzes data through algorithms to perform classification and regression predictions, such as stock price forecasts, spam email classification, credit scoring, medical diagnosis, etc. We can use DataInterpreter
to produce such algorithmic code, model data, and complete prediction tasks.
Example: Wine Recognition
Task
We use the sklearn wine recognition dataset as an example to illustrate how to use DataInterpreter
for machine learning modeling. This is a classic multi-class dataset with several features such as color, chemical composition, etc., based on which the wine category of samples can be predicted. We require DataInterpreter
to fetch the data, split the training and validation sets, train the model, and make predictions on the validation set.
Code
python examples/di/machine_learning.py
python examples/di/machine_learning.py
Execution Results
Example: Sales Forecast
Task
Let's use the Walmart Sales Forecast Dataset as an example to show how to use the DataInterpreter
for sales forecasting modeling. There are four tables in the dataset: train.csv, test.csv, feature.csv, and store.csv, and we ask the DataInterpreter
to fetch the data, stitch the data, slice the training and validation sets, train the model, and make predictions on the test set.
Code
python examples/di/machine_learning.py --use_case sales_forecast
python examples/di/machine_learning.py --use_case sales_forecast
Execution Results
Mechanism Explained
DataInterpreter
plans according to our requirements, forms several tasks, and executes them in sequence to fulfill the needs. The complete code generated by DataInterpreter
will be stored in the data/output path.
Extension
For targeted processing of more complex machine learning tasks, please refer to Machine Learning with Tools.