Skip to content

Machine Learning Modeling

Overview

Machine learning analyzes data through algorithms to perform classification and regression predictions, such as stock price forecasts, spam email classification, credit scoring, medical diagnosis, etc. We can use DataInterpreter to produce such algorithmic code, model data, and complete prediction tasks.

Example: Wine Recognition

Task

We use the sklearn wine recognition dataset as an example to illustrate how to use DataInterpreter for machine learning modeling. This is a classic multi-class dataset with several features such as color, chemical composition, etc., based on which the wine category of samples can be predicted. We require DataInterpreter to fetch the data, split the training and validation sets, train the model, and make predictions on the validation set.

Code

bash
python examples/di/machine_learning.py
python examples/di/machine_learning.py

Execution Results


Example: Sales Forecast

Task

Let's use the Walmart Sales Forecast Dataset as an example to show how to use the DataInterpreter for sales forecasting modeling. There are four tables in the dataset: train.csv, test.csv, feature.csv, and store.csv, and we ask the DataInterpreter to fetch the data, stitch the data, slice the training and validation sets, train the model, and make predictions on the test set.

Code

bash
python examples/di/machine_learning.py --use_case sales_forecast
python examples/di/machine_learning.py --use_case sales_forecast

Execution Results

Mechanism Explained

DataInterpreter plans according to our requirements, forms several tasks, and executes them in sequence to fulfill the needs. The complete code generated by DataInterpreter will be stored in the data/output path.

Extension

For targeted processing of more complex machine learning tasks, please refer to Machine Learning with Tools.

Released under the MIT License.