HyperAI Recognizing Handwritten Digits
If you are familiar with machine learning, especially the deep learning methods that have become very popular in recent years, you have likely heard of the MNIST dataset. This dataset comes from the National Institute of Standards and Technology (NIST). The training set consists of handwritten digits from 250 different people, and the test set also contains handwritten digit data in the same proportion.
Here we will use this dataset to introduce image classification using PyTorch / TensorFlow in HyperAI. This will involve HyperAI's dataset binding, model training, and model usage.
PyTorch
Get the Code
Download the sample code from GitHub:
git clone https://github.com/signcl/openbayes-mnist-example.git
cd openbayes-mnist-example/pytorchAfter switching to the downloaded code directory, you will see the following code structure:
.
├── openbayes.yaml
└── train.pyWhere:
- .pycontains the code for executing machine learning model training and performing inference using the model
- openbayes.yamlcontains the configuration for executing a "Python Script Execution", for more information see openbayes cli Configuration File
Create Your First Task
:::caution Note The process introduced here is for uploading a code archive through the web interface, but this is no longer our recommended best practice for creating "Python Script Execution". A better approach can be found at bayes Command Line Tool Getting Started. :::
After preparing the dataset, click "New Compute Container" in the left navigation bar on the page to create a Python script execution task, using train.py to train a PyTorch handwritten digit recognition model.
On the "New Container" page, select Python Script Execution as the access method, and upload train.py from the directory. In the execution command field, enter the command:
python train.pyThe parameter parsing in
train.pyuses Python's built-inargparselibrary. For more information, see argparse — Parser for command-line options, arguments and sub-commands.
Select CPU for compute resources (if other GPU types are available, GPU is recommended as it will be much faster), then select the "pytorch-1.8" image,
After submitting the task, wait about 15 seconds for the task to start executing. On the container page, you can see the execution status displayed in the logs.
TensorFlow
Get the Code
Download the sample code from GitHub:
git clone https://github.com/signcl/openbayes-mnist-example.git
cd openbayes-mnist-example/tensorflowSwitch to the downloaded code directory and you will see the following code structure:
.
├── inference.py
├── openbayes.yaml
└── train.pyWhere:
- .pycontains code for executing machine learning model training and performing inference using the model
- openbayes.yamlcontains configuration for executing a "Python Script". For more information, see openbayes cli configuration file
Create Your First Task
:::caution Note This article introduces the process of uploading a code archive via the web interface. However, this is no longer our recommended best practice for creating "Python Scripts". For a better approach, please refer to Getting Started with bayes Command Line Tool. :::
After the dataset is prepared, click "New Compute Container" in the left navigation bar to create a Python script execution task, using train.py to train a TensorFlow handwriting recognition model.
On the "New Container" page, select "Python Script Execution" and enter the following command in the execution command field:
python train.py -o /openbayes/home -e 50 -m model.h5 -l ./tf_dirWhere:
- train.pyis the py file used for model training
- -o /openbayes/homespecifies that the file directory to save to is- /openbayes/home. In Python script execution mode, the system will save and upload results in the- /openbayes/homedirectory, so all work directory results should be saved in this directory. File updates in other directories, including even the default . directory, will not be saved
- -e 50specifies the- epochvalue for model training
- -m model.h5specifies the name of the saved model. Combined with the- -o /openbayes/homeparameter, the final trained model will be stored at- /openbayes/home/model.h5
- -l ./tf_dirdefaults- ./tf_dirto point to HyperAI as the directory for TensorBoard
Parameter parsing in
train.pyuses Python's built-in libraryargparse. For more information, see argparse — Parser for command-line options, arguments and sub-commands.
In the next step, select CPU for compute, choose "tensorflow-2.8" for the image, select "Python Script Execution" as the access method, and upload the train.py file from the directory.
After submission, wait about 15 seconds and the task will begin execution. The task startup time is usually related to the size of the bound dataset - the larger the required dataset, the longer the container execution preparation time. On the container page, you can see the execution status displayed in the logs.
Click "TensorBoard Visualization" to view the executing model information through TensorBoard.
View Execution Directory and Output Results
After execution is complete, click the "Working Directory" tab on the container page to see that the specified model has been created.
Use the Trained Model for Classification
After obtaining the trained model, we can use it to classify and predict data. On the completed execution page, click "Continue Execution" in the upper right corner to enter the new execution build page.
Here, bind the "Working Directory" from the previous execution to /openbayes/home. For more details about "Continue Execution", see Continue Execution for Compute Containers.
Modify the specified execution command:
python inference.py -m /openbayes/home/model.h5Where -m /openbayes/home/model.h5 specifies the directory of the model to be loaded. After submission and execution is complete, you can see in the logs that 98% accuracy was achieved on the 10,000 test results.