Model Training
With deep learning networks can learn data features from large amounts of data, which can be applied to a variety of detection, segmentation tasks.
Last updated
With deep learning networks can learn data features from large amounts of data, which can be applied to a variety of detection, segmentation tasks.
Last updated
To use deep learning to solve specific problems, it is important to understand how to train a network model with known data. Therefore, we first explained how to train a deep-learning neural network. The content of this section is as follows:
Model training locally
Model training in the cloud
Make your training datasets
This subsection describes how to perform model training on a personal PC. You will need to have the following hardware:
A laptop or desktop computer with an NVIDIA graphics card and Windows installed (if your computer does not contain an NVIDIA graphics card, we recommend using the cloud for your model training).
Please install Anaconda software to manage your Python environment. There are three versions of Anaconda available for Windows, Max, and Linux; please choose the one that corresponds to your system.
Install VScode as your code editing software.
Use anaconda to create the environment we will use to train our model:
Open the Anaconda Prompt below
Enter the following commands to enter the work command, activate the environment, and open vscode (note that you replace the working directory and environment names with your corresponding names).
Enter the following command in Terminal:
When you're done, type
Uninstalling and installing the GPU-friendly version of Torch.
Before installing, please check the Cuda driver installed on your card and install pytorch and torchvision corresponding to the Cuda version. e.g., my cuda version is higher than 12.1, so I install
At this point, you have completed all the environment configurations required for model training, and you can happily train the model below 😊. To train the model, you need to execute the following commands:
Where model is the path to the model you need to train, the current Grove Vision V2 only supports the deployment of yolov8n.pt, so please choose yolov8n.pt. data is the path to the training set needed for model training, here we use a very small dataset as a test, later in this section we will explain how to create the training set for training. In addition, we have implemented custom partial label training for the coco dataset, which allows you to select specific labels for training according to your preferences and uses, enabling more flexible inference. imgsz is the image size accepted as input to the model, which must be 192. epochs is the number of times the model has been trained. In general, the more times it has been trained, the better the model fits the data.
At the end of the training, you can find your trained model at the location where the model is saved as shown on the command line (e.g. my model is saved at runs\detect\train3)
Unfortunately, if you don't have a computer with an NVIDIA graphics card, you can't train the model locally. However, you can train it in the cloud, and both training methods will give you the same model results. We use ultralytics for cloud model training. Below is the link to the Ultralytics HUB. Open the link, and you will see an introduction and tutorial on how to use it.
You need a GitHub, Google, or Apple account to use the Ultralytics HUB.
The ultralytics hub supports the training of users' own datasets. See the subsequent section for how to make and upload a dataset.
Set the Advanced Model Configuration. If you are using Google Colab for model training in the future, it is recommended to change the Epochs to 30 because although Google Colab is free, there is a time limit for using it, and 30 training is about 2 hours, which is just within the time limit. If we want to deploy the model on the Grove Vision V2 module, we also need to change the image size to 192, which is more in line with our actual application scenario. In addition, a Google mailbox is required to use Google Colab.
If you want to detect labels outside the open-source dataset, you need to make your own dataset. The dataset should include the images and the corresponding labels.
The following is the dataset format specified by yolov8
We use the Make Sense online data labeling tool to complete the training set production process:
For information on how to use Make Sense, see:
After labeling the dataset labels, you need to merge the image and label data to form a complete dataset. The dataset needs to meet the following format:
You need to copy the label data generated by Make Sense to the appropriate location. For example, if you tagged the labeled data for training, the images required for training should be located in images/train and the labeled data should be located in labels/train. Additionally, you need to write a dataset.yaml file. This file needs to be placed in the dataset directory with the contents of the file: (! (Note: the dataset.yaml naming should match the folder naming. Otherwise, you will encounter problems when uploading the dataset to ultralytics)
Names need to correspond to the labels used for Make Sense annotation. If you are training locally, it is recommended that you change the paths of train, val, and test to absolute paths, which is not required for training in the cloud. At this point, you are ready to train your model with the made dataset; just change the path from ‘data=coco128.yaml’ to dataset.yaml in your made dataset.
If you are training in the cloud, you must upload your dataset to the ultralytics hub. The upload process is very simple:
With your dataset in the Ultralytics HUB, you can now use your own dataset for model training.