Vision AI Module by Seeed

Grove Vision AI V2 is based on Arm Cortex-M55 and Ethos-U55 embedded vision modules. The Ethos-U55 has 64 to 512 GOP/s of arithmetic power, which meets the increasing demand for downloading Machine Learning models to the edge for inference.

Use the online site to download your favorite model to Grove Vision AI

Downloading the model directly to Grove Vision AI V2 using the web-side tool (SenseCraft) is the easiest way to use the module.

SenseCraft AI (

Please make sure that the CH343 driver is installed before using it, if not this page will help you to install it automatically, but you need to reboot your computer device after installation.

When you open this URL, you will see the above page. At this point you need to connect your device

There are two options in the drop down box. Please select Grove Vision AI V2.

Just click on the appropriate COM port and connect it. I choose COM3 in this picture.

On this website, Sensecraft shows us a number of efficient and interesting models that are adapted to be deployed directly to our Grove Visio AI V2.

Click directly on the model shown in the image (e.g. Face Detection in the figure) and then click send.This web page prompts the following message, which means that the model is being burned. Just be patient for a few minutes.

When the model has been burned successfully, the real-time inference results and images can be viewed in the preview on the right side. There are two hyperparameters that can be adjusted, one is CONFIDENCE, only inference results above this threshold will be displayed. The other is IoU, if IoU is high, multiple prediction frames for the same real object will be displayed. It is recommended that both parameters be left at their default values.

Using grove vision ai to communicate with petoi robot dog

You can use arduino IDE to modify our open source program to use grove vision ai. our program integrates target tracking with Grove Vision AI V2 . You can enable this feature by simply modifying the code. Also you can develop richer functionality with the api related to the SSCMA library.

Make sure not to comment in OpenCatEsp32.ino.

#define CAMERA

Comment out the

#define MU_CAMERA

statement in camera.h; activate the


statement, recompile and upload the code to board.

Note that you need to bring in the relevant libraries and do the following.

  1. Download the latest version of the Seeed_Arduino_SSCMA library from the GitHub repository.

  2. Add the library to your Arduino IDE by selecting Sketch > Include Library > Add .ZIP Library and choosing the downloaded file.

Or you can

You can write your own functional code modeled after our code for Grove Vision AI V2 integrated in opencatESP32.

The following is a brief description of how to use the SSCMA's common api.

Here's a brief description of how to use SSCMA's common api.

bool begin(TwoWire *wire = &Wire, uint16_t address = I2C_ADDRESS, uint32_t wait_delay = 2, uint32_t clock = 400000) -- Initialize the Grove Vision AI V2.
Input parameters:
TwoWire *wire -- This pointer points to a TwoWire object, typically used to communicate with I2C devices.
uint16_t address -- This is the address of the I2C device and is used to identify the specific device connected to the I2C bus.
uint32_t wait_delay -- The delay (in milliseconds) to wait for a response before sending a command.
uint32_t clock -- This is the clock rate (in Hz) of the I2C bus.
Return: True or False. True for successful initialization, False for failed initialization.

int invoke(int times = 1, bool filter = 0, bool show = 0) -- Used to send an INVOKE command to Grove Vision AI V2 to allow Grove Vision AI to begin invoking models, reasoning, and recognition .
Input parameters:
int times -- The number of times to Invoke.
fileter -- means to send an event reply only if the last result is different from the previous one (by geometry and score comparison).
bool show -- means send event replies only if the last result is different from the previous one (by geometry and score comparison).
! note
For more information about Grove Vision AI's protocol definitions, please read 
Protocol Documentation
Returns: CMD_OK or CMD_ETIMEDOUT. Returns CMD_OK if the model was successfully enabled, otherwise CMD_ETIMEDOUT.

int available() -- Checks how many bytes of data are available for reading on the device connected through the IIC interface.
Input parameters: None.
Returns: Number of bytes of data that can be read from the device.
int read(char *data, int length) -- Reads data from the Grove Vision AI via the IIC interface. The purpose of this function is to populate the read data with the array pointed to by the provided data pointer.
Input parameters:
char *data -- The array to be used to store the data.
int length -- The length of the data to be read.
Returns: The length of the data to be read.
int write(const char *data, int length) -- Writes data to the specified device via the I2C interface.

int write(const char *data, int length) -- Writes data to the specified device via the I2C interface.
Input parameters:
const char *data -- The content of the data to be written.
int length -- The length of the data to be written.
Returns: The length of the data to be written.

std::vector<boxes_t> &boxes() { return _boxes; } 
typedef struct
    uint16_t x;      // Horizontal coordinates of the centre of the box
    uint16_t y;      // Vertical coordinates of the centre of the box
    uint16_t w;      // Width of the identification box
    uint16_t h;      // Height of the identification box
    uint8_t score;   // Confidence in identifying as target
    uint8_t target;  // Target
} boxes_t;

Train and deploy your own models to Grove Vision AI V2

If you want to train models on your own dataset for visual detection tasks, we recommend using the yolov5 family of models.

First, we will show you how to train your own yolov5 model on a PC. Then, we will show you how to quantize and transform the trained model and deploy it to the grove vision ai.

We recommend that the training phase of the yolov5 model is done on windows operating system. It is possible to use vscode or pycharm as your text editor, as well as anaconda or virtualenv for python environment management. All in all, we need a python interpreter.

We hope that your device has an Nvidia GPU as this will greatly speed up our training process.

If you have not used git before, you need to download the git software, on Win, you can download the git software from the following link.

Git - Downloading Package (

You need to clone the yolov5 repository.

git clone

You then need to open the project in an environment that has the specified python interpreter. For example, I use anaconda to manage my win system python environment, so I need to create the required environment in the anaconda software.

We open the project in the environment we created.

Above is a screenshot of the project opened with the vscode editor and you can see that there is a requirements.txt. this file contains the third party libraries we need.

Use the following commands to install third-party libraries in the current environment:

pip install -r requirements.txt

At this point we have most of the third-party libraries installed, but we need to reinstall pytorch since the pytorch libraries you need to install, as well as the torchvision library, need to be cuda-compliant.

Uninstall pytorch and torchvision first.
#Check cuda version

then you will see:

As above, cuda version 12.5.Download pytorch and torchvision for the corresponding cuda from PyTorch, just make sure that your cuda version is higher than the cuda version that pytorch needs. Since my cuda version is 12.5, I can install pytorch with cuda12.1.

At this point, we've finished installing all the third-party libraries. However, we would like you to double check that pytorch is installed correctly.

If torch.cuda.is_available() outputs True, the installation is correct.

You can use the dataset provided by yolov5 for retraining, or you can make your own dataset for training.

If you use the dataset provided by yolov5 for model training, execute the following command in the directory where you installed yolov5:

python --weights --data ${dataset yaml file path} --imgsz 192

The argument to --data in this command can be either yolov5\data either yaml file, for example, it can be --data coco128.yaml. Also, --imgsz 192 is the size of the input image for the grove vision ai v2, if the model is going to be downloaded to the grove vision ai, the imgsz needs to stay unchanged.

The output during training is shown below:

The trained model is saved under: yolov5\runs\train\exp10\weights similar paths

After training you need to convert the pt model to a saved_model model. Run :

python --weights E:\Project\yolov5\runs\train\exp10\weights\ --imgsz 192 --include saved_model

When finished, the following folder appears:

Install Tensorflow:

pip install tensorflow

Create the following new python script to convert our saved_model model file to a tflite model file:

import tensorflow as tf
import os.path as osp

converter = tf.lite.TFLiteConverter.from_saved_model(r'Your saved_model folder path')

tflite_model = converter.convert()

def representative_dataset():
  for _ in range(100):
    yield [
        tf.random.uniform((1, 192, 192
                           , 3))

converter.optimizations = [
converter.target_spec.supported_ops = [
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
converter.representative_dataset = representative_dataset

tflite_quant_model = converter.convert()

with open(osp.join(r'The location path to be saved','yolov5n_int8.tflite'), 'wb') as f:

Execute this python script, for example, I will command this script as


get yolov5n_int8.tflite.

The following graph optimization is performed on our obtained tflite model. Installation:

pip3 install ethos-u-vela

You may have encountered the following error during the build of vela:

You will need to download and install Microsoft C++ Build Tools to resolve this issue.

After you have installed vela,you need to download vela related configuration file, or copy the following content into a file, which can be named vela_config.ini.

; file: my_vela_cfg.ini ; ----------------------------------------------------------------------------- 
; Vela configuration file ; ----------------------------------------------------------------------------- 
; System Configuration 

; My_Sys_Cfg 
; ----------------------------------------------------------------------------- 
; Memory Mode 
; My_Mem_Mode_Parent 
  • Finally, use the following command to optimize the graph

vela --accelerator-config ethos-u55-64 \ --config vela_config.ini --system-config My_Sys_Cfg --memory-mode My_Mem_Mode_Parent --output-dir ${Save path of the optimized model} ${The path of the tflite model that needs to be optimized}

Then you get something like a model file with the name yolov5n_int8_vela.tflite, which is a model that has been optimized.

You can upload your own trained model to Grove Vision AI V2 via SenseCraft AI as mentioned above.(SenseCraft AI (

You can customize the name of the model, load the model file to the web side, and then you need to train the model according to the number of labels used, for example, if you use coco128 dataset for training, then there will be a total of 80 labels. Be careful to keep a one-to-one correspondence between the serial number of the tags and the order of the tag names.

When you finish downloading the model, you can see in the right window that grove vision ai performs inference using a user-defined model. Different models return different inference results, after our test, the yolov5 model trained by ourselves can correctly output the recognized labels, but it can't output the coordinates as well as the box. So if you are also programming with arduino, you only need to focus on sorce and target.

Last updated