Posted by Paul Ruiz – Senior Developer Relations Engineer
Earlier this yr we launched Google AI Edge, a collection of instruments with quick access to ready-to-use ML duties, frameworks that allow you to construct ML pipelines, and run standard LLMs and customized fashions – all on-device. For AI on Android Highlight Week, the Google group is highlighting numerous ways in which Android builders can use machine studying to assist enhance their purposes.
On this publish, we’ll dive into Google AI Edge Torch, which allows you to convert PyTorch fashions to run regionally on Android and different platforms, utilizing the Google AI Edge LiteRT (previously TensorFlow Lite) and MediaPipe Duties libraries. For insights on different highly effective instruments, you’ll want to discover the remainder of the AI on Android Highlight Week content material.
To get began with Google AI Edge simpler, we have supplied samples accessible on GitHub as an executable codelab. They exhibit easy methods to convert the MobileViT mannequin for picture classification (suitable with MediaPipe Duties) and the DIS mannequin for segmentation (suitable with LiteRT).
This weblog guides you thru easy methods to use the MobileViT mannequin with MediaPipe Duties. Take into account that the LiteRT runtime gives related capabilities, enabling you to construct customized pipelines and options.
Convert MobileViT mannequin for picture classification suitable with MediaPipe Duties
As soon as you have put in the required dependencies and utilities to your app, step one is to retrieve the PyTorch mannequin you want to convert, together with another MobileViT elements you may want (akin to a picture processor for testing).
from transformers import MobileViTImageProcessor, MobileViTForImageClassification hf_model_path="apple/mobilevit-small" processor = MobileViTImageProcessor.from_pretrained(hf_model_path) pt_model = MobileViTForImageClassification.from_pretrained(hf_model_path)
Because the finish results of this tutorial ought to work with MediaPipe Duties, take an additional step to match the anticipated enter and output shapes for picture classification to what’s utilized by the MediaPipe picture classification Job.
class HF2MP_ImageClassificationModelWrapper(nn.Module): def __init__(self, hf_image_classification_model, hf_processor): tremendous().__init__() self.mannequin = hf_image_classification_model if hf_processor.do_rescale: self.rescale_factor = hf_processor.rescale_factor else: self.rescale_factor = 1.0 def ahead(self, picture: torch.Tensor): # BHWC -> BCHW. picture = picture.permute(0, 3, 1, 2) # RGB -> BGR. picture = picture.flip(dims=(1,)) # Scale [0, 255] -> [0, 1]. picture = picture * self.rescale_factor logits = self.mannequin(pixel_values=picture).logits # [B, 1000] float32. # Softmax is required for MediaPipe classification mannequin. logits = torch.nn.purposeful.softmax(logits, dim=-1) return logits
hf_model_path="apple/mobilevit-small" hf_mobile_vit_processor = MobileViTImageProcessor.from_pretrained(hf_model_path) hf_mobile_vit_model = MobileViTForImageClassification.from_pretrained(hf_model_path) wrapped_pt_model = HF2MP_ImageClassificationModelWrapper( hf_mobile_vit_model, hf_mobile_vit_processor).eval()
Whether or not you propose to make use of the transformed MobileViT mannequin with MediaPipe Duties or LiteRT, the subsequent step is to transform the mannequin to the .tflite format.
First, match the enter form. On this instance, the enter form is 1, 256, 256, 3 for a 256×256 pixel three-channel RGB picture.
Then, name AI Edge Torch’s convert operate to finish the conversion course of.
import ai_edge_torch sample_args = (torch.rand((1, 256, 256, 3)),) edge_model = ai_edge_torch.convert(wrapped_pt_model, sample_args)
After changing the mannequin, you possibly can additional refine it by incorporating metadata for the picture classification labels. MediaPipe Duties will make the most of this metadata to show or return pertinent info after classification.
from mediapipe.duties.python.metadata.metadata_writers import image_classifier from mediapipe.duties.python.metadata.metadata_writers import metadata_writer from mediapipe.duties.python.imaginative and prescient.image_classifier import ImageClassifier from pathlib import Path flatbuffer_file = Path('hf_mobile_vit_mp_image_classification_raw.tflite') edge_model.export(flatbuffer_file) tflite_model_buffer = flatbuffer_file.read_bytes() //Extract the picture classification labels from the HF fashions for later integration into the TFLite mannequin. labels = record(hf_mobile_vit_model.config.id2label.values()) author = image_classifier.MetadataWriter.create( tflite_model_buffer, input_norm_mean=[0.0], # Normalization is not wanted for this mannequin. input_norm_std=[1.0], labels=metadata_writer.Labels().add(labels), ) tflite_model_buffer, _ = author.populate()
With all of that accomplished, it is time to combine your mannequin into an Android app. In case you’re following the official Colab pocket book, this includes saving the mannequin regionally. For an instance of picture classification with MediaPipe Duties, discover the GitHub repository. You will discover extra info within the official Google AI Edge documentation.
After understanding easy methods to convert a easy picture classification mannequin, you need to use the identical strategies to adapt numerous PyTorch fashions for Google AI Edge LiteRT or MediaPipe Duties tooling on Android.
For additional mannequin optimization, take into account strategies like quantizing throughout conversion. Take a look at the GitHub instance to study extra about easy methods to convert a PyTorch picture segmentation mannequin to LiteRT and quantize it.
What’s Subsequent
To maintain updated on Google AI Edge developments, search for bulletins on the Google for Builders YouTube channel and weblog.
We look ahead to listening to about the way you’re utilizing these options in your initiatives. Use #AndroidAI hashtag to share your suggestions or what you have inbuilt social media and take a look at different content material in AI on Android Highlight Week!