PyTorch machine studying fashions on Android

October 2, 2024

12

Posted by Paul Ruiz – Senior Developer Relations Engineer

Earlier this yr we launched Google AI Edge, a collection of instruments with quick access to ready-to-use ML duties, frameworks that allow you to construct ML pipelines, and run standard LLMs and customized fashions – all on-device. For AI on Android Highlight Week, the Google group is highlighting numerous ways in which Android builders can use machine studying to assist enhance their purposes.

On this publish, we’ll dive into Google AI Edge Torch, which allows you to convert PyTorch fashions to run regionally on Android and different platforms, utilizing the Google AI Edge LiteRT (previously TensorFlow Lite) and MediaPipe Duties libraries. For insights on different highly effective instruments, you’ll want to discover the remainder of the AI on Android Highlight Week content material.

To get began with Google AI Edge simpler, we have supplied samples accessible on GitHub as an executable codelab. They exhibit easy methods to convert the MobileViT mannequin for picture classification (suitable with MediaPipe Duties) and the DIS mannequin for segmentation (suitable with LiteRT).

a red Android figurine is shown next to a black and white silhouette of the same figure, labeled 'Original Image' and 'PT Mask' respectively, demonstrating image segmentation.

DIS mannequin output

This weblog guides you thru easy methods to use the MobileViT mannequin with MediaPipe Duties. Take into account that the LiteRT runtime gives related capabilities, enabling you to construct customized pipelines and options.

Convert MobileViT mannequin for picture classification suitable with MediaPipe Duties

As soon as you have put in the required dependencies and utilities to your app, step one is to retrieve the PyTorch mannequin you want to convert, together with another MobileViT elements you may want (akin to a picture processor for testing).

from transformers import MobileViTImageProcessor, MobileViTForImageClassification

hf_model_path="apple/mobilevit-small"
processor = MobileViTImageProcessor.from_pretrained(hf_model_path)
pt_model = MobileViTForImageClassification.from_pretrained(hf_model_path)

Because the finish results of this tutorial ought to work with MediaPipe Duties, take an additional step to match the anticipated enter and output shapes for picture classification to what’s utilized by the MediaPipe picture classification Job.

class HF2MP_ImageClassificationModelWrapper(nn.Module):

  def __init__(self, hf_image_classification_model, hf_processor):
    tremendous().__init__()
    self.mannequin = hf_image_classification_model
    if hf_processor.do_rescale:
      self.rescale_factor = hf_processor.rescale_factor
    else:
      self.rescale_factor = 1.0

  def ahead(self, picture: torch.Tensor):
    # BHWC -> BCHW.
    picture = picture.permute(0, 3, 1, 2)
    # RGB -> BGR.
    picture = picture.flip(dims=(1,))
    # Scale [0, 255] -> [0, 1].
    picture = picture * self.rescale_factor
    logits = self.mannequin(pixel_values=picture).logits  # [B, 1000] float32.
    # Softmax is required for MediaPipe classification mannequin.
    logits = torch.nn.purposeful.softmax(logits, dim=-1)

    return logits

hf_model_path="apple/mobilevit-small"
hf_mobile_vit_processor = MobileViTImageProcessor.from_pretrained(hf_model_path)
hf_mobile_vit_model = MobileViTForImageClassification.from_pretrained(hf_model_path)
wrapped_pt_model = HF2MP_ImageClassificationModelWrapper(
hf_mobile_vit_model, hf_mobile_vit_processor).eval()

Whether or not you propose to make use of the transformed MobileViT mannequin with MediaPipe Duties or LiteRT, the subsequent step is to transform the mannequin to the .tflite format.

First, match the enter form. On this instance, the enter form is 1, 256, 256, 3 for a 256×256 pixel three-channel RGB picture.

Then, name AI Edge Torch’s convert operate to finish the conversion course of.

import ai_edge_torch

sample_args = (torch.rand((1, 256, 256, 3)),)
edge_model = ai_edge_torch.convert(wrapped_pt_model, sample_args)

After changing the mannequin, you possibly can additional refine it by incorporating metadata for the picture classification labels. MediaPipe Duties will make the most of this metadata to show or return pertinent info after classification.

from mediapipe.duties.python.metadata.metadata_writers import image_classifier
from mediapipe.duties.python.metadata.metadata_writers import metadata_writer
from mediapipe.duties.python.imaginative and prescient.image_classifier import ImageClassifier
from pathlib import Path

flatbuffer_file = Path('hf_mobile_vit_mp_image_classification_raw.tflite')
edge_model.export(flatbuffer_file)
tflite_model_buffer = flatbuffer_file.read_bytes()

//Extract the picture classification labels from the HF fashions for later integration into the TFLite mannequin.
labels = record(hf_mobile_vit_model.config.id2label.values())

author = image_classifier.MetadataWriter.create(
    tflite_model_buffer,
    input_norm_mean=[0.0], #  Normalization is not wanted for this mannequin.
    input_norm_std=[1.0],
    labels=metadata_writer.Labels().add(labels),
)
tflite_model_buffer, _ = author.populate()

With all of that accomplished, it is time to combine your mannequin into an Android app. In case you’re following the official Colab pocket book, this includes saving the mannequin regionally. For an instance of picture classification with MediaPipe Duties, discover the GitHub repository. You will discover extra info within the official Google AI Edge documentation.

moving image of Newly converted ViT model with MediaPipe Tasks

Newly transformed ViT mannequin with MediaPipe Duties

After understanding easy methods to convert a easy picture classification mannequin, you need to use the identical strategies to adapt numerous PyTorch fashions for Google AI Edge LiteRT or MediaPipe Duties tooling on Android.

For additional mannequin optimization, take into account strategies like quantizing throughout conversion. Take a look at the GitHub instance to study extra about easy methods to convert a PyTorch picture segmentation mannequin to LiteRT and quantize it.

What’s Subsequent

To maintain updated on Google AI Edge developments, search for bulletins on the Google for Builders YouTube channel and weblog.

We look ahead to listening to about the way you’re utilizing these options in your initiatives. Use #AndroidAI hashtag to share your suggestions or what you have inbuilt social media and take a look at different content material in AI on Android Highlight Week!

PyTorch machine studying fashions on Android

Convert MobileViT mannequin for picture classification suitable with MediaPipe Duties

What’s Subsequent

Related Articles

Making Responsive UI in Godot

Subject Service Engineer At Rotork In Panipat

Solos Xeon 6 and seven Newest AirGo3-enabled Smartglasses With ChatGPT 4o And 25 Languages Translation, Upcoming Airgo V Smartglasses With AI-powered Object Recognition

LEAVE A REPLY Cancel reply

Latest Articles

Making Responsive UI in Godot

Subject Service Engineer At Rotork In Panipat

Solos Xeon 6 and seven Newest AirGo3-enabled Smartglasses With ChatGPT 4o And 25 Languages Translation, Upcoming Airgo V Smartglasses With AI-powered Object Recognition

The Rogue Prince of Persia will get an enormous replace with ‘The Second Act’

Chinese language GPU unicorn Moore Threads inches nearer to IPO: Report

ABOUT US