Skip to content

Add ExecuTorch Alpha blog post and other page updates #1615

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions _get_started/mobile.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
---
layout: get_started
title: Mobile
permalink: /get-started/mobile/
title: ExecuTorch
permalink: /get-started/executorch/
background-class: get-started-background
body-class: get-started
order: 5
published: true
---

## Get Started with PyTorch Mobile
## Get Started with PyTorch ExecuTorch

As of PyTorch 1.3, PyTorch supports an end-to-end workflow from Python to deployment on iOS and Android.
This is an early, experimental release that we will be building on in several areas over the coming months.
<p>
<a href="https://pytorch.org/executorch/stable/index.html" class="btn btn-lg with-right-arrow">
ExecuTorch Documentation
</a>
</p>

Get started on [Android]({{ site.baseurl }}/mobile/android)

Get started on [iOS]({{ site.baseurl }}/mobile/ios)

<script page-id="mobile" src="{{ site.baseurl }}/assets/menu-tab-selection.js"></script>
<script src="{{ site.baseurl }}/assets/get-started-sidebar.js"></script>
51 changes: 51 additions & 0 deletions _posts/2024-04-30-executorch-alpha.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
layout: blog_detail
title: "ExecuTorch Alpha: Taking LLMs and AI to the Edge with Our Community and Partners"
---

We are excited to announce the release of [ExecuTorch alpha](https://github.com/pytorch/executorch), focused on deploying large language models (LLMs) and large ML models to the edge, stabilizing the API surface, and improving our installation processes. It has been an exciting few months [from our 0.1 (preview) release](https://pytorch.org/blog/pytorch-edge/) in collaboration with our partners at Arm, Apple, and Qualcomm Technologies, Inc.

In this post we’ll discuss our full support for Meta’s Llama 2, early support for Meta’s Llama 3, broad model support in ExecuTorch, and highlight the important work our partners have done to move us forward.

## Large Language Models on Mobile

Mobile devices are highly constrained for compute, memory, and power. To bring LLMs to these devices, we heavily leverage quantization and other techniques to pack these models appropriately.

ExecuTorch alpha supports 4-bit post-training quantization using GPTQ. We've provided broad device support on CPU by landing dynamic shape support and new dtypes in XNNPack. We've also made significant improvements in export and lowering, reduced memory overhead and improved runtime performance. This enables running Llama 2 7B efficiently on iPhone 15 Pro, iPhone 15 Pro Max, Samsung Galaxy S22, S23, and S24 phones and other edge devices. [Early support](https://github.com/pytorch/executorch/releases/tag/v0.2.0) for [Llama 3 8B](https://ai.meta.com/blog/meta-llama-3/) is also included. We are always improving the token/sec on various edge devices and you can visit GitHub for the [latest performance numbers](https://github.com/pytorch/executorch/blob/main/examples/models/llama2/README.md).

We're working closely with our partners at Apple, Arm, and Qualcomm Technologies to delegate to GPU and NPU for performance through Core ML, MPS, TOSA, and Qualcomm AI Stack backends respectively.

## Supported Models

We remain committed to supporting an ever-expanding list of models with ExecuTorch. Since preview, we have significantly expanded our tested models across NLP, vision and speech, with full details [in our release notes](https://github.com/pytorch/executorch/releases/tag/v0.2.0). Although support for on-device LLMs is early, we anticipate most traditional models to function seamlessly out of the box, with delegation to XNNPACK, Core ML, MPS, TOSA, and HTP for performance. If you encounter any problems please open [a GitHub issue](https://github.com/pytorch/executorch/issues) with us.

## Productivity

Deploying performant models tuned for specific platforms often require deep visualization into the on-device runtime data to determine the right changes to make in the original PyTorch model. With ExecuTorch alpha, we provide a powerful SDK with observability throughout the process from model authoring to deployment, including delegate and hardware-level information.

The ExecuTorch SDK was enhanced to include better debugging and profiling tools. Because ExecuTorch is built on PyTorch, the debugging capabilities include the ability to map from operator nodes back to original Python source code for more efficient anomaly resolution and performance tuning for both delegated and non-delegated model instances. You can learn more about the ExecuTorch SDK [here](https://github.com/pytorch/executorch/blob/main/examples/sdk/README.md).

## Partnerships

ExecuTorch has only been possible because of strong collaborations across Arm, Apple, and Qualcomm Technologies. The collaboration for the initial launch of ExecuTorch continues as we support LLMs and large AI models on the edge for PyTorch. As we’ve seen with this early work for ExecuTorch alpha, there are unique challenges with these larger models and we’re excited to develop in the open.

We also want to highlight the great partnership with Google on [XNNPACK](https://github.com/google/XNNPACK) for CPU performance. The teams continue to work together upstreaming our changes and across the TensorFlow and PyTorch teams to make sure we can all support generative AI models on the edge with SOTA performance.

Lastly, our hardware partner MediaTek has been doing work enabling the Llama collection of models with ExecuTorch on their SoCs. We'll have more to share in the future.

## Alpha and Production Usage

With our alpha release, we have production-tested ExecuTorch. Meta is using ExecuTorch for hand tracking on Meta Quest 3 and a variety of models on Ray-Ban Meta Smart Glasses. In addition, we have begun the rollout of ExecuTorch with Instagram and are integrating with other Meta products. We are excited to see how ExecuTorch can be used for other edge experiences.

## Community

We are excited to see various efforts in the community to adopt or contribute to ExecuTorch. For instance, Unity recently [shared their work](https://schedule.gdconf.com/session/unity-developer-summit-drive-better-gameplay-experiences-on-user-devices-with-ai-presented-by-unity/903634) at the Game Developers Conference ([GDC](https://gdconf.com/)) on leveraging ExecuTorch and Edge IR to run PyTorch models with their neural network inference library Sentis. Leveraging ExecuTorch's hackability and extensibility, Unity introduced their own custom backend that serializes ExecuTorch’s Edge Dialect IR into Sentis’ native serialized format enabling developers to begin using PyTorch models easily in their games and apps.

We’ve been building and innovating with ExecuTorch in the open. Our north star is to empower the community to deploy any ML model on edge devices painlessly and efficiently. Whether you are a hobbyist or this is your day job, we’d love for you to [jump in to bring your ML models to the edge](https://pytorch.org/executorch/stable/getting-started-setup.html). We are looking for your help to:

1. Use ExecuTorch to [run your LLM models locally](https://github.com/pytorch/executorch/blob/main/docs/source/llm/getting-started.md) on various deployment targets and share your feedback
2. Expand our supported models, including bug reports
3. Expand our quantization schemes
4. Help us build out delegates to GPU and NPU

To all individual contributors and early adopters of ExecuTorch, a big thank you as well. We can’t wait to have more of you [join us](https://github.com/pytorch/executorch)!
19 changes: 9 additions & 10 deletions edge.html
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,10 @@ <h1 class="small">PyTorch Edge</h1>
<div class="row">
<div class="col-md-10">
<h2 class="mt-5 mb-2">PyTorch Edge</h2>
<p>The AI landscape is quickly evolving, with AI models being deployed beyond server to edge devices such as mobile phones, wearables, AR/VR/MR and embedded devices. PyTorch Edge extends PyTorch's research-to-production stack to these edge devices and paves the way for building innovative, privacy-aware experiences with superior productivity, portability, and performance, optimized for these diverse hardware platforms. </p>
<h2 class="mt-5 mb-2">PyTorch on Edge - From PyTorch Mobile to ExecuTorch</h2>
<p>In 2019, we announced <a href="/mobile/home/">PyTorch Mobile</a> powered by <a href="https://pytorch.org/docs/stable/jit.html">TorchScript</a> to address the ever-growing need for edge devices to execute AI models. To advance our PyTorch Edge offerings even further, we developed <a href="/executorch-overview">ExecuTorch</a>. ExecuTorch facilitates PyTorch inference on edge devices while supporting portability across hardware platforms with lower runtime and framework tax. ExecuTorch was developed collaboratively between industry leaders including Meta, Arm, Apple, and Qualcomm. </p>
<p>PyTorch Mobile allowed users to stay in the PyTorch ecosystem from training to model deployment. However, the lack of consistent PyTorch semantics used across these and the focus on TorchScript inhibited the developer experience and slowed down research to production. PyTorch Mobile also didn’t provide well-defined entry points for third-party integration and optimizations, which we’ve addressed with ExecuTorch. </p>
<p>We’ve renewed our commitment to on-device AI with <a href="/executorch-overview">ExecuTorch</a>. This extends our ecosystem in a much more “in the spirit of PyTorch” way, with productivity, hackability, and extensibility as critical components. We look forward to supporting edge and embedded applications with low latency, strong privacy, and innovation on the edge. </p>
<p>The AI landscape is quickly evolving, with AI models being deployed beyond server to edge devices such as mobile phones, wearables, AR/VR/MR and embedded devices. PyTorch Edge extends PyTorch's research-to-production stack to these edge devices and paves the way for building innovative, privacy-aware experiences with superior productivity, portability, and performance, optimized for these diverse hardware platforms.</p>
<h2 class="mt-5 mb-2">Introducing ExecuTorch</h2>
<p>To advance our PyTorch Edge offering, we developed <a href="https://pytorch.org/executorch-overview">ExecuTorch</a>, our new runtime for edge devices. ExecuTorch facilitates PyTorch inference on edge devices while supporting portability across hardware platforms with lower runtime and framework tax. ExecuTorch was developed collaboratively between industry leaders including Meta, Arm, Apple, and Qualcomm. </p>
<p>With ExecuTorch, we’ve renewed our commitment to on-device AI. This extends our ecosystem in a much more “in the spirit of PyTorch” way, with productivity, hackability, and extensibility as critical components. We look forward to supporting edge and embedded applications with low latency, strong privacy, and innovation on the edge. </p>
</div>
</div>
</div>
Expand All @@ -41,15 +40,15 @@ <h2>Learn more about PyTorch Edge</h2>
</div>
<div class="row content">
<div class="col-md-4 text-center">
<p class="lead">New on-device inference</p>
<a href="/executorch-overview" class="btn btn-lg mb-4 with-right-arrow">
<p class="lead">What’s New in ExecuTorch</p>
<a href="https://github.com/pytorch/executorch" class="btn btn-lg mb-4 with-right-arrow">
ExecuTorch
</a>
</div>
<div class="col-md-4 text-center">
<p class="lead">Legacy PyTorch Mobile runtime</p>
<a href="/mobile/home" class="btn btn-lg with-right-arrow">
PyTorch Mobile
<p class="lead">Try ExecuTorch</p>
<a href="https://pytorch.org/executorch/stable/index.html" class="btn btn-lg with-right-arrow">
ExecuTorch Documentation
</a>
</div>
</div>
Expand Down
12 changes: 5 additions & 7 deletions executorch.html
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,8 @@ <h1 class="small">ExecuTorch</h1>
<div class="container mb-5">
<div class="row">
<div class="col-md-10">
<p class="mt-4"><strong>IMPORTANT NOTE: This is a preview version of Executorch and should be used for testing and evaluation purposes only. It is not recommended for use in production settings. We welcome any feedback, suggestions, and bug reports from the community to help us improve the technology.</strong></p>

<h2 class="mt-5 mb-2" id="what-is-executorch">What is ExecuTorch?</h2>
<p>ExecuTorch is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of PyTorch models to edge devices. Key value propositions of ExecuTorch are:</p>
<p>ExecuTorch is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of various PyTorch models (vision, speech, Generative AI, and more) to edge devices. Key value propositions of ExecuTorch are:</p>

<div class="container">
<div class="row mt-3">
Expand Down Expand Up @@ -56,10 +54,10 @@ <h2 class="mt-5 mb-2" id="what-is-executorch">What is ExecuTorch?</h2>

<h2 class="mt-5 mb-2" id="explore-executorch">Explore ExecuTorch</h2>

<p>We are excited to see how the community leverages our all new on-device AI stack. You can learn more about <a href="https://pytorch.org/executorch/stable/getting-started-architecture">key components</a> of ExecuTorch and its architecture, <a href="https://pytorch.org/executorch/stable/intro-how-it-works">how it works</a>, and explore <a href="/executorch">documentation page</a> and <a href="https://pytorch.org/executorch/stable/#tutorials-and-examples:~:text=Getting%20Started-,Tutorials%20and%20Examples,-Docs">detailed tutorials</a>.</p>
<p>ExecuTorch is currently powering various experiences across AR, VR and Family of Apps (FOA) products and services at Meta. We are excited to see how the community leverages our all new on-device AI stack. You can learn more about <a href="https://pytorch.org/executorch/stable/getting-started-architecture">key components</a> of ExecuTorch and its architecture, <a href="https://pytorch.org/executorch/stable/intro-how-it-works">how it works</a>, and explore <a href="https://pytorch.org/executorch">documentation pages</a> and <a href="https://pytorch.org/executorch/stable/#tutorials-and-examples:~:text=Getting%20Started-,Tutorials%20and%20Examples,-Docs">detailed tutorials</a>.</p>

<p>
<a href="/executorch" class="btn btn-lg with-right-arrow">
<a href="https://pytorch.org/executorch/stable/index.html" class="btn btn-lg with-right-arrow">
ExecuTorch Documentation
</a>
</p>
Expand All @@ -68,9 +66,9 @@ <h2 class="mt-5 mb-2" id="why-executorch">Why ExecuTorch?</h2>

<p>Supporting on-device AI presents unique challenges with diverse hardware, critical power requirements, low/no internet connectivity, and realtime processing needs. These constraints have historically prevented or slowed down the creation of scalable and performant on-device AI solutions. We designed ExecuTorch, backed by our industry leaders like Meta, Arm, Apple, and Qualcomm, to be highly portable and provide superior developer productivity without losing on performance.</p>

<h2 class="mt-5 mb-2" id="how-is-executorch-different-from-pytorch-mobile-lite-interpreter">How is ExecuTorch Different from <a href="/mobile/home/">PyTorch Mobile (Lite Interpreter)</a>?</h2>
<h2 class="mt-5 mb-2" id="executorch-alpha-release">ExecuTorch Alpha Release</h2>

<p>PyTorch Mobile uses TorchScript to allow PyTorch models to run on devices with limited resources. ExecuTorch has a significantly smaller memory size and a dynamic memory footprint resulting in superior performance compared to PyTorch Mobile. Also ExecuTorch does not rely on TorchScript, and instead leverages PyTorch 2.0 compiler and export functionality for on-device execution of PyTorch models.</p>
<p>ExecuTorch was initially introduced to the community at the 2023 <a href="https://pytorch.org/blog/pytorch-conference-2023/">PyTorch Conference</a>. With our most recent alpha release, we further expanded ExecuTorch’s capabilities across multiple dimensions. First, we enabled support for the deployment of large language models (LLMs) on various edge devices. Second, with ExecuTorch alpha, we have further stabilized the API surface. Lastly, we have significantly improved the developer experience by simplifying the installation flow as well as improving observability and developer productivity via the <a href="https://github.com/pytorch/executorch/blob/main/examples/sdk/README.md">ExecuTorch SDK</a>. ExecuTorch alpha release also provides early support for the recently announced Llama 3 8B along with demonstrations on how to run this model on an iPhone 15 Pro and a Samsung Galaxy S24 mobile phone.</p>

</div>
</div>
Expand Down