diff --git a/_get_started/mobile.md b/_get_started/mobile.md index cd55a7ecdb93..2a640293144c 100644 --- a/_get_started/mobile.md +++ b/_get_started/mobile.md @@ -1,21 +1,21 @@ --- layout: get_started -title: Mobile -permalink: /get-started/mobile/ +title: ExecuTorch +permalink: /get-started/executorch/ background-class: get-started-background body-class: get-started order: 5 published: true --- -## Get Started with PyTorch Mobile +## Get Started with PyTorch ExecuTorch -As of PyTorch 1.3, PyTorch supports an end-to-end workflow from Python to deployment on iOS and Android. -This is an early, experimental release that we will be building on in several areas over the coming months. +

+ + ExecuTorch Documentation + +

-Get started on [Android]({{ site.baseurl }}/mobile/android) - -Get started on [iOS]({{ site.baseurl }}/mobile/ios) diff --git a/_posts/2024-04-30-executorch-alpha.md b/_posts/2024-04-30-executorch-alpha.md new file mode 100644 index 000000000000..4c9a8649e7eb --- /dev/null +++ b/_posts/2024-04-30-executorch-alpha.md @@ -0,0 +1,51 @@ +--- +layout: blog_detail +title: "ExecuTorch Alpha: Taking LLMs and AI to the Edge with Our Community and Partners" +--- + +We are excited to announce the release of [ExecuTorch alpha](https://github.com/pytorch/executorch), focused on deploying large language models (LLMs) and large ML models to the edge, stabilizing the API surface, and improving our installation processes. It has been an exciting few months [from our 0.1 (preview) release](https://pytorch.org/blog/pytorch-edge/) in collaboration with our partners at Arm, Apple, and Qualcomm Technologies, Inc. + +In this post we’ll discuss our full support for Meta’s Llama 2, early support for Meta’s Llama 3, broad model support in ExecuTorch, and highlight the important work our partners have done to move us forward. + +## Large Language Models on Mobile + +Mobile devices are highly constrained for compute, memory, and power. To bring LLMs to these devices, we heavily leverage quantization and other techniques to pack these models appropriately. + +ExecuTorch alpha supports 4-bit post-training quantization using GPTQ. We've provided broad device support on CPU by landing dynamic shape support and new dtypes in XNNPack. We've also made significant improvements in export and lowering, reduced memory overhead and improved runtime performance. This enables running Llama 2 7B efficiently on iPhone 15 Pro, iPhone 15 Pro Max, Samsung Galaxy S22, S23, and S24 phones and other edge devices. [Early support](https://github.com/pytorch/executorch/releases/tag/v0.2.0) for [Llama 3 8B](https://ai.meta.com/blog/meta-llama-3/) is also included. We are always improving the token/sec on various edge devices and you can visit GitHub for the [latest performance numbers](https://github.com/pytorch/executorch/blob/main/examples/models/llama2/README.md). + +We're working closely with our partners at Apple, Arm, and Qualcomm Technologies to delegate to GPU and NPU for performance through Core ML, MPS, TOSA, and Qualcomm AI Stack backends respectively. + +## Supported Models + +We remain committed to supporting an ever-expanding list of models with ExecuTorch. Since preview, we have significantly expanded our tested models across NLP, vision and speech, with full details [in our release notes](https://github.com/pytorch/executorch/releases/tag/v0.2.0). Although support for on-device LLMs is early, we anticipate most traditional models to function seamlessly out of the box, with delegation to XNNPACK, Core ML, MPS, TOSA, and HTP for performance. If you encounter any problems please open [a GitHub issue](https://github.com/pytorch/executorch/issues) with us. + +## Productivity + +Deploying performant models tuned for specific platforms often require deep visualization into the on-device runtime data to determine the right changes to make in the original PyTorch model. With ExecuTorch alpha, we provide a powerful SDK with observability throughout the process from model authoring to deployment, including delegate and hardware-level information. + +The ExecuTorch SDK was enhanced to include better debugging and profiling tools. Because ExecuTorch is built on PyTorch, the debugging capabilities include the ability to map from operator nodes back to original Python source code for more efficient anomaly resolution and performance tuning for both delegated and non-delegated model instances. You can learn more about the ExecuTorch SDK [here](https://github.com/pytorch/executorch/blob/main/examples/sdk/README.md). + +## Partnerships + +ExecuTorch has only been possible because of strong collaborations across Arm, Apple, and Qualcomm Technologies. The collaboration for the initial launch of ExecuTorch continues as we support LLMs and large AI models on the edge for PyTorch. As we’ve seen with this early work for ExecuTorch alpha, there are unique challenges with these larger models and we’re excited to develop in the open. + +We also want to highlight the great partnership with Google on [XNNPACK](https://github.com/google/XNNPACK) for CPU performance. The teams continue to work together upstreaming our changes and across the TensorFlow and PyTorch teams to make sure we can all support generative AI models on the edge with SOTA performance. + +Lastly, our hardware partner MediaTek has been doing work enabling the Llama collection of models with ExecuTorch on their SoCs. We'll have more to share in the future. + +## Alpha and Production Usage + +With our alpha release, we have production-tested ExecuTorch. Meta is using ExecuTorch for hand tracking on Meta Quest 3 and a variety of models on Ray-Ban Meta Smart Glasses. In addition, we have begun the rollout of ExecuTorch with Instagram and are integrating with other Meta products. We are excited to see how ExecuTorch can be used for other edge experiences. + +## Community + +We are excited to see various efforts in the community to adopt or contribute to ExecuTorch. For instance, Unity recently [shared their work](https://schedule.gdconf.com/session/unity-developer-summit-drive-better-gameplay-experiences-on-user-devices-with-ai-presented-by-unity/903634) at the Game Developers Conference ([GDC](https://gdconf.com/)) on leveraging ExecuTorch and Edge IR to run PyTorch models with their neural network inference library Sentis. Leveraging ExecuTorch's hackability and extensibility, Unity introduced their own custom backend that serializes ExecuTorch’s Edge Dialect IR into Sentis’ native serialized format enabling developers to begin using PyTorch models easily in their games and apps. + +We’ve been building and innovating with ExecuTorch in the open. Our north star is to empower the community to deploy any ML model on edge devices painlessly and efficiently. Whether you are a hobbyist or this is your day job, we’d love for you to [jump in to bring your ML models to the edge](https://pytorch.org/executorch/stable/getting-started-setup.html). We are looking for your help to: + +1. Use ExecuTorch to [run your LLM models locally](https://github.com/pytorch/executorch/blob/main/docs/source/llm/getting-started.md) on various deployment targets and share your feedback +2. Expand our supported models, including bug reports +3. Expand our quantization schemes +4. Help us build out delegates to GPU and NPU + +To all individual contributors and early adopters of ExecuTorch, a big thank you as well. We can’t wait to have more of you [join us](https://github.com/pytorch/executorch)! \ No newline at end of file diff --git a/edge.html b/edge.html index 215d3d52e497..c6fffa260b9f 100644 --- a/edge.html +++ b/edge.html @@ -23,11 +23,10 @@

PyTorch Edge

PyTorch Edge

-

The AI landscape is quickly evolving, with AI models being deployed beyond server to edge devices such as mobile phones, wearables, AR/VR/MR and embedded devices. PyTorch Edge extends PyTorch's research-to-production stack to these edge devices and paves the way for building innovative, privacy-aware experiences with superior productivity, portability, and performance, optimized for these diverse hardware platforms.

-

PyTorch on Edge - From PyTorch Mobile to ExecuTorch

-

In 2019, we announced PyTorch Mobile powered by TorchScript to address the ever-growing need for edge devices to execute AI models. To advance our PyTorch Edge offerings even further, we developed ExecuTorch. ExecuTorch facilitates PyTorch inference on edge devices while supporting portability across hardware platforms with lower runtime and framework tax. ExecuTorch was developed collaboratively between industry leaders including Meta, Arm, Apple, and Qualcomm.

-

PyTorch Mobile allowed users to stay in the PyTorch ecosystem from training to model deployment. However, the lack of consistent PyTorch semantics used across these and the focus on TorchScript inhibited the developer experience and slowed down research to production. PyTorch Mobile also didn’t provide well-defined entry points for third-party integration and optimizations, which we’ve addressed with ExecuTorch.

-

We’ve renewed our commitment to on-device AI with ExecuTorch. This extends our ecosystem in a much more “in the spirit of PyTorch” way, with productivity, hackability, and extensibility as critical components. We look forward to supporting edge and embedded applications with low latency, strong privacy, and innovation on the edge.

+

The AI landscape is quickly evolving, with AI models being deployed beyond server to edge devices such as mobile phones, wearables, AR/VR/MR and embedded devices. PyTorch Edge extends PyTorch's research-to-production stack to these edge devices and paves the way for building innovative, privacy-aware experiences with superior productivity, portability, and performance, optimized for these diverse hardware platforms.

+

Introducing ExecuTorch

+

To advance our PyTorch Edge offering, we developed ExecuTorch, our new runtime for edge devices. ExecuTorch facilitates PyTorch inference on edge devices while supporting portability across hardware platforms with lower runtime and framework tax. ExecuTorch was developed collaboratively between industry leaders including Meta, Arm, Apple, and Qualcomm.

+

With ExecuTorch, we’ve renewed our commitment to on-device AI. This extends our ecosystem in a much more “in the spirit of PyTorch” way, with productivity, hackability, and extensibility as critical components. We look forward to supporting edge and embedded applications with low latency, strong privacy, and innovation on the edge.

@@ -41,15 +40,15 @@

Learn more about PyTorch Edge

-

New on-device inference

- +

What’s New in ExecuTorch

+
ExecuTorch
-

Legacy PyTorch Mobile runtime

- - PyTorch Mobile +

Try ExecuTorch

+
+ ExecuTorch Documentation
diff --git a/executorch.html b/executorch.html index 6daf89413a7b..de8f9c8bf9ce 100644 --- a/executorch.html +++ b/executorch.html @@ -22,10 +22,8 @@

ExecuTorch

-

IMPORTANT NOTE: This is a preview version of Executorch and should be used for testing and evaluation purposes only. It is not recommended for use in production settings. We welcome any feedback, suggestions, and bug reports from the community to help us improve the technology.

-

What is ExecuTorch?

-

ExecuTorch is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of PyTorch models to edge devices. Key value propositions of ExecuTorch are:

+

ExecuTorch is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of various PyTorch models (vision, speech, Generative AI, and more) to edge devices. Key value propositions of ExecuTorch are:

@@ -56,10 +54,10 @@

What is ExecuTorch?

Explore ExecuTorch

-

We are excited to see how the community leverages our all new on-device AI stack. You can learn more about key components of ExecuTorch and its architecture, how it works, and explore documentation page and detailed tutorials.

+

ExecuTorch is currently powering various experiences across AR, VR and Family of Apps (FOA) products and services at Meta. We are excited to see how the community leverages our all new on-device AI stack. You can learn more about key components of ExecuTorch and its architecture, how it works, and explore documentation pages and detailed tutorials.

- + ExecuTorch Documentation

@@ -68,9 +66,9 @@

Why ExecuTorch?

Supporting on-device AI presents unique challenges with diverse hardware, critical power requirements, low/no internet connectivity, and realtime processing needs. These constraints have historically prevented or slowed down the creation of scalable and performant on-device AI solutions. We designed ExecuTorch, backed by our industry leaders like Meta, Arm, Apple, and Qualcomm, to be highly portable and provide superior developer productivity without losing on performance.

-

How is ExecuTorch Different from PyTorch Mobile (Lite Interpreter)?

+

ExecuTorch Alpha Release

-

PyTorch Mobile uses TorchScript to allow PyTorch models to run on devices with limited resources. ExecuTorch has a significantly smaller memory size and a dynamic memory footprint resulting in superior performance compared to PyTorch Mobile. Also ExecuTorch does not rely on TorchScript, and instead leverages PyTorch 2.0 compiler and export functionality for on-device execution of PyTorch models.

+

ExecuTorch was initially introduced to the community at the 2023 PyTorch Conference. With our most recent alpha release, we further expanded ExecuTorch’s capabilities across multiple dimensions. First, we enabled support for the deployment of large language models (LLMs) on various edge devices. Second, with ExecuTorch alpha, we have further stabilized the API surface. Lastly, we have significantly improved the developer experience by simplifying the installation flow as well as improving observability and developer productivity via the ExecuTorch SDK. ExecuTorch alpha release also provides early support for the recently announced Llama 3 8B along with demonstrations on how to run this model on an iPhone 15 Pro and a Samsung Galaxy S24 mobile phone.