Share this article

Latest news

With KB5043178 to Release Preview Channel, Microsoft advises Windows 11 users to plug in when the battery is low

Copilot in Outlook will generate personalized themes for you to customize the app

Microsoft will raise the price of its 365 Suite to include AI capabilities

Death Stranding Director’s Cut is now Xbox X|S at a huge discount

Outlook will let users create custom account icons so they can tell their accounts apart easier

AI gets into your browser and it’s faster than ever

Now, AI chatbots will work a lot faster in the browsers

3 min. read

Published onMarch 5, 2024

published onMarch 5, 2024

Share this article

Read our disclosure page to find out how can you help Windows Report sustain the editorial teamRead more

We were wondering when will they use the full power of AI within our browsers and, apparently, the time has come.

We’re talking about a new feature called ONNX Runtime Web which uses the WebGPU accelerator that allows AI models to be built directly into the browser and make them faster. A lot faster!

What is the ONNX Runtime Web?

What is the ONNX Runtime Web?

To explain that, you need to know that WebGPU is engine, like WebGL, but a lot more powerful, capable of dealing with larger computational workloads. It is basically harnessing the GPU power to perform parallel computational tasks needed in AI processes.

Now, getting to ONNX Runtime Web, it is a JavaScript library that enables web developers to embed LLMs right into the web browsers and benefit from the GPU hardware acceleration.

Usually, large LLMs are not so easily deployed into browsers because they require a lot of memory and computational power.

ONNX Runtime Web’s innovation is that it enables the WebGPU backend which Microsoft and Intel are developing right now.

How fast is the ONNX Runtime Web?

To prove their point, the ONNX Runtime team created ademo using the Segment Anything model, and the results were little short of amazing.

They incorporated WASM EP and WebGPU EP and used an NVIDIA GeForce RTX 3060 and Intel Core i9 PC. Then, they compared the encoder using the CPU and the new WebGPU and the latter proved to be a lot faster, as shown in the screenshot above.

The good news is that WebGPU is already embedded into Chrome 113 and Edge 113 for Windows, macOS and ChromeOS and Chrome 121 for Android. That means you can also play with ONNX Runtime Web on these browsers.

The developers explained how to try ONNX Runtime Web on their project page:

The experience utilizing different backends in ONNX Runtime Web is straightforward. Simply import the relevant package and create an ONNX Runtime Web inference session with the required backend through the Execution Provider setting. We aim to simplify the process for developers, enabling them to harness different hardware accelerations with minimal effort.

The following code snippet shows how to call ONNX Runtime Web API to inference a model with WebGPU. AdditionalONNX Runtime Web documentationandexamplesare accessible for delving deeper.

So, now that we have anything in place, let’s start deploying some powerful LLMs into the browsers!

What do you think about the new ONNX Runtime Web? Let’s talk about this development in the comments section below.

More about the topics:AI,browser

Claudiu Andone

Windows Toubleshooting Expert

Oldtimer in the tech and science press, Claudiu is focused on whatever comes new from Microsoft.

His abrupt interest in computers started when he saw the first Home Computer as a kid. However, his passion for Windows and everything related became obvious when he became a sys admin in a computer science high school.

With 14 years of experience in writing about everything there is to know about science and technology, Claudiu also likes rock music, chilling in the garden, and Star Wars. May the force be with you, always!

User forum

0 messages

Sort by:LatestOldestMost Votes

Comment*

Name*

Email*

Commenting as.Not you?

Save information for future comments

Comment

Δ

Claudiu Andone

Windows Toubleshooting Expert

Oldtimer in the tech and science press, with 14 years of experience in writing on everything there is to know about science, technology, and Microsoft