Useful Sensors launches AI in a Box, aiming to establish a different paradigm for edge computing and TinyML

Useful Sensors launches AI in a Box, aiming to establish a different paradigm for edge computing and TinyML

Would you leave a Google Staff Research Engineer role just because you want your TV to automatically pause when you get up to get a cup of tea? Actually, how is that even relevant, you might ask. Let’s see what Pete Warden, former Google Staff Research Engineer and now CEO and Founder of Useful Sensors, has to say about that.

From Jetpac to Google and TinyML, from Google to AI in a Box

Pete Warden wrote the world’s only mustache detection image processing algorithm. He also was the founder and CTO of startup Jetpac. He raised a Series A from Khosla Ventures, built a technical team, and created a unique data product that analyzed the pixel data of over 140 million photos from Instagram and turned them into in-depth guides for more than 5,000 cities around the world.

Jetpac was acquired by Google in 2014, and Warden has been a Google Staff Research Engineer from then till March 2022. That’s when he founded Useful Sensors, which he sees as the evolution of the work he’s been doing at Google. Warden was the Technical Lead of the TensorFlow Mobile team, responsible for deep learning on mobile and embedded devices.

Warden is sometimes credited as having kickstarted the TinyML subdomain of machine learning. Naturally much of what he did was based off things others were already working on: “A lot of my contribution has been helping publicize and document a bunch of these engineering practices that have emerged,” Warden said. Either way TinyML is getting big, and Warden is a big part of it.

Tiny machine learning (TinyML) is broadly defined as a fast growing field of machine learning technologies and applications including hardware, algorithms and software capable of performing on-device sensor data analytics at extremely low power, typically in the mW range and below, and hence enabling a variety of always-on use-cases and targeting battery operated devices.

Useful Sensors just launched a product called AI in a Box, which it dubs an “offline, private, open source LLM for conversations and more”. Even though it’s not the first product Useful Sensors has created, it’s the first one that’s officially launched. That was a good opportunity to catch up with Warden and talk about what Useful Sensors is working on.

Simplicity and the creepiness factor

While it is true that Warden cited wanting his TV to automatically pause when he gets up to get a cup of tea as part of the reason why he started Useful Sensors, some context is definitely needed. Part of Warden’s motivation for his work on Tensorflow for embedded devices was to see it used in everyday objects.

As Warden related, when he went to talk to companies that made light switches or TVs to tell them “all about this wonderful open source code that they could get for free, and all the conferences and documentation and examples and books”, they would hear him out.

But then in the end, they’d usually say something like: “That’s great. But we barely have a software engineering team, let alone a machine learning team. So can you just give us something that gives us a voice interface or tells us when somebody sits down in front of the TV”?

That’s quite telling, and producing self-contained AI-enhanced hardware is a valid reason to set out on a new venture. However, that’s not something Google itself could not achieve. Google Pixel, for example, already provides automatic captions running on-device for content that plays on the phone. But there’s something more: privacy and data sovereignty, aka “the creepiness factor”.

In a video hosted on the Useful Sensors home page, Warden mentions how during his tenure at Google he would often get questions about whether Google is spying on people. Those questions are valid ones, triggered by a widely observed phenomenon: when mentioning topic XYZ in the vicinity of your phone, you will often get bombarded with ads about XYZ for days on end.

Warden on his part swears that the code he was working on does not do that. But, he goes on to add, he has no way of proving that because the code is proprietary. Plus, we may add, there’s nothing anyone can say about other parts of Google’s codebase, or other apps for that matter. It’s hard to light-heartedly dismiss such a widely shared experience.

Useful Sensors

That brings us to the core of it all – what Useful Sensors does and how it’s different. The vision, as Warden put it, is to be able to run machine learning locally, and to do it in a private and a checkable way. Everything should run locally with no internet connection, so conversations and data are completely secure. No account, setup, or subscription needed.

Warden shared that Useful Sensors has already launched the person sensor, a small board that provides and indication of whether there’s a person nearby, as well as a tiny QR code. Both run entirely locally and retail at $10 and $7 respectively, Warden said. But these products have something else in common too: they are aimed at makers, i.e. hobbyists with enough motivation and technical skills to tinker with them, but also at electronics vendors.

As Warden shared, Useful Sensors is currently in talks with a number of electronics vendors. Useful Sensors products are being evaluated, and Warden is hopeful that it won’t be too long before they end being included in devices sold in the market. In fact, that is the audience that holds the greatest promise for Useful Sensors. Its backers also see the potential apparently, as the company has received $5 million in seed funding already.

Warden co-founded Useful Sensors with CTO Manjunath Kudlur, formerly of Cerebras. Kudlur was the compiler team lead at Cerebras as well as one of the founders of Google’s TensorFlow and Nvidia’s CUDA. Warden said Kudlur contributes greatly to things such as accelerating transformer models for Useful Sensors. The team lists a total of 8 people at this point, but if their plans come to fruition, another funding round and growth are well within sight as per Warden.

AI in a Box, the product that Useful Sensors just launched, seems like it was designed to do a number of things. First, it can grow awareness for Useful Sensors, as it’s aimed at makers. As Warden said, people can tinker with the code, but it already it does some useful things out of the box. It can provide live captions, as well as receive voice commands, and translate between multiple major languages on the fly.

AI in a Box can also help raise some cash for Useful Sensors. But perhaps more importantly, it positions Useful Sensors as an ecosystem provider. This seems like part of the vision for the company, and Warden shared that he’s hoping people will get creative with AI in the Box. In fact, he added, some examples of things people have built with Useful Sensors products already exist on Hackster.

Under the hood

AI in a Box features a RockChip 3588S SoC with a NPU. The NPU is a unit specifically designed to accelerate neural networks, and the team was able to leverage it to enable a Large Language Model to run locally. AI in a Box is built on a foundation of open source models like Whisper and Llama2. In the same vein, the company is releasing all the code to accelerate and control the system under an open source license.

Useful Sensors library for optimized transformer inference on the RockChip NPU is also available. The idea is that transparency should help with security and privacy auditing, with Warden noting he’d be happy to have regulators audit Useful Sensors products. Releasing open source code will also enable developers to use the system as a base to build their own real-time voice input applications in Python.

Warden said that once they were able to get the real-time speech to text working, there was lots of choice around which LLMs they could run locally. The team is also looking into doing some of its own fine-tuning, but they’ve been able to get a long way with just providing prompt contexts for interactions. As Warden noted, anybody who’s familiar with LLMs would probably easily recognize what Useful Sensors did.

Real-time speech to text enables AI in a Box in a box to function as a keyboard too, among other things. By having a LLM in the mix too, that opens up a range of possibilities. For example, LLMs are known to be able to interface with APIs. Warden mentioned Raspberry Pi as an example that could enable people to control a number of devices using voice commands.

AI, innovation, empowerment

An example of what people are already working on, Warden said, is an actor who is using the company’s person sensor to automate spotlight operation for solo performances. Rather than having to pay someone to operate a spotlight, the actor is hoping it should be possible to automate this. In a way, that’s a perfect metaphor for the double-edged sword that innovation and AI truly are. That may sound like a good idea for the actor, but what about the operator?

“That’s that’s a really big question with anything around innovation. If we’re, quote unquote, making things more efficient, what are the societal impacts of that? A big part of what I’m trying to do is get these technologies into people’s hands so that it’s not just a bunch of engineers who are making these decisions about what we should do.

People can try these models for themselves, and see, for example, how useful, but also how flawed the current generation of LLMs are. I don’t want us technocrats to be the ones making these decisions. I want a well informed public who are actually able to say – hey, this is what we want”, Warden said.

That certainly sounds like a noble aspiration. How compatible it really is with VC backing, a cut-throat competitive landscape dominated by the Googles of the world, the public’s ability to elaborate its own take on things, and administrations’ willingness to take the public into account, remains to be seen.

Join the Orchestrate all the Things Newsletter

Stories about how Technology, Data, AI and Media flow into each other shaping our lives. Analysis, Essays, Interviews, News. Mid-to-long form, 1-3 times/month.

 
 

Write a Reply or Comment

Your email address will not be published.