July 2, 2024

Decentralized AI Perception: How Collaborative Spatial Computing Brings AI to Retail

At the AWE June 2024 event in Long Beach, California, Nils Pihl, a pioneer in decentralized AI perception, presented an insightful and forward-thinking vision of transforming retail through AI. His talk captivated the audience with groundbreaking ideas and practical solutions already making waves in the industry.

Key Highlights from the Presentation

  • Decentralized Machine Perception Network: Nils introduces his work on creating a decentralized network that empowers AI to navigate and understand complex environments, surpassing the limitations of traditional GPS.

  • Urban Navigation Challenges: Using examples from cities like New York and Hong Kong, Nils explains the inadequacies of GPS in dense urban settings and the necessity for AI to develop a spatial semantic understanding of the world.

  • Retail Applications: The focus then shifted to retail, where Nils emphasized how decentralized AI can solve critical issues like inventory management, product placement, and navigation inside grocery stores—where traditional GPS falls short.

  • Collaborative Spatial Computing: Nils introduced the concept of decentralized positioning systems hosted by the retailers themselves. This approach addresses privacy concerns and allows for precise and efficient task management within the store.

  • Real-world Impact: Retail employees can save significant time, reduce errors, and even empower cognitively impaired staff by using a collaborative spatial computing tool. AI-generated tasks help streamline operations and improve overall efficiency.

  • Future Implications: Nils concluded with the potential future of robotics and AI, envisioning a hybrid era where human and AI perception work together. This will redefine how we interact with retail spaces and beyond.

Watch the Full Presentation

Nils Pihl's presentation is a must-watch for anyone interested in the future of AI, robotics, and retail. His insights into decentralized AI perception and its practical applications offer a glimpse into a transformative technology that could soon be ubiquitous in our daily lives. Watch the full presentation to explore the groundbreaking advancements and how they can reshape the retail industry.

Don't miss out on this talk. Watch the full presentation now!

Video Transcript: Nils Pihl AWE Presentation - “Decentralized AI Perception: How Collaborative Spatial Computing Brings AI to Retail”

Hello, everyone. Just give me a second to put on my second sneaky private microphone here.

There we go. Hey. And slides. We're all good. Hello, everyone.

I'm thankful for this very manageable crowd. This works well for my social anxiety. I'm Nils Pihl. That's like Phil, but wrong. I'm from Sweden.

I live in Hong Kong, and I work on decentralized AI perception. I'm building a decentralized machine perception network. And I'm going to be talking today about why that is helpful for retail today already, but also some of the things you'll be able to do in the very near future. And, to get us started,  I want to start in the future. I want you to join me in the year 2030 and we're chilling on the couch with our humanoid robot.

But it's Saturday morning in the year 2030, and we're out of food, and we don't want to go to the grocery store. So, of course, we want to send our handsome companion to the grocery store on our behalf to buy a bunch of items. Among them, ketchup. Let me tell you the one about a robot that walks into a grocery store to buy ketchup. Let's walk through a little bit how this might work.

The first challenge that your humanoid robot is going to have is - he needs to make it to the grocery store. He leaves your apartment or your house, and he has to make it to the grocery store. How is he going to do that? Maybe he'll use the GPS, right? Use something like Google Maps.

And in a great big parking lot like Long Beach, that might work. But unfortunately, cities get bigger. In a city like New York, New York has over 900 buildings taller than 100 meters. I don't know what that is in feet, but it's very tall, over 900. And GPS is a line of sight based technology.

GPS already in a city like New York is not precise enough to help something like a robot get to the grocery store. But it gets a lot worse because cities get a lot bigger. In my hometown of Hong Kong, we don't have 900 buildings taller than 100 meters. We have over 4,000 buildings taller than 100 meters, so using GPS is just out of the question.

But unfortunately, it gets even worse. Cities get even bigger than that. In fact, if you hop in an airplane or a helicopter, maybe not a helicopter, but an airplane, and you look at Hong Kong, this is Hong Kong, by the way, this little part here is Hong Kong. The rest of this is the Greater Bay Area. It's an area smaller than greater metropolitan Los Angeles just for scale, but it's home to 85 million people, and in this frame are more high-rise buildings than in all of North America and all of Europe combined.

So, positioning is about to get tricky. And robots and AI have to understand modern cities. And by the looks of it, the definition of what a modern city is, is going to be determined to some extent by China, not just because their cities are so big, but also because of robot adoption. In the year 2015, which is an alarmingly long time ago now. But in the year 2015, the Chinese robotic fleet overtook the U.S. robotic fleet in size.

And today, actually two years ago, more than every other robot installed was installed in China. In South Korea, which is a little bit ahead - every 10th worker in manufacturing is a robot already, which is pretty nuts. So robot adoption is happening very, very quickly in Asia, and it's happening very, very quickly in these big cities, and the GPS is just not going to cut it. But even once we've gotten to the grocery store, things get even more complicated because, in the grocery store, there is certainly no indoor GPS.

Google Maps does not know what the inside of your grocery store looks like, and the GPS cannot get a signal there. So robots will get lost looking for groceries just like everyone else. Which, of course, brings us to the big question.

What comes after the GPS? I believe that devices, machines, and artificial intelligence have to start building up their own spatio-semantic understanding of the world.

So spatio-semantic means, on the one hand, spatio, they have to understand the spaceness of things, but also the semantic thingness of things. You need to understand that this is a thing that is moving. And I should not look at those feature points too much in my slam algorithm. These are things that are stationary that I can trust. These are things that are going to run me over.

You need a spatio-semantic reconstruction of the space. Robots and AI will need to perceive the world physically themselves. They need to actually perceive the world. And this gets tricky because this handsome guy and any other robot you'll find on the market is, of course, limited to, duh, the sensors that are onboard the robot itself. So, as our handsome robot friend comes to the grocery store, what's he going to do?

Well, he's going to shuffle around looking for the ketchup like the rest of us, bringing us to the big question. In a big, complicated environment like this, where is the ketchup? Does Google know? Well, for once, Google doesn't know. And the reason why is very interesting.

You see, it turns out that for this particular kind of use case, centralized positioning. Centralized visual positioning, like the ones provided by Google, by Microsoft, by our sponsor Niantic is bad actually, both practically, but also for interesting privacy reasons. See, if we look at a store like this, the visual merchandising, meaning where the products are placed and how many and how high up, is one of the best kept secrets of retailers. And retailers are so paranoid about this that many American retailers straight up refuse to store any information related to their visual merchandising on any cloud provider, just will not store it in any cloud. This is super, super sensitive information.

So sensitive that one of our common experiences trying to do some guerrilla marketing is getting thrown out of grocery stores just because we have cameras on. So corporations are people, my friend, and they need privacy just like we do. They don't want Google to know where they keep their stash, and we don't want Google to know where we keep our stash. Centralized visual positioning is a problem. So what's going to have to happen?

How is our robot going to find the ketchup? Of course, it's unacceptable that it will have to walk around the store just looking. And it's not even clear if the store will let the robot come into the store if the robot is going to be recording everything for the next time. So, the store will have to be able to answer spatial questions from the AI. The store needs to be able to answer the robot's question: Where is the ketchup?

Without revealing a whole map of the store. So technically, computers must be able to collaborate to understand the world. The computer of our humanoid robot needs to collaborate with the computer of the store to establish a shared coordinate system that they can reason about space in together. So what if we built a decentralized positioning system? What if we built a positioning system that was self-hosted?

You could host the data yourself. You make it application-agnostic. You make it platform-agnostic. That would work, right? Plot twist.

We built that already. And I'm going to show you some of the things that you can do with that today. This beautiful thing here is a real grocery store. I can unfortunately not tell you which one. But what's cool about this particular picture is this was collaboratively reconstructed by several different... Just iPhones. Just iPhones collaboratively reconstructed this over time. They're just building it up, which is pretty cool because then you have the foundation not just for Sci-Fi things like a self-healing VPS, but you actually have data that spatial AI can look at. What can you do when you know things about the store?

To find out, we built this collaborative and external sense of space. We call it a domain. A domain is an external sense of space that devices connect to, to be able to ask questions about the environment. Where can I walk? Where can I not walk? Where is there occlusion? Where is there not occlusion? Etc.

And using this system over the last two years, we've accomplished a couple of really cool things, and so have our customers. For example, we were able to do persistent, shared augmented reality. Very cool, can do a lot of nice, gimmicky marketing things with that.

But on top of that, we built a spatial task management system, which I'll be showing you, which turned out to be a huge deal. We did product search, where you could find any product in the store on your first try, even as a shopper or as a first day employee. We provided that with route optimization. So, if you're a first-day employee who has a whole shopping cart that you have to pick up for someone else, you can do that with route optimization. And we brought in AI generated tasks, and I'll be showing you a lot of these things today, starting with the spatial task management.

If we could bring the music down just a little bit. So this tool allows them to create tasks for each other with pictures, with descriptions, and a precise location of where the issue is. So you can just scan in if you're on the next shift or you came later, and just navigate to where the issue is. This means the way you run in the store is different because you don't have to find each other, you don't have to have a meeting about it. You save a lot on handovers.

In fact, after four months of using this daily, Coop, the store that we just saw are telling us that they save at least 15 minutes per employee per day on the handovers alone. So that's huge. But also, they tell us that they complete more tasks every day because this way of doing task management is more fun, it's engaging, and it's a little bit less daunting. They are experiencing a lower error rate. There are fewer mistakes in the handover.

Tasks get done right the first time, and they're reporting lower stress levels. It's just easier to deal with this than to, when you start your shift, get a long paper list with the 20 things that you need to do and who's dealing with what in what order. But perhaps most exciting of all, and one of the reasons that the error rate is lower, is they tell us that it really empowers their special needs staff with cognitive impairment. Since they started using this system. Their staff members with cognitive impairment are operating at the same level as their peers and are really thriving in the environment.

No more misunderstandings, no more repetition. And they're happy. We're really, really excited about how spatial computing can help empower people. But I also mentioned that we use this for AI-generated tasks. I'm going to show you a very quick video now that I want to just preface what you're about to see.

You're about to see one webcam among many webcams in the store. Of course, one webcam streaming video over RSTP to an AI server that analyzes and looks for empty shelves. That, in turn, is collaborating with another server that is managing the spatial task management, which in turn is talking to another server, the domain server, to understand the physical environment, and in turn also talking to the device that I'm holding in my hand. And we're going to see something that I think is really remarkable. Have a look.

Stockouts are a leading cause of shopper frustration and lost revenue for retailers. When a shopper can't find what they're looking for on the shelf, they're likely to go buy it online instead. Let’s say that I just bought the last toilet paper on that shelf. There may be more in the back, but for now the shelf is empty. How long until the item is back on the shelf?

Luckily, this store is running Cactus (formerly Convergent). The spatial AI and it's looking through the store's many cameras, and it has already created a task to fix the stockout, complete with AR navigation so that the staffers can address the issue faster than humanly possible. With spatial AI, even first-day staff members can find every issue and item instantly. Retailers lose close to a trillion dollars in revenue every year due to stockouts. And that's before we factor in the cost of staff training. Spatial AI gives every staffer the superhuman ability to know what needs to get done and where. Auki Labs, share your vision.

So, this is the point of collaborative spatial computing. My phone was aware of things that needed to get done in the store. That was figured out by an AI server over there, and a bunch of web cameras over there, and a domain server over there. All of these things being able to reason about space together so that you don't just get what other computer vision will tell you, which is this camera has detected an empty shelf here in screen space.

But literally, where in physical space exactly is the shelf that we're interested in? Of course, this becomes very, very important for things like robotics. Moving forward, we're very excited about AI-generated tasks. So excited that we decided to partner up with some fantastic guys named Akuret. We have Akuret here today, also.

Akuret makes AI for retailers that look at your point of sale data and your stock level data to detect opportunities to increase your sales. It's basically AI recommendations for things that you need to go have a look at in the physical store to make sure that things are the way you actually believe they are. So, Akuret can predict things that you don't have as much stock as you think, for example. And we're finding that the retailers that act on these recommendations see half a percent to 1.5% increased sales consistently, which is, of course, huge, really, really exciting. So what you get then is what I would like to call the hypothesis of the hybrid era of robotics.

This is something that we're going to have to go through before we have humanoid robots everywhere. The hybrid era of robotics is the idea that we want both human reasoning and AI reasoning. We want human perception, and we want AI perception, but we also just want to use human legs and human arms because those are fantastic. So when you have machine perception and human legs, there are a lot of great things you can do with AI-generated tasks that help you know what to do in the store faster and spatial computing tools that help even the cognitively impaired staff do better. When you merge spatial computing with AI, you bring out the best both in the staff and in the AI.

You get more bang for your buck from your computer vision when your computer vision can place things in actual space, and you get more out of your staff when the things are in actual space as well.

So, stores are very dynamic and challenging. We are not the first people to realize that there's a big opportunity to bring augmented reality to stores. But visual positioning has been a challenge for many reasons. One, as we already covered, because, well, the retailers don't want a digital twin of their store to exist on someone else's servers. But also they change so much that your digital twin has to be constantly updating.

If you create a digital twin of the store today, you'll have to make a new one in four days. So, visual positioning, as done today, is too brittle. And again, I think to the rescue is collaborative spatial computing and being able to self-host these systems and have self-healing systems where even visiting applications can contribute to maintaining the store. Simply put, AI must collaboratively perceive the world. That's what comes after the GPS.

After the GPS, hopefully, we will see a universal spatial computing protocol, something that's just a part of the Internet that allows different machines and AI to shake hands, exchange coordinate systems, exchange spatial data, and borrow spatial computing resources from each other. And I believe that what will make this possible is this very exciting new trend called decentralized physical infrastructure networks (DePIN). Because if we're going to be self-hosting this, if you're imagining a pair of future AR glasses that connect to the positioning service of whatever venue you connect to, not only could those glasses be a lot smaller, because you can move the spatial compute largely out of the glasses entirely, but also the infrastructure for these glasses are now spread out all over the world. DePIN, I believe, enables civilization-scale infrastructure that is funded, owned, and operated by the people themselves. I think that's very exciting, and I think that the most impactful DePIN is going to be a decentralized machine perception network that can replace the GPS and allow for collaborative spatial computing even in the complex environments of the large modern cities in Asia.

That's why we've been building this protocol, which we call the posemesh. You can build on it too. We have an SDK out. You don't have to use our applications; you can build whatever you like using the posemesh, and we'll never get any data about your spaces. We're building this as a DePIN for AI perception, and I want to just share this launch video with you guys.

Every space has a purpose, created with intent and maintained with care. They can bring out the best in us, and they can challenge us. Auki allows you to create a digital overlay of your space, your own domain. The domain is your canvas, your virtual real estate, making your environment accessible to digital things and AI. help people navigate and find their way. Help them find each other, help them find the time, help them find a place. With a growing number of applications and an SDK to let your imagination run wild, getting a domain lets you stay ahead of the curve and on the right path. What will you do with your domain?

Thank you. Thank you.

So, coming soon to a store near you: Cactus, the Spatial AI. If you're interested in building your own applications on top of the posemesh, get in touch. We are Auki.ai. If you're a retailer interested in using spatial AI, go to Cactus. Thank you so much for your time.

About Auki Augmented World Expo (AWE)

Our mission is to help the XR community advance Augmented and Virtual Reality technology to further human progress. With the industry successfully reaching the milestone of 1 billion active AR users in 2020, over the next decade, AWE aims to guide the industry’s attention towards our own core objectives of enabling people across the globe to learn, connect, and grow within the XR industry.

The AWE community is committed to pursuing the goal of building a $1 trillion dollar XR industry by 2030 while ensuring that the outcome is a world worth living in.

AWE was first organized in 2010 by augmenteality.org. Its primary mission was to accelerate the adoption of augmented reality by bringing together the industry: developers, creators, founders, product leads, C-level executives, enthusiasts, media, and analysts.

The first AWE event started with only 300 attendees and a handful of exhibitors. Today, AWE events have grown to thousands of attendees from around the world, hundreds of sponsors and exhibitors, and an extensive network of professionals in Augmented Reality, Virtual Reality, and other immersive technologies.

In 2015 the first AWE ASIA event was organized, in 2016 the first AWE Europe event was held in Germany, and AWE Tel Aviv followed in 2018. AWE also introduced its long-running ‘AWE Nite’ series, and chapters have since been created in 8 countries and 19 cities around the world.

AWE’s annual Auggie Awards continue to be the most recognized AR & VR industry awards in the world since they began in 2010, and today, they continue to showcase the best solutions and innovations in Augmented, Virtual, and Mixed Reality.

Over the past decade, AWE has become the most valuable AR/VR global community, always striving to help its members learn and connect whilst facilitating new business opportunities.

X | LinkedIn |  Instagram | Facebook | YouTube | AWEXR.com

About Auki Labs

Auki is building the posemesh, a decentralized machine perception network for the next 100 billion people, devices and AI on Earth and beyond. The posemesh is an external and collaborative sense of space that machines and AI can use to understand the physical world.

Our mission is to improve civilization’s intercognitive capacity; our ability to think, experience and solve problems together with each other and AI. The greatest way to extend human reach is to collaborate with others. We are building consciousness-expanding technology to reduce the friction of communication and bridge minds.

X | LinkedIn | Medium | YouTube | AukiLabs.com

About the posemesh

The Posemesh is an open-source protocol that powers a decentralized, blockchain-based spatial computing network.

The Posemesh is designed for a future where spatial computing is both collaborative and privacy-preserving. It limits any organization's surveillance capabilities and encourages sovereign ownership of private maps of personal and public spaces.

The decentralization also offers a competitive advantage, especially in shared spatial computing sessions, AR for example, where low latency is crucial. The posemesh is the next step in the decentralization movement, responding as an antidote to the growing power of big tech.

The Posemesh has tasked Auki Labs with developing the software infrastructure of the posemesh.

X | Discord | LinkedIn | Medium | Updates | YouTube  | Telegram | Whitepaper | DePIN Network

了解Auki的最新动态

获取新闻、照片、活动和业务更新。

获取更新