INSIDE FACEBOOK REALITY LABS: WRIST-BASED INTERACTION FOR THE NEXT COMPUTING PLATFORM
TL;DR: Last week, we kicked off a three-part series on the future of human-computer interaction (HCI). In the first post, we shared our 10-year vision of a contextually-aware, AI-powered interface for augmented reality (AR) glasses that can use the information you choose to share, to infer what you want to do, when you want to do it. Today, we’re sharing some nearer-term research: wrist-based input combined with usable but limited contextualized AI, which dynamically adapts to you and your environment. Later this year, we’ll address some groundbreaking work in soft robotics to build comfortable, all-day wearable devices and give an update on our haptic glove research.
At Facebook Reality Labs (FRL) Research, we’re building an interface for AR that won’t force us to choose between interacting with our devices and the world around us. We’re developing natural, intuitive ways to interact with always-available AR glasses because we believe this will transform the way we connect with people near and far.
“Imagine being able to teleport anywhere in the world to have shared experiences with the people who matter most in your life — no matter where they happen to be,” says Andrew Bosworth, who leads FRL. “That’s the promise of AR glasses. It’s a fusion of the real world and the virtual world in a way that fundamentally enhances daily life for the better.”
Rather than dragging our attention to the periphery in the palm of our hand like our mobile phones, AR glasses will see the world exactly as we see it, placing people at the center of the computing experience for the first time and bringing the digital world to us in three dimensions to help us communicate, navigate, learn, share, and take action in the world.
The future of HCI demands an exceptionally easy-to-use, reliable, and private interface that lets us remain completely present in the real world at all times. That interface will require many innovations in order to become the primary way we interact with the digital world. Two of the most critical elements are contextually-aware AI that understands your commands and actions as well as the context and environment around you, and technology to let you communicate with the system effortlessly — an approach we call ultra-low-friction input. The AI will make deep inferences about what information you might need or things you might want to do in various contexts, based on an understanding of you and your surroundings, and will present you with a tailored set of choices. The input will make selecting a choice effortless — using it will be as easy as clicking a virtual, always-available button through a slight movement of your finger.
But this system is many years off. So today, we’re taking a closer look at a version that may be possible much sooner: wrist-based input combined with usable but limited contextualized AI, which dynamically adapts to you and your environment.
We started imagining the ideal input device for AR glasses six years ago when FRL Research (then Oculus Research) was founded. Our north star was to develop ubiquitous input technology — something that anybody could use in all kinds of situations encountered throughout the course of the day. First and foremost, the system needed to be built responsibly with privacy, security, and safety in mind from the ground up, giving people meaningful ways to personalize and control their AR experience. The interface would also need to be intuitive, always available, unobtrusive, and easy to use. Ideally, it would also support rich, high-bandwidth control that works well for everything from manipulating a virtual object to editing an electronic document. On top of all of this, it would need a form factor comfortable enough to wear all day and energy-efficient enough to keep going just as long.
That’s a long list of requirements. As we examined the possibilities, two things became clear: The first was that nothing that existed at the time came close to meeting all those criteria. The other was that any solution that eventually emerged would have to be worn on the wrist.
Why the wrist
Why the wrist? There are many other input sources available, all of them useful. Voice is intuitive, but not private enough for the public sphere or reliable enough due to background noise. A separate device you could store in your pocket like a phone or a game controller adds a layer of friction between you and your environment. As we explored the possibilities, placing an input device at the wrist became the clear answer: The wrist is a traditional place to wear a watch, meaning it could reasonably fit into everyday life and social contexts. It’s a comfortable location for all-day wear. It’s located right next to the primary instruments you use to interact with the world — your hands. This proximity would allow us to bring the rich control capabilities of your hands into AR, enabling intuitive, powerful, and satisfying interaction.
A wrist-based wearable has the additional benefit of easily serving as a platform for compute, battery, and antennas while supporting a broad array of sensors. The missing piece was finding a clear path to rich input, and a potentially ideal solution was EMG.
EMG — electromyography — uses sensors to translate electrical motor nerve signals that travel through the wrist to the hand into digital commands that you can use to control the functions of a device. These signals let you communicate crisp one-bit commands to your device, a degree of control that’s highly personalizable and adaptable to many situations.
The signals through the wrist are so clear that EMG can understand finger motion of just a millimeter. That means input can be effortless. Ultimately, it may even be possible to sense just the intention to move a finger.
“What we’re trying to do with neural interfaces is to let you control the machine directly, using the output of the peripheral nervous system — specifically the nerves outside the brain that animate your hand and finger muscles,” says FRL Director of Neuromotor Interfaces Thomas Reardon, who joined the FRL team when Facebook acquired CTRL-labs in 2019.
This is not akin to mind reading. Think of it like this: You take many photos and choose to share only some of them. Similarly, you have many thoughts and you choose to act on only some of them. When that happens, your brain sends signals to your hands and fingers telling them to move in specific ways in order to perform actions like typing and swiping. This is about decoding those signals at the wrist — the actions you’ve already decided to perform — and translating them into digital commands for your device. It’s a much faster way to act on the instructions that you already send to your device when you tap to select a song on your phone, click a mouse, or type on a keyboard today.
Dynamic control at the wrist
Initially, EMG will provide just one or two bits of control we’ll call a “click,” the equivalent of tapping on a button. These are movement-based gestures like pinch and release of the thumb and forefinger that are easy to execute, regardless of where you are or what you’re doing, while walking, talking, or sitting with your hands at your sides, in front of you, or in your pockets. Clicking your fingers together will always just work, without the need for a wake word, making it the first ubiquitous, ultra-low-friction interaction for AR.
But that’s just the first step. EMG will eventually progress to richer controls. In AR, you’ll be able to actually touch and move virtual UIs and objects, as you can see in this demo video. You’ll also be able to control virtual objects at a distance. It’s sort of like having a superpower like the Force.
But that’s just the beginning. It’s highly likely that ultimately you’ll be able to type at high speed with EMG on a table or your lap — maybe even at higher speed than is possible with a keyboard today. Initial research is promising. In fact, since joining FRL in 2019, the CTRL-labs team has made important progress on personalized models, reducing the time it takes to train custom keyboard models that adapt to an individual’s typing speed and technique.
“The goal of neural interfaces is to upset this long history of human-computer interaction and start to make it so that humans now have more control over machines than they have over us,” Reardon explains. “We want computing experiences where the human is the absolute center of the entire experience.”
Take the QWERTY keyboard as an example. It’s over 150 years old, and it can be radically improved. Imagine instead a virtual keyboard that learns and adapts to your unique typing style (typos and all) over time. The result is a keyboard that slowly morphs to you, rather than you and everyone else in the world learning the same physical keyboard. This will be faster than any mechanical typing interface, and it will be always available because you are the keyboard. And the beauty of virtual typing and controls like clicking is that people are already adept at using them.
Adaptive interfaces and the path to intelligent click
So what’s possible in the nearer term — and how will we get there?
“We believe our wristband wearables may offer a path to ultra-low-friction, always-available input for AR glasses, but they’re not a complete solution on their own — just as the mouse is one piece of the graphical user interface,” says FRL Director of Research Science Hrvoje Benko. “They need to be assisted with intent prediction and user modeling that adapts to you and your particular context in real time.”
What if, rather than clicking through menus to do the thing you’d like to do, the system offered that thing to you and you could confirm it with just a simple “click” gesture? When you combine input microgestures with an adaptive interface, then you arrive at what we call “intelligent click.”
“The underlying AI has some understanding of what you might want to do in the future,” explains FRL Research Science Manager Tanya Jonker. “Perhaps you head outside for a jog and, based on your past behavior, the system thinks you’re most likely to want to listen to your running playlist. It then presents that option to you on the display: ‘Play running playlist?’ That’s the adaptive interface at work. Then you can simply confirm or change that suggestion using a microgesture. The intelligent click gives you the ability to take these highly contextual actions in a very low-friction manner because the interface surfaces something that’s relevant based on your personal history and choices, and it allows you to do that with minimal input gestures.”
This may only save you a few seconds per interaction, but all those seconds add up. And perhaps more importantly, these subtle gestures won’t derail you from your train of thought or flow of movement. Imagine, for example, how much time you’d save if you didn’t have to stop what you’re doing to select and open the right app before engaging with the digital world? For AR glasses to truly improve our lives and let us remain present in the moment, we need an adaptive interface that gently surfaces digital information only when it’s relevant, and then fades naturally into the background.
“Rather than constantly diverting your attention back to a device, the interface should simply come in and out of focus when you need it,” notes Jonker, “and it should be able to regulate its behavior based on your very, very lightweight feedback to the system about the utility of its suggestions to you so that the entire system improves over time.”
It’s a tall order, and a number of technical challenges remain. Building an interface that identifies and interprets context from the user and the world demands advances in machine learning, HCI, and user interface design.
“The system learns something about your location and key objects, like your running shoes, or activity recognition,” says Jonker. “And it learns that, in the past, you’ve often launched your music app when you leave your house with those shoes on. Then, it asks you if you’d like to play your music, and allows you to confirm it with just a click. These more simple and feasible examples are ones that we’re exploring in our current research.”
Haptics in focus
While ultra-low-friction input like a finger click or microgestures will enable us to interact with adaptive interfaces, we also need a way to close the feedback loop — letting the system communicate back to the user and making virtual objects feel tangible. That’s where haptics come into play.
“From your first grasp at birth all the way to dexterous manipulation of objects and typing on a keyboard, there’s this really rich feedback loop, where you see and do things with your hands and fingers and then you feel sensations coming back as you interact with the world,” says FRL Research Science Director Sean Keller. “We’ve evolved to leverage those haptic signals to learn about the world. It’s haptics that lets us use tools and fine control. From a surgeon using a scalpel to a concert pianist feeling the edges of the keys — it all depends on haptics. With a wristband, it’s the beginning. We can’t reproduce every sensation in the virtual world you might feel when interacting with a real object in the real world, but we’re starting to produce a lot of them.”
Take a virtual bow and arrow. With wrist-based haptics, we’re able to approximate the sensation of pulling back the string of a bow in order to give you confidence that you’re performing the action correctly.
You might feel a series of vibrations and pulses to alert you when you received an email marked “urgent,” while a normal email might have a single pulse or no haptic feedback at all, depending on your preferences. When a phone call comes in, a custom piece of haptic feedback on the wrist could let you know who’s calling. This would then let you complete an action — in this case, an intelligent click to either pick up the call or send it to voicemail — with little or no visual feedback. These are all examples of haptic feedback helping HCI become a two-way conversation between you and your devices.
“Haptics might also be able to convey different emotions — we call this haptic emojis,” adds FRL Research Science Manager Nicholas Colonnese. “If you’re in the right context, different types of haptic feedback could correspond to popular emojis. This could be a new playful way for better social communication.”
We’re currently building a series of research prototypes meant to help us learn about wristband haptics. One prototype is called “Bellowband,” a soft and lightweight wristband named for the eight pneumatic bellows placed around the wrist. The air within the bellows can be controlled to render pressure and vibration in complex patterns in space and time. This is an early research prototype helping us determine the types of haptic feedback worthy of further exploration.
Another prototype, Tasbi (Tactile and Squeeze Bracelet Interface), uses six vibrotactile actuators and a novel wrist squeeze mechanism. Using Bellowband and Tasbi, we have tested a number of virtual interactions, from seeing if people can detect differences in the stiffness of virtual buttons to feeling different textures to moving virtual objects. These prototypes are an important step toward possibly creating haptic feedback that feels indistinguishable from real-life objects and activities. Thanks to a biological phenomenon called sensory substitution, this is in fact possible: Our mind combines the visual, audio, and haptic stimuli to give these virtual experiences new dimensions.
It’s still early days, but the future is promising.
“The edge of haptics research leads us to believe that we can actually enable rich communication,” Keller notes. “People can learn language through touch and potentially through just a wristband. There’s a whole new space that’s just beginning to open up, and a lot of it starts with richer haptic systems on the wrist.”
Privacy, security, and safety as fundamental research questions
In order to build a human-centered interface for AR that can be used practically in everyday life, privacy, security, and safety must be considered fundamental research questions that underlie all of our explorations in wrist-based interaction. We must ask how we can help people make informed decisions about their AR interaction experience. In other words, how do we enable people to create meaningful boundaries between themselves and their devices?
“Understanding and solving the full extent of ethical issues requires society-level engagement,” says Keller. “We simply won’t get there by ourselves, so we aren’t attempting to do so. As we invent new technologies, we are committed to sharing our learnings with the community and engaging in open discussion to address concerns.”
That’s why we support and encourage our researchers to publish their work in peer-reviewed journals — and why we’re telling this story today. We believe that far before any of this technology ever becomes part of a consumer product, there are many discussions to have openly and transparently about what the future of HCI can and should look like.
“We think deeply about how our technologies can positively and negatively impact society, so we drive our research and development in a highly principled fashion,” says Keller, “with transparency and intellectual honesty at the very core of what we do and what we build.”
We’re taking concrete steps to discuss important neuroethical questions in tandem with technology development. Our neuroethics program at FRL Research includes Responsible Foresight workshops where we surface and mitigate potential harms that might arise from a product, as well as Responsible Innovation workshops, which help us identify and take action on potential issues that might arise during development. We collaborate with academic ethicists to help the industry as a whole address those issues, and our embedded ethicists within the team help guide us as we address considerations like data management.
As we continue to explore the possibilities of AR, we’ll also continue to engage our responsible innovation principles as the backbone of every research question we pursue, chief among them: always put people first.
A world of possibilities
With sensors on the wrist, you can interact with virtual objects or control the ambiance of your living room in a nearly frictionless way. And someone born without a hand can even learn to operate a virtual one.
“We limit our creativity, our agency, and our actions in the world based on what we think is possible,” says Reardon. “Being able to do more, faster, and therefore experiment more, create more, explore more — that’s at the heart of the next computing platform.”
We believe people don’t need to choose between the virtual world and the real world. With ultra-low-friction wrist-based input, adaptive interfaces powered by contextually-aware AI, and haptic feedback, we can communicate with our devices in a way that doesn’t pull us out of the moment, letting us connect more deeply with others and enhancing our lives.
“This is an incredible moment, setting the stage for innovation and discovery because it’s a change to the old world,” says Keller. “It’s a change to the rules that we’ve followed and relied upon to push computing forward. And it’s one of the richest opportunities that I can imagine being a part of right now.”