Building Blocks

Week 8 : June 18th - 24th

Building Blocks


Text Input

This project will require a standardised method of text input across the platform. It's important to keep the contexts of use in mind as text input will need to be comfortable and effective while standing, sitting or moving. Text input for the AR platform does not need to be as fast as other devices such as PC or laptop, but should aim to be around as fast as mobile devices. The following will explore a variety of different types of text input.


Physical Keyboards

Physical keyboards are peripheral hardware for text input in devices. Using a physical device in AR would be disadvantageous as the user would need to bring this device with them, adding layers of effort and usability concerns to the platform. AR may not need the speed and efficiency of physical keyboards for text input if the level of text required is low.

QWERTY Keyboard
The traditional QWERTY physical keyboard has become the standard for the majortiy of text input tasks for computers. When it comes to an augmented reality application, these keyboards become more problematic. Carrying a dedicated peripheral device for AR text input would not be a desirable or viable solution for the majortiy of users. Physical keyboards require a large amount of space between keys for acurate text input, and while they can be miniaturised, they would still be cumbersome, especially when standing or walking.

Image Credit:
Multitap Keyboard
Multitap keyboards were popularised in mobile devices before the implementation of touch screens. By reducing the number of keys and assigning multiple options to each, the keyboard could be smaller and more simple for mobile use. Users would press each button multiple times to choose the letter, however multitap inputs can rely on predictive text functions, requiring the user to only press each button once in sequence.

Image Credit:
Chord Keyboard
The chord keyboard is a small one handed keyboard which aims to provide the functionality of a full sized keyboard with far fewer keys. Multiple buttons can be pressed at once to produce characters and symbols. This type of keyboard is completely different to the standard QWERTY keyboard and would require a lot training to become proficient with it. Reduced success due to lack of familiarity in the concept is something to keep in mind when designing text input methods for use in AR.


Digital Keyboards

Digital software keyboards are much more promising for use in AR applications. The following examples are a mix of digital keyboards, bearing in mind the physical input device.

Mobile Software Keyboard
Mobile devices generally rely on a touchscreen with a small digital QWERTY keyboard. The reduced letter accuracy due to the size is accounted for through predictive text and suggestions based on writing habits. People are extremely familiar and comfortable with using this method of text input on a daily basis, happy to compose large emails and messages through the medium. As people already carry their phone with them, there is potential to use their phone as the primary text input for their AR device. Using the phone would ensure that users are already familiar with the concept and wouldn't require them to bring additional devices; however it could be argued that the project would fail to celebrate the real world, keeping people in their phones.

Image Credit:
Swipe Typing
Swipe typing is another method for using the phone touchscreen. The user swipes between the letters they want to type; a combination of their pattern and predictive text works to decipher what they want to type. This method adds a gestural flow to their typing. Mapping this to AR could feel less rigid and structured than a conventional keyboard, potentially giving them a purer real world experience.

Image Credit:
Dasher is an accessibility typing tool, allowing for text input without a keyboard. The software is directed towards the next letter in the word with predictive text aiding in the input by altering the scale of likely letters. Using Dasher would require some learning but could be applied to AR through hand or eye tracking without needing any other peripheral devices. As Dasher is already an accessibility tool, it would also increase the overall accessibility of the platform. I imagine that the level of focus required for Dasher would be higher than a mobile software keyboard.

Metropolis Keyboard
The Metropolis Keyboard was developed by Zhang et al in 2000. They used a quantitative study to design the optimal keyboard layout for mobile devices assessing letter frequency, layout and distance between keys. The Metropolis keyboard is claimed to be 40% faster than QWERTY for mobile applications. QWERTY however, would likely allow for better performance from novices.

The Senseboard allows for a full QWERTY keyboard to be virtually mapped to any surface. Finger movement is tracked along with pattern recognition to allow users to type with all ten fingers. The video shows a virtual keyboard being displayed to the user through AR glasses, which could reduce confusion and error rate among people less familiar with key positioning. This method requires a surface to type, which would mean it would be impossible to use while moving, or standing without a nearby surface.

AR Mapping
AR could map visual digital keyboard displays to surfaces and allow for typing through finger position. The example below shows a one handed QWERTY keyboard mapped to the users arm, interacted with using their dominant hand. This design doesn't require additional hardware, but it doesn't seem to be considered from a usability perspective. Single finger typing on a normal QWERTY keyboard would lead to slow and deliberate text input.

Image Credit:
Gaze Input
A common form of text input is "Gaze & Dwell" or "Gaze & Select" which can be found in AR headsets such as the Microsoft Hololens. A gaze point in the center of the user's field of view must be placed over the keyboard letters by moving their head. The letters are then selected by dwelling on the letter for a moment, or by a gesture. The example below uses a Hololens with gaze & select on a digital QWERTY keyboard. This method requires some head movement to position the gaze, and the user can only focus on one letter at a time, impacting overall performance.

Laser Projection
Similar to the Senseboard, laser projection offers a visual display of the keyboard. This method also requires a flat surface for use.

Image Credit:
Dual Touch Keyboard
The Steam Controller has a unique form of text input which incorporates a touch pad for each thumb. A QWERTY keyboard is split into two halves, with each half being controlled by a touch pad and a trigger for selection. This input method does require some practice, however it has the potential to very quickly input text through a controller. Mapping a similar technique to hand rotation could be a viable gestural approach to text input in AR.

Image Credit:
Daisy Wheel Keyboard
Another interesting text input method for controllers is the Daisy Wheel, which I was testing with an XBOX 360 controller. The Daisy Wheel only requires two inputs for each letter; a direction on the joystick, and a button for letter selection. The Daisy Wheel provides a large amount of control over text input with very few required inputs.

Image Credit:
Japanese Radial
A Github member called Yotokun developed a Japaense text input radial menu for Occulus Touch. Using the controller with hand roll, pitch and position, Japaense characters can be selected. There is also an English version available. Typing based on hand position and orientation could be a viable strategy for AR text input and is something I want to test as soon as possible.

Image Credit:

Speech Input

Speech commands could be combined with physical text input methods to add a multimodal layer of interaction over the system. In situations where a user's hands aren't free or they are in private locations, speech input could be a useful asset. When the information being inputed is confidential or private, or the user is in a public space and not comfortable issuing voice commands, the manual input could be preferred.



The Dual Touch & Daisy Wheel methods takes advantage of the capabilities of the controller they're designed for and really showcase how text input should be considered when designing for AR. New methods need to be explored, rather than simply applying what works for other devices. I want to explore how hands could be used for effective text input in AR.

I intend to create video mockups and rough interactive prototypes to test different forms of text input, aiming to find what works from a contextual usability point of view.




Video Prototype - Radial Hand Input

I put together a very quick and rough proof of concept video prototype to get across the basic idea of a hand controlled radial text input. Letter selection is based on hand rotation, selected letters are highlighted and the user can pinch to type the letter. The concept in the video would require optimised letter layout based on predictive text, perhaps the letters should rearrange themselves based on the next most likely input; or more common letters could be towards the center.

Assuming the hardware allowed for total hand tracking, this concept could be easily used while seated, standing or walking. It is a subtle movement when in public, and would not become tiring over time as the hand wouldn't need to be raised.


I see a lot of ways that a concept like this could be developed for AR text input:

  1. This concept could be adapted to work with one or two hands, giving further speed and control over text input.
  2. A fist could be made to type a space.
  3. Hand pitch could add layering of letters to reduce required rotational accuracy and amount. (similar to the daisy wheel)
  4. Pinching each finger could represent a different letter (index, middle, ring, pinky), each finger will choose a letter from a selection of 4 (also similar to daisy wheel)
  5. Letter layout optimisation.




Usability Testing - Steam Controller & XBOX Typing

As part of testing a gestural form of text input, I conducted some usability tests with the Steam and daisy wheel keyboards. The testing was a limited study of 4 participants including myself.

The test was designed as follows:

  1. Participants were given a list of four sentences to type, they were given the opportunity to familiarise themselves with the content and wording. The sentences were:
    • The quick brown fox jumps over the lazy dog.
    • Now is the time for all good men to come to the aid of their country.
    • Pack my box with five dozen liquor jugs.
    • A journey of a thousand miles begins with a single step.
  2. After becoming familiar with the sentences and nature of the test, they were asked to type the sentences on their mobile devices using the conventional mobile QWERTY keyboard. The time for each sentence was recorded and the number of errors noted.
  3. The participants repeated the test using the XBOX daisy wheel keyboard.
  4. The participants repeated the test using the Steam keyboard.
  5. After completing the sentences for each method, each participant was asked a series of debriefing questions to guage their opinions on the method of text input.
  6. On completeion of the tests, participants were asked to measure their preference for each method, along with the perceived efficiency and learning difficulty.

The graphs below show the completion time of each sentence and the rate of errors for each participant.

The performance of each input method was likely due to familiartiy. All of the participants use mobile QWERTY keyboards on a daily basis and have become accustomed to the typing interface, while none of the participants have typed using the daisy wheel or Steam controller methods. Completion time and error rates were significantly lower for the mobile keyboard input.

The daisy wheel technique performed the worst across all participants. The combination of a new keyboard layout and a new typing interface became a very slow and high focus experience for users. The accuracy required by the analog stick on the controller led to a large amount of misclicks and errors throughout the test, with users commenting on how precise they needed to be. All participants disliked the daisy wheel input method, feeling that it was inefficient and difficult to learn.

The Steam keyboard performed better than the daisy wheel across every sentence for every user, with increased speed and reduced error rate. While the Steam keyboard was considerably slower than the mobile keyboard, the participants felt that they could learn the method over time and eventually become proficient with it. Interestingly, even though it was slower, the Steam keyboard was preferred over mobile for one user, and equal for another. When asked why, they stated that they enjoyed how it felt to use and could feel a good flow with how they were typing.

While these tests were very quick and with a low amount of participants, I believe it gave good insight into how a text input method should feel:

  • A level of familiarity helps the user to transfer their current skill and knowledge (Steam uses a QWERTY layout while the daisy chain is completely new)
  • The typing experience can be ranked highly even if it's less efficient (Steam keyboard was ranked above or comparable to the mobile keyboard)
  • Feedback is linked to experience & performance (The Steam controller provided haptic feedback and freedom of movement, the daisy wheel was slightly unresponsive and difficult to tell when a key has been pressed)
  • Error rate reduces experience quality (The daisy wheel led to irritation towards the end of the test)




Gestural Typing

After testing the Steam and daisy wheel keyboards, I wanted to try create a fully gestural keyboard that a user could control with their hands. Using my Leap Motion and Processing, I coded the first version of this gestural interface.

Prototype 1 - Gestural QWERTY

The first version tracks both hands. The right hand is used for character selection; by rotating your hand you can move the selection left and right, and by tilting your hand up or down you can access the other rows in the keyboard. The left selects characters through pinching, adds a space by tilting your hand down and a backspace by tilting your hand up.

I used a QWERTY keyboard layout for the prototype to increase familiarity of character location. After some testing, major problems became very apparant. When rotating your right hand, it can become a natural reaction to tilt the hand upwards or downwards which causes a lot of difficulty in character selection. Hand tracking limitations with the hardware caused some difficulty too; when users moved one hand over the other the tracking sometimes failed.

I wanted to create a prototype which felt more natural, making more use of actual gestures.

Prototype 2 - Gestural QWERTY 2

For this prototype I removed keyboard row selection through tilting, replacing it with row selection through finger extension. 0 - 1 fingers extended selects the top row, 2 fingers extended selects the second row & 3+ fingers extended selects the bottom row. This method felt similar to playing a musical instrument or using sign language to type and the concept was enjoyed by those who used it, however some people did have trouble becoming used to completely gestural input.

I tested myself on both prototypes to assess their efficiency against the other methods I had already tested. I found the efficiency to be comparable to the daisy wheel Xbox input (especially when cross referenced with other participant metrics) but with a far superior experience and flow to it. The error rate is far higher primarily due to limitations in tracking hardware and software accuracy.

Even though the gestural typing methods performed worse than the other methods in this test, with some improvements I believe it could be a viable form of text input for AR:

  • Improved hand tracking technology: A company called CTRL-Labs are working on a wearable (CTRL-Kit) which measures electrical pulses along the neurons in the arm, which can track hand movements without cameras, or even physical movements.
  • Optimised Keyboard Layout: With a keyboard layout more attuned to gestural input, I can see the efficiency of the method rising significantly.
  • Single Hand Typing: I was unable to create a comparable single handed prototype using the technology I had, however the CTRL-Kit could bypass camera tracking and unleash the full potential of one handed gestural typing.
  • Hand Position: My prototype required the user's hands to be raised to elbow level for accurate hand tracking, once again, CTRL-Kit could allow the user's hand to be wherever they want.

Prototype 3 - Letter Frequency Keyboard

I created a keyboard based on the most frequenntly used letters in typing. This prototype still used the same extended finger and hand rotation controls, however to increase accuracy of character selection on each row, I increased the row count to 5. Below is the keyboard used for this prototype.

Much higher speed and accuracy was achievable with this iteration of the prototype. I managed to increase my total speed by 30% (258 seconds vs 352 seconds) and reduce my error rate by over 80% (4 errors vs 19 errors) in comparison to the previous gestural QWERTY prototype. This improvement is due to the reduced accuracy and movement required for selecting each letter because of the optimised keyboard layout and reduced amount of letters per row, but increased familiarty with the gestural interface could also be a factor.

Proposed Gestural Concept

My proposed gestural input method for text input in AR is a one handed approach combined with a suitable keyboard layout. The layout below takes frequently used letters and letter combinations into account and places them near the neutral hand position. Each row is controlled by individual fingers and characters are selected by moving the specified finger to the palm of the hand. Space and backspace are controlled by moving the thumb to the index finger. Letter columns are selected by rotating the hand.

Ergonomic testing showed that moving the pinky finger to the palm is not always a comfortable or easy action. The least commonly used letters have been placed on the pinky row, and the number of letters on that row has been reduced. The video below shows how this concept might look.

This concept would allow a user to type while sitting, standing or walking, with a single hand by their side. Predictive text would be combined with the concept to considerably increase typing speed and reduce required accuracy. Users would receive haptic feedback to character input through touching their fingers to their palm.





CTRL-Kit by CTRL-Labs is being pitched as a non-invasive neural interface wearable which deciphers information from neurons through machine learning and surface electromyography (EMG). This technology will allow the user to transfer hand movement into digital information, but it also claims to allow you to control digital content without physically moving. In the video below there is an example of someone playing Space Invaders with one hand, without moving their hand.

"Once you go to the nervous system directly, you break the bounds of sequential interaction."

Combining a technology like this with AR would give users a huge amount of control over AR content interactions. All controls could be mapped to a single hand which is constantly tracked without the need for cameras. This consistent tracking would ensure that lighting, occlusion or other limitations with camera tracking technology could be overcome.




Chosen Text Input Modalities

For this project, text input can be broken down into a primary and secondary method of interaction. Multiple input modalities will increase accessibility and versatility when in different contexts.

Primary Input - Speech: Speech recognition software is becoming more effective at interpreting long strings of spoken language. When it comes to fast and effective text input, speech would perform well and be suitable for the majortiy of contexts.
Secondary Input - Gestural: In contexts where speech wouldn't be ideal such as loud environments or situations where silence is required or socially expected; when inputting passwords or sensitive information; if the user dislikes speech input and has a preference for gesture.

The gestural input would be interpreted through a wrist worn wearable with technology such as CTRL-Kit. For this project, actual physical control gestures will be addressed rather than the "thought" control interactions as I have yet to see the capabilities of this technology.


Take a Stance >