TL;DR: Novel extensions to the core GPU pipeline demonstrate object segmentation and user interaction directly in front of the sensor, without degrading camera tracking or reconstruction, to enable real-time multi-touch interactions anywhere.
Abstract: KinectFusion enables a user holding and moving a standard Kinect camera to rapidly create detailed 3D reconstructions of an indoor scene. Only the depth data from Kinect is used to track the 3D pose of the sensor and reconstruct, geometrically precise, 3D models of the physical scene in real-time. The capabilities of KinectFusion, as well as the novel GPU-based pipeline are described in full. Uses of the core system for low-cost handheld scanning, and geometry-aware augmented reality and physics-based interactions are shown. Novel extensions to the core GPU pipeline demonstrate object segmentation and user interaction directly in front of the sensor, without degrading camera tracking or reconstruction. These extensions are used to enable real-time multi-touch interactions anywhere, allowing any planar or non-planar reconstructed physical surface to be appropriated for touch.
TL;DR: OmniTouch is a wearable depth-sensing and projection system that enables interactive multitouch applications on everyday surfaces and is conceivable that anything one can do on today's mobile devices, they could do in the palm of their hand.
Abstract: OmniTouch is a wearable depth-sensing and projection system that enables interactive multitouch applications on everyday surfaces. Beyond the shoulder-worn system, there is no instrumentation of the user or environment. Foremost, the system allows the wearer to use their hands, arms and legs as graphical, interactive surfaces. Users can also transiently appropriate surfaces from the environment to expand the interactive area (e.g., books, walls, tables). On such surfaces - without any calibration - OmniTouch provides capabilities similar to that of a mouse or touchscreen: X and Y location in 2D interfaces and whether fingers are "clicked" or hovering, enabling a wide variety of interactions. Reliable operation on the hands, for example, requires buttons to be 2.3cm in diameter. Thus, it is now conceivable that anything one can do on today's mobile devices, they could do in the palm of their hand.
TL;DR: This work presents a general purpose framework for micro-task markets that provides a scaffolding for more complex human computation tasks which require coordination among many individuals, such as writing an article.
Abstract: Micro-task markets such as Amazon's Mechanical Turk represent a new paradigm for accomplishing work, in which employers can tap into a large population of workers around the globe to accomplish tasks in a fraction of the time and money of more traditional methods. However, such markets have been primarily used for simple, independent tasks, such as labeling an image or judging the relevance of a search result. Here we present a general purpose framework for accomplishing complex and interdependent tasks using micro-task markets. We describe our framework, a web-based prototype, and case studies on article writing, decision making, and science journalism that demonstrate the benefits and limitations of the approach.
TL;DR: This paper develops techniques that recruit synchronous crowds in two seconds and use them to execute complex search tasks in ten seconds, and offers empirically derived guidelines for a retainer system that is low-cost and produces on-demand crowds inTwo seconds.
Abstract: Interactive systems must respond to user input within seconds. Therefore, to create realtime crowd-powered interfaces, we need to dramatically lower crowd latency. In this paper, we introduce the use of synchronous crowds for on-demand, realtime crowdsourcing. With synchronous crowds, systems can dynamically adapt tasks by leveraging the fact that workers are present at the same time. We develop techniques that recruit synchronous crowds in two seconds and use them to execute complex search tasks in ten seconds. The first technique, the retainer model, pays workers a small wage to wait and respond quickly when asked. We offer empirically derived guidelines for a retainer system that is low-cost and produces on-demand crowds in two seconds. Our second technique, rapid refinement, observes early signs of agreement in synchronous crowds and dynamically narrows the search space to focus on promising directions. This approach produces results that, on average, are of more reliable quality and arrive faster than the fastest crowd member working alone. To explore benefits and limitations of these techniques for interaction, we present three applications: Adrenaline, a crowd-powered camera where workers quickly filter a short video down to the best single moment for a photo; and Puppeteer and A|B, which examine creative generation tasks, communication with workers, and low-latency voting.
TL;DR: ReVision is a system that automatically redesigns visualizations to improve graphical perception, and applies perceptually-based design principles to populate an interactive gallery of redesigned charts.
Abstract: Poorly designed charts are prevalent in reports, magazines, books and on the Web Most of these charts are only available as bitmap images; without access to the underlying data it is prohibitively difficult for viewers to create more effective visual representations In response we present ReVision, a system that automatically redesigns visualizations to improve graphical perception Given a bitmap image of a chart as input, ReVision applies computer vision and machine learning techniques to identify the chart type (eg, pie chart, bar chart, scatterplot, etc) It then extracts the graphical marks and infers the underlying data Using a corpus of images drawn from the web, ReVision achieves image classification accuracy of 96% across ten chart categories It also accurately extracts marks from 79% of bar charts and 62% of pie charts, and from these charts it successfully extracts data from 71% of bar charts and 64% of pie charts ReVision then applies perceptually-based design principles to populate an interactive gallery of redesigned charts With this interface, users can view alternative chart designs and retarget content to different visual styles
TL;DR: A bimanual hand tracking system that provides physically-motivated 6-DOF control for 3D assembly and solves the pose estimation problem with efficient queries of a precomputed database that relates hand silhouettes to their 3D configuration.
Abstract: Computer Aided Design (CAD) typically involves tasks such as adjusting the camera perspective and assembling pieces in free space that require specifying 6 degrees of freedom (DOF). The standard approach is to factor these DOFs into 2D subspaces that are mapped to the x and y axes of a mouse. This metaphor is inherently modal because one needs to switch between subspaces, and disconnects the input space from the modeling space. In this paper, we propose a bimanual hand tracking system that provides physically-motivated 6-DOF control for 3D assembly. First, we discuss a set of principles that guide the design of our precise, easy-to-use, and comfortable-to-use system. Based on these guidelines, we describe a 3D input metaphor that supports constraint specification classically used in CAD software, is based on only a few simple gestures, lets users rest their elbows on their desk, and works alongside the keyboard and mouse. Our approach uses two consumer-grade webcams to observe the user's hands. We solve the pose estimation problem with efficient queries of a precomputed database that relates hand silhouettes to their 3D configuration. We demonstrate efficient 3D mechanical assembly of several CAD models using our hand-tracking system.
TL;DR: Pause-and-Play is presented, a system that helps users work along with existing video tutorials by using computer vision to detect events in existing videos and leverages application scripting APIs to obtain real time usage traces.
Abstract: Video tutorials provide a convenient means for novices to learn new software applications. Unfortunately, staying in sync with a video while trying to use the target application at the same time requires users to repeatedly switch from the application to the video to pause or scrub backwards to replay missed steps. We present Pause-and-Play, a system that helps users work along with existing video tutorials. Pause-and-Play detects important events in the video and links them with corresponding events in the target application as the user tries to replicate the depicted procedure. This linking allows our system to automatically pause and play the video to stay in sync with the user. Pause-and-Play also supports convenient video navigation controls that are accessible from within the target application and allow the user to easily replay portions of the video without switching focus out of the application. Finally, since our system uses computer vision to detect events in existing videos and leverages application scripting APIs to obtain real time usage traces, our approach is largely independent of the specific target application and does not require access or modifications to application source code. We have implemented Pause-and-Play for two target applications, Google SketchUp and Adobe Photoshop, and we report on a user study that shows our system improves the user experience of working with video tutorials.
TL;DR: A magnetic control system that can levitate and actuate a permanent magnet in a pre-defined 3D volume is developed, combined with an optical tracking and display system that projects images on the levitating object.
Abstract: This paper presents ZeroN, a new tangible interface element that can be levitated and moved freely by computer in a three dimensional space. ZeroN serves as a tangible rep-resentation of a 3D coordinate of the virtual world through which users can see, feel, and control computation. To ac-complish this, we developed a magnetic control system that can levitate and actuate a permanent magnet in a pre-defined 3D volume. This is combined with an optical tracking and display system that projects images on the levitating object. We present applications that explore this new interaction modality. Users are invited to place or move the ZeroN object just as they can place objects on surfaces. For example, users can place the sun above physical objects to cast digital shadows, or place a planet that will start revolving based on simulated physical conditions. We describe the technology and interaction scenarios, discuss initial observations, and outline future development.
TL;DR: It is shown that the unique properties of Dormouse enable elegant programming models for complex and useful problems, and two such frameworks are proposed: ManReduce, a framework for combining human and machine computation into an intuitive parallel data flow that goes beyond existing frameworks in several important ways, such as enabling functions on arbitrary communication graphs between human andMachine clusters.
Abstract: We present Jabberwocky, a social computing stack that consists of three components: a human and machine resource management system called Dormouse, a parallel programming framework for human and machine computation called ManReduce, and a high-level programming language on top of ManReduce called Dog Dormouse is designed to enable cross-platform programming languages for social computation, so, for example, programs written for Mechanical Turk can also run on other crowdsourcing platforms Dormouse also enables a programmer to easily combine crowdsourcing platforms or create new ones Further, machines and people are both first-class citizens in Dormouse, allowing for natural parallelization and control flows for a broad range of data-intensive applications And finally and importantly, Dormouse includes notions of real identity, heterogeneity, and social structure We show that the unique properties of Dormouse enable elegant programming models for complex and useful problems, and we propose two such frameworks ManReduce is a framework for combining human and machine computation into an intuitive parallel data flow that goes beyond existing frameworks in several important ways, such as enabling functions on arbitrary communication graphs between human and machine clusters And Dog is a high-level procedural language written on top of ManReduce that focuses on expressivity and reuse We explore two applications written in Dog: bootstrapping product recommendations without purchase data, and expert labeling of medical images
TL;DR: This work develops and reports on a collection of interactions enabled by Medusa in support of multi-user collaborative design, specifically within the context of Proxi-Sketch, a multi- user UI prototyping tool.
Abstract: We present Medusa, a proximity-aware multi-touch tabletop. Medusa uses 138 inexpensive proximity sensors to: detect a user's presence and location, determine body and arm locations, distinguish between the right and left arms, and map touch point to specific users and specific hands. Our tracking algorithms and hardware designs are described. Exploring this unique design, we develop and report on a collection of interactions enabled by Medusa in support of multi-user collaborative design, specifically within the context of Proxi-Sketch, a multi-user UI prototyping tool. We discuss design issues, system implementation, limitations, and generalizable concepts throughout the paper.
TL;DR: EchoMouse, a device created to characterize the transfer functions of any system, and libpointing, a toolkit that was developed to replicate and compare the ones used by Windows, OS X and Xorg are presented.
Abstract: Transfer functions are the only pointing facilitation technique actually used in modern graphical interfaces involving the indirect control of an on-screen cursor. But despite their general use, very little is known about them. We present EchoMouse, a device we created to characterize the transfer functions of any system, and libpointing, a toolkit that we developed to replicate and compare the ones used by Windows, OS X and Xorg. We describe these functions and report on an experiment that compared the default one of the three systems. Our results show that these default functions improve performance up to 24% compared to a unitless constant CD gain. We also found significant differences between them, with the one from OS X improving performance for small target widths but reducing its performance up to 9% for larger ones compared to Windows and Xorg. These results notably suggest replacing the constant CD gain function commonly used by HCI researchers by the default function of the considered systems.
TL;DR: A model to proactively suggest data transforms which map input data to a relational format expected by analysis tools is presented, and a metric that scores tables according to type homogeneity, sparsity and the presence of delimiters is proposed.
Abstract: Analysts regularly wrangle data into a form suitable for computational tools through a tedious process that delays more substantive analysis. While interactive tools can assist data transformation, analysts must still conceptualize the desired output state, formulate a transformation strategy, and specify complex transforms. We present a model to proactively suggest data transforms which map input data to a relational format expected by analysis tools. To guide search through the space of transforms, we propose a metric that scores tables according to type homogeneity, sparsity and the presence of delimiters. When compared to "ideal" hand-crafted transformations, our model suggests over half of the needed steps; in these cases the top-ranked suggestion is preferred 77% of the time. User study results indicate that suggestions produced by our model can assist analysts' transformation tasks, but that users do not always value proactive assistance, instead preferring to maintain the initiative. We discuss some implications of these results for mixed-initiative interfaces.
TL;DR: The 1Line keyboard is presented, a soft QWERTY keyboard that is 140 pixels tall (in landscape mode) and 40% of the height of the native iPad QwerTY keyboard, and uses a keystroke level model to predict the peak expert text entry rate to be 66--68 WPM.
Abstract: Current soft QWERTY keyboards often consume a large portion of the screen space on portable touchscreens. This space consumption can diminish the overall user experi-ence on these devices. In this paper, we present the 1Line keyboard, a soft QWERTY keyboard that is 140 pixels tall (in landscape mode) and 40% of the height of the native iPad QWERTY keyboard. Our keyboard condenses the three rows of keys in the normal QWERTY layout into a single line with eight keys. The sizing of the eight keys is based on users' mental layout of a QWERTY keyboard on an iPad. The system disambiguates the word the user types based on the sequence of keys pressed. The user can use flick gestures to perform backspace and enter, and tap on the bezel below the keyboard to input a space. Through an evaluation, we show that participants are able to quickly learn how to use the 1Line keyboard and type at a rate of over 30 WPM after just five 20-minute typing sessions. Using a keystroke level model, we predict the peak expert text entry rate with the 1Line keyboard to be 66--68 WPM.
TL;DR: This year's UIST program includes keynotes by Ge Wang (Stanford professor and co-founder of Smule) on computers and music and by Dan Jurafsky (another Stanford professor and MacArthur Fellow) on computational linguistics.
Abstract: Welcome to UIST 2011, the Twenty-Fourth Annual ACM Symposium on User Interface Software and Technology.
UIST is the premier forum for the presentation of research innovations in the software and technology of human-computer interfaces. Sponsored by ACM's special interest groups on computer-human interaction (SIGCHI) and computer graphics (SIGGRAPH), UIST brings together researchers and practitioners from many areas, including web and graphical interfaces, new input and output devices, information visualization, interactive displays, tangible computing, and computer supported cooperative work. The single-track schedule, intimate size, and location in Santa Barbara, "the American Riviera", make UIST 2011 an ideal place to exchange results and to forge future collaborations.
We received 262 paper submissions from more than 20 countries. After a thorough review process, the program committee accepted 67 papers (25%). Each anonymous submission was reviewed by a primary program committee member and two external reviewers. If any of the three reviewers deemed a submission to pass a rejection threshold we asked a secondary committee member to write a fourth review of the paper. Authors then received all the reviews and if their paper passed the rejection threshold they had the opportunity to write a short rebuttal. The program committee met in person in San Francisco on June 17-18, 2011, to examine each submission and select the top papers. Submissions were finally accepted only after the authors provided a final revision addressing the committee's comments.
In addition to the presentations of accepted papers, this year's program includes keynotes by Ge Wang (Stanford professor and co-founder of Smule) on computers and music and by Dan Jurafsky (another Stanford professor and MacArthur Fellow) on computational linguistics. Posters, demos, the eighth annual Doctoral Symposium, and third annual Student Innovation Content (this year focusing on Microsoft's new TouchMouse) complete the program.
TL;DR: This work argues why input modalities beyond direct touch are required and proposes the combination of freehand gestures and direct touch that provides additional degrees of freedom and resolves input ambiguities, while keeping the locus of interaction on the shape output.
Abstract: Actuated shape output provides novel opportunities for experiencing, creating and manipulating 3D content in the physical world. While various shape displays have been proposed, a common approach utilizes an array of linear actuators to form 2.5D surfaces. Through identifying a set of common interactions for viewing and manipulating content on shape displays, we argue why input modalities beyond direct touch are required. The combination of freehand gestures and direct touch provides additional degrees of freedom and resolves input ambiguities, while keeping the locus of interaction on the shape output. To demonstrate the proposed combination of input modalities and explore applications for 2.5D shape displays, two example scenarios are implemented on a prototype system.
TL;DR: The design and implementation of SideBySide, a system designed for ad-hoc multi-user interaction with handheld projectors, including a hybrid handheld projector to project visible and infrared light, and techniques for tracking projected fiducial markers that move and overlap are presented.
Abstract: We introduce SideBySide, a system designed for ad-hoc multi-user interaction with handheld projectors. SideBySide uses device-mounted cameras and hybrid visible/infrared light projectors to track multiple independent projected images in relation to one another. This is accomplished by projecting invisible fiducial markers in the near-infrared spectrum. Our system is completely self-contained and can be deployed as a handheld device without instrumentation of the environment. We present the design and implementation of our system including a hybrid handheld projector to project visible and infrared light, and techniques for tracking projected fiducial markers that move and overlap. We introduce a range of example applications that demonstrate the applicability of our system to real-world scenarios such as mobile content exchange, gaming, and education.
TL;DR: This paper explores how to extend time domain reflectometry in order to touch-enable thin, modular, and deformable surfaces and devices, and demonstrates how to use this approach to make smart clothing and to rapid prototype touch-sensitive objects of arbitrary shape.
Abstract: Time domain reflectometry, a technique originally used in diagnosing cable faults, can also locate where a cable is being touched. In this paper, we explore how to extend time domain reflectometry in order to touch-enable thin, modular, and deformable surfaces and devices. We demonstrate how to use this approach to make smart clothing and to rapid prototype touch-sensitive objects of arbitrary shape. To accomplish this, we extend time domain reflectometry in three ways: (1) Thin: We demonstrate how to run time domain reflectometry on a single wire. This allows us to touch-enable thin metal objects, such as guitar strings. (2) Modularity: We present a two-pin connector system that allows users to daisy chain touch-sensitive segments. We illustrate these enhancements with 13 prototypes and a series of performance measurements. (3) Deformability: We create deformable touch devices by mounting stretch-able wire patterns onto elastic tape and meshes. We present selected performance measurements.
TL;DR: NaviRadar: an interaction technique for mobile phones that uses a radar metaphor in order to communicate the user's correct direction for crossings along a desired route and provides distinct advantages over current systems by using only tactile feedback.
Abstract: We introduce NaviRadar: an interaction technique for mobile phones that uses a radar metaphor in order to communicate the user's correct direction for crossings along a desired route. A radar sweep rotates clockwise and tactile feedback is provided where each sweep distinctly conveys the user's current direction and the direction in which the user must travel. In a first study, we evaluated the overall concept and tested five different tactile patterns to communicate the two different directions via a single tactor. The results show that people are able to easily understand the NaviRadar concept and can identify the correct direction with a mean deviation of 37° out of the full 360° provided. A second study shows that NaviRadar achieves similar results in terms of perceived usability and navigation performance when compared with spoken instructions. By using only tactile feedback, NaviRadar provides distinct advantages over current systems. In particular, no visual attention is required to navigate; thus, it can be spent on providing greater awareness of one's surroundings. Moreover, the lack of audio attention enables it to be used in noisy environments or this attention can be better spent on talking with others during navigation.
TL;DR: Exper exploratory techniques for finding relevant and inspiring design examples are introduced, including searching by stylistic similarity to a known example design and searching by style-based keyword.
Abstract: In design, people often seek examples for inspiration. However, current example-finding practices suffer many drawbacks: templates present designs without a usage context; search engines can only examine the text on a page. This paper introduces exploratory techniques for finding relevant and inspiring design examples. These novel techniques include searching by stylistic similarity to a known example design and searching by stylistic keyword. These interactions are manifest in d.tour, a style-based design exploration tool. d.tour presents a curated database of Web pages as an explorable design gallery. It extracts and analyzes design features of these pages, allowing it to process style-based queries and recommend designs to the user. d.tour's gallery interface decreases the gulfs of execution and evaluation for design example-finding.
TL;DR: The FuwaFuwa sensor module is a round, hand-size, wireless device for measuring the shape deformations of soft objects such as cushions and plush toys that can be embedded in typical soft objects in the household without complex installation procedures and without spoiling the softness of the object because it requires no physical connection.
Abstract: We present the FuwaFuwa sensor module, a round, hand-size, wireless device for measuring the shape deformations of soft objects such as cushions and plush toys. It can be embedded in typical soft objects in the household without complex installation procedures and without spoiling the softness of the object because it requires no physical connection. Six LEDs in the module emit IR light in six orthogonal directions, and six corresponding photosensors measure the reflected light energy. One can easily convert almost any soft object into a touch-input device that can detect both touch position and surface displacement by embedding multiple FuwaFuwa sensor modules in the object. A variety of example applications illustrate the utility of the FuwaFuwa sensor module. An evaluation of the proposed deformation measurement technique confirms its effectiveness.
TL;DR: A prototype touch screen device that can sense the normal and tangential forces of a touch gesture on the screen is implemented and two example applications, a web browser and an e-book reader, are designed that utilize the force gestures for their primary actions.
Abstract: Force gestures are touch screen gestures augmented by the normal and tangential forces on the screen. In order to study the feasibility of the force gestures on a mobile touch screen, we implemented a prototype touch screen device that can sense the normal and tangential forces of a touch gesture on the screen. We also designed two example applications, a web browser and an e-book reader, that utilize the force gestures for their primary actions. We conducted a user study with the prototype and the applications to study the characteristics of the force gestures and the effectiveness of their mapping to the primary actions. In the user study we could also discover interesting usability issues and collect useful user feedback about the force gestures and their mapping to GUI actions.
TL;DR: The prototyped and empirically evaluated the effect of sidetone to help operators self regulate their speaking loudness, and found that engaging in more social tasks and more intellectually demanding tasks influenced how loudly people spoke.
Abstract: In our field deployments of mobile remote presence (MRP) systems in offices, we observed that remote operators of MRPs often unintentionally spoke too loudly. This disrupted their local co-workers, who happened to be within earshot of the MRP system. To address this issue, we prototyped and empirically evaluated the effect of sidetone to help operators self regulate their speaking loudness. Sidetone is the intentional, attenuated feedback of speakers' voices to their ears while they are using a telecommunication device. In a 3-level (no sidetone vs. low sidetone vs. high sidetone) within- participants pair of experiments, people interacted with a confederate through an MRP system. The first experiment involved MRP operators using headsets with boom microphones (N=20). The second experiment involved MRP operators using loudspeakers and desktop microphones (N=14). While we detected the effects of the sidetone manipulation in our audio-visual context, the effect was attenuated in comparison to earlier audio-only studies. We hypothesize that the strong visual component of our MRP system interferes with the sidetone effect. We also found that engaging in more social tasks (e.g., a getting-to-know-you activity) and more intellectually demanding tasks (e.g., a creativity exercise) influenced how loudly people spoke. This suggests that testing such sidetone effects in the typical read-aloud setting is insufficient for generalizing to more interactive, communication tasks. We conclude that MRP application support must reach beyond the time honored audio-only technologies to solve the problem of excessive speaker loudness.
TL;DR: The design of the Portico system is described, a portable system for enabling tangible interaction on and around tablet computers that allows tablets to extend both their sensing capabilities and interaction space without sacrificing portability.
Abstract: We present Portico, a portable system for enabling tangible interaction on and around tablet computers. Two cameras on small foldable arms are positioned above the display to recognize a variety of physical objects placed on or around the tablet. These cameras have a larger field-of-view than the screen, allowing Portico to extend interaction significantly beyond the tablet itself. Our prototype, which uses a 12" tablet, delivers an interaction space six times the size of the tablet screen. Portico thus allows tablets to extend both their sensing capabilities and interaction space without sacrificing portability. We describe the design of our system and present a number of applications that demonstrate Portico's unique capability to track objects. We focus on a number of fun applications that demonstrate how such a device can be used as a low-cost way to create personal surface computing experiences. Finally, we discuss the challenges in supporting tangible interaction beyond the screen and describe possible mechanisms for overcoming them.
TL;DR: This paper presents a hybrid framework, PAX, which associates the visual representation of user interfaces and their internal hierarchical metadata, i.e. the pixels, with the content, role, and value of GUI widgets.
Abstract: Pixel-based methods are emerging as a new and promising way to develop new interaction techniques on top of existing user interfaces. However, in order to maintain platform independence, other available low-level information about GUI widgets, such as accessibility metadata, was neglected intentionally. In this paper, we present a hybrid framework, PAX, which associates the visual representation of user interfaces (i.e. the pixels) and their internal hierarchical metadata (i.e. the content, role, and value). We identify challenges to building such a framework. We also develop and evaluate two new algorithms for detecting text at arbitrary places on the screen, and for segmenting a text image into individual word blobs. Finally, we validate our framework in implementations of three applications. We enhance an existing pixel-based system, Sikuli Script, and preserve the readability of its script code at the same time. Further, we create two novel applications, Screen Search and Screen Copy, to demonstrate how PAX can be applied to development of desktop-level interactive systems.
TL;DR: Scotty, a prototype implementation for Mac OS X Cocoa that enables developers to modify existing applications at runtime, is described and a collection of interaction and functional transformations on existing off-the-shelf applications are demonstrated.
Abstract: This article introduces runtime toolkit overloading, a novel approach to help third-party developers modify the interaction and behavior of existing software applications without access to their underlying source code. We describe the abstractions provided by this approach as well as the mechanisms for implementing them in existing environments. We describe Scotty, a prototype implementation for Mac OS X Cocoa that enables developers to modify existing applications at runtime, and we demonstrate a collection of interaction and functional transformations on existing off-the-shelf applications. We show how Scotty helps a developer make sense of unfamiliar software, even without access to its source code. We further discuss what features of future environments would facilitate this kind of runtime software development.
TL;DR: H4-Writer uses Huffman coding to assign minimized key sequences to letters, with full access to error correction, punctuation, digits, modes, etc., and is believed to be the most efficient and quickest four-key text entry method available.
Abstract: We present what we believe is the most efficient and quickest four-key text entry method available. H4-Writer uses Huffman coding to assign minimized key sequences to letters, with full access to error correction, punctuation, digits, modes, etc. The key sequences are learned quickly, and support eyes-free entry. With KSPC = 2.321, the effort to enter text is comparable to multitap on a mobile phone keypad; yet multitap requires nine keys. In a longitudinal study with six participants, an average text entry speed of 20.4 wpm was observed in the 10th session. Error rates were under 1%. To improve external validity, an extended session was included that required input of punctuation and other symbols. Entry speed dropped only by about 3 wpm, suggesting participants quickly leveraged their acquired skill with H4-Writer to access advanced features.
TL;DR: A creation tool for contextual help that allows users to apply common computer skills-taking screenshots and writing simple scripts and performs pixel analysis on screenshots to make this tool applicable to a wide range of applications and platforms without source code access.
Abstract: Contextual help is effective for learning how to use GUIs by showing instructions and highlights on the actual interface rather than in a separate viewer. However, end-users and third-party tech support typically cannot create contextual help to assist other users because it requires programming skill and source code access. We present a creation tool for contextual help that allows users to apply common computer skills-taking screenshots and writing simple scripts. We perform pixel analysis on screenshots to make this tool applicable to a wide range of applications and platforms without source code access. We evaluated the tool's usability with three groups of participants: developers, in-structors, and tech support. We further validated the applicability of our tool with 60 real tasks supported by the tech support of a university campus.
TL;DR: A novel set of motion-sensing configurations based on laser speckle sensing that are particularly suitable for human-computer interaction and allow these configurations to be fast, precise, extremely compact and low cost.
Abstract: Motion sensing is of fundamental importance for user interfaces and input devices. In applications, where optical sensing is preferred, traditional camera-based approaches can be prohibitive due to limited resolution, low frame rates and the required computational power for image processing. We introduce a novel set of motion-sensing configurations based on laser speckle sensing that are particularly suitable for human-computer interaction. The underlying principles allow these configurations to be fast, precise, extremely compact and low cost. We provide an overview and design guidelines for laser speckle sensing for user interaction and introduce four general speckle projector/sensor configurations. We describe a set of prototypes and applications that demonstrate the versatility of our laser speckle sensing techniques.
TL;DR: An analysis and taxonomy of various aspects of application context and how they may be used in retrieving software help artifacts with web browsers are provided and the design of a context-aware augmented web search system is presented.
Abstract: Users of complex software applications frequently need to consult documentation, tutorials, and support resources to learn how to use the software and further their understand-ing of its capabilities. Existing online help systems provide limited context awareness through "what's this?" and simi-lar techniques. We examine the possibility of making more use of the user's current context in a particular application to provide useful help resources. We provide an analysis and taxonomy of various aspects of application context and how they may be used in retrieving software help artifacts with web browsers, present the design of a context-aware augmented web search system, and describe a prototype implementation and initial user study of this system. We conclude with a discussion of open issues and an agenda for further research.
TL;DR: This paper introduces query-feature graphs, or QF-graphs, and shows that the associations produced by the approach exhibit levels of accuracy that make them eminently usable in a range of real-world applications.
Abstract: This paper introduces query-feature graphs, or QF-graphs. QF-graphs encode associations between high-level descriptions of user goals (articulated as natural language search queries) and the specific features of an interactive system relevant to achieving those goals. For example, a QF-graph for the GIMP graphics manipulation software links the query "GIMP black and white" to the commands "desaturate" and "grayscale." We demonstrate how QF-graphs can be constructed using search query logs, search engine results, web page content, and localization data from interactive systems. An analysis of QF-graphs shows that the associations produced by our approach exhibit levels of accuracy that make them eminently usable in a range of real-world applications. Finally, we present three hypothetical user interface mechanisms that illustrate the potential of QF-graphs: search-driven interaction, dynamic tooltips, and app-to-app analogy search.