Shumin Zhai (Chinese simplified: 翟树民) (born 1961) is a Chinese-born American CanadianHuman–computer interaction (HCI) research scientist and inventor.[citation needed] He is known for his research specifically on input devices and interaction methods, swipe-gesture-based touchscreen keyboards, eye-tracking interfaces, and models of human performance in human-computer interaction. His studies have contributed to both foundational models and understandings of HCI and practical user interface designs and flagship products. He previously worked at IBM where he invented the ShapeWriter text entry method for smartphones, which is a predecessor to the modern Swype keyboard.[1][2] Dr. Zhai's publications have won the ACM UIST Lasting Impact Award and the IEEE Computer Society Best Paper Award, among others, and he is most known for his research specifically on input devices and interaction methods, swipe-gesture-based touchscreen keyboards, eye-tracking interfaces, and models of human performance in human-computer interaction. Dr. Zhai is currently a principal scientist at Google where he leads and directs research, design, and development of human-device input methods and haptics systems.
From 2001 to 2007, Dr. Zhai was a visiting adjunct professor in the Department of Computer and Information Science (IDA) at Linköping University, where he also supervised graduate research.
From 1996 to 2011, he worked at the IBM Almaden Research Center. In January 2007, he originated and led the SHARK/ShapeWriter project at IBM Research and a start-up company that pioneered the touchscreen word-gesture keyboard paradigm, filing the first patents of this paradigm, publishing the first generation of scientific papers.[4] In 2010, ShapeWriter was acquired by Nuance Communications, and taken off the market. During his tenure at IBM, Dr. Zhai also worked with a team of engineers from IBM and IBM vendors to bring the ScrollPoint mouse from research to market, and received a CES award and millions of users.
From 2009 to 2015, Dr. Zhai was also the editor-in-chief of the ACM Transactions on Computer-Human Interaction. At the time he had been deeply involved in both the conference side and the journal side of publishing HCI research as an author, reviewer, editor, committee member, and papers chair.[5][6]
From 2011 till present, Dr. Zhai has been working at Google as a principal scientist, where he leads and directs research, design, and development of human-device input methods and haptics systems. Specifically, he has led research and design of Google's keyboard products, Pixel phone haptics, and novel Google Assistant invocation methods. Notably, Dr. Zhai led the design of Active Edge, a headline feature of Google Pixel 2, which enables the user to reach Google Assistant faster and more intuitively using a gentle device squeeze rather than the touch screen.
Work
Dr. Zhai researches primarily in human-computer interaction, and is currently working on the research, design and development of manual and text input methods and haptics systems. Besides text input and haptics, his other research interests include system user interface design, human-performance modeling, multi-modal interaction, computer input devices and methods, and theories of human-computer interaction.[7] He has published over 200 research papers[8] and received 30 patents.[9]
Word-gesture keyboard
In 2003, Dr. Zhai and Per Ola Kristensson proposed a method of speed-writing for pen-based computing, SHARK (shorthand aided rapid keyboarding), which augments stylus keyboarding with shorthand gesturing. SHARK defines a shorthand symbol for each word according to its movement pattern on an optimized stylus keyboard.[10] In 2004, they presented SHARK2 that increased recognition accuracy and relaxed precision requirements by using the shape and location of gestures in addition to context based language models.[11] In doing so, Dr. Zhai and Kristensson delivered a paradigm of touch screen gesture typing[12] as an efficient method for text entry that has continued to drive the development of mobile text entry across the industry.[4] One of the most important rationales of gesture keyboards is facilitating transition from primarily visual-guidance drive letter-to-letter tracing to memory-recall driven gesturing.[13] By releasing the first word-gesture keyboard in 2004 through IBM AlphaWorks and a top ranked iPhone app called ShapeWriter WritingPad in 2008,[14] Dr. Zhai and his colleagues were able to facilitate this transition and brought the invention from the laboratory to real world users.[15]
Laws and models of action
One of Dr. Zhai's main HCI research threads is Fitts’ law type of human performance models. From 1996, Dr. Zhai, alongside his colleagues, has pursued research on “Laws of Action” that attempted to carry the spirit of Fitts' law forward. In the HCI context, Fitts' law can be considered the “Law of Pointing”, while they believe there are other robust human performance regularities in action. The two new classes of action relevant to user interface design and evaluation that they have explored are crossing and steering.[16]
“Law of Pointing”: Refining Fitts’ law models for bivariate pointing, 2003[17]
“Law of Steering”: Human Action Laws in Electronic Virtual Worlds - an empirical study of path steering performance in VR, 2004[18]
“Law of Crossing”: Foundations for designing and evaluating user interfaces based on the crossing paradigm, 2010[19]
Modeling human performance of pen stroke gestures, 2007[20]
FFitts' law: modeling finger touch with Fitts' law, 2013[21]
Dr. Zhai started working on multiple degrees of freedom (DOF) input during his graduate years at the University of Toronto. In his Ph.D. thesis, he systematically examined human performance as a function of design variations of a 6 DOF control device, such as control resistance (isometric, elastic, and isotonic), transfer function (position vs. rate control), muscle groups used, and display format. He investigated people's ability to coordinate multiple degrees of freedom, based on three ways of quantification: simultaneous time-on-target, error correlation, and efficiency.
Eye-tracking augmented user interfaces
Dr. Zhai has been involved in two applications about eye-tracking augmented user interfaces, MAGIC pointing and RealTourist.[23]
In 1999, he worked together with his colleagues (Carlos Morimoto and Steven Ihde) at IBM Almaden Research Center and published a paper Manual and gaze input cascaded (MAGIC) pointing. This work explored a new direction in utilizing eye gaze for computer input, showing that the MAGIC pointing techniques might offer many advantages, including less physical effort and fatigue than traditional manual pointing, greater accuracy and naturalness than traditional gaze pointing, and possibly faster speed than manual pointing.[24]
In 2005, he developed and studied an experimental system, RealTourist, with Pernilla Qvarfordt and David Beymer. RealTourist lets a user to plan a conference trip with the help of a remote tourist consultant who could view the tourist's eye-gaze superimposed onto a shared map. Data collected from the experiment were analyzed in conjunction with literature review on speech and eye-gaze patterns. This inspective, exploratory research identified various functions of gaze-overlay on shared spatial material including: accurate and direct display of partner's eye-gaze, implicit deictic referencing, interest detection, common focus and topic switching, increased redundancy and ambiguity reduction, and an increase of assurance, confidence, and understanding. This study identified patterns that can serve as a basis for designing multimodal human-computer dialogue systems with eye-gaze locus as a contributing channel, and investigated how computer-mediated communication can be supported by the display of the partner's eye-gaze.[25]
FonePal
FonePal is a system developed to improve the experience of accessing call centers or help desks. Known as "touchtone hell", voice menu navigation has long been recognized as a frustrating user experience due to the nature of voice presentation. In contrast, FonePal allows a user to scan and select from a visual menu at the user's own pace, typically much faster than waiting for the voice menus to be spoken. FonePal uses the Internet infrastructure, specifically Instant Messaging, to deliver a visual menu on a nearby computer screen simultaneously with the voice menu over the phone.[26]
In 2005 and 2006, Dr. Zhai and his colleague Min Yin at IBM Almaden Research Center published two papers about this project. Their study shows that FonePal enables easier navigation of IVR phone tree, higher navigation speed, less routing error and greater satisfaction. FonePal can also seamlessly bridge the caller to a searchable web knowledge base, promoting relevant self-help and reducing call center operation cost.[27][28]
^Zhai, Shumin; Kristensson, Per-Ola (2003). "Shorthand Writing on Stylus Keyboard". Proceedings of the conference on Human factors in computing systems - CHI '03. ACM. pp. 97–104. doi:10.1145/642611.642630. ISBN1581136307. S2CID1697605.
^Waterloo, E5-Engineering 5 3102 200 University Avenue West; Canada, ON N2L 3G1 (2018-04-09). "CBB Seminar: Dr. Shumin Zhai, Google Inc". Engineering. Retrieved 2019-04-27.{{cite web}}: CS1 maint: numeric names: authors list (link)
^Zhai, Shumin; Kristensson, Per-Ola (2003). "Shorthand writing on stylus keyboard". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '03. New York, NY, USA: ACM. pp. 97–104. doi:10.1145/642611.642630. ISBN9781581136302. S2CID1697605.
^US 7251367, Zhai, Shumin, "System and method for recognizing word patterns based on a virtual keyboard layout", published 2007-07-31, assigned to IBM
^Zhai, Shumin; Kristensson, Per Ola (2012). "The word-gesture keyboard: reimagining keyboard interaction". Communications of the ACM. 55 (9 (September 2012)). ACM: 91–101. doi:10.1145/2330667.2330689. S2CID566903.
^Accot, Johnny; Zhai, Shumin (2003). "Refining Fitts' law models for bivariate pointing". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '03. New York, NY, USA: ACM. pp. 193–200. doi:10.1145/642611.642646. ISBN9781581136302. S2CID5154061.
^Zhai, Shumin; Accot, Johnny; Woltjer, Rogier (2004-04-01). "Human Action Laws in Electronic Virtual Worlds: An Empirical Study of Path Steering Performance in VR". Presence: Teleoperators and Virtual Environments. 13 (2): 113–127. doi:10.1162/1054746041382393. ISSN1054-7460. S2CID36408015.
^Apitz, Georg; Guimbretière, François; Zhai, Shumin (May 2008). "Foundations for Designing and Evaluating User Interfaces Based on the Crossing Paradigm". ACM Trans. Comput.-Hum. Interact. 17 (2): 9:1–9:42. doi:10.1145/1746259.1746263. ISSN1073-0516. S2CID6224916.
^Cao, Xiang; Zhai, Shumin (2007). "Modeling human performance of pen stroke gestures". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '07. New York, NY, USA: ACM. pp. 1495–1504. doi:10.1145/1240624.1240850. ISBN9781595935939. S2CID6745302.
^Bi, Xiaojun; Li, Yang; Zhai, Shumin (2013). "FFitts law". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '13. New York, NY, USA: ACM. pp. 1363–1372. doi:10.1145/2470654.2466180. ISBN9781450318990. S2CID2675893.
^Zhai, Shumin; Morimoto, Carlos; Ihde, Steven (1999). "Manual and gaze input cascaded (MAGIC) pointing". Proceedings of the SIGCHI conference on Human factors in computing systems the CHI is the limit - CHI '99. New York, NY, USA: ACM. pp. 246–253. doi:10.1145/302979.303053. ISBN9780201485592. S2CID207247711.
^Qvarfordt, Pernilla; Beymer, David; Zhai, Shumin (2005). "RealTourist – A Study of Augmenting Human-Human and Human-Computer Dialogue with Eye-Gaze Overlay". In Costabile, Maria Francesca; Paternò, Fabio (eds.). Human-Computer Interaction - INTERACT 2005. Lecture Notes in Computer Science. Vol. 3585. Springer Berlin Heidelberg. pp. 767–780. doi:10.1007/11555261_61. ISBN9783540317227.
^Yin, Min; Zhai, Shumin (2005). "Dial and see". Proceedings of the 18th annual ACM symposium on User interface software and technology. UIST '05. New York, NY, USA: ACM. pp. 187–190. doi:10.1145/1095034.1095066. ISBN9781595932716. S2CID8403712.
^Yin, Min; Zhai, Shumin (2006). "The benefits of augmenting telephone voice menu navigation with visual browsing and search". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '06. New York, NY, USA: ACM. pp. 319–328. doi:10.1145/1124772.1124821. ISBN9781595933720. S2CID16484512.