Bookmark and Share

General Discussion

close-range gesture control vs. touch
6 replies [Last post]
Amit Shoham
Offline
Last seen: 27 weeks 6 days ago
Level 4: Thaumaturgist
Joined: 2011-05-28
Points: 60

Gesture-based control of applications is slated to be one of the fastest-growing embedded vision application areas.  In particular, close-range gesture control seems to be a hot topic.  Personally, I'm having a difficult time understanding why...

Here are just a few reasons for my skepticism:

1. It seems to me that it's much more intuitive to manipulate objects on a screen by touching them. If I'm close enough to touch the screen, in most cases I suspect that this provides a better user experience.

2. Vision-based gestures may be problematic because they lack context.  I'd hate to have my laptop delete a file because I waved goodbye to my wife, or initiate an online purchase because I tried to get a waiter's attention at a cafe.  Touch naturally provides the context that's missing from vision-based gesture control.

3. Any time we put a camera in a smart, connected device, we create a privacy concern.

4. Waving your arms around to control an application often causes more fatigue than using a mouse or touch screen.

5. What about this guy?

https://www.youtube.com/watch?v=2MJ-NeXRcEk

He can use keyboards and touchscreens just fine, but can he use a vision-based gesture interface?  I can think of several people I've met whose hands are deformed.  Any application that is intended for use by the public--e.g. point-of-sale applications--will need to accomodate them, and the vast majority of vision-based gesture technologies will not work.  At the very least, the vision-based control will have to be supplemented by some buttons.

6. Much of the functionality of vision-based gesture control can be provided by much cheaper, more power-efficient, and more computationally-efficient solutions.  Here are a couple of examples:

https://www.youtube.com/watch?v=xSKKFHhgP-c

https://www.youtube.com/watch?v=tlU6Gz1Sc8M

Note that these solutions don't provide the sophistication of e.g. Omek's "Grasp."  I.e. they don't track the positions of individual fingers... In fact, these solutions don't even discern a hand from a foot, and this actually makes them usable by handicapped people.

In my opinion, if vision-based gesture control is ultimately going to be more than a solution looking for a problem, it will require some very thoughtful user interface design that includes a touchscreen (or mouse, keyboard, etc.), and some additional visual context (e.g. gaze tracking).  Waving one's hands in the air to move a pointer and select objects on the screen is not rarely when touching the screen is a viable option.  The vision-based functionality will have to provide some meaningful value beyond what's already provided by a touchscreen in order to ultimately be successful.

 

gershom (not verified)

Amit,

I largely agree with your main point. For close-range gesture control to get to market in any kind of a meaningful way, there has to be a clear demonstration of an engaging user experience, which has to be both natural and intuitive, as well as providing added value over existing interfaces (notably the keyboard, mouse and touchscreen). Indeed, wasn't this the case with Kinect, as well? Many (most?) of those launch titles were 1st and 2nd party titles, funded by Microsoft.

Nonetheless, there are many reasons to believe this technology will be able to deliver on its promise (and I, for one, am a believer, since my company, Omek Interactive, is currently developing a close-range gesture solution). First of all, I think there is a natural (and understandable) tendency to view close-range gesture through the prism of long-range (i.e., Kinect) gestures -- big, waving movements of the arms/hands, tiring motions, etc. But, I don't expect that this will be the close-range experience at all. The proximity to the camera, combined with higher resolutions and higher framerates, will support much finer, nuanced tracking, quite different than the Kinect experience.

As you point out, gestures certainly won't replace keyboards, but they will greatly expand the types of interactions our devices can understand. Indeed, does anyone believe that when touchscreens start becoming part of the PC experience, we will stop using our mice and keyboards?

Ultimately, communicating with your devices by moving your hands and fingers is much closer to the way people communicate with each other than, say, a mouse. I suggest this is one reason why the touchscreen experience works -- it's more natural to touch the object you want to select. Or, alternatively, you could just point at it with your finger. A touchscreen can detect when it is being touched, but a 3d gesture system can not only identify which finger it is, but also the direction it's pointing in.

Amit Shoham
Offline
Last seen: 27 weeks 6 days ago
Level 4: Thaumaturgist
Joined: 2011-05-28
Points: 60

Hi Gershom,

I think we're vehemently in agreement :)   I too believe that there's potentially some great value to close-range gesture interfaces.  I wrote the original post out of frustration on several fronts. 

First and foremost, I was seeing a lot of demos of close-range gesture technology that frankly were so lame that I felt they were doing a dis-service to the industry by making the technology look clunky and not very useful.  A demo isn't going to be effective if the person showing it is complaining about the strain on their arm, the poor response time, etc.  when there's another user interface available that works perfectly well for the same task.  A good demo has to show that the technology makes life MUCH easier for the user.  But all of the demos I was seeing actually looked harder to use than a touchscreen, keyboard, etc.  If you're Microsoft then you have the clout to work with lots of app developers to create production-ready titles that show the real value of the technology.  I understand that this is a lot harder for a small company such as Omek.  But I want to at least see a reasonable "mock up" that shows the real value of the technology.  I don't want to see a demo that shows the technology actually working, but in a way that makes it look bad because it's less intuitive/responsive/etc. than the much simpler and cheaper existing technology that we already have.

It's tempting to think that we need the "I can prove that it works" demo, even though that demo doesn't actually show the value of the technology.  IMO this is very short-sighted.  I think that having the mockup that shows the real value of the technology is vital, and if you already have that mockup, then you can easily turn it into a much more compelling demo as soon as the technology is sufficiently mature.

My second source of frustration was the fact that the vast majority of the close-range interface technology that I was seeing didn't seem very mature.  Combined with the fact that all of the demos I saw that were ill-conceived, it generally gave me a very negative feeling about the technology.  I could understand intellectually that there's a lot of potential, but I just couldn't care less...  And I was talking to others in the industry that seemed to be having the same experience.  They understood the potential, but the excitement just wasn't there.

So here's my challenge to Omek and other companies working on close-rage gesture: make me care! Show me the mockup/demo/application that proves the value of the technology.  Make me FEEL that close-range gesture interfaces will actually make my life better in some way.

flared0ne
flared0ne's picture
Offline
Last seen: 6 years 24 weeks ago
Level 1: Prestidigitator
Joined: 2012-09-10
Points: 1

I can see we're going to have some fun!

So, how about this? An infrared sensor which doesn't do imaging (so no "camera in a smart connected device" concerns), which has exquisitely-precise reproducible-precision re position-detection (0.01 millimeters, ten micrometers), which has a latency almost three times faster than the refresh rate on your computer monitor (running between 32 and 200 frames per second), and which is initially being pitched as a "gestural recognition" package where, if you can SEE an action and it can be reliably reproduced, it can be used as a control input. And you get to define the "upon recognition" action, if any.

Amit Shoham
Offline
Last seen: 27 weeks 6 days ago
Level 4: Thaumaturgist
Joined: 2011-05-28
Points: 60

What you're describing sounds a lot like the 'leap motion' product. check it out: http://www.embedded-vision.com/news/2012/05/30/evolution-gesture-interfa...

I do think that there are lots of cool things you can do with this type of control interface.  My main concern with these UIs is that it seems to me that touch is extremely intuitive, accurate, and reliable.  I see a lot of people replacing touch-based interfaces with touchless gesture interfaces (based on some sort of vision sensor, whether an RGB camera, IR, or some other sort of image sensor) just because they can--and the end result is less reliable, less intuitive, consumes more power, etc.  I think that ultimately vision-based gesture interfaces should NOT replace touch-based interfaces, but complement them.  There are good uses for vision-based interfaces, but the applications must come along that make truly appropriate use of their capabilities, rather than just use them for the whiz-bang factor.  So far, it seems to me that those applications are few and far between.

 

RobbySun
Offline
Last seen: 1 year 4 weeks ago
Level 3: Conjurer
Joined: 2011-05-27
Points: 42

Some disabled people might actually benefit from the close-range gesture control. For example, blind people can't use the touch screen. People on a wheelchair may find it difficult to reach a touch screen installed high up on a wall. In both cases, the close-range gesture control would be a nice supplement (but not replacement) to the touch screen, if the technology is smart & robust enough to be practical.

Amit Shoham
Offline
Last seen: 27 weeks 6 days ago
Level 4: Thaumaturgist
Joined: 2011-05-28
Points: 60

Good point Robby.  Actually, blind people can use a touch screen with this software:

http://www.wired.com/gadgetlab/2011/10/touchscreen-braille-writer/

but I still think you make a good point.  For some disabled people, close-range gesture control provides some value (and long-range gesture control provides even more value).

To be clear, I don't think that close-range gesture control doesn't have any value.  In fact, I believe that it has value far beyond helping a small minority of disabled people.  But it seems to me that the companies making the gesture recognition software haven't discovered the real value yet.  I see a lot of demos where gesture control is used to browse through some photos, or to start/stop a media player, and I can't help thinking "why in the world would I ever want to use my laptop/phone/tablet this way?"  The demos inevitably end up looking awkward, less responsive, and more error-prone compared to a touchscreen.

I was hoping to spur more active discussion and unearth some good ideas here... Let's try and take this a step further :)

I think it's hard to beat the touchscreen for intuitive and responsive control of anything that can easily be represented in two dimensions.  But a 3D interface may be a more intuitive way to manipulate 3D models.  Imagine this scenario:  I'm working on some 3D animation.  I hold up my right hand, and with my left hand I select an object using the touchscreen.  For each of my fingers on my right hand, the fingertip becomes a reference point in 3D space that is associated with the selected object.  If at the time that I selected the object I had two fingers extended on my right hand, then I now have two reference points which I can use to rotate, move, and resize the object any way I want in 3D space.  If I had three or more fingers extended when I selected the object, then I now have three, four, or five reference points. The additional reference points allow me to twist and distort the object in all sorts of ways.  Once I've contorted the object to the desired pose, position, and orientation, I simply tap the touchscreen again with my left hand and the resutls are "frozen."  Used this way--if the gesture control software and hardware is sufficiently accurate and responsive--it could potentially provide much more intuitive control of 3D objects than a touchscreen can.

I think that the various companies now creating the hardware and software for close range gesture control are trusting that once the technology is available, someone will find some great use for it.  They're probably right--but I think they'll have a much easier time monetizing their investment if they can more clearly demonstrate the value of the technology.  I haven't seen any such demos yet...