Week 11 (July 27 - Aug 2)

2 minute read | Written by Nickil Maveli

We decide to choose “from-to” with a proxmity window of 4 word tokens between “from” and “to” as the initial template of lexical trigger to map it to the construal dimension, Prominence. In addition, we also identify “first-second”, “firstly-secondly” and “here-then”, but could not find much relevant hand gestures in the PATS dataset.

Consider a sample video of a talk show host, Jimmy Fallon, taken from the PATS dataset with a pre-defined start and end time:

Transcript: “with Mexico that players can either travel from the u.s. to Mexico by plane or just walked past the wall that still won’t be built it’s up to you you can choose”

The video frames corresponding to the “from-to” lexical trigger for the anticipated hand gesture are shown below:

Frame 1

Handedness Axis Shape Direction Gesture
Both Hands Horizontal Straight Diagonal right up Yes
Lexical prompt: "travel from the"

Frame 2

Handedness Axis Shape Direction Gesture
Both Hands Horizontal Straight Leftward Yes
Lexical prompt: "u.s. to"

Frame 3

Handedness Axis Shape Direction Gesture
Both Hands Horizontal Straight Diagonal left down Yes
Lexical prompt: "Mexico by"

Frame 4

Handedness Axis Shape Direction Gesture
Both Hands Horizontal Straight Rightward Yes
Lexical prompt: "plane"

Now, consider a sample video of a talk show host, Seth Meyers, taken from the PATS dataset with a pre-defined start and end time:

Transcript: “$25,000 do you know how short a flight is from DC to Philadelphia if you tried to watch Thelma and Louise on that flight you wouldn’t meet Louie Susan Sarandon on the bar Tyler so tan prices Medicaid patients should lose their health care but has no problem spending tens of thousands of dollars on private jets and he’s not the only one treasury secretary Steve mnuchin also came”

The video frames corresponding to the “from-to” lexical trigger for the unanticipated hand gesture are shown below:

Frame 1

Handedness Axis Shape Direction Gesture
- - - - No
Lexical prompt: "flight is from"

Frame 2

Handedness Axis Shape Direction Gesture
- - - - No
Lexical prompt: "from DC"

Frame 3

Handedness Axis Shape Direction Gesture
- - - - No
Lexical prompt: "to Philadel"

Frame 4

Handedness Axis Shape Direction Gesture
- - - - No
Lexical prompt: "elphia"

As is evident from these frames, merely relying on the textual component of the “from-to” lexical trigger to identify the hand gestures would not work as different speakers use hand gestures differently for the same lexical context. Hence, the need arises to build a frame-level hand gesture classification system assisted by the lexical trigger.

Updated: