Week 11 (July 27 - Aug 2)

2 minute read | Written by Nickil Maveli

We decide to choose “from-to” with a proxmity window of 4 word tokens between “from” and “to” as the initial template of lexical trigger to map it to the construal dimension, Prominence. In addition, we also identify “first-second”, “firstly-secondly” and “here-then”, but could not find much relevant hand gestures in the PATS dataset.

Consider a sample video of a talk show host, Jimmy Fallon, taken from the PATS dataset with a pre-defined start and end time:

Transcript: “with Mexico that players can either travel from the u.s. to Mexico by plane or just walked past the wall that still won’t be built it’s up to you you can choose”

The video frames corresponding to the “from-to” lexical trigger for the anticipated hand gesture are shown below:

Frame 1

**Lexical prompt:** "travel from the"
Handedness	Axis	Shape	Direction	Gesture
Both Hands	Horizontal	Straight	Diagonal right up	Yes

Frame 2

**Lexical prompt:** "u.s. to"
Handedness	Axis	Shape	Direction	Gesture
Both Hands	Horizontal	Straight	Leftward	Yes

Frame 3

**Lexical prompt:** "Mexico by"
Handedness	Axis	Shape	Direction	Gesture
Both Hands	Horizontal	Straight	Diagonal left down	Yes

Frame 4

**Lexical prompt:** "plane"
Handedness	Axis	Shape	Direction	Gesture
Both Hands	Horizontal	Straight	Rightward	Yes

Now, consider a sample video of a talk show host, Seth Meyers, taken from the PATS dataset with a pre-defined start and end time:

Transcript: “$25,000 do you know how short a flight is from DC to Philadelphia if you tried to watch Thelma and Louise on that flight you wouldn’t meet Louie Susan Sarandon on the bar Tyler so tan prices Medicaid patients should lose their health care but has no problem spending tens of thousands of dollars on private jets and he’s not the only one treasury secretary Steve mnuchin also came”

The video frames corresponding to the “from-to” lexical trigger for the unanticipated hand gesture are shown below:

Frame 1

**Lexical prompt:** "flight is from"
Handedness	Axis	Shape	Direction	Gesture
-	-	-	-	No

Frame 2

**Lexical prompt:** "from DC"
Handedness	Axis	Shape	Direction	Gesture
-	-	-	-	No

Frame 3

**Lexical prompt:** "to Philadel"
Handedness	Axis	Shape	Direction	Gesture
-	-	-	-	No

Frame 4

**Lexical prompt:** "elphia"
Handedness	Axis	Shape	Direction	Gesture
-	-	-	-	No

As is evident from these frames, merely relying on the textual component of the “from-to” lexical trigger to identify the hand gestures would not work as different speakers use hand gestures differently for the same lexical context. Hence, the need arises to build a frame-level hand gesture classification system assisted by the lexical trigger.