The Maptask corpus: Case of L2 Russian speakers

Olga Goubanova

We describe a corpus of task-oriented Map Task dialogues between speakers of American English and L2 English speakers (Russian). The total of 24 subjects, 6 male and 6 female American and 8 female and 4 male Russian speakers, participated in the experiments. A subset of three maps from quadruple 1 of the HCRC Map Task corpus were used. We wanted to preserve the original experimental design in terms of the number of dialogues each subject can participate, 4 in total, with two as an instruction giver with the same map and two as an instruction follower but with two different maps. This resulted in having a sextuple of three pairs of native-L2 speakers with 12 dialogues per condition. The corpus consists of 48 dialogues subdivided into 4 sets according to different experimental conditions; the conditions differed in whether some of the landmarks' labels were missing on an instruction giver or instruction follower maps or on both maps. Approximately 30% of landmarks lacked identifying labels. We present a preliminary analysis of the differences in communicative strategies employed by native and L2 speakers. In particular, we consider the process of creating referring expressions for the landmarks. We also present some examples of how the errors made by L2 speakers during the conversations affect the process of referring expressions creation. Finally, we present some corpus statistics, e.g., the duration of each conversation, the deviation between the instruction giver and follower routes, and the number of the speakers' turns.