Home

Awesome

Human question data set

Data obtained in Experiment 1 in Rothe, Lake, & Gureckis (2016). Asking and evaluating natural language questions. [PDF]

Questions

df.allBySubj.csv contains all 605 questions and some extra info.

Preview

| trial| subj|bin                |type       |paras | para_1| para_2| para_3|full                            |text                                  |
|-----:|----:|:------------------|:----------|:-----|------:|------:|------:|:-------------------------------|:-------------------------------------|
|     1|    7|shipsize(red)      |shipsize   |2     |      2|     NA|     NA|How many tiles is the red ship? |How many blocks is the red ship?      |
|     2|    7|horizontal(purple) |horizontal |3     |      3|     NA|     NA|Is the purple ship horizontal?  |Is the purple ship laying horizontal? |
|     3|    7|shipsize(red)      |shipsize   |2     |      2|     NA|     NA|How many tiles is the red ship? |How many blocks is the red ship?      |

Each row refers to one natural language question that a person asked in the context of a partly revealed game board.

Columns

In addition, → questions_raw/ contains all questions that were generated, that is, including those discarded as ambiguous or invalid.

Contexts

Context 1-18

These partly revealed game boards are in → contexts/, with the following coding:

Example: Trial 13

H,H,H,H,W,H
H,H,H,W,R,H
H,W,W,H,R,W
B,H,H,W,W,H
H,W,H,H,W,H
H,H,H,H,H,H