Awesome

Human question data set

Data obtained in Experiment 1 in Rothe, Lake, & Gureckis (2016). Asking and evaluating natural language questions. [PDF]

Questions

df.allBySubj.csv contains all 605 questions and some extra info.

Preview

| trial| subj|bin                |type       |paras | para_1| para_2| para_3|full                            |text                                  |
|-----:|----:|:------------------|:----------|:-----|------:|------:|------:|:-------------------------------|:-------------------------------------|
|     1|    7|shipsize(red)      |shipsize   |2     |      2|     NA|     NA|How many tiles is the red ship? |How many blocks is the red ship?      |
|     2|    7|horizontal(purple) |horizontal |3     |      3|     NA|     NA|Is the purple ship horizontal?  |Is the purple ship laying horizontal? |
|     3|    7|shipsize(red)      |shipsize   |2     |      2|     NA|     NA|How many tiles is the red ship? |How many blocks is the red ship?      |

Each row refers to one natural language question that a person asked in the context of a partly revealed game board.

Columns

trial = Context ID = Partly revealed gameboard (see below)
subj = Subject ID
bin = Program representation of the question
type = Program representation -- function
paras = Program representation -- parameters
paras_1 = Program representation -- parameter 1
paras_2 = Program representation -- parameter 2
paras_3 = Program representation -- parameter 3
full = A standardized example question
text = Literal question generated by the subject also available in → questions_clean/

In addition, → questions_raw/ contains all questions that were generated, that is, including those discarded as ambiguous or invalid.

Contexts

Context 1-18

These partly revealed game boards are in → contexts/, with the following coding:

H = Hidden
W = Water
B = Blue ship
R = Red ship
P = Purple ship

Example: Trial 13

H,H,H,H,W,H
H,H,H,W,R,H
H,W,W,H,R,W
B,H,H,W,W,H
H,W,H,H,W,H
H,H,H,H,H,H