Awesome
Human question data set
Data obtained in Experiment 1 in Rothe, Lake, & Gureckis (2016). Asking and evaluating natural language questions. [PDF]
Questions
df.allBySubj.csv
contains all 605 questions and some extra info.
Preview
| trial| subj|bin |type |paras | para_1| para_2| para_3|full |text |
|-----:|----:|:------------------|:----------|:-----|------:|------:|------:|:-------------------------------|:-------------------------------------|
| 1| 7|shipsize(red) |shipsize |2 | 2| NA| NA|How many tiles is the red ship? |How many blocks is the red ship? |
| 2| 7|horizontal(purple) |horizontal |3 | 3| NA| NA|Is the purple ship horizontal? |Is the purple ship laying horizontal? |
| 3| 7|shipsize(red) |shipsize |2 | 2| NA| NA|How many tiles is the red ship? |How many blocks is the red ship? |
Each row refers to one natural language question that a person asked in the context of a partly revealed game board.
Columns
trial
= Context ID = Partly revealed gameboard (see below)subj
= Subject IDbin
= Program representation of the questiontype
= Program representation -- functionparas
= Program representation -- parametersparas_1
= Program representation -- parameter 1paras_2
= Program representation -- parameter 2paras_3
= Program representation -- parameter 3full
= A standardized example questiontext
= Literal question generated by the subject also available in → questions_clean/
In addition, → questions_raw/ contains all questions that were generated, that is, including those discarded as ambiguous or invalid.
Contexts
These partly revealed game boards are in → contexts/, with the following coding:
H
= HiddenW
= WaterB
= Blue shipR
= Red shipP
= Purple ship
Example: Trial 13
H,H,H,H,W,H
H,H,H,W,R,H
H,W,W,H,R,W
B,H,H,W,W,H
H,W,H,H,W,H
H,H,H,H,H,H