Home

Awesome

Supplementary materials for the paper "Moral Stories: Situated Reasoning about Norms, Intents, Actions, and their Consequences" (Emelin et al., 2021)

<p align="center"> <img src="images/example.png" /> </p>

Dataset is now also available on HuggingFace: https://huggingface.co/datasets/demelin/moral_stories.
Full paper is available here: https://aclanthology.org/2021.emnlp-main.54.pdf

Abstract: In social settings, much of human behavior is governed by unspoken rules of conduct. For artificial systems to be fully integrated into social environments, adherence to such norms is a central prerequisite. We investigate whether contemporary NLG models can function as behavioral priors for systems deployed in social settings by generating action hypotheses that achieve predefined goals under moral constraints. Moreover, we examine if models can anticipate likely consequences of (im)moral actions, or explain why certain actions are preferable by generating relevant norms. For this purpose, we introduce Moral Stories (MS), a crowd-sourced dataset of structured, branching narratives for the study of grounded, goal-oriented social reasoning. Finally, we propose decoding strategies that effectively combine multiple expert models to significantly improve the quality of generated actions, consequences, and norms compared to strong baselines, e.g. though abductive reasoning.


Dataset

Overview

The Moral Stories dataset is available at https://tinyurl.com/moral-stories-data. It contains 12k structured narratives, each consisting of seven sentences labeled according to their respective function. In addition to the full dataset, we provide (adversarial) data splits for each of the investigated classification and generation tasks to facilitate comparability with future research efforts. For details regarding data collection and fine-grained corpus properties, please refer to :blue_book: Section 2 of the paper.

Story examples

<p align="center"> <img src="images/stories.png" /> </p>

Quickstart guide: Evaluating models on Wino-X

  1. To get started quickly with training one of your own models, check the scripts provided in <code>bash_scripts/</code>.
  2. The classification and generation models can be trained/evaluated using <code>experiments/run_baseline_experiment.py</code>.
  3. Evaluation for classification can be run using the same script. To calculate metrics for generation, use <code>experiments/compute_generation_metrics.py</code> script on specified model generations.

Codebase details

We provide code for the replication of data curation steps as well as experiments discussed in our paper. <code>requirements.txt</code> specifies libraries utilized by the codebase. Example shell scripts used to run each experiment can be found in <code>/bash_scripts</code> whereas their Beaker analogues are provided in <code>/beaker_scripts</code>. The following briefly describes individual files included in the codebase:

Dataset collection

(:blue_book: See Section 2 of the paper.)

Split creation

(:blue_book: See Section 3 of the paper.)

Experiments

(:blue_book: See Sections 3 and 4 of the paper.)

Human evaluation

(:blue_book: See Section 4 of the paper.)


Citation

@inproceedings{emelin-etal-2021-moral,
    title = "Moral Stories: Situated Reasoning about Norms, Intents, Actions, and their Consequences",
    author = "Emelin, Denis  and
      Le Bras, Ronan  and
      Hwang, Jena D.  and
      Forbes, Maxwell  and
      Choi, Yejin",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.54",
    doi = "10.18653/v1/2021.emnlp-main.54",
    pages = "698--718",
    abstract = "In social settings, much of human behavior is governed by unspoken rules of conduct rooted in societal norms. For artificial systems to be fully integrated into social environments, adherence to such norms is a central prerequisite. To investigate whether language generation models can serve as behavioral priors for systems deployed in social settings, we evaluate their ability to generate action descriptions that achieve predefined goals under normative constraints. Moreover, we examine if models can anticipate likely consequences of actions that either observe or violate known norms, or explain why certain actions are preferable by generating relevant norm hypotheses. For this purpose, we introduce Moral Stories, a crowd-sourced dataset of structured, branching narratives for the study of grounded, goal-oriented social reasoning. Finally, we propose decoding strategies that combine multiple expert models to significantly improve the quality of generated actions, consequences, and norms compared to strong baselines.",
}