Home

Awesome


SPDX-License-Identifier: CC0-1.0

Wisesight Sentiment Corpus

DOI

ข้อความภาษาไทยจากสื่อสังคมออนไลน์ พร้อมกับป้ายกำกับความรู้สึก (บวก, กลางๆ, ลบ, คำถาม) รวม 26,737 ข้อความ เผยแพร่เป็นสมบัติสาธารณะ โดยการสละสิทธิ์ตาม CC0 1.0 Universal

Social media messages in Thai language with sentiment label (positive, neutral, negative, question). Contains 26,737 messages. Dedicated to the public domain under CC0 1.0 Universal.

Table of contents

Changelog

Data characteristics and preprocessing

This corpus does not claim to be a statistically representative sample of the Thai language register.

General information:

Data coverage:

Privacy:

Data alterations and modifications:

Further exploration:

Annotation methodology

Corpus file structure

Copyright and disclaimer

This dataset contains social media text extracted from publicly accessible sources on the internet. The selection, organization, curation, and transformation of this dataset are original works that were previously copyrighted. However, the copyright holder has waived all rights to this dataset and dedicated it to the public domain under the Creative Commons Zero v1.0 Universal Public Domain Dedication.

Any trademarks or trade names appearing in the messages belong to their respective owners.

Wisesight (Thailand) Co., Ltd. has assisted in the collection and sentiment labeling of this dataset, but does not necessarily endorse the labels assigned by human annotators. These annotations are for research purposes only and do not represent the professional work Wisesight performs for its clients.

Please note that human annotators may not personally agree or disagree with the messages they label. Additionally, the labels assigned do not necessarily reflect their personal opinions on the content.

You are free to use this dataset for any purpose, without any restrictions.

Citation

Please cite the following if you make use of the dataset:

Suriyawongkul, Arthit, Ekapol Chuangsuwanich, Pattarawat Chormai, Nitchakarn Chantarapratin, Ponrawee Prasertsom, Jitkapat Sawatphol, Nozomi Yamada, Attapol Rutherford, Charin Polpanumas, and Can Udomcharoenchaikit. “PyThaiNLP/Wisesight Sentiment Corpus with Word Tokenization Label”. Zenodo, 7 November 2024. https://doi.org/10.5281/zenodo.3457446.

BibTeX:

@misc{Suriyawongkul_PyThaiNLP_Wisesight_Sentiment_Corpus_2020,
  author       = {Suriyawongkul, Arthit and
                  Chuangsuwanich, Ekapol and
                  Chormai, Pattarawat and
                  Chantarapratin, Nitchakarn and
                  Prasertsom, Ponrawee and
                  Sawatphol, Jitkapat and
                  Yamada, Nozomi and
                  Rutherford, Attapol and
                  Polpanumas, Charin and
                  Udomcharoenchaikit, Can},
  doi          = {10.5281/zenodo.3457446},
  license      = {CC0-1.0},
  month        = nov,
  publisher    = {Zenodo},
  title        = {{PyThaiNLP/Wisesight Sentiment Corpus with Word Tokenization Label}},
  url          = {https://doi.org/10.5281/zenodo.3457446},
  version      = {v1.1},
  year         = 2024
}

Acknowledgement

We would like to thank:

Additional resources