Awesome
CPsyCoun
CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling
<p align="center"> <a href="https://huggingface.co/CAS-SIAT-XinHai/CPsyCounX"><img src="https://img.shields.io/badge/CPsyCounX-yellow" alt="CPsyCounX"></a> <a href="https://github.com/CAS-SIAT-XinHai/CPsyCoun"><img src="https://img.shields.io/badge/GitHub-24292e" alt="github"></a> <a href="https://huggingface.co/datasets/CAS-SIAT-XinHai/CPsyCoun"><img src="https://img.shields.io/badge/CPsyCounD-yellow" alt="CPsyCounD"></a> <a href="https://huggingface.co/datasets/CAS-SIAT-XinHai/CPsyCounR"><img src="https://img.shields.io/badge/CPsyCounR-yellow" alt="CPsyCounR"></a> </p>🔥News
- [Sep. 2024]: Our psychological counseling report dataset CPsyCounR is now available upon reasonable request after signing the Privacy Data Protection Agreement.
- [Jul. 2024]: Paper presentation work: Report | Long talk interviewed by shanghai AI Lab | Short talk interviewed by AI TIME
- [Jul. 2024]: We collaborate with EmoLLM team to launch EmoLLM V3.0, which was full fine-tuned based on the dataset CPsyCounD and the model InternLM2.5-7B-Chat. Model weights: OpenXLab, ModelScope. WebDemo: OpenXLab demo.
- [May. 2024]: Our paper has released on arXiv , check it out!
- [May. 2024]: CPsyCoun has been accepted to 2024 ACL Findings!
- [Apr. 2024]: CPsyCoun has been used in EmoLLM , welcome!
Method
CPsyCoun Framework
The CPsyCoun framework consists of two parts - Data Generation and Automatic Evaluation.
Dialogue Reconstruction
The method Memo2Demo consists of two parts - Memo Conversion and Demo Generation, in order to generate high-quality psychological consultation dialogue from counseling reports.
Counseling Report
Acoording to the China’s National Class II Psychological Counselor Examination and other psychological counseling literature, the counseling report is normalized into six parts: Title, Type, Method, Case Brief, Consultation Process and Experience Thoughts.
- An example of counseling report
CPsyCounD
The high-quality multi-turn dialogue dataset, which has a total of 3,134 multi-turn consultation dialogues.
- For more details, please refer to the CPsyCounD.
- CPsyCounD in LLaMA-Factory form is open-sourced at HuggingFace.
Evaluation Framework
Evaluation Metrics
- Comprehensiveness
- The client’s situation and the degree to which psychological problems are reflected in the dialogues.
- Professionalism
- The professionalism of the psychological counselor during the dialogues.
- Authenticity
- The degree of authenticity between the client and the counselor in the dialogues.
- Safety
- The degree of privacy protection of clients.
Score Criterion
- The score criterion of each evaluation metric
Turn-Based Dialogue Evaluation
The approach to effectively evaluate multi-turn consultation dialogues.
Denote a $m$-turn dialogue as a set of paired elements ${(q_i,r_i)|i=1, 2, ..., m}$, where each $q_i$ represents a query from the client, and each corresponding $r_i$ represents the counselor's reply. We first split it into $m$ single-turn dialogue, then prompt the model with query together with its dialogue history in each single-turn dialogue, resulting in the corresponding single-turn response:
where $h_i={(q_j, r_j)|j=1, 2, ..., i-1}$ signifies the dialogue history before $i$-th turn, and $f_{\mathit{LLM}}(\cdot)$ denotes the inference process of LLMs.
Then, we employ LLM to assess these responses, utilizing the evaluation metrics. The model to assign an evaluation score $\hat{s}_i$ for a single-turn response $\hat{r}_i$. Then we average them to yield the total evaluation score of the current $m$-turn dialogue:
- For more details, please refer to the Code.
CPsyCounE
The general multi-turn dialogue evaluation dataset, which has nine topics.
- For more details, please refer to the CPsyCounE.
Experiments
Intrinsic Evaluation
Role-play VS Memo2Demo
- Statistics of generated dialogues
- The results of intrinsic evaluation
Extrinsic Evaluation
CPsyCounX
We further fine-tune InternLM2-7B-Chat on CPsyCounD. CPsyCounX is fine-tuning for 9 epochs with the batch size set to 448, and the learning rate set to ${1\times10^{-6}}$. During fine-tuning, we adopt the InternLM2-style template to concatenate queries and responses within the multi-turn dialogue.
- For more details, please refer to the Code.
- CPsyCounX is open-sourced at HuggingFace.
Results
- The average results of extrinsic evaluation
- Radar plot of detailed scores of CPsyCounX and other baselines
- The full results of extrinsic evaluation
Citation
If you find our work helpful in your research, please cite the following paper:
@inproceedings{zhang-etal-2024-cpsycoun,
title="{CP}sy{C}oun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for {C}hinese Psychological Counseling",
author="Zhang, Chenhao and Li, Renhao and Tan, Minghuan and Yang, Min and Zhu, Jingwei and Yang, Di and Zhao, Jiahao and Ye, Guancheng and Li, Chengming and Hu, Xiping",
journal={ACL},
year={2024}
}