Persuading across Diverse Domains:
A Dataset and Persuasion Large Language Model


We propose the persuasive dialogue dataset DailyPersuasion, covering 13,000 scenarios in 35 domains, and the model PersuGPT, which outperforms GPT-4 in both automatic and manual evaluation in unseen persuasion scenarios.

Anonymous institutions

Quick Glance of Proposed DailyPersuasion Dataset.

Abstract

Persuasive dialogue requires multi-turn following and planning abilities to achieve the goal of persuading users, which is still challenging even for state-of-the-art large language models (LLMs). Previous works focus on retrieval-based models or generative models in a specific domain due to a lack of data across multiple domains. In this paper, we leverage GPT-4 to create the first multi-domain persuasive dialogue dataset DailyPersuasion. Then we propose a general method named PersuGPT to learn a persuasion model based on LLMs through intent-to-strategy reasoning, which summarizes the intent of user's utterance and reasons next strategy to respond. Moreover, we design a simulation-based preference optimization, which utilizes a learned user model and our model to simulate next turns and estimate their rewards more accurately. Experimental results on two datasets indicate that our proposed method outperforms all baselines in terms of automatic evaluation metric Win-Rate and human evaluation.

Examples

We present some examples from the collected DailyPersuasion dataset, and responses generated by different methods.

(A)Examples of DailyPersuasion



(B)Examples of Generated Responses in Unseen Scenarios

Data Statistics

Figure 1. The distribution of the number of scenarios in different domains is counted. It should be noted that a persuasion scenario may belong to multiple domains.


Figure 2. Word cloud charts of persuasion strategies in selected domains from the DailyPersuaion dataset. It can be seen that in addition to those high-frequency general strategies, each domain has some unique persuasion strategies.

Resources

We have open sourced the DailyPersuasion dataset in the anonymous Github repository, and we will open source the corresponding prompt, code, and models as soon as possible.

Ethics and Disclosure

Persuasive dialogue systems serve as a double-edged sword. On one hand, they can be extensively applied in psychological therapy and philanthropic efforts, fostering positive developments within human society. On the other hand, their misuse in potentially harmful scenarios must be strictly regulated. In our study, we filter the keywords used to construct persuasive scenarios, ensuring all generated scenarios are safe and free from bias. We utilize GPT-4, aligned with ethical values, to collect data, hoping to guarantee the gathered data is devoid of user privacy breaches and harmful content as GPT-4. We will ask humans to review all scenarios, dialogues, and strategies before releasing DailyPersuasion and further filter inappropriate or risky data. We will also ask all people or organizations that download the dataset to sign a strict license to manage the use of our data. It is worth noting that, while our system can be employed across various persuasive domains, it should not be used to directly replace human interaction. All applications of our system should operate under human supervision and regulation, maintaining a balance between leveraging technology for good and ensuring ethical use.