The legal counsel of a new social media platform asked the data science team to ensure the system strikes the right balance between the need to remove inciting content and freedom of speech. In a status meeting, the team happily reported that their algorithm managed to remove 90% of the inciting content, and that only 20% of the removed content is non-inciting. Yet, when examining a few dozen samples, the legal counsel surprisingly found that content which was clearly non-inciting has been removed. "The algorithm is not working!" she thought. "Anyone could see that the content removed has zero likelihood to be inciting! What kind of balance did they strike?" Trying to sort things out, the team leader asks whether the counsel wants to decrease the percentage of removable non-inciting content, to which the counsel replied affirmatively. Choosing another threshold for classification, the team proudly reported that only 5% rather than 20% of the removed content was non-inciting, at the expense of reducing the success rate of omitting toxic content to 70%. Still confused, the legal counsel wondered what went wrong: the system now was not only removing clearly non-inciting content, but it had also failed to remove evidently inciting materials. Following several frustrating rounds, new insights have emerged: the legal counsel has learned the inherent precision-recall trade-off. In addition, the team leader realized that the definition of inciting content that was used in the process of labeling the training data was too simplistic. The legal counsel could have helped clarify the complexities of this concept in alignment with the law. The team leader and the counsel regretted not working together on the project from day one. As it turns out, both were using the same words, but apparently, much of what they meant has been lost in translation.
While both data scientists and lawyers have been involved in the design of computer systems in the past, current AI systems warrant closer collaboration and better understanding of each others' fields.2 The growing prevalence of AI systems, as well as their growing impact on every aspect of our daily life create a great need to that AI systems are "responsible" and incorporate important social values such as fairness, accountability and privacy. It is our belief that to increase the likelihood that AI systems are "responsible," an effective multidisciplinary dialogue between data scientists and lawyers is needed. Firstly, it will assist in clearly determining what it means for an AI system to be responsible. Moreover, it would help both disciplines to spot relevant technical, ethical and legal issues and to jointly reach better outcomes early in the design stage of the system.