Racial and gender biases are exceedingly prevalent in the North American context. These biases have the potential to impact individuals within our communities in ways both big and small. Empirical research suggests that more than 50 percent of youth belonging to a minority racial group perceive themselves to have been victims of racial or ethnic discrimination. In fact, among adolescents, about one in five Caucasians, one in three African Americans and two in five multi-racial and other racial identities, experience discrimination by virtue of their racial background. Approximately 70 percent of adolescents claim to have witnessed discrimination against same-race and cross-race peers. Similarly, gender bias and discrimination are systemic and ubiquitous within our society. Men and women are often treated differently from a young age into adulthood through their socialization at school, work and in various other social settings (Schmitt, Ellemers, and Branscombe 2003; Cogburn, Chavous, and Griffin 2011). Studies show that discrimination is strongly associated with depression and anxiety, which, in turn, causes severe social and public health concerns (Tynes et al. 2008; Niwa, Way, and Hughes 2014). Although the implications of discrimination are serious, there have been insufficient efforts from the machine learning community to address these issues. However, there is significant potential for a machine learning solution; specifically with respect to identifying racial and gender discrimination in both human-generated and computer-generated text. These machine learning tools can be leveraged by a whole host of individuals and organizations in pursuit of identifying and minimizing instances of unintentional gender or racial bias. Our project is designed to achieve just that in order to systematically engage with the problem of bias for the purpose of addressing the social ills that this form of discrimination can cause.
This project will provide two key contributions to the machine learning field. Namely, we will develop: i) a large-scale gender and racial bias dataset collected from various sources and available open source; and ii) models that can be used in an out-of-distribution manner on datasets with different distributions.