Are you interested in succeeding on GitHub? As a white person, your odds are better than those of a black person. A new study that used AI to analyze 2M+ contributions by ~365K developers on GitHub finds users with white-sounding names may have more success on the platform.
It was found that users with white-sounding names may have a greater chance of succeeding on GitHub, based on an analysis of 2M+ contributions by 365K developers using artificial intelligence (AI).
Researchers from the University of British Columbia have found that software developers with names that are perceived as White, Hispanic, or Asian-Pacific Islander on GitHub are more likely to find success than developers with names that are perceived to be Black, Hispanic, or Asian-Pacific Islander.
Researchers from the University of Texas-Austin published their findings in IEEE Transactions on Software Engineering earlier this year, which raises important questions about the consequences of a lack of diversity within the open-source software community as a whole, as well as on GitHub in particular.
A recent study conducted by researchers at the University of Waterloo examined more than 2 million contributions, also called “pull requests,” made on GitHub by 365,607 developers. In their study, the researchers used an artificial intelligence tool called NamePrism, which analyzes people’s names in order to determine their perceived race and ethnicity, and they discovered that developers who appear to be white on GitHub are more likely to be accepted for their ideas. The likelihood of a developer being perceived as Hispanic or Asian-Pacific Islander increases by 6 to 10% when compared to developers who are perceived as white.
In open-source software, you don’t see a person. You don’t know how they are. You don’t know what you think about them. This may be the only place where you might have the possibility of a meritocracy. The study was co-authored by Mei Nagappan, an assistant professor of computer science at the University of Waterloo. She said that you only know their name.
Even in this environment, Nagappan said that it is concerning that racial bias may still exist, given open-source communities such as GitHub that have an enormous influence on the development of products, given the influence that they have on such communities. As he pointed out, if we do not listen to diverse voices, the software will be built by and for a very homogeneous group of people.
Furthermore, GitHub has grown to be a sort of portfolio for software developers, meaning that this bias could be detrimental to the careers of developers in the future. Nagappan said, “I believe that you can have a successful career at a company if you have contributions accepted even to one of the important projects,” when you are a newcomer to the industry.
The GitHub team has not responded to Protocol’s requests for comment, and Nagappan said the research isn’t aimed at GitHub in particular, but rather is meant to address concerns raised in the open-source community in general. In addition to Nagappan’s research findings, he also pointed out that previous research has found that the acceptance rate of developers on GitHub who are perceived as women is lower than that of others. The acceptance rates of developers have also been found to vary depending on the country of origin of the developers in question.
According to him, the NamePrism tool used by his team isn’t optimal when it comes to predicting the race and ethnicity of people. Researchers assigned a subjective and highly reliable level of confidence to the tool that was used to assign a race or ethnicity to developers. As for all other developers, they classified their perceived race as “unknown” for all other developers.
Despite the fact that the Waterloo researchers remained clear of attributing this phenomenon of racial bias on GitHub to any particular cause, it was discovered by the researchers that the majority of developers who contributed ideas to GitHub as well as those who responded to those contributions have names which they estimated to be white in nature. As well as finding this to be true, they also found that developers who have been perceived to be Black, Hispanic, and Asian-Pacific Islanders are more likely to be accepted when the people responding to their pull requests are also of the same racial or ethnic background as their own.
In order to prevent this potential bias from occurring in the future, the researchers propose that GitHub adopt a single-blind or double-blind review structure similar to the way research is evaluated in the academic world. A second suggestion would be to have more than one person assess a given contribution, so that the biases of one individual would not interfere with the results.
It is not only GitHub that is grappling with the issue of how perceptions of race impact people’s interactions online. According to Airbnb, last year it launched a research project called Project Lighthouse with the goal of gaining more insight into how racial discrimination manifests itself on its platform, including the influence that people’s names can have on other users’ perceptions of their actions.