Rocket Inclusive Language Scanner: Rocketeers Develop Open Source Software to Detect and Replace Problematic Language
Rocket Software is constantly on a mission to create legendary products, especially ones that incorporate our core values of empathy, humanity, trust, and love. Recently, we’ve been on a more specific mission: in light of the racial issues across the U.S. and the world, Rocket is focusing on creating a more inclusive, diverse, and equal future. Our journey started by creating an employee-led initiative called RIDE, which stands for Rocket Inclusion, Diversity, and Equity. It has also filtered down to the product level, including the creation of our recent open-source tool, Rocket Inclusive Language Scanner.
As members of the User Experience team and the RIDE group, our goal is to ensure that all customers have optimal experiences when using our products. Language has a profound impact on how we experience the world and ourselves. It can impact our mindset, frame our perceptions, and change our thoughts. That’s why it’s important for language, including product language, to be inclusive. To create a better experience for all customers, the language we use shouldn’t be racist, ableist, problematic, or offensive.
Since language has a profound impact on the user experience, our team focuses on reading materials and documentation associated with each product. The issue is that, since we have materials that have been around in some cases for decades, we need to evolve them with the current environment, and ensure we’re not perpetuating the usage of terms with problematic origins like “master” or “blacklist”.
Our team began working on a solution to improve these materials, and scanned the equivalent of hundreds of thousands of pages of content. While updating these documents based on current best practices, some materials came back with over 700 instances of questionable terms such as “master”. Since there are many reasons a term might exist in content, for example that it might be referring to an interface label defined in another product and changing it might confuse the user further, we wanted to review the context of each instance before replacing them all at once. We found, however, that either we had to search for each word, and review and replace every instance manually in its own file, which could take hours, or we had to replace all problematic instances without understanding the context by using the search and replace functions in existing tools.
We searched for a tool that allowed us to find terms in our doc repos, review the findings, and offer the capability to replace the chosen instances. Since we couldn’t find a tool to serve our specific need, we developed a simple tool to find and replace terms in multiple files.
Rocket Inclusive Language Scanner Overview
Rocket Inclusive Language Scanner uses a terminology sheet to collate the user inputs, such as the search term and its suggested replacement. The tool uses the terminology sheet to parse the source files and generate a detailed report regarding its findings. The report captures the current context and the suggested context of the search terms. You can use the information contained in this report to make an informed decision on whether or not to replace the offensive occurrence of the term. The report can be updated to indicate whether you wish to make the suggested replacement in the source file. You can also edit the replacement phrase in the report so that when the offensive term is replaced in the source file, it will consider the context of its occurrence.
Tip: When the report is generated, the Rocket Inclusive Language Scanner only replaces the search term with the suggested replacement term in the report. If you notice any inaccuracies in the replacement phrase (for example use of articles around the replaced term), you can update the report to correct it.
Once you are done reviewing the report, correcting it where required and indicating whether or not you wish to replace the offensive occurrence—by putting a simple ‘yes’ in the specified column of the report—you can use the language scanner again to implement the replacements in the source files. It will generate a detailed log of all the replacements done in the source file. This log comes in handy if, in the future, you wish to undo or modify the replacements.
Tip: To undo a replacement, you only need to switch the contents of the Original Phrase and Suggested Phrase columns in the report and run the language scanner again using the updated report.
Using the Rocket Inclusive Language Scanner
The input terminology sheet must have two sheets, Sheet1 and Sheet2. If you rename the sheets, you will need to rename them in the script as well. Sheet1 is where you need to give your inputs as shown in the following figure:
1. Double click the easygui.py file.
2. Enter the documentation folder path and the input terminology sheet path. The terminology sheet path should include full file name of the sheet, including the file extension:
4. After looking at the context of the findings reported in Sheet2, type yes in the column F (Change?) if you wish to implement the suggested change.
5. Save the input terminology sheet.
6. To replace the instances against which you entered “yes” (in Change? Column), click the Replace button.
Rocket Inclusive Language Scanner replaces all the chosen occurrences and generates a log indicating the changes it made.
The Importance of the Rocket Inclusive Language Scanner
Language has a profound impact on how we experience the world and ourselves. It can impact our mindset, frame our perceptions, and change our thoughts. That’s why it’s important for language, including product language, to be inclusive.
If we can elevate and improve our users’ experiences through the creation of this tool, it’s worth it to create a more equal and inclusive future. If you or your company would like to join us on our mission, you can download or contribute to the tool here.