• June 25, 2022

Rocket Inclusive Language Scanner: Rocketeers Develop Open Source Software to Detect and Replace Problematic Language

Rocket Software is constantly on a mission to create legendary products, especially ones that incorporate our core values of empathy, humanity, trust, and love. Recently, we’ve been on a more specific mission: in light of the racial issues across the U.S. and the world, Rocket is focusing on creating a more inclusive, diverse, and equal future. Our journey started by creating an employee-led initiative called RIDE, which stands for Rocket Inclusion, Diversity, and Equity. It has also filtered down to the product level, including the creation of our recent open-source tool, Rocket Inclusive Language Scanner.

As members of the User Experience team and the RIDE group, our goal is to ensure that all customers have optimal experiences when using our products. Language has a profound impact on how we experience the world and ourselves. It can impact our mindset, frame our perceptions, and change our thoughts. That’s why it’s important for language, including product language, to be inclusive. To create a better experience for all customers, the language we use shouldn’t be racist, ableist, problematic, or offensive. 

Since language has a profound impact on the user experience, our team focuses on reading materials and documentation associated with each product. The issue is that, since we have materials that have been around in some cases for decades, we need to evolve them with the current environment, and ensure we’re not perpetuating the usage of terms with problematic origins like “master” or “blacklist”. 

Our team began working on a solution to improve these materials, and scanned the equivalent of hundreds of thousands of pages of content. While updating these documents based on current best practices, some materials came back with over 700 instances of questionable terms such as “master”. Since there are many reasons a term might exist in content, for example that it might be referring to an interface label defined in another product and changing it might confuse the user further, we wanted to review the context of each instance before replacing them all at once. We found, however, that either we had to search for each word, and review and replace every instance manually in its own file, which could take hours, or we had to replace all problematic instances without understanding the context by using the search and replace functions in existing tools. 

We searched for a tool that allowed us to find terms in our doc repos, review the findings, and offer the capability to replace the chosen instances. Since we couldn’t find a tool to serve our specific need, we developed a simple tool to find and replace terms in multiple files. 

Rocket Inclusive Language Scanner Overview

Rocket Inclusive Language Scanner uses a terminology sheet to collate the user inputs, such as the search term and its suggested replacement. The tool uses the terminology sheet to parse the source files and generate a detailed report regarding its findings. The report captures the current context and the suggested context of the search terms. You can use the information contained in this report to make an informed decision on whether or not to replace the offensive occurrence of the term. The report can be updated to indicate whether you wish to make the suggested replacement in the source file. You can also edit the replacement phrase in the report so that when the offensive term is replaced in the source file, it will consider the context of its occurrence. 

Tip: When the report is generated, the Rocket Inclusive Language Scanner only replaces the search term with the suggested replacement term in the report. If you notice any inaccuracies in the replacement phrase (for example use of articles around the replaced term), you can update the report to correct it.

Once you are done reviewing the report, correcting it where required and indicating whether or not you wish to replace the offensive occurrence—by putting a simple ‘yes’ in the specified column of the report—you can use the language scanner again to implement the replacements in the source files. It will generate a detailed log of all the replacements done in the source file. This log comes in handy if, in the future, you wish to undo or modify the replacements. 

Tip: To undo a replacement, you only need to switch the contents of the Original Phrase and Suggested Phrase columns in the report and run the language scanner again using the updated report.

Using the Rocket Inclusive Language Scanner

The input terminology sheet must have two sheets, Sheet1 and Sheet2. If you rename the sheets, you will need to rename them in the script as well. Sheet1 is where you need to give your inputs as shown in the following figure: 

To find and replace the deprecated terms in the source files:

1. Double click the easygui.py file.

2. Enter the documentation folder path and the input terminology sheet path. The terminology sheet path should include full file name of the sheet, including the file extension:

3. Click Find
The Rocket Inclusive Language Scanner saves its findings in Sheet2 of the input terminology sheet.

4. After looking at the context of the findings reported in Sheet2, type yes in the column F (Change?) if you wish to implement the suggested change.

Alternatively, you can also update the Suggested Phrase column and enter yes in the Change? Column. 

5. Save the input terminology sheet.

6. To replace the instances against which you entered “yes” (in Change? Column), click the Replace button.
Rocket Inclusive Language Scanner replaces all the chosen occurrences and generates a log indicating the changes it made.

The Importance of the Rocket Inclusive Language Scanner

Language has a profound impact on how we experience the world and ourselves. It can impact our mindset, frame our perceptions, and change our thoughts. That’s why it’s important for language, including product language, to be inclusive. 

If we can elevate and improve our users’ experiences through the creation of this tool, it’s worth it to create a more equal and inclusive future. If you or your company would like to join us on our mission, you can download or contribute to the tool here.

Shahul Hameed 0 Posts

Shahul Hameed is an information developer at Rocket Software. Shahul has a degree in electronics and communication. He loves researching about natural language understanding and watching football.


Leave a Comment

Your email address will not be published.