Auto healing tests. Do you really want it?

Many teams are leaning towards having auto-healing capability in their automated tests. Have they done a fair evaluation? Do they really need it? What are the problems? Let's understand.

Jun 10, 2024

Lately, with the rise of AI and ML, their impact on software testing has become apparent. Many startups are selling record and replay tools that not only facilitate easy automation for testers but also sell an auto-healing capability.

What does "auto heal" mean?

Generally, A record and replay tool claims that if a test breaks due to a locator issue, their AI-powered system will automatically fix it. This ensures that there won’t be flakiness in the tests, as the locators will always work.

Test flow with and without auto healing capability

The million-dollar question, do we need auto-healing tests?

The answer is not straight forward. But let’s understand what these tools offer and based on that we can take an informed decision.

Locator Strategy in Record Replay Tools

Generally many of the record and replay tools maintain a pool of locators, probably capturing the html element upto a certain level.

These tools almost capture the entire html dom objects with variations in their locators using different attributes. Each of these varied locators can locate the same element.

Their AI engine assign a score to each locator. So any change in one of the attribute will still find the element with the other locators stored in their pool.

This strategy works well with almost every type of page. But it fails with tabular format of UI.

The problem

Above locator strategy fails to work appropriately in case of tables.

Since every element of the table shares the same attribute, these tools maintain an index to locate the elements. However, when you delete any value from the table, the other values typically shift to fill the deleted index. Consequently, the tool may not be able to determine whether it is still locating the deleted element or the new one that has shifted into the deleted element's position. This can cause the tool to fail and produce a false positive.

Fails to report broken CSS/JS

Generally, CSS or JavaScript is tied to attributes like class name and ID. Any change in the ID can be addressed by the tool with its auto-healing capability. However, this change may break the CSS.

Since your automated tests are passing (green), you might skip manually checking the scenario and miss the cosmetic defect introduced by the broken CSS.

A conventional automation tool would fail the test in this case if it uses the broken ID or class in its locator, giving you an opportunity to manually debug the test and notice the broken UI.

Load time assessment by Conventional Tools

Another benefit of using conventional tools is that they indirectly inform you about the application's load time. We typically use timeouts to ensure that elements load within a specified timeout. If the element fails to load within the timeout, the test will fail, indicating that the application has exceeded the expected load time.

This failure helps you identify and understand any slowness in the application. Since the record and replay tool will always make the test pass regardless of the time taken to load the page, it will miss the opportunity for you to assess and flag any slowness issue.

Flakiness, is the locator the only reason for it?

Flakiness is not always due to locator issues. There can be various other reasons for your test to be flaky, such as unstable infrastructure, test data problems, test data cleanup issues, network latency, etc. Locator issues are just one factor.

The unique selling point of these tools is that they claim to remove flakiness. However, that might not be the case for every team; their tests might still fail with or without auto-healing due to other flakiness factors.

Conclusion

In summary, auto-healing record and replay tools offer many benefits in test automation, but they also have certain drawbacks.

My findings listed above do not cover all the limitations of record and replay tools but mainly focus on the problems with the auto-healing capability.

I prefer to keep my self away from auto-healing features as they not only introduce false positives but also takes away my trust in the test results.

I would always be in doubt about whether my tests passed because the application is working as expected or if some AI (Artificial Intelligence) intervention magically marked my tests as green.

QA Expertise