BDD-Based Integration Testing Framework for NebulaGraph: Part Ⅰ

Evolution of Testing Framework

So far, during the development of NebulaGraph, its testing framework has undergone three major changes, as shown in the following figure. In this article, I would like to introduce some experience and lessons that our team has learned from these changes.

how the testing framework in NebulaGraph evolves over time

For a database product, the importance of testing cannot be overemphasized. Therefore, no matter how the testing framework evolves, it always aims to easily and quickly accumulate test cases to assure stability of NebulaGraph functionalities, not only for developers, but also for its users, including maintenance engineers, documentation engineers, and even non-technical users. To achieve this goal, our testing framework must make codeless programming or no-programming possible for users.

Most database products have their own testing framework, and the tests submitted by developers must comply with the frameworks. We tried developing such a testing framework. To develop it, our team needed to implement a lot of custom functionalities from scratch, and if it was done, our users needed to learn a new set of rules. Therefore, we did not adopt that framework. When we started supporting MATCH, an openCypher-compatible syntax, the TCK repository came into view. Although it is a test suite to verify conformance to openCypher, it refreshed our mind about implementing integration testing for NebulaGraph. Why the framework we originally tried was abandoned? One reason is that the result set returned by NebulaGraph is a complex data structure that may be composed of vertices, edges, and paths. It can be described in JSON, but not in an elegant and concise way. Once the amount of result sets gets large, the "content" will be overcome by the "format", which means that the description of the structure, verbose and annoying, is far more than the data that users care about. However, the set of rules for describing vertices, edges, and paths defined in TCK are simple and intuitive, and fit the pattern statement in MATCH, which can be accepted and understood by openCypher users. Of course, to apply the TCK rules to NebulaGraph, a database with strong schema, we need to extend them, but it doesn't matter. Because of these advantages, we determined to develop this BDD-based testing framework.

By comparing the complexity of the following test cases, you can see the obvious progress in each change. The following figures show test cases for:

GTest-based testing
pytest-based testing
BDD-based testing

[GTest-based testing] [pytest-based testing] [BDD-based testing]

It can be seen from the preceding comparisons that we are getting closer to the essence of "testing": Only the input and output, but not coding to assemble test data, are what we care about. With the help of some small automation tools, our users, even those who have no technical background, can create test cases.

Expectations and Implementations

Before extending the TCK-based testing framework, we set the following goals for this upgrade:

It must be convenient to add a test case and the expected data.
It must support importing other test data sets.
It can reuse the flexibility of the pytest-based framework, especially mechanisms such as plugins and fixture.
It must be compatible with the MATCH test cases in TCK.

To build a TCK-based testing framework for NebulaGraph, we must choose an "appropriate" testing framework. It must meet these requirements:

Completely complying with BDD-based testing.
Convenient, flexible, and extendable.
Compatibility with the existing pytest test cases would be a plus.

BDD-based testing frameworks can be implemented in various languages and in different ways, even Python has many implementations, for example, pytest-bdd and behave. To meet the third requirement, we decided to build a testing process based on pytest-bdd for NebulaGraph. pytest-bdd is a plug-in of pytest, so it can well support the features of BDD and the functionalities of pytest, which is more in line with our expectations.

After the testing framework was chosen, we began to design the modules of the entire testing process. They are ConnectionManager, DataLoader, Parser, Comparator, and Reporter.

ConnectionManager

Manages connections to NebulaGraph, including retries after errors and error filtering.

DataLoader

Reads the CSV files, parses the data types in the configurations, splices data for an INSERT statement, and so on.

Parser

Parses the strings of vertices, edges, and paths described in TCK and converts them into the Value format defined in NebulaGraph, which makes comparisons easier.

Comparator

Responsible for comparing the values of different Value formats, including basic data types and composite data types. Composite data types include List, Map, Set, Vertex, Edge, and Path.

Reporter

Supports customizing features such as outputting the position and the line number in a feature file where an nGQL statement with an execution error is located.

All these modules, which are both independent and interrelated, and different scope of pytest fixture make it possible to implement the isolation and testing of different scenarios.

What is BDD?

So far, BDD has been mentioned many times in this article. You may be familiar with TDD (Test-Driven Development) and DDD (Domain-Driven Design), but not with BDD. What is BDD? BDD, the abbreviation for Behavior-Driven Development, is a sort of evolution of TDD. With this methodology, developers and non-developers can write test cases in natural language to do tests, which is friendlier to non-developers and makes developers and users better understand and communicate with each other. In practice, we found that BDD is not only effective in software quality management, but also a good supplement to complicated requirement management. Besides users' complex scenarios, the feature files contain users' expected statements, steps, and conclusion of a query, which makes developers understand the feature more easily. After the feature development is done, these files turn into test cases. Well, kill two birds with one stone.

To understand BDD, we should introduce the Gherkin language firstly. It defines a set of basic syntax rules to organize the structure of plain text so that BDD-based testing tools can parse the text. The text in Gherkin is saved in a file with the FEATURE extension. A .feature file can be composed of several scenarios and each Scenario has its own steps (Given, When, Then, and so on). These syntax rules are simple and easy to understand, and only a few keywords need to be memorized. Therefore, reading a test case written in Gherkin is like reading an "Ask and Answer" dialogue. Easy to learn.

Our goal is to implement the BDD-based testing framework for NebulaGraph as a pure black box testing process, where developers and non-developers care about only two things: The input nGQL statements and the expected returns. Such a framework can reduce the mental burden of users to add test cases, and make it convenient for them to contribute to NebulaGraph. After we upgraded the framework, about 2,500 test cases from our team have been added within half a year. These test cases provide the strong quality assurance for NebulaGraph 2.0. You can find them in the tests/tck/features directory of the nebula-graph repository. You can use them as a guide to nGQL. If you have a tough scenario and don't know how to describe it in nGQL, you can refer to these test cases.

Conclusion

In this article, I briefly described the evolution of the testing framework for NebulaGraph. In the upcoming article, I will introduce the available functionalities of the new testing framework and how to use it to test changes to the NebulaGraph source code. Stay tuned!

Interested in learning more about NebulaGraph? Join the Slack community!