Skip to content

Commit ac22acd

Browse files
authored
CodeGenerationAndRendering doc added (#1417)
1 parent 34ad60c commit ac22acd

File tree

1 file changed

+339
-0
lines changed

1 file changed

+339
-0
lines changed

docs/CodeGenerationAndRendering.md

Lines changed: 339 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,339 @@
1+
# Code generation and rendering
2+
3+
Code generation and rendering is a part of test generation process in UnitTestBot (find the overall picture in the
4+
[UnitTestBot architecture
5+
overview](https://github.com/UnitTestBot/UTBotJava/blob/main/docs/OverallArchitecture.md)).
6+
UnitTestBot gets the synthetic representation of generated test cases from the fuzzer or the symbolic engine. This representation, or model, is
7+
implemented in the `UtExecution` class.
8+
9+
The `codegen` module generates the real test code based on this `UtExecution` model and renders it in a
10+
human-readable form in accordance with the requested configuration (considering programming language, testing
11+
framework, mocking and parameterization options).
12+
13+
The `codegen` module
14+
- converts `UtExecution` test information into an Abstract Syntax Tree (AST) representation using `CodeGenerator`,
15+
- renders this AST according to the requested programming language and other configurations using `renderer`.
16+
17+
## Example
18+
19+
Consider the following method under test:
20+
21+
```java
22+
package pack;
23+
24+
public class Example {
25+
26+
public int maxIfNotEquals(int a, int b) throws IllegalArgumentException {
27+
if (a == b) throw new IllegalArgumentException("a == b");
28+
if (a > b) return a; else return b;
29+
}
30+
}
31+
```
32+
33+
The standard UnitTestBot-generated tests for this method (without test summaries and clustering into regions)
34+
look like this:
35+
36+
```java
37+
package pack;
38+
39+
import org.junit.jupiter.api.Test;
40+
import org.junit.jupiter.api.DisplayName;
41+
42+
import static org.junit.jupiter.api.Assertions.assertEquals;
43+
import static org.junit.jupiter.api.Assertions.assertThrows;
44+
45+
public final class ExampleStandardTest {
46+
47+
@Test
48+
@DisplayName("maxIfNotEquals: a == b : False -> a > b")
49+
public void testMaxIfNotEquals_AGreaterThanB() {
50+
Example example = new Example();
51+
52+
int actual = example.maxIfNotEquals(1, 0);
53+
54+
assertEquals(1, actual);
55+
}
56+
57+
@Test
58+
@DisplayName("maxIfNotEquals: a == b -> ThrowIllegalArgumentException")
59+
public void testMaxIfNotEquals_AEqualsB() {
60+
Example example = new Example();
61+
62+
assertThrows(IllegalArgumentException.class, () -> example.maxIfNotEquals(-255, -255));
63+
}
64+
65+
@Test
66+
@DisplayName("maxIfNotEquals: a < 0, b > 0 -> return 1")
67+
public void testMaxIfNotEqualsReturnsOne() {
68+
Example example = new Example();
69+
70+
int actual = example.maxIfNotEquals(-1, 1);
71+
72+
assertEquals(1, actual);
73+
}
74+
}
75+
```
76+
77+
Here is an example of the parameterized tests for this method. We also implement the data provider method — the
78+
argument source.
79+
80+
```java
81+
package pack;
82+
83+
import org.junit.jupiter.params.ParameterizedTest;
84+
import org.junit.jupiter.params.provider.MethodSource;
85+
86+
import java.util.ArrayList;
87+
88+
import static org.junit.jupiter.api.Assertions.assertEquals;
89+
import static org.junit.jupiter.api.Assertions.assertTrue;
90+
import static org.junit.jupiter.params.provider.Arguments.arguments;
91+
92+
public final class ExampleParameterizedTest {
93+
94+
@ParameterizedTest
95+
@MethodSource("pack.ExampleTest#provideDataForMaxIfNotEquals")
96+
public void parameterizedTestsForMaxIfNotEquals(Example example, int a, int b, Integer expectedResult, Class expectedError) {
97+
try {
98+
int actual = example.maxIfNotEquals(a, b);
99+
100+
assertEquals(expectedResult, actual);
101+
} catch (Throwable throwable) {
102+
assertTrue(expectedError.isInstance(throwable));
103+
}
104+
}
105+
106+
public static ArrayList provideDataForMaxIfNotEquals() {
107+
ArrayList argList = new ArrayList();
108+
109+
{
110+
Example example = new Example();
111+
112+
Object[] testCaseObjects = new Object[5];
113+
testCaseObjects[0] = example;
114+
testCaseObjects[1] = 1;
115+
testCaseObjects[2] = 0;
116+
testCaseObjects[3] = 1;
117+
testCaseObjects[4] = null;
118+
argList.add(arguments(testCaseObjects));
119+
}
120+
{
121+
Example example = new Example();
122+
123+
Object[] testCaseObjects = new Object[5];
124+
testCaseObjects[0] = example;
125+
testCaseObjects[1] = -255;
126+
testCaseObjects[2] = -128;
127+
testCaseObjects[3] = -128;
128+
testCaseObjects[4] = null;
129+
argList.add(arguments(testCaseObjects));
130+
}
131+
{
132+
Example example = new Example();
133+
134+
Object[] testCaseObjects = new Object[5];
135+
testCaseObjects[0] = example;
136+
testCaseObjects[1] = -255;
137+
testCaseObjects[2] = -255;
138+
testCaseObjects[3] = null;
139+
testCaseObjects[4] = IllegalArgumentException.class;
140+
argList.add(arguments(testCaseObjects));
141+
}
142+
143+
return argList;
144+
}
145+
}
146+
```
147+
148+
## Configurations
149+
150+
UnitTestBot renders code in accordance with the chosen programming language, testing framework,
151+
mocking and parameterization options.
152+
153+
Supported languages for code generation are:
154+
- Java
155+
- Kotlin (experimental) — we have significant problems with the support for nullability and generics
156+
- Python and JavaScript — in active development
157+
158+
Supported testing frameworks are:
159+
- JUnit 4
160+
- JUnit 5
161+
- TestNG (only for the projects with JDK 11 or later)
162+
163+
Supported mocking options are:
164+
- No mocking
165+
- Mocking with Mockito framework
166+
- Mocking static methods with Mockito
167+
168+
Parameterized tests can be generated in Java only. Parameterization is not supported with the mocks enabled or
169+
with JUnit 4 chosen as the testing framework.
170+
171+
## Entry points
172+
173+
The `codegen` module gets calls from various UnitTestBot components. The most common scenario is to call `codegen`
174+
from integration tests as well as from the `utbot-intellij` project and its `CodeGenerationController` class. The
175+
`utbot-online` and `utbot-cli` projects call `codegen` as well.
176+
177+
The `codegen` entry points are:
178+
- `CodeGenerator.generateAsString()`
179+
- `CodeGenerator.generateAsStringWithTestReport()`
180+
181+
The latter gets `UtExecution` information received from the symbolic engine or the fuzzer and converts it into the
182+
`codegen`-related data units, each called `CgMethodTestSet`. As a result of further processing, the test code is
183+
generated as a string with a test generation report (see [Reports](#Reports) for details).
184+
185+
Previously, `CgMethodTestSet` has been considerably different from `UtMethodTestSet` as it has been using
186+
`ExecutableId` instead of the legacy `UtMethod` (has been removed recently).
187+
For now, `CgMethodTestSet` contains utility functions required for code generation, mostly for parameterized tests.
188+
189+
## Abstract Syntax Tree (AST)
190+
191+
The `codegen` module converts `UtExecution` information to the AST representation.
192+
We create one AST per one source class (and one resulting test class). We use our own AST implementation.
193+
194+
We generate a `UtUtils` class containing a set of utility functions, when they are necessary for a given test class.
195+
If the `UtUtils` class has not been created previously, its AST representation is generated as well. To learn more
196+
about the `UtUtils` class and how it is generated, refer to the
197+
[design doc](https://github.com/UnitTestBot/UTBotJava/blob/main/docs/UtUtilsClass.md).
198+
199+
All the AST elements are `CgElement` inheritors.
200+
`CgClassFile` is the top level element — it contains `CgClass` with the required imports.
201+
202+
The class has the body (`CgClassBody`) as well as minor properties declared: documentation comments, annotations,
203+
superclasses, interfaces, etc.
204+
205+
The class body is a set of `CgRegion` elements, having the `header` and the corresponding `content`, which is mostly
206+
the set of `CgMethod` elements.
207+
208+
The further AST levels are created similarly. The AST leaves are `CgLiteral`, `CgVariable`,
209+
`CgLogicalOr`, `CgEmptyLine`, etc.
210+
211+
## Test method
212+
213+
The below-mentioned functionality is implemented in `CgMethodConstructor`.
214+
215+
To create a test method:
216+
* store the initial values of the static fields and perform the seven steps for creating test method body mentioned later,
217+
* if the static field values undergo changes, perform these seven steps in the `try` block and recover these values in the `finally` block accordingly.
218+
219+
To create test method body:
220+
1. substitute static fields with local variables
221+
2. set up instrumentation (get mocking information from `UtExecution`)
222+
3. create a variable for the current instance
223+
4. create variables for method-under-test arguments
224+
5. record an actual result by calling method under test
225+
6. generate result assertions
226+
7. for successful tests, generate field state assertions
227+
228+
_Note:_ generating assertions has pitfalls. In primitive cases like comparing two integers, we can use the standard
229+
assertions of a selected test framework. To compare two objects of an arbitrary type, we need a
230+
custom implementation of equality assertion, e.g. using `deepEquals()`. The `deepEquals()` method compares object
231+
structures field by field. The method is recursive: if the current field is not of the primitive type, we call
232+
`deepEquals()` for this field. The maximum recursion depth is limited.
233+
234+
For the parameterized tests
235+
- we do not support mocking, so we do not set up the initial environment,
236+
- we do not generate field state assertions.
237+
238+
`UtExecution` usually represents a single test scenario, and one `UtExecution` instance is used to create a single
239+
test method. Parameterized tests are supposed to cover several test scenarios, so several `UtExecution` instances
240+
are used for generating test methods.
241+
242+
## Generic execution
243+
244+
Parameterization often helps to reveal similarities between test scenarios. The combined outcome is sometimes more
245+
expressive. To represent these similarities, we construct generic executions.
246+
247+
Generic execution is a synthetic execution, formed by a group of real executions, that have
248+
- the same type of result,
249+
- the same modified static fields.
250+
251+
Otherwise, we create several generic executions and several parameterized tests. The logic of splitting executions
252+
into the test sets is implemented in the `CgMethodTestSet` class.
253+
254+
From the group of `UtExecution` elements, we take the first successful execution with the non-nullable result. See
255+
`CgMethodConstructor.chooseGenericExecution()` for more details.
256+
257+
## Renderer
258+
259+
We have a general approach for rendering the code of test classes. `UtUtils` class is rendered differently: we
260+
hardcode the required method implementations for the specific code generation language.
261+
262+
All the renderers implement `CgVisitor` interface. It has a separate `visit()` method for each supported
263+
`CgElement` item.
264+
265+
There are three renderers:
266+
- `CgAbstractRenderer` for elements that are similar in Kotlin and Java
267+
- `CgJavaRenderer` for Java-specific elements
268+
- `CgKotlinRenderer` for Kotlin-specific elements
269+
270+
Each renderer method visits the current `CgElement`. It means that all the required instructions are printed properly.
271+
If an element contains the other element, the first one delegates rendering to its _child_.
272+
273+
`CgVisitor` refers us to the _Visitor_ design pattern. Delegating means asking the _child_ element to
274+
accept the renderer and manage it. Then we go through the list of `CgElement` types to find the first
275+
matching `visit()` method.
276+
277+
_Note:_ the order of `CgElement` items listed in `CgElement.accept()` is important.
278+
279+
Rendering may be terminated if the generated code is too long (e.g. due to test generation bugs).
280+
281+
## Reports
282+
283+
While constructing the test class, we create test generation reports. It contains basic statistical information: the
284+
number of generated tests, the number of successful tests, etc. It also may contain information on potential problems
285+
like trying to use mocks when mocking framework is not installed.
286+
287+
The report is represented as an HTML-string allowing to include clickable links.
288+
289+
_Note:_ no test generation reports are created for parameterized tests.
290+
291+
## Services
292+
293+
Services help the `codegen` module to produce human-readable test code.
294+
295+
### Name generator
296+
297+
With this service, we create names for variables and methods. It assures avoiding duplicates in names,
298+
resolving conflicts with keywords, etc. It also adds suffixes if we process a mock or a static item.
299+
300+
Name generator is called directly from `CgStatementConstructor`.
301+
302+
_Note:_ if you need a new variable, you should better use this service (e.g. the `newVar()` method) instead of calling
303+
the `CgVariable` constructor manually.
304+
305+
### Framework and language services
306+
307+
Framework services help the `codegen` module to generate constructions (e.g. assertions) according to a given
308+
testing or mocking framework.
309+
Language services provide the `codegen` module with language-specific information on test class extensions,
310+
prohibited keywords, etc.
311+
312+
See the `Domain` file for more framework- and language-specific implementations.
313+
314+
### CgFieldStateManager
315+
316+
`CgFieldStateManager` stores the initial and the final environment states for the given method under test.
317+
These states are used for generating assertions. Usually, the environment state consists of three parts:
318+
* current instance state,
319+
* argument state,
320+
* static field state.
321+
322+
All the state-related variables are marked as `INITIAL` or `FINAL`.
323+
324+
### CgCallableAccessManager
325+
326+
This service helps to validate access. For example, if the current argument
327+
list is valid for the method under test, `CgCallableAccessManager` checks if one can call this method with these
328+
arguments without using _Reflection_.
329+
330+
`CgCallableAccessManager` analyzes callables as well as fields for accessibility.
331+
332+
## CgContext
333+
334+
`CgContext` contains context information for code generation. The `codegen` module uses one
335+
context per one test class. `CgContext` also stores information about the scope for the inner context elements: e.g. when
336+
they should be instantiated and cleared. For example, the context of the nested class is the part of the owner class context.
337+
338+
`CgContext` is the so-called "God object" and should be split into independent storages and
339+
helpers. This task seems to be difficult and is postponed for now.

0 commit comments

Comments
 (0)