diff --git a/docs/Fuzzing Platform.md b/docs/Fuzzing Platform.md index 686ff78ec8..37b2314faa 100644 --- a/docs/Fuzzing Platform.md +++ b/docs/Fuzzing Platform.md @@ -2,21 +2,22 @@ **Problem:** fuzzing is a versatile technique for generating values to be used as method arguments. Normally, to generate values, one needs information on a method signature, or rather on the parameter types (if a fuzzer is -able to "understand" them). _White-box_ approach also requires AST, and _grey-box_ approach needs coverage +able to "understand" them). +The _white-box_ approach also requires AST, and the _grey-box_ approach needs coverage information. To generate values that may serve as method arguments, the fuzzer uses generators, mutators, and predefined values. * _Generators_ yield concrete objects created by descriptions. The basic description for creating objects is _type_. - Constants, regular expressions, and other structured object specifications (e.g. in HTML) may be also used as + Constants, regular expressions, and other structured object specifications (e.g. in HTML) may also be used as descriptions. * _Mutators_ modify the object in accordance with some logic that usually means random changes. To get better results, mutators obtain feedback (information on coverage and the inner state of the program) during method call. -* _Predefined values_ work well for known problems, e.g. incorrect symbol sequences. To discover potential problems one can analyze parameter names as well as the specific constructs or method calls inside the method body. +* _Predefined values_ work well for known problems, e.g. incorrect symbol sequences. To discover potential problems, one can analyze parameter names as well as the specific constructs or method calls inside the method body. -General API for using fuzzer looks like this: +The general API for using the fuzzer looks like this: ``` fuzz( @@ -29,9 +30,14 @@ fuzz( } ``` -Fuzzer accepts list of types which can be provided in different formats: string, object or Class<*> in Java. Then seed -generator accepts these types and produces seeds which are used as base objects for value generation and mutations. -Fuzzing logic about how to choose, combine and mutate values from seed set is only fuzzing responsibility. API should not provide such abilities except general fuzzing configuring. +The fuzzer gets the list of types, +which can be provided in different formats: as a string, an object, or a Class<*> in Java. +The seed generator accepts these types and produces seeds. +The seeds are base objects for value generation and mutations. + +It is the fuzzer, which is responsible for choosing, combining and mutating values from the seed set. +The fuzzer API should not provide access to the inner fuzzing logic. +Only general configuration is available. ## Parameters @@ -42,26 +48,34 @@ The general fuzzing process gets the list of parameter descriptions as input and ``` In this particular case, the fuzzing process can generate the set of all the pairs having integer as the first value -and `true` or `false` as the second one. If values `-3, 0, 10` are generated to be the `Int` values, the set of all the possible combinations has six items: `(-3, false), (0, false), (10, false), (-3, true), (0, true), (10, true)`. Depending on the programming language, one may use interface descriptions or annotations (type hints) instead of defining the specific type. Fuzzing platform (FP) is not able to create the concrete objects as it does not deal with the specific languages. It still can convert the descriptions to the known constructs it can work with. - -Say, in most of the programming languages, any integer may be represented as a bit array, and fuzzer can construct and -modify bit arrays. So, in general case, the boundary values for the integer are these bit arrays: - -* [0, 0, 0, ..., 0] - null -* [1, 0, 0, ..., 0] - minimum value -* [0, 1, 1, ..., 1] - maximum value -* [0, 0, ..., 0, 1] - plus 1 -* [1, 1, 1, ..., 1] - minus 1 +and `true` or `false` as the second one. +If values `-3, 0, 10` are generated to be the `Int` values, the set of all the possible combinations has six items: +`(-3, false), (0, false), (10, false), (-3, true), (0, true), (10, true)`. +Depending on the programming language, +one may use interface descriptions or annotations (type hints) instead of defining the specific type. +Fuzzing platform (FP) is not able to create the concrete objects as it does not deal with the specific languages. +It can still convert the descriptions to the known constructs it can work with. + +Say, in most of the programming languages, any integer may be represented as a bit array, and the fuzzer can construct and +modify bit arrays. So, in the general case, the boundary values for the integer are these bit arrays: + +* [0, 0, 0, ..., 0] — zero +* [1, 0, 0, ..., 0] — minimum value +* [0, 1, 1, ..., 1] — maximum value +* [0, 0, ..., 0, 1] — plus 1 +* [1, 1, 1, ..., 1] — minus 1 One can correctly use this representation for unsigned integers as well: -* [0, 0, 0, ..., 0] - null (minimum value) -* [1, 0, 0, ..., 0] - maximum value / 2 -* [0, 1, 1, ..., 1] - maximum value / 2 + 1 -* [0, 0, ..., 0, 1] - plus 1 -* [1, 1, 1, ..., 1] - maximum value +* [0, 0, 0, ..., 0] — zero (minimum value) +* [1, 0, 0, ..., 0] — maximum value / 2 +* [0, 1, 1, ..., 1] — maximum value / 2 + 1 +* [0, 0, ..., 0, 1] — plus 1 +* [1, 1, 1, ..., 1] — maximum value -Thus, FP interprets the _Byte_ and _Unsigned Byte_ descriptions in different ways: in the former case, the maximum value is [0, 1, 1, 1, 1, 1, 1, 1], while in the latter case it is [1, 1, 1, 1, 1, 1, 1, 1]. FP types are described in details further. +Thus, FP interprets the _Byte_ and _Unsigned Byte_ descriptions in different ways: in the former case, +the maximum value is [0, 1, 1, 1, 1, 1, 1, 1], while in the latter case it is [1, 1, 1, 1, 1, 1, 1, 1]. +FP types are described in detail further. ## Refined parameter description @@ -79,19 +93,21 @@ public boolean isNaN(Number n) { In the above example, let the parameter be `Integer`. Considering the feedback, the fuzzer suggests that nothing but `Double` might increase coverage, so the type may be downcasted to `Double`. This allows for filtering out a priori unfitting values. ## Statically and dynamically generated values -Predefined, or _statically_ generated, values help to define the initial range of values, which could be used as method arguments. These values allow us to: +Predefined, or _statically_ generated, values help to define the initial range of values, which could be used as method arguments. -* check if it is possible to call the given method with at least some set of values as arguments, -* gather statistics on executing the program, +These values allow us to: +* check if it is possible to call the given method with at least some set of values as arguments; +* gather statistics on executing the program; * refine the parameter description. _Dynamic_ values are generated in two ways: - -* internally — via mutating the existing values, successfully performed as method arguments (i.e. seeds); -* externally — via obtaining feedback that can return not only the statistics on the execution (the paths explored, +* internally, via mutating the existing values, successfully performed as method arguments (i.e. seeds); +* externally, via obtaining feedback that can return not only the statistics on the execution (the paths explored, the time spent, etc.) but also the set of new values to be blended with the values already in use. -Dynamic values should have the higher priority for a sample, that's why they should be chosen either first or at least more likely than the statically generated ones. In general, the algorithm that guides the fuzzing process looks like this: +Dynamic values should have a higher priority for a sample; +that is why they should be chosen either first or at least more likely than the statically generated ones. +In general, the algorithm that guides the fuzzing process looks like this: ``` # dynamic values are stored with respect to their return priority @@ -135,7 +151,6 @@ Sometimes it is reasonable to modify the source code so that it makes applying f ## Generators There are two types of generators: - * yielding values of primitive data types: integers, strings, booleans * yielding values of recursive data types: objects, lists @@ -146,8 +161,7 @@ three modifications for it using `put(key, value)`. For this purpose, you may request for applying the fuzzer to six parameters `(key, value, key, value, key, value)` and get the necessary modified values. -Primitive type generators allow for yielding - +Primitive type generators allow for yielding: 1. Signed integers of a given size (8, 16, 32, and 64 bits, usually) 2. Unsigned integers of a given size 3. Floating-point numbers with a given size of significand and exponent according to IEEE 754 @@ -155,30 +169,25 @@ Primitive type generators allow for yielding 5. Characters (in UTF-16 format) 6. Strings (consisting of UTF-16 characters) -Fuzzer should be able to provide out-of-the-box support for these types — be able to create, modify, and process -them. To work with multiple languages it is enough to specify the possible type size and to describe and create the +The fuzzer should be able to provide out-of-the-box support for these types — be able to create, modify, and process +them. +To work with multiple languages, it is enough to specify the possible type size and to describe and create concrete objects based on the FP-generated values. The recursive types include two categories: - * Collections (arrays and lists) * Objects Collections may be nested and have _n_ dimensions (one, two, three, or more). Collections may be: - * of a fixed size (e.g., arrays) * of a variable size (e.g., lists and dictionaries) Objects may have: - 1. Constructors with parameters - 2. Modifiable inner fields - 3. Modifiable global values (the static ones) - 4. Calls for modifying methods FP should be able to create and describe such objects in the form of a tree. The semantics of actual modifications is under the responsibility of a programming language. diff --git a/docs/OverallArchitecture.md b/docs/OverallArchitecture.md index a5ef963223..ba9a48562a 100644 --- a/docs/OverallArchitecture.md +++ b/docs/OverallArchitecture.md @@ -112,7 +112,7 @@ sequenceDiagram The plugin provides * a UI for the IntelliJ-based IDEs to use UnitTestBot directly from source code, * the linkage between IntelliJ Platform API and UnitTestBot API, -* support for the most popular programming languages and frameworks for end users (the plugin and its optional dependencies are described in [plugin.xml](https://github.com/UnitTestBot/UTBotJava/blob/main/utbot-intellij/src/main/resources/META-INF/plugin.xml) and nearby, in the [`META-INF`](https://github.com/UnitTestBot/UTBotJava/tree/main/utbot-intellij/src/main/resources/META-INF) folder. +* support for the most popular programming languages and frameworks for end users (the plugin and its optional dependencies are described in [plugin.xml](https://github.com/UnitTestBot/UTBotJava/blob/main/utbot-intellij/src/main/resources/META-INF/plugin.xml) and nearby, in the [`META-INF`](https://github.com/UnitTestBot/UTBotJava/tree/main/utbot-intellij/src/main/resources/META-INF) folder). The main plugin module is [utbot-intellij](https://github.com/UnitTestBot/UTBotJava/tree/main/utbot-intellij), providing support for Java and Kotlin. Also, there is an auxiliary [utbot-ui-commons](https://github.com/UnitTestBot/UTBotJava/tree/main/utbot-ui-commons) module to support providers for other languages. @@ -124,7 +124,7 @@ As for the UI, there are two entry points: The main plugin-specific features are: * A common action for generating tests right from the editor or a project tree — with a generation scope from a single method up to the whole source root. See [GenerateTestAction](https://github.com/UnitTestBot/UTBotJava/blob/main/utbot-intellij/src/main/kotlin/org/utbot/intellij/plugin/ui/actions/GenerateTestsAction.kt) — the same for all supported languages. * Auto-installation of the user-chosen testing framework as a project library dependency (JUnit 4, JUnit 5, and TestNG are supported). See [UtIdeaProjectModelModifier](https://github.com/UnitTestBot/UTBotJava/blob/main/utbot-intellij/src/main/kotlin/org/utbot/intellij/plugin/util/UtIdeaProjectModelModifier.kt) and the Maven-specific version: [UtMavenProjectModelModifier](https://github.com/UnitTestBot/UTBotJava/blob/main/utbot-intellij/src/main/kotlin/org/utbot/intellij/plugin/util/UtMavenProjectModelModifier.kt). -* Suggesting the location for a test source root and auto-generating the `utbot_tests` folder there, providing users with a sandbox in their codespace. +* Suggesting the location for a test source root and auto-generating the `utbot_tests` folder there, providing users with a sandbox in their code space. * Optimizing generated code with IDE-provided intentions (experimental). See [IntentionHelper](https://github.com/UnitTestBot/UTBotJava/blob/main/utbot-intellij/src/main/kotlin/org/utbot/intellij/plugin/generator/IntentionHelper.kt) for details. * An option for distributing generation time between symbolic execution and fuzzing explicitly. * Running generated tests while showing coverage with the IDE-provided measurement tools. See [RunConfigurationHelper](https://github.com/UnitTestBot/UTBotJava/blob/main/utbot-intellij/src/main/kotlin/org/utbot/intellij/plugin/util/RunConfigurationHelper.kt) for implementation. @@ -241,7 +241,10 @@ The main instrumentation of UnitTestBot is [UtExecutionInstrumentation](https:// ### Code generator Code generation and rendering are a part of the test generation process in UnitTestBot. -UnitTestBot gets the synthetic representation of generated test cases from the fuzzer or the symbolic engine. This representation, or model, is implemented in the `UtExecution` class. The `codegen` module generates the real test code based on this `UtExecution` model and renders it in a human-readable form. +UnitTestBot gets the synthetic representation of generated test cases from the fuzzer or the symbolic engine. +This representation (or model) is implemented in the `UtExecution` class. +The `codegen` module generates the real test code based on this `UtExecution` model +and renders it in a human-readable form. The `codegen` module - converts `UtExecution` test information into an Abstract Syntax Tree (AST) representation using `CodeGenerator`, @@ -287,7 +290,7 @@ To minimize the number of executions in a group, we use a simple greedy algorith 2. Add this execution to the final suite and mark new lines as covered. 3. Repeat the first step and continue till there are executions containing uncovered lines. -The whole minimization procedure is located in the [org.utbopt.framework.minimization](utbot-framework/src/main/kotlin/org/utbot/framework/minimization) package inside the [utbot-framework](../utbot-framework) module. +The whole minimization procedure is located in the [org.utbot.framework.minimization](../utbot-framework/src/main/kotlin/org/utbot/framework/minimization) package inside the [utbot-framework](../utbot-framework) module. ### Summarization module @@ -309,7 +312,7 @@ For detailed information, please refer to the Summarization architecture design ### SARIF report generator -SARIF (Static Analysis Results Interchange Format) is a JSON–based format for displaying static analysis results. +SARIF (Static Analysis Results Interchange Format) is a JSON-based format for displaying static analysis results. All the necessary information about the format and its usage can be found in the [official documentation](https://github.com/microsoft/sarif-tutorials/blob/main/README.md) @@ -346,7 +349,8 @@ UnitTestBot consists of three processes (according to the execution order): These processes are built on top of the [Reactive distributed communication framework (Rd)](https://github.com/JetBrains/rd) developed by JetBrains. -One of the main Rd concepts is _Lifetime_ — it helps to release shared resources upon the object's termination. You can find the Rd basic ideas and UnitTestBot implementation details in the [Multiprocess architecture](https://github.com/UnitTestBot/UTBotJava/blob/main/docs/RD%20for%20UnitTestBot.md) design doc. +One of the main Rd concepts is a _Lifetime_ — it helps to release shared resources upon the object's termination. +You can find the Rd basic ideas and UnitTestBot implementation details in the [Multiprocess architecture](https://github.com/UnitTestBot/UTBotJava/blob/main/docs/RD%20for%20UnitTestBot.md) design doc. ### Settings @@ -362,4 +366,15 @@ The end user has three places to change UnitTestBot behavior: 3. Controls in the **Generate Tests with UnitTestBot window** dialog — for per-generation settings. ### Logging -TODO \ No newline at end of file + +The UnitTestBot Java logging system is implemented across the IDE process, the Engine process, and the Instrumented process. + +UnitTestBot Java logging relies on `log4j2` library. +The custom Rd logging system is recommended as the default one for the Instrumented process. + +In the [Logging](../docs/contributing/InterProcessLogging.md) document, +you can find how to configure the logging system when UnitTestBot Java is used +* as an IntelliJ IDEA plugin, +* as Contest estimator or the Gradle/Maven plugins, via CLI or during the CI test runs. + +Implementation details, log level and performance questions are also addressed [here](../docs/contributing/InterProcessLogging.md). \ No newline at end of file diff --git a/docs/RD for UnitTestBot.md b/docs/RD for UnitTestBot.md index 3becab04da..49ffba4e4d 100644 --- a/docs/RD for UnitTestBot.md +++ b/docs/RD for UnitTestBot.md @@ -77,7 +77,7 @@ executing all the callbacks because some other thread executes them. Rd is a lightweight reactive one-to-one RPC protocol, which is cross-language as well as cross-platform. It can work on the same or different machines via the Internet. -These are some of Rd entities: +These are some Rd entities: - `Protocol` encapsulates the logic of all Rd communications. All the entities should be bound to `Protocol` before being used. `Protocol` contains `IScheduler`, which executes a _runnable_ instance on a different thread. - `RdSignal` is an entity allowing one to **fire and forget**. You can add a callback for every received message @@ -228,7 +228,7 @@ Sometimes the _Instrumented process_ may unexpectedly die due to concrete execut - **Important**: do not add [`Rdgen`](https://mvnrepository.com/artifact/com.jetbrains.rd/rd-gen) as an implementation dependency — it breaks some JAR files as it contains `kotlin-compiler-embeddable`. 5. Logging & debugging: - - [Interprocess logging](./InterProcessLogging.md) + - [Interprocess logging](contributing/InterProcessLogging.md) - [Interprocess debugging](./contributing/InterProcessDebugging.md) 6. Custom protocol marshaling types: do not spend time on it until `UtModels` get simpler, e.g. compatible with `kotlinx.serialization`. diff --git a/docs/contributing/InterProcessDebugging.md b/docs/contributing/InterProcessDebugging.md index 864a14e865..d488d7f56f 100644 --- a/docs/contributing/InterProcessDebugging.md +++ b/docs/contributing/InterProcessDebugging.md @@ -62,7 +62,7 @@ To debug the _Engine process_ and the _Instrumented process_, you need to enable "-agentlib:jdwp=transport=dt_socket,server=n,suspend=n,quiet=y,address=12345" ``` See `org.utbot.intellij.plugin.process.EngineProcess.Companion.debugArgument` for switch implementation. -4. For information about logs, refer to the [Interprocess logging](../InterProcessLogging.md) guide. +4. For information about logs, refer to the [Interprocess logging](InterProcessLogging.md) guide. ### Run configurations for debugging the Engine process diff --git a/docs/InterProcessLogging.md b/docs/contributing/InterProcessLogging.md similarity index 92% rename from docs/InterProcessLogging.md rename to docs/contributing/InterProcessLogging.md index e54c27c864..a023dba59e 100644 --- a/docs/InterProcessLogging.md +++ b/docs/contributing/InterProcessLogging.md @@ -1,248 +1,248 @@ -# Interprocess logging - -This document describes -how logging is implemented across the UnitTestBot Java [processes](https://github.com/UnitTestBot/UTBotJava/blob/main/docs/RD%20for%20UnitTestBot.md): -the IDE process, the Engine process, and the Instrumented process. - -## Architecture - -The UnitTestBot Java logging system relies on `log4j2` library. - -For UnitTestBot Java used as an IntelliJ IDEA plugin, the configuration file for logging is [`utbot-intellij/log4j2.xml`](../utbot-intellij/src/main/resources/log4j2.xml). - -When used as Contest estimator or the Gradle/Maven plugins, via CLI or during the CI test runs, -UnitTestBot Java engine searches classpath for the first `log4j2.xml` in the `resources` directory. - -### IDE process - -The IDE process writes logging information to standard `idea.log` files and puts them into the default log directory. - -To configure logs for the IDE process, use the log configuration file in a straightforward way. - -To change the log configuration for the prebuilt plugin, -go to **Help** > **Diagnostic Tools** > **Debug Log Settings...** and configure `log4j2.xml`. - -To store log data for the Engine process started from the IDE process, the UnitTestBot Java plugin creates a directory: -`org.utbot.intellij.plugin.process.EngineProcessKt.engineProcessLogDirectory`. - -### Engine process - -The Engine process can be started either from IntelliJ IDEA or separately — as a standalone engine. - -#### Engine process started from IntelliJ IDEA - -As the plugin does not support multiple generation processes, -the logs from the Engine process are written to the same file. - -The default log file directory is `%user_temp%/UtBot/rdEngineProcessLogs`. - -The [`utbot-intellij/log4j2.xml`](../utbot-intellij/src/main/resources/log4j2.xml) file is copied to the -UnitTestBot Java temporary directory: -`org.utbot.intellij.plugin.process.EngineProcessKt.engineProcessLogConfigurationsDirectory`. -Then this file is provided to the Engine process via the following CLI switch: - ``` - -Dlog4j2.configurationFile=%configuration_file% - ``` - -Here, `%configuration_file%` can take one of two values: -1. A modified copy of [`utbot-intellij/log4j2.xml`](../utbot-intellij/src/main/resources/log4j2.xml) file, which is stored in UnitTestBot Java temporary directory. - - More precisely, there are 2 appenders in the configuration file: - ```xml - - - - - ``` - By default, `IdeaAppender` is used everywhere in a file for the IDE plugin. - - For the Engine process, a temporary `log4j2.xml` is created, - where the `ref="IdeaAppender"` substring is replaced with `ref="EngineProcessAppender"`: - this replacement changes all the appenders and the log pattern - but keeps categories and log levels for the loggers the same. - - As soon as the file reaches 20 MB size, `RollingFileAppender` writes the logs to the `utbot-engine-current.log` file. - The created log files are named `utbot-engine-%i.log`. - A log file with the largest index is the latest one: `utbot-engine-1.log` has been created earlier than `utbot-engine-10.log`. - - Each time the Engine process starts, the following lines are printed into the IntelliJ IDEA log: - ``` - | UtBot - EngineProcess | Engine process started with PID = 4172 - | UtBot - EngineProcess | Engine process log directory - C:\Users\user_name\AppData\Local\Temp\UTBot\rdEngineProcessLogs - | UtBot - EngineProcess | Engine process log file - C:\Users\user_name\AppData\Local\Temp\UTBot\rdEngineProcessLogs\utbot-engine-current.log - ``` - -2. A path from `UtSettings.engineProcessLogConfigFile`. - - The option provides the external `log4j2` configuration file with the path instead of [`utbot-intellij/log4j2.xml`](../utbot-intellij/src/main/resources/log4j2.xml). - In the `~/.utbot/settings.properties` file, one can set this path to a custom configuration file applicable to the Engine process, for example: - ``` - engineProcessLogConfigFile=C:\wrk\UTBotJava\engineProcessLog4j2.xml - ``` - This allows you to configure logs for the Engine process even for the prebuilt plugin (you need to restart an IDE). - -#### Engine process started separately - -When used as Contest estimator or the Gradle/Maven plugins, via CLI or during the CI test runs, -UnitTestBot Java engine searches classpath for the first `log4j2.xml` in the `resources` directory -to get configuration information. - -### Instrumented process - -The Instrumented process sends the logs to its parent — to the Engine process. -Logs are sent via the corresponding Rd model: `org.utbot.rd.models.LoggerModel`. - -See also `org.utbot.instrumentation.rd.InstrumentedProcess.Companion.invoke` and -`org.utbot.instrumentation.process.InstrumentedProcessMainKt.main`. - -## Rd logging system - -Rd has the custom logging system based on `com.jetbrains.rd.util.Logger` interface. -It is convenient to set the Rd logging system as default for the Instrumented process: -during concrete execution, -the `log4j2` classes in UnitTestBot Java could be confused with the `log4j2` classes from the project under test. -Duplicated `log4j2` libraries can break instrumentation and coverage statistics. - -You should always override the default Rd logging strategy, which writes log data to `stdout/stderr`. -Use `com.jetbrains.rd.util.Logger.Companion.set` method to provide custom -`com.jetbrains.rd.util.ILoggerFactory`. -The created loggers will be automatically re-instantiated to obtain a new logger from the provided factory. -You can obtain a logger via the `com.jetbrains.rd.util.getLogger` function. -Check `EngineProcessMain` for Rd logging example. - -For available Rd factories, see the `org.utbot.rd.loggers` package: it contains the implemented factories. -The format of the log messages is the same as described in `utbot-intellij/src/main/resources/log4j2.xml`. - -## Implementation details - -### Additivity - -An entry may appear in a log many times due to _additivity_. The resulting log may look like this: -``` -13:55:41.204 | INFO | AnalyticsConfigureUtil | PathSelectorType: INHERITORS_SELECTOR -13:55:41.204 | INFO | AnalyticsConfigureUtil | PathSelectorType: INHERITORS_SELECTOR -``` - -The logger's full name constitutes a tree structure so that the logged events from a child are visible to a parent. - -For example, the following `log4j2.xml` configuration in IntelliJ IDEA will produce such a problem: -```xml -... - - - - - - - - -... -``` - -This happens because the `org.utbot` logger is a parent to `org.utbot.intellij`, and all the events from -`org.utbot.intellij` are also transferred to `org.utbot`. - -To modify this behavior, add the `additivity="false"` tag to all loggers manually: -```xml -... - - - - - - - - -... -``` - -Consider this problem when you manually configure log level and appender for a logger. - -For more information, -refer to the [`log4j2` additivity](https://logging.apache.org/log4j/2.x/manual/configuration.html#Additivity) document. - -### Logging: auxiliary methods - -Find more logging-related methods at `UtRdLogUtil.kt` and `Logging.kt`. - -To trace the execution duration, -use the `measureTime` method (see `Logging.kt`) with the corresponding log level scope. - -In the Engine process, the entries from the Instrumented process are logged by `org.utbot.instrumentation.rd.InstrumentedProcessKt.rdLogger`. - -## Log levels and performance - -For development, the `Debug` level is preferred in most cases. - -The `Info` log level is sufficient for release. - -In Rd, if you choose the `Trace` level for all loggers or set it as default for the root logger, -this enables logging for all technical _send/receive_ events from protocol. -It may cause ~50 MB of additional entries per generation to appear and _heavily_ pollutes the log. This might be useful -for troubleshooting interprocess communication but in all other cases prefer the `Debug` level or -specify the `Trace` level per logger explicitly. - -For the `Debug` level, if a log message requires heavy string interpolation, wrap it in lambda, for example: -```kotlin -val someVeryBigDataStructure = VeryBigDataStructure() - -logger.debug("data structure representation - $someVeryBigDataStructure") // <---- interpolation -``` -Here, even for a message with the `Debug` level, interpolation will always occur because -the message is passed as a parameter, which is evaluated at call site. -If the `Info` level (or higher) is set for a logger, -the message is built, but not logged, -resulting in unnecessary work, possibly causing performance issues. - -Consider using lambdas: -```kotlin -// message will be created only if debug log level is available -logger.debug { "data structure representation - $someVeryBigDataStructure"} -``` - -Here, although the logs are sent from one process to another, no performance penalties have been noticed. - -To reach higher performance, try to use `bufferedIO` and `immediateFlush` properties in `log4j2.xml`. -For example, you can make the following changes to the `log4j2.xml` file in `utbot-intellij`: -```xml - -``` - -This will reduce a number of I/O operations and help to use `log4j2` buffer more efficiently. -This may also have a flip side: -when the process terminates, `log4j2` terminates the logging service before the buffer is flushed, and -you will lose the last portion of logs. -This behavior is undesirable for testing and debugging, -but probably acceptable for release. - -## Docker and Gradle - -To see the logs in Gradle from console, Docker and CI, add the following `build.gradle.kts` file: -```kotlin -allprojects { - tasks { - withType { - testLogging.showStandardStreams = true - testLogging.showStackTraces = true - } - } -} -``` - -## Useful links - -UnitTestBot Java documentation: -1. [Multiprocess architecture](RD%20for%20UnitTestBot.md) -2. [Interprocess debugging](./contributing/InterProcessDebugging.md) -3. [How to use loggers](../HowToUseLoggers.md) - -`log4j2` documentation: -1. [Architecture](https://logging.apache.org/log4j/2.x/manual/architecture.html) — an overall `log4j2` description. -2. [Layouts](https://logging.apache.org/log4j/2.x/manual/layouts.html) — how to format log messages. - (UnitTestBot Java uses `Pattern layout` everywhere.) -3. [Appenders](https://logging.apache.org/log4j/2.x/manual/appenders.html) — - a description of various ways to store log entries (and how to configure the storages). - UnitTestBot Java uses the `Console`, `File` and `RollingFile` appenders. -4. [Configuration](https://logging.apache.org/log4j/2.x/manual/configuration.html) — - how to use a configuration file, how to check the file, and other useful information. +# Interprocess logging + +This document describes +how logging is implemented across the UnitTestBot Java [processes](https://github.com/UnitTestBot/UTBotJava/blob/main/docs/RD%20for%20UnitTestBot.md): +the IDE process, the Engine process, and the Instrumented process. + +## Architecture + +The UnitTestBot Java logging system relies on `log4j2` library. + +For UnitTestBot Java used as an IntelliJ IDEA plugin, the configuration file for logging is [`utbot-intellij/log4j2.xml`](../../utbot-intellij/src/main/resources/log4j2.xml). + +When used as Contest estimator or the Gradle/Maven plugins, via CLI or during the CI test runs, +UnitTestBot Java engine searches classpath for the first `log4j2.xml` in the `resources` directory. + +### IDE process + +The IDE process writes logging information to standard `idea.log` files and puts them into the default log directory. + +To configure logs for the IDE process, use the log configuration file in a straightforward way. + +To change the log configuration for the prebuilt plugin, +go to **Help** > **Diagnostic Tools** > **Debug Log Settings...** and configure `log4j2.xml`. + +To store log data for the Engine process started from the IDE process, the UnitTestBot Java plugin creates a directory: +`org.utbot.intellij.plugin.process.EngineProcessKt.engineProcessLogDirectory`. + +### Engine process + +The Engine process can be started either from IntelliJ IDEA or separately — as a standalone engine. + +#### Engine process started from IntelliJ IDEA + +As the plugin does not support multiple generation processes, +the logs from the Engine process are written to the same file. + +The default log file directory is `%user_temp%/UtBot/rdEngineProcessLogs`. + +The [`utbot-intellij/log4j2.xml`](../../utbot-intellij/src/main/resources/log4j2.xml) file is copied to the +UnitTestBot Java temporary directory: +`org.utbot.intellij.plugin.process.EngineProcessKt.engineProcessLogConfigurationsDirectory`. +Then this file is provided to the Engine process via the following CLI switch: + ``` + -Dlog4j2.configurationFile=%configuration_file% + ``` + +Here, `%configuration_file%` can take one of two values: +1. A modified copy of [`utbot-intellij/log4j2.xml`](../../utbot-intellij/src/main/resources/log4j2.xml) file, which is stored in UnitTestBot Java temporary directory. + + More precisely, there are 2 appenders in the configuration file: + ```xml + + + + + ``` + By default, `IdeaAppender` is used everywhere in a file for the IDE plugin. + + For the Engine process, a temporary `log4j2.xml` is created, + where the `ref="IdeaAppender"` substring is replaced with `ref="EngineProcessAppender"`: + this replacement changes all the appenders and the log pattern + but keeps categories and log levels for the loggers the same. + + As soon as the file reaches 20 MB size, `RollingFileAppender` writes the logs to the `utbot-engine-current.log` file. + The created log files are named `utbot-engine-%i.log`. + A log file with the largest index is the latest one: `utbot-engine-1.log` has been created earlier than `utbot-engine-10.log`. + + Each time the Engine process starts, the following lines are printed into the IntelliJ IDEA log: + ``` + | UtBot - EngineProcess | Engine process started with PID = 4172 + | UtBot - EngineProcess | Engine process log directory - C:\Users\user_name\AppData\Local\Temp\UTBot\rdEngineProcessLogs + | UtBot - EngineProcess | Engine process log file - C:\Users\user_name\AppData\Local\Temp\UTBot\rdEngineProcessLogs\utbot-engine-current.log + ``` + +2. A path from `UtSettings.engineProcessLogConfigFile`. + + The option provides the external `log4j2` configuration file with the path instead of [`utbot-intellij/log4j2.xml`](../../utbot-intellij/src/main/resources/log4j2.xml). + In the `~/.utbot/settings.properties` file, one can set this path to a custom configuration file applicable to the Engine process, for example: + ``` + engineProcessLogConfigFile=C:\wrk\UTBotJava\engineProcessLog4j2.xml + ``` + This allows you to configure logs for the Engine process even for the prebuilt plugin (you need to restart an IDE). + +#### Engine process started separately + +When used as Contest estimator or the Gradle/Maven plugins, via CLI or during the CI test runs, +UnitTestBot Java engine searches classpath for the first `log4j2.xml` in the `resources` directory +to get configuration information. + +### Instrumented process + +The Instrumented process sends the logs to its parent — to the Engine process. +Logs are sent via the corresponding Rd model: `org.utbot.rd.models.LoggerModel`. + +See also `org.utbot.instrumentation.rd.InstrumentedProcess.Companion.invoke` and +`org.utbot.instrumentation.process.InstrumentedProcessMainKt.main`. + +## Rd logging system + +Rd has the custom logging system based on `com.jetbrains.rd.util.Logger` interface. +It is convenient to set the Rd logging system as default for the Instrumented process: +during concrete execution, +the `log4j2` classes in UnitTestBot Java could be confused with the `log4j2` classes from the project under test. +Duplicated `log4j2` libraries can break instrumentation and coverage statistics. + +You should always override the default Rd logging strategy, which writes log data to `stdout/stderr`. +Use `com.jetbrains.rd.util.Logger.Companion.set` method to provide custom +`com.jetbrains.rd.util.ILoggerFactory`. +The created loggers will be automatically re-instantiated to obtain a new logger from the provided factory. +You can obtain a logger via the `com.jetbrains.rd.util.getLogger` function. +Check `EngineProcessMain` for Rd logging example. + +For available Rd factories, see the `org.utbot.rd.loggers` package: it contains the implemented factories. +The format of the log messages is the same as described in `utbot-intellij/src/main/resources/log4j2.xml`. + +## Implementation details + +### Additivity + +An entry may appear in a log many times due to _additivity_. The resulting log may look like this: +``` +13:55:41.204 | INFO | AnalyticsConfigureUtil | PathSelectorType: INHERITORS_SELECTOR +13:55:41.204 | INFO | AnalyticsConfigureUtil | PathSelectorType: INHERITORS_SELECTOR +``` + +The logger's full name constitutes a tree structure so that the logged events from a child are visible to a parent. + +For example, the following `log4j2.xml` configuration in IntelliJ IDEA will produce such a problem: +```xml +... + + + + + + + + +... +``` + +This happens because the `org.utbot` logger is a parent to `org.utbot.intellij`, and all the events from +`org.utbot.intellij` are also transferred to `org.utbot`. + +To modify this behavior, add the `additivity="false"` tag to all loggers manually: +```xml +... + + + + + + + + +... +``` + +Consider this problem when you manually configure log level and appender for a logger. + +For more information, +refer to the [`log4j2` additivity](https://logging.apache.org/log4j/2.x/manual/configuration.html#Additivity) document. + +### Logging: auxiliary methods + +Find more logging-related methods at `UtRdLogUtil.kt` and `Logging.kt`. + +To trace the execution duration, +use the `measureTime` method (see `Logging.kt`) with the corresponding log level scope. + +In the Engine process, the entries from the Instrumented process are logged by `org.utbot.instrumentation.rd.InstrumentedProcessKt.rdLogger`. + +## Log levels and performance + +For development, the `Debug` level is preferred in most cases. + +The `Info` log level is sufficient for release. + +In Rd, if you choose the `Trace` level for all loggers or set it as default for the root logger, +this enables logging for all technical _send/receive_ events from protocol. +It may cause ~50 MB of additional entries per generation to appear and _heavily_ pollutes the log. This might be useful +for troubleshooting interprocess communication but in all other cases prefer the `Debug` level or +specify the `Trace` level per logger explicitly. + +For the `Debug` level, if a log message requires heavy string interpolation, wrap it in lambda, for example: +```kotlin +val someVeryBigDataStructure = VeryBigDataStructure() + +logger.debug("data structure representation - $someVeryBigDataStructure") // <---- interpolation +``` +Here, even for a message with the `Debug` level, interpolation will always occur because +the message is passed as a parameter, which is evaluated at call site. +If the `Info` level (or higher) is set for a logger, +the message is built, but not logged, +resulting in unnecessary work, possibly causing performance issues. + +Consider using lambdas: +```kotlin +// message will be created only if debug log level is available +logger.debug { "data structure representation - $someVeryBigDataStructure"} +``` + +Here, although the logs are sent from one process to another, no performance penalties have been noticed. + +To reach higher performance, try to use `bufferedIO` and `immediateFlush` properties in `log4j2.xml`. +For example, you can make the following changes to the `log4j2.xml` file in `utbot-intellij`: +```xml + +``` + +This will reduce a number of I/O operations and help to use `log4j2` buffer more efficiently. +This may also have a flip side: +when the process terminates, `log4j2` terminates the logging service before the buffer is flushed, and +you will lose the last portion of logs. +This behavior is undesirable for testing and debugging, +but probably acceptable for release. + +## Docker and Gradle + +To see the logs in Gradle from console, Docker and CI, add the following `build.gradle.kts` file: +```kotlin +allprojects { + tasks { + withType { + testLogging.showStandardStreams = true + testLogging.showStackTraces = true + } + } +} +``` + +## Useful links + +UnitTestBot Java documentation: +1. [Multiprocess architecture](../RD%20for%20UnitTestBot.md) +2. [Interprocess debugging](InterProcessDebugging.md) +3. [How to use loggers](../../HowToUseLoggers.md) + +`log4j2` documentation: +1. [Architecture](https://logging.apache.org/log4j/2.x/manual/architecture.html) — an overall `log4j2` description. +2. [Layouts](https://logging.apache.org/log4j/2.x/manual/layouts.html) — how to format log messages. + (UnitTestBot Java uses `Pattern layout` everywhere.) +3. [Appenders](https://logging.apache.org/log4j/2.x/manual/appenders.html) — + a description of various ways to store log entries (and how to configure the storages). + UnitTestBot Java uses the `Console`, `File` and `RollingFile` appenders. +4. [Configuration](https://logging.apache.org/log4j/2.x/manual/configuration.html) — + how to use a configuration file, how to check the file, and other useful information. It is **highly advised** to read the `Additivity` part. \ No newline at end of file