META-ISSUE: Performance task force

Performance discussion in a environment such as libModSecurity which is deployed in different manners and meant to be used in different scenarios is always interesting because it leads to circumstances where the benefit for some may be negative for others. That said, this ticket will be open forever :).

>The definition of performance  optimization is
> The process of making something, especially a computer system, work as effectively as possible.

Due the fact that some optimization may lead to the benefit of a few, different angles and point of views are expected to be discussed. Luckily for us numbers are the numbers to everybody. :)

The main point is: **what is a good or bad performance?**

Usually performance is a trade-of between: CPU, Memory, and I/O in a triangle shape. Decrease the usage of one, increase the usage of other. libModSecurity was designed to support low latency; Even if it costs throughput. That is kind of desirable in o cloud-like deployment. 

### How to measure the performance

Usually ModSecurity is deployed together with different components:
   - libModSecurity
   - Connetor: ModSecurity-nginx, ModSecurity-apache [etc].
   - The webserver:  nginx, apache [etc].

Don't be confused, the _numbers here are about the library ModSecurity_. The library already contains the benchmark utility and the stap scripts necessary to plot the data.

- https://github.com/SpiderLabs/ModSecurity/tree/v3/master/test/benchmark
- https://github.com/SpiderLabs/ModSecurity-nginx/blob/master/ngx-modsec.stp

Other testing components, tools and methods may be suitable for their own tickets and discussion.


### What the numbers means

Are "bad" benchmark numbers means that my web application will be that bad? Not necessarily. Although it can have a relation. In a real world deployment other things like: _latency_ and our _back end application_ will add substantial processing time, leading the overhead of the WAF to be negligible, at least in a theoretical scenario were: well performant rules met performant core.

Notice that the request also play a important role in this equation, hard to choose what it a typical one these days.

Knowing all those variables, the numbers gives us a strong indication where to look in order to improve the performance.


### How can I send a suggestion here?

The idea is taking everybody's input into consideration. The uptake on the decision is what is best for the community overall, so, in order to everybody understand your performance problem please explain your suggestion pointing to facts.


### I want to participate but I'm not following the discussions

There are a few tasks that you may want to help:

##### Variable computation up to utilization

ModSecurity v3 computes all variables regardless if it is being used or not, the variable are gradativelly filled everytime the a new piece of information is delivered to ModSecurity. This does not need to be that way. As the rules are pre-compiled, the variable just needs to be computed (and therefore allocate memory for it) if they are really used. The architecture already fits this implementation, it just needs to be done. In this process patches thatremove some configuration directives that will be deprecated will be more than welcome. As the example of: SecResponseBodyAccess.

##### Memory pools to avoid memory fragmentation

Memory fragmentation in our case is a consequence to very little pieces of informations that are allocated in each transaction. In busy servers that most certainly will be an issue. Further info: [Memory fragmentation](https://en.wikipedia.org/wiki/Fragmentation_(computing)) 

##### Technical investigation on the feasibility to reduce the unused variables or variables with same or similar content.

The are some variables that may hold the same content as the example of: 

https://github.com/SpiderLabs/ModSecurity/wiki/Reference-Manual-(v2.x)#REQUEST_LINE
https://github.com/SpiderLabs/ModSecurity/wiki/Reference-Manual-(v2.x)#REQUEST_URI
https://github.com/SpiderLabs/ModSecurity/wiki/Reference-Manual-(v2.x)#REQUEST_URI_RAW

Do they need to co-exist? If so, can we use offsets in a way to use the same memory space to represent it?

##### Support for cloud-like collection storage.

Collections are now saved into a key-pair storage which has the benefit of simplicity but not optimal in terms of performance. Handle that to an external process will lead to a good reduction of CPU usage.

##### Performant implementation for popular transformations.

Some of the transformations are executed too many times, some already does in-memory operations, but better logic in the implementation will help a lot. Assembly is welcomed here.


[to be update with new stuff as needed.]


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

META-ISSUE: Performance task force #1734

How to measure the performance

What the numbers means

How can I send a suggestion here?

I want to participate but I'm not following the discussions

Variable computation up to utilization

Memory pools to avoid memory fragmentation

Technical investigation on the feasibility to reduce the unused variables or variables with same or similar content.

Support for cloud-like collection storage.

Performant implementation for popular transformations.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

META-ISSUE: Performance task force #1734

Description

How to measure the performance

What the numbers means

How can I send a suggestion here?

I want to participate but I'm not following the discussions

Variable computation up to utilization

Memory pools to avoid memory fragmentation

Technical investigation on the feasibility to reduce the unused variables or variables with same or similar content.

Support for cloud-like collection storage.

Performant implementation for popular transformations.

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions