Out-of-bounds Read in BehaviorTree.CPP Script Validation

When parsing invalid scripts that generate large error messages, the `ValidateScript` function may attempt to create a string from a non-NULL-terminated buffer, causing `strlen` to read beyond the buffer's bounds. This occurs because a fixed-size stack buffer is used to store error messages without ensuring proper NULL termination.

https://github.com/BehaviorTree/BehaviorTree.CPP/blob/b4bbef6800dca639dcf0316f0a5312c6f10e27ed/src/script_parser.cpp#L70

Found in commit: https://github.com/BehaviorTree/BehaviorTree.CPP/commit/48f6c5bd1d797c73a562f1c92f404d0bb944e018


## Bug Class
Out-of-bounds Read / Buffer Over-read ([CWE-125](https://cwe.mitre.org/data/definitions/125.html))

## Root Cause
In `src/script_parser.cpp`, the `ValidateScript` function allocates a fixed-size buffer of 2048 bytes for error messages:


```cpp
Result ValidateScript(const std::string& script)
{
  char error_msgs_buffer[2048];

  auto input = lexy::string_input<lexy::utf8_encoding>(script);
  auto result =
      lexy::parse<BT::Grammar::stmt>(input, ErrorReport().to(error_msgs_buffer));
  if(result.has_value() && result.error_count() == 0)
  {
    try
    {
      std::vector<BT::Ast::ExprBase::Ptr> exprs = LEXY_MOV(result).value();
      if(exprs.empty())
      {
        return nonstd::make_unexpected("Empty Script");
      }
      // valid script
      return {};
    }
    catch(std::runtime_error& err)
    {
      return nonstd::make_unexpected(err.what());
    }
  }
  return nonstd::make_unexpected(error_msgs_buffer);
}

}  // namespace BT
```

This buffer is passed to lexy's error reporter, which fills it with error messages. When parsing fails, the buffer content is used to construct a string via `nonstd::make_unexpected`.
The string construction internally calls `strlen` to determine the buffer length. However, there's no guarantee that:

- The buffer is NULL-terminated
- The error messages fit within the buffer
- The last byte is reserved for a NULL terminator

This leads to `strlen` reading past the buffer bounds looking for a NULL terminator that isn't there.

## Stack trace

```
==2543337==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffff5a09820 at pc 0x5555555ae3af bp 0x7fffffffcd10 sp 0x7fffffffc4d8
READ of size 2049 at 0x7ffff5a09820 thread T0
    #0 0x5555555ae3ae in strlen (/home/user/BehaviorTree.CPP/build3/script_fuzzer+0x5a3ae) (BuildId: 390cd92534c04116bddfa049abd4832d7b583113)
    #1 0x555555692d1c in std::char_traits<char>::length(char const*) /usr/lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/char_traits.h:399:9
    #2 0x555555679d3c in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::basic_string<std::allocator<char>>(char const*, std::allocator<char> const&) /usr/lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/basic_string.h:648:30
    #3 0x5555558ff01b in nonstd::expected_lite::expected<std::monostate, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>::expected<char*, 0>(nonstd::expected_lite::unexpected_type<char*>&&) /home/user/BehaviorTree.CPP/include/behaviortree_cpp/contrib/expected.hpp:1979:36
    #4 0x5555558fe109 in BT::ValidateScript(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) /home/user/BehaviorTree.CPP/src/script_parser.cpp:94:10
[...]
```

## Reproducer

```cpp
#include "behaviortree_cpp/scripting/script_parser.hpp"
#include <iostream>

int main()
{
  std::string script = "+6e66>6666.6+66\r6>6;6e62=6+6e66>66666'; en';o';o'; en'; "
                       "\177n\177r;6.6+66.H>6+6\2006\036;@e66";

  auto result = BT::ValidateScript(script);

  if(!result)
  {
    std::cout << "Script validation failed as expected" << std::endl;
  }

  return 0;
}
```

__NOTE__: This requires that `libbheaviortree.so` is compiled with `-fsanitize=address`.

## Proposed fix

I'm by no means an expert for `lexy`, but given that [lexy's parse function accepts a template callback for error handling](https://github.com/foonathan/lexy/blob/34d2adf74a2b25b6bdd760a3bbb931f3fd5e60cd/include/lexy/action/parse.hpp#L169-L174):

```cpp
/// Parses the production into a value, invoking the callback on error.
template <typename Production, typename Input, typename ErrorCallback>
constexpr auto parse(const Input& input, const ErrorCallback& callback)
{
    return parse_action<void, Input, ErrorCallback>(callback)(Production{}, input);
}

```


A recommended fix seems to implement a safe error callback that manages its own memory instead of a fix-sized stack-based buffer, something similar to:

```cpp
class SafeErrorReport {
    std::string error_buffer;
    std::size_t count = 0;

    struct sink {
        std::string& buffer;
        std::size_t& count;

        template <typename Input, typename Reader, typename Tag>
        void operator()(const lexy::error_context<Input>& context,
                       const lexy::error<Reader, Tag>& error) {
            buffer += "error: while parsing ";
            buffer += context.production();
            buffer += "\n";
            count++;
        }

        std::size_t finish() && {
            return count;
        }
    };

public:
    auto sink() { return sink{error_buffer, count}; }
    const std::string& get_errors() const { return error_buffer; }
};
```

This is based on my understanding of [lexy's error_report.hpp](https://github.com/foonathan/lexy/blob/34d2adf74a2b25b6bdd760a3bbb931f3fd5e60cd/include/lexy_ext/report_error.hpp#L318). That would turn `ValidateScript` into something like this:

```cpp
Result ValidateScript(const std::string& script)
{
    auto input = lexy::string_input<lexy::utf8_encoding>(script);
    SafeErrorReport error_report;  // Replace char buffer with our safe handler
    
    auto result = lexy::parse<BT::Grammar::stmt>(input, error_report);
    
    if(!result.has_value() || result.error_count() != 0)
    {
        return nonstd::make_unexpected(error_report.get_errors());
    }

    try
    {
        std::vector<BT::Ast::ExprBase::Ptr> exprs = LEXY_MOV(result).value();
        if(exprs.empty())
        {
            return nonstd::make_unexpected("Empty Script");
        }
        // valid script
        return {};
    }
    catch(std::runtime_error& err)
    {
        return nonstd::make_unexpected(err.what());
    }
}
```

NOTE: Not 100% sure on this fix, just loosely based on my quick understand of lexy's code..

## Impact

The vulnerability could allow an attacker to cause out-of-bounds memory reads by providing specially crafted invalid scripts. While this doesn't lead to direct code execution, it could potentially:

- Lead to information disclosure if sensitive data is stored adjacent to the buffer
- Cause program crashes (DoS)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Out-of-bounds Read in BehaviorTree.CPP Script Validation #923

Bug Class

Root Cause

Stack trace

Reproducer

Proposed fix

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Out-of-bounds Read in BehaviorTree.CPP Script Validation #923

Description

Bug Class

Root Cause

Stack trace

Reproducer

Proposed fix

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions