Skip to content

Commit 7a8d15e

Browse files
authored
[lldb] Support custom LLVM formatting for variables (#81196)
Adds support for applying LLVM formatting to variables. The reason for this is to support cases such as the following. Let's say you have two separate bytes that you want to print as a combined hex value. Consider the following summary string: ``` ${var.byte1%x}${var.byte2%x} ``` The output of this will be: `0x120x34`. That is, a `0x` prefix is unconditionally applied to each byte. This is unlike printf formatting where you must include the `0x` yourself. Currently, there's no way to do this with summary strings, instead you'll need a summary provider in python or c++. This change introduces formatting support using LLVM's formatter system. This allows users to achieve the desired custom formatting using: ``` ${var.byte1:x-}${var.byte2:x-} ``` Here, each variable is suffixed with `:x-`. This is passed to the LLVM formatter as `{0:x-}`. For integer values, `x` declares the output as hex, and `-` declares that no `0x` prefix is to be used. Further, one could write: ``` ${var.byte1:x-2}${var.byte2:x-2} ``` Where the added `2` results in these bytes being written with a minimum of 2 digits. An alternative considered was to add a new format specifier that would print hex values without the `0x` prefix. The reason that approach was not taken is because in addition to forcing a `0x` prefix, hex values are also forced to use leading zeros. This approach lets the user have full control over formatting.
1 parent 40083cf commit 7a8d15e

File tree

5 files changed

+104
-10
lines changed

5 files changed

+104
-10
lines changed

lldb/docs/use/variable.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -460,6 +460,15 @@ summary strings, regardless of the format they have applied to their types. To
460460
do that, you can use %format inside an expression path, as in ${var.x->x%u},
461461
which would display the value of x as an unsigned integer.
462462

463+
Additionally, custom output can be achieved by using an LLVM format string,
464+
commencing with the ``:`` marker. To illustrate, compare ``${var.byte%x}`` and
465+
``${var.byte:x-}``. The former uses lldb's builtin hex formatting (``x``),
466+
which unconditionally inserts a ``0x`` prefix, and also zero pads the value to
467+
match the size of the type. The latter uses ``llvm::formatv`` formatting
468+
(``:x-``), and will print only the hex value, with no ``0x`` prefix, and no
469+
padding. This raw control is useful when composing multiple pieces into a
470+
larger whole.
471+
463472
You can also use some other special format markers, not available for formats
464473
themselves, but which carry a special meaning when used in this context:
465474

lldb/source/Core/FormatEntity.cpp

Lines changed: 60 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@
5757
#include "llvm/ADT/STLExtras.h"
5858
#include "llvm/ADT/StringRef.h"
5959
#include "llvm/Support/Compiler.h"
60+
#include "llvm/Support/Regex.h"
6061
#include "llvm/TargetParser/Triple.h"
6162

6263
#include <cctype>
@@ -658,6 +659,38 @@ static char ConvertValueObjectStyleToChar(
658659
return '\0';
659660
}
660661

662+
static llvm::Regex LLVMFormatPattern{"x[-+]?\\d*|n|d", llvm::Regex::IgnoreCase};
663+
664+
static bool DumpValueWithLLVMFormat(Stream &s, llvm::StringRef options,
665+
ValueObject &valobj) {
666+
std::string formatted;
667+
std::string llvm_format = ("{0:" + options + "}").str();
668+
669+
// Options supported by format_provider<T> for integral arithmetic types.
670+
// See table in FormatProviders.h.
671+
672+
auto type_info = valobj.GetTypeInfo();
673+
if (type_info & eTypeIsInteger && LLVMFormatPattern.match(options)) {
674+
if (type_info & eTypeIsSigned) {
675+
bool success = false;
676+
int64_t integer = valobj.GetValueAsSigned(0, &success);
677+
if (success)
678+
formatted = llvm::formatv(llvm_format.data(), integer);
679+
} else {
680+
bool success = false;
681+
uint64_t integer = valobj.GetValueAsUnsigned(0, &success);
682+
if (success)
683+
formatted = llvm::formatv(llvm_format.data(), integer);
684+
}
685+
}
686+
687+
if (formatted.empty())
688+
return false;
689+
690+
s.Write(formatted.data(), formatted.size());
691+
return true;
692+
}
693+
661694
static bool DumpValue(Stream &s, const SymbolContext *sc,
662695
const ExecutionContext *exe_ctx,
663696
const FormatEntity::Entry &entry, ValueObject *valobj) {
@@ -728,9 +761,12 @@ static bool DumpValue(Stream &s, const SymbolContext *sc,
728761
return RunScriptFormatKeyword(s, sc, exe_ctx, valobj, entry.string.c_str());
729762
}
730763

731-
llvm::StringRef subpath(entry.string);
764+
auto split = llvm::StringRef(entry.string).split(':');
765+
auto subpath = split.first;
766+
auto llvm_format = split.second;
767+
732768
// simplest case ${var}, just print valobj's value
733-
if (entry.string.empty()) {
769+
if (subpath.empty()) {
734770
if (entry.printf_format.empty() && entry.fmt == eFormatDefault &&
735771
entry.number == ValueObject::eValueObjectRepresentationStyleValue)
736772
was_plain_var = true;
@@ -739,22 +775,19 @@ static bool DumpValue(Stream &s, const SymbolContext *sc,
739775
target = valobj;
740776
} else // this is ${var.something} or multiple .something nested
741777
{
742-
if (entry.string[0] == '[')
778+
if (subpath[0] == '[')
743779
was_var_indexed = true;
744780
ScanBracketedRange(subpath, close_bracket_index,
745781
var_name_final_if_array_range, index_lower,
746782
index_higher);
747783

748784
Status error;
749785

750-
const std::string &expr_path = entry.string;
751-
752-
LLDB_LOGF(log, "[Debugger::FormatPrompt] symbol to expand: %s",
753-
expr_path.c_str());
786+
LLDB_LOG(log, "[Debugger::FormatPrompt] symbol to expand: {0}", subpath);
754787

755788
target =
756789
valobj
757-
->GetValueForExpressionPath(expr_path.c_str(), &reason_to_stop,
790+
->GetValueForExpressionPath(subpath, &reason_to_stop,
758791
&final_value_type, options, &what_next)
759792
.get();
760793

@@ -883,8 +916,18 @@ static bool DumpValue(Stream &s, const SymbolContext *sc,
883916
}
884917

885918
if (!is_array_range) {
886-
LLDB_LOGF(log,
887-
"[Debugger::FormatPrompt] dumping ordinary printable output");
919+
if (!llvm_format.empty()) {
920+
if (DumpValueWithLLVMFormat(s, llvm_format, *target)) {
921+
LLDB_LOGF(log, "dumping using llvm format");
922+
return true;
923+
} else {
924+
LLDB_LOG(
925+
log,
926+
"empty output using llvm format '{0}' - with type info flags {1}",
927+
entry.printf_format, target->GetTypeInfo());
928+
}
929+
}
930+
LLDB_LOGF(log, "dumping ordinary printable output");
888931
return target->DumpPrintableRepresentation(s, val_obj_display,
889932
custom_format);
890933
} else {
@@ -2227,6 +2270,13 @@ static Status ParseInternal(llvm::StringRef &format, Entry &parent_entry,
22272270
if (error.Fail())
22282271
return error;
22292272

2273+
auto [_, llvm_format] = llvm::StringRef(entry.string).split(':');
2274+
if (!LLVMFormatPattern.match(llvm_format)) {
2275+
error.SetErrorStringWithFormat("invalid llvm format: '%s'",
2276+
llvm_format.data());
2277+
return error;
2278+
}
2279+
22302280
if (verify_is_thread_id) {
22312281
if (entry.type != Entry::Type::ThreadID &&
22322282
entry.type != Entry::Type::ThreadProtocolID) {
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
C_SOURCES := main.c
2+
include Makefile.rules
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
import lldb
2+
from lldbsuite.test.lldbtest import *
3+
import lldbsuite.test.lldbutil as lldbutil
4+
5+
6+
class TestCase(TestBase):
7+
def test_raw_bytes(self):
8+
self.build()
9+
lldbutil.run_to_source_breakpoint(self, "break here", lldb.SBFileSpec("main.c"))
10+
self.runCmd("type summary add -s '${var.ubyte:x-2}${var.sbyte:x-2}!' Bytes")
11+
self.expect("v bytes", substrs=[" = 3001!"])
12+
13+
def test_bad_format(self):
14+
self.build()
15+
lldbutil.run_to_source_breakpoint(self, "break here", lldb.SBFileSpec("main.c"))
16+
self.expect(
17+
"type summary add -s '${var.ubyte:y}!' Bytes",
18+
error=True,
19+
substrs=["invalid llvm format"],
20+
)
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#include <stdint.h>
2+
#include <stdio.h>
3+
4+
struct Bytes {
5+
uint8_t ubyte;
6+
int8_t sbyte;
7+
};
8+
9+
int main() {
10+
struct Bytes bytes = {0x30, 0x01};
11+
(void)bytes;
12+
printf("break here\n");
13+
}

0 commit comments

Comments
 (0)