Skip to content

Limit normalize edge nodes #4687

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
145 changes: 72 additions & 73 deletions packages/utils/src/object.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import { ExtendedError, WrappedFunction } from '@sentry/types';

import { htmlTreeAsString } from './browser';
import { isElement, isError, isEvent, isInstanceOf, isPlainObject, isPrimitive, isSyntheticEvent } from './is';
import { memoBuilder, MemoFunc } from './memo';
import { memoBuilder } from './memo';
import { getFunctionName } from './stacktrace';
import { truncate } from './string';

Expand Down Expand Up @@ -99,7 +99,7 @@ export function urlEncode(object: { [key: string]: any }): string {
*
* @param value Initial source that we have to transform in order for it to be usable by the serializer
*/
function getWalkSource(value: any): {
function getWalkSource(value: unknown): {
[key: string]: any;
} {
if (isError(value)) {
Expand Down Expand Up @@ -244,7 +244,7 @@ function serializeValue(value: any): any {
*
* Handles globals, functions, `undefined`, `NaN`, and other non-serializable values.
*/
function makeSerializable<T>(value: T, key?: any): T | string {
function makeSerializable<T>(value: T, key?: unknown): T | string {
if (key === 'domain' && value && typeof value === 'object' && (value as unknown as { _events: any })._events) {
return '[Domain]';
}
Expand All @@ -253,20 +253,20 @@ function makeSerializable<T>(value: T, key?: any): T | string {
return '[DomainEmitter]';
}

if (typeof (global as any) !== 'undefined' && (value as unknown) === global) {
if (typeof global !== 'undefined' && (value as unknown) === global) {
return '[Global]';
}

// It's safe to use `window` and `document` here in this manner, as we are asserting using `typeof` first
// which won't throw if they are not present.

// eslint-disable-next-line no-restricted-globals
if (typeof (window as any) !== 'undefined' && (value as unknown) === window) {
if (typeof window !== 'undefined' && (value as unknown) === window) {
return '[Window]';
}

// eslint-disable-next-line no-restricted-globals
if (typeof (document as any) !== 'undefined' && (value as unknown) === document) {
if (typeof document !== 'undefined' && (value as unknown) === document) {
return '[Document]';
}

Expand Down Expand Up @@ -301,84 +301,83 @@ function makeSerializable<T>(value: T, key?: any): T | string {
}

/**
* Walks an object to perform a normalization on it
* normalize()
*
* @param key of object that's walked in current iteration
* @param value object to be walked
* @param depth Optional number indicating how deep should walking be performed
* @param memo Optional Memo class handling decycling
* - Creates a copy to prevent original input mutation
* - Skip non-enumerablers
* - Calls `toJSON` if implemented
* - Removes circular references
* - Translates non-serializeable values (undefined/NaN/Functions) to serializable format
* - Translates known global objects/Classes to a string representations
* - Takes care of Error objects serialization
* - Optionally limit depth of final output
*/
// eslint-disable-next-line @typescript-eslint/explicit-module-boundary-types
export function walk(key: string, value: any, depth: number = +Infinity, memo: MemoFunc = memoBuilder()): any {
const [memoize, unmemoize] = memo;
export function normalize(input: unknown, maxDepth: number = +Infinity, maxEdges: number = 10_000): any {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait 10_000 works :0

Does it get transpiled properly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll probably have to make maxEdges configurable like maxDepth

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, all gets transpiled by tsc.

Make maxEdges configurable, or remove the default value?

Copy link
Member

@AbhiPrasad AbhiPrasad Mar 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should keep the default, but add an option similar to

normalizeDepth?: number;

Also I wonder if this change will be disruptive to existing customers in any way. Any reason to choose 10_000?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10k is just a rough guestimate for what will likely not be noticed by customers but still limit crazy large objects.

I think the alternative of limiting number of properties/elements per object/array might be less disruptive and output more useful serialised output. The issue I see with this edges approach is that it's hard to pick a maxEdges value and when that number is reached, nothing more is outputted. This includes skipping properties further up the object tree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I see with this edges approach is that it's hard to pick a maxEdges value and when that number is reached, nothing more is outputted

Yeah good observation. I think limiting number of properties/elements per object/array is also much more intuitive to understand - and adjusting it will be much more obvious.

const [memoize, unmemoize] = memoBuilder();
let edges = 0;

function walk(key: string, value: unknown & { toJSON?: () => string }, depth: number): any {
// If we reach the maximum depth, serialize whatever is left
if (depth === 0) {
edges += 1;
return serializeValue(value);
}

// If we reach the maximum depth, serialize whatever is left
if (depth === 0) {
return serializeValue(value);
}
// If value implements `toJSON` method, call it and return early
if (typeof value?.toJSON === 'function') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately optional chaining is less bundle size efficient.

Suggested change
if (typeof value?.toJSON === 'function') {
if (value != null && typeof value.toJSON === 'function') {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just early return if value is undefined/null?

edges += 1;
return value.toJSON();
}

/* eslint-disable @typescript-eslint/no-unsafe-member-access */
// If value implements `toJSON` method, call it and return early
if (value !== null && value !== undefined && typeof value.toJSON === 'function') {
return value.toJSON();
}
/* eslint-enable @typescript-eslint/no-unsafe-member-access */
// `makeSerializable` provides a string representation of certain non-serializable values. For all others, it's a
// pass-through. If what comes back is a primitive (either because it's been stringified or because it was primitive
// all along), we're done.
const serializable = makeSerializable(value, key);
if (isPrimitive(serializable)) {
edges += 1;
return serializable;
}

// `makeSerializable` provides a string representation of certain non-serializable values. For all others, it's a
// pass-through. If what comes back is a primitive (either because it's been stringified or because it was primitive
// all along), we're done.
const serializable = makeSerializable(value, key);
if (isPrimitive(serializable)) {
return serializable;
}
// Create source that we will use for the next iteration. It will either be an objectified error object (`Error` type
// with extracted key:value pairs) or the input itself.
const source = getWalkSource(value);

// Create source that we will use for the next iteration. It will either be an objectified error object (`Error` type
// with extracted key:value pairs) or the input itself.
const source = getWalkSource(value);
// Create an accumulator that will act as a parent for all future itterations of that branch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Create an accumulator that will act as a parent for all future itterations of that branch
// Create an accumulator that will act as a parent for all future iterations of that branch

const acc: { [key: string]: any } = Array.isArray(value) ? [] : {};

// Create an accumulator that will act as a parent for all future itterations of that branch
const acc: { [key: string]: any } = Array.isArray(value) ? [] : {};
// If we already walked that branch, bail out, as it's circular reference
if (memoize(value)) {
edges += 1;
return '[Circular ~]';
}

// If we already walked that branch, bail out, as it's circular reference
if (memoize(value)) {
return '[Circular ~]';
}
// Walk all keys of the source
for (const innerKey in source) {
// Avoid iterating over fields in the prototype if they've somehow been exposed to enumeration.
if (!Object.prototype.hasOwnProperty.call(source, innerKey)) {
continue;
}

// Walk all keys of the source
for (const innerKey in source) {
// Avoid iterating over fields in the prototype if they've somehow been exposed to enumeration.
if (!Object.prototype.hasOwnProperty.call(source, innerKey)) {
continue;
if (edges >= maxEdges) {
acc[innerKey] = '[Max Edges Reached...]';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe lets call this [MaxEdges ~] to make it consistent with [Circular ~]?

break;
}

// Recursively walk through all the child nodes
const innerValue: any = source[innerKey];
acc[innerKey] = walk(innerKey, innerValue, depth - 1);
}
// Recursively walk through all the child nodes
const innerValue: any = source[innerKey];
acc[innerKey] = walk(innerKey, innerValue, depth - 1, memo);
}

// Once walked through all the branches, remove the parent from memo storage
unmemoize(value);
// Once walked through all the branches, remove the parent from memo storage
unmemoize(value);

// Return accumulated values
return acc;
}
// Return accumulated values
return acc;
}

/**
* normalize()
*
* - Creates a copy to prevent original input mutation
* - Skip non-enumerablers
* - Calls `toJSON` if implemented
* - Removes circular references
* - Translates non-serializeable values (undefined/NaN/Functions) to serializable format
* - Translates known global objects/Classes to a string representations
* - Takes care of Error objects serialization
* - Optionally limit depth of final output
*/
// eslint-disable-next-line @typescript-eslint/explicit-module-boundary-types
export function normalize(input: any, depth?: number): any {
try {
// since we're at the outermost level, there is no key
return walk('', input, depth);
return walk('', input as unknown & { toJSON?: () => string }, maxDepth);
} catch (_oO) {
return '**non-serializable**';
}
Expand All @@ -390,7 +389,7 @@ export function normalize(input: any, depth?: number): any {
* eg. `Non-error exception captured with keys: foo, bar, baz`
*/
// eslint-disable-next-line @typescript-eslint/explicit-module-boundary-types
export function extractExceptionKeysForMessage(exception: any, maxLength: number = 40): string {
export function extractExceptionKeysForMessage(exception: unknown, maxLength: number = 40): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change, let's move into another PR?

const keys = Object.keys(getWalkSource(exception));
keys.sort();

Expand Down Expand Up @@ -422,14 +421,14 @@ export function extractExceptionKeysForMessage(exception: any, maxLength: number
*/
export function dropUndefinedKeys<T>(val: T): T {
if (isPlainObject(val)) {
const obj = val as { [key: string]: any };
const rv: { [key: string]: any } = {};
const obj = val as { [key: string]: unknown };
const rv: { [key: string]: unknown } = {};
for (const key of Object.keys(obj)) {
if (typeof obj[key] !== 'undefined') {
rv[key] = dropUndefinedKeys(obj[key]);
}
}
return rv as T;
return rv as unknown as T;
}

if (Array.isArray(val)) {
Expand Down
66 changes: 66 additions & 0 deletions packages/utils/test/object.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -533,6 +533,72 @@ describe('normalize()', () => {
});
});

describe('can limit number of edge nodes', () => {
test('array', () => {
const obj = {
foo: new Array(100).fill(1, 0, 100),
};

expect(normalize(obj, 10, 10)).toEqual({
foo: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, '[Max Edges Reached...]'],
});
});

test('object', () => {
const obj = {
foo1: 1,
foo2: 1,
foo3: 1,
foo4: 1,
foo5: 1,
foo6: 1,
foo7: 1,
foo8: 1,
foo9: 1,
foo10: 1,
foo11: 1,
foo12: 1,
};

expect(normalize(obj, 10, 10)).toEqual({
foo1: 1,
foo2: 1,
foo3: 1,
foo4: 1,
foo5: 1,
foo6: 1,
foo7: 1,
foo8: 1,
foo9: 1,
foo10: 1,
foo11: '[Max Edges Reached...]',
});
});

test('objects and arrays', () => {
const obj = {
foo1: 1,
foo2: 1,
foo3: 1,
foo4: 1,
foo5: 1,
foo6: [1, 1, 1, 1, 1, 1],
foo7: [1, 1, 1, 1, 1, 1],
foo8: [1, 1, 1, 1, 1, 1],
};

expect(normalize(obj, 10, 10)).toEqual({
foo1: 1,
foo2: 1,
foo3: 1,
foo4: 1,
foo5: 1,
foo6: [1, 1, 1, 1, 1, '[Max Edges Reached...]'],
foo7: '[Max Edges Reached...]',
});
});
});

test('normalizes value on every iteration of decycle and takes care of things like Reacts SyntheticEvents', () => {
const obj = {
foo: {
Expand Down