Skip to content

Refactor benchmark example to reduce stack usage #116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 19, 2018

Conversation

RonEld
Copy link
Contributor

@RonEld RonEld commented Sep 19, 2017

  1. Seperate the tests to different hepler functions, according to
    their module
  2. Enable ECDSA tests

Tested on K64F and NRF52840_DK

Copy link
Contributor

@k-stachowiak k-stachowiak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked what I could at my knowledge level:

  • missed options that have no implementation
  • resources not being released
    I found no issue of that sort.
    I have one usability related suggestion.

else
#if defined(MBEDTLS_RSA_C) && \
defined(MBEDTLS_PEM_PARSE_C) && defined(MBEDTLS_PK_PARSE_C)
if( todo->rsa )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This program may benefit from printing warnings if the user requests a given test via CLI but it is compiled out. Currently it skips it silently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k-stachowiak Perhaps, but this PR is only to refactor the code into different test functions, to reduce stack size

@AndrzejKurek
Copy link
Contributor

Changes look fine to me, but just as a note for future pull requests - it would be nice to split it into two commits - first, moving the code around without changes, and second - introducing these changes.
Doing it as it is done here makes it very hard to review the changes, as github reports only random chunks removed and added at various lines, while they are only moved.

@Patater
Copy link
Contributor

Patater commented May 9, 2018

This PR needs rebasing, as benchmark has undergone some changes.

Rewrite after a rebase with many conflicts
@RonEld RonEld force-pushed the benchmark_refactor branch from 76b3bc7 to a198d9b Compare May 13, 2018 14:55
@RonEld
Copy link
Contributor Author

RonEld commented May 13, 2018

There were too many conflicts, so I have rewritten the PR.
Note that it is only moving the tests to different functions, so no other changes \ reordering

Ron Eldor added 2 commits May 13, 2018 18:12
1. update copyright year.
2. Enable ecdsa.
3. Change `hlen` parameter in the ecdsa test to byte size, as this is
what the API expects.
4. Remove unused variables.
Add the platform context to the test functions, as these are the functions
handling the cryptographic modules.
@RonEld
Copy link
Contributor Author

RonEld commented May 13, 2018

@AndrzejKurek I have rewritten the PR, as there were too many conflicts in rebase. Please re-review.
I have also added support for the platform context.

@k-stachowiak could you please specify where there are resources not being freed? If you found some memory leaks, I think it would be a good idea to open a separate issue, as I didn't really modify code, except the ecdsa code that didn't work on my HW accelerator.

Have the test functions return error code, to avoid resource leak
of the platform context, and for cleaner exit.
@@ -145,7 +145,7 @@
/*
* Uncomment this line to enable ECDSA benchmark.
*/
//#define ENABLE_ECDSA
#define ENABLE_ECDSA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be enabled? It is more than just reducing stack size, its a functional change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it was disabled was because ECDSA caused stack overflow.
Now, it can be returned.
I think we can remove the whole ENABLE_ECDSA part, once this will be tested on numerous targets

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change breaks the example for the ARM Compiler on K64F! We should have tested that before merging. It's especially obvious to test because it's what the CI tests!

cc: @mazimkhan

Copy link
Contributor Author

@RonEld RonEld Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested it on K64F and on NRF52840.
If CI is still broken, we can return this change.

@@ -819,7 +777,7 @@ static int benchmark( int argc, char *argv[], mbedtls_platform_context* ctx )
mbedtls_snprintf( title, sizeof( title ), "ECDSA-%s",
curve_info->name );
TIME_PUBLIC( title, "sign",
ret = mbedtls_ecdsa_write_signature( &ecdsa, MBEDTLS_MD_SHA256, buf, curve_info->bit_size,
ret = mbedtls_ecdsa_write_signature( &ecdsa, MBEDTLS_MD_SHA256, buf, ( curve_info->bit_size + 7 ) / 8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the previous buffer size simply inaccurate, or we don't have so much space? If this amount only plays a role in checking whether the buffer is big enough, why don't we pass the real buffer size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter is hlen not buffer length. So, sizeof(buf) is inaccurate.
The previous buffer size was sent as bits, not bytes, but the API expects length in bytes.
SInce this is a benchmark application, it didn't verify with expected result, and the SW implementation succeeded. I noticed this issue, when a HW accelerator expects only specific standard lengths, so it returned an error

Copy link
Contributor

@AndrzejKurek AndrzejKurek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the malloc_buf allocation that's not freed, and probably should be raised as a bug, this PR is ok.

@RonEld
Copy link
Contributor Author

RonEld commented May 14, 2018

@AndrzejKurek Thanks for the review. I did consider adding a free to the malloc_buf allocation, but since it's not defined on Mbed OS(at least no the targets I checked), and it's a functional change, I didn't add it

Copy link

@andresag01 andresag01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is mostly a straight forward split of the previous long benchmark function to reduce stack usage. And I am happy with these changes. However, I pointed out several other (mostly preexisting issues) that would be good to fix, but I do not think they are a blocker because they do were already there.

Also pointed out a simple style issue, but this is not a blocker either...

int i;
todo_list todo;
#if defined(MBEDTLS_MEMORY_BUFFER_ALLOC_C)
unsigned char malloc_buf[HEAP_SIZE] = { 0 };

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this is going to work. This line will allocate a 64KB array on the stack that will then be managed by Mbed TLS. However, the default stack size in Mbed OS is 4KB and it almost certainly cause problems. I suggest removing the use of MBEDTLS_MEMORY_BUFFER_ALLOC_C from this example. The embedded platforms that this example is targeting cannot cope with these huge allocations. If you still want to make it work, you can probably look up the appropriate symbols in the linker files and pass it real pointers to the heap. Even then, this might cause trouble if other parts of the system use the real malloc/calloc...

for( i = 1; i < argc; i++ )
{
if( strcmp( argv[i], "md4" ) == 0 )
todo.md4 = 1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this todo struct is a trace of the origin of this example, that is, the benchmark application in mbedtls/programs/benchmark.c. That application was meant for command line usage, so it made sense to provide an interface to enable/disable parts of the benchmark. However, this app is meant for an embedded platform and (as far as I remember) these values have never been changed (proof of that is that we used ENABLE_ECDSA instead of flicking the appropriate todo bit). I think this feature is entirely redundant and causes clutter. I suggest removing the todo struct and all its associated code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's redundant and not needed. Opened #170 to track it,
Side note: ENABLE_ECDSA was used to reduce stack size. Flipping the todo bit would keep the code compiled and keep the stack usage.

#endif
memset( buf, 0xAA, sizeof( buf ) );

if( test_md( &todo, ctx ) != 0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Please add before ).

return ( 0 );
}

static int test_rng( const todo_list * todo, mbedtls_platform_context* ctx )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of this function returning an error code. However, as far as I can tell, this is currently not properly set in all cases (same situation with all the other test_*()). It would be great to properly set the value, specially if there are more platforms with hardware acceleration support.

Copy link
Contributor Author

@RonEld RonEld May 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean. Do you mean to set a ret variable and set its value according to the crypto functions? I agree it should be done, but I think it should be part of a different PR, as this PR is only refactoring the function calls to separate functions, to reduce stack size

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, thats what I mean.
Agreed, preexisting and different issue

memset( tmp, 0xBB, sizeof( tmp ) );

#if defined(MBEDTLS_MD4_C)
if( todo.md4 )
if( todo->md4 )
TIME_AND_TSC( "MD4", mbedtls_md4( buf, BUFSIZE, tmp ) );

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think the return code in any of these is being checked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, but this is pre-existing behavior. Opened #172 to track this


if( mbedtls_ctr_drbg_seed( &ctr_drbg, myrand, NULL, NULL, 0 ) != 0 )
return(1);
mbedtls_ctr_drbg_set_prediction_resistance( &ctr_drbg, MBEDTLS_CTR_DRBG_PR_ON );
TIME_AND_TSC( "CTR_DRBG (PR)",
if( mbedtls_ctr_drbg_random( &ctr_drbg, buf, BUFSIZE ) != 0 )
return(1) );
return(1) );

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the return value in all these is lost...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? this return is part of the CODE being set in the TIME_AND_TSC macro

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is returning 1 in all cases, so if it fails, the user would not know the cause. That is what I mean by "lost"

@RonEld
Copy link
Contributor Author

RonEld commented May 23, 2018

@andresag01 Thank you for your comments.
Note this PR is only refactoring(and enabling ECDSA). No other preexisting issue was addressed and I opened issues to track those

@andresag01
Copy link

@RonEld: I agree. As I mentioned in my review comments, I do not think these are blockers because they are preexisting issues.

@k-stachowiak
Copy link
Contributor

k-stachowiak commented May 24, 2018

could you please specify where there are resources not being freed? If you found some memory leaks, I think it would be a good idea to open a separate issue, as I didn't really modify code, except the ecdsa code that didn't work on my HW accelerator.

My comment only stated, what I was checking for, including unreleased resources. Let me confirm, that I have found no issues of that kind.

@RonEld
Copy link
Contributor Author

RonEld commented May 24, 2018

@k-stachowiak Thanks for confirming.
@andresag01 I believe I addressed all you issues

Copy link

@andresag01 andresag01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RonEld: As discussed, most of my comments are related to preexisting issues and are best addressed in another PR. There is just one or two style issues, but that is not a blocker, so I am approving this PR.

@RonEld RonEld changed the title Refactor benchmark example to reduce stack size Refactor benchmark example to reduce stack usage May 31, 2018
@k-stachowiak k-stachowiak added runCI and removed runCI labels Jul 19, 2018
@k-stachowiak k-stachowiak merged commit d576d43 into ARMmbed:master Jul 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants