Skip to content

Talk about merging in the possibility of secure allocator on strings #415

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 46 commits into from
Closed

Talk about merging in the possibility of secure allocator on strings #415

wants to merge 46 commits into from

Conversation

dawesc
Copy link
Contributor

@dawesc dawesc commented Feb 6, 2016

Hi Chris, i know you've closed down the ticket but i wonder if you'd consider merging the attached? I hope that it now reads more cleanly as you asked and uses the suggested Value::String syntax as well. The thing we've not done is the numerics, i just have no idea how to begin making that backward compatible because asUInt64 for example would we return a pointer to a secure allocated UInt64, I think this is one step too far because it would cause too many problems, but what do you think? I'm keen to hear your opinion on the error reporting whether or not to change that to _Value::String instead of std::string as well?

Many thanks for your time on this

Christopher

Christopher Dawes and others added 30 commits February 5, 2016 10:55
All cpp files have been copied to inl files in the include folder
The inl file for value type has been moved into the include folder, later Exception clases must be moved back into value
The cpp files contain a lot of code that has to be moved into the inl
files that will be used for the templates
For the classes that aren't templated the code must remain in the cpp
file, this has been removed so that if you put the inl and cpp files
back together you'd end up with the original
Onto all the classes and methods in the header files the template
definition has been made to define them all as templated
Using typedef's for classes that have been templated makes it easier for
backward compatibility. To do this and retain the naming the classes
that have been made templates the original classes that have been
templated have been moved to the namespace "detail"
Now with the templates in place we need to have the code so that it
compiles, this is going through and adding typename in appropriate
places and the links to the template names throughout the class names to
get it to compile.
Now starting to change over to the new memory allocation, care has been
taken to leave exceptions in the original allocator
Remove malloc/free and instead use unique_ptr with vector and char to
represent the old character array.
 - add missing include for std::unique_ptr
 - fix global operator<< template to be specific to Json::Value
 - add missing include guards to header implementation files
 - include the template implementations in tests
Merge jsoncpp latest changes to forked master
Manually re-patched the conflicts

Conflicts:
	include/json/writer.h
	src/lib_json/json_reader.cpp
	src/lib_json/json_value.cpp
	src/lib_json/json_writer.cpp
Add a unit test for secure allocation
Change the streams to basic streams with the correct allocator
Strange change to have to instantiate string then move it due to
compiler issue
@cdunn2001
Copy link
Contributor

I don't have time right now, but I plan to rebase the entire template branch onto master, sans the merge commits, and maybe with some squashing. A clean history is important for retaining the ability to cherry-pick changes from one branch to another.

@cdunn2001
Copy link
Contributor

Hold on. Maybe we're looking at this wrong. While I still want templates for many reasons, maybe you don't need them at all for the secure allocator. Maybe all you really need is a hidden implementation for Json::Value, something like a pimpl-idiom. That would be a major-version bump because of binary incompatibility, but the API would look no different. The secure allocator could be selected as a feature.

It might make more sense to separate these concerns: templates for the API versus wiping allocator in the implementation.

@dawesc
Copy link
Contributor Author

dawesc commented Feb 9, 2016

Hi Chris, interesting idea hiding it all behind that. Certainly something that would improve the compile times and prevent the issues you've suggested above. I'd still advocate the use of templates underneath to prevent the duplication of code between secure and non-secure due to the number of places std::string and std::iostream(s) are used and therefore i'd be advocating for a template implementation behind (which people could potentially access if they wanted to do something truly crazy) and then sit in front of that classes that proxy across to the template version (pimpl).

The below is a justification for having some templated functions even on a proxy class; we opted to use templates to take a parameter of basic_string in any form for some of the functions:

The slight issue we've been suffering from in our codebase is this; the main reason for secure memory allocation is the passing of either payment card data, passwords or crypto keys across a JSON interface. Our web server passes us a blob of data in a securely allocated block that we now have no idea whether or not it does contain sensitive information in until actually we've unpacked it using a reader; once unpacked we then have enough information (it's jsonrpc so the method call tells us) to know whether or not it continues to need to be treated securely so for we then want to unpack the JSON into C++ objects that contain std::string. Therefore from a SecureValue we want asString() to return std::SecureString but also std::string; we added asTemplateString to the API to save copying the data twice. So this is one good reason for being able to at least have some template functions for returning of data even in a proxy class. So this is why it is nice for Value to return non-secure strings (even when it's secure) and for Writer to write to a non-secure string even though value was secure. The hardest methods were get() and operator[] to need std::string support on because throughout our code we had member names that certainly aren't secure that were then not compatible. Finally constructor of Value, it was so important that we could construct secure values from non-secure strings so that they could form part of a secure tree but come from an insecure source.

With reader you might be reading from an insecure source because again you've no choice over this but you'll do work yourself to overwrite the string before continuing and may want to have a non-secure string come in. So again maybe on reader we want to consider this but then equally you can always pass the .c_str() in, not the end of the earth I'm tempted to say so leave Reader alone to only support it securely when using SecureReader.

Thanks again

Christopher

@cdunn2001
Copy link
Contributor

Thanks for the description of your use-case. Sounds sensible. I'm curious about how you communicate with the decryptor, but I won't press you for details on that.

I'd still advocate the use of templates underneath to prevent the duplication of code between secure and non-secure due to the number of places std::string and std::iostream(s) are used

I'd be fine with using SecureAlloc everywhere always in the hidden implementation. It's fast enough that the overhead would not bother anyone. (It's worth measuring though.)

Note that the Reader/Writer implementations are already hidden. (Not counting the deprecated ones.)

Someday, I would like a read-only Value tree implementation which retains the input char* of the entire JSON structure (or a copy of it), stores char* pointers into it (plus lengths), and wipes the entire thing at the end. That would require a JsonCpp "manager" object, which could be awkwardly confusing or inconvenient for some users, but it would not need std::string at all! And it would be super-fast.

Therefore from a SecureValue we want asString() to return std::SecureString but also std::string; we added asTemplateString to the API to save copying the data twice.

On the API, you should be able to use a char* getter to avoid any memory allocation at all.

@dawesc
Copy link
Contributor Author

dawesc commented Feb 12, 2016

Thanks for the description of your use-case. Sounds sensible. I'm curious about how you communicate with the decryptor, but I won't press you for details on that

No problem, we are using a combination of HSMs with openssl and libgcrypt depending on the use case, so typically we use the Hardware Security Module (HSM) to kick off the cryptography by decrypting keys held on the local system under a KEK on the HSM and then using libgcrypt or openssl as appropraite to then perform the crypto operations (again painfully wiping all memory after use). For the really secure stuff we perform all the cryptography on the HSM so we don't actually see things like PIN codes as purely the translation happens in the HSM (for example).

On the API, you should be able to use a char* getter to avoid any memory allocation at all.

This is true!

@cdunn2001
Copy link
Contributor

Good use of HSM, etc. Interesting.

Therefore from a SecureValue we want asString() to return std::SecureString but also std::string.

I don't think you need that at all. Just use char* in the API. No memory allocations; no worries.

Internally, I disagree that there is any reason to allow more than one kind of std::string/stringstream. We can use your SecureAlloc everywhere. The overhead seems minimal. Secure-wiping can be a feature of the library. And like I said, someday I want to store only char* into managed pages.

As for changing everything to use SecureAlloc internally, you can do that on a branch for 2.0.0, unreleased. When I finish hiding the implementation, you can try to apply your updates there. I don't think the number of changes will be very large -- just switching some types to new typedefs. Your SecureAlloc will be exposed, but used privately.

Do you see what I'm saying? Maybe I should apologize. The other ticket implied that templates were enough, but my thinking was murky. The real problem is the exposed implementation, but that will be addressed eventually.

@dawesc
Copy link
Contributor Author

dawesc commented Feb 13, 2016

Wow ok if you're happy with that, that would be amazing! Shall i leave it to your capable hands then or would you like us to carry on making this particular branch pass the unit tests (sorry we've just not had time in the last few days to get a windows VM up and running with MS VS to work out why the tests don't pass with their compiler. So odd as they do with MinGW!)

@cdunn2001
Copy link
Contributor

Yes, I'd like it to pass the unit-tests. But you'll need to undo most of the changes here. No templates; just typedefs for std::string and stringstream internally, wherever you want to switch to your SecureAlloc. I think we should modify or remove the std::string portions of the API, since we don't want to expose the SecureAlloc versions, but returning standard std::string by value instead of by reference is fine.

Then we'll add a feature-setting to enable the SecureAlloc, so that we can compare runtimes easily. Because old-STL uses global allocators instead of instance-allocators, that setting will have to alter a global (though file-static) variable, but that's fine.

Then squash your commits. You can start a new pull-request if you'd rather.

It might be months before this is officially "released", since we want to incorporate other API changes too, but you can use the new branch until then. If you need a tarball, we can release an "alpha" version.

@dawesc
Copy link
Contributor Author

dawesc commented Feb 14, 2016

Hi Chris, so just to confirm then that the tasks to be done are:

  • Add alloc.h in /include containing SecureAlloc
  • In config.h add feature setting "SECURE_MEMORY" as an option.
  • In value.h add a #ifdef for SECURE_MEMORY which will then typedef for std::string and all streams such that you get Value::String or Value::Stream to use as types (there will be couple more but in principle).
  • when freeing memory allocated through malloc in json_value.cpp #if SECURE_MEMORY then memset to 0 up to first 0.

Definitely will create a new branch for this and just copy across any pertinent bits and then create a new pull request. Many thanks for all your time on this,

Christopher

@cdunn2001
Copy link
Contributor

Almost. I think it should be either a feature or a compile-time setting. I prefer the latter, but I'm fine with the former.

If the former, change internal std:: classes to typedefs, and use a compile-time macro to choose the implementation.

If the former, than you would typedef internal std:: types to use SecureAlloc always. SecureAlloc would check a global, file-static variable to decide whether actually to wipe anything. That would be a non-inline call, so maybe a non-static global would be ok instead, if that is considered secure enough.

@cdunn2001
Copy link
Contributor

when freeing memory allocated through malloc in json_value.cpp #if SECURE_MEMORY then memset to 0 up to first 0.

Hmmm. Technically, UTF can have a zero-byte for codepoint "zero", and JSON can have any UTF character except control characters.

I guess we could just call that an unsupported codepoint for this library.

I think we actually handle that right now, but maybe we could just let that be a caveat of the library.

@dawesc
Copy link
Contributor Author

dawesc commented Feb 15, 2016

Hi Chris, sorry with the formers and latters i've got a bit confused; to be clear you'd prefer it to be a feature. In the Features class we'll create a private static bool accessed via public static methods IsSecureMemory and SetSecureMemory (this will not be thread safe).

Globally we replace all std::string / std::stream with std::basic_string / std::basic_stream<..., SecureAlloc> and SecureAlloc checks IsSecureMemory before perfoming the memset. The same principle applies on the memset before free.

Apologies for this clarification but i was struggling a bit with the above. Thanks again, Christopher

@cdunn2001
Copy link
Contributor

Well, I'd definitely prefer it to be a compile-time option (i.e. an ifdef around a typedef).

And either way, we need to replace most of the std::* with typedefs. (The ones in the public interface will not change, right?)

@dawesc
Copy link
Contributor Author

dawesc commented Feb 15, 2016

Hi Chris, great thanks, i'll give it a bash.

@dawesc dawesc closed this Feb 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants