Skip to content

Implement IFS in C++ #701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 22, 2020
Merged

Conversation

VincentZalzal
Copy link
Contributor

My C++ version is slightly different than the Julia version. I decided to exclude saving to a file from the chaosGame() function; I find it cleaner, and cause less distraction when reading the page.

@leios
Copy link
Member

leios commented May 21, 2020

I think the idea of returning the points from the chaos_game function is good. I will modify the Julia code and ask #700 to do so as well.

As for the C++ code, I typically use a different style, so I refrain from reviewing this for now. If anyone else wants to review it, go ahead!

@leios leios mentioned this pull request May 21, 2020
@Frank-Buss
Copy link

I like the Point struct and the operators for it, because unlike in Julia, it is not easy to do the same without an extra struct. But I think using the random templates is not necessary, the Julia code uses the standard rand function only, too, and using other random algorithm doesn't help to see the result. This helps to simplify the code and to make it shorter. I gave it a try, see the full code below for IFS.cpp. I also added a typedef to make the code easier to read.

#include <cmath>
#include <fstream>
#include <cstdlib>
#include <vector>

struct Point {
  double x, y;
};

typedef std::vector<Point> PointVector;

Point operator+(Point lhs, Point rhs) { return {lhs.x + rhs.x, lhs.y + rhs.y}; }
Point operator*(double k, Point pt) { return {k * pt.x, k * pt.y}; }
Point operator*(Point pt, double k) { return k * pt; }

// Return a random number x, with 0 <= x < 1
double drand() {
    return double(rand()) / double(RAND_MAX);
}

PointVector chaosGame(int numOutputPoints, const PointVector& inputPoints) {
  // Choose first point randomly
  Point curPoint = { drand(), drand() };

  // For each output point, compute midpoint to random input point
  PointVector outputPoints(numOutputPoints);
  for (auto& outPoint : outputPoints) {
    outPoint = curPoint;
    curPoint = 0.5 * (curPoint + inputPoints[rand() % inputPoints.size()]);
  }

  return outputPoints;
}

int main() {
  // This will generate a Sierpinski triangle with a chaos game of n points for
  // an initial triangle with three points on the vertices of an equilateral
  // triangle.
  PointVector inputPoints = {
      {0.0, 0.0}, {0.5, std::sqrt(0.75)}, {1.0, 0.0}};
  auto outputPoints = chaosGame(10000, inputPoints);

  // It will output the file sierpinski.dat, which can be plotted after
  std::ofstream ofs("sierpinski.dat");
  for (auto pt : outputPoints)
    ofs << pt.x << '\t' << pt.y << '\n';
}

@VincentZalzal
Copy link
Contributor Author

I was torn myself about using a template parameter, because it hinders readability, I agree. The problem is, it is not recommended to use rand() in new code. But using the new header requires more machinery. Also, doing the modulo trick to get an integer in a range, while idiomatic, is biased, unless the integer is a power of two. For a few number of points, the bias is completely negligible, but still, ideally, I would show the "correct" way if possible.

I could solve it by having a custom rand function like yours, but with a static RNG inside, so that all stuff is hidden. Would this be better? I'll start working on it in the meantime.

@Frank-Buss
Copy link

Do you have a reference where it says that the old rand function shouldn't be used anymore? I can't see that it is deprecated, at least not at http://www.cplusplus.com/reference/cstdlib/rand/ . This page also mentions the modulo trick, but I don't understand why it is not uniformly distributed. Assuming rand is purely random, there should be no bias.

But right, might be useful to show the new way how C++ does it anyway, but just with another simple function to hide the complexity of the random number generator from the chaosGame function and just two easy to use functions, which should be part of the standard library anyway, like many other languages have it, like randrange in Python.

@VincentZalzal
Copy link
Contributor Author

rand() is not deprecated, simply it is not recommended anymore. This SO post explains it better than I would: https://stackoverflow.com/questions/52869166/why-is-the-use-of-rand-considered-bad

I think this new version has a very clean and simple chaosGame() function now, no template and rand cruft!

@jiegillet
Copy link
Member

jiegillet commented May 22, 2020

@Frank-Buss I don't know C++, but from reading the SO post, I understand why there is a bias. Imagine your RNG has RAND_MAX=5, so it gives you any number between 0 and 5 uniformly. However you need a number between 0 and 3, so you use the modulo trick. That remaps the original numbers to

0 -> 0
1 -> 1
2 -> 2
3 -> 3
4 -> 0
5 -> 1

You end up getting 0 or 1 twice as much as 2 or 3. Sometimes that doesn't happen if RAND_MAX is a multiple of the max number you want, but that's not reliable.

Sorry if you already understood it :)

@Frank-Buss
Copy link

Looks good, I like the new "using" syntax instead of typedef. But maybe would be better to just define a function which returns a random int in a range, instead of chooseRandPoint, because then this function could be potentially outsourced later to a general library for using it in other sample code, or it might help people like me to copy-and-paste it for my projects :-)

After thinking about modulo operator with rand(), I do understand now why it is biased, because the whole range from 0 to RAND_MAX is used for rand, and each number happens equally often, so if it is mapped e.g. to 0..2, and RAND_MAX+1 is a power of 2, then the "last" mapping for 2 is missing, which causes an error in the equal distribution of 1/2^32 (if RAND_MAX+1==2^32). Not that much for most practical cases, but of course gets worse the closer the modulo operator gets to RAND_MAX.

@Frank-Buss
Copy link

@jiegillet Thanks, your posting overlapped with my posting, I understood it meanwhile :-)

@VincentZalzal
Copy link
Contributor Author

Funny, I actually refrained from generalizing the function this time! My first version was:

template <typename Container>
auto choose(const Container& container) {
  auto index = std::uniform_int_distribution<std::size_t>(
      0, container.size() - 1)(rng());
  return container[index];
}

(sorry for the weird line breaks, but this is what using clang-format with the provided file yields...)

This works with any container supporting size() and operator[], and deduce the return type. I find using it cleaner, because you can't mistakenly access out of the array:

choose(points);
// instead of
points[randrange(points.size())] // you may pass the wrong array size here

But we could also have both: a function to get an int, and use it for a container. This is maybe the cleanest:

std::size_t randrange(std::size_t numElems) {
  return std::uniform_int_distribution<std::size_t>(0, numElems - 1)(rng());
}

template <typename Container>
auto choose(const Container& container) {
  return container[randrange(container.size())];
}

A preference?

@jiegillet
Copy link
Member

Totally unrelated but it made me remember an anecdote. I was TA'ing a computational physics course once and part of an exam question was to code a dice() function that returns a number between 0 and 5, and then do some stats with that. One of the student sent in his code, it was working but sometimes it was really slow. I take a look and his function was something like:

x = rand()
while(x>5) x = rand()

It worked but... lol
Maybe you can use that? Possibly there is no bias :D

@leios
Copy link
Member

leios commented May 22, 2020

I just wanted to pop in and say the code is looking better and please do what you feel like is best!

That said, please do not do what @jiegillet said.

@Frank-Buss
Copy link

I like the solution with the 2 functions. I'm not sure about the template function. It is not needed in this case with one file, but it makes sense if the function is later moved to a library or common header files. Then it can be used with other container types as well. But I think would be better to remove it for now and use the concrete type, because currently there is no such library and it makes the code easier to read with the concrete type.

@leios
Copy link
Member

leios commented May 22, 2020

@Frank-Buss We have considered possibly providing language utilities (compilation help, maybe some header files for C++). Do you think something like this would help here?

I'm always afraid of using a resource-specific library because then the code becomes "textbook specific," and harder to understand unless you have first read the intro chapters for that language in the book.

@VincentZalzal
Copy link
Contributor Author

@leios This is my first contribution here, so take my opinion with a grain of salt :) I think it is probably easier to have self-contained examples, and not develop a "dialect" that would be AAA-specific, unless something is really missing in the language.

Here is a new version, with the two functions, but no template.

@Frank-Buss
Copy link

I think the code looks good now. Maybe last change I would do is write one short line of comment before each function, as the Julia code does, like // This is a function to simulate a "chaos game". In a real application I would even use Doxygen-style comments, but I don't think this is necessary here. So for the choose function it would be like this:

// Return a random point from the non-empty PointVector
Point choose(const PointVector& points) {
  return points[randrange(points.size())];
}

I guess this would be something @leios has to take care of, to make the code look the same for all examples, e.g. if the comments should start with a capital letter (I prefer a lower case letter, but I've seen the other style as well), or if he wants full Doxygen-style comments for each function and class.

BTW, for Python, the function name is choice, but I think choose is fine, too.

And I think @VincentZalzal is right, self-contained examples might be better in this case. If I would develop an application, I would avoid code duplication and use common header files etc., but for the Arcane Algorithm Archive, it might be more important to provide easy to understand examples. A reader might read just one chapter and one source code example, and can understand a fully self-contained example faster, if he doesn't have to lookup all the non-standard functions. This means only one file for each example as well.

If there would be a coding styleguide, unless there is already one and I missed it, it would help code contributions. This would include using tab vs spaces (maybe per language, e.g. in Python it is recommended to use spaces), identifier naming conventions, comment conventions etc. as well. But maybe not important, if every code example is self-contained anyway, and @leios could fix minor differences per contribution to make it look the same for all examples.

@VincentZalzal
Copy link
Contributor Author

Done.

I completely agree, since C++ is so free-for-all, having a style guide would be beneficial for uniformity. For example, I refrained from using the style I'm used to at work, because I know most people hate it :) At least, clang-format is already managing spaces and line breaks. Having at least some variable and function naming conventions (e.g. functions start with a verb and are camel case) would help. Right now, there is a war about whether const is on the left or the right of the type, etc.

Copy link

@Frank-Buss Frank-Buss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect

Copy link
Member

@leios leios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It builds locally and I am happy with @Frank-Buss 's review.

Good work and thanks to both of you!

@leios leios merged commit c134159 into algorithm-archivists:master May 22, 2020
@VincentZalzal VincentZalzal deleted the ifs_cpp branch May 22, 2020 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants