Skip to content

Update to the convolution chapter(s) #796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 4, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -515,4 +515,5 @@ vscode/
# Coconut compilation files
**/coconut/*.py

# aspell
*.bak
5 changes: 5 additions & 0 deletions SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,11 @@
* [Affine Transformations](contents/affine_transformations/affine_transformations.md)
* [Bit Logic](contents/bitlogic/bitlogic.md)
* [Taylor Series](contents/taylor_series_expansion/taylor_series_expansion.md)
* [Convolutions](contents/convolutions/convolutions.md)
* [Convolutions in 1D](contents/convolutions/1d/1d.md)
* [Multiplication as a Convolution](contents/convolutions/multiplication/multiplication.md)
* [Convolutions of Images (2D)](contents/convolutions/2d/2d.md)
* [Convolutional Theorem](contents/convolutions/convolutional_theorem/convolutional_theorem.md)
* [Tree Traversal](contents/tree_traversal/tree_traversal.md)
* [Euclidean Algorithm](contents/euclidean_algorithm/euclidean_algorithm.md)
* [Monte Carlo](contents/monte_carlo_integration/monte_carlo_integration.md)
Expand Down
2 changes: 1 addition & 1 deletion contents/IFS/IFS.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ The code examples are licensed under the MIT license (found in [LICENSE.md](http

##### Text

The text of this chapter was written by [James Schloss](https://github.com/leio) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).
The text of this chapter was written by [James Schloss](https://github.com/leios) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).

[<p><img class="center" src="../cc/CC-BY-SA_icon.svg" /></p>](https://creativecommons.org/licenses/by-sa/4.0/)

Expand Down
2 changes: 1 addition & 1 deletion contents/computus/computus.md
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,7 @@ The code examples are licensed under the MIT license (found in [LICENSE.md](http

##### Text

The text of this chapter was written by [James Schloss](https://github.com/leio) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).
The text of this chapter was written by [James Schloss](https://github.com/leios) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).

[<p><img class="center" src="../cc/CC-BY-SA_icon.svg" /></p>](https://creativecommons.org/licenses/by-sa/4.0/)

Expand Down
307 changes: 307 additions & 0 deletions contents/convolutions/1d/1d.md

Large diffs are not rendered by default.

183 changes: 183 additions & 0 deletions contents/convolutions/2d/2d.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
# Convolutions on Images

For this section, we will no longer be focusing on signals, but instead images (arrays filled with elements of red, green, and blue values).
That said, for the code examples, greyscale images may be used such that each array element is composed of some floating-point value instead of color.
In addition, we will not be discussing boundary conditions too much in this chapter and will instead be using the simple boundaries introduced in the section on [one-dimensional convolutions](../1d/1d.md).

The extension of one-dimensional convolutions to two dimensions requires a little thought about indexing and the like, but is ultimately the same operation.
Here is an animation of a convolution for a two-dimensional image:

<div style="text-align:center">
<video style="width:90%" controls loop>
<source src="../res/2d.mp4" type="video/mp4">
Your browser does not support the video tag.
</video>
</div>

In this case, we convolved the image with a 3x3 square filter, all filled with values of $$\frac{1}{9}$$.
This created a simple blurring effect, which is somewhat expected from the discussion in the previous section.
In code, a two-dimensional convolution might look like this:

{% method %}
{% sample lang="jl" %}
[import:4-28, lang:"julia"](../code/julia/2d_convolution.jl)
{% endmethod %}

This is very similar to what we have shown in previous sections; however, it essentially requires four iterable dimensions because we need to iterate through each axis of the output domain *and* the filter.

At this stage, it is worth highlighting common filters used for convolutions of images.
In particular, we will further discuss the Gaussian filter introduced in the [previous section](../1d/1d.md), and then introduce another set of kernels known as Sobel operators, which are used for naïve edge detection or image derivatives.

## The Gaussian kernel

The Gaussian kernel serves as an effective *blurring* operation for images.
As a reminder, the formula for any Gaussian distribution is

$$
g(x,y) = \frac{1}{2\pi\sigma^2}e^{-\frac{x^2+y^2}{2\sigma^2}},
$$

where $$\sigma$$ is the standard deviation and is a measure of the width of the Gaussian.
A larger $$\sigma$$ means a larger Gaussian; however, remember that the Gaussian must fit onto the filter, otherwise it will be cut off!
For example, if you are using a $$3\times 3$$ filter, you should not be using $$\sigma = 10$$.
Some definitions of $$\sigma$$ allow users to have a separate deviation in $$x$$ and $$y$$ to create an ellipsoid Gaussian, but for the purposes of this chapter, we will assume $$\sigma_x = \sigma_y$$.
As a general rule of thumb, the larger the filter and standard deviation, the more "smeared" the final convolution will be.

At this stage, it is important to write some code, so we will generate a simple function that returns a Gaussian kernel with a specified standard deviation and filter size.

{% method %}
{% sample lang="jl" %}
[import:30-47, lang:"julia"](../code/julia/2d_convolution.jl)
{% endmethod %}

Though it is entirely possible to create a Gaussian kernel whose standard deviation is independent on the kernel size, we have decided to enforce a relation between the two in this chapter.
As always, we encourage you to play with the code and create your own Gaussian kernels any way you want!
As a note, all the kernels will be scaled (normalized) at the end by the sum of all internal elements.
This ensures that the output of the convolution will not have an obnoxious scale factor associated with it.

Below are a few images generated by applying a kernel generated with the code above to a black and white image of a circle.

<p>
<img class="center" src="../res/circle_blur.png" style="width:100%" />
</p>


In (a), we show the original image, which is just a white circle at the center of a $$50\times 50$$ grid.
In (b), we show the image after convolution with a $$3\times 3$$ kernel.
In (c), we show the image after convolution with a $$20\times 20$$ kernel.
Here, we see that (c) is significantly fuzzier than (b), which is a direct consequence of the kernel size.

There is a lot more that we could talk about, but now is a good time to move on to a slightly more complicated convolutional method: the Sobel operator.

## The Sobel operator

The Sobel operator effectively performs a gradient operation on an image by highlighting areas where a large change has been made.
In essence, this means that this operation can be thought of as a naïve edge detector.
Essentially, the $$n$$-dimensional Sobel operator is composed of $$n$$ separate gradient convolutions (one for each dimension) that are then combined together into a final output array.
Again, for the purposes of this chapter, we will stick to two dimensions, which will be composed of two separate gradients along the $$x$$ and $$y$$ directions.
Each gradient will be created by convolving our image with their corresponding Sobel operator:

$$
\begin{align}
S_x &= \left(\begin{bmatrix}
1 \\
2 \\
1 \\
\end{bmatrix} \otimes [1~0~-1]
\right) = \begin{bmatrix}
1 & 0 & -1 \\
2 & 0 & -2 \\
1 & 0 & -1 \\
\end{bmatrix}\\

S_y &= \left(
\begin{bmatrix}
1 \\
0 \\
-1 \\
\end{bmatrix} \otimes [1~2~1]
\right) = \begin{bmatrix}
1 & 2 & 1 \\
0 & 0 & 0 \\
-1 & -2 & -1 \\
\end{bmatrix}.
\end{align}
$$

The gradients can then be found with a convolution, such that:

$$
\begin{align}
G_x &= S_x*A \\
G_y &= S_y*A.
\end{align}
$$

Here, $$A$$ is the input array or image.
Finally, these gradients can be summed in quadrature to find the total Sobel operator or image gradient:

$$
G_{\text{total}} = \sqrt{G_x^2 + G_y^2}
$$

So let us now show what it does in practice:

<p>
<img class="center" src="../res/sobel_filters.png" style="width:100%" />
</p>

In this diagram, we start with the circle image on the right, and then convolve it with the $$S_x$$ and $$S_y$$ operators to find the gradients along $$x$$ and $$y$$ before summing them in quadrature to get the final image gradient.
Here, we see that the edges of our input image have been highlighted, showing outline of our circle.
This is why the Sobel operator is also known as naïve edge detection and is an integral component to many more sophisticated edge detection methods like one proposed by Canny {{ "canny1986computational" | cite }}.

In code, the Sobel operator involves first finding the operators in $$x$$ and $$y$$ and then applying them with a traditional convolution:

{% method %}
{% sample lang="jl" %}
[import:49-63, lang:"julia"](../code/julia/2d_convolution.jl)
{% endmethod %}

With that, I believe we are at a good place to stop discussions on two-dimensional convolutions.
We will definitely return to this topic in the future as new algorithms require more information.

## Example Code

For the code in this section, we have modified the visualizations from the [one-dimensional convolution chapter](../1d/1d.md) to add a two-dimensional variant for blurring an image of random white noise.
We have also added code to create the Gaussian kernel and Sobel operator and apply it to the circle, as shown in the text.

{% method %}
{% sample lang="jl" %}
[import, lang:"julia"](../code/julia/2d_convolution.jl)
{% endmethod %}

<script>
MathJax.Hub.Queue(["Typeset",MathJax.Hub]);
</script>

### Bibliography

{% references %} {% endreferences %}

## License

##### Code Examples

The code examples are licensed under the MIT license (found in [LICENSE.md](https://github.com/algorithm-archivists/algorithm-archive/blob/master/LICENSE.md)).

##### Images/Graphics
- The image "[8bit Heart](../res/heart_8bit.png)" was created by [James Schloss](https://github.com/leios) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).
- The image "[Circle Blur](../res/circle_blur.png)" was created by [James Schloss](https://github.com/leios) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).
- The image "[Sobel Filters](../res/sobel_filters.png)" was created by [James Schloss](https://github.com/leios) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).
- The video "[2D Convolution](../res/2d.mp4)" was created by [James Schloss](https://github.com/leios) and [Grant Sanderson](https://github.com/3b1b) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).

##### Text

The text of this chapter was written by [James Schloss](https://github.com/leios) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).

[<p><img class="center" src="../../cc/CC-BY-SA_icon.svg" /></p>](https://creativecommons.org/licenses/by-sa/4.0/)

##### Pull Requests

After initial licensing ([#560](https://github.com/algorithm-archivists/algorithm-archive/pull/560)), the following pull requests have modified the text or graphics of this chapter:
- none

74 changes: 74 additions & 0 deletions contents/convolutions/code/julia/1d_convolution.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
using DelimitedFiles
using LinearAlgebra

function convolve_cyclic(signal::Array{T, 1},
filter::Array{T, 1}) where {T <: Number}

# output size will be the size of sign
output_size = max(length(signal), length(filter))

# convolutional output
out = Array{Float64,1}(undef,output_size)
sum = 0

for i = 1:output_size
for j = 1:output_size
if (mod1(i-j, output_size) <= length(filter))
sum += signal[mod1(j, output_size)] * filter[mod1(i-j, output_size)]
end
end

out[i] = sum
sum = 0

end

return out
end

function convolve_linear(signal::Array{T, 1}, filter::Array{T, 1},
output_size) where {T <: Number}

# convolutional output
out = Array{Float64,1}(undef, output_size)
sum = 0

for i = 1:output_size
for j = max(1, i-length(filter)):i
if j <= length(signal) && i-j+1 <= length(filter)
sum += signal[j] * filter[i-j+1]
end
end

out[i] = sum
sum = 0
end

return out
end

function main()

# sawtooth functions for x and y
x = [float(i)/200 for i = 1:200]
y = [float(i)/200 for i = 1:200]

# Normalization is not strictly necessary, but good practice
normalize!(x)
normalize!(y)

# full convolution, output will be the size of x + y
full_linear_output = convolve_linear(x, y, length(x) + length(y))

# simple boundaries
simple_linear_output = convolve_linear(x, y, length(x))

# cyclic convolution
cyclic_output = convolve_cyclic(x, y)

# outputting convolutions to different files for plotting in external code
writedlm("full_linear.dat", full_linear_output)
writedlm("simple_linear.dat", simple_linear_output)
writedlm("cyclic.dat", cyclic_output)

end
Loading