You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note that in this case, $$x$$ is not necessarily a spatial element.
7
+
Often times, it is time or something else entirely!
8
+
The easiest way to think about this is that the function $$g(x)$$ is being shifted across all of space by the variable $$\xi$$.
9
+
At every point $$x$$, we multiply $$f(x)$$ and $$g(x)$$ and integrate the multiplied output to find the convolutional output for that spatial step, $$(f*g)(x)$$.
10
+
Note that in code, this is often discretized to look like:
This means we basically just need to keep one array steady, flip the second array around, and move it through the first array one step at a time, performing a simple element-wise multiplication each step.
16
+
17
+
This can be seen in the following animation:
18
+
19
+
ADD ANIMATION
20
+
21
+
Note that in this case, the output array will be the size of `f[n]` and `g[n]` put together.
22
+
Sometimes, though, we have an large size for `f[n]` and a small size for `g[n]`.
23
+
In this case `g[n]` is often called a *filter*, and often times when we are using a filter on an array (that might represent an image or some form of data), we want the output array to be the same size as the input.
24
+
In this case, rather than outputting a larger array, we often do something special at the borders of the array.
25
+
Depending on the situation, this may be necessary.
26
+
Note that there are different methods to deal with the edges in this case, so it's best to do whatever seems right when the situation arises.
27
+
28
+
At this stage, the math and code might still be a little opaque, and it might help to think of the second signal or filter as an array of weights, signalling how much of the convolutional output relies on any particular element within signal one.
29
+
For example, let's say that signal one is a square wave, and signal two is a filter composed of a three element triangle wave (`[1 2 1]`), as shown here:
30
+
31
+
ADD IMAGES
32
+
33
+
The simplest interpretation for this is that at every point $$x$$ along signal one, the convolutional output will be composed of one part $$x-1$$, two parts $$x$$, and one part $$x+1$$.
34
+
If we perform a convolution with signal one and this filter, we will find that the square wave is smeared a little at the edges, like so:
35
+
36
+
ADD ANIMATION
37
+
38
+
This specific case is similar to a Gaussian, which is a common kernel used for blurring images in two-dimensions.
39
+
For this reason, we will discuss common kernels found in the wild in the next section on [convolutions of images](../2d/2d.md).
40
+
41
+
In code, the one-dimensional convolution might look something like this::
The code examples are licensed under the MIT license (found in [LICENSE.md](https://github.com/algorithm-archivists/algorithm-archive/blob/master/LICENSE.md)).
65
+
66
+
##### Text
67
+
68
+
The text of this chapter was written by [James Schloss](https://github.com/leios) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).
After initial licensing ([#560](https://github.com/algorithm-archivists/algorithm-archive/pull/560)), the following pull requests have modified the text or graphics of this chapter:
Unlike the section on [one-dimensional convolutions](../1d/1d.md), for this section, we will no longer be focusing on signals, but instead images.
4
+
For the purposes of this chapter, an image will be an array filled with elements with some red, green, and blue value associated with it; however, for the code examples, greyscale images may be used where each array element is simply composed of some floating-point value.
5
+
6
+
In this case, extending the one-dimensional convolution to two-dimensions is a relatively straightforward task, but indexing still requires some thought.
7
+
8
+
ADD CODE AND VIDEO
9
+
10
+
# Common filters
11
+
12
+
For image processing, there are quite a few relatively common filters to use for various tasks.
13
+
For this section, we will cover 2 of them: Gaussian and Sobel.
14
+
15
+
## The Gaussian kernel
16
+
17
+
The Gaussian kernel serves as an effective *blurring* operation for images.
18
+
Like we showed with the triangle filter in the section on [one-dimensional convolutions](../1d/1d.md), the Gaussian kernel effectively adds a small amount of the neighboring elements to each pixel in an image.
19
+
20
+
21
+
## The Sobel operator
22
+
23
+
The Sobel operator effectively performs a gradient operation on an image by highlighting areas where a large change has been made and can be considered a naive edge detector.
24
+
It is also the first non-trivial example of convolution.
25
+
That is to say that the $$n$$-dimensional Sobel operator is composed of $$n$$ separate gradient convolutions that are then combined together into one, output array.
26
+
27
+
28
+
<script>
29
+
MathJax.Hub.Queue(["Typeset",MathJax.Hub]);
30
+
</script>
31
+
32
+
## License
33
+
34
+
##### Code Examples
35
+
36
+
The code examples are licensed under the MIT license (found in [LICENSE.md](https://github.com/algorithm-archivists/algorithm-archive/blob/master/LICENSE.md)).
37
+
38
+
##### Text
39
+
40
+
The text of this chapter was written by [James Schloss](https://github.com/leios) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).
After initial licensing ([#560](https://github.com/algorithm-archivists/algorithm-archive/pull/560)), the following pull requests have modified the text or graphics of this chapter:
Important note: this particular section will be expanded upon after the Fourier transform and fast Fourier transform chapters have been revised.
4
+
5
+
Now, let me tell you about a bit of computational magic:
6
+
7
+
**Convolutions can be performed with Fourier Transforms!**
8
+
9
+
That is crazy, but it's also incredibly hard to explain, so let me do my best.
10
+
As described in the chapter on [Fourier Transforms](../cooley_tukey/cooley_tukey.md), Fourier Transforms allow programmers to move from real space to frequency space.
11
+
When we transform a wave into frequency space, we see a single peak in frequency space related to the frequency of that wave.
12
+
No matter what function we send into a Fourier Transform, the frequency-space image can be interpreted as a series of different waves with a specified frequency.
13
+
14
+
So here's the idea: if we take two functions $$f(x)$$ and $$g(x)$$ and move them to frequency space to be $$\hat f(\xi)$$ and $$\hat g(\xi)$$, we can then multiply those two functions and transform them back into a third function to blend the signals together.
15
+
In this way, we will have a third function that relates the frequency-space images of the two input functions.
16
+
This is precisely a convolution.
17
+
18
+
This is because of something known as the *convolution theorem* which looks something like this:
Where $$\mathcal{F}$$ denotes the Fourier Transform.
23
+
Now, by using a Fast Fourier Transform (fft) in code, this can take a standard convolution on two arrays of length $$n$$, which is an $$\mathcal{O}(n^2)$$ process, to $$\mathcal{O}(n\log(n))$$.
24
+
This means that the convolution theorem is fundamental to creating fast convolutional methods for large inputs, assuming that both of the input signals are similar sizes.
25
+
That said, it is debatable whether the convolution theorem will be faster when the filter size is small.
26
+
Also: depending on the language used, we might need to read in a separate library for FFT's.
27
+
28
+
{% method %}
29
+
{% sample lang="jl" %}
30
+
That said, Julia has an in-built fft routine, so the code for this method could not be simpler:
This method also has the added advantage that it will *always output an array of the size of your signal*; however, if your signals are not of equal size, we need to pad the smaller signal with zeros.
45
+
Also note that the Fourier Transform is a periodic or cyclical operation, so there are no real edges in this method, instead the arrays "wrap around" to the other side.
46
+
For this reason, this convolution is often called a *cyclic convolution* instead of a *linear convolution* like above.
47
+
Note that cyclic convolutions can definitely still be done without Fourier Transforms and we can do linear convolutions with Fourier Transforms, but it makes the code slightly more complicated than described above.
48
+
49
+
<script>
50
+
MathJax.Hub.Queue(["Typeset",MathJax.Hub]);
51
+
</script>
52
+
53
+
## License
54
+
55
+
##### Code Examples
56
+
57
+
The code examples are licensed under the MIT license (found in [LICENSE.md](https://github.com/algorithm-archivists/algorithm-archive/blob/master/LICENSE.md)).
58
+
59
+
##### Text
60
+
61
+
The text of this chapter was written by [James Schloss](https://github.com/leios) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).
After initial licensing ([#560](https://github.com/algorithm-archivists/algorithm-archive/pull/560)), the following pull requests have modified the text or graphics of this chapter:
Copy file name to clipboardExpand all lines: contents/convolutions/convolutions.md
+9-102Lines changed: 9 additions & 102 deletions
Original file line number
Diff line number
Diff line change
@@ -6,8 +6,8 @@
6
6
4. Correlation of same filters on images. Correlation. Take noisy signal, and correlate it with lean function to see if it's similar
7
7
8
8
# Convolutions
9
-
Alright, I am going to come right out and say it: convolutions can be confusing.
10
-
Not only are they hard to really describe, but if you do not see them in practice, it's hard to understand why you would ever want to use them.
9
+
To put it bluntly, convolutions can be confusing.
10
+
Not only are they hard to describe, but if you do not see them in practice, it's hard to understand why you would ever want to use them.
11
11
I'm going to do what I can to describe them in an intuitive way; however, I may need to come back to this in the future.
12
12
Let me know if there is anything here that is unclear, and I'll do what I can to clear it up.
13
13
@@ -19,115 +19,22 @@ A convolution is another function-related operation, and is often notated with a
19
19
$$f(x)*g(x)=c(x)$$
20
20
21
21
provides a third function $$c(x)$$ that blends $$f(x)$$ and $$g(x)$$.
22
+
This concept is known as a *convolution*, and as a rather important side-note: there is an incredibly similar operator known as a *correlation* which will be discussed in the near future.
23
+
Now we are left with a rather vague question: How do we *blend* functions?
22
24
23
-
As a rather important side-note: there is an incredibly similar operator known as a *correlation* which will be discussed in the near future.
24
-
For now, let's focus on convolutions, which are defined as:
25
+
To answer this question, we will need to show off a few simple graphics or animations in the [Convolutions in 1D](1d/1d.md) section before discussing the mathematical definition.
26
+
We will then move on to the application of convolutions to images in the [Convolutions of images](2d/2d.md) section.
27
+
As a note: convolutions can be extended to $n$-dimensions, but after seeing how it is extended to two dimensions, it should be possible for the reader to extend it to three dimensions and beyond if that is needed.
28
+
In addition, we will be touching on a rather difficult but powerful topic with the [Convolutional Theorem](convolutional_theorem/convolutional_therem.md) section where convolutions can be computed by using [Fourier transforms](../Cooley_tukey/cooley_tukey.md)
Note that in this case, $$x$$ is not necessarily a spatial element.
29
-
Often times, it is time or something else entirely!
30
-
The easiest way to think about this is that the function $$g(x)$$ is being shifted across all of space by the variable $$\xi$$.
31
-
At every point $$x$$, we multiply $$f(x)$$ and $$g(x)$$ and integrate the multiplied output to find the convolution for that spatial step, $$(f*g)(x)$$.
32
-
Note that in code, this is often discretized to look like:
This means we basically just need to keep one array steady, flip the second array around, and move it through the first array one step at a time, performing a simple element-wise multiplication each step.
38
-
39
-
<!---This can be seen in the following animation:--->
Note that in this case, the output array will be the size of `f[n]` and `g[n]` put together.
59
-
Sometimes, though, we have an large size for `f[n]` and a small size for `g[n]`.
60
-
In this case `g[n]` is often called a *filter*, and often times when we are using a filter on an array (that might represent an image or some form of data), we want the output array to be the same size as the input.
61
-
In this case, rather than outputting a larger array, we often do something special at the borders of the array.
62
-
Depending on the situation, this may be necessary.
63
-
Note that there are different methods to deal with the edges in this case, so it's best to do whatever seems right when the situation arises.
64
-
65
-
### Convolutional Theorem
66
-
67
-
Now, let me tell you about a bit of black computational magic:
68
-
69
-
**Convolutions can be performed with Fourier Transforms!**
70
-
71
-
That is crazy!
72
-
It's also incredibly hard to explain, so let me do my best.
73
-
As described in the chapter on [Fourier Transforms](../cooley_tukey/cooley_tukey.md), Fourier Transforms allow programmers to move from real space to frequency space.
74
-
When we transform a wave into frequency space, we see a single peak in frequency space related to the frequency of that wave.
75
-
No matter what function we send into a Fourier Transform, the frequency-space image can be interpreted as a series of different waves with a specified frequency.
76
-
77
-
So here's the idea: if we take two functions $$f(x)$$ and $$g(x)$$ and move them to frequency space to be $$\hat f(\xi)$$ and $$\hat g(\xi)$$, we can then multiply those two functions and transform them back into a third function to blend the signals together.
78
-
In this way, we will have a third function that relates the frequency-space images of the two input functions.
79
-
*This is precisely a convolution!*
80
-
81
-
Don't believe me?
82
-
Well, this is because of something known as the *convolution theorem* which looks something like this:
Where $$\mathcal{F}$$ denotes the Fourier Transform.
87
-
Now, by using a Fast Fourier Transform (fft) in code, this can take a standard convolution on two arrays of length $$n$$, which is an $$\mathcal{O}(n^2)$$ process, to $$\mathcal{O}(n\log(n))$$.
88
-
This means that the convolution theorem is fundamental to creating fast convolutional methods for large inputs, assuming that both of the input signals are similar sizes.
89
-
That said, it is debatable whether the convolution theorem will be faster when the filter size is small.
90
-
Also: depending on the language used, we might need to read in a separate library for FFT's.
91
-
92
-
{% method %}
93
-
{% sample lang="jl" %}
94
-
That said, Julia has an in-built fft routine, so the code for this method could not be simpler:
95
-
[import:19-22, lang:"julia"](code/julia/conv.jl)
96
-
Where the `.*` operator is an element-wise multiplication.
This method also has the added advantage that it will *always output an array of the size of your signal*; however, if your signals are not of equal size, we need to pad the smaller signal with zeros.
109
-
Also note that the Fourier Transform is a periodic or cyclical operation, so there are no real edges in this method, instead the arrays "wrap around" to the other side.
110
-
For this reason, this convolution is often called a *cyclic convolution* instead of a *linear convolution* like above.
111
-
Note that cyclic convolutions can definitely still be done without Fourier Transforms and we can do linear convolutions with Fourier Transforms, but it makes the code slightly more complicated than described above.
112
-
113
-
<!---
114
-
If you are still having trouble wrapping your head around what the convolution theorem actually means, maybe this graphic will help:
115
-
116
-
ADD IMAGE
117
-
118
-
Remember that each element of the frequency-space array is a different waveform in real-space, so when you multiply two frequency-space arrays, you are selectively amplifying similar waveforms.
119
-
--->
30
+
Finally, all of the sections related to convolutions will be updated after the Fourier transform and fast Fourier transform chapters have been updated in the near future.
120
31
121
32
<script>
122
33
MathJax.Hub.Queue(["Typeset",MathJax.Hub]);
123
34
</script>
124
35
125
36
## License
126
37
127
-
##### Code Examples
128
-
129
-
The code examples are licensed under the MIT license (found in [LICENSE.md](https://github.com/algorithm-archivists/algorithm-archive/blob/master/LICENSE.md)).
130
-
131
38
##### Text
132
39
133
40
The text of this chapter was written by [James Schloss](https://github.com/leios) and is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/legalcode).
0 commit comments