Skip to content

Commit 489b0eb

Browse files
committed
(MODULES-8760) Add iterative feature to merge() function
This adds a new feature to the `merge()` function such that it builds a hash from hashes returned from a lambda when the function is given an `Iterable` as its only argument. This adds a 4.x version of merge that is backwards compatible (except it generates different error messages for argument errors). Since a 4.x version wins over a 3x the new version will win over the old except for users that use the old API for calling functions (as expected; this is fine).
1 parent b0dd4c1 commit 489b0eb

File tree

3 files changed

+161
-7
lines changed

3 files changed

+161
-7
lines changed

README.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1986,6 +1986,33 @@ Since Puppet 4.0.0, you can use the + operator to achieve the same merge.
19861986
19871987
$merged_hash = $hash1 + $hash2
19881988
1989+
If merge is given a single `Iterable` (`Array`, `Hash`, etc.) it will call a given block with
1990+
up to three parameters, and merge each resulting Hash into the accumulated result. All other types
1991+
of values returned from the block (typically `undef`) are skipped (not merged).
1992+
1993+
The codeblock can take 2 or three parameters:
1994+
* with two, it gets the current hash (as built to this point), and each value (for hash the value is a [key, value] tuple)
1995+
* with three, it gets the current hash (as built to this point), the key/index of each value, and then the value
1996+
1997+
If the iterable is empty, or no hash was returned from the given block, an empty hash is returned. In the given block, a call to `next()`
1998+
will skip that entry, and a call to `break()` will end the iteration.
1999+
2000+
*Example: counting occurrences of strings in an array*
2001+
```puppet
2002+
['a', 'b', 'c', 'c', 'd', 'b'].merge | $hsh, $v | { { $v => $hsh[$v].lest || { 0 } + 1 } }
2003+
# would result in { a => 1, b => 2, c => 2, d => 1 }
2004+
```
2005+
2006+
*Example: skipping values for entries that are longer than 1 char*
2007+
2008+
```puppet
2009+
['a', 'b', 'c', 'c', 'd', 'b', 'blah', 'blah'].merge | $hsh, $v | { if $v =~ String[1,1] { { $v => $hsh[$v].lest || { 0 } + 1 } } }
2010+
# would result in { a => 1, b => 2, c => 2, d => 1 } since 'blah' is longer than 2 chars
2011+
```
2012+
2013+
The iterative `merge()` has an advantage over doing the same with a general `reduce()` in that the constructed hash
2014+
does not have to be copied in each iteration and thus will perform much better with large inputs.
2015+
19892016
*Type*: rvalue.
19902017
19912018
#### `min`

lib/puppet/functions/merge.rb

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# Merges two or more hashes together or hashes resulting from iteration, and returns the resulting hash.
2+
#
3+
# @example Using merge()
4+
#
5+
# $hash1 = {'one' => 1, 'two', => 2}
6+
# $hash2 = {'two' => 'dos', 'three', => 'tres'}
7+
# $merged_hash = merge($hash1, $hash2)
8+
# # The resulting hash is equivalent to:
9+
# # $merged_hash = {'one' => 1, 'two' => 'dos', 'three' => 'tres'}
10+
#
11+
# When there is a duplicate key, the key in the rightmost hash will "win."
12+
#
13+
# Note that since Puppet 4.0.0 the same merge can be achieved with the + operator.
14+
#
15+
# $merged_hash = $hash1 + $hash2
16+
#
17+
# If merge is given a single Iterable (Array, Hash, etc.) it will call a given block with
18+
# up to three parameters, and merge each resulting Hash into the accumulated result. All other types
19+
# of values returned from the block (typically undef) are skipped (not merged).
20+
#
21+
# The codeblock can take 2 or three parameters:
22+
# * with two, it gets the current hash (as built to this point), and each value (for hash the value is a [key, value] tuple)
23+
# * with three, it gets the current hash (as built to this point), the key/index of each value, and then the value
24+
#
25+
# If the iterable is empty, or no hash was returned from the given block, an empty hash is returned. In the given block, a call to `next()`
26+
# will skip that entry, and a call to `break()` will end the iteration.
27+
#
28+
# @example counting occurrences of strings in an array
29+
# ['a', 'b', 'c', 'c', 'd', 'b'].merge | $hsh, $v | { { $v => $hsh[$v].lest || { 0 } + 1 } }
30+
# # would result in { a => 1, b => 2, c => 2, d => 1 }
31+
#
32+
# @example skipping values for entries that are longer than 1 char
33+
# ['a', 'b', 'c', 'c', 'd', 'b', 'blah', 'blah'].merge | $hsh, $v | { if $v =~ String[1,1] { { $v => $hsh[$v].lest || { 0 } + 1 } } }
34+
# # would result in { a => 1, b => 2, c => 2, d => 1 } since 'blah' is longer than 2 chars
35+
#
36+
# The iterative `merge()` has an advantage over doing the same with a general `reduce()` in that the constructed hash
37+
# does not have to be copied in each iteration and thus will perform much better with large inputs.
38+
#
39+
Puppet::Functions.create_function(:'merge') do
40+
41+
dispatch :merge2hashes do
42+
repeated_param 'Variant[Hash, Undef, String[0,0]]', :args # this strange type is backwards compatible
43+
return_type 'Hash'
44+
end
45+
46+
dispatch :merge_iterable3 do
47+
repeated_param 'Iterable', :args
48+
block_param 'Callable[3,3]', :block
49+
return_type 'Hash'
50+
end
51+
52+
dispatch :merge_iterable2 do
53+
repeated_param 'Iterable', :args
54+
block_param 'Callable[2,2]', :block
55+
return_type 'Hash'
56+
end
57+
58+
59+
def merge2hashes(*hashes)
60+
accumulator = {}
61+
hashes.each {|h| accumulator.merge!(h) if h.is_a?(Hash)}
62+
accumulator
63+
end
64+
65+
def merge_iterable2(iterable, &block)
66+
accumulator = {}
67+
enum = Puppet::Pops::Types::Iterable.asserted_iterable(self, iterable)
68+
enum.each do |v|
69+
r = yield(accumulator, v)
70+
accumulator.merge!(r) if r.is_a?(Hash)
71+
end
72+
accumulator
73+
end
74+
75+
def merge_iterable3(iterable, &block)
76+
accumulator = {}
77+
enum = Puppet::Pops::Types::Iterable.asserted_iterable(self, iterable)
78+
if enum.hash_style?
79+
enum.each do |entry|
80+
r = yield(accumulator, *entry)
81+
accumulator.merge!(r) if r.is_a?(Hash)
82+
end
83+
else
84+
begin
85+
index = 0
86+
loop do
87+
r = yield(accumulator, index, enum.next)
88+
accumulator.merge!(r) if r.is_a?(Hash)
89+
index += 1
90+
end
91+
rescue StopIteration
92+
end
93+
end
94+
accumulator
95+
end
96+
end

spec/functions/merge_spec.rb

Lines changed: 38 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,17 +2,15 @@
22

33
describe 'merge' do
44
it { is_expected.not_to eq(nil) }
5-
it { is_expected.to run.with_params.and_raise_error(Puppet::ParseError, %r{wrong number of arguments}i) }
6-
it { is_expected.to run.with_params({}, 'two').and_raise_error(Puppet::ParseError, %r{unexpected argument type String}) }
7-
it { is_expected.to run.with_params({}, 1).and_raise_error(Puppet::ParseError, %r{unexpected argument type (Fixnum|Integer)}) }
5+
it { is_expected.to run.with_params({}, 'two').and_raise_error(ArgumentError, Regexp.new(Regexp.escape("rejected: parameter 'args' expects a value of type Undef, Hash, or String[0, 0], got String"))) }
6+
it { is_expected.to run.with_params({}, 1).and_raise_error(ArgumentError, %r{parameter 'args' expects a value of type Undef, Hash, or String, got Integer}) }
87
it { is_expected.to run.with_params({ 'one' => 1, 'three' => { 'four' => 4 } }, 'two' => 'dos', 'three' => { 'five' => 5 }).and_return('one' => 1, 'three' => { 'five' => 5 }, 'two' => 'dos') }
98

10-
it {
11-
pending 'should not special case this'
12-
is_expected.to run.with_params({}).and_return({})
13-
}
9+
it { is_expected.to run.with_params.and_return({}) }
10+
it { is_expected.to run.with_params({}).and_return({}) }
1411
it { is_expected.to run.with_params({}, {}).and_return({}) }
1512
it { is_expected.to run.with_params({}, {}, {}).and_return({}) }
13+
1614
describe 'should accept empty strings as puppet undef' do
1715
it { is_expected.to run.with_params({}, '').and_return({}) }
1816
end
@@ -24,4 +22,37 @@
2422
.with_params({ 'key1' => 'value1' }, { 'key2' => 'value2' }, 'key3' => 'value3') \
2523
.and_return('key1' => 'value1', 'key2' => 'value2', 'key3' => 'value3')
2624
}
25+
describe 'should accept iterable and merge produced hashes' do
26+
27+
it { is_expected.to run \
28+
.with_params([1,2,3]) \
29+
.with_lambda {|hsh, val| { val => val } } \
30+
.and_return({ 1 => 1, 2 => 2, 3 => 3 }) }
31+
32+
it { is_expected.to run \
33+
.with_params([1,2,3]) \
34+
.with_lambda {|hsh, val| { val => val } unless val == 2} \
35+
.and_return({ 1 => 1, 3 => 3 }) }
36+
37+
it { is_expected.to run \
38+
.with_params([1,2,3]) \
39+
.with_lambda {|hsh, val| raise StopIteration.new if val == 3; { val => val } } \
40+
.and_return({ 1 => 1, 2 => 2 }) }
41+
42+
it { is_expected.to run \
43+
.with_params(['a', 'b', 'b', 'c', 'b']) \
44+
.with_lambda {|hsh, val| { val => (hsh[val] || 0) + 1 } } \
45+
.and_return({ 'a' => 1, 'b' => 3, 'c' => 1 }) }
46+
47+
it { is_expected.to run \
48+
.with_params(['a', 'b', 'c']) \
49+
.with_lambda {|hsh, idx, val| { idx => val } } \
50+
.and_return({ 0 => 'a', 1 => 'b', 2 => 'c'}) }
51+
52+
it { is_expected.to run \
53+
.with_params({'a' => 'A', 'b' => 'B', 'c' => 'C'}) \
54+
.with_lambda {|hsh, key, val| { key => "#{key}#{val}" } } \
55+
.and_return({ 'a' => 'aA', 'b' => 'bB', 'c' => 'cC'}) }
56+
57+
end
2758
end

0 commit comments

Comments
 (0)