bytes, strings: add iterator forms of existing functions

We propose to add the following functions to package bytes and package strings, to allow callers to iterate over these results without having to allocate the entire result slice. This text shows only the string package form.

This is one of a collection of proposals updating the standard library for the new 'range over function' feature (#61405). It would only be accepted if that proposal is accepted. See #61897 for a list of related proposals.

- - -

Iterating over lines is an incredibly common operation that we’ve resisted adding only because we didn’t want to encourage allocation of a potentially large slice. Iterators provide a way to finally add it.

```
// Lines returns an iterator over the newline-terminated lines in the string s.
// The lines yielded by the iterator include their terminating newlines.
// If s is empty, the iterator yields no lines at all.
// If s does not end in a newline, the final yielded line will not end in a newline.
func Lines(s string) iter.Seq[string] {
	return func(yield func(string)bool) bool {
		for s != "" {
			var line string
			if i := strings.Index(s, "\n"); i >= 0 {
				line, s = s[:i+1], s[i+1:]
			} else {
				line, s = s, ""
			}
			if !yield(line) {
				return false
			}
		}
		return true
	}
}
```

Iterating over bytes in a string is common and too difficult, since range ranges over runes. This function will inline to the obvious for loop (because we will make sure it does):

```
// Bytes returns an iterator over bytes in s, yielding both the index and the byte.
func Bytes(s string) iter.Seq2[int, byte] {
	return func(yield func(int, byte) bool) bool {
		for i := range len(s) {
			if !yield(i, s[i]) {
				return false
			}
		}
		return true
	}
}
```

Iterating over runes is served by a regular range loop, but like slices.All and maps.All, it could be useful as an input to other iterator adapters. The name is Runes, not Seq or All, so that its clear at call sites what is being iterated over (runes not bytes).

```
// Runes returns an iterator over bytes in s, yielding both the start index and the rune.
func Runes(s string) iter.Seq2[int, rune] {
	return func(yield func(int, rune) bool) bool {
		for i, c := range s {
			if !yield(i, c) {
				return false
			}
		}
		return true
	}
}
```

Similar to Lines, there should be iterator forms of Split, Fields, and Runes, to avoid requiring the allocation of a slice when the caller only wants to iterate over the individual results. If we were writing the library from scratch, we might use the names Split, Fields, and Runes for the iterator-returning versions, and code that wanted the full slice could use slices.Collect. But that's not an option here, so we add a distinguishing Seq suffix. We do not expect that new functions will use the Seq suffix. For example the function above is Lines, not LinesSeq.

```
// SplitSeq returns an iterator over all substrings of s separated by sep.
// The iterator yields the same strings that would be returned by Split(s, sep),
// but without constructing the slice.
func SplitSeq(s, sep string) iter.Seq[string] {
	if sep == "" {
		return runeSplitSeq(s)
	}
	return func(yield func(string)bool) bool {
		for {
			i := strings.Index(s, sep)
			if i < 0 {
				break
			}
			frag := s[:i]
			if !yield(frag) {
				return false
			}
			s = s[i+len(sep):]
		}
		return yield(s)
	}
}

func runeSplitSeq(s string) iter.Seq[string] {
	return func(yield func(string)bool) bool {
		for s != "" {
			_, size := utf8.DecodeRuneInString(s)
			if !yield(s[:size]) {
				return false
			}
			s = s[size:]
		}
	}
}
```

```
// SplitAfterSeq returns an iterator over substrings of s split after each instance of sep.
func SplitAfterSeq(s, sep string) iter.Seq[string]
```

```
// FieldsSeq returns an iterator over substrings of s split around runs of
// whitespace characters, as defined by unicode.IsSpace. ...
func FieldsSeq(s string) iter.Seq[string]
```

```
// FieldsFuncSeq returns an iterator over substrings of s split around runs of
// Unicode code points satisfying f(c). ...
func FieldsFuncSeq(s string, f func(rune) bool) iter.Seq[string]
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bytes, strings: add iterator forms of existing functions #61901

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bytes, strings: add iterator forms of existing functions #61901

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions