I’m glad this is being added for the ergonomic benefits. But the number of times the article points out cases where conventions will need to be formed by the community makes me fear this will add to the already longer-than-I’d-like list of anti-patterns or footguns that Go’s design and type system makes easy to fall into.
FTA: “Maybe you just want to use the range keyword to iterate over every element of your collection. Easy enough.
func (s Slice) All() func(yield func(i int) bool) {
return func(yield func(i int) bool) {
for i := range s {
if !yield(s[i]) {
return
}
}
}
}
”
So, for an “easy enough” example correctly, you have to write func five times in order to, if I understand this correctly, wrote a function returning a function that takes a function as an argument?
IEnumerable<int> ProduceEvenNumbers(int upto)
{
for (int i = 0; i <= upto; i += 2)
{
yield return i;
}
}
Yes, that introduces “magic” where the runtime figures out that ProduceEvenNumbers won’t continue, but why give such functions the flexibility not to listen to such requests (in the golang version, is forgetting the if and just yielding instead ever useful?)
I don't know why most examples about the Go iterators shows unnecessarily convoluted examples. The authors probably don't know as much of the language as they think they do.
Go is far from perfect and certainly has a lot of flaws, but IMO the iterators design fits well into the language and does not deserve the criticism.
The range only needs a function, so what is the point of calling a function that returns a function?
This is enough to make a range iterator, you don't have to call `All` manually:
func (slice Slice) All(yield func(string) bool) {
for _, s := range slice {
if !yield(s) {
break
}
}
}
In Go, the expression `object.method` already returns the method as a function, with `object` being already bound in the scope of the function. The syntax is so that if you have this definition:
type T string
func (t T) f(a int) int {
return a
}
Then:
- `T.f` returns `func(t T, a int) int`
- `T("whatever").f` returns `func(a int) int` where `t` is already set to `"whatever"`
- `T("whatever").f(42)` calls the function and returns `42`
This is true, you can just return the method without invoking it.
I don't tend to use that convention though, because most of the time when I use this pattern, I'm returning a function that uses a parameter I pass in. You can see this in the predicate example. I prefer keeping my call sites consistent and always using functions that return a function when invoked, rather than sometimes invoking them and sometimes just passing the func reference. YMMV.
That looks a lot better, indeed. I still don’t understand why every implementation has to do that if in if !yield(s) check, though.
Is it ever useful to be able to do something there before returning? If so, wouldn’t it be cleaner to implement this as an interface with two methods, one that simply has to yield for every item produced and an optional one that gets called when the runtime knows it won’t ever run the first one anymore?
Because yield is implemented as a callback, there is no compiler magic to automatically stop the iteration when the caller side uses break or continue, so this condition cannot really be avoided.
Also there are many cases where you might need to have some special logic after the last iteration, so simply killing the underlying function is not an option.
I am all for having them in the language, however the way they have been designed, or how the new magic fields for structure aligment (in Go 1.23) are being designed, this shows how attacking other languages as PhD level complexity and then coming out with such special case designs, is kind of ironic.
I give it 10 more years of such special cased improved, for Go not to be any better than the languages the community regularly complains about, while Go is "perfect".
All this added complexity seems to be just sugar syntax too. The examples here clearly prove that we could write clean code that achieved the same functionality without this feature.
We could also just use an Iterator pattern for such use cases too.
Go used to be such a simple language. I wonder what's driving them to keep adding complexity to the language itself. It makes me respect Harelang's goal of "freeze the language once 1.0 is out" a lot more.
The proposal [0] explains the motivation behind this "syntax sugar":
There is no standard way to iterate over a sequence of values in Go. For lack of any convention, we have ended up with a wide variety of approaches. Each implementation has done what made the most sense in that context, but decisions made in isolation have resulted in confusion for users.
In the standard library alone, we have archive/tar.Reader.Next, bufio.Reader.ReadByte, bufio.Scanner.Scan, container/ring.Ring.Do, database/sql.Rows, expvar.Do, flag.Visit, go/token.FileSet.Iterate, path/filepath.Walk, go/token.FileSet.Iterate, runtime.Frames.Next, and sync.Map.Range, hardly any of which agree on the exact details of iteration. Even the functions that agree on the signature don’t always agree about the semantics. For example, most iteration functions that return (T, bool) follow the usual Go convention of having the bool indicate whether the T is valid. In contrast, the bool returned from runtime.Frames.Next indicates whether the next call will return something valid.
When you want to iterate over something, you first have to learn how the specific code you are calling handles iteration. This lack of uniformity hinders Go’s goal of making it easy to easy to move around in a large code base. People often mention as a strength that all Go code looks about the same. That’s simply not true for code with custom iteration.
Wouldn't that always require you to have a dedicated type?
I don't know D but I'm imagining what python has.
Nice thing about this go approach is you can have just functions as shown in examples. One practical use is functions building range functions on top of existing interfaces. Or as the filters examples shown, you can create filter generators on regular slices/maps.
Creating a small wrapper type over a slice/map is trivial in Go (`type X[T] []T`) , and then you can define the range functions as methods on that slice type. If they allowed generic instance methods it would be even simpler.
That's not the only point of methods, even if it's the only one the designers of Go envisioned. Another very relevant purpose is method chaining syntax. That is, with instance methods you can write a.b().c(), with functions you have to write c(b(a)). This turns out to be extremely relevant for longer chains.
Of course, other than generic methods, this could also be supported by just supporting universal function call syntax. That is, the compiler could simply take f(x, a) and x.f(a) to be perfectly equivalent, regardless of whether f is a method of x's type or a free floating method. There is some minor complication because of backwards compatibility, but that's easily fixed (the syntax you use can prefer the function f vs the method f if there is any ambiguity).
On the other hand, generic methods can be extremely useful in their own right, for other reasons as well. Having generic methods in an interface, such that a type has to have a generic method to implement that interface, is perfectly reasonable as a feature request - it wouldn't contradict anything in the spirit of Go. Of course, the implementation can have problems and trade-offs, I'm not claiming this is an easy feature to implement. But I don't think it's excluded.
> Creating a small wrapper type over a slice/map is trivial in Go
And yet it's specifically one thing rsc did not want. Further issues described in the rangefunc proposals:
- it would require the desugaring to run off of method-set analysis of userland types, something which does not currently exist
- it severely complicates resource management around the iterator, as you need a 3-step iteration for resource-bearing iterators (acquire iterator, defer cleanup, perform iteration)
And one not actually listed explicitly: for the limited amount of optimisations the Go compiler does, internal iteration is a lot easier to optimise as it pretty much inlines down to a `for` loop, the termination of which is much easier to analyse than bouncing through a bunch of pull calls.
Not only that, but `for range` works off of underlying type, so this is already valid Go:
import "fmt"
type Foo []int
func main() {
f := Foo([]int{1, 2, 3})
for _, v := range f {
fmt.Println(v)
}
}
One could approach it the other way. Once more projects adopt all kinds of wrapper functions and types, the deficiency of Go will become a more widespread knowledge as compiler will get progressively less able to cope with added abstractions.
Hopefully it will put the common fallacy of "Go or Rust" to rest as the weight class and capabilities are on opposite ends of spectrum, with much closer comparison being "Rust or C# or Swift or Kotlin" if one looks for a Rust alternative that makes a tradeoff of not forcing many small decision to reduce decision fatigue by conceding to a reasonable extent certain areas Rust excels at.
In any case, for its touted simplicity Go sure doesn't look like a simple and straightforward to follow language anymore.
The 3-step approach to iteration is also well solved[0] in C#, and works even in rather complex cases like `File.ReadLines(...)` where the line iterator internally handles IO, file handle acquisition and disposal. Just `foreach (var msg in File.ReadLines("messages.jsonl"))` and you won't be able to make a footgun out of it.
This also applies to the usage chained with filter/map/etc.
var messages = File
.ReadLines("messages.jsonl")
.Select(line => JsonSerializer.Deserialize<Message>(line))
.ToArray();
Any new feature requires something that currently doesn't exist in the compiler. Just because the change would be larger doesn't make it worse, or at least this can't be the only argument.
And if you do implement interface-based iteration with an Iterator interface, it's not hard to also add a ClosableIterator interface and have the loop handle the auto-close as well.
"Community" is kinda the key word there, because when someone is "along for the ride" over a period of many years then a ratcheting of complexity is easier to cope with than it is to arrive cold in a big/complex language.
I agree with the idea, as I imagine someone landing today on C++23, C23, C# 12, Java 22, will have a much hard time that those of us that know them pretty much since their first baby steps as programming languages.
As per my own experience since the mid 1980's.
However, exactly because Go had such an history of language evolution to learn from since FORTRAN came to be in 1958, maybe some of the early decisions could have been done better, instead of Apple style, "we are not doing X, for years later, X is actually something we want".
This is why we need to keep reinventing languages to grow another generation of juniors in seniors because they are not intelligent enough to correctly use abstraction
`yield` being a function that is passed into the iterator seems like suboptimal design to me. Questions like "What happens if I store `yield` somewhere and call it long after the loop ended?" and "What happens if I call `yield` from another thread during the loop?" naturally arise. None of this can happen with a `yield` keyword like in JavaScript or C#. So why did the Go-lang people go for this design?
> Questions like "What happens if I store `yield` somewhere and call it long after the loop ended?" and "What happens if I call `yield` from another thread during the loop?" naturally arise.
Nothing special. `yield` is just a normal function. Once you realize this, it actually is very easy to reason about. I just think the naming is confusing. I think about it as `body`.
> Questions like "What happens if I store `yield` somewhere and call it long after the loop ended?" and "What happens if I call `yield` from another thread during the loop?" naturally arise.
The fact that you can store `yield` somewhere allows for more flexible design of iterator functions, e.g. (written in a hurry as a proof of concept so will panic at the end): https://go.dev/play/p/QpVYmmC6g5b?v=gotip
Those hairy details may be hard to remember (or even decide), but they won't matter for most of the users - most users will just use `yield` in the simplest way, without storing them or calling them from another goroutine.
It is an explicit goal of the go team to minimize the number of keywords in the language. Simple languages have fewer keywords so go must have few keywords. https://go.dev/ref/spec#Keywords
Look how simple that is.
This is why things like ‘close(channel)’ are magic builtin functions, not keywords (more complicated) or a method like ‘channel.Close’ (works with interfaces and consistent with files and such, so not simple).
Languages where ‘yield’ is a keyword use a fundamentally different design (external va internal iteration). I don’t think it’s plausible that the Go team rejected this design because it would require another keyword. They presumably rejected it because of the additional complexity (you either need some form of coroutines or the compiler needs to convert the iterator code to a state machine).
> It is an explicit goal of the go team to minimize the number of keywords in the language.
It's understandable - because unfortunately people judge languages by very shallow metrics. Several times I've seen people use "number of keywords" as a proxy for language complexity.
However, that's completely misguided. `static` in C++ (and, IMO, `for` in Go) demonstrate that overloading a keyword to mean multiple things is harder to understand than having a larger number of more meaningful keywords.
From my vantage, pointers without arithmetic are typically called references, as opposed to "true" pointers. I did not mean it in a derogatory way, both have their place and Go is a great language even (or maybe despite) without what I would call "true" pointers.
The fact that, in Go, you can have pointers to pointers, and reassign pointer variables like any other, would imply, IMO, that pointers are first-class values, and so they are true pointers, even without being able to pointer arithmetic with them.
That's cheating for dev-rel marketing. And it's contradictory, because many more keywords could be (magical or normal) functions, like in some other languages.
// Only care about the iteration count
for range aContainer { ... }
// Just the values
for v := range myChannel { ... }
// Indexes and values (or keys and values for a map)
for i, v := range mySlice { ... }
What's the rationale behind Go choosing internal iteration (the iterator calls a function for each value, like Ruby) over external (the iterator returns each value, like Python and C#)?
My understanding is that internal iteration makes it easier to write iterators (producers) but harder to write the consuming code. That's why Go needs to re-write the body of each `for` loop as a function body, including special handling for `break`, `return` etc.
External iteration OTOH makes it harder to write producers but easier to write consumers. Python and C# therefore allow external iterators to be written via coroutines/generators.
Wouldn't Go's goroutines make the coroutine approach to external iterators straightforward? Whereas the re-writing necessary for internal iterators seems convoluted?
Isn’t the only real difference that the yield function is being passed into the iterator instead of being a reserved word? I don’t think it’s clunky, although it took a few minutes for me to get it.
No, before C# got generators, some of the machinery had to be manually implemented, with yeld, the compiler generates the necessary implementation for IEnumerable.
I was hoping the blog author would have revealed some plans for supporting a new iteration API in dolt. The range over func API is particularly useful if you need to iterate and compute over something that doesn’t all fit in memory (as is necessary for a slice and map).
The syntax just doesn’t sit right with me due to some reason. It gives me the same heebie jeebies as Python’s decorators. Not a fan of either. Maybe I need to get used to them.
Defining a Go iterator is similar to defining a Python decorator that takes an argument. Both involve defining a function that returns another function, with the inner function having another function as its parameter.
This isn’t a particularly difficult concept, but manipulating functions in this way does feel unusual in such heavily-procedural languages. This has two downsides:
* Programmers who only use those languages are unfamiliar with the concept, and
* The languages’ syntaxes aren’t designed to make it particularly clear.
This feels like it's undermining channels, the feature go really wants you to use in other places too (but people tend to still use mutexes). Channels aren't quite as lazy (if you use an unbuffered one you supply one element in advance), but they're close.
Channels require a different goroutine to send values while also receiving them (which is how you’d have two loops communicating, essentially what you get from range funcs).
There’s nothing stopping you from doing this but it does mean you are introducing the requirement of thread safety in your code, in the case where the iterator is stateful.
I would argue anything that needs a range func beyond the simple functional things like filters is probably a stateful iterator (or generator if you’d like), and as such having range funcs is a great way to write code that doesn’t go wrong due to parallelism.
Now you could add two way communication to your channel iterator (or any other locking mechanism) for safety but honestly I think range funcs perfectly solve this use case, and have already used them to keep my code more readable and correct.
All this said, while I’m still a fan of Go and have used it regularly since 0.9 as well as contributed to the language, I will agree with the other comments that sometimes the language design bends over backward to be purist at the cost of having to add more footguns in user land.
For this use case, not only do channels add an insane amount of overhead, they're also broken in all sorts of way e.g. there is no way to properly clean up resources around the iteration, and they add more opportunities for race conditions since the object under iteration has to be shared with the channel's producing goroutine.
Here's a rough implementation of iterators using channels that I was experimenting with. It needs some syntactic sugar to do proper cleanup of the go routine if the loop exits early; probably the iterator function needs to return a func that closes the channel called after the loop, and some error handling/closed channel checking.
The yield functions only exist as syntactic sugar to make it look like iterators in other languages and to make it clear where the value emission point is (I had mentioned this in tweets and skeets when I was originally working on this, if didn't make it into the gist).
An unbuffered channel is really a scheduler abstraction. Consuming from an unbuffered channel blocks, the thread can enter and immediately begin executing the go routine that was blocking on producing. The go routine is acting like a closure around the channel state.
I had some further experiments interleaving these iterators, but didn't clean it up at the time before I had sufficiently convinced myself it was possible and I got distracted with other things.
If you have shipped some task to a channel, or is waiting for some work to complete on a channel, there is no native way to propagate the error that your task may have failed. Also if a error did happen during processing the task you put on the channel, the stacktrace suddenly is not the whole story anymore. Channels also has no way to make sure the context.Context is reasonably propagated