Turns out other languages have evolved in the mean time and came up with much better solutions. In particular with dynamic languages ES6 modules in combination with npm's approach to package dependency management is the current benchmark. ES6 exposes bindings instead of values which solves a ton of issues that Python will never be able to fix with it's import system.
Can you elaborate on the issues that exposing bindings instead of values fixes or point me to an article that explains this? I'm not too familiar with the ES6 import system and I am curious, as I have used Python's extensively and encountered many of the issues in your article.
It gives a read-only peek at the scope of the imported module.
import {foo} from 'bar.js';
// you can't re-bind `foo` from here, but if
// a function defined in 'bar.js' mutates it,
// the change is reflected here too.
It enables circular dependencies, for free, and makes it easy to discard unused code.
Notice how you import something with the trailing path being "go-dockerclient" but the actual package name is "docker"?
In npm, if you type x = require("foo"), you always refer to it as 'x.something'. You don't have to guess or read docs.
2) build flags. What the fuck?
// +build fuckme
3) Canonical import paths. Double what the fuck?
For those of you who don't know, if you have "github.com/foo/bar" but write 'package bar // import "github.com/google/bar"' then a user of your package will not be able to compile it unless they move it to the directory you gave.
Yeah, no joke.... I'm in awe at this stupidity
4) Poor project-based importing (something actually worse than virtualenv; nothing near as good as npm).
5) init side effects, e.g. look at how pprof works. Literally you do an underscore import and it mutates the behaviour of the http package. Thanks.
1) Yes, that is messed up, but that's just cause that package isn't following standard idiom
2) Build flags are very useful, on the verge of required, for some situations. For instance, code that needs seperate implementations for each os.
3) This is very useful for packages that may be on github, but served through an alternate path. If they import the github version, you are now stuck with github. If you move to self hosted git or something, their existing codebase won't know how to do an update, and will require a rewrite to use the new version.
4) I'll be the first to admit it isn't the greatest, but, it's rendered mostly moot by the fact that you build single binaries. Deal with a slightly annoying import system in exchange for effortless deployments and not having to worry about libraries and such on the servers? Heck yeah!
5) I agree on this one, it should be something more like doing pprof.Register(*ServeMux) in an init in your program.
> I'd say Go is the benchmark in case of import system.
I prefer systems that (a) allow namespace nesting within a single package and (b) don't run arbitrary code on import (in Go's case, init()). The latter is particularly pernicious because systems that have arbitrary code execution on import have to define some kind of global ordering on imports (throughout the entire program!), since the order that packages get loaded in very much affects the program semantics. This is called the "static initialization order fiasco" in C++ and Go inherits it too.
"As much as possible" would be doing node_modules/$PACKAGE-$VERSION for all packages (like e.g. Rubygems does). But instead they're doing node_modules/$PACKAGE and still nesting conflicting dependencies, so it's entirely possible to still run into the same issue.
Sometimes you really do legitimately have a lot of static, global state. For instance, consider a program that needs to reference local, national, and/or global geography and its metadata, on a wide scale, randomly. All the countries have subdivisions, and subdivisions of subdivisions, and so on all the way down, which are all inter-referential. You can easily hit 100 MB of state that is essentially constant, and needs to be indexed 50 different ways for millions of function calls per user action that would access it.
Why not manage access to such things in a singleton class?
Singletons are fine, but it's almost always better to lazily initialize them rather than eagerly, to save on startup time. As a bonus, if you have no eager global initialization in your language, you can make import completely side-effect-free, which is a really nice simplification that I wish more languages adopted.
The slow startup from imports is my biggest annoyance with python.
We had a decent sized library at a previous company that pulled in modules that defined huge register maps, wrapped c++ libraries, etc.
I wrapped all imports in a lazy importer that was triggered by the first attribute access. It brought our script startup times from 3 seconds down to a fraction.
Blows me away that this isn't default behavior for ALL modules.
That behaviour feels to me like it may result in faster startup, but would also result in less predictable performance for code bases with somewhat random access such as web applications.
You could I suppose do some cache warming to make sure the first user request isn't slowed down, but its one more thing to think about.
>"I wrapped all imports in a lazy importer that was triggered by the first attribute access."
Well, putting code in the root of your file is generally the problem to such things, I would argue. Granted, I don't know about how that is necessary when it comes to "register maps" and "wrapped c++ libraries". But I'd imagine you should be encapsulating them away anyways and that would include fixing large startup time by design.
And make 300,000 queries over TCP like getting the list of county names in a state, or getting the list of place names in a county, because my actual use case involves fuzzy matching an arbitrary subset determined by user input, of 18,000,000+ unsanitized data records against geographical place names so they can be assigned geometries?
I'd like the program to finish in 15 seconds or less, please.
If you're making 300K queries over TCP to a database in order to do a calculation, then I'd say you need a much better data structure and/or algorithm. Either that, or do the bulk of the calculations on the database in P/T-SQL, or pre-calculate before-hand so that your on-line queries are just lookups instead of actual calculations.
The train of the discussion, if you go and read the OP's link and inner links, is like this:
- Singletons are bad
- Why are singletons bad?
- They're not "real" OO, they're global state, they obfuscate dependency, etc, etc, etc
- But what if I just legitimately have a ton of global state?
- Use a database! Use a filesystem!
The last point in the chain admits that the first point is mistaken. "Use a database" is just saying "use someone else's code to solve your problem". What if the database is implemented using singletons? What if it uses code that isn't OO at all? All you've accomplished is to say "OO can't solve your problem, use something external". In fact, my problem is solved just fine by using a singleton.
Turns out other languages have evolved in the mean time and came up with much better solutions. In particular with dynamic languages ES6 modules in combination with npm's approach to package dependency management is the current benchmark. ES6 exposes bindings instead of values which solves a ton of issues that Python will never be able to fix with it's import system.