Previous Entry Share Next Entry
Dependencies, and the danger of import *
device
jducoeur
Just a quick thought for the programming crowd. I've spent most of this week focused on refactoring Querki -- it's needed it for a while, and the growing need to write some proper test harness is the straw that's breaking the camel's back. (Unit testing is always a good check of your factoring: poorly-factored code is usually hard to test.)

Along the way, I'm trying to clean up my dependencies, and I'm starting to realize how dangerous mass imports can really be. I have lots of places where I had quickly typed
import models._
or something like that. (Underscore is Scala's wildcard operator, pretty consistently, so think of that as "import models.*".)

It isn't so much that this automatically makes your code bad. (Although I am beginning to suspect that it is slowing down my compiles, by muddying the dependency tree.) But it promotes lazy thinking: these sorts of mass imports make it simply too easy to use a whole lot of different classes and traits, without thinking about the factoring implications. The result is classes that are often less cohesive than they should be, in no small part because it was slightly too easy to be that way.

So I'm gradually moving towards a coding standard of being more explicit about imports in most cases. Some packages and objects are specifically intended for wildcard import, and that's okay, but I'm coming to the conclusion that the rule of thumb should be usually not doing so.

I'm curious: what are your habits? Do you prefer wildcard imports or explicit ones?

  • 1
I definitely find that wildcard imports in python make refactoring and such much harder and also make it hard to see where things are coming from, so I now always either import specific members or import the module and use the module namespace explicitly. (I guess the exception is modules that contain only a well-defined list of constants that are unlikely to dramatically change.)

I generally come from the World Of C-Likes, so it has always been explicit ones for me.

When doing development, I write as if nothing were imported, and only import things as I find I need them. Periodically go through and trim out units I no longer need. I see no reason to change; while it might be a little faster coding by importing more and leaving in unnecessary ones, I find decreases the quality of my code. My currentway makes it easier for me to do loosely coupled interfaces, highly modular implementations, and makes me think about what I'm importing (well, including) and why.

I don't use wildcard imports; I prefer to know what I'm bringing into my namespace.

In Java, my IDE brought things in for me, but Java has what I consider an unnecessary level of importing required--in my mind things like file access should be in the core.

It meant I never had to say "import java.net.*", however, because I could just name a class and IntelliJ would find it for me and ask to import it (or do that automatically).

In MATLAB there is a more serious implication in that some language features actually are separate products and must be bought separately. Knowing what you're using becomes crucial there, especially if you plan on sharing your code with others.

JS had no decent import mechanism, just including a whole file. Node.js and "require" start to change this and now I have I think about it again...and find myself going with "foo = require ('foo')", basically a wildcard...

Note: Scala's importing is more or less identical to Java's, by design -- one of Scala's great strengths *and* weaknesses is that it is nearly 100% Java-compatible, so that it can use Java libraries and be used to write them. It has a somewhat more flexible and powerful import mechanism (eg, you can import locally to any block), but in general you basically wind up importing mostly the same stuff as Java.

It meant I never had to say "import java.net.*", however, because I could just name a class and IntelliJ would find it for me and ask to import it (or do that automatically).

Yeah, there's a part of me that misses that feature of Visual Studio (I was very used to doing that in C#). But I now find myself wondering if the discipline of writing my own imports may be healthy.

In MATLAB there is a more serious implication in that some language features actually are separate products and must be bought separately. Knowing what you're using becomes crucial there, especially if you plan on sharing your code with others.

That's increasingly true of Scala nowadays as well. The "language" per se is global -- but it's always been immensely flexible, allowing you to write you own operators and such, and internal DSLs for special purposes ranging from parsing to XML literals are increasingly common.

And now that they're introducing high-powered macros (essentially compile-time Scala code for doing AST transforms), that's going to be more and more true -- you can modify the language in nearly-arbitrary ways with your imports. Damned useful, but a tad dangerous.

JS had no decent import mechanism, just including a whole file.

Yeah, this has always been my biggest complaint about JS, and was a *huge* fight at Memento (where I was leading the client team, and engaged in a year-long argument about Flash vs. JS with one of the other architects). The lack of good modularization tools has always made it hard for me to take JS seriously for really complex projects...

Hmm. Thinking on it more, I guess I'm an "import.*" guy from a philosophical perspective. It's abstraction leak.

...at work there's a whole group (who I support) that cope with the layers of external dependencies our code has, some for shipping, some for internal development, etc., etc. There are some thorny problems in there but they all basically boil down to compile/link time, disk space, overlapping solutions, or DLL Hell (where two libraries depend on different versions of the same library).

For smaller projects, if I can avoid the last two, and the first two aren't that bad, then I just want to import everything I need so I don't have to think about what I'm importing. The hit to compile times and shipping sizes is not as important to me as things like:

* reduced programming effort
* solid code -- I presume, right or wrong, that libraries are more debugged than my own off-the-cuff solution will be
* the lessons implicit in well-thought-out libraries -- where they've already caught the three stupid things you can do and put up warning signs
* in some cases, connection to a community of like-minded coders

Masking dependencies, however, is a bit dangerous. NPM makes it dead easy (as with other package managers) to include something that pulls in a tree of a hundred other libraries. It'd be nice to have *some* sort of insight into that before selecting a dependency, potentially a "library store" where you can go and shop for solutions to a problem based on what they pull in...

like instead of import scalaz; import scalaz._, you want us to list all the functions we want to import? Weird...

Fair point, and it's looking like I'm going to wind up in a somewhat nuanced position. There are a fair number of libraries where wildcard import is appropriate, but for querki-internal code I'm leaning against it in most cases.

And there are many libraries where it clearly wants to be forbidden -- for example, while mutable data structures occasionally have their uses, I would never, ever want to find "import scala.collections.mutable._" anywhere in the code.

Truth is, I don't know scalaz at all well, and am not using it inside Querki yet. It's on my to-do list, but has been hampered by how few decent introductions I've found for it, and *man* it is hard to get started on, especially if you don't have a strong category-theory background to begin with: they toss around an enormous amount of jargon much too casually. So far, other things have been higher priority for my self-education (eg, getting deeper in Akka). Scalaz is approaching the top of my list, although I will admit that I've been putting it off in hopes that they would finish the frikking Functional Programming in Scala book first, but I suspect I'm going to have to start elsewhere...

So, we probably can come to a compromise:
- import libraries they way they are recommended to be imported;
- import other people's code the way you think is better;
- import your own components individually by name.

(If so, that's what I actually practice, to my amazement.)

I live by the Zen of Python, which includes:

"""
Explicit is better than implicit.
Namespaces are one honking great idea -- let's do more of those!
"""

(Though appropriately, it also includes "Although practicality beats purity.")

-- `python -mthis`

  • 1
?

Log in

No account? Create an account