Never Invent Here: the even-worse sibling of “Not Invented Here”

“Not Invented Here”, or “NIH syndrome”, refers to the tendency of organizations to undervalue external or third-party technical assets, even if they are free and easily available, when it is taken to an illogical extreme. The NIH archetype is the enterprise architect who throws person-decade after person-decade into reinventing solutions that exist elsewhere, maintaining this divergent “walled garden” of technology that has no future except by executive force. No doubt, that’s bad. I’m sure that it exists in rich, insular organizations, but I almost never see it in organizations with under a thousand employees. Too often in software, however, I see the opposite extreme: a mentality that I call “Never Invent Here” (NeIH). With that mentality, external assets are overvalued and often implicitly trusted, leaving engineers to spend more time adapting to the quirks of off-the-shelf assets, and less time building assets of their own.

Often, the never-invent-here mentality is couched in other terms, such as business-driven engineering or “Agile” software production. Let’s be honest about this faddish “Agile” nonsense: if engineers are micromanaged to the point of having to justify weeks or even days of their own working time, not a damn thing is ever going to be invented, because no engineer can afford to take the risk; they’re mired in user stories and backlog grooming. The core attitude underlying “Agile” and NeIH is that anything that takes more than some insultingly small amount of time (say, 2 weeks) to build should not be trusted to in-house employees. Rather than building technical assets, programmers spend most of their time in the purgatory of evaluating assets with throwaway benchmarking code and in writing “glue code” to make those third-party assets work together. The rewarding part of the programmer’s job is written off as “too hard”, while programmers are held responsible for the less rewarding part of the job: gluing the pieces together in order to meet parochial business requirements. Under such a regime, there is little room for progress or development of skills, since engineers are often left to deal with the quirks of unproven “bleeding edge” technologies rather than either (a) studying the work of the masters, or (b) building their own larger works and having a chance to learn from their own mistakes.

Never-invented-here engineering can be either positive or negative for an engineer’s career, depending on where she wants to go, but I tend to view its effects as negative for more senior talent. To the good, it assists in buzzword bingo. She can add Spring and Hibernate and Maven and Lucene to her CV, and other employers will recognize those technologies by name, and that might help her get in the door. To the bad, it makes it hard for engineers to progress beyond the feature-level stage, because meatier projects just aren’t done in most organizations when it’s seen as tenable for non-coding architects and managers to pull down off-the-shelf solutions and expect the engineers to “make the thingy work with the other thingy”.

Software engineers don’t mind writing some glue code, because even the best jobs involve grunt work, but no one wants to be stuck doing only that. While professional managers often ignore the fact, engineers can be just as ambitious as they are; the difference is that their ambition is focused on project scope and impact rather than organizational ascent or number of people managed. Entry-level engineers are satisfied to fix bugs and add small features– for a year or two. Around 2 years in, they want to be working on (and suggesting) major features and moving to the project level. At 5 years, they’re ready for bigger projects, initiatives, infrastructure, and to lead multi-engineer projects. And so on. Non-technical managers may ignore this, preferring to institute the permanent juniority of “Agile”, but they do so at their peril.

One place where this is especially heinous is in corporate “data science”. It seems like 90 percent (possibly more) of professional “data scientists” aren’t really being asked to develop or implement new algorithms, but are stuck in a role that has them answering short-term business needs, banging together off-the-shelf software, and getting mired in operations rather than fundamental research. Of course, if that’s all that a company really needs, then it probably doesn’t make sense for it to invest in the more interesting stuff, and in that case… it probably doesn’t need a true data scientist. I don’t intend to say that data cleaning and glue code are “bad” because they’re a necessary part of every job. They don’t require a machine learning expert, is all.

People ask me why I dislike the Java culture, and I’ve written much about that, but I think that one of Java’s worst features is that it enables the never-invent-here attitude of the exact type of risk-averse businessman who makes the typical corporate programmer’s job so goddamn depressing. In Java, there’s arguably a solution out there that sorta-kinda matches any business problem. Not all the libraries are good, but there are a lot of them. Some of those Java solutions are work very well, others do not, and it’s hard to know the difference (except through experience) because the language is so verbose and the code quality so low (in general; again, this is cultural rather than intrinsic to the language) that to actually read it is a non-starter. Even in the case where an engineer actually wanted to read the code and figure out what was actually going on, the business would never budget the time. Still, off-the-shelf solutions are trusted implicitly until they fail (either breaking, or being ill-suited to the needs of the business). Usually, that doesn’t happen for quite a while, because most off-the-shelf, open-source solutions are of decent quality when it comes to common problems, and far better than what would be written under the timelines demanded by most businesses, even in “technology” companies. The problem is the fact that, a year or two down the road, those off-the-shelf products often aren’t enough to meet every need. What happens then?

I wrote an essay last year entitled, “If you stop promoting from within, soon you can’t.” Companies tend to have a default mode of promotion. Some promote from within, and others tend to hire externally for the top jobs, and people tend to figure out which mode is in play within a year or so. In technology, the latter is more common for three reasons. One is the cultural prominence of venture capital. VCs often inject their buddies, regardless of merit, at high levels in companies they fund, regardless of whether the founders want them there. Second is the rapid scramble for headcount accumulation that exists in, and around, the VC-funded world. This requires companies to sell themselves very hard to new hires, which means that the best jobs and projects are often used to entice new people into joining rather than handed down to those already on board. The third is the tendency of software to be extremely political, because for all of our beliefs about “meritocracy”, the truth is that an individual’s performance is extremely context-dependent and we, as programmers, tend to spend a lot of time arguing for technologies and practices that’ll put us, individually, high in the rankings. Even if they are the same in terms of skill and natural ability, a team of programmers will usually have one “blazer” and N-1 who keep up with the blazer’s changes, and no self-respecting programmer is going to let himself be in the “keep-up-with” category for longer than a month. At any rate, once a company develops the internal reputation of not promoting internally, it starts to lose its best people. Soon, it reaches a point where it has to hire externally for the best jobs, because everyone who would have been qualified is already gone, pushed out by the lack of advancement. While many programmers don’t seek promotion in terms of ascent in a management hierarchy, they do want to work on bigger and more interesting projects with time. In a never-invent-here culture that just expects programmers to work on “user stories”, the programmers who are capable of more are often the first ones to leave.

Thus, if most of what a company has been doing has been glue code and engineers are not trusted to run whole projects, then by the time the company’s needs have out-scaled the off-the-shelf product, the talent level will have fallen to the point that it cannot resolve the situation in-house. It will either have to find “scaling experts” at a rate of $400 per hour to solve future problems, or live with declining software quality and functionality.

Of course, I am not saying, “don’t use off-the-shelf software”. In fact, I’d say that while programmers ought to be able to spend the majority of their time writing assets instead of adapting to pre-existing ones, it is still very often best to use an existing solution if one will suffice. Unless you’re going to be a database company, you shouldn’t be rolling your own alternative to Postgres; you should use what is already there. I’d make a similar argument with programming languages: there are enough good ones already in existence that expecting employees to contend with an in-house programming language, that probably won’t be very good, is a bad idea. In general, something that is necessary but outside the core competency of the engineers should be found externally, if possible. If you’re a one-product company that needs minimal search, there are great off-the-shelf products that will deliver that. On the other hand, if you’re calling your statistically-literate engineers “data scientists” and they want to write some machine learning algorithms instead of trying to make Mahout work for their problem, you should let them.

With core infrastructure (e.g. Unix, C, Haskell) I’d agree that it’s best to use existing, high-quality solutions. I also support going off-the-shelf with the relatively small problems: e.g. a CSV parser. If there’s a bug-free CSV parser out there, there’s no good reason to write one in-house. The mid-range is where off-the-shelf solutions are often inferior– and, often, in subtle ways (such as tying a large piece of software architecture to the JVM, or requiring expensive computation to deal with a wonky binary protocol)– to competently-written in-house solutions. Why is this? For the deep, core infrastructure there is a wealth of standards that already exists, and there are high-quality implementations to meet them. Competing against existing assets is probably a wasted effort. On the other hand, for the small problems like CSV parsing, there isn’t much meaningful variability in what a user can want. Typically, the programmer just wants the problem to be solved so she can forget about it. The mid-range of problem size is tricky, though, because there’s enough complexity that off-the-shelf solutions aren’t likely to deliver everything one wants, but not quite enough demand for solutions for nearly-unassailable standard implementations to exist in the open-source world. Let’s take linear regression. This might seem like a simple problem, but there are a lot of variables and complexities, such as: handling of large categorical variables, handling of missing data, regularization, highly-correlated inputs, optimization methods, whether to use early stopping, basis expansions, and choice of loss function. For a linear regression problem in 10,000 dimensions with 1 million data points, standards don’t exist yet. This problem isn’t a core infrastructural problem like building an operating system, but it’s hard enough that off-the-shelf solutions can’t be blindly relied upon to work.

This “mid-range” of problem is where programmers are expected to establish themselves, and it’s often where there’s a lot of pressure to use third-party products, regardless of whether they’re appropriate to the job. At this level, there’s enough variability in expectations and problem type that beating an off-the-shelf solution into conforming to the business need is just as hard as writing it from scratch, but the field isn’t so established that standards exist and the problem is considered “fully solved” (or close to it) already. Of course, off-the-shelf software should be used on mid-range problems if (a) it’s likely to be good enough, (b) those problems are uncorrelated to the work that the software engineers are trying to do and would be perceived as a distraction, and (c) the software can be used without architectural compromise (i.e. rewriting code in Java).

The failure, I would say, isn’t that technology companies use off-the-shelf solutions for most problems, because that is quite often the right decision. It’s that, in many technologies, that’s all that they use, because core infrastructure and R&D don’t fit into the two-week “sprints” that the moronic “Agile” fad demands that engineers accommodate, and therefore can’t be done in-house at most companies. The culture of trust in engineers is not there, and that (not the question of whether one technology is used over another) is the crime. Moreover, this often means that programmers spend more time overcoming the mismatch between existing assets and the problems that they need to solve than they spend in building new assets from scratch (which is what we’re trained, and built, to do). In the long term, this leads the engineer to the atrophy of skills, lowers her level of satisfaction with her job, and can damage her career (unless she can move into management). For a company, this spells attrition and permanent loss of capability.

The never-invent-here attitude is stylish because it seems to oppose the wastefulness and lethargy of the old “not-invented-here” corporate regime, while simultaneously reaffirming the fast-and-sloppy values of the new one, doped with venture capital and private equity. It benefits “product people” and non-technical makers of unrealistic promises (to upper management, clients, or investors) while accruing technical debt and turning programmers into a class of underutilized API Jockeys. It is, to some extent, a reaction against the “not invented here” world of yesteryear, in which engineers (at least, by stereotype) toiled on unnecessary custom assets without a care about the company’s more immediate needs. I would also say that it’s worse.

Why is the “never invent here” (NeIH) mentality worse than “not invented here” (NIH)? Both are undesirable, clearly. NIH, taken to the extreme, can become a waste of resources. That said, it is at least a “waste” that keeps the programmers’ skills sharp. On the other hand, NeIH can be just as wasteful of resources, as programmers contend with the quirks and bugs of software assets that they must find externally, because their businesses (being short-sighted and talent-hostile) do not trust them to build such things. It also has long-term negative effects on morale, talent level, and the general integrity of the programming job. My guess is that the “never invent here” mentality will be proven, by history, to have been a very destructive one that will lose us half a generation of programmers.

If you’re a non-technical businessperson, or a CTO who’s been out of the code game for five years, what should you take away from this post? If your sense is that your engineers want to use existing, off-the-shelf software, then you should generally let them. I am certainly not saying that it is bad to do so. If the engineers believe that an existing asset will do a job better than they could do if they started from scratch, and they’re industrious and talented, they’re probably right. On the other hand, senior engineers will develop a desire to build and run their own projects, and they will agitate in order to get that opportunity. The short-termist, never-invent-here attitude that I’ve seen in far too many companies is likely to get in the way of that; you should remove it before it does. Of course, the matter of what to invent in-house is far more important than the ill-specified and vague question of “how much”; in general and on both, senior engineering talent can be trusted to figure that out.

In that light, we get to the fundamental reason why “never invent here” is so much more toxic than its opposite. A “not invented here” culture is one in which engineers misuse freedom, or in which managers misuse authority, and do a bit of unnecessary work. That’s not good. But the “never invent here” culture is one in which engineers are out of power, and therefore aren’t trusted to decide when to use third-party assets and when to build from scratch. It’s business-driven engineering, which means that the passengers are flying the plane, and that’s never a good thing.