If you consider the number of packages on pypi, npm, and rubygems, and the release dates of python (1991), ruby (1995) and node.js (2009), it looks a lot like those communities are getting new packages at the following rates, per year:
python: 29,720 packages / 22 years = 1351 packages per year
ruby: 54,385 packages / 18 years = 3022 packages per year
node.js 26,966 packages / 4 years = 6742 packages per year
It’s important to note that I’m only showing the other languages as a measuring stick. There are probably a lot of reasons for the disparity here (including my imprecise math), and I’m not trying to say anything like “node.js > ruby > python” but obviously new node.js packages are being published to npm at a staggering rate. I just checked npm (on a Sunday night) and 20 new versions of packages have been published to npm in the last hour alone.
This is a pretty amazing feat for node.js. How is this possible?
If you’ve needed an HTTP client in Python, you’ve probably asked yourself “should I use urllib, urllib2, or httplib for this?” like so many people before you. Well the answer is probably that you should use requests. It’s just a really simple, intuitive HTTP client library that wraps up a lot of weirdness and bugs from the standard libraries, but it’s not a standard library like the others.
The “Batteries included” philosophy of Python was definitely the right approach during the mid 90’s and one of the reasons that I loved Python so much; this was a time before modern package management, and before it was easy to find and install community-created libraries. Nowadays though I think it’s counter-productive. Developers in the community rarely want to bother trying to compete with the standard library, so people are less likely to try to write libraries that improve upon it.
Developers are less likely to use non-standard libraries in their projects too. I’ve heard many instance of people suffering through using an inferior standard library just to have “no dependencies”. But in this day and age of cheap massive storage and modern package management, there are very few reasons for a policy of “no dependencies”.
Conversely, the node.js core developers seem to actually want to minimize the standard library as much as possible. They have repeatedly removed features from the standard library to free them up to be implemented by the community instead. This allows for the most variety and lets the best implementation win (but only until someone makes a better one, of course!).
Imagine how liberating this is for standard library maintainers too. The node.js standard library is comparatively tiny, freeing the core developers to just deal with the core necessities.
Just like the 140 character limit on twitter makes people “blog” more, the node.js community has a culture of publishing tiny modules that makes people more comfortable with publishing smaller packages, and so it happens vastly more often.
While I don’t really consider myself “a node.js developer”, I’ve found myself publishing over a dozen packages in the last year myself, just because I’m working on a node.js project at work, and I publish any reusable package from that work that I create. My project ends up factored better, and I end up with a bunch of building blocks I can re-use elsewhere (and then I do!). Sometimes random people from the interwebs even fix my bugs before I notice them!
This approach is not unusual at all either, at least in the node.js community. The community is lead by developers that have each published literally hundreds of packages. I don’t even know how this is possible.
Of course there’s absolutely nothing to stop other languages from following suit, so it seems to me like community culture is the deciding factor.
Massive monolithic frameworks like Ruby on Rails tend to not take hold in the node.js community mindshare, I think due in part to this “tiny module aesthetic”, and it’s all the better for it. It seems to me that monolithic frameworks like Rails don’t have a clear notion of what does not belong in Rails, and so they tend to expand indefinitely to include everything an application could need. This leads to people trying to assert that Rails is the best solution for all problems, and having a cooling effect on innovation within the web application space in the Ruby community.
The tiny module aesthetic also leads to an ecosystem of extreme re-use. When a package doesn’t do more than is absolutely necessary, it’s very easy to understand and to integrate into other applications. Conversely, while massive monolithic frameworks tend to appear in a lot of end-user applications, they rarely never end up being dependencies in other packages themselves. It seems unfortunate to me that so much great software should be so difficult to re-use.
To be honest, few things are more boring to me in software development than version control, and I find the complexity of git to be a bit silly and mostly unnecessary for me, but I’ve never heard of a node.js developer that doesn’t use it, and there’s huge value in that kind of monoculture. It means that they all speak with a common language and there’s no impedance when you want to contribute to someone else’s projects.
Github.com has a similar effect in lowering the barriers for cross-pollination between developers. I’ve only very rarely seen a node.js project that wasn’t on github. This means I immediately know exactly where to find source code if I want to contribute.
The pluses and minuses of a github monoculture go far beyond this, but there’s probably enough fodder there for an entirely different post.
Node.js packages tend to use extremely permissive licensing like MIT and BSD licensing. In fact, the default license for a package created by npm init
is BSD. I think this is another sign of the times.
Very few people care anymore if someone else forks and doesn’t contribute back. There’s very little to gain in not contributing back anyway because of the cost of maintaining your own fork.
This is important because no one wants to have to deal with dependencies that make them think about legal repercussions, or give them extra responsibilities. Licenses like the GPL end up having a cooling effect on software-reuse for this exact reason.
EDIT Jun 3rd, 2014: Comments below are basically copy/pasted from my old blog format for posterity.
Nice writeup! I do think that the term ‘node community’ is a little bit too general since some of the most popular node modules are in fact bigger than they should be.
For example, the 4 most popular modules from https://npmjs.org/: underscore, async, request and express could all be split up into smaller pieces:
Underscore isn’t terribly large but it has things tucked in it like a templating function that don’t belong.
Async is thematically consistent, offering many variations on a theme, but from the empirical usage patterns that I’ve seen most people use a small percentage of the offered functionality.
Request actually went through the modularization process over the last month or two and went from having 1 or 2 dependencies to having 11.
Express is perhaps the greediest module on NPM (IMHO) in that it invented its own naive APIs early on which means that now you can only use modules designed for express with express since the extensibility is stuck in a relatively pre-cambrian era. Classic framework lock-in, albeit to a lesser extent than other, larger web frameworks. I’ve thought a lot about why this happened and believe that if you try to make a land-grab early on and abstract too much too early then you get stuck with your abstractions due to backwards compatibility. Good software grows slowly.
Yeah I totally agree, and you’ve probably explained it even better than I did.
Express is actually a bit of a conundrum for me for a few reasons, especially since in trying to create a mini API framework myself ( https://percolatorjs.com ), I’ve felt forced into duplicating some of the things I dislike about the modularity (or lack of modularity) in express, to the point that I eventually gave up and added ‘connect’ middleware support. There are a few competing pressures that make web frameworks like express (and rails for that matter) popular:
(1) Some people just want to “get shit done” ™ and don’t want to have to search for multiple pieces of a puzzle to hobble together a solution. I think there’s probably a gap where finding the best modules for a given problem is still a too hard. Filling that gap (somehow) will promote collaboration even more. The current npm search (on its own) doesn’t solve the problem IMO.
(2) HTTP is actually really complicated and very hard to abstract effectively. For example, how many frameworks 404 when you try an invalid method on an existing resource instead of the proper 405? This is mostly because of the way routing is abstracted; These frameworks use the http method as the first ‘key’ in the routing table, when really the url makes more sense as a key and the method should be secondary. There are a number of cross-cutting aspects like this including logging, error-handling, and authentication where some sort of centralized pre-processing makes a lot of sense. Not all “aspects” should be handled with serialized pre-processing middleware though, and unfortunately that’s what we’re seeing a proliferation of. I’m disappointed that I can’t build a mini-framework without integrating connect, or rewriting/repackaging all those otherwise great connect middleware modules out there, but I don’t know a better way. To be honest, I don’t even know at this point what the better abstraction would be. Doing everything in pre-processing middleware and monkey-patching the request/response objects doesn’t seem like the answer though.
Good stuff. I haven’t felt this excited about a platform community in a long time.
Good read: spot on.
I would add a couple additional factors to that trend:
- Proper package management: `npm` was and still is what got me hooked up. Package management is often an afterthought that is rarely properly addressed.
- Flat community: which is partly illustrated with the extreme culture of re-use you mentioned. I feel there is much lower friction in creating and contributing to modules in the node.js community, with no gate keeper in sight.
For a similar system: Perl’s CPAN (the package repository) has been online since 1995 and presently has 120,447 modules for a yearly rate of 6691 modules for 18 years now.
It’s been providing the same functionality you mention: an encouraged license (“the same terms as Perl”), a monoculture of software (CPAN for package distribution/documentation), and an emphasis on tiny modules. Little packages are encouraged and really the way to go when you’re building Perl modules.
That said, it’s great to see those practices being adopted successfully by Node and seeing the success of the groups using it. We live in very exciting times for open source. :)
The “The Tiny Module Aesthetic” reminds me that, the emerging of the “Single-purpose app” in recent years, and the design philology of the Unix utilities. And I think those are all about “simplicity”, which I think and care about the more and more in recent years, but I always have to remind myself about when writing code for LIVEditor – my code editor with “Firebug” embedded.
There are some frameworks in Node.JS which are just copied, not unique, like for example https://github.com/biggora/caminte – is just a copy of Juggling DB https://github.com/1602/jugglingdb.
I would strongly recommend to use genuine frameworks which have been tested by the community and better supported.
Why don’t you consider Perl? 120,460 Perl modules in 15 years, that makes it 8030 modules per year and it’s increasing exponentially now.
“Batteries not included” approach puts a significant cost on application developers.
If I need to do a simple thing X, I’d rather have a single good library for X rather than twenty libraries with varying focus and quality. Even if one of those twenty libraries would be far better for my needs, the task of properly comparing and choosing the lib is often higher than doing X with a random library.
It also reduces maintainability – if one project uses lib1 for X, and another project uses lib2 with a different API for the exact same thing, then it makes stuff harder to maintain.
@ Chankey Pathak, Mark Smith: For the record, I didn’t include perl because I haven’t got any perl experience to draw from, while I’ve spent years writing reams of code in ruby, python and node.js. I can honestly say though that CPAN is nothing short of legendary even outside the perl community, and npmjs.org would do well to follow its lead. It would be interesting to know too how CPAN maintains that rate of growth, and if it’s due to the same things I’ve mentioned.
I should also add that node.js isn’t even the language/platform that’s nearest and dearest to my heart. I’d really just love my other favorite languages’ ecosystems to pick up some of these qualities.
From a CPAN’s lover’s perspective NPM already is _very_ well done and pretty smooth to use.
From a Perl lover’s perspective, JavaScript doesn’t have a choice anyways, because you really NEED a lot of modules to work around well let’s politely say “language quirks”. ;)
You simply need things like underscore or node, because it gives you things which aren’t builtin in the language. (*cough* npm install printf *cough* ;)
Similarily, Perl has gotten its Lisp-style MOP OO module Moose, because there’s no such thing as a “class” in Perl.
Our equivalent of underscore-alike modules are for example List::Utils, Lists::MoreUtils etc. etc. – and you can’t possibly shove ALL those modules into what we call “core” – set aside that half of the people want Foo and the other half Bar and then there’s our “problem” we’re really proud of: Backwards compatibility is _amazing_ in Perl and there’s always someone who runs a 12 year old Perl and still needs his Really::Ancient::Module::BlahBlah. ;)
So far I encountered exactly ONE npm module which I couldn’t install blindly with npm install -g – and that was due to an architecture check in its requirements (which isn’t necessary, module very high level) and my Linux flavor returns a different architecture string – so one lesson might be to encourage a best practise “don’t overdo requirements for your module, it just complicates things”
In Perl, it’s common that you simply ask “the community” (IRC for example) “what do I want for XY?” and get a variety of 2-5 modules and a short explanation why you want A and not B. Imagine for example you have more than one XML/HTML parser module (which we do) – do you want XPath support or do you prefer DOM-style methods? Do you need a strict parser or do you want to work on dirty HTML?
Most module authors mention these kinds of use cases and “flavors” to do things and refer graciously to the alternatives. Plus, there’s the entire commenting and rating system for modules and yes, you of course look if Perl gurus Tom, Dick and Harry Superperl ranks module Foo highly or not – it’s not that we don’t have our own little rockstars and divas. ;)
Due to variety and so many choices, there’s also bad choices and we put A LOT of work into discouraging certain modules these days which are simply not really appropriate to use in 2013 or some well known badly written ones. But you can’t just throw them out, too – because Perl is installed everywhere – what is the browser for JavaScript is pretty much any Linux/Unix for Perl. And all the shitty JavaScript quirks you’ve gotten from browser vendors, we get from distributors which come with a Perl from 1999 in 2013.
Sooner rather than later (there already are a couple) there will be a class of modules on NPM which are dual use – browser and Node-based server-side and you will have to keep that straight and compatible and that will be a lot of work. Also, what if people start writing JavaScript modules which are able to run in browser, on Node AND on e.g. spidermonkey? (what about Rhino/Nashorn?) I’d encourage that a lot, not everything needs Node and NPM is the de facto thing to put a module up on by now.
Another advantage of CPAN is: It’s _extremely_ centralized. You look for a module on CPAN. Period. There’s no “download from here” or “fetch from there” – it’s on CPAN or it doesn’t exist. As CPAN is tested and tested and retested by volunteers, using modules from CPAN pretty much ensures been tested on a dozen platforms. If you use the plain “cpan Foo::Bar” installer, you can always watch all tests live while running the installation of a module.
And Perl modules have tests LIKE THERE’S NO TOMORROW – as Perl’s testing flavor TAP is extremely easy and simply and cheaply to do, modules actually _do_ have a LOT of tests.
Last but not least: On average, the commonly used Perl modules are _extremely_ well documented and the docs can be read on commandline and via web. The documentation style is probably rather old-school from a Node lover’s perspective – it’s more on the dry, informative side and less about being cute and funny – but some docs literally go into the amount of a book written.
Also, no matter how big a project – I simply expect it to be installable via CPAN. There’s a couple of Perl things which are more applications but “modules” and are not installable via CPAN, but that’s pretty rare. Entire frameworks are installed via CPAN – think like the difference of how you install Derby.js and Meteor.js.
And: I run deliberately large dependency-chains for the fun of it – CPAN has a flag for “resolve all dependencies yourself and say auto-yes to everything you’re asked during installation” and I manage a couple of hundred modules in ONE session WITHOUT any error while running ALL tests during installation – possibly on half a dozen Perl versions (Made with perlbrew which is our nvm/rvm/virtualenv).
As the Perl mantra “there’s more than one way to do it” is done by the JavaScript community on the same level of excess (a dozen web frameworks anyone? ;) you’ve made your choice already anyways to go down the road of extreme flexibility and variety. It already happened, no going back here.
The Perl community for example is very forceful to really push people into using modules from CPAN exactly BECAUSE they’re tested and have solved hundreds of problems for you – you don’t split a CSV file by hand, you use Text::CSV which is proven to work for a decade. You won’t do better with your meager little split. :)
You don’t parse HTML with regex, you use a module. You don’t validate email addresses by hand, you use a module. We have modules like Regex::Common for example which contain a range of common usecases people usally throw (bad) regex on or modules which pretty much read any stupid format known to mankind – it’s all put into modules and more modules.
Luckily, Perl also has most of the necessary modules I usally call “boring shit” modules – Excel reading and writing anyone? Those are the true heroes in my opinion – people who sit down and really write this kind of ungrateful, boring stuff you suddenly need there’s *gasp* lurking an “enterprise” around the corner – not to mention that nobody in their right mind is going to implement SSL bindings three times. (One lesson for JavaScript/Node might be to maybe not add another web framework but deliberately go for the boring stuff sooner rather than later and get it over with..)
For my part, I can’t rave enough about CPAN and some of the people poured their hearts and time and knowledge into some modules. Billion euro/dollar businesses run on it – some for over a decade straight.
So, hack, upload and enjoy. :)
Yours truely – a Perl lover. :)
CPAN’s main website: http://metacpan.org
Maven is also something that’s begun to pick up pace in the Java community. Stats seem to be here: http://search.maven.org/#stats
But not sure about exact growth rate. A pretty picture is available here: http://mvnrepository.com/
While having a small standard library is great for essentially encouraging reinventing the wheel to make it better, it can at times be a detriment for “just getting things done”. It’s called a “standard” for a reason, which is that you can rely on it to be present and available for use, and more importantly, for understanding by the developers that come in and out of a project.
That being said, popular extensions will solidify and essentially become a 3rd party standard for use and becomes indispensable and tied into the core project, whether it’s just the common-knowledge standard du jour, or it actually gets merged in.
It certainly is a more interesting way to get a diversity of development of what people consider to be core though. For that, I find that model interesting to watch and see how it grows and if anything I favor “wins” :).
Su-Shee: thanks for the education! I agree, CPAN sounds great. Since we’re getting along so well, I won’t say anything about the language itself. :)
A few minor clarifications on javascript in 2013: Don’t `npm install printf`, use util.format: https://nodejs.org/api/util.html#util_util_format_format . Underscore is much less necessary nowadays (especially in node.js) as a result of ecmascript 1.6 improvements (See “iteration methods” at https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Array ). Also using jslint will keep you 99% quirk-free (I have it integrated into my builds and vim).
There’s no need to slander the GPL in this post, it’s just as permissive and would allow the same type of collaboration. The trouble is that most developers also want to use libraries as part of proprietary code bases. I don’t understand why they don’t just dual-license; GPL for non-profits and devs, MIT/BSD for commercial/proprietary software.
There really should be a bigger push to license internal code under the GPL. 99% of the time it doesn’t offer a competitive edge. It’s the data and marketing and network effects that have more to do with it than anything.
Great post. I’d also add to that list the benefits of local by default packages that are in your project folder. The approach of tools like easy_install, pip, bundler, etc is to hide all your depemdencies in some dot-prefixed folder outside your project directory. This promotes an out of sight, out of mind mentality. The npm approach on the other hand puts them in your project and makes it much more likely that you are going to explore the source of your dependencies, more likely that you bug fix those dependencies instead of work around bugs and most importantly you’ll treat that folder like part of your own project. This last point means that you are likely to develop one of your own generic modules to the point where it’s good enough to submit back to npm as a public module.
Oh, go ahead and say something about the language and I will probably tell you the exact same thing back about JavaScript – or something quite similar. If anything, both have A LOT of history and baggage – but they both also have A LOT of flexibility, otherwise the range of modules existing as of today wouldn’t even be possible and people wouldn’t deliver at that speed. And let’s face it: We can happily unite (Come along, you too Java, don’t be shy…) in Python’s and Ruby’s dislike for {} and ; – and we use a lot of them. ;)
Sadly, format.util supported too few options when I looked last. (In the end, I did it in DB anyways..) – but it’ll evolve, I’m sure.
Underscore will probably move to other list-supporting functions over the years I would expect, the map-grep-filter-foreach range is just the basics anyways. (At least, Perl didn’t need map-grep-foreach-shift-unshift-push-pop-splice as a module, that really is core-core…)
And before someone is tempted to bitch about Java-and-Maven: _THAT_ community delivered literally EVERYTHING I can think up of “boring shit” and “enterprisey stuff” you can imagine. And if you ever looked for “good PDF manipulation” for example: Java did deliver. Kudos for that.
If anything the lesson simply is: Yes, you want TONS of modules, yes, the environment IS important, no, it doesn’t matter if you’re a less-than-perfect language, as long as you support a certain flexibility, people will work around it.
And style evolves a lot – nobody writes JavaScript like in 2000 anymore and you get discouraged a lot to do so – same goes for Perl. But if you already had a web business in 2000, you probably just can’t throw out your entire carefully crafted JavaScript or your backend of the day (in 2000) and that is something EVERYBODY faces. Perl did and we paid dearly, Java does, Ruby will, Python will do too.
JavaScript showed – thanks to Douglas Crockford and jQuery among others and yes, thanks to Node.js, too – how to evolve nicely into a new style – but also Node.js setups will at some point be really huge and not that easily updated anymore and then you face the very same thing everybody else does.
And god do I wish this petty bickering among languages would stop – we owe each other so many things – all of us – and all of us owe Lisp and Smalltalk and C.
Let me all remind you that there was a time when “real programmers” laughed about using a “scripting language” on this cheap little Linux hack with this meh little HTML thingie to put up a business on and now look where we all are today…
Su-shee: Totally agree. Well said.
Rudolph: “Slander” is a legal term and a pretty ridiculous one for what I said. More legalese isn’t going to help the GPL’s case. You already agree that the GPL is more restrictive for people mixing software with other proprietary software, so how can it *possibly* be slanderous for me to say it’s not as permissive?
Well, counting the age of the Python and Ruby repos by the age of the language (and not of the package management projects, such as pypi and rubygems) while considering node.js by the age of the project and not the language (JavaScript) skews the numbers quite a bit!
Let’s compare by language age:
python: 29,720 packages / 22 years (1991) = 1351 packages per year
ruby: 54,385 packages / 18 years (1995) = 3022 packages per year
node.js 26,966 packages / 18 years (1995) = 1498 packages per year
And by package manager age:
python: 29,720 packages / 11 years (2002?) = 2701 packages per year
ruby: 54,385 packages / 10 years (2003) = 5438 packages per year
node.js 26,966 packages / 4 years (2009) = 6742 packages per year
Except that, of course, these numbers are meaningless, because each language/tools has its own submission evaluation process (including “no evaluation”), and also different granularities in how to count “packages” (by module? by a set of modules that compose a project?).
Also, language-specific package managers are more or less a new thing — counting for example the number of packages submitted last year would be a more meaningful metric (apart from the problems I raised above.)
Still, it’s nice to see that language-specifc package managers are thriving, and I wish all the best to all of them. :)
The reason Express is the size it is, is because all these things are what we would be duplicating each time anyway. Yes a lot of it could/should/will live in npm, but an aggregate is not harmful. Look at npm-www for example, this is what happens when you re-create things every single time you write an app, it becomes are large unwieldy mess of stitching together random things and re-inventing within the app itself instead of an aggregate “framework”. Most middleware should absolutely be abstracted to a lower level and then “wrapped” as middleware, there’s no question about that, but should they cease to exist? absolute not.
TJ: Yeah, I actually agree that the core http module isn’t a useful enough abstraction to be used for many (most?) real world web applications. It also doesn’t offer a reasonable way to add functionality other than wrapping it or monkey-patching the request/response objects. Cross-cutting concerns are the obvious problem (security, logging, routing, etc), but also: what if you want to just add functionality that’s more easily accessible than constantly requiring the same 20 modules that are all very HTTP relevant? And I agree that larger aggregates are useful and valid (and I bet you’d also agree conversely that putting an orm in a web server abstraction is a superfluous aggregation, and so there are limits to what aggregates make sense).
I don’t actually have a better solution than what express does. I hope that was obvious from my comments. I certainly tried, but ended up coming back to connect. At the same time, I find it unfortunate that connect middleware have non-obvious interdependencies and are difficult to keep clash-free (basically all the problems of monkey-patching). But like I say, I haven’t got a better proposal.
I personally don’t think node.js has the HTTP abstraction problem solved very well (yet).
> Most middleware should absolutely be abstracted to a lower level and then “wrapped” as middleware, there’s no question about that
Not a bad start!
@Su-Shee: If there would be an upvote button on comments here I would’ve hit it so hard that it would break my mouse! Well written, +1 to you.
Some times I get confused when people try to convince other’s programmers that easily re-usuable modules are awesome and you should try them. Then I remember that not every was into perl 10 years a go… Even if you’re not a fan of the perl syntax (beauty), CPAN is the shit and so is NPM.
Pingback: This Week's Links: _why Edition - NYTimes.com
I’m sure this was already said.. but you can’t compare node.js to php/ruby. I am a node developer.. but node is a new FRAMEWORK. It would be more accurate to compare JAVASCRIPT to php/ruby.
Just FYI
Adam: The comparisons I make are comparisons in the customs of the communities surrounding those technologies and not the technologies themselves. I think at that level, it’s apples-to-apples.
@PeterisP:
> “Batteries not included” approach puts a significant cost on application developers.
I agree, it sure wastes a hell lot of time.
@TJ: Is express code too succinct for what it does? For example, its very difficult to figure out what is happening inside a router https://github.com/visionmedia/express/blob/master/lib/router/index.js#L46
Pingback: To node or not to node – overview of Node.js | Developing Scalable Applications
@Adam I have to disagree here. JavaScript is only a language (it is syntax), PHP/Ruby are more than syntax, it is syntax + all about the runtime and functions hooks to the environment. Node.js is JavaScript + runtime/hooks stuff. That’s why Node.js is comparable to PHP and Ruby, JavaScript isn’t.
Pingback: ~ The Truth about JavaScript. « Clint Nash