Thoughts by Ted

Musings about Open Source, Linux, and Life by Theodore Tso

Ext4 is now the primary filesystem on my laptop

Over the weekend, I converted my laptop to use the ext4 filesystem.  So far so good!  So far I’ve found one bug as a result of my using ext4 in production (if delayed allocation is enabled, i_blocks doesn’t get updated until the block allocation takes place, so files can appear to have 0k blocksize right after they are created, which is confusing/unfortunate), but nothing super serious yet.  I will be doing backups a bit more frequently until I’m absolutely sure things are rock solid, though!

I am using the latest ext4 patches and the tip of the e2fsprogs git repository.  Hopefully when we get the bulk of the patches merged into the mainline kernel after the 2.6.26 ships and the 2.6.27 merge window opens, and after I ship out e2fsprogs 1.41 (I have one work-in-progress pre-release, with another coming soon), it’ll be ready for much more wide-spread testing.

In addition to the excellent crew of ext4 developers, I’d like to call out for special thanks Gary Howco and Holger Kiehl, two early users/benchmarkers of ext4 who tried our latest code, and reported bugs that had previously escaped attention by developers (who had been mostly testing the code via the same old test suites); their additional workloads and benchmarks flushed out a few additional bugs.   Thanks, guys!!

Hopefully after a few weeks of my using ext4 for real-live work, I’ll find a few last few bugs to be fixed, and/or feel much more confident it’s ready for me to recommend to others for their production data.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Technorati
  • Reddit

Learning how to communicate

Pity poor Dr. Ari Jaaksi from Nokia. He gave a talk at the Handsets World conference in Berlin on Tuesday, where according to ZDnet, he lectured Open Source Developers that they needed to learn why DRM and other closed technologies were necessary, because of business issues such as subsidized (device) business models. I suspect he wasn’t prepared for the reaction, which took the form of a major fuss on Slashdot, as well as some declarations from a few people on the maemo-users mailing lists that they would never buy another Nokia device. Dr Jaaski then posted today on his blog an entry entitled, “Some learning to do?”, where he stated that while Nokia needs to learn how the open source world works (not just licenses and legal issues, but also the spirit), that the open source world also needed to learn as well — about WHY things are the way they are. He ended the post with “I’m not a teacher, I’m a learner” — which is fair enough.

I actually think that Dr. Jaaksi is right; we all should be interested in learning, and having talked with a number of Nokia employees, I can testify that their hearts are very much in the right place, and there are some strong reasons regarding the demands of carriers, the subsidized business model of handsets in the US, the attitudes of suppliers for components in mobile devices, that very much tie their hands — as well the hands of every other handset company out there. Progress in this space will come slowly, and but I believe there is hope that in the long run things will get better.

However, at the same time, I would suggest to Dr. Jaaksi that it’s also important how to communicate with the open source community — or with any community that holds views sometimes with great passion. Many people do not react well to being told that they need to learn something — even if it is true that they do. Admitting that they need to learn is for some people tantamount to admitting ignorance, or even worse admitting stupidity, and so they are loath to do this. Telling them this just makes them defensive and angry.

My suggestion for how get people to learn without making them defensive is to use a modified Socratic method. It works especially well with the Open Source crowd, because many of them are engineers, and engineers are by nature problem solvers. So if you give them a problem to solve, they will go to work trying to find potential solutions, and thus start thinking and learning about the problem from a different point of view. Hence, a better approach is to give them a series of business constraints (time to market, the demands of carriers, the fact that most end users prefer to buy subsidized phones with 1-2 year lock-in contracts, the intransigence of suppliers, why a single company can’t afford to create their own chipsets from scratch and have to rely on companies like Broadcom, etc.) and then tell them — we’d love to delight our customers in any way we can, and clearly you are very passionate about DRM, open device drivers, and other OSS issues. We however have to create successful products and sell them at a price such that we don’t go out of business — can you help us work through all of these constraints? Let’s work together so we can make an open source mobile platform that meets the spirit as well as the legal requirements of the open source world!

Dr. Jaaksi and his team are struggling with a hard problem, and I am very sympathetic to their issues. And, they have to be commended for investing as much time and effort on the Maemo platform, as well as other Linux and Open Source platforms as they have to date. The strength of the Open Source community is that we work together to solve our mutual problems, while celebrating and giving permission to people to “scratch their own itches”. In order to do that, we need to learn how to better communicate with each other, and that is something that goes in both directions.

When I was at MIT as an undergraduate, far too many years ago than I care to admit, I took economics classes and Sloan School of Management classes (even though they were not necessary for a Computer Science degree) for that very reason. I encourage open source advocates to learn how to walk a mile or two in business people’s shoes, since if you want to influence them them to change in the ways that you desire, you need to know where they are coming from — just shouting and throwing rotten tomatoes generally won’t get you very far. On the flip side, people who are on the corporate side of the divide also need to understand that certain ways of engaging the community can be more productive than others. At the end of the day, hopefully we can all agree with Dr. Jaaksi that everyone will benefit if we all commit to learning from each other.

Update: I’ve fixed Dr. Jaaksi’s name so it is spelled correctly; my apologies for the error.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Technorati
  • Reddit

Donald Knuth: “I trust my family jewels only to Linux”

Andrew Binstock interviewed Donald Knuth recently, and one of the more amusing tidbits was this:

I currently use Ubuntu Linux, on a standalone laptop—it has no Internet connection. I occasionally carry flash memory drives between this machine and the Macs that I use for network surfing and graphics; but I trust my family jewels only to Linux.

More seriously, I found his comments about about multi-core computers to be very interesting:

I might as well flame a bit about my personal unhappiness with the current trend toward multicore architecture. To me, it looks more or less like the hardware designers have run out of ideas, and that they’re trying to pass the blame for the future demise of Moore’s Law to the software writers by giving us machines that work faster only on a few key benchmarks! I won’t be surprised at all if the whole multithreading idea turns out to be a flop, worse than the “Itanium” approach that was supposed to be so terrific—until it turned out that the wished-for compilers were basically impossible to write.

Let me put it this way: During the past 50 years, I’ve written well over a thousand programs, many of which have substantial size. I can’t think of even five of those programs that would have been enhanced noticeably by parallelism or multithreading. Surely, for example, multiple processors are no help to TeX….

I know that important applications for parallelism exist—rendering graphics, breaking codes, scanning images, simulating physical and biological processes, etc. But all these applications require dedicated code and special-purpose techniques, which will need to be changed substantially every few years.

This is a very interesting issue, because it raises the question of what next-generation CPU’s need to do in order to be successful. Given that it is no longer possible to just double the clock frequency every 18 months, should CPU architects just start doubling the number of cores every 18 months instead? Or should they try to concentrate a lot more computing power into an individual core, and optimize for a fast and dense interconnect between the CPU’s? The latter is much more difficult, and the advantage of doing the first is that it’s really easy for marketing types to use some cheesy benchmark such as SPECint to help sell the chip, but then people find out that it’s not very useful in real life.

Why? Because programmers have proven that they have a huge amount of trouble writing programs that take advantage of these very large multicore computers. Ultimately, I suspect that we will need a radically different way of programming in order to take advantage of these systems, and perhaps a totally new programming language before we will be able to use them.

Professor Knuth is highly dubious that the later approach will work, and while I hope he’s wrong (since I suspect the hardware designers are starting to run out of ideas, so it’s time software engineers started doing some innovating), he’s a pretty smart guy, and he may well be right. Of course, another question is whether what would we do with all of that computing power? Whatever happened to the predictions that computers would be able to support voice or visual recognition? And of course, what about the power and cooling issues for these super-high-powered chips? All I can say is, the next couple of years is going to be interesting, as we try to sort out all of these issues.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Technorati
  • Reddit

Organic vs. Non-Organic Open Source, Revisited

There’s been some controversy generated over my use of the terminology of “Organic” and “Non-Organic” Open Source. Asa Dotzler noted that it wasn’t Mozilla’s original intent to “make a distinction between how Mozilla does open source and how others do open source”. Nessance complained that he didn’t like the term “Non-Organic”, because it was “raw and vague - is it alien, poison, silicon-based?” and suggested instead the term “Synthetic Open Source”, referencing a paper by Siobhán O’Mahony, ” What makes a project open source? Migrating from organic to synthetic communities”. Nessance referenced a series of questions and answers by Stephen O’ Grady from Red Monk, where he claimed the distinction between the two doesn’t matter. (Although given that Sun is a paying customer of Red Monk, Stephen admits that this might have influenced his thinking and so he might be “brainwashed” :-).

So let’s take some of these issues in reverse order. Does the distinction matter? After all, if the distinction doesn’t matter, then there’s no reason to create or define specialized terminology to describe the difference. Certainly, Brian Aker, a senior technologist from MySQL, thinks it does, as do folks like me and Amanda McPherson and Mike Dolan; but does it really? Are we just saying that because we want to take a cheap shot at Sun?

Well, to answer that, let’s go back and ask the question, “Why is Open Source a good thing in the first place?” It’s gotten to the point where people just assume that it’s a good thing, because everybody says it is. But if we go back to first principals maybe it will become much clearer why this dinction is so important.

Consider the Apache web server; it was able to completely dominate the web server market, easily besting all of its proprietary competitors, including the super-deep-pocketed Microsoft. Why? It won because a large number of volunteers were able to collaborate together to create a very fully featured product, using a “stone soup” model where each developer “scratched their own itch”. Many, if not most, of these volunteers were compensated by their employers for their work. Since their employers were not in the web server business, but instead needed a web server as means (a critical means, to be sure) to pursue their business, there was no economic reason not to let their engineers contribute their improvements back to the Apache project. Indeed, it was cheaper to let their engineers work on Apache collaboratively than it was to purchase a product that would be less suited for their needs. In other words, it was a collective “build vs. buy” decision, with the twist that because a large number of companies were involved in the collaboration, it was far, far cheaper than the traditional “build” option. This is a powerful model, and the fact that Sun originally asked Roy Felding from the Apache Foundation to assist in forming the Solaris community indicates that at least some people in Sun appreciated why this was so important.

There are other benefits of having code released under the Open Source license, such as the ability for others to see the implementation details of your operating system — but in truth, Sun had already made the Source Code for Solaris available for a nominal fee years before. And, of course, there are plenty of arguments over the exact licensing terms that should be used, such as GPLv2, GPLv3, CDDL, the CPL, MPL, etc., but sometimes those arguments can be a distraction from the central issue. While the legal issues that arise from the choice of license are important, at the end of the day, the most crucial issue is the development community. It is the strength and the diversity of the development community which is the best indicator for the health and the well-being of an Open Source project.

But what about end-users, I hear people cry? End users are important, to the extent that they provide ego-strokes to the developers, and to the extent that they provide testing and bug reports to the developers, and to the extent that they provide an economic justification to companies who employ open source developers to continue to do so. But ultimately, the effects of end-users on an open source project is only in a very indirect way.

Moreover, if you ask commercial end users what they value about Open Source, a survey by Computer Economics indicated that the number one reason why customers valued open source was “reduced dependence on software vendors”, which end users valued 2 to 1 over “lower total cost of ownership”. (Which is why Sun Salescritters who were sending around TCO analysis comparing 24×7 phone support form Red Hat with Support-by-email from Sun totally missed the point.) What’s important to commercial end users is that they be able to avoid the effects of vendor lock-in, which implies that if all of the developers are employed by one vendor, it doesn’t provide the value the end users were looking for.

This is why whether a project’s developers are dominated by employees from a single company is so important. The license under which the code is released is merely just the outward trappings of an open source project. What’s really critical is the extent to which the development costs are shared across a vast global community of developers who have many different means of support. This saves costs to the companies who are using a product being developed in such a fashion; it gives choice to customers about whether they can get their support from company A or company B; programmers who don’t like the way things are going at one company have an easier time changing jobs while still working on the same project; it’s a win-win-win scenario.

In contrast, if a project decides to release its code under an open source license, but nearly all the developers remain employed by a single company, it doesn’t really change the dynamic compared to when the project was previously under a closed-source license. It is a necessary but not sufficient step towards attracting outside contributors, and eventually migrating towards having a true open source development community. But if those further steps are not taken, the hopes that users will think that some project is “cool” because it is under an open-source license will ultimately be in vain. The “Generation Y”/Millennial Generation in particular are very sensitive indeed to Astroturfing-style marketing tactics.

Ok, so this is why the distinction matters. Given that it does, what terms shall we use? I still like “Organic” vs “Non-organic”. While it may not have been intended by the Mozilla Foundation, the description in their web page, “only a small percentage of whom are actual employees [of the Mozilla Foundation]“, is very much what I and others have been trying to describe. And while I originally used the description “Projects which have an Open Source Development Community” vs “Projects with an Open Source License but which are dominated by employees from a single company”, I think we can all agree these are very awkward. We need a better shorthand.

When Brian Aker from MySQL suggested “Organic” vs “Non-Organic” Open Source, and I think those terms work well. If some folks think that “Non-Organic” is somehow pejorative (hey, at least we didn’t say “genetically modified Open Source” :-), I suppose we could use Synthetic Open Source. I’m not really convinced that is any much more appetizing, myself, however.

So what would be better terms to use? Please give me some suggestions, and maybe we can come up with a better set of words that everyone is happy with.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Technorati
  • Reddit

Links — 2008-04-25

The Open Source Commands
Really good ideas that companies should take to heart.
Open Source Commandments II: Passover Penguins
More really good ideas, especially for companies like Sun…
Did Canonical Just Get Punked by Red Hat and Novell?
Interesting thoughts about Linux desktop strategies
rPath to OEM SUSE Linux Enterprise Server from Novell for Appliances
I know a bunch of the folks at rPath, and I very much respect their technology; I think this is a very good thing for them.
Does Microsoft CEO Steve Ballmer need an intervention?
Does anyone think a Microsoft/Yahoo merger makes sense besides Mr. Ballmer?
Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Technorati
  • Reddit

Organic vs. Non-organic Open Source

Brian Aker dropped by and replied to my previous essay by making the following comment:

I believe you are hitting the nail on the “organic” vs “nonorganic” open source. I do not believe we have a model for going from one to the other. Linux and Apache both have very different models for contribution… but I don’t believe either are really optimized at this point.

Optimization to me would lead to a system of “less priests” and more inclusion.

I made an initial reply as comment, and then decided it was so long that I should promote it to a top-level post.

I assume that when Brian talks about “organic open source” what he means is what I was calling an “open source development community”. Some googling turned up the following definition from Mozilla Firefox’s organic software page: “Our most well-known product, Firefox, is created by an international movement of thousands, only a small percentage of whom are actual employees.”

This puts it in contrast with “non-organic” software, where all or nearly all of the developers are employed by one company. (And anyone who proves talented at adding features to that source base soon gets a job offer by that one company. :-) By that definition we can certainly see projects like Wine, Mysql, Ghostscript (at one time), and others as fitting into that model, and being quite successful. There’s nothing really wrong with the non-organic software model, although many of them have struggled to make enough money when competing with pure proprietary softare competitors, with MySQL perhaps being the exception which proves the rule.

In most of these cases, though, the project started more as an organic open source, and then transitioned into the non-organic model when there was a desire to monetize the project — and/or when the open source programmers decided that it would be nice if they could turn their avocation into a vocation, and let their hobby put food on the family table.

Solaris, of course, is doing something else quite different, though. They are trying to make the transition from a proprietary customer/supplier relationship to trying to develop an Open Source community — and what John’s candidate statement pointed out is that they weren’t really interested in creating an organic open source developer community at all, but they wanted the fruits of an open source community — with plenty of application developers, end-users, etc., all participating in that community.

We don’t have a lot of precedent for projects who try to go in this direction, but I suspect they are skipping a step when they try to go to the end step without bothering to try to make themselves open to outside developers. And by continuing to act like a corporation, they end up shooting themselves in the foot. For example, the OpenSolaris license still prohibits people from publishing benchmarks or comparisons with other operating systems. Very common in closed-source operating systems and databases, but it discourages people from even trying to make things better, both within and outside of the Open Solaris core team. Instead, they respond to posts like David Miller’s with “Have you ever kissed a girl?”. (Thanks, Simon, for that quote; I had seen it before, but not for a while, and it pretty well sums up the sheer arrogance of the Open Solaris development team.)

So while Linux may not be completely optimized in terms of “less priests” and more inclusion, at least over 1200 developers contributed to 2.6.25 during its development cycle. Compared to that, Open Solaris is positively dominated by “high priests” and with a “you may not touch the holy-of-holies” attitude; heck, they won’t even allow you to compare them to other religions without branding you a heretic and suing you for licensing violations!

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Technorati
  • Reddit

What Sun was trying to do with Open Solaris

I was recently checking to see what, if any follow-up there had been from Sun’s ham-handed handling of the Open Solaris Trademark, and I ran across this very interesting comment from John Plocher’s Candidate Statement for the Open Solaris Governing Board:

“I also think there was a misunderstanding about what Sun desired when it launched the community (in part) to encourage developers to adopt and use Solaris. My take is that, while there *is* value in getting more kernel, driver and utility developers contributing to and porting the (open) Solaris operating system, there is significantly *more* value in having a whole undivided ecosystem based on a compatible set of distributions, where application developers, university students, custom distro builders and users are all able to take advantage of each other’s work.

Put these two things together, and you can see Sun’s predicament. Sun *wanted* a community that empowered application developers, but *got* a community aimed squarely at kernel hackers. Whether you see this as the “kernel.org -vs- Ubuntu” fight, or the “fully open -vs- MySQL model” argument, in my opinion, it all is simply a reflection of the above mismatched expectations.”

So that explains why it’s take three long years to try to get basic open source development tools (such as putting Open Solaris source code in a distributed SCM located outside of the Sun firewall) for Open Solaris. It was never was Sun’s intention to try to promote a kernel engineering community, or at least, it was certainly not a high priority for them to do so. This can be shown by the fact that as of this writing they still are using the incredibly clunky requester/sponsor system for getting patches into Solaris; setting up a git or mercurial server is not rocket science. This lack explains why Linus gets more contributions while brushing his teeth than Open Solaris gets in a week.

So if you run into a Sun salescritter or a Sun CEO claiming that OpenSolaris is just like Linux, it’s not. Fundamentally, Open Solaris has been released under a Open Source license, but it is not an Open Source development community. Maybe it will be someday, as some Sun executives have claimed, but it’s definitely not a priority by Sun; if it was, it would have been done before now. And why not? After all, they are getting all of the marketing benefit of claiming that Solaris is “just like Linux”, without having to deal with any of the messy costs of working with an outside community. As a tactical measure, astroturfing is certainly a valid marketing trick. But after three years, the excuse of “just you wait a little longer, we’re just trying to figure this open source community stuff out”, is starting to wear a little thin.

Furthermore, if (as John Ploncher claims) this was about “empowering application programmers”, why was it that Sun’s first act was to trumpet how wonderful it was to release the Solaris source code under a Open Source license? This only seems to make sense if the Open Solaris initiative was really a cynical marketing tactic to try to save Solaris from being viewed as irrelevant. If that was Sun’s intention, I think it is fair to say that from a marketing point of view, the tactic has been at least partially successful — although as John has admitted, the goal of creating a full community with application developers, university students, and so on, hasn’t materialized for Open Solaris. Sun has the dream; the Linux community is living it.

However, from business standpoint, I wonder if Sun will really be able to sustain their Solaris engineering team if they will really be doing all of the work themselves, and outside contributions continue at the rate of 0.6 patches per day. After all, the margins when you are selling low-cost AMD servers are much lower than when you are selling über-expensive SPARC servers. With Linux, we have a major advantage in that kernel improvements are coming from multiple companies in the ecosystem, instead of being paid for by a single company. And given that 70-80% of Sun’s AMD servers are running Linux, not Solaris, it’s not clear how Sun justifies their Solaris engineering costs to their shareholders. Furthermore, if Solaris on x86_64 were to actually take off, there’s nothing to stop competitors from selling Solaris support — except the competitors won’t have to pay the engineering costs to maintain and improve Solaris, so they would be able to provide the support much more cheaply than Sun could. So while Sun’s marketing tactics have kept Solaris alive in some verticals, I have to question how successful Sun will be in the long-term.

Update: I’ve posted a reply to Brian Aker’s comments as a follow up that would probably be interesting to those folks who are interested in the ideas found in this essay.

Update^2: John Plocher’s name is spelled with an ‘h’, which I had omitted. My deep apologies to him for getting his name wrong. I’ve fixed it in the post.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Technorati
  • Reddit

AT&T: Customer support horrors

This morning, I just wasted two hours of my life trying to deal with a bill with AT&T. I am a work-at-home employee, and my company has a contract with AT&T so that when I deal 1-700-xxx-xxxx, I can reach the internal corporate phone network. In addition, long distance calls on my home office line are billed to the company at the pre-negotiated corporate rates. I also had a (long-dormant) AT&T long distance account, dating from before I started working at this company. Starting at the beginning of the year, that account grew a $18 monthly fee. When I tried to make it go away, the AT&T consumer side of the house said that according to Verizon, that was because I had my long distance service through AT&T. Which was true, in a sense — AT&T was providing my service, but through a corporate account. But the consumer side of AT&T didn’t understand that, and were in my opinion, willfully ignorant.

Several phone calls later, I got the 1-800 number for the AT&T corporate side of the house, and those folks said they couldn’t look at consumer/personal AT&T accounts, and implied it was my fault that I had both accounts on the line. (This took over an hour while the support person tried multiple things, all of which was not helpful at all.)

I finally googled for AT&T CEO’s office, and found a number on the consumerist.com web site that claimed to be the AT&T CEO’s office. It had long since been directed to a help center, but after I told my tale, I was quickly transferred to an executive customer support person, who was able to fix the problem in ten minutes.

The only questions remaining are:

1) Why can’t the different parts of AT&T talk to one another?
2) Will this problem really be solved, or will I see another bill in a month or two and have to spend more time dealing with this mess all over again?
3) When will VOIP services put AT&T long distance out of its misery? (Given the absolute frustration of this morning, this can’t happen soon enough.)
4) Will AT&T compensate me for two hours of frustration and of my life that I will never get back. (Not bloody likely.)

One thing is for certain, it will be a long, long, LONG time before I will ever voluntarily choose AT&T to provide service for just about anything. Even an iPhone wouldn’t be enough inducement….

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Technorati
  • Reddit

Why I purchased the Sony PRS-505 Reader

Although I lot of people have been lauding the Kindle, I recently decided to go with the Sony PRS-505 instead. Yes, the Kindle has built-in EVDO access, and the ability to buy books without a computer, or even browse the web; and yes, the Sony has once again demonstrated it can’t create a compelling 21st century computer application to save its life. However, it had a few things that at least for me, made a better choice for me than the Kindle:

  1. The Sony is thinner — I want to be able to slip it into my laptop case and have it take the absolute minimum amount of space.
  2. The Sony simply looks much more elegant than the Kindle; steel with a leather cover simply looks a lot better than white, cheasy plastic.
  3. I’m not interested in buying a lot of DRM’ed ebooks; ergo, I won’t be buying may books from either Sony or Amazon’s web sites. It is highly likely that within 2 years I will be buying a more advanced eBook reader, possibly one with color, and I don’t want to be locked into a single format where I have to go and repurchase all of my books just because some the latest and greatest eBook reader uses an incompatible DRM technology from whatever Sony or Amazon has used.
  4. The Sony is $100 cheaper. Given that something better will be available within 2-3 years at the very most, and possibly sooner, I’m just not interested in spending $400 on a first generation prototype.
  5. Perhaps the most important, the Sony has the really, really good open source support. Kovid Goyal’s libprs500 project supports the Sony PRS-500 and PRS-505, and has very good version tools, allowing people to convert eBooks previously stored in HTML, PDF, TXT, Microsoft Reader (.lit), IDPF/Open eBook (.epub) into Sony’s format. And with a little bit of work, it does a very, very good job with the conversion. Better yet, its ability to convert multiple HTML pages into a single eBook, with credible table of contents, means that libprs500 can pull down the New York Times, the Economist, etc., automatically format it into a single eBook which you can save onto your Sony Reader, and then read it while you are on the airplane. No muss, no fuss. I can also take various books that are available on the web as HTML and also convert them into an eBook which can be used by the Sony Reader very easily.

This last point is I believe one of the best reasons why the Sony Reader will be able to compete very successfully with the Kindle. The libprs500 software is written as a python application, and it will work on Windows, Linux, and MacOS — and its GUI user interface is far better than the truly pathetic Sony Connect software. Score one for Open Source! In my opinion, Sony should send a very nice gift certificate to Kovid as a thank you; his open source project has added an immeasurable amount of value to their product.

The only thing that you can’t do using the libprs500 software is buy DRM’ed books which are locked to the Sony Reader — but that isn’t something that many people will be particularly interested in, I suspect. OK, I did buy Pillars of the Earth, which was available on the Sony site for $6 dollars — hmm, cheaper than Amazon’s $9.99 — but that was an investment I was willing to flush down the toilet when the PRS-505 becomes obsolete, mainly so I could test what buying a DRM’ed book would be like from the Sony web site. But I probably won’t be buying many books with DRM that way. On the other hand, I am quite willing to spend quite a bit more money on non-DRM’ed books from publishes such as Baen Books.

Here’s to the hope to the publishing industry figures things out faster than RIAA’s member companies. In the meantime, I will be mostly pretending that the both the Sony and Amazon eBook stores with their proprietary DRM’ed books don’t exist…

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Technorati
  • Reddit

Does perfect code exist? (Abstractions, Part 1)

Bryan Cantrill recently wrote a blog entry, where among other things, he philosophized on the concept of “perfect code”. He compares software to math, arguing that Euclid’s greatest common denominator algorithm shows no sign of wearing out, and that when code achieves perfection (or gets close to perfection), “it sediments into the information infrastructure” and the abstractions defined by that code becomes “the bedrock that future generations may build upon”. Later, in the comments of his blogs, when pressed to give some examples of such perfection, he cites a clever algorithm coded by his mentor to divide a high resolution timestamp by a billion extremely efficiently, and Solaris’s “cyclic subsystem”, a timer dispatch function.

Watching his talk at Google where his introduction and sidebar book review on Scott Rosenberg’s “On Dreaming in Code”, it’s clear that he very passionately believes that it is possible to write perfect code, and that one should strive for that at all times. Perhaps that’s because he mostly writes code for operating systems, where the requirements change slowly, and for one OS in particular, Solaris, which tries far harder than most software projects to keep published interfaces stable for as long as possible. In contrast, the OS I’ve spent a lot of time hacking around, Linux, quite proudly states that at least inside the kernel, interfaces can not and should not be stable. Greg Kroah-Hart’s “Stable API Nonsense” is perhaps one of the strongly and most passionate expositions of that philosophy.

I can see both sides of the argument, and in their place, both have something to offer. To Bryan’s first point, it is absolutely true that interfaces can become “bedrock” upon which entire ecosystems are built. Perhaps one of the most enduring and impactful example would be the Unix programming interface, which has since become enshrined by POSIX.2 and successor standards. I would argue, though, that it is the interface that is important, and not the code which initially implemented it. If the interface is powerful enough, and if it appears at the right time, and the initial implementation is good enough (not perfect!), then it can establish itself by virtue of the software which uses it becoming large enough that it assumes an important all out of scale with its original intention.

Of course, sometimes such an interface is not perfect. There is an apocryphal story that when S. Feldman at AT&T labs first wrote the ‘make’ utility, that he did so rather quickly, and then put it available for his fellow lab members to use, and then went home to sleep. In some versions of the story he had stayed up late and/or pulled an all-nighter to write it, so he slept a long time. When he came back to work, he had come up with a number of ways to improve the syntax of the Makefile. Unfortunately, (so goes the story) that too many teams were already using the ‘make’ utility, so he didn’t feel he could change the Makefile syntax. I have no evidence that this ever took place, and I suspect it is an urban myth that was invented to explain why Makefiles have a rather ugly and unfortunate syntax that many would call defects, including the use of syntactically significant tab characters which are indistinguishable from other forms of leading whitespace.

Another example which is the bane of filesystem designers everywhere are the Unix readdir(2), telldir(2), and seekdir(2) interfaces. These interfaces fundamentally assume that directories are stored in linear linked lists, and filesystems that wish to use more sophisticated data structures, such as b-trees, have to go to extraordinary lengths in order support these interfaces. Very few programs use telldir(2) and seekdir(2), but some filesystems such as JFS maintain two b-trees instead of one just to cater to telldir/seekdir.

And yet, it is absolutely true that interfaces can be the bedrock for an entire industry. Certainly no matter what its warts, the Unix/Posix interface has proven the test of time, and it has been responsible for the success of many a company and many billions of dollars of market capitalization. But is this the same as perfect code? No, but if billions of dollars of user applications are going to be depending on that code, it’s best if code which implement such an interface be high quality, and should attempt to achieve perfection.

But what does it mean for code to be perfect? For a particular environment, if the requirements can be articulated clearly, I can accept that code can reach perfection, in that it becomes as fast as possible (for the given computer platform), and it handles all exception cases, etc., etc. Unfortunately, in the real world, the environment and the requirements inevitably change over time. For example, Bryan’s cyclic subsystem, which he proudly touts as being if not perfect, almost so, and which executes at least 100 times a second on every Solaris system in the world. I haven’t looked at the cyclic system in any detail, since I don’t want to get myself contaminated (the CDDL and GPLv2 licenses are intentionally incompatible, and given that companies — including Sun — have sued over IPR issues, well, one can’t be too careful), but waking up the CPU from its low-power state 100 times a second isn’t a good thing at all if you are worried about energy conservation in data centers — or in laptops.

For example, on my laptop, it is possible to keep wakeups down to no more than 30-35 times a second and it would be possible to do more but for an abstraction limitation. Suppose for example an process wants to be sleep and then receive a wakeup 1 milliseconds later, and so requests this via usleep(). At the same time, another application wants to sleep until some file descriptor activity takes place, or after 1.2 milliseconds takes place. 0.3 milliseconds later, a third process requests a sleep, this time for 0.8 milliseconds. Now, it could be that in all of the above cases, the applications don’t actually need exact timing; if they all get their wakeups plus or minus some fraction of a millisecond, they would be quite cool with that. Unfortunately, the standard timer interfaces have no way of expressing this, and so the OS can’t combine the three wakeups at T+1.0, T+1.1, and T+1.2 milliseconds into one wakeup at T+1.1ms.

So this is where Greg K-H’s “Stable API Nonsense” comes into play. We may not be able to solve this problem at the userspace level, but we darn well can solve this problem inside the kernel. Inside the kernel, we can change the timer abstraction to allow device drivers and kernel routines to provide a timer delta plus a notion of how much accuracy is required for a particular timer request. Doing so might change a timer structure that previously external device drivers had depended upon — but too bad, that’s why stable ABI/API’s are not supported for internal kernel interfaces. Is this that the interface could have been extended? Well, perhaps, and perhaps not; if an interface is well designed, it is possible it can be extended in an API and/or ABI compatible way. There usually is a performance cost to doing so, and sometimes it may make sense to pay that that cost, and sometimes it may not. I’ll talk more about that in a future essay.

Yet note what happened to the timer implementation. We have a pretty sophisticated timer implementation inside Linux, that uses heap data structures and buckets of timers for efficiency. So while I might not call it perfect, it is pretty good. But, oops! Thanks to this new requirement of energy efficiency, it will likely need to get changed to support variable levels of accuracy and the ability to fire multiple timers that are (more or less) coming due in a single CPU wakeup cycle. Does that make it no longer perfect, or no longer pretty good? Well, it just had a new requirement impact the code, and if criteria for perfection is for the abstraction defined by the code to be “bedrock” and never-changing, and no need to make any changes in said code over multiple years, I would argue that little to no code, even OS code, can ever achieve perfection by that definition — unless that code is no longer being used.

(This is the first of a multi-part series of essays on abstractions that I have planned. The next essay which I plan to write will be entitled “Layer surfing, or jumping between layers”, and will be coming soon to a blog near you….)

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Technorati
  • Reddit