Empathy Box http://empathybox.com living la vida obscura Mon, 10 Nov 2008 00:30:07 +0000 http://wordpress.org/?v=2.2.2 en Snide commentary on the narrowness of masculine gender roles or vapid 80’s pop tune? http://empathybox.com/archives/17 http://empathybox.com/archives/17#comments Mon, 10 Nov 2008 00:30:07 +0000 jay http://empathybox.com/archives/17 From the chorus to the Berlin song “Sex”:

Man: I’m a man
Girl: I’m a goddess
Man: I’m a man
Girl: Well I’m a virgin
Man: I’m a man
Girl: I’m a blue movie
Man: I’m a man
Girl: I’m a bitch
Man: I’m a man
Girl: I’m a geisha
Man:I’m a man
Girl: I’m a little girl
Man: I’m a man
Girl: I’m a boy
Man: I’m a man
Girl: Well I’m your mother
Man: I’m a man
Girl: I’m a one night stand
Man: I’m a man
Girl: Am I bi
Man: I’m a man
Girl: I’m a slave
Man: I’m a man
Girl: I’m a little girl
Man + Girl: And we make love together…

Somehow the girl gets to be all this kinky shit, but the guy is just like “I’m a man”.

]]>
http://empathybox.com/archives/17/feed
Books FTW! http://empathybox.com/archives/14 http://empathybox.com/archives/14#comments Sun, 02 Sep 2007 06:34:15 +0000 jay http://empathybox.com/archives/14 After several failed attempts to limit our book buying habit, I think my wife and I have agreed to just look for a bigger apartment.

New ones:

]]>
http://empathybox.com/archives/14/feed
But wait, perhaps we can optimize it further! http://empathybox.com/archives/13 http://empathybox.com/archives/13#comments Fri, 27 Oct 2006 07:05:30 +0000 jay http://empathybox.com/archives/13 At work today I saw this little beauty at the end of a method with void return type:

if(true)
    return;

I thought it was quite wise of him to check that. Incidentally, the author of this little koan now works at Google.

]]>
http://empathybox.com/archives/13/feed
Obviously his parents hated him… http://empathybox.com/archives/12 http://empathybox.com/archives/12#comments Wed, 25 Oct 2006 08:08:06 +0000 jay http://empathybox.com/archives/12 Quick question: if your last name was Mountjoy would you really go by “Dick” instead of “Richard”? You would if you were California State Senate candidate and self-proclaimed “Immigration Control Consultant” Dick Mountjoy. His website asks that you add him to your prayer list. I am not the church-going type, but I am thinking that would be an awkward moment, “Dear lord we ask that you bless Dick Mountjoy…”

]]>
http://empathybox.com/archives/12/feed
The Ideas in the Spring Framework http://empathybox.com/archives/11 http://empathybox.com/archives/11#comments Tue, 17 Oct 2006 07:53:49 +0000 jay http://empathybox.com/archives/11 The proponents of new technologies always seem to see themselves as involved in some kind of populist rebellion. For example, judging from rhetoric, I am pretty sure that Ruby programmers see themselves something like this:



I think the underlying reason these little wars go on is that so few people are in a position to understand the trade-offs involved in a technology choice because understanding the trade-off requires understanding both alternatives. It takes a great deal of experience, and the more fundamental the change, the longer it takes. It is easy for me to like, for example, Haskell, since I have never had to build a large, reliable, non-trivial piece of software in it, I only have to see the benefits.

In any case Java had its own little rebellion not so very long ago producing a slew of “light-weight” technologies (Java’s conception of what is light-weight is very different from anyone else’s perspective so we better keep that in the quotes). The whole thing of periodically turning against the tools we use, is a bit counter-productive and kind of juvenile. But I am willing to accept a bi-yearly cycle of ecstasy and loathing to work in an industry with a little passion. Probably foremost in this mini-rebellion was the Spring Framework, which bills itself as a sort of add-on to J2EE. The core idea in Spring is “dependency injection,” which is billed as a better way of building applications out of loosely-coupled components.

It is very difficult to evaluate the effect of an idea like this which claims to improve software architecture because small programs have very little need for anything like architecture. As a consequence it is very difficult to explain the advantage of the new idea, because the advantage only because significant when you have a program of a certain size. So evaluating this kind of idea requires implementing a large system multiple times in different ways to see which is better. It should be no surprise that this takes years (the best example is the microkernel debate, where, many complete production quality OSes later, we still have both).

In any case, Spring remains quite popular and just released version 2.0. I have been using Spring for a while and I think I am ready to start reflecting on how good of an idea the whole dependency-injection-as-core-architecture choice really is. For the record I think Spring is about as good as you can do in Java-land right now. But there is a whole world of difference between better than others and good.

For those who haven’t stumbled across it already the idea of dependency injection is to make component dependencies be parameters of your objects rather than hard coding them. Thus if your program does logging (for example) using some home grown system you might do the following:


  class Example{
    private Logger logger = FileSystemLogger();

    public void doSomething(){
       logger.info("Doing something...");
       ...
    }

The good news is that your class mostly only depends on the Logger interface (say, info, error, warn, etc.), the bad news is that you have hard-coded an implementation of Logger, FileSystemLogger when you initialize the logger. There are a lot of ways to abstract away this dependency, but the absolute simplest is is the following:


  class Example{
    private Logger logger;
    
    public Example(Logger logger){
        this.logger = logger;
    }

    public void doSomething(){
       logger.info(”Doing something…”);
       …
    }
  }

In this way whomever instantiates the Example class will have to provide the Logger implementation. And we have rid ourselves of the dependency.

This example hardly does much justice to the idea, you have to imagine you have a large collection of services bound together in this way, and then you can begin to imagine the benefits.

But in Spring the dependency injection idea is co-mingled with a few other ideas:

  1. XML configuration
  2. Aspect-oriented programming
  3. Fixing J2EE
  4. Programming to Interfaces

Fixing J2EE is where Spring is most clearly successful. If you are using straight-up JDBC you can switch to Spring’s JdbcTemplate and write 1/5th the JDBC code immediately and get free transaction handling too (which now comes with a nifty little @Transactional annotation).

The XML configuration bit is more debatable. In Spring one would implement the above example as follows [1]:

  <bean id="logger" class="FileSystemLogger"/>
  <bean id="myExample" class="Example">
            <constructor-arg order="1"><ref local="logger"/></constructor-arg>
  </bean>
  

Actually the XML itself is a terrible way of wiring things together when compared with Java. XML won’t enforce type-correctness, requires the programmer to learn a whole new set of Spring-specific sematics. Rather than having a applicationContext.xml file with Spring configuration XML, it is vastly simpler to have the following:

  class ApplicationContext{
    ApplicationConext(){
      Logger logger = new FileSystemLogger();
      Example myExample = new Example(logger);
      ...
    }
  }

I thought I had invented this idea when I first thought of it, but a little research shows that it turns up at the end of Martin Fowler’s essay on Inversion of Control, and a very intelligent discussion of the details of using Java-based dependency injection is given in this article. The primary advantage of using code is that the flow of control in the application remains fully documented and understandable. It is a well known problem that the Spring configuration for a large project can become a problem in and of itself. Spring has mechanisms within it help avoid configurartion file madness, but frankly they cannot compete with the mechanisms in Java itself (e.g. type-checking, inheritence, etc.).

The claimed advantage of a configuration file over source code is that one can change the configuration without having to recompile. It is true that many organizations treat configuration changes less seriously than code changes, but that is because these organizations only support primative configuration (e.g. string or number values). Spring allows you to completely rearrange the flow of control via configuration (e.g. disable all transactionality, say, or in general string together completely untested software combinations). It is true that simple properties are often maintained via dependency injection, but these are typically externalized to a properties file just as they would be without dependecy injection. I doubt most organizations that use Spring seriously allow application context xml changes without at least cursory QA verification, in which case a new build is not really much of an issue.

So I can’t say much for the XML part of Spring. I still think programs are better off as code.

The next idea in the Spring meddly is the idea of programming to interfaces. The phrase programming to interfaces is much older than Java itself, but when Java programmers use it it is not entirely clear what they mean. That is, there is programming to interfaces, and then there is programming to Java interfaces. Programming to interfaces means creating a well-defined set of publically accessible operations for your object and using only those operations. It is an excellent idea and you can practice it in virtually any programming language (for example the Linux kernel does a lot of programming to interfaces–e.g. the filesystem interface). But the Spring people seem quite convinced that you should program to Java interfaces, that is you should create a Java interface for, well, everything. The only benefits of this practice is that it will make your project look very complicated which may impress your coworkers. If you are being payed on a per-line-of-code basis you will incur a nice little 10% bonus. The downside is that, oh yes, there will be more code. And every time you add a parameter to some method in class A you will also need to add it to interface A. In other words this advice boils down to “repeat yourself unnecessarily”. This kind of thing is particularly irksome in the early stages of development when interfaces are evolving. I will go ahead an claim that there is not one single advantage in application development of programming to Java interfaces over the following:

  1. You need, oh I don’t know, say a logging class, and you suspect that in the future there may be many logging implementations.
  2. You create a class Logger.java which contains the implementation you initially plan to use. You put thought into the operations possible on a Logger, even considering the future implementations you may need. You do not create any interfaces.
  3. You check-in your code and release this version of your project.
  4. 95% of the time it ends here, but 5% of the time it becomes the case that logging becomes much more important, and you need to support more ways of logging.
  5. You use the refactoring functionality in your IDE to seperate Logger.java into an interface Logger.java and an implementation FileSystemLogger.java. This takes all of 3 seconds. Actually this is a nice little advantage of the Java convention of naming classes with implementation-specific details (i.e. ArrayList), and naming interfaces with a simple generic name (e.g. List).

Notice how 95% of the time creating the extra interface up front is just a waste of time, and the 5% of the time it isn’t you loose those 3 seconds required to fire up the IDE.

Now this doesn’t always hold true. It may be that you need to support two implemenations right from the get-go, in which case a Java interface is exactly what the doctor ordered. It also might be that you are writing a library, not an application–which is totally different. A library is full of code that is meant to be used without changing the code itself, and so it has to support any kind of reasonable extension. In this case up front interfaces are a requirement. But just because you are often annoyed that, say, the Java standard library didn’t do this, doesn’t mean you should go litering libraries thoughout your own application. I think one of the reasons for the complexity of many Java projects is because the programmers have taken too much advice from library developers (who are usually very good programmers and worth listening to) and they have started to act as if they did not have access to their own code. This is why your project at work has 4 trillion lines of configuration, any serious change to which will end in breakages that you will only be able to detect at runtime.

Part of this comes from the fixation many programmers have with object-oriented programming. To put this simply let me rank three things from best to worst:

  1. Simple requirements
  2. Complex requirements satisfied with two seperate polymorphic implementations (e.g. an interface and two implentations)
  3. Complex requirements satisfied with a morass of if statements and case-wise logic.

When you have multiple states of a system, you have a potentially combinatorial explosion of system states. That is, if you have 10 interfaces with 3 implementations each then you have 3^10 = 59,049 cominations to test. It doesn’t matter if you use the complex set of case-wise logic or the clean programming-to-interfaces style–the complexity is still potentially exponential.

Fortunately Spring only recommends that you make all these interfaces, you don’t actually have to. I think the reason must be that it was an early technical limitation in Spring before cglib. As usual when defending a technical limitation people pretend it is not a limitation at all…

Finally we come to Aspect-Oriented Programming. First of all, I think the Spring people are right that the fine-grained aspect orientation (e.g. in AspectJ) is not worth the complication, all that is really needed is the ability to wrap methods with reusable services. But how hard is it, really, to proxy a method? Before we go any further, let’s look at how easy this is too do if one has access to first order functions. Here is an implementation in pseudo-javascript:

  function makeBeforeAdvisedFunction(fun, advice){
    return function(){ advice(arguments); return fun(arguments);}
  }

See how it works? You give a function, and you get back a function which works exactly the same, but with advice applied.

Of course this isn’t quite as powerful as Spring AOP which allows you to apply advice to many methods with a single regular expression. But is this really that necessary? I mean if applying the advice is all of 10 extra characters then is it such a big deal to add it explicitly? Furthermore, isn’t it beneficial to have everything right there, explicitly spelled out, rather than having code invisibily inserted at runtime?

I don’t really know the answer to this. I suspect that the phrase Aspect-Oriented Programming is a complete misnomer, since your project is unlikely to become Aspect-Oriented. It may feature Aspects for say four or five essential services, but the idea that it would become completely oriented towards aspects is a complete oversell[2]. The whole point is that there are a few key things repeated all over. The ones that most all programs share are transaction management, caching, and audit trails (recording all the changes to an object), and security checks, plus a few that are probably domain specific.

A truely aspect-oriented program sounds like a disaster. How would one debug such a beast if it were composed of bits and pieces of code glued together from all over without any clear control-flow? But for the specific examples above it is a compellingly non-invasive way to add a reusable services on top of an object.

[1] Actually Spring advocates using setter methods instead of contructor arguments for setting properties. They argue that this is more flexible. I agree that it is more flexible, but I think it is more dangerous. First it forces you to implement setter methods for properties, even though it may be either meaningless or disasterous if anyone calls those setter methods after initialization (for example what happens if someone calls setDataSource on your data-access object in the middle of a transaction…I don’t know but i can’t think of what a correct behavior is in this case). Secondly it makes it very troublesome to ensure the validity of your object because one cannot enforce any requirement on the complete set of parameters. This deficiency is partially overcome by the @Required annotation available in Spring 2.0. But of course this means every property must be either required or not, it can’t be the case that one either provides A or provides B.

[2] In addition to a tendency to oversell itself the whole Aspect-Oriented thing is completely terminology heavy, and the terminology is particularly wretched (point-cuts, join-points, etc.). It’s a pretty off-putting combination of commercial salesmanship and the pseudo-academic phraseology they developed. But you know, we shouldn’t let that kind of thing sink good ideas if they are there.

]]>
http://empathybox.com/archives/11/feed
Special Topics In Calamity Physics http://empathybox.com/archives/9 http://empathybox.com/archives/9#comments Tue, 10 Oct 2006 08:40:43 +0000 jay http://empathybox.com/archives/9 Just finished reading this awesome Prep School/Coming Of Age/Thriller thing. Don’t start it unless you have some time, though. I really did nothing but read for about three days.

Next up The Emperor’s Children, this should round out my trendy bestselling lit. indulgence.

]]>
http://empathybox.com/archives/9/feed
5 Principles For Programming http://empathybox.com/archives/8 http://empathybox.com/archives/8#comments Mon, 09 Oct 2006 07:51:15 +0000 jay http://empathybox.com/archives/8 Here are a few things I have learned about programming computers, in no particular order. I didn’t invent any of them, and I don’t always follow them. But since nobody seems to know very much about making good software, it makes sense to try to distill a little wisdom when possible.

Fail Fast

Check for programming errors early and often, and report them in a suitably dramatic way. Errors get more expensive to fix as the development process progresses–an error that the programmer catches in her own testing is far cheaper then one the QA tester finds, which is in turn far cheaper than the one your largest customer calls to complain about. The reason this matters is that the cost of software comes almost entirely from the errors. To understand why this is, consider writing code in the following manner: you are assigned some feature, you type up a complete implementation all in one go, then you hit compile for the first time. (I TA’d a beginning programming class in grad school and this is not very different from how beginning programmers insist on working.) The point is that if you have any experience writing software you know that if getting to the first compile required n man-hours, then the time required to having shippable code is probably between 2n and 100n man-hours, depending on the domain. That time will be divided between the programmer’s own bootstrap testing, QA time (and the associated bug fixing), and perhaps some kind of beta.

The classic examples of this principle are type-checking, unit testing, and the assert statement. When I first learned about the assert statement I couldn’t accept that it was useful–after all, the worst thing that can happen is that your code can crash, right? and that is what the assert statement causes. For all the hoopla about unit testing, you would think that it was something deeper then just a convention for where to put your assert statements. But software development is in such an infantile stage, that we shouldn’t poke fun–unit testing, for all the child-like glee of its proponents–may well be the software engineering innovation of the decade.

You see the violation of the fail fast principle all the time in Java programming where beginning programmers will catch (and perhaps log) exceptions and return some totally made up value like -1 or null to indicate failure. They think they are preventing failures (after all whenever an exception shows up in the logs it is a problem, right?) when really they are removing simple failures and inserting subtle time-sucking bugs.

Unfortunately the idea of failing fast is counter-intuitive. People hate the immediate cause of pain not the underlying cause. Maybe this is why you hear so many people say they hate dentists, and so few say they hate, I don’t know, plaque. This is a lot of what irritates people about statically typed languages–when the compiler complains we hate the compiler, when the program does what we say and crashes we hate ourselves for screwing up–even when it is an error that could have been discovered by a more vigilant compiler.

This is why I can’t work myself into the same first-kiss level of ecstasy others manage over languages like Ruby[1]. Dynamic code feels great to program in. After the first day you have half the system built. I did a huge portion of my thesis work in Python and it was a life saver. Thesis work doesn’t need to be bug free, it is the quintessential proof-of-concept (and yet so many CS students, when faced with a problem, break out the C++). But I have also worked on a large, multi-programmer, multi-year project, and this was not so pleasent. A large dynamically typed code base exhibits all the problems you would expect: interfaces are poorly documented and ever changing, uncommon code paths produce errors that would be caught by type checking, and IDE support is weak. The saving grace is that one person can do so much more in Python or Ruby that maybe you can turn your 10 programmer program into three one programmer programs and win out big, but this isn’t possible in a lot of domains. It is odd that evangelists for dynamic languages (many of whom have never worked on a large, dynamically-typed project) seem to want to deny that static type-checking finds errors, rather than just saying that type-checking isn’t worth the trouble when you are writing code trapped between a dynamically typed database interface and a string-only web interface.
Syntax highlighting (and auto-compilation) in IDEs is another example of this principle, but on a much shorter timescale. Once you have become accustomed to having your errors revealed instantaneously it is painful to switch back to having to wait for a compiler to print them out in bulk one at a time.

Write Less Code (and Don’t Repeat Yourself)

This is perhaps the most important and deep principle in software engineering, and many lesser principles can be derived from it. Somehow simple statements/programs/explanations/models are more likely to be correct. No one knows why this is; perhaps it is some deep fact about the universe, but it seems to be true.

In software this comes into play as bugs: longer programs have a lot more bugs so longer programs cost more.

Worse, difficulty seems to scale super-linearly as a function of lines of code. In the transition from Windows XP to Vista the codebase went from 40 million to 50 million lines of code. To do this took 2,000 of the world’s best software engineers 5 years of work.

The reason for this is that the only way to get real decreases in program size (decreases of more than a few characters or lines) is to exploit symmetry in the problem you are solving. Inheritance is a way to exploit symmetry by creating type hierarchies, design patterns are an attempt to exploit symmetry of solution type. Functional languages are still the king of symmetry (all of lisp is built out of a few primitive functions). But rather than categorize these by the mechanism of the solution, it is better to think of them as what they are: ways to write less code.

The best way of all to avoid writing code is to use high quality libraries. The next time you find yourself writing a web application reflect on how little of the code executing is really yours, and how much belongs to the Linux kernel, Internet Explorer, the Windows XP, Oracle, Java, and the vast arrays of libraries you rely on (”ahh the old hibernate/spring/JSF/MySQL solution, so lightweight…”).

Some of the difficulties of large programs are technical but many are sociopolitical. Microsoft is the size of a medium sized country by income and the size of at least a small city by head-count. Yet it is run in much the same way as any 200 person company, namely some guy tells some other guy what to do, and he tells you. Unfortunately they have found that none of these things work at that scale, and I don’t think anyone has a really good idea of how to fix them.

Your problem doesn’t require a small city to produce, but the principle is the same. If your solution requires double the man-power then all the organizational overhead will have to be developed to handle this. Furthermore the organization will be composed of computer programmers who are often at roughly the same level of interpersonal sophistication as that sling blade guy.

What is remarkable, though, is that to make the solution small means also making it clear. I think that this has mostly to do with human brains. We can only think one or maybe two sentences worth of thought at a time, so finding the concepts that make your solution one sentence is essential. The famous haskell quicksort is a perfect example of this. I can’t help but feel jealous of the computer science students ten years from now who will see algorithms presented in that way. (If you don’t read haskell the program just says: “An empty list is quicksorted. The quicksort of a non-empty list is the concatenation of (1) the quicksort of list elements less than the first element, (2) the first element itself, and (3) and the quicksort of list elements greater than the first element.” Though, of course, the haskell version is much briefer.)

Computer Programs Are For People

“We want to establish the idea that a computer language is not just a way of getting a computer to perform operations but rather that it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute.”
The Structure and Interpretation of Computer Programs

The wonderful thing about the above quote is that it gets less brave and more obvious every year. We know that c/java/lisp/haskell have not one bit of power that isn’t in simple assembly langauge–they only allow us to express the ideas more clearly and prevent certain kinds of stupid mistakes. There is no program that can be written in one that can’t be written in another, and all end up as machine instructions sooner or later (some at compile time, some at run time, but no matter). Given this fact it should be obvious that the only reason to have a programming language is to communicate to a person. Don Knuth wrote about this idea, calling it Literate Programming, and created a system called WEB which was a great idea mired in a terrible implementation[2]. The idea was to embed the program in an essay about the program that explained how it worked. The Sun programmers simplified this in some ways with Javadoc, but still something was lost, since it is very hard to get any of the big ideas out of Javadoc, or even to know where to start reading. Projects always have two links: one for Javadoc and one for a higher level documentation which is written in some other system. WEB created a linear narrative to describe a program that might not be quite so straight-forward; Javadoc creates a flat list of documentation with no beginning, end, or summary. Neither is quite what I want.

It is a small tragedy that programmers who spend so much time trying to understand obscure code, and so much time creating new lanauges to write more obscure code in, spend so little time coming up with the WYSIWYG version of WEB that makes program source input look like the beautiful WEB/TEX output.

I think the best ideas in object-oriented programming also fall under this category. The methodology of solving a problem by creating a domain model to express your problem in is the best example. The point of such a model is to create that level of abstraction which exists wholly for people to think in. Not all problems are easily broken by such a method but many are. The concept of encapsulation (you know, the reason you type private in front of all those java variables) is another example. Both of these are to make things simpler, more usable, to put it simply, more human.

A sort of corollary of writing computer programs for people, writing less code, and solving the general problem is the following: write short functions. If I could transmit only a single sentence to the programmers of tomorrow which summed up everything I knew it would be that: write short functions[3]. When you write short functions you are forced to break the code into logical divisions and you create a natural vocabulary built out of function names. This makes the code an easily readable, easily testable, set of operations. When you have done this it becomes possible to see the duplication in your code and you start solving the more general problems. It is sad that the best piece of software engineering advice that I know of is to write short functions, but, well, there it is. Don’t spend it all in one place.

Do The Right Thing

This principle sounds funny when stated directly, after all who is advocating doing the wrong thing? And what is the right thing to do, anyway?

The point is that in the process of developing software I am always facing the following situation: I can cut corners now, hack my way around a bug, add another special case, etc. OR I can try to do the right thing. Often I don’t know what the right thing is, and then I don’t have a choice but to guess. But more often, I know the best solution but it requires changing things. In my experience every single factor in the software development process will argue for doing the wrong thing: schedules, managers, coworkers, and even, when they get involved, customers. All of these groups want things working as soon as possible, and they don’t care what is done to accomplish that. But no one can see the trade-off being made except for the programmers working on the code. And each of these hacks seems to come back like the ghost of Christmas past in the form of P0 bugs, and I end up doing the right thing then under great pressure and at higher cost then I would have done it before.

A lot of times doing the right thing means solving the more general problem. And it is an odd experience in computer programming that often solving the more general problem is no harder then solving the special cases once you can see the general problem.
There is a lot of advice that argues the opposite. This line of thought says just throw something out there, then see how it is conceptually broken, then fix it. This argument is perfectly summarized in the “worse-is-better” discussion, and I don’t have much to add to it. Except to say this, I think that worse-is-better is a payment scheme. With worse-is-better you get 85% of a solution for dirt cheap, and the remaining 15% you will pay in full every month for the rest of your software system’s life. If you are writing software that you know will be of no use tomorrow, then worse-is-better is a steal, (but you might want to consider quitting your job). If you are writing software that will last a while you should do the right thing.

Reduce State

You may have heard that Amdahl’s law is the new Moore’s law and that by the time Microsoft finishes the next version of Windows computers will have like holly shit 80 fucking cores. This means that in five years when your single threaded program is going full tilt boogie it will be using all of 1/80th of the processor on the machine. As a semi-recent computer science grad. student my opinion of concurrency is “neato.” But I notice the old timers at work have more of a “we are so totally fucked” look about them when they talk about it. I think the reason for this is this: if x is a mutable object then the following doesn’t hold in multithreaded program:

x.equals(x)

Like everyone else, I am guessing that the end game for all this will be a massive reduction in mutable state. Those functional language people are definitely on to something. But those of us who still have to program in dysfunctional languages during the day need a more gradual path. The question is, if mutable state is bad, is less mutable state less bad? or do I have to get rid of it all?

I don’t know the answer to this. But for a while now, I have been trying the following. Whenever possible avoid class variables that aren’t declared final. See functional programming gets rid of side-effects altogether, but I know if this is necessary. Inside a function the occasional i++ really isn’t that confusing and I am not sure I want to give it up just yet. The reason is that method-local variables have no publicly accessible state, so as long as I am writing short functions this temporary state shouldn’t be a problem. By making x immutable you ensure that x.equals(x). This also makes it very easy to prevent invalid states, just ensure that either the user provides valid inputs to the constructor or the constructor throws an exception—if you do this, and don’t have mutable state, then you are guaranteed no bad states.

I haven’t figured out yet how to make all my members final just yet (or I would probably be using haskell). It seems to me that if I want to change a User’s email address then I need to be able to call user.setEmail(). That is because the state of the email address is real state out there in the world that I have to model in my program. So the domain model retains its state. But as yea of the Java world know, the domain model is not all the code, oh no. We still have business objects, and persistence objects, and gee-golly all kinds of other objects. And guess what–99% of the state in these objects can go. And when it does everything gets better.

But I am only starting with this concurrency thing. I am reading this book, which is awesome. In it you can learn about all kinds of disturbing things like how the JVM has secretly been scrambling the order of execution of your code in subtle ways and things like that.
Know Your Shit

Just as the workable solution is always the last thing you try, the impossibly to diagnose bug is always in the software layer you don’t understand. You have to understand all layers that directly surround your code—for most programmers this begins with the OS. If you do low level programming you better know about computer architecture too. But this idea is bigger that just catching obscure bugs, it has to do with finding the solution to hard problems. Someone who is familiar with the internals of an OS has enough of the big ideas under their belt to attack most large software problems.
Web programmers know that all performance problems come from the database. Naturally when you first see a performance problem in a database backed application, you want to do some profiling and see where in the code the time is going. After all, isn’t this what everyone says to do?. You can do this, but you might as well just save yourself the time and just log the queries that are issued on that code path, then get the database execution plan on each, you’ll find the performance problem pretty quickly. The reason is simple: data lives on disks and disks are (say) 100,000 times slower than memory. So to cause a problem in java you have to be about 100,000 times more stupid shit than to cause a problem in SQL. But notice how the gorgeous relational database abstraction layer has broken down and in order to solve the problem one has to think about how much data is being pulled off the disk to satisfy the query. The point is that you can’t stop at understanding the relational part, you also need to understand the database part.

The larger problem is what should we be learning to be better at this. I know that the following things will help because they have helped me:

  1. Learn a functional programming language
  2. Learn how operating systems work
  3. Learn how databases work
  4. Learn how to read a computer science paper
  5. Learn as much math as you can (but which math…)

Unfortunately t is virtually impossible to say what will not help you solve a problem. Will knowledge of good old-fashioned AI help you write enterprise software? It certainly might when you implement their 400,000 lines of Java business rules templates as 2,000 lines in a prolog-like business rules system. Likewise, it isn’t every day that I need to integrate at work, but when I have the payoff has usually been big. If you asked me if studying something very useless and different from computers, literature, say, would help you to solve problems, I couldn’t tell you that it wouldn’t. It might be less likely then studying operating systems or databases to pay off, so there might be some opportunity cost, but I couldn’t tell you that the next big advance wouldn’t come from someone who had divided their time between programming and literature. I think that that is pretty much the state of our art, we don’t even know what the framework in which the new ideas will come, let alone what they might be. I’m not sure if that is exciting or pathetic.

[1] “Ruby is a butterfly“. Wow dude, go outside. Look at any of the beautiful creatures on the earth. Now go back in and look at your computer. See much resemblance? Me either.

[2] Knuth says: “A user of WEB needs to be good enough at computer science that he or she is comfortable dealing with several languages simultaneously. Since WEB combines TEX and PASCAL with a few rules of its own, WEB programs can contain WEB syntax errors, TEX syntax errors, PASCAL syntax errors, and algorithmic errors; in practice all four types of errors occur, and a bit of sophistication is needed to sort out which is which. Computer scientists tend to be better at such things than other people.”

Just because we are better at it doesn’t mean we are good at it, and even if we are good at it that doesn’t make it a good idea. Anyone who has looked at a JSP that contained equal parts HTML, CSS, SQL, Java, and Javascript has a fair idea what the source for WEB programs looks like. But the produced output, like all TEX output, is absolutely stunning.

[3] Wearing sunscreen is good advice too, but computer programmers are probably the least at-risk sub-population for skin cancer since most of them aren’t white and none of them seem to get outside enough. If in doubt write short functions while wearing sunscreen, but if you have to give up one, lose the sunscreen.

]]>
http://empathybox.com/archives/8/feed
Dynamic languages are for neat freaks not slobs http://empathybox.com/archives/7 http://empathybox.com/archives/7#comments Mon, 09 Oct 2006 00:12:53 +0000 jay http://empathybox.com/archives/7 I don’t know how something that fails to make the proper Foleyesque slobering noises about dynamic typing and Ruby on Rails managed to get voted up on reddit, but I am glad this did. It is a rather excellent article that manages to nail the issue without really taking sides. My only complaint is this: it gets the recommendation exactly wrong.

If you use a slobby language like python you will have to become extremely neat. If you work on a project with more than one programmer you will have to start documenting the types of every function in comments or you you will find yourself having discussions like “fred, does this method expect a list of hashes that map ints to strings, or a list of hashes that map ints to user objects?”. You will also have to be very serious about your testing strategy to make sure that you maintain the ability to refactor when you have a lot of code, otherwise it is impossible to know when you have broken something. If you have only 60% unit test coverage in some area, you will need to do a lot of manual testing to ensure you haven’t broken something with your refactoring.

Likewise if you work in a language like Java (or better yet, haskell!) you can be a lot sloppier. You can make interface changes and depend on the compiler (or a refactoring tool) to find all the various places you have just broken.

One should pick tools that correct ones own tendencies, not tools that exacerbate problems.

So my recommendation would be this: if you are very precise about types, so much so that you can maintain them all in your head, then dynamic languages are the way to go–they will offer you flexibility and less writing out of types. If you find yourself slipping a bit in your precision, then you want a compiler which checks static types for you. I know that I am a slob, so I think that type inference is in my future.

]]>
http://empathybox.com/archives/7/feed