java sucks.
© 1997-2000 Jamie Zawinski
<jwz@jwz.org>
I think Java is the best language going today, which is to say,
it's the marginally acceptable one among the set of complete bagbiting
loser languages that we have to work with out here in the real world. Java
is far, far more pleasant to work with than C or C++ or Perl or Tcl/Tk
or even Emacs-Lisp. When I first started using Java, it felt like an
old friend: like finally I was back using a real object system,
before the blights of C (the PDP-11 assembler that thinks it's a
language) and C++ (the PDP-11 assembler that thinks it's an object
system) took over the world.
However, as I settled in, I found a lot of things about Java that
irritate me. As this happened, I wrote them down. The following
document was mostly written while I was learning the language, during
the design and development of
Grendel
back in 1997. Therefore, some of the following complaints might have
been addressed in later versions of the language, or they might have
been misunderstandings on my part.
It's too bad Sun has been working as hard as they can, in their
typical Sun way, to destroy Java by holding on to it so closely that
nobody else can actually improve it.
I've also merged in a few complaints from
Dan Bornstein
and Richard Mlynarik. Thanks guys!
About the Java name, and associated politics:
The fact is that there are four completely different things
that go by the name ``Java'':
- A language;
- An enormous class library;
- A virtual machine;
- A security model.
Sun would like you to believe that these are all the same thing,
and that the name ``Java'' implies all of them, but this is marketing
fiction. Worse than that, the fact that Sun has tried so hard to push
this idea has done grievous damage to the acceptance of Java.
- Java-the-language is, overall, a very good thing, and works well.
- Java-the-class-library is mostly passable.
- Java-the-virtual-machine is an interesting research project,
a nice proof of concept, and is basically usable for a certain
class of problems (those problems where speed isn't all that
important: basically, those tasks where you could get away with
using Perl instead of C.)
- Java-the-security-model is another interesting research
project, but it only barely works right now. In a few years, maybe
they'll have it figured out or replaced.
If Sun hadn't tried so hard to conflate these four completely
different things, if they had first shipped native-code Java
compilers, then the VM, then the security model, then
Java probably would have completely displaced C++ by now.
The whole ``write once run anywhere'' idea (which is to say, the
virtual machine) is a wonderful idea, and I wish it the best of luck.
But it's not true yet. It might be someday: in the meantime, I'd like
to write programs in Java today, the way I can write programs in
C today. So I have to recompile for every architecture on
which I want to run. Ok, I wish I didn't, but that's what I have to do
today anyway, but I have to do it in C instead of Java.
Virtual machines are cool. Security models that allow
network-distributed code are cool. Serialization and agent-like
behavior is also cool.
But these are not what I'm most interested in. There are a lot of
people who are most interested in those things, but me, I just
want to write a program that will run on some suitable number of
architectures. I'm happy distributing binaries for each
architecture to do that. Sure, having one binary that ran on
everything would be nice, but you know, it's just not a hard
requirement.
Today, I program in C.
I think C is a pretty crummy language. I would like to write the
same kinds of programs in a better language.
First the good stuff:
-
Java doesn't have free().
I have to admit right off that, after that, all else is gravy. That
one point makes me able to forgive just about anything else, no matter
how egregious. Given this one point, everything else in this document
fades nearly to insignificance.
But...
About the Java language itself:
(I'm separating my complaints about Java the language, and Java the
class library, despite Sun's repeated attempts to blur this important
and fundamental distinction.)
-
It's hard to live with none of: lexically scoped local functions;
a macro system; and inlined functions.
-
I really hate the lack of downward-funargs; anonymous classes are
a lame substitute. (I can live without long-lived closures, but
I find lack of function pointers a huge pain.)
-
The fact that static methods aren't really class methods (they're
actually global functions: you can't override them in a subclass)
is pretty dumb.
-
It's far from obvious how one hints that a method should be inlined, or
otherwise go real fast. Does `final' do it? Does `private
final' do it? Given that there is no preprocessor to let you do
per-function shorthand, and no equivalent of Common Lisp's flet
(or even macrolet), one ends up either duplicating code, or
allowing the code to be inefficient. Those are both bad choices.
-
Two identical byte[] arrays aren't equal and don't hash
the same. Maybe this is just a bug, but:
- You can't fix this by subclassing Hashtable.
- You can't fix this by subclassing Array because it's
not really an object. What you can do is wrap an Object around
an Array and let that implement hashCode and equals
by digging around in its contained array, but that adds
not-insignificant memory overhead (16 bytes per object, today.)
- Gee, I know, I'll write my own hash table. I've only done that
a thousand times.
-
I can't seem to manage to iterate the characters in a String
without implicitly involving half a dozen method calls per character.
-
The other alternative is to convert the String to a byte[]
first, and iterate the bytes, at the cost of creating lots of random
garbage.
-
Generally, I'm dissatisfied with the overhead added by Unicode support in
those cases where I'm sure that there are no non-ASCII characters.
There ought to be two subclasses of an abstract String class, one
that holds Unicode, and one that holds 8-bit quantities. They should offer
identical APIs and be indistinguishable, except for the fact that
if a string has only 8-bit characters, it takes up half as much memory!
-
Of course, String being final eliminates even the option of
implementing that.
-
Interfaces seem a huge, cheesy copout for avoiding multiple inheritance;
they really seem like they were grafted on as an afterthought. Maybe
there's a good reason for them being the way they are, but I don't see it;
it looks like they were just looking for a way to multiply-inherit methods
without allowing call-next-method and without allowing instance variables?
-
There's something kind of screwy going on with type promotion that I don't
totally understand yet but that I'm pretty sure I don't like. This gets a
compiler error about type conflicts:
abstract class List {
abstract List next();
}
class foo extends List {
foo n;
foo next() { return n; }
}
I think that's wrong, because every foo is-a List.
The compiler seems to be using type-of rather than typep.
-
This ``integers aren't objects'' nonsense really pisses me off. Why did
they do that? Is the answer as lame as, ``we wanted the `int' type to be
32 bits instead of 31''? (You only really need one bit of type on the
pointer if you don't need small conses, after all.)
The way this bit me is, I've got code that currently takes an array of
objects, and operates on them in various opaque ways (all it cares about
is equality, they're just cookies.) I was thinking of changing these
objects to be shorts instead of objects, for compactness of their
containing objects: they'd be indexes into a shared table, instead of
pointers to shared objects.
To do this, I would have to rewrite that other code to know that
they're shorts instead of objects. Because one can't assign a
short to a variable or argument that expects an Object,
and consequently, one can't invoke the equal method on a
short.
Wrapping them up in Short objects would kind of defeat the
purpose: then they'd be bigger than the pointer to the original
object rather than smaller.
-
And in related news, it's a total pain that one can't iterate over the
contents of an array without knowing intimate details about its contents:
you have to know whether it's byte[], or int[], or
Object[]. I mean, it is not rocket science to have a language
that can transparently access both boxed and unboxed storage. It's not as
if Java isn't doing all the requisite runtime type checks already! It's
as if they went out of their way to make this not work...
Is there some philosophical point I'm missing? Is the notion of
separating your algorithms from your data structures suddenly no longer a
part of the so-called ``object oriented'' pantheon?
-
After all this time, people still think that integer overflow is better
than degrading to bignums, or raising an exception?
Of course, they have Bignums now (ha!) All you have to do (ha!)
is rewrite your code to look like this:
result = x.add(y.multiply(BigInteger.valueOf(7))).pow(3).abs().setBit(27);
Note that some parameters must be BigIntegers, and some must be ints,
and some must be longs, with largely no rhyme or reason. (This complaint
is in the ``language'' section and not the ``library'' section because
this shit should be part of the language, i.e., at the syntax level.)
-
I miss typedef. If I have integers that represent something, I
can't make type assertions about them except that they are ints.
Unless I'm willing to swaddle them in blankets by wrapping
Integer objects around them.
-
Similarly, I think the available idioms for simulating enum and
:keywords are fairly lame. (There's no way for the compiler to
issue that life-saving warning, ``enumeration value `x' not handled
in switch'', for example.)
They go to the trouble of building a single two-element
enumerated type into the language (Boolean) but won't give us a way
to define our own?
-
As far as I can see, there's no efficient way to implement `assert'
or `#ifdef DEBUG'. Java gets half a point for this by
promising that if you have a static final boolean, then
conditionals that use it will get optimized away if appropriate.
This means you can do things like
if (randomGlobalObject.DEBUG) { assert(whatever, "whatever!"); }
but that's so gratuitously verbose that it makes my teeth hurt.
(See also, lack of any kind of macro system.)
-
By having `new' be the only possible interface to allocation, and
by having no back door through which you can escape from the type safety
prison, there are a whole class of ancient, well-known optimizations that
one just cannot perform. If something isn't done about this, the language
is never going to be fast enough for some tasks, no matter how good the
JITs get. And ``write once run everywhere'' will continue to be the
marketing fantasy that it is today.
-
I sure miss multi-dispatch. (The CLOS notion of doing method lookup based
on the types of all of the arguments, rather than just on the type of the
implicit zero'th argument, this).
-
The finalization system is lame. Worse than merely being lame, they
brag about how lame it is! To paraphrase the docs: ``Your object
will only be finalized once, even if it's resurrected in finalization!
Isn't that grand?!'' Post-mortem finalization was figured out years ago
and works well. Too bad Sun doesn't know that.
- Relatedly, there are no ``weak pointers.'' Without weak pointers
and a working finalization system, you can't implement a decent caching
mechanism for, e.g., a communication framework that maintains proxies to
objects on other machines, and likewise keeps track of other machines'
references to your objects.
- You can't close over anything but final variables in an
inner class! Their rationale is that it might be ``confusing.''
Of course you can get the effect you want by manually
wrapping your variables inside of one-element arrays.
The very first time I tried using inner classes, I got bitten by
this -- that is, I naively attempted to modify a
closed-over variable and the compiler complained at me, so I in
fact did the one-element array thing. The only other time I've
used inner classes, again, I needed the same functionality; I
started writing it the obvious way and let out a huge sigh of
frustration when, half way through, I realized what I had done
and manually walked back through the code turning my
Object foo = <whatever>;
into
final Object[] foo = { <whatever> };
and all the occurence of foo into foo[0].
Arrrgh!
- The access model with respect to the mutability (or
read-only-ness) of objects blows. Here's an example:
System.in, out and err (the stdio
streams) are all final variables. They didn't used to be, but some
clever applet-writer realized that you could change them and start
intercepting all output and do all sorts of nasty stuff. So, the
whip-smart folks at Sun went and made them final. But hey!
Sometimes it's okay to change them! So, they also added
System.setIn, setOut, and setErr methods
to change them!
``Change a final variable?!'' I hear you cry. Yep. They sneak
in through native code and change finals now. You might think
it'd give 'em pause to think and realize that other people
might also want to have public read-only yet privately writable
variables, but no.
Oh, but it gets even better: it turns out they didn't really
have to sneak in through native code anyway, at least as far as
the JVM is concerned, since the JVM treats final variables as
always writable to the class they're defined in! There's no
special case for constructors: they're just always writable.
The javac compiler, on the other hand, pretends
that they're only assignable once, either in static init code
for static finals or once per constructor for instance
variables. It also will optimize access to finals, despite the
fact that it's actually unsafe to do so.
- Something else related to this absurd lack of control over
who can modify an object and who cannot is that there is no notion
of constant space: constantry is all per-class, not per-object.
If I've got a loop that does
String foo = "x";
it does what you'd expect, because the loader happens to have
special-case magic that interns strings, but if I do:
String foo[] = { "x", "y" };
then guess what, it conses up a new array each time through
the loop! Um, thanks, but don't most people expect literal
constants to be immutable? If I wanted to copy it, I would copy it.
The language also should impose the contract that literal constants
are immutable.
Even without the language having immutable objects, a
non-losing compiler could eliminate the consing in some limited
situations through static analysis, but I'm not holding my breath.
Using final on variables doesn't do anything useful
in this case; as far as I can tell, the only reason that
final works on variables at all is to force you to
specify it on variables that are closed over in inner classes.
-
The locking model is broken.
- First, they impose a full word of overhead on each and
every object, just in case someone somewhere sometime wants to
grab a lock on that object. What, you say that you know that
nobody outside of your code will ever get a pointer to this
object, and that you do your locking elsewhere, and you have
a zillion of these objects so you'd like them to take up as
little memory as possible? Sorry. You're screwed.
- Any piece of code can assert a lock on an object and then
never un-lock it, causing deadlocks. This is a gaping security
hole for denial-of-service attacks.
In any half-way-rational design, the lock associated with an
object would be treated just like any other slot, and only
methods statically ``belonging'' to that class could frob it.
But then you get into the bug of Java not doing closures
properly. See, you want to write a method:
public synchronized void with_this_locked (thunk f)
{
f.funcall ();
}
but then actually writing any code becomes a disaster
because of the mind-blowing worthlessness of inner classes.
-
There is no way to signal without throwing: that is, there is no way
to signal an exceptional condition, and have some condition handler
tell you ``go ahead and proceed anyway.'' By the time the condition
handler is run, the excepting scope has already been exited.
-
The distinction between slots and methods is stupid.
Doing foo.x should be defined to be equivalent to
foo.x(), with lexical magic for
``foo.x = ...''
assignment. Compilers should be trivially able to inline
zero-argument accessor methods to be inline object+offset loads.
That way programmers wouldn't break every single one of their
callers when they happen to change the internal implementation
of something from something which happened to be a ``slot'' to
something with slightly more complicated behavior.
-
The notion of methods "belonging" to classes is lame.
Anybody anytime should be allowed to defined new, non-conflicting
methods on any class (without overriding existing methods.)
This causes no abstraction-breakage, since code which cares couldn't,
by definition, be calling the new, ``externally-defined'' methods.
This is just another way of saying that the pseudo-Smalltalk object
model loses and that generic functions (suitably constrained by the
no-external-overrides rule) win.
Library:
-
It comes with hash tables, but not qsort? Thanks!
-
String has length+24 bytes of overhead over
byte[]:
class String implements java.io.Serializable {
private char value[]; // 4 bytes + 12 bytes of array header
private int offset; // 4 bytes
private int count; // 4 bytes
}
-
The only reason for this overhead is so that String.substring() can
return strings which share the same value array. Doing this at the cost
of adding 8 bytes to each and every String object is not a
net savings...
-
If you have a huge string, pull out a substring() of it, hold on
to the substring and allow the longer string to become garbage (in other
words, the substring has a longer lifetime) the underlying bytes of
the huge string never go away.
-
The file manipulation primitives are inadequate; for example, there's no
way to ask questions like ``is the file system case-insensitive?'' or,
``what is the maximum file name length?'', or ``is it required that file
extensions be exactly three characters long?'' Which could be worked
around, but for:
-
The architecture-interrogation primitives are inadequate; there is no
robust way to ask ``am I running on Windows'' or ``am I running on Unix.''
-
There is no way to access link() on Unix, which is the only
reliable way to implement file locking.
-
There is no way to do ftruncate(), except by copying and
renaming the whole file.
-
Is "%10s %03d" really too much to ask? Yeah, I know there are
packages out on the net trying to reproduce every arcane nuance of
printf(), but controlling field width and padding seems pretty
darned basic to me.
-
A RandomAccessFile cannot be used as a FileInputStream.
More specifically, there is no class or interface which those two classes
have in common. So, despite the fact that both implement read()
and a slew of other like-functioning methods, there is no way to write a
method which works on streams of either type.
Identical lossage exists for the pairing of RandomAccessFile
and FileOutputStream. WHAT WERE THEY THINKING?
-
markSupported is stupid.
-
What in the world is the difference between System
and Runtime? The division seems completely random and
arbitrary to me.
-
What in the world is application-level crap like
checkPrintJobAccess() doing in the base language class library?
There's all kinds of special-case abstraction-breaking garbage like this.
Stay tuned, I'm sure I'll have found something new to hate by tomorrow.
(Well, that's how this document originally ended.
But it's not true, because I'm back to hacking in C, since it's still the
only way to ship portable programs.)
gipoco.com
is neither affiliated with the authors of this page or responsible
for its contents. This is a safe-cache copy of the original web site.
gipoco.com
is neither affiliated with the authors of this page nor responsible
for its contents. This is a safe-cache copy of the original web site.