Mark Reinhold
2011/12/20
Copyright © 2011 Oracle and/or its affiliates · All rights reserved
This document is an overview of the current state of Project Jigsaw, an exploratory effort to design and implement a module system for the Java SE Platform and to apply that system to the Platform itself and to the JDK.
This document is not yet complete. Additional sections covering compilation, packaging, libraries, repositories, the module-system API, and the modularization of the JDK are in preparation.
Every feature mentioned here has been implemented in the main Jigsaw repository unless otherwise noted.
Comments to: jigsaw dash dev at openjdk dot java dot net
The Jigsaw module system is designed to be both approachable and scalable: Approachable by all developers, yet sufficiently scalable to support the modularization of large legacy software systems in general and the JDK in particular. It aims to implement a set of general requirements; its detailed design has been further guided by the following principles:
Modularity is a language construct — The best way to support modular programming in a standard way in the Java platform is to extend the language itself to support modules. Developers already think about standard kinds of program components such as classes and interfaces in terms of the language; modules should be just another kind of program component.
Module boundaries should be strongly enforced — A class that is private to a module should be private in exactly the same way that a private field is private to a class. In other words, module boundaries should determine not just the visibility of classes and interfaces but also their accessibility. Without this guarantee it is impossible to construct modular systems capable of running untrusted code securely.
Static, single-version module resolution is usually sufficient — Most applications do not need to add or remove modules dynamically at run time, nor do they need to use multiple versions of the same module simultaneously. The module system should be optimized for common scenarios but also support narrowly-scoped forms of dynamic multi-version resolution motivated by actual use cases such as, e.g., application servers, IDEs, and test harnesses.
A module is a collection of Java types (i.e., classes and interfaces) with a name, an optional version number, and a formal description of its relationships to other modules. In addition to Java types a module can include resource files, configuration files, native libraries, and native commands. A module can be cryptographically signed so that its authenticity can be validated.
The most important type of inter-module relationship is that of dependence, in which one module declares that it depends upon some other module by specifying that module’s name and possibly also a constraint upon the range of allowable versions.
A module dependence is not necessarily precise: Multiple modules with the same name but different version numbers might be available to satisfy it. Before a module can be used each of its dependences must be resolved to a specific module. Given an initial set of modules, resolution is the process of locating additional modules, as required, and constructing a superset of that set in which every dependence is optimally satisfied.
TODO: This compile-time preference for older versions is not yet implemented.
There are three principal phases in the lifetime of a module:
Compile time – A module’s dependences are resolved, its types are compiled from Java source files, and its other content is compiled or constructed as appropriate. The results are then packaged up for publication and distribution.
Install time – A module is inserted into a module library, i.e., a collection of previously-installed modules. If the module is invokable, i.e., it has an entry point, then it is made ready for use by resolving its dependences and storing the result of that computation in a persistent configuration.
Run time – An invokable module is loaded into a running Java virtual machine and linked up to the other modules upon which it depends as recorded in its configuration during installation.
The phase in which resolution is performed determines how dependences are satisfied. At compile time the oldest available version of a module satisfying a dependence is preferred, while in later phases the newest version is preferred.
When compiling a module, the Java compiler writes class files into a module-structured classes directory. In this module-path layout there is one top-level directory for each module; the content of each module directory is structured as a normal classes directory, i.e., a tree of decomposed Java package names. In order to support interactive development, the Java launcher can run a modular application directly from a module-path directory. When doing so it performs resolution before invoking the application’s entry point, although the resulting configuration is not stored for future use.
The module system does not support general dynamic run-time resolution; i.e., it is not possible to add or remove dependences or modules after an application has started running. Sophisticated container-type programs such as application servers, IDEs, and test harnesses can achieve the effect of run-time resolution in a limited way by using the module-system API to install modules into a temporary module library and then run them from that library.
TODO: Finish implementing run-time module-path support.
TODO: Design and implement container support.
The Java programming language is extended to include module
declarations for the purpose of defining modules, their
content, and their relationships to other modules. A compilation unit
containing a module declaration is, by convention, stored in a file named
module-info.java
and compiled into a file named module-info.class
.
The simplest possible module declaration merely expresses the existence of a module with a specific name:
module foo { }
If a module has a version number then that is placed after the module
name, preceded by an @
character:
module foo @ 1.0 { }
A version number starts with a digit and thereafter consists of Java
identifier-part characters, periods ('.'
), and dashes ('-'
).
Module names are qualified Java identifiers, just like Java package names:
module foo.bar { }
Module declarations cannot be annotated.
Source and class files for ordinary Java types do not specify the modules of which they are members.
An exports
clause in a module declaration makes the public types in the
package it names available to other modules:
module foo {
exports foo; // Export all public types in the foo package
}
Here the foo
module exports all of the public types in its foo
package, though not in any subpackages of foo
. No other public types
declared in the foo
module are exported. There is no requirement that
a module declare and export a package of the same name, though that is
conventional for simple modules. It is not possible to export non-public
types.
Multiple exports
clauses are, of course, allowed:
module foo {
exports foo;
exports foo.spi;
exports foo.util;
}
A module’s exports
declarations govern the accessibility of the
public types declared in the named packages. It is thus enforced at both
compile time, by the Java compiler, and at run time, by the virtual
machine.
TODO: Finish initial implementation.
ISSUE: Use package names that differ from the module names in these examples, to improve readability?
The requires
clause expresses the dependence of one module upon
another:
module foo {
exports foo;
}
module bar {
requires foo;
}
Here the bar
module depends upon the foo
module, so at both compile
time and run time the exported types declared in foo
are both visible
to and accessible by types declared in bar
. If no foo
module is
available then bar
cannot be compiled, and if bar
is invokable then
neither can it be installed or invoked.
At run time foo
and bar
will have distinct module class loaders, and
bar
’s loader will use foo
’s loader to load the types exported by
foo
.
An exports
clause in a module’s declaration only affects the
availability of types declared in that module; it cannot be used to
re-export types imported from other modules.
ISSUE: Do we need disjunctive dependences? Negative dependences?
When bar
simply requires foo
then the exported types in foo
are
available to bar
but not to other modules that depend upon bar
and
not upon foo
. Imported types can be re-exported via the public
modifier of the requires
clause:
module foo {
exports foo;
}
module bar {
requires public foo; // Re-exports foo's exported types
}
module baz {
requires bar; // Can use foo's exported types
}
The public
modifier makes the types imported into bar
from foo
available to any other module that depends directly upon bar
.
Types can be re-exported through a chain of modules:
module foo {
exports foo;
}
module bar {
requires public foo;
}
module baz {
requires public bar;
}
module buz {
requires baz; // Can also use foo's exported types
}
In this case any other module that depends upon either bar
or baz
will be able to use public types exported by foo
without depending upon
foo
itself.
A requires
clause can include a version constraint:
module bar {
requires foo @ 1.0;
}
This dependence of bar
upon foo
can be satisfied only by a foo
module whose version is exactly 1.0
. More-flexible constraints are
useful in practice, so a constraint can be specified in terms of an
exclusive or inclusive lower or upper bound:
module bar {
requires foo @ >= 1.0;
requires baz @ < 5.1a;
}
These dependences can be satisfied by any foo
module with version 1.0
or later and any baz
module with version no greater than 5.1a
.
No specific semantics are imposed upon version numbers. Version numbers are compared using an algorithm similar to that of the Debian packaging system.
TODO: Support both lower and upper bounds in version constraints.
TODO: Re-examine the version-comparison algorithm.
In large software systems it is often useful to restrict the set of
modules that can depend upon some other module. The permits
clause
expresses such a constraint:
module foo {
exports foo;
permits bar;
permits baz;
}
Here the module foo
can be required only by modules named bar
or
baz
. A dependence from a module of some other name upon foo
will not
be resolvable at compile time, install time, or run time. If no
permits
clauses are present then there are no such constraints.
The bar
and baz
modules can re-export foo
’s exported types via
requires public
clauses. Care must be taken, therefore, when writing
permits
clauses.
TODO: Controlling
permits
by module name alone is not sufficient, since an adversary can install a module of any given name. At the same time, for debugging the JDK itself it’s desirable to be able to install an experimental version of a JDK module into a local module library which delegates to the built-in module library of a pre-installed JDK, so simply limiting permitted modules to just those in the same module library is won’t work in general. We need to explore more alternatives here.ISSUE: Should it be possible to restrict a permitted module from re-exporting a permitting module’s exported types?
To support the refactoring of large modular systems, and also to allow
the separation of module names corresponding to well-defined standards
(e.g., java.base
) from the names of modules implementing those
standards (e.g., jdk.base
), the provides
clause declares an alternate
name for a module:
module foo {
provides bar;
}
Given this declaration, any dependence upon bar
can be satisfied by
foo
. More than one provides
clause can be present.
TODO: Implement aliases.
ISSUE: Should aliases have version numbers? The syntax currently allows them. They appear to be necessary to support refactoring by aggregation. In popular native packaging systems, however, the natural mapping of a module alias is to a virtual package, and virtual packages don’t have version numbers.
If a module declares a class with a traditional public static void main
entry point then it can be made into an application module via the
class
clause:
module foo {
class foo.Main; // Contains the main method
}
The java
launcher can then be used to invoke the module:
$ java -m foo
in which case the foo.Main.main
method is found and invoked in the
usual fashion. Any remaining command-line arguments are passed to the
main
method as usual.
A module declaration can contain at most one class
clause.
ISSUE: Should there be a way to suggest, if not specify, an external name for the entry point for use by external agents such as command shells?
ISSUE: Should entry points be expressed instead as services?
A dependence from one module to another can be declared optional:
module bar {
requires optional foo;
}
If no foo
module is available then bar
can still be installed and
invoked. Code in bar
that uses types from foo
must be written
defensively so that it operates properly when foo
is not available.
A foo
module must still be available when compiling bar
since code in
bar
can depend upon types declared in foo
.
A dependence from one module to another can be declared local:
module bar {
requires local foo;
}
To resolve this dependence, foo
must explicitly permits bar
.
A local dependence allows two modules to define types in the same Java package:
module foo {
permits bar;
exports p;
}
module bar {
requires local foo;
exports p;
}
Such multi-module packages, also called split packages, are sometimes required when modularizing large legacy systems.
With a local dependence, types declared in the same package in each module can make use of public, protected, and even package-private types and members declared in the same package in the other module. The public types exported by each module are implicitly re-exported by the other. At run time this is all achieved by using the same module class loader for both modules.
More than two modules can be related by local dependences:
module foo {
permits bar;
exports p;
}
module bar {
requires local foo;
permits baz;
exports p;
}
module baz {
requires local bar;
exports p;
}
In this case all three modules would, at run time, be loaded by the same module class loader.
ISSUE: Should
requires local public
be illegal?ISSUE: Should each module in a set of modules related by local dependence be required explicitly to permit all the other modules? That is not the case today, but it is arguably safer.
The bindings of a module are the types defined within it together with
those imported from other modules via requires
clauses. The view of
a module is a subset of its bindings, namely the set of types that it
exports, via exports
and requires public
clauses, and the set of
modules to which those types are available, as constrained by any
permits
clauses.
module bar {
requires foo;
exports bar;
}
This bar
module binds types defined locally, e.g., on the module path
under the bar
module directory, as well as all public types exported
from the module foo
. It defines a single view which exports all public
types in the bar
package to any other module.
In large software systems it is often useful to define multiple views of the same module. One view can, e.g., be declared for general use by any other module, while another provides access to internal interfaces intended only for use by a select set of closely-related modules.
A series of exports
, requires public
, and permits
clauses at the
top syntactic level of a module declaration defines the module’s default
view. Further views of a module’s bindings can be defined using the
view
construct, which specifies a view name together with a bracketed
list of exports and permits declarations:
module bar {
requires foo;
exports bar;
view bar.internal {
permits baz;
exports bar.private;
}
}
The bar
module now defines two views. The default view, available by
referencing the module name bar
, is the same as before—it’s as if the
declaration also said view bar { exports bar; }
. The new view, named
bar.internal
, is available only to the baz
module. It exports all
public types in the bar.private
package. It also exports all public
types in the bar
package because the non-default views of a module
inherit the exports
clauses of that module’s default view.
A non-default view never has requires
clauses.
A non-default view cannot declare its version; it inherits the version, if any, of its containing module.
A non-default view does not inherit the permits
clauses, if any, of its
containing module.
In addition to declaring exports and entry points, a non-default view can also declare aliases and services.
A non-default view can, finally, also declare an entry point different from that of its containing module’s default view, so a single module can define multiple related entry points. For example, the declaration
module commands {
view cat {
class org.foo.commands.Cat;
}
view find {
class org.foo.commands.Find;
}
view ls {
class org.foo.commands.List;
}
}
defines three entry points: cat
, find
, and ls
.
HISTORICAL NOTE: Module views are not a new idea. The concept proposed here is very similar to that of structures in the module systems of Scheme 48 and Standard ML.
TODO: Finish initial implementation.
ISSUE: Should a non-default view instead not inherit the types exported by the default view of its containing module declaration? If so, should there be a way to declare explicitly that a view inherits the exported types of the default view, or perhaps some other view?
ISSUE: How do views map to native packaging systems such as RPM or Debian? Treating a module view as a virtual package would probably work but might not scale well. Another possibility is to structure the names of non-default views so that they always include the names of their containing modules, but that turns views into second-class entities.
The module system assumes the existence of a foundational module named
java.base
, which is the one module that must be present in every Java
SE implementation. It is the module upon which all others depend, either
implicitly or explicitly, somewhat akin to the implicit reference to the
java.lang
package by every compilation unit.
If a module does not declare an explicit dependence upon a java.base
module, is not itself named java.base
, and does not define an alias or
view named java.base
, then at compile time a synthesized dependence
upon java.base
is inserted into the compiled module declaration. The
version constraint in this dependence is of the form >=
N, where N is
the version number given to the -target
option of the Java compiler, if
any, or else the version number of the Java SE Platform Specification
implemented by the system of which the compiler is a part.
A module can declare that it provides a service:
module foo {
provides service mammals.Wombat with foo.WombatImpl;
}
Here the foo
module declares that it implements the mammals.Wombat
service using the class foo.WombatImpl
.
To make use of a service, a module must first declare a dependence upon it:
module bar {
requires service mammals.Wombat;
}
Code in the bar
module can use an enhanced version of the
ServiceLoader
API to access instances of the Wombat
service.
The order in which instances are returned is not specified.
A module can declare a service dependence to be optional
, in which case
it is possible to use the module even when no provider of the service is
available. As with optional module dependences, code in such modules
must be written defensively so that it operates properly when no
providers are present.
Services are not themselves versioned. A service is defined by a specific interface or abstract class, hence it is implicitly versioned by the version of the module that declares that type.
If a module defining a service also exports some types then those types are available only to modules that have regular module dependences upon it, either directly or indirectly. Classes that implement services are not exported implicitly, nor do they need to be exported explicitly. A class that implements a service can therefore remain both invisible and inaccessible to the clients of that service.
TODO: Finish working out the design and implementation.
ISSUE: Should
permits
clauses affect service lookup?