This document defines Deterministic Parallel Java (DPJ) as an
extension to the Java Programming Language, v1.6 (Java 6). For the
specification of Java 6, see The Java Programming Language
Specification [3], hereafter referred to as “JLS.”
Because we are extending an already-specified language (Java), we omit
many of the topics one would expect to find in a specification
(lexical structure, form of statements and expressions, etc.). We use
JLS for all that. Instead, we describe only what DPJ adds to Java.
The structure of the specification is as follows. In
§ 2, we describe the DPJ extensions to Java's
class and interface definitions. In § 3, we
describe region path lists, or RPLs, which provide a way to
name sets of memory locations on the heap in a DPJ program. In
§ 4, we describe the DPJ extensions to Java's class
and array types. In § 5 we describe
effects, which specify accesses to memory in terms of
operations on RPLs. In § 6, we describe DPJ's
constructs for expressing parallelism.
Readers of this document may wish to consult the following documents:
The Deterministic Parallel Java Tutorial provides a guided introduction to the DPJ language.
The Deterministic Parallel Java Installation Manual explains how to install DPJ on your system, and
how to compile and run DPJ programs.
The Deterministic Parallel Java Language Reference Manual explains the major features of the DPJ language in
detail. It contains many of the same concepts as this
specification, but with more discussion, motivation, and examples.
Please note that the purpose of this specification is to give a
compact and precise definition of DPJ, for someone who is
already familiar with the concepts discussed herein. Therefore, it is
probably best to have at least a working knowledge of the tutorial and
reference manual (which are more concerned with explaining the
language) before consulting this specification.
A region name declaration as defined immediately below may appear as a
class member (JLS § 8.1.6).
A region name declaration has the form regionregion-decls, where region-decls is a comma-separated
list of identifiers. The declared names are available within the
enclosing class scope and via inheritance and/or class selection (as
for an ordinary Java static member, JLS § 8.2), subject to visibility
restrictions. The declared names may be used as name RPL elements
(§ 3.2.1) wherever they are available.
Every region name declaration appearing in a particular class must
have a unique name. Otherwise, if two region declarations with the
same name appear in the same scope, the innermost declaration hides
all the others.
Any field declaration may include an RPL specifier with the
syntax inrpl, where rpl is a valid RPL
(§ 3.1). If the RPL specifier appears in a field
declaration, it must appear after the field name and before the
initializer expression, if any. The RPL specifier is meaningful only
for non-final fields; if an RPL specifier appears in a
final field, it is allowed but silently ignored. RPL specifiers
are not allowed for local variable declarations or formal method
parameter definitions, because those variables reside in stack regions
(see § 5.3).
Every method may optionally declare its effects via an effect
summary (§ 5.2). If an effect summary
is present, it must appear after the method arguments, before the
method body, and before the throws clause, if any. The compiler
internally converts each method effect summary to an effect set
(§ 5.1), called the declared effect set
of the method. If there is no explicit effect summary, then the
compiler inserts a default effect set containing writes
Root:*. In particular, an ordinary Java 1.6 method (with no DPJ
annotations) is a valid DPJ method, whose effect summary is
writes Root:*.
The compiler infers the effects of the method body
(§ 5.3) and checks the following, after
deleting all local effects (§ 5.1):
The inferred effects are a subeffect
(§ 5.5) of the declared effects.
The declared effects are a subeffect
(§ 5.5) of the declared effects of all
methods overridden by this one.
If either of these requirements is not met, the compiler generates an
error.
Constructors: The following rules apply to the method effect
summaries of constructors:
The effect of writing into a field of the object being
constructed does not have to be reported in the constructor's effect
summary, even if the field is not declared final. This is
because in DPJ, the effect of initializing an object in a task is
not seen until after all concurrent tasks have completed.
All instance initializer effects for a class must be included in
the method summaries of all constructors of the class.
A region declaration may appear as a statement in a method body (JLS
§ 8.4.7). The declaration has the same form as for class region
name declarations (§ 2.1). The
declared names are available as name RPL elements in the innermost
enclosing statement block.
The keyword commutative may optionally appear as a method
qualifier, before the return type and before the method parameters, if
any. The commutative annotation is a programmer-specified
guarantee that the method commutes with itself, i.e., any pair of
invokes effects (§ 5.1) using this
method are treated as noninterfering. If commutative appears in
the definition of a method M, then it must also appear in the
definitions of all methods overriding M.
When using commutative, the programmer is responsible for
ensuring two things:
The method is properly synchronized so that concurrent
invocations of the method behave as though they have occurred in
sequence (i.e., the invocations have serializable
semantics [4]).
The semantics of the method is such that either order of a pair
of invocations produces the same result.
Note that “same result” in condition (2) generally has an
application-specific meaning. For instance, two insert operations on
a set may commute with each other, because they produce the “same”
set (i.e., the two sets are indistinguishable as far as future set
operations like find are concerned), even though the sets may have
different internal representations. However, two append operations on
an ordered list generally do not commute.
As part of a class or interface definition, a list of generic type
and/or RPL parameter declarations may appear directly after the class
or interface name, which follows the keyword class or
interface. A parameter declaration extends the generic
parameter declaration of Java 1.6 and has the form
<params> or
<params|constraints>, where
params is a comma-separated nonempty list of identifiers
stating the parameters being declared. The generic type parameters,
if any, must come first; followed by the RPL parameters, if any.
Any of the type parameters may be preceded by the keyword
type. At least the first RPL parameter must be preceded by
the keyword region, and the other RPL parameters may be
preceded by the keyword region.
constraints is a comma-separated list of RPL
constraints. An RPL constraint has the form rpl-1#rpl-2, where rpl-1 and rpl-2 are legal RPLs
(§ 3.1) in the scope of the class definition.
The type parameters function exactly as in Java 1.6 (notice that if
only type parameters appear, and the optional type keyword is
not used, the syntax corresponds exactly to Java 1.6). The RPL
parameters may be used as RPL elements (§ 3.2) in
the scope of the class definition, including in any type constraints
appearing in any generic parameter declarations. As with type
parameters in Java 1.6, the type and RPL parameters are available only
in an instance context (i.e., in any scope where this is in
scope).
The RPL parameter constraints, if present, impose disjointness
constraints (§ 3.3.4) that are
enforced when RPL arguments are bound to the parameters in a type that
instantiates the class (§ 4.1.1). Each
constraint rpl-1 # rpl-2 is also recorded in the
environment, so that the compiler can prove that rpl-3 and
rpl-4 are disjoint, if rpl-3 is included in
(§ 3.3.3) rpl-1 and
rpl-4 is included in rpl-2. It is an error to write a
constraint rpl-1 # rpl-2 such that rpl-1
is included in rpl-2 or vice versa.
Declaring method parameters: As part of a method definition
(including constructors), a list of type and RPL parameter
declarations may appear inside angle brackets <...> in the same
position where Java 1.6 allows type parameters, i.e., after the method
qualifiers (public, static, etc.), if any, and before the
method return type (for non-constructor methods) or class name (for
constructors). A method parameter declaration has the same form as a
class parameter declaration (§ 2.4). The
declared RPL parameters may be used as RPL elements in the scope of
the method definition, including the formal parameters and return
type. The constraints, if present, impose disjointness constraints
that are enforced when the parameters become bound in a method
invocation. The constraints are also recorded in the method
environment, as described in § 2.4 for
class parameter constraints.
Invoking parametric methods: When a method defined with type
and/or RPL parameters is invoked, the invocation (at the programmer's
option, and depending upon the support for region inference in the DPJ
compiler) may use either explicit type and RPL arguments or
inferred type and RPL arguments. This approach corresponds to
the way that Java 1.6 handles generic method parameters.
Explicit Type and RPL Arguments: As in Java 1.6, if type
and/or RPL arguments are present, then there must be an explicit
selector (a static class selector or an expression of class or
interface type) followed by a dot. The arguments must be enclosed in
angle brackets <...>, after the dot and before the method name.
For constructors, the arguments must follow the keyword new and
precede the class name. In either case, the RPL arguments have the
same form as for class and interface RPL arguments
(§ 4.1.1), except that the keyword
region must precede at least the first RPL argument. (This
requirement tells the compiler which arguments are types and which
ones are RPLs. Because of Java's method overloading, this information
is not available from the method name, as it is for classes from the
class name.) The number of type and RPL arguments must exactly match
the number of type and RPL parameters; otherwise, the compiler reports
an error.
Inferred Type and RPL Arguments: If a method is defined with
type and/or RPL parameters, then the method invocation may be written
without explicit type or RPL arguments. The compiler may
attempt to infer the RPL arguments from the types of the value
arguments, using an algorithm similar to the one that Java 1.6 uses
for inferring generic parameters. This specification does not require
any particular inference algorithm; in particular no inference is a
correct implementation of the language. If the inference is not
possible for a given RPL argument, either because inference is not
supported or it is not possible (for instance, because the parameter
does not appear in any of the actual argument types), then the
compiler must use Root:* as the RPL for that argument.
RPL Constraints: The compiler checks the RPL arguments
against the constraints, as discussed in
§ 4.1.1 for class parameter constraints. If
any constraint is violated, the compiler issues a warning.
In DPJ, the programmer uses region path lists, or RPLs, to name sets
of heap regions (§ 3.4.1). The region sets in
turn are used in specifying and checking effects
(§ 5). § 3.1 and
§ 3.2 describe the construction of RPLs.
§ 3.3 describe the rules for comparing RPLs,
which are essential in noninterference checking.
§ 3.4 describes heap regions and
stack regions, which do not appear in DPJ programs, but are
used by the implementation to do noninterference checking.
A region path list (RPL) is a nonempty sequence of RPL elements
(§ 3.2) separated by colons (:). An RPL
is well-formed if it obeys the following rules:
The first element must be Root, an RPL parameter element
(§ 3.2.3), or a variable RPL element
(§ 3.2.4).
Root, an RPL parameter element, or a variable element may
appear only as the first element.
It is also acceptable to write an RPL starting with a class or local
region name element (§ 3.2.1), an array
index element (§ 3.2.3), or the star element
(§ 3.2.5). In these cases, the RPL has
Root as an implicit first element.
An RPL element is one of the following: a name RPL element, a
parameter RPL element, an array index RPL element, a variable RPL
element, or the star RPL element.
Root, which is a reserved name available at global scope.
A class region name declared in the enclosing class scope
(§ 2.1).
A local region name declared in the enclosing statement scope
(§ 2.3.2).
selector.region, where selector is a
field access selector (i.e., everything to the left of
.Identifier in a field access expression, JLS §
15.11), and region is a class region declared in the class
named by selector (§ 2.1).
A parameter RPL element names a class RPL parameter
(§ 2.4), method RPL parameter
(§ 2.4.2), or capture parameter
(§ 4.1.7) that is in scope. Note that
capture parameters are introduced by the compiler in capturing types
(§ 4.1.7), and cannot be written by the
programmer.
RPLs have several pairwise relations that are used to determine when
two RPLs name disjoint or overlapping sets of regions: equivalence
(§ 3.3.1), nesting
(§ 3.3.2), disjointness
(§ 3.3.4), and inclusion
(§ 3.3.3).
There is an equivalence relation on RPLs, which we will denote here by
“rpl-1 is equivalent to rpl-2.” Its meaning is that
the set of regions represented by rpl-1 is equal to the set of
regions represented by rpl-2.
Equivalent RPLs: Two RPLs are equivalent if both RPLs have
the same number of elements, and the corresponding pairs of elements
are equivalent.
Equivalent RPL Elements: Two RPL elements are equivalent if:
They refer to the declared same region name, RPL parameter, or
variable name; or
They are [expr-1] and
[expr-2], where expr-1 and expr-2
are always-equal expressions; or
They are both [?] or *.
Always-Equal Expressions: Always-equal expressions are
defined as follows:
Two constants are always-equal expressions if they are the same
value.
Two variables are always-equal expressions if they are the same
variable (i.e., they have the same variable symbol) and the variable
is declared final.
Two binary operation expressions are always-equal expressions if
they represent the same operation, and their corresponding component
expressions are always-equal.
There is a nesting relation on RPLs, which we will denote here by
“rpl-1 is nested under rpl-2.” Its meaning is that
for every region r-1 represented by rpl-1, there exists
a region r-2 represented by rpl-2 such that r-1
is nested under r-2. The meaning of “r-1 is nested
under r-2” Is that r-1 is a descendant of r-2 in
the region tree.
rpl-1 is nested under rpl-2 if one of the following holds:
rpl-2 is Root.
rpl-1 is a sequence of two or more RPL elements, and
rpl-1 without its last element is nested under rpl-2.
There is an inclusion relation on RPLs, which we will denote here by
“rpl-1 is included in rpl-2.” Its meaning is that the
set of regions represented by rpl-1 is contained (in the sense
of ordinary set inclusion) in the set of regions represented by
rpl-2.
Inclusion of RPLs:rpl-1 is included in rpl-2
if one of the following holds:
rpl-2 ends with *, and rpl-1 is nested under
rpl-2 (§ 3.3.2) without its last element.
The last element of rpl-1 is included in the last element
of rpl-2, and inclusion holds for the RPLs after stripping
the last elements of both.
rpl-1 has one element, which is a capture parameter
(§ 4.1.7) included in rpl-2.
Inclusion of RPL Elements: RPL element elt-1 is
included in RPL element elt-2 if (1) the elements are
equivalent (§ 3.3.1); or (2) elt-1
is any array index RPL element and elt-2 is [?].
There is a disjointness relation on RPLs, which we will denote here by
“rpl-1 is disjoint from rpl-2.” Its meaning is that
the set of regions represented by rpl-1 has empty intersection
with the set of regions represented by rpl-2.
Disjoint RPLs:rpl-1 and rpl-2 are disjoint if
one of the following holds:
rpl-1 is included in
(§ 3.3.3) some RPL rpl-3;
rpl-2 is included in some RPL rpl-4; and rpl-3
and rpl-4 are constrained to be disjoint
(§ 2.4).
For some n ≥ 1, (a) the RPLs formed by taking the first n
elements of rpl-1 and rpl-2 are equivalent
(§ 3.3.1); and (b) rpl-1 and
rpl-2 have disjoint elements in position n+1 (this is
called a “distinction from the left” in [2]).
The last elements of rpl-1 and rpl-2 are disjoint
(this is called a “distinction from the right”
in [2]).
Disjoint RPL Elements: Two RPL elements elt-1 and
elt-2 are disjoint if (1) one is a name RPL element
(§ 3.2.1) and the other is an array index
RPL element (§ 3.2.3); or (2) they are both
name RPL elements corresponding to distinct region name symbols
(§§ 2.1
and 2.3.2); or (3) they are
the array RPL elements [expr-1] and
[expr-2], where expr-1 and expr-2 are
always-unequal expressions.
Always-Unequal Expressions: Always-unequal expressions are
defined as follows:
Two constants are always-unequal expressions if they are
different values.
expr-1 and expr-2 are always-unequal expressions
if expr-1 is the negation expression
(§ 6.2) of expr-2, or vice versa.
DPJ partitions memory into heap regions (collections of memory
locations on the heap) and stack regions (collections of memory
locations on the stack). Heap regions are explicitly named using RPLs
(§ 3). Stack regions are internally named and
automatically managed by the compiler.
Because all DPJ type and effect annotations are erased at compile
time, regions are not actually represented at runtime. They are a
theoretical construct, used in the soundness proofs of
DPJ [1]. For example, regions are used to show that
if two RPLs are disjoint (§ 3.3.4),
then the two sets of memory locations associated with the RPLs in the
dynamic semantics are nonintersecting.
The programmer should not have to reason directly about regions; the
proofs in [1] essentially say that it is sufficient
to reason about RPLs, using the rules given in § 3.
For more information about regions and the region tree, the interested
reader is referred to [1].
A heap region names a set of memory locations on the heap at runtime.
Syntactically, a heap region is like an RPL (§ 3),
with the following exceptions:
No * or [?] appears in the sequence of elements.
There are no parameter RPL elements.
Object reference values appear instead of object reference variables.
Integer values [n] appear instead of integer expressions
[expr].
An RPL with no * or [?] corresponds to a heap region at
runtime, by replacing parameters with actual regions, object reference
variables with the references they store, and integer expressions with
the values to which they evaluate. For more information about how
this substitution works, see [1]. An RPL containing
* or [?] represents the set of valid RPLs obtained by
substituting (possibly empty) sequences of RPL elements for *
and nonnegative array index values for ?. This set of RPLs in
turn corresponds to a set of regions, via the substitutions mentioned
above.
Every heap region is part of a tree of regions rooted at the global
name Root (§ 3.2.1). The tree
structure is given in two ways:
By the syntax of RPLs: e.g., A:B is a child of A.
By the types of object references: if o is an object
reference, and the owner region of the type of o
(§ 4.1.6) is R, then a region whose
RPL starts with o is a descendant of region R.
For more information about the region tree, see [1].
The compiler internally generates stack regions for method
value parameters and local variables. For example, the declaration
int x inside a method generates the stack region x that is
in scope where the variable x is in scope. These stack regions
are generated and checked automatically, and the programmer cannot
specify them directly.
Stack regions are compared by symbol: two occurrences of the same
symbol are treated as identical, and different symbols are treated as
different. This simple comparison is possible for stack variables
because Java does not allow references to or aliasing of these
variables.
When a class or interface defined with type and/or RPL parameters
(§ 2.4) is used to name a type, the type
may optionally specify type and/or RPL arguments.
Syntax: The arguments have the form
<args>, where args is a comma-separated list
of valid class or interface types (this section) and/or RPLs
(§ 3). This syntax, if it appears, must immediately
follow the class or interface name. The types, if any, must appear
first; followed by the RPLs, if any. Any of the types may be preceded
by the keyword type, and any of the RPLs may be preceded by the
keyword region; but the keywords are optional, as the compiler
can infer which are the types and which are the RPLs from the class or
interface definition.
Binding RPL arguments: It is an error to specify more RPL
arguments than there are RPL parameters. If fewer RPL arguments than
parameters are specified, the remaining arguments (from left to right)
become bound to Root. For each RPL parameter constraint
rpl-1 # rpl-2 specified in the parameter
declaration (§ 2.4), the compiler checks
that the RPLs rpl-1 and rpl-2 are disjoint
(§ 3.3.4), after substituting the RPL
arguments for the corresponding parameters in the RPLs. If any
disjointness constraint is violated, the compiler issues a warning.
The DPJ erasure of a class type is the type obtained by
ignoring all RPL parameters and arguments. For example, suppose class
C has a type parameter T and a region parameter R.
Then the DPJ erasure of C<Object,Root> is C<Object>.
Class or interface types T1 and T2 with no type or RPL arguments
are equivalent if they are equivalent in ordinary Java. Otherwise,
T1 and T2 are equivalent if (1) their DPJ erasures
(§ 4.1.2) are equivalent according to
this section; and (2) for every RPL parameter P of C, the RPL
arguments bound to P in T1 and T2 are equivalent
(§ 3.3.1).
Types that instantiate the same class or interface :T1 is
a subtype of T2 in DPJ if (1) the two types instantiate the same
class or interface; (2) the two types have no type or RPL arguments;
and (3) T1 is a subtype of T2 in ordinary Java.
T1 is also a subtype of of T2 in DPJ if (1) the two types
instantiate the same class or interface; (2) the DPJ erasure
(§ 4.1.2) of T1 is a subtype of the
DPJ erasure of T2 according to this section; and (3) the RPL
arguments of T1 are included
(§ 3.3.3) in the corresponding
arguments of T2.
Types that instantiate different classes:T1 is a subtype
of T2 if T2 is a direct supertype or indirect
supertype of T1.
Direct supertypes: T2 is a direct supertype of T1 if a
type with T2's class or interface appears in an extends or
implements clause of T1's class or interface, and that type
is a subtype of T2 after substituting arguments for parameters
according to T1.
Indirect supertypes: T2 is an indirect supertype of T1
if there is a chain of direct supertypes connecting T1 to T2.
A cast from one class or interface type to another is legal if the
cast would be legal for the DPJ erasures
(§ 4.1.2) of the types. Therefore
casts that violate type preservation are allowed. Any unsafe cast
will generate the usual “unchecked or unsafe operations” warning
that Java 1.6 gives for unsound generic casts.
If class-type is a class type, then the owner RPL of
class-type is (1) the RPL bound to the first parameter of
class-type, if class-type has parameters; otherwise (2)
Root.
The compiler extends Java 1.6 type capture (JLS § 5.1.10) to capture
RPL parameters as well as generic type parameters. If type is
a class type with rpls as its RPL arguments, then the capture
of type is defined as follows:
Take the Java 1.6 capture of type, possibly
substituting for the generic type arguments but keeping the same RPL
arguments.
In the result of step 1, for each RPL rpl in rpls
that is partially specified (i.e., contains * or [?]),
replace that RPL with a fresh RPL parameter constrained to be
included in rpl.
In DPJ, the array type is given by base-type followed by one or
more instances of brackets, where base-type is a
primitive, class, or interface type; and each instance of
brackets is a pair of brackets ([]) optionally followed
by an RPL argument <rpl>, optionally followed by
#index-var, where index-var is an identifier that
declares an index variable. An array type is valid if (1)
base-type is valid, treating all index variables appearing in
the type as integer variables in scope; and (2) the RPL of every
brackets is valid, treating all index variables appearing in
that brackets and all brackets to the left of that one
as integer variables in scope.
Default index variable declaration: If no index variable
declaration #index-var appears in a particular instance
of brackets, then that instance of brackets is treated
as if it had the declaration #_ (pound underscore). For
example, int[]<[_]> is a valid array type; it is equivalent to
int[]<[_]>#_.
A new array expression is new followed by base-type and
one or more instances of new-brackets, where base-type
is a primitive, class, or interface type; and new-brackets is
the same as brackets as described in
§ 4.2.1, except that an integer length expression
expr may optionally appear between the brackets. A new array
expression is valid if (1) the array type generated by deleting the
length expressions is valid (§ 4.2.1); and (2) the
first n instances of new-brackets contain a length
expression, for some n ≥ 1, while the rest do not.
If T1 and T2 are array types, then the compiler
checks whether T1 is a subtype of T2 as follows:
If the RPL specified in the leftmost brackets of
T1 is not included in
(§ 3.3.3) the RPL specified in the
leftmost brackets of T2, then the answer is no.
Otherwise, check whether the element type of T1 is a subtype
of the element type of T2 (§ 4.1.4
or this section), but require equivalence of corresponding RPL
arguments (§ 3.3.1), instead of just
inclusion, in doing this check.
We require equivalence of RPL arguments for all but the leftmost RPL
because it is not sound in general to allow subtype assignments into
array cells.
The compiler computes the type of a field access expression
selector-exp.field-name as follows: (1) compute
the type selector-type of selector-exp; (2) capture the
type (§ 4.1.7) to generate type
captured-selector-type; (3) look up the type field-type
of field based on the class C named in
captured-selector-type; (4) make the following substitutions in
field-type: (a) the type and RPL arguments of
captured-selector-type for the type RPL parameters of C; and
(b) selector-rpl for this. If selector-exp is a
final local variable var, then selector-rpl is
var; otherwise, selector-rpl is a fresh capture
parameter (§ 4.1.7) constrained to be
included in owner-rpl:*, where owner-rpl is the
owner RPL (§ 4.1.6) of
captured-selector-type.
If expression e has array type T (§ 4.2.1), then we
construct the type and region of array access expression
e[e'] as follows:
To construct the type, do the following: (a) concatenate
base-type of T with all instances of brackets in T
but the leftmost to form elt-type; and (b) if an index
variable i is declared in the leftmost instance of
brackets, then substitute e' for all instances of i
appearing in elt-type.
To construct the RPL, do the following: (a) let rpl be
the RPL appearing in the leftmost instance of brackets in
T, or Root if no RPL appears there; and (b) if an index
variable i appears in the leftmost instance of brackets,
then substitute e' for all instances of i appearing in
rpl.
as follows: (1) compute the type selector-type of
selector-exp and the types arg-types of args; (2)
capture selector-type (§ 4.1.7) to
generate type captured-selector-type; (3) look up the method
symbol based on the method name, selector-type, and
arg-types, and find the return type return-type of the
method; (4) make the following substitutions in return-type:
(a) the type and RPL arguments of captured-selector-type for
the type and RPL parameters of return-type; (b)
type-args and rpl-args for the type and RPL parameters
of the method; (c) for every expression int-exp in args
of a type assignable to int, int-exp for the
corresponding formal parameter of the method; and (d)
selector-rpl for this, where if selector-exp is a
final local variable var, then selector-rpl is
var, and otherwise selector-rpl is a fresh capture
parameter (§ 4.1.7) constrained to be
included in owner-rpl:*, where owner-rpl is the
owner RPL (§ 4.1.6) of
selector-type.
Inferred or Missing Type Arguments, RPL Arguments, and/or
selector-exp: If there are no explicit type or RPL
arguments, the compiler infers them (if necessary) as stated in
§ 2.4.2 and proceeds as stated above
using the inferred arguments in step (4)(b). If there is no
selector-exp, then as in ordinary Java, the implied selector is
this.
An effect is a set of actions that affect memory. Every statement and
expression in the program is assigned an effect. If the effects of
two statements do not interfere, then the statements may be safely run
in parallel.
Internally, the compiler represents effects as sets of basic
effects. A basic effect is one of the following:
A read effect, indicating a read operation on an RPL
(§ 3) or stack region
(§ 3.4.2). This effect represents any read at
runtime to any heap region named by the RPL, or to the stack region.
A write effect, indicating a write operation on an RPL or
stack region. This effect represents any read or write to any heap
region named by the RPL, or to the stack region.
An invoke effect, indicating an invocation of method m
of class C with effect E, where E is a set of basic effects.
A basic effect is local if it is a read or write effect on a
stack region or an RPL containing a local region name
(§ 3.2.1), or it is an invocation effect
whose underlying effects are all local. We refer to a set of basic
effects as an effect set.
The programmer may specify an effect summary as part of a method
definition (§ 2.3.1). An
effect summary consists either of the keyword pure, representing
an empty effect set, or one or both of the following, in this order:
Read effects: The keyword reads, followed by a
comma-separated list of valid RPLs (§ 3),
representing one read effect on each RPL.
Write effects: The keyword writes, followed by a
comma-separated list of valid RPLs (§ 3),
representing one write effect on each RPL.
The effect set of the whole summary is the union of zero, one, or two
effect sets generated as described above.
The compiler infers the effects of a statement, expression, or
statement block by using the following method-local analysis:
For expressions selector-exp.field-name that
directly access a non-final field (effects on final
fields are ignored), the compiler computes the RPL access-rpl
accessed by the expression. It uses the same procedure as for
typing field access expressions
(§ 4.3.1), except that it uses the
the RPL specifier (§ 2.2) instead of the
type associated with field-name in the class C, and there
are no substitutions for type parameters. The compiler records a
read or write effect (§ 5.1) to
access-rpl, depending on whether the expression is used in an
assignment or access.
For statements and expressions that invoke a method, the
compiler computes the effect set E generated by the invocation.
It uses the same procedure as for typing method invocation
expressions (§ 4.3.3), except that it
uses the effect summary of the method
(§ 2.3.1) instead of the
return type, and there are no substitutions for type parameters.
The compiler records an invocation effect
(§ 5.1) with the method symbol and E.
For statements and expressions that access a non-final
stack variable (i.e., method formal parameter or local variable),
record a read or write effect on a stack region identified by the
variable's symbol. Effects on final variables are ignored.
For any other statement or expression, the compiler accumulates
the effects of its components, coarsening component effects as
necessary (§ 5.4) to generate effects
that are valid at the outer scope.
Coarsening of RPLs: RPLs appearing in accumulated effects may
include any of the following elements that are no longer in scope at
the point where the effects are being reported: (1) integer variables;
(2) final local variables of class type; and (3) locally
declared region names. The compiler therefore performs the following
conversion (called effect coarsening) to generate a valid set
of accumulated effects in the surrounding environment:
Delete all effects on RPLs containing a local region name that
is out of scope.
Replace each RPL rpl that starts with var, where
var is a final local variable not in scope, with
owner-rpl:*, where owner-rpl is the owner RPL
(§ 4.1.6) of the type of var.
Perform this operation recursively on the resulting RPL.
Replace all elements [e], where e refers to
variables not in scope, with [?].
Coarsening of Stack Regions: If a stack region appearing in
an effect goes out of scope, any effect on that stack region is
deleted in translating the effect to the outer scope.
The following rules determine whether effect set E1 is a subeffect of
effect set E2:
If E1 and E2 each consist of a single basic effect, then
the following rules apply: (a) if R1 is included in R2
(§ 3.3.3), then a read of R1 is
a subeffect of a read or write of R2, and a write of R1 is a
subeffect of a write of R2; and (b) if E3 is included in
E4, then an invocation of method M with effect E3 is a
subeffect of an invocation of the same method M with effect E4.
An invocation of method M with effect E is a subeffect of E.
If E1 is a subset of E2 (i.e., all the basic effect of
E1 also appear in E2), then E1 is a subeffect of E2.
If E1 = E3 ∪ E4, where E3 and E4 are subeffects of
E2, then E1 is a subeffect of E2.
The subeffect relation is reflexive and transitive.
The following rules determine whether effect sets E1 and E2 are
noninterfering:
If E1 and E2 each consist of a single basic effect, then
the following rules apply: (a) two read effects are noninterfering;
(b) two effects on R and R', each a read or write, are
noninterfering if R and R' are disjoint
(§ 3.3.4); and (c) two invocations
of the same method are noninterfering if the method is declared
commutative
(§ 2.3.3).
If E3 and E4 are noninterfering, then an invocation of
method M with effect E3 is noninterfering with E4.
If E1 = E3 ∪ E4, where E3 and E4 are both
noninterfering with E2, then E1 is noninterfering with E2.
DPJ extends Java's control flow constructs by adding two parallel
statements: cobegin and foreach. cobegin followed
by a statement block executes each statement of the block as a
parallel task. foreach defines a loop that executes sets of
iterations as parallel tasks.
The cobegin statement consists of the keyword cobegin,
followed by a statement S to execute. If S is any statement but a
block enclosed in curly braces { … }, or if S
is a block consisting of a single statement, then the cobegin
just executes the statement S. Otherwise, S is a block containing
multiple statements, and the component statements of S are run as
parallel tasks. There is an implicit barrier (join) at the end of the
cobegin statement, so that all the component tasks must finish
execution before the parent task executes the statement after the
cobegin.
The compiler computes the inferred effect set
(§ 5.3) of each statement in the block and checks
the effect sets for pairwise noninterference
(§ 5.6). The compiler emits a warning if
interference is discovered. Both parallel and sequential code
generation of cobegin blocks are supported.
The foreach statement has the syntax
foreach(index-var in start,
length, stride)S, where
index-var is an integer variable declaration.
start, length, and stride are integer
expressions. The stride expression is optional, and the
default is 1.
S is a statement representing the loop body. It may be either
a simple statement or a block.
The foreach statement causes one instance of S to execute for
each value of index-var in the iteration space
{start + stride ⋅ i | i ∈ [0,
length−1]}. index-var is treated as a
final variable in S, so that any assignment to the index
variable inside the loop causes an error.
The compiler performs the following noninterference check for indexed
foreach loops:
Infer the effect set of the body of the foreach
(§ 5.3).
Create a copy of the effect set generated in (1), but replace
every occurrence of index-var with the negation expression
(§ 3.3.4) of index-var.
If interference is detected, the compiler generates a warning.
Both parallel and sequential code generation are supported. For
sequential code generation, the foreach iterations are executed
in sequence, from lowest to highest in the iteration space. For
parallel code generation, the compiler repeatedly halves the iteration
space, recursively forking off pairs of parallel tasks, until a
programmer-specified cutoff. The precise mechanism for specifying the
cutoff is implementation dependent.
Robert L. Bocchino and Vikram S. Adve.
Formal definition and proof of soundness for Core DPJ.
Technical Report UIUCDCS-R-2008-2980, University of Illinois at
Urbana-Champaign, 2008.
Robert L. Bocchino Jr., Vikram S. Adve, Danny Dig, Sarita V. Adve, Stephen
Heumann, Rakesh Komuravelli, Jeffrey Overbey, Patrick Simmons, Hyojin Sung,
and Mohsen Vakilian.
A type and effect system for Deterministic Parallel Java.
In OOPSLA '09: Proceedings of the 24th ACM SIGPLAN Conference on
Object-Oriented Programming Systems, Languages, and Applications, pages
97–116, New York, NY, USA, 2009.