INTRODUCTION OF HIGH LEVEL CONCURRENCY SEMANTICS IN OBJECT ORIENTED LANGUAGES
G Stewart von Itzstein
Bachelor of Computer and Information Science (Honours)
thesis submitted for the Degree of
31st December 2004
INTRODUCTION OF HIGH LEVEL CONCURRENCY SEMANTICS IN OBJECT ORIENTED LANGUAGES
G Stewart von Itzstein
A thesis submitted in partial fulfilment of the requirements for the degree of
Doctor of Philosophy, Information Technology
Table of Contents
List of Figures
List of Tables
author wishes to thank
But like all systems, it has a weakness. The system is based on the rules of a building. One system built on another. If one fails, so must the other. (Wachowski 2003)
Concurrency expression in most popular programming languages is still quite low level and based on constructs such as semaphores and monitors, that have not changed in twenty years. Libraries that are emerging (such as the upcoming Java concurrency library JSR-166) show that programmers are demanding higher-level concurrency semantics be available in mainstream languages and major new research results have emerged more recently in concurrency. Communicating Sequential Processes (CSP), Calculus of Communicating Systems (CCS) and Pi have higher-level synchronization behaviours defined implicitly through the composition of events at the interfaces of concurrent processes. Join calculus, on the other hand has explicit synchronization based on a localized conjunction of events defined as reduction rules. The Join semantics appear to be more appropriate to mainstream programmers; who want explicit expressions of synchronization that do not breach the Object Oriented idea of modularisation. Join readily expresses the dynamic creation and destruction of processes and channels which is sympathetic to dynamic languages like Java.
The research described here investigates if the Object Oriented programming language Java can be modified so that all expressions of concurrency and synchronization can use higher-level syntax and semantics inspired by the Join calculus. Of particular interest is to determine if a true integration can be made of Join into Java. This works seeks to develop a true language extension not just a class library. This research also investigates the impact of the Join constructs on the programming of well-known concurrency software patterns including the size and complexity of the programs. Finally, the impact on the performance on programs written in the language extension is also studied.
The major contribution of the thesis is the design of a new superset of Java called Join Java and the construction and evaluation of the first prototype compiler for the language. The Join Java language can express virtually all published concurrency patterns without explicit recourse to low-level monitor calls. In general, Join Java programs are more concise than their Java equivalents. The overhead introduced in Join Java by the higher-level expressions derived from the Join calculus is manageable. The synchronization expressions associated with monitors (wait and notify) which are normally located in the body of methods can be replaced by Join Java expressions (the Join patterns) which form part of the method signature. This provides opportunities in the future to enhance further the inheritance possibilities of Join Java possibly minimizing the impact of inheritance anomalies.
'I declare that this thesis does not incorporate without acknowledgment any material previously submitted for a degree or diploma in any university and that to the best of knowledge it does not contain any materials previously published unless noted below, or written by another person except where due reference is made in the text’ Some of the material in this thesis has been published in the following papers.
A brief introduction of the ideas covered in this thesis appeared in
Itzstein, G. S. and
Parts of the concurrency applications chapter appeared in;
Itzstein, G. S. and
Some of the implementation chapter appeared in;
Itzstein, G., Stewart and Jasiunas, M (2003). On Implementing High Level Concurrency in Java. Advances in Computer Systems Architecture 2003, Aizu Japan, Springer Verlag.
Parts of the design patterns chapter appeared in;
Itzstein, G. and
G Stewart Itzstein
Table of Contents
1.1 Introduction 1-2
1.2 Motivation 1-3
1.3 Structure of Thesis 1-7
1.4 Research Contribution 1-8
This chapter introduces the content of the research reported in this thesis. The chapter is divided into three sections. The first section gives a motivation for the work. An overview of the research is given and indicates possible deficiencies in the expression of concurrency. The second section describes the structure of the thesis. Finally, the last section describes the research contributions of the thesis.
Given the long history of object oriented and concurrent programming (Dahl and Dijkstra 1972; Hoare 1985; Milner 1989) it might be expected that all the important issues have been investigated. Concurrency has become increasingly critical to modern software and operating systems. This is reflected at all levels of computer systems architecture from hardware multi-threading (Intel 2004) and multithreading support at the programming level where concurrency is more closely integrated into programming languages. With this increased reliance of concurrency, it would be expected that modern languages such as C++ and Java would have incorporated a high-level set of language elements to better express concurrency. However, this is not the case. For example, C++ uses an operating system level threads and semaphore library that is not integrated into the language. In Java even though concurrency appears to be more closely integrated, its expression of synchronization is still quite low-level being loosely based on monitors (Hoare 1974; Buhr, Fortier et al. 1995). Hansen states that the monitor implementation of Java (its claimed “high-level” expression of concurrency) is not even a true reflection of his original vision of monitors (Hansen 1999). He argued that Java synchronization should be used by default rather than left up to the programmer. Another author Holub also made the following observation regarding java.
programming language's threading model is possibly the weakest part of the
language. It's entirely inadequate for programs of realistic complexity and
isn't in the least bit Object Oriented.”
Process calculi on the other hand are designed to model concurrency rather than an implement it (Milner 1989). These process calculi have a strong research heritage on the expression and formal meaning of concurrency. The deficiencies identified by Holub and Hansen within mainstream Object Oriented languages might be overcome by using ideas from process calculi research.
Concurrency has been implemented in many languages. The first purpose built concurrent programming language was Simula (Dahl and Nygaard 1966). Simula also became the first language that introduced some of the ideas of Object Oriented programming. Simula used a crude concurrency mechanism of co-routines that helped simulate concurrency. Later languages such as Ada (Mapping and Team 1994) improved upon the simple co-routine concurrency. Strangely, C++ however supplied less support for concurrency instead leaving implementations to the outside environment.
The authors of Simula incorporated a primitive expression of concurrency via co-routines because they recognized that the world that was to be simulated was itself concurrent. Since then there has been continuing interest in supporting parallelism in a number of different programming paradigms including that of Object Oriented. Yet “modern languages” like Java support concurrency and synchronization using technology that is nearly 30 years old. These low-level primitives such as semaphores (Dijkstra 1968) and monitors (Hoare 1974) even though very flexible, are not easy to scale up to large applications nor are they sympathetic to the Object Oriented paradigm. This has been recognized in the research community with a number of higher level concurrency implementations being proposed such as JCSP (Welch 1999), JavaTrevini (Colby, Jagadeesan et al. 1998) and JSR-166 (Lea 2002). Yet libraries do not entirely solve the problem as they don’t closely integrate with the existing language semantics encouraging poor programming practice that the semantics was supposed to prevent.
Whilst imperative non Object Oriented languages concurrency requirements could be satisfied using the low-level concurrency structures Object Oriented languages extensively use the structure of the program to represent information about the modelled domain. Programmers that use Object Oriented languages make use of objects within their programs to model the problem they are solving. Consequently, if the language does not support concurrency within the Object Oriented framework it will present a serious impediment to the design of quality programs. This is more critical now as Object Oriented programming have become one the most popular programming paradigms for system development (Andrews 1998). Languages such as C++ (Stroustrup 1983) and later Java (Gosling and McGilton 1996) have the attention of a large proportion of mainstream programmers at the current time (JobNet 2004). By 1998 just two years after Java was first proposed the language was being used in 40% of IT companies (Andrews 1998). Java has also been used as a teaching language in 44% of universities (Hardgrave and Douglas 1998) as of 1998 which is two years after the language was released. Java has found wide scale acceptance in a number of real world application domains. For example Java’s ability to act as an embedded language for small devices has been leveraged by Nokia for an API for mobile phones (Nokia 2003).
To see how easy it is to get into trouble using the low-level concurrency constructs in an Object Oriented language like Java, a simple example is now presented relating to the use of a synchronized block. For example, consider the code in Figure 1 below.
Figure 1. Erroneous Use of the Synchronized Keyword
The monitor lock reference in Java can be changed half way through a synchronized section of code. This problem is fundamentally caused by the fact that all non-base types in Java are represented by references rather than the object identity itself. Usage of the lock is via a variable reference that can be arbitrarily changed. This means that the user may change what the reference type is pointing to and thus replace the monitor at will. This means that code in Java can potentially look safe however, when executing is completely unprotected from multiple threads accessing it. Any successful language extension should solve this problem by hiding the lock from the user. In this way, the user cannot accidentally replace the lock at runtime. This problem is only one of the more direct examples in other situations problems can be extremely hard to locate and only cause problems at runtime infrequently.
Object Oriented languages like Java (described in more detail from page 2-22) contain methods that change or report the state of an object or class of objects. These methods act as the message passing mechanisms between those objects in the program space. This design lends itself to encapsulation of data and restricts data access between objects. According to the paradigm no data should be shared between objects other than those provided by the message passing methods. In Java’s concurrency capabilities however, there exists no explicit message passing mechanism for thread communications. Threads communicate via shared variables with the synchronized keywords providing protection of data.
In this thesis, message passing in Java between threads via a novel method of compound method signatures is introduced. That is when all methods from a compound method signature are called the associated body for the compound method is executed and parameters can be passed between fragments. The mechanism is introduced at the language level rather than a library. Intra-process communication is achieved via multiple compound method signatures sharing parts of the signature. The actual communication is achieved using parameters and return types. If multiple compound methods share, some of the same method-signatures dynamic message passing channels are created. By implementing the synchronization in this way a thread message passing mechanism with very low cognitive overhead to the programmer is achieved in almost exactly the same way as message passing between objects. This innovation also demonstrates how a formal method can co-exist with the Object Oriented paradigm and be adapted to a mainstream object oriented language is found.
The rest of the thesis is organized as follows. In Chapter 2, the literature is reviewed. In Chapter 3, existing semantics of Java are reviewed and then the syntax and semantics of Join Java is introduced. Problems in the current semantics of Java are also covered. In Chapter 4, the implementation of the syntax and semantics of the language is described. The focus of this chapter is the technical challenges that have been overcome in the implementation. In particular, Chapter 4 describes the pattern matcher that is at the heart of the runtime component of Join Java. Chapter 5 examines how design patterns in concurrency and synchronization can be expressed in Join Java. Chapter 6 applies Join Java to classical concurrency applications. It is shown how Join Java can express most of these problems in a simple compact form. Chapter 7 benchmarks the performance of Join Java and shows that further optimizations are possible to overcome the remaining bottlenecks in the performance. The impact of boxing and unboxing in Java is examined for its effect on the performance of the language extension. Chapter 8 provides conclusions and suggests further research.
Java language introduces several features that do not have clear analogues in
the original Join calculus formulation and other Join calculus-based
systems. Imperative Object Oriented versions of the Join calculus such as Join
Java have much in common with the “joint action” approach ta
1. Provision of high-level synchronization constructs into a mainstream language. These high-level constructs allow programmers to better model parallel behaviour in their programs without low-level synchronization mechanisms.
2. Provision of a safer intra-thread message passing mechanism via a guard-style method signature implementation. This high-level mechanism allows the dynamic creation of channels at runtime.
3. Improvement of thread integration in Java through a new return type that effectively allows asynchronous methods to Java. A side benefit of this is to allow parameters to be passed to threads at the time of their creation.
This dissertation explores how an expression of concurrency borrowed from process algebra can be incorporated into a mainstream Object Oriented programming language. The extension has the aim of increasing the level of abstraction in which concurrent programs can be written.
Table of Contents
2.1 Introduction 2-10
2.2 Abstractions of Concurrency 2-11
2.3 Process Calculi and the Join Calculus 2-17
2.4 The Object Oriented Paradigm 2-22
2.5 Concurrent Object Oriented Languages 2-28
2.6 Related Work 2-44
2.7 Conclusion 2-47
In this chapter, the literature relevant to the expression of concurrency is reviewed. It is intended to mention all the key researchers, define the key concepts, and identify open questions in the literature that the work of this thesis aims to fill.
The survey is organized into five sections. In section one, concurrency as a general concept is covered. How the expression of concurrency has progressively become more abstract is reviewed. Some of the more popular abstractions associated with concurrency are presented. Process calculi are examined in section two for inspiration in deriving newer higher-level concurrency constructs for programming languages. There is a detailed introduction to the Join calculus. In section three, an overview of the Object Oriented paradigm with its motivations and history is presented. In section four, the options for implementing concurrency in Object Oriented languages are examined. A number of categorizing attributes of concurrent object oriented languages are identified. Finally, a number of examples of real Object Oriented languages that claim to support concurrency are then categorized using the identified attributes.
In this section, the fundamentals of the representation of the concepts of concurrency is examined. In the first part, it is described what is meant by abstraction. A definition from (Kafura 1998) is adopted applied to the domain of concurrency. In the second part, the basic principles that form the foundation for any concurrent system are examined. The third part looks at common currency language expressions, such as monitors, semaphores and shared variables that are built upon these foundations. It presents a common hierarchy that has emerged from the literature that illustrates the levels of abstraction. It is also noted how results from design patterns literature present a higher level of concurrency abstraction.
This thesis has adopted Kafura ’s definition of abstraction
Abstraction is a design technique that focuses on the essential aspects of an entity and ignores or conceals less important or non-essential aspects. (Kafura 1998)
The key concept in abstraction is information hiding. Information hiding implies a layer at which some specific information is not visible and a layer below which that information is visible. This implies a strong relationship between abstraction and layering.
Kafura also sets out four general properties that make a good abstraction. Kafura firstly says that abstraction is concerned with attributes and behaviour. The attributes are the characteristics of the abstracted entity. Behaviours are the operations that the abstracted entity normally performs. The four properties that operate on the abstraction are;
1. The first property of good abstraction is that the abstraction is well named. That is if the abstraction has a good name the users of the abstraction will have an implicit understanding of it. An example of this is a variable, where the name, for example counter, implies the function as an abstraction.
2. The second property is that an abstraction be coherent. An abstraction is coherent when the attributes and behaviours that comprise the abstraction are related and expected given the domain in which it operates. For example, the attributes of a counter are to store a counter to store an integer and the behaviours of a counter are to increment and decrement the integer.
3. The third property of an abstraction is that it be minimal. The abstraction should provide the least behaviours and attributes needed. For example, the behaviour increment by two is not minimal for a counter. A counter which has an colour attribute is an example of an abstraction that has more attributes than it needs.
4. The fourth property is that it is complete. This means that the abstraction should have enough behaviours and attributes to model the entity that is being abstracted. The counter behaviours presented previously are not complete because you need to be able to reinitialize to some reference value typically zero.
Kafura ’s general properties could also be used to evaluate the various abstractions of concurrency. To supply a good abstraction of concurrency there is a need that the concurrency mechanism to be well named. For example, mutex could be considered as not a well-named abstraction although the full expansion mutual exclusion would be more appropriate and considered a better abstraction by the first property. Co-routines are well named as the name implies a similarity to subroutines but parallel in nature. Coherence implies that the attributes and behaviours of say, semaphores (Dijkstra 1968) which have two behaviours up and down are related and expected based upon the name. The third property that it is minimal can be seen in monitors where the abstraction only describes an entity that has behaviours wait and notify. The final property completeness implies that the monitors are complete enough to represent synchronization.
It has been noted that concurrency relies on three primitive abstractions (Schneider and Andrews 1986) atomicity, basic communication, and synchronization. These primitive abstractions are the lowest level of abstraction of concurrency. For a concurrent system to be usable, it must implements these primitive abstractions in some form. Consequently, these abstractions are considered the foundation abstractions for concurrency.
All concurrent systems are composed of multiple actions that are active at the same time communicating and synchronizing with each other. The most fundamental abstraction of concurrency that is of interest to computer scientists is the atomic action (Lomet 1977). Every concurrent system needs some form of atomicity. This is the point in which an action is indivisible. Historically semaphores (Dijkstra 1968) were proposed before atomic actions; however it is well established that semaphores are built on the idea of atomic actions. These abstractions of concurrency fit neatly into the Kafura properties presented earlier. That is they are well named coherent, minimal and complete. The abstraction of atomicity is well named as it implies indivisible action. The abstraction is coherent as its behaviour is what would be expected. The abstraction is certainly minimal and it is complete.
For concurrency to be useful there needs to be some form of communication between various parallel actions in the system. If communications were not supported in a concurrent system the only result from concurrent processes would be side effects such as printing information to the screen. The most basic form of communications is a variable in a shared address space (Schneider and Andrews 1983). Communication using such a shared variable can only occur correctly if the read and write operations on the variable are shown to be atomic. The abstraction is well named. It shows coherence with its behaviours being related to reading and writing from a shared variable. The abstraction is minimal in that it only is concerned with reading and writing to and from a shared variable and again is minimal as the behaviours and attributes could not be reduced any further. The abstraction is complete as all the behaviours necessary for the abstraction are present.
Synchronization is about the ordering of atomic events. Synchronization can thus also be defined as a mechanism for controlling access to a shared resource (Bloom 1979). For basic communication to be reliable, the abstract concept of atomicity is also necessary. Synchronization is how mutual exclusion of access is guaranteed to a shared variable in the previous example. Mutual exclusion means that only one parallel process can access a shared variable at one time thus eliminating the problem of processes reading data structures that are currently being altered. Synchronization requires atomicity to work and assists communications to work reliably. Again, this abstraction obeys Kafura ’s basic properties of being well named, coherent in that the behaviours of synchronization say that given a specific order of calls the result will always be correct. It is also minimal as the abstraction restricts itself to representing ordering of operations (the behaviour). The abstraction is complete as it deals completely with ordering of atomic actions.
This layer of abstraction is illustrated in Figure 2 where it is represented as the foundation for a hierarchy of abstraction. This hierarchy will be built on, showing how each level of abstraction requires the lower levels of abstraction to function but gives the users easier to understand and utilize mechanisms for achieving concurrency and synchronization whilst hiding the lower level details.
Figure 2. Foundation of Concurrency Abstraction
A number of concurrency abstractions have become popular in programming languages. These abstractions are somewhat higher in abstraction than that of atomicity, synchronization, and basic communications. However, if examined closely it could be seen that they make use of these abstractions to function. For example, Semaphores make use of atomic calls to check and set operations. Without atomicity of calls, Semaphores would not work correctly. Similarly, monitors express a higher-level abstraction of concurrency. Monitors make use of synchronization atomicity and communications to function correctly. Consequently, it is considered that this is a higher level of abstraction than that of atomicity, synchronization, and basic communications. One could look at monitors as an abstraction that hides the locks and queues associated with entry and exit to protected sections of code. The locks act as synchronization mechanisms and queues control the ordering of threads accessing the protection of code.
It is clear that these two abstractions of concurrency can be expressed in terms of each other’s implementations; consequently, they are expressed side-by-side on our hierarchy. Figure 3 shows another layer being added called intermediate-level abstractions that reflect this increase in expressability via hiding of lower-level abstractions.
Figure 3. Extended Concurrency Abstraction Hierarchy
From the applications development point of view concurrency problems can also be partitioned into design patterns. That is most concurrency problems can be implemented using one or more generic design patterns proposed by a number of authors (Lea 1998; Schmidt 2000). Programmers make use of these patterns to solve their specific domain problem. However, these “design patterns” tend to be expressed in a high level way which leads to difficulties if the language they are using only provides low level abstractions. This is the case with most concurrency mechanisms. Consequently, one of the most difficult aspects of programming using patterns is the translation from the general implementation supplied by the pattern designer to that of the specifics of the language and the problem. This difficulty could be reduced if there was a higher-level abstraction that better matched the generalized way design patterns are expressed. This missing layer between higher-level abstractions and design patterns can be seen in Figure 4. In this thesis, it is suggested that an extension to the Java language based on the Join calculus is one possible intermediate layer that more closely matches a number of design patterns.
Figure 4. Complete Concurrency Abstraction Hierarchy
2.3 Process Calculi and the Join Calculus
In the first part of this section, popular process algebras are briefly introduced, giving some background on their design motivations. The Join Calculus (Fournet, Gonthier et al. 1996) is introduced in more detail and has its major terminology introduced. Finally, Join is contrasted with popular process calculi showing that it has features that are more suitable for implementation into the Object Oriented paradigm.
The term “process calculi” defines a group of formal description languages that are designed to describe concurrent communicating systems (Fidge 1994). These description languages (also known as process algebra’s) have been defined as
An algebraic approach to the study of concurrent processes. Its tools are algebraically languages for the specification of processes and the formulation of statements about them, together with calculi for the verification of these statements(Glabbeek 1986).
In this thesis three process calculi are mentioned; They are calculus of communicating systems CCS (Milner 1980), communicating sequential processes CSP (Brookes, Hoare et al. 1984; Hoare 1985) and finally the algebra of communicating processes (Bergstra and Klop. 1985) ACP. The CSP process algebra was initially inspired by Dijkstra’s guarded command language (Dijkstra and Scholten 1990). CCS is designed in a similar handshake communication style to that of CSP. Whilst CSP is directed more at a theoretical basis and tends to be a more specification based language, CCS is more operationally oriented (Baeten and Verhoef 1995). ACP, a more recent work, differs from CSP and CCS in that it is more algebraic in its approach to modelling concurrency. These process algebras’ embody the intermediate to higher-level abstractions based on our definition from the previous section. They are frameworks for modelling the behaviour of interacting processes rather than programming them. For a full comparative introduction of the differences between CSP and CSS consult (Fidge 1994). Virtually all process algebras have the aim of modelling interaction of concurrent systems with syntax and semantic architectures varying between them. For this thesis, this research is more interested in the models of concurrency that are below the level of the process calculi. That is the syntax and semantics of concurrency and synchronization that are being implemented rather than modelling of behaviour. Consequently, an in-depth discussion of process algebras is beyond the scope of this work. However, the point that should be made is that CCS, CSP and ACP have implied synchronization rather than explicit synchronization. This contrasts with the Join calculus presented in the next section.
In this section, a detailed outline of the Join calculus is given. This section starts by giving a brief overview of the calculus and some of the terms that will be used for the remainder of the discussion.
The Join calculus can be thought of as both a name passing polyadic calculus and a core language for concurrent and distributed programming (Maranget, Fessant et al. 1998). The calculi’s operational semantics can be specified as a reflexive chemical abstract machine (CHAM) (Berry and Boudol 1992) (Maranget, Fessant et al. 1998). Using this semantic the state of the system is represented as a "chemical soup" that consists of active definitions and running processes (Maranget, Fessant et al. 1998). Potential reactions are defined by a set of reduction rules when a reaction occurs reactants are removed from the soup and replaced with the resultant definitions. The syntax and semantics of the Join Calculus is defined in the next section.
The Join calculus (Fournet and Gonthier 1996) contains both processes and expressions called Join patterns. Processes are asynchronous constructs and they produce no result. Expressions are synchronous and produce a result. Processes communicate by sending messages through communication channels. These communications channels can be described using Join patterns. A Join pattern definition has two or more left hand terms and a right hand side. The expression will not reduce until there are calls to all left hand terms. A body itself may contain another expression that may include a parallel composition of two or more other expressions.
The syntax of the Funnel (Odersky 2000) language is used in preference to the syntax originally defined by (Fournet and Gonthier 1996). This syntax is closer to that of the target domain Object Oriented languages, in our case Java, as its syntactic components are consistent with other parts of the Java language specification. The Funnel variant (Odersky 2000) syntax of the Join calculus is presented in Figure 5.
Figure 5. Funnel Variant of Join Calculus Syntax
For example given the Join calculus expression presented in Figure 6 it can be seen that there are two expressions on the left hand side of X(Y1) and X(Y2) and a right hand side with a parallel composition of xa(y1) and xb(y1). This means that when expressions X1(Y1) and X2(Y2) are available the body of the definition is evaluated, in this case the parallel composition of xa(y1) and xb(y1) expressions.
Figure 6. Example Join Calculus Expression
In the previous section, it has been shown how the body of the definition is only evaluated when all expressions in the left hand side are available. This behaviour can be modelled using the CHAM (chemical abstract machine) semantics proposed by (Berry and Boudol 1992). The chemical abstract machine semantics are modelled on the idea of a soup of waiting chemicals that can react to generate new chemicals while removing old chemicals from the soup. As an example of the semantics of the chemical abstract machine a brief example is presented. Given a machine with one possible reduction def A&B ; C&D it is assumed that the soup already contains the left hand side expressions A R F S C. Consequently, the state of the CHAM is as shown in Figure 7. Further if the expression B is added to the soup, the soup will contain all the terms on the left hand side of the reduction rule. The terms will then react and generate all the terms on the right hand side of the reduction rule removing the terms on the left hand side of the reduction rule from the soup. This is shown in Figure 8 where the addition of term B creates new terms (via the reduction rule) D and C in the soup (shown with stars in the figure) and the terms A and B are removed.
Figure 7. Addition of B to CHAM
Join’s dynamic nature emerges when there is the possibility for multiple reductions to occur depending on the combination of Join expressions appearing. If one were to augment the definition above with an additional definition def A&Z ; X it can be seen that on a call to A depending on whether Z or B were previously called the choice of which definition to evaluate would occur. If Z is waiting (called previously) in the CHAM then the second definition would evaluated, A and Z would be removed from the CHAM and replaced by X. Alternatively if B was waiting in the CHAM then the first definition would be evaluated removing A and B from the pool, replacing them with C and D. One alternative situation that can occur is; what happens if Z and B are both waiting in the CHAM and A is then called. In this situation, a non-deterministic reduction is possible in that either the first definition or second definition could equally be reduced but not both. A calculus does not specify what would occur in this case.
Figure 8. Operational Semantics of the Join Calculus
These semantics combined with parameter passing will allow parameters to be passed to the body of the reduction. If the return values were augmented with blocking semantics, it can consequently allow parameters to pass data into a reduction and back out of the reduction. With these two additions to the syntax and semantics message passing is supported. This is a core component of the Object Oriented paradigm.
The Join calculus gives an ideal alternative to the current popular concurrency semantics in programming languages via its locality and reflection. Locality allows us to limit reductions to only one site in the calculus rather than all sites. Reflection allows us to define implicitly possible reductions as combinations of existing reduction components. Fessant also points out an advantage of the Join calculus over other calculi.
Processes communicate by sending messages on channels or port names. Messages carried by channels are made of zero or more values and channels are values themselves. By contrast, with other process calculi (such as the pi-calculus and its derived programming language Pict), channels, and the processes that listen on them are defined by a single language construct. This feature allows considering (and implementing) channels as functions, when channels are used as functions. (Fessant and Conchon 1998)
Odersky (Odersky, Zenger et al. 1999) argues that Join Calculus better expresses the idea that most modern applications are reactive in their interfaces and concurrent in their implementation. The pi-calculus (Milner, Parrow et al. 1992) and the fusion calculus (Parrow and Victor 1998) whilst expressing interaction between entities well do not represent sequential expressions sufficiently. His paper further points out that complicated type systems are needed to address these problems.
In this section, the Join calculus has been identified as a promising candidate for the expression of concurrency that would be suitable for incorporating into a mainstream Object Oriented language. It is shown that Join’s high-level abstraction is based on the conjunction of atomic actions and expression of synchronization explicitly in a way that is not likely to breach the modularity required by Object Oriented paradigms. This strongly motivates the choice of Join as the basis for a Java language concurrency extension.
2.4 The Object Oriented Paradigm
In this section, what defines the Object Oriented paradigm is examined. The review particularly emphasizes which Object Oriented aspects might interact with concurrency. Some of the basic concepts of the Object Oriented paradigm are described. It is shown how the object-orientated paradigm has information hiding as one of its central concepts. The requirements for a programming language to be Object Oriented are looked at and what advantages Object Oriented languages have. It is shown how Object Oriented languages provide a framework in which the only information an object shares with another object is the information passed via a message passing mechanism such as method calls. The idea of message passing in the language and how encapsulation requires us to limit the access of member variables in classes is scrutinized. Finally, the family tree of Object Oriented languages is examined. For a comprehensive survey of Object Oriented, languages consult (Sutherland 1999).
Object Oriented languages were first proposed by (Dahl and Nygaard 1966) as a way of representing real world concepts. The primary Object Oriented language structure is the class, that is an augmented record that also contains the operations that work on the data contained within that record. Instances of classes can be created which are known as objects. Each object’s code operates on the data within the instance object. In this way, the code is associated with specific data rather than as is the case with non Object Oriented languages, where the code is generic and data is passed to it.
Before Object Oriented languages, most software analysis and design focused on process. Process designs were either data centric or control centric. This data centric or control centric approach tended to become problematic and was limited to small to medium sized programming projects. These limitations and non-scalability issues became known as the software crisis (Dijkstra 1972; Booch 1987). At this point software engineering was more an art with the differences between good projects and bad projects being mainly due to the skill of the team. In an attempt to improve this situation a methodology of design that would avoid the scalability issues of the process model techniques was pursued. Developers needed to bring reuse and closer representation of the real world to software engineering. This closely matched the idea of the Object Oriented paradigm that was proposed by the Simula language (Dahl and Nygaard 1966). The Object Oriented paradigm offered abstraction via “class hierarchies” that allowed control of cohesion between components in the system. This was a necessary requirement for the increased complexity of projects being developed.
An important concept of software engineering is the idea of high cohesion and low coupling (Stevens, Myers et al. 1982). A highly cohesive module will require outside modules as little as possible. A loosely coupled module will be as constrained as possible from communication with other modules in the system. The two principles of high cohesion and low coupling work together to decrease the complexity of software hence reducing potentially hard to localise errors. In the Object Oriented paradigm, encapsulation is the method in which these two principles are achieved. Encapsulation hides attributes from other objects (modules) via modifiers. Modifiers allow the programmer to set the visibility of methods and attributes. In Object Oriented languages, it is customary to restrict visibility of internal methods and attributes so that they cannot be seen outside the object in question. In this way, these issues of high coupling are avoided such as having various objects access any attribute or method they wish. If this were allowed, there would end up with high coupling and hence unnecessary complexity. This reduction in accessibility leads to a requirement of a more formal mechanism of object communication this is called message passing . This starkly contrasts with the majority of concurrency abstraction implementations that are highly coupled and have low cohesion. Concurrency abstractions tend to be non-localized side effect style code where concurrency calls can be made anywhere in the code to anywhere else in the code. An example of this is Semaphores where up and down calls can be made anywhere. Monitors correct some of these deficiencies with Semaphores. However, monitors have even been shown to be a problem (Ramnath and Dathan 2003) where they increase the coupling of more complicated patterns. Within an Object Oriented program, no object should be able to manipulate another object’s attributes and hence state directly. The way that one object seeks to change the state of another’s object is via mutator methods. These method’s explicit purpose is to allow another object to request a change of an object’s internal state. If an object needs the contents of another object, it is done via an accessor method. In this way, the object can protect itself against being placed in states that it should not be. This reduction in coupling reduces the complexity of the interaction between different modules of the system hence increasing the scalability of software development. Any concurrent mechanism implemented in an Object Oriented language should support this implicitly.
2.4.1 Advantages of the Object Oriented Paradigms
The advantages of the Object Oriented paradigm can thus be summarized as:
1. Data Abstraction . This concept allows complicated representations to be modelled in the language whilst still maintaining readability. The Object Oriented paradigm achieves this via inheritance of classes. In this way, generic concepts are represented high in the inheritance structure whilst concepts that are more specific are added progressively as one descends through the inheritance tree.
2. Encapsulation . Bundles related data and methods into neat packages that then provide a form of modularity. Encapsulation also provides support of information hiding via interfaces (definition of the methods within a class). In this way, the user of the class does not need to know the implementation details of the class they are using. In fact, the implementation can be arbitrarily changed at any time with the user of the class not even knowing.
3. Polymorphism . Quite often, it is a good idea to group similar objects of different classes so they can be treated in the same way. For example in a drawing program, representative classes may have for each type of drawing a widget such as circles spheres. It would be useful if one could just refer to all these widgets as shapes and write methods that use shapes. In this way, one can add new types of widgets without having to write new methods for each. This helps support code reuse hence leading to more scalable engineering of software systems.
4. Reusability . By using the encapsulation facilities of Object Oriented languages, an amount of reusability is acquired. The encapsulation of data structures with the methods, attributes with well-defined interfaces allows us to hide the implementation details from the user. This coupled with high cohesion and low coupling leads to code that can be reused in different projects. Furthermore making use of the facility of polymorphism, code reuse is attained. That is rather than writing a different method for each type of class; instead, a carefully designed inheritance structure will provide a generic class with generic methods. The methods are written to accommodate the generic class signatures any subclassed object can then be passed as the generic type.
2.4.2 History of Object Oriented Languages
With so many Object Oriented languages available, it is useful to see how the major languages relate to each other. Object Oriented languages have developed a long way since the introduction of the Simula language. A number of external sources have influenced the development of languages culminating in recent languages such as Java (Gosling and McGilton 1996), C ++ (Stroustrup 1983) and Sather (Lim and Stolcke 1991). Figure 9 shows a tree illustrating the development of Object Oriented languages. The tree shows the main languages in relation to the non-Object Oriented languages they extended and how they related to each other. This tree is by no means complete but shows some of the highlights and how they relate to each other. The Simula languages are the founding languages for Object Oriented concepts. Simula could also be regarded as the first language to introduce true concurrent behaviour. One might regard Simula as the watershed that spawned most of the modern languages that have gained wide spread acceptance today. It is interesting to reflect that most object oriented languages since the original Simula language have had some form of concurrency. However, the most popular object oriented language (C++) omitted concurrency support and only supplied the most low-level of primitives for synchronization. Java on the other hand supplies a concurrent class Thread. However, even this class is still an addition to the language not integrated in the language syntax.
Figure 9. The Family Tree of Object Oriented Languages
It can be seen from the preceding sections Object Oriented languages leverage the idea of information hiding, encapsulation, inheritance, and message passing to achieve the aim of building a scalable paradigm that will support the continually increasing size of programming projects. The language design is elegant in data structure composition with control structures being subjugated to the data rather than data being supplied to the control structures. Communication is also more elegant with the concept that communication should only be achieved through accessor/mutator methods. This section shows that in Object Oriented languages it is a design aim to restrict inter-object communication as much as possible restricting objects from accessing other objects or ancestor’s attributes. For consistency communication should only be allowed via a message-passing construct. This ideal can be intuitively extended to threading to say that one should restrict inter-thread communication as much as possible. That inter-thread communication should only allow threads to communicate through safe message passing constructs rather than arbitrarily accessing shared variables. Unfortunately, concurrency is generally not well implemented in Object Oriented languages. Concurrency tends to be minimally integrated with a lot of dependence on external libraries.
In the next section, the concepts of the fusion of Object Oriented languages and concurrency will be introduced. An overview of some Object Oriented languages is presented along with how concurrent Object Oriented languages can be categorized.
In this section, existing approaches to the expression of concurrency in a selection of object-orientated languages is surveyed. The selection includes some research languages in addition to three mainstream Object Oriented languages C ++ , Java, and Ada. This section starts by suggesting a number of categorizing attributes in which concurrent Object Oriented languages can be differentiated. The categorization of actor-based languages vs. non-actor based languages is examined first. Following this, a categorization of how closely concurrency is integrated into the language is shown. The final categorization is that of mainstream vs non-mainstream languages. A number of concurrent Object Oriented languages that have already been proposed are reviewed. The definition of what is a mainstream Object Oriented programming language and how these mainstream languages have handled the issues relating the expression of concurrency are reviewed. Finally, a selection of concurrent object oriented languages is reviewed with emphasis on the classification of concurrency. The outcome of this survey is the realization that for most mainstream languages there has been little attempt to encapsulate threads or integrate concurrency and as a result, many of Object Oriented specific problems that researchers have identified are amplified.
Object Oriented designs do not necessarily make concurrent programming easier. A poorly designed concurrent Object Oriented program can easily obscure the behaviour of threads running in parallel. Unlike processes in operating systems, which are protected by memory management software (other than those explicitly given all privileges), Java for example uses a type system to protect users from executing unsafe operations. However, the Java type system does not protect the user from concurrent access to shared variables. For example, programming a thread pool using only monitors can be a non-trivial task. The programmer needs to pass references to (usually objects implementing the runnable interface) a job dispatcher. This job dispatcher via some central registry of workers finds which threads are idle and signals a thread that a job is waiting. The worker thread then collects the job and runs it returning the outcome via some shared variable. Implementations of this pattern can be quite difficult with shared data being vulnerable to corruption due to double updates if the author is not careful to fully protect shared data.
There are a great number of
languages available today. Languages such as Java (Steele, Gosling et al. 1995), ADA (Wellings, Johnson et al. 2000) and C
with PThreads (Stroustrup 1983) are popular
languages. These languages have varying levels of support for
concurrency/interaction, ranging from difficult low-level external library
support such as C++ to low-level usable concurrency as is the case with
A major feature that defines the Object Oriented paradigm is the idea of class inheritance. Class inheritance is where a class takes the attributes/methods from another class in addition to the ones it provides. This allows a generalized class to be written and then classes that are more specific are written to express domains that are more specific. With respect to concurrency, inheritance causes problems in expression. These so called “inheritance anomalies” (Matsuoka, Wakita et al. 1990) have been studied quite carefully and have been shown that there is no completely satisfactory solution (Crnogorac, Rao et al. 1998). The thesis highlights how low-level synchronization expressions that are poorly integrated into an Object Oriented language require them to be placed in the body of methods. This approach of placing constraints within the body is an enabling mechanism for the creation of inheritance anomalies (Pons 2002).
Thus, it is concluded that there is a definite argument to investigate how high-level concurrency can be added to mainstream languages. These language extensions will then assist with encapsulation and integrate more closely. With this alternative approach, it may also be possible to reduce the number of inheritance anomalies via moving constraint information to the inheritable part of the class rather than the method body. The inheritance anomaly is where redefinitions of methods are required to maintain the integrity of concurrent objects (Matsuoka, Wakita et al. 1990). When method names and signatures are known, and method bodies are abstract, the signature of a parent class does not usually convey any information regarding its synchronization behaviour. This behaviour is usually defined within the body (Fournet, Laneve et al. 2000). With this restriction in most Object Oriented languages an inheritance anomaly can exist if a subclass method cannot be added without modification in some form of the superclass methods. This of course is the reverse of the design methodology in Object Oriented languages where subclasses modify the behaviour of the superclass by overriding the superclass methods. If the superclass has to be rewritten for the subclass, the design paradigm breaks down. Much of the research into concurrent Object Oriented languages has been carried out in an attempt to reduce the impact of the inheritance anomaly. (Crnogorac, Rao et al. 1998) shows that none of these languages can actually fully solve the problem, as there appears to be a fundamental incompatibility between concurrency and Object Oriented paradigms. For this reason, most research is aimed at minimizing the problem rather than eliminating it. This thesis does not aim to solve the inheritance anomaly, but tries to contribute to reducing the likelihood of occurrence.
In the following sections, we will introduce three categorizations of concurrent object oriented languages. These categorizations are actor categorization (Section 2.5.1), implementation categorization (Section 2.5.2) and mainstream categorization (Section 2.5.3). A number of languages are then classified using the categorizations in section 2.5.4. In section 2.5.5 we show that a gap exists in the field of concurrent object oriented languages that Join Java intends to fill.
2.5.1 Actor versus Non-Actor Based Concurrent Languages
A categorization of concurrent languages is how parallelism and synchronization is integrated within the language.
1. Actor based languages (Agha 1986). In these languages, every object is also a thread and objects and threads are tightly coupled. The first actor based language was Plasma (Hewitt 1976) originally named Planner 73 (a non-object oriented language based on Lisp ). Active objects imply that all objects are also threads. A true active object language will make the thread active the moment the object is created. Following Plasma there have been a number of implementations of Active Object Languages. Plasma II (Salles 1984) evolved later with support for parallel interpreters. Still later (Salles 1984) Alog added extensions for logic. Alog evolved into Smart (Salles 1989) which added support for distributed interpreters. Figure 10 gives a brief description of the evolution of active object languages from Plasma up to languages that are more recent.
2. Non-actor based languages. In these languages, threads are separate from the objects. In this way, several threads may be active within a single object at the same time. This mechanism allows objects that do not need to be active to have low overhead. In many implementations, the language has a specialized libraries that allows code to run in parallel. An example of this category is Java (Gosling and McGilton 1996). In Java, if the user wishes to produce their own thread class they simply achieve this via calls to the library or by overriding the asynchronous method of the thread class. Initially when the first Object Oriented language was proposed, it supported concurrency via co-routines that were conceptually quite similar to that of sub-routines from older style imperative language. This idea of separating the objects from the threads is a passive object approach to concurrent Object Oriented languages. The disadvantage of this approach is that the programmer has to create explicitly threads when they deem necessary. There is a cognitive separation of threads and objects in the language. This leads to cognitive overheads as the programmer tries to define where threading is required and where it is not required then designing a mechanism for communication between the threads.
Actor based languages whilst not necessarily being object oriented share a great number of similarities with that of the Object Oriented paradigm. The term active object language will be used when referring to actor-based languages that are within the Object Oriented paradigm. In the following sections, the thesis is most interested in the active object variety of actor-based languages. However, this section will cover the most important languages in the entire set of actor-based languages for completeness.
Figure 10. Evolution of Actor Based Languages
2.5.2 Integrating Concurrency into Object Oriented Languages
In concurrent Object Oriented languages, there are several ways of integrating concurrency into the language. These approaches have a direct affect on the ease of use and safety of concurrency within the language.
1. Library approaches. Concurrency and synchronization features are embedded via a predefined set of libraries. This approach is taken by JCSP (Welch 1999) and CSP for Java (Hilderink, Broenink et al. 1997) both which retrofits CSP semantics to Java via library calls. These types of extensions are normally quicker to develop however lack some of the advantages of language level integration. For example, errors in the use of libraries cannot be detected by the compiler leading to long debug cycles for application development.
2. Superset . This is where an existing language is expanding with syntax and semantics to support concurrency. For example, Timber (Black, Carlsson et al. 2001) adds concurrency and imperative object oriented semantics to Haskell (Jones, Hughes et al. 1998; Jones 2003). These implementations tend to gain wider acceptance as they leverage existing user knowledge to allow for quicker understanding of concepts. However, they are usually slower than the base languages as they add features that may break the optimizations in the base language’s compiler.
3. Design a completely new language. Create a new language with semantics of interaction and concurrency embedded into the language. An example of this language is Funnel (Odersky 2000) where the language was specially designed to illustrate a small set of features including concurrency. This is the ideal implementation solution, as no compromises need to be made in implementation. These implementations have the disadvantage that there are very few programmers who will understand or be willing to adopt the language. It is then harder to get the language to be accepted as a mainstream language.
For the purposes of this thesis, mainstream languages are defined as any language that is used by a reasonable proportion of non-academic users. That is, those languages that are used by a significant proportion of industry programmers. For example, C ++ , Ada or Java would be regarded as being a mainstream language due to their widespread use.
2.5.4 Implementing Concurrent Object Oriented Languages and Extensions
Concurrency has been long recognized as being necessary in any modern programming language. Right back at Simula -67 and Algol-68 threads and simplistic synchronization primitives have been made available in languages. Later languages like OCCAM (Inmos 1984) have implemented small improvements such as monitors (Hoare 1974) and CSP style constructs (Hoare 1980). Java has made concurrent programming using threads widely available to mainstream programmers. However, this situation has just re-emphasized what many-experienced concurrent programmers already knew, that concurrent programming is inherently difficult to get right. In the following paragraphs, some of the more well known approaches to implementing concurrency in Object Oriented languages will be covered.
MuC++ (Buhr 1992) was designed to give programmers access to the underlying system data structures. It implemented low-level operations such as real-time monitors, timeouts dynamic-priority scheduling and basic priority inheritance. It uses a translator and runtime kernel that supports concurrency using a shared memory model. The language modifies the C ++ syntax but does not modify the compiler. A pre-processor converts the program into a normal C++ language file. When a compilation error occurs, the compiler will give error messages in terms of the translated code not the language the programmer is writing in.
ACT++(Kafura , Mukherji et al. 1993) is one of a large number of languages based upon the idea of actors (Agha 1986). This language is implemented using a library rather than a language extension. Again this language is based on the C ++ language. However, it does not modify the syntax.
rings (Holmes 1995) are
proposed as a language independent model of synchronization. More specifically synchronization rings are a
library of behaviours that support the composition of concurrent
behaviours. This language, independent
model has been implemented in two languages C
The Tao (Mitchell 1995) model (and later implementing languages) aim to overcome some of the problems involved in the inheritance anomaly. The prototype language is implemented as a language extension with a pre-processor that generates C ++ code after passing through a translator. The language achieves its aim via the novel approach to inheritance anomaly avoidance.
A number of implementations of CSP (Reppy 1992) have been created for Object Oriented languages (Demaine 1997; Hilderink, Broenink et al. 1997,Welch 1999; Bakkers, Hilderink et al. 1999). Most of these CSP style extensions are in the form of libraries that are added to the language. (Demaine 1997)’s work on Java communicating sequential processes is a good example. In his extension, events are regarded as first class objects that are treated as normal objects in Java. The extension (a Java package) uses a number of primitives for sending and receiving these events. Demaine also thought it important to supply an alt or non-deterministic choice primitive operator as well. The language extension is based upon the work done by (Reppy 1992) which in turn was based upon (Hoare 1985). Similarly (Hilderink, Broenink et al. 1997)’s implementation uses a library to introduce CSP style semantics into Java. Whilst these approaches are usable, it would be imagined that the CSP primitives should really be implemented at the language level.
The first actor based language was proposed by (Hewitt 1976). The language (originally named Planner 73) was based on Lisp syntax but supplied in additional to the existing multitude of parenthesis even more parenthesis and brackets. This language was somewhat hard to interpret when reading code. From Figure 10 above it can be seen that from Plasma a large number of languages were spawned in several branches including the act series of languages, Act 1 (Lieberman 1981), Act 2 (Theriault 1983) adopted features from the Omega language (Attardi and Simi 1981), The Plasma designer returned with Act 3 (Hewitt 1985) which was an domain specific variant of the Act series of languages. Plasma 2 introduced a parallel interpeter (Salles 1984). Salles then further extended Plasma 2 with further support for logic in ALOG (Salles 1984). This was then further extended into Smart which contained support for distributed computing on virtual machines (Salles 1989).
A set of descendants of the Plasma language was the ABC series of languages ABCL/1 (Yonezawa 1990), ABCL/R (Watanabe and al 1988) ABC language paradigm with reflection and finally ABC/R2 (Yonezawa 1992) which merged features from Hybrid into the Common Lisp semantics. One problem with these languages is that they mostly used Lisp as the host language. These languages proved to be unpopular outside academia. To remedy this a language variant of ABC and later languages were created (ABCL/c+ (Doi and al 1988)) which used C as the host for the actor language. In a number of ways this extension to C was synonymous to the C++ extension to C.
A number of independent languages were created to explore facets of actor-based languages in relation to object orientation. A short selection of these are presented with their more memorable characteristics. Concurrent Aggregates (Chien 1990) was aimed at being a more restrictive and pure implementation of the Object Oriented paradigm with actors (an active object language). It also added features such as continuations and first class behaviours to the language. At about the same time some implementers favoured Smalltalk (Goldberg and Robson 1983) based solutions for instance Actalk (Briot 1989). A number of other independent active object languages were created that explored various facets of active object languages such as Lucy (Kahn and al 1990) and Studio (Hadjadji 1994). These languages whilst being excellent academic explorations of the facilities of actor based languages tended to be of little use in mainstream programming as they were frequently hosted in either purpose built narrow domain languages or more commonly in the Lisp language.
A small number of actor-based languages were created that attempted to merge with the more mainstream language’s C or C++. These languages merged the ideas of the original Plasma language with the syntax and semantics of the host language. For example, Hybrid (Nierstrasz 1987) abandoned the Lisp style syntax in favour of C++ style syntax with static typing. Another C++ language Act++ (Kafura 1989) was designed to be a full concurrent extension of C++ based on active object paradigm. Synchronous C++ (Petitpierre 1998) was proposed more recently. This extension contained a real time kernel that allowed each object to contain a thread of control that could be interrupted at will by other active objects. The ideas of Synchronous C++ were ported to Java with Synchronous Java (Petitpierre 2000) two years later. Both languages were implemented via concurrency libraries.
Triveni is a process-algebra based design methodology that combines threads and events in the context of Object Oriented programming (Colby, Jagadeesan et al. 1998). The Triveni system is not really a language but a framework that can be added to any language via a library. One of these extensions is the JavaTriveni (Colby, Jagadeesan et al. 1998) library. The framework supplies two classes of combinators, standard combinators from process algebras such as parallel composition, and pre-emptive combinators (Berry 1993). In Triveni communication is done via events that are in fact named communication channels. The framework also has a specification based testing facility that allows testing for safety properties. However, Triveni’s major contribution is the pre-emptive combinators that allow another process to abort or suspend processes at runtime and in combination with asynchronous communication that earlier frameworks did not possess. Triveni is an interesting approach however it does have some disadvantages. Firstly, the framework is entirely implemented as libraries and consequently does not support the extension at the language level. This means that the users have to call library functions to make use of expressions that are more naturally expressed as language primitives. For example, the parallel operation that executes a number of jobs concurrently is expressed as an array of new jobs passed to the constructor of the parallel object. This is non-intuitive, as one would expect such a low-level operation to be expressed as a syntactic operation rather than a library operation. Of course, this could be done with Triveni however, that would lose some of the language independence the designers were seeking when designing this framework. The second disadvantage is the differences between it and the platform on which it sits (Java). To be ultimately usable the extension should fit as closely as possible with the paradigm of the platform on which it is placed. Again, this is was a conscious design choice as the developers of Triveni were aiming for a platform independent framework.
(Guerby 1996), tasks are integrated into the
language via class style syntax.
Synchronization is via a rendezvous mechanism. Rendezvous is a mechanism where one thread
sends a message to another thread via two primitives, accept and entry. An
The JoCaml language was proposed by (Maranget, Fessant et al. 1998; Fournet, Laneve et al. 2000) it is an extension of the objective-Caml language with Join calculus semantics. The authors claim that the language includes high-level communications and synchronization channels, failure detection and automatic memory management. JoCaml was a second language implementation of the Join calculus authors (Join language (Fournet and Maranget 1998)) JoCaml is implemented via a concurrency library added to the Caml Language. It also modifies the bytecode set of the Caml language in order to make the extension work more easily. This language extension is a more interesting extension but is implemented in a non-familiar language to mainstream programmers. It would thus be necessary to translate the ideas to a production system in order to get the ideas more readily accepted.
Bertrand Meyer extended the Eiffel (Meyer 1988) language to support concurrency and synchronization. His extension concurrent Eiffel (Meyer 1997) was a straight forward syntactic addition to the language. He achieved concurrency via adding a keyword separate that indicates that the object has its own thread of execution. These threads of executions communicate via method invocations in each thread’s methods. If the method is not ready to reply it will block the caller until the reply is ready. Concurrent Eiffel despite the fact that the base language Eiffel is popular in the academic area and some restricted commercial domains cannot be regarded as a mainstream language.
Concurrent ML (Reppy 1992) is an extension based upon SML/NJ (George, MacQueen et al. 2000). The language supports dynamic thread creation and typed messages passing channels. Threads are implicitly first class in the language that means that synchronisation abstractions can be dynamically created. The language also provides a number of other features such as stream I/O and low-level I/O being integrated into the language. This language whilst providing some neat features such as the first class behaviour for thread manipulation is not an accepted mainstream language.
C++ in reality provides no support for threading or synchronization. Threading is achieved via operating system specific libraries that do not integrate closely with other language features. Synchronization is achieved through either data structure sharing or pipes. The danger of this approach is the libraries are not necessarily platform independent that means that programs have to be modified for every target platform. Secondly, other libraries and language components that threaded programs may use not themselves be thread safe. For instance a member of the class may be declared as static which means only one instance exists for all classes. In normal cases, this may be acceptable however; if the data is accessed and modified in a multi-threaded context, it is possible to corrupt the contents of the field with unpredictable results. This is a problem, as generally the application programmer has no access to the source code that means they have to either risk possible data corruption or rewrite their own versions of the libraries. Generally, the solution to this problem is to create a wrapper class that protects the class using some form of locking mechanism (such as the OS semaphores of Unix based C and C++ languages). This of course increases the complexity of the application being programmed hence reducing maintainability.
Java supports multithreading via the use of a special Thread class that in essence implements an asynchronous method. To make a threaded program the user subclasses the Thread class and then overrides’ the run method. Synchronization is achieved via a monitor embedded into every object in the program. Programmers make use of these monitors to protect data structures whilst other threads are modifying the same data structure. A deeper description of the semantics of Java concurrency can be found in Chapter 3. It is easy to, firstly circumvent the monitor lock by accident or, secondly to lose track of the locality of locks which leads to the possibility of dead-lock. It has also been noted that non-method level locking leads to greater chances of inheritance anomalies (Pons 2002)
In this section, a selection of the most relevant object oriented languages and extensions were covered. It can be seen that actor based and more specifically active object languages are an interesting solution to the issues involved in concurrency and Object Oriented languages. It could also be presumed that the active object paradigm would be more expensive computationally than that of non-active objects presented due to the number of active threads being implicitly larger. It can be seen how academic languages whilst having superior concurrency mechanisms are not widely used. Counter to this is mainstream languages which are more conservative in their approach to concurrency yet are widely accepted. Consequently, through these languages it can be seen that there is an apparent gap between the promise of the academic languages and the reality of the mainstream languages.
In this section, we summarize the languages we examined using the categorization that were identified earlier. Table 1 show the method in which concurrency is added to the language. Table 2 shows whether the language is considered mainstream or non-mainstream. Finally, Table 3 shows whether the language is an actor based language or non-actor language.
Table 1. Summary of Concurrency Integration of Languages
In Table 1 it can be seen that the majority of languages are retrofitted languages. A number of the languages are library based in that their concurrency is based on a library. A number of assumptions were made in this table regarding whether particular language features are library based, new language or retrofitted. For example, Java whilst being a new language with some concurrency features integrated into the language (synchronized keyword). The threading code is in fact a library. We only consider the language to be non-library based if it can create and synchronize threads without resorting to system libraries.
Table 2. Summary of Mainstream vs. Non-Mainstream Languages
In Table 2 shows that the majority of languages are non-mainstream. This makes sense as industry users are generally unwilling to move from the languages they are familiar with. If we were to examine the mainstream languages mentioned here, we can see that they tend to be iterations of existing languages. For example, C evolved to C++ and Java adopted a large portion of the syntax from C++.