A number of online machine learning techniques based on tree model have been studied in order to cope with today's requirements of quickly processing large scale data-sets. We present a design pattern for incremental tree data processing as gradually constructing on-demand tree-model on memory. Our approach adopts the actor model as making use of multi-cores and distributed computers without largely rewriting code for algorithms. The pattern basically defines a node in the tree as an actor which is the unit of asynchronous processes and each data instance flows between actor nodes as a message. We study concrete two machine learning algorithms, VFDT for decision tree's top-down growth and BIRCH for hierarchical clustering's bottom up growth. For supporting VFDT, we propose an extension mechanism of replicating root nodes so that it can address bottleneck as starting of inputs. For supporting BIRCH, we split processes of recursive construction into asynchronous steps with correcting target node by traversing extra horizontal links between sibling nodes. We carried out machine learning tasks with our implementation on top of Akka Java, and we confirmed reasonable performance for the tasks with large scale data-sets.
The actor model is a well-established way to approach to modularly designing and implementing concurrent and/or distributed systems, seeing increasing adoption in industry. But deductive verification tailored to actor programs remains underexplored; general concurrent logics could be used, but the logics are complex and full of features to reason about behaviors the actor model strives to avoid.
We explore a relatively lightweight approach of extending a system for proving sequential program correctness with means to prove safety properties of actor programs (currently, assuming no faults). We borrow ideas from hybrid logic, a modal logic for stating assertions are true at a particular point in a model (in this case, a particular actor’s local state). To make such assertions useful, we stabilize them using rely-guarantee-style reasoning over local actor states, and only permit sending stable versions of these assertions to other actors. By carefully restricting the formation of assertions that a proposition is true at a certain actor, we avoid the need for actors to handle each others’ rely-guarantee relations explicitly. Finally, we argue that the approach requires only modest adjustments beyond applying traditional sequential techniques to actors with immutable messages, by implementing most of the logic as a Dafny library.
In high-throughput, distributed systems, such as large-scale banking infrastructure, synchronization between actors becomes a bottle-neck in high-contention scenarios. This results in delays for users, and reduces opportunities for scaling such systems. This paper proposes Static Local Coordination Avoidance, which analyzes application invariants at compile time to detect whether messages are independent, so that synchronization at run time is avoided, and parallelism is increased. Analysis shows that in industry scenarios up to 60% of operations are independent. Initial performance evaluation shows that, in comparison to a standard 2-phase commit baseline, throughput is increased, and latency is reduced. As a result, scalability bottlenecks in high-contention scenarios in distributed actor systems are reduced for independent messages.
We propose the localised ρ-calculus, a process calculus that uses an identification of processes and names in higher-order communication along with a notion of locations. These are named, bounded areas for computation, organised in a hierarchy, and processes can move from one to another by crossing their boundaries. Process migration becomes higher-order communication. To control the usage of channels we develop a type system with binary session types for the localised ρ-calculus, based on work by Giunti and Vasconcelos, and prove that it is sound.
The actor paradigm supports the natural expression of concurrency. It has inspired the development of several actor-based languages, whose adoption depends, to a large extent, on the runtime characteristics ( the performance and scaling behaviour) of programs written in these languages.
This paper investigates the relative runtime characteristics of Akka, CAF and Pony, based on the Savina benchmarks. We observe that the scaling of many of the Savina benchmarks does not reflect their categorization (into essentially sequential, concurrent and parallel), that many programs have similar runtime characteristics, and that their runtime behaviour may drastically change nature ( go from essentially sequential to parallel) by tweaking some parameters.
These observations lead to our proposal of a single benchmark program which we designed so that through tweaking of some knobs (we hope) we can simulate most of the programs of the Savina suite.