Wednesday, September 7

CnC Workshop held in Cherokee ballroom of Lory student center

7PM - 9PM: Reception at the Hilton for LCPC and CnC attendees

Thursday, September 8

LCPC Workshop held in Cherokee ballroom of Lory student center

8:15AM - 8:45AM: Continental Breakfast

8:45AM - 9:00AM: Opening Remarks, Workshop Chairs

9:00AM - 10:00AM: Keynote Speaker, Vikram Adve from University of Illinois, Urbana-Champaign
Session chair: Michelle Strout

Parallel Programming Should Be -- And Can Be -- Deterministic-by-default
How can we make parallel programming accessible to average developers in the vast commodity applications arena? I will argue that "deterministic-by-default" parallel programming models are part of the answer. With such a model, the programmer is freed from the burden of reasoning about thread interleavings, data races, deadlock, or the intricacies of memory models. Unfortunately, this goal is challenging to achieve for modern imperative, object-oriented languages. I will describe Deterministic Parallel Java, a deterministic-by-default parallel programming language that guarantees sequential semantics (except where requested otherwise) for explicitly parallel programs. The deterministic guarantees are achieved using a region-based type and effect system that is a simple extension to base Java. Programmers can combine nondeterministic algorithms with deterministic ones in a data race-free, isolated, manner, with the strongest safety guarantees of any parallel language or system. DPJ also supports the safe use of expressive parallel frameworks. An interactive tool called DPJizer, under development, enables programmers to port existing Java code to DPJ semi-automatically. We are now exploring how these techniques can be extended to C++, combined with existing parallel langauges such as Cilk++ or OpenMP, and applied to large, industrial code bases.

10:00AM - 10:30AM: Break

10:30AM - 12:00PM: Compiling for parallelism and/or power
Session chair: Wim Bohm

Automatic Scaling of OpenMP Beyond Shared Memory
Okwan Kwon (Purdue), Fahed Jubair (Purdue), Seung-Jai Min (LBNL), Rudolf Eigenmann (Purdue), and Samuel Midkiff (Purdue)

A Methodology for Fine-Grained Parallelization of JavaScript Applications
Jeffrey Fifield and Dirk Grunwald (University of Colorado, Boulder)

Evaluation of Power Consumption at Execution of Multiple Automatically Parallelized and Power Controlled Media Applications on the RP2 Low-power Multicore
Hiroki Mikami, Shumpei Kitaki, Masayoshi Mase, Akihiro Hayashi, Mamoru Shimaoka, Keiji Kimura, Masato Edahiro, and Hironori Kasahara (Waseda University, Japan)

12:00PM - 1:30PM: Lunch in North Ballroom

1:30PM - 2:30PM: Run-time analysis and parallelization
Session chair: Xioming Li

Double inspection for run-time loop parallelization
Michael Philippsen (University of Erlangen-Nuremberg, Germany), Nikolai Tillmann (Microsoft Research), and Daniel Brinkers (University of Erlangen-Nuremberg, Germany)

A Hybrid Approach to Proving Memory Reference Monotonicity
Cosmin E. Oancea and Lawrence Rauchwerger (Texas A & M)

2:30PM - 3:00PM: Break

3:00PM - 4:30PM: Parallel programming models
Session chair: Lawrence Rauchwerger

OpenCL as a Programming Model for GPU Clusters
Jungwon Kim, Sangmin Seo, Jun Lee, Jeongho Nah, Gangwon Jo, and Jaejin Lee (Seoul National University, Korea)

CellCilk: Extending Cilk for heterogeneous multicore platforms
Tobias Werth (University of Erlangen-Nuremberg, Germany), Silvia Schreier (University of Hagen, Germany), and Michael Philippsen (University of Erlangen-Nuremberg, Germany)

OPELL and PM: A Case Study on Porting Shared Memory Programming Models to Accelerators Architectures
Joseph B. Manzano, Ge Gan, Juergen Ributzka, Sunil Shrestha, and Guang R. Gao at University of Delaware

4:30PM: Group photo

5:30PM: Bus pickup at Hilton

6:30PM-9:30PM: Banquet at Sylvan Dale Ranch

Friday, September 9

LCPC Workshop held in Cherokee ballroom of Lory student center

8:30AM - 9:00AM: Continental Breakfast

9:00AM - 10:00AM: Keynote Speaker, Vijay Saraswat from the Advanced Programming Languages Department at IBM TJ Watson Research Center,
Session chair: Sanjay Rajopadhye

Constrained types: What are they and what can they do for you
The talk is based on joint work with Nate Nystrom, Igor Peshansky, Olivier Tardieu, Dave Grove, Dave Cunningham, Yoav Zibin and Avi Shinnar.

One of the major innovations of the X10 programming language is constrained types [OOPSLA 2008]. A constrained type is of the form N{c} where N is the name of a class/struct/interface and c is a constraint on the properties (marked immutable instance fields) of N. Constraints may reference any accessible immutable variable and may be drawn from any vocabulary of predicates, as long as an underlying constraint solver is available.

For instance, an object o is of type Matrix{self.I==self.J} if it is of type Matrix and o.I==o.J. The signature of matrix multiply can be given by def mult(a:Matrix{self.I==this.J}):Matrix{self.I==this.I, self.J==a.J}. If the class Tree is augmented with a property root:Tree, then the type Tree{self.root=o} is precisely the type of all nodes in the tree with root o (cf membership in ownership domains).

Constrained types may occur wherever types can occur -- including in declarations of variables, fields and parameters, return types of methods, dynamic casts, annotations etc. The X10 compiler statically checks symbolic entailment of constraints when performing subtype checks (e.g. to determine if an expression can be assigned to a variable, or returned from a method).

Constrained types are a particularly powerful form of dependent types well-suited to object-oriented programming languages. They draw power from the natural impredicativity of object-oriented languages -- the type of any field of a class T can be T or any class which has fields of type T. Indeed, checking entailment of constrained types even with just == and != constraints is undecidable (because of impredicativity).

Constrained types are very valuable in code generation and array bounds checking. For instance, the type Array[T] in X10 can be defined over arbitrarily shaped index regions. The type Rail[T] is defined as Array[T]{self.rank==1,self.zeroBased,self.rect}, i.e. the set of all arrays o of type T such that o's region is 0..N for some N. If the compiler infers that a variable is of type Rail[T] it generates much more efficient code for accessing its elements. Similarly, if x:Array{self.region==r}, and p:Point(r.inner(1)) (where r.inner(k) represents all those points p such that p+k lies in r), then x(p) and x(p+1) can be statically established to be legal accesses (without knowing the value of r) since p and p+1 lie in r. (*)

Constrained types can also be used for safety analysis of concurrent programs, in particular for establishing that two pieces of parallel code affect disjoint partitions of the heap (= zones), and hence can commute with each other. In particular we show how the "region effects" of Deterministic Parallel Java (DPJ) can be naturally represented in X10 using constrained types. For instance the type of all arrays x of type T whose members x(p) lie in the zone Zone(,p) obtained from using the index p can be described by x#Array[(p:x.region)=> T{,p)}]. Using this the compiler can establish that if p!= q and p,q lie in x.region then x(p) != x(q). (*)

(*) These constraints cannot be processed by the X10 2.2 compiler. Work is in progress to add support for such constraints.

10:00AM - 10:30AM: Break

10:30AM - 12:00PM: Synchronization
Session chair: Keiji Kimura

Optimizing the Concurrent Execution of Locks and Transactions
Justin Gottschlich and Jaewoong Chung (Intel)

A Study of the usefulness of Producer/Consumer Synchronization
Hao Lin, Samuel P. Midkiff and Rudolf Eigenmann (Purdue)

Lock-Free Resizeable Concurrent Tries
Aleksandar Prokopec, Phil Bagwell, Martin Odersky (École Polytechnique Fédérale de Lausanne, Switzerland)

12:00PM - 1:30PM: Lunch in North Ballroom

1:30PM - 3:00PM: Accelerators
Session chair: Rudi Eigenmann

Fine-Grained Treatment to Synchronizations in GPU-to-CPU Translation
Ziyu Guo and Xipeng Shen (College of William and Mary)

A Mutable Hardware Abstraction to Replace Threads
Sean Halle (TU Berlin and INRIA and UC Santa Cruz) and Albert Cohen (INRIA)

Dynamic Task Parallelism with a GPU Work-Stealing Runtime System
Sanjay Chatterjee, Max Grossman, Alina Sbirlea, and Vivek Sarkar (Rice University)

3:00PM - 5:00PM: Poster session with snacks

Dinner on own

Saturday, September 10

LCPC Workshop held in Cherokee ballroom of Lory student center

8:30AM - 9:00AM: Continental Breakfast

9:00AM - 10:00AM: Compiling for Parallelism
Session chair: Hironori Kasahara

A Code Merging Optimization Technique for GPGPU
Ryan Taylor and Xiaoming Li (University of Delaware)

Static compilation analysis for host-accelerator communication optimization
Mehdi Amini (HPC Project, Meudon, France and MINES ParisTech/CRI, Fontainebleau, France), Fabien Coelho (MINES ParisTech/CRI, Fontainebleau, France), Francois Irigoin (MINES ParisTech/CRI, Fontainebleau, France) and Ronan Keryell (HPC Project, Meudon, France)

10:00AM - 10:30AM: Break

10:30AM - 12:00PM: Run-time systems

Scheduling Support for Communicating Parallel Tasks
Jorg Dummler (Chemnitz University of Technology, Germany), Thomas Rauber (Bayreuth University, Germany), and Gudula Runger (Chemnitz University of Technology, Germany)

Polytasks: A Compressed Task Representation for HPC Runtimes
Daniel Orozco (University of Delaware and ET International), Elkin Garcia (University of Delaware), Robert Pavel (University of Delaware), Rishi Khan (ET International) and Guang Gao (University of Delaware)

Detecting False Sharing in OpenMP Applications Using the DARWIN Framework
Besar Wicaksono, Munara Tolubaeva and Barbara Chapman (University of Houston)

12:00PM - 12:15PM: Closing remarks, Workshop chairs

