PPoPP '17- Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Full Citation in the ACM Digital Library
SESSION: Keynote
It's Time for a New Old Language
Guy L. Steele, Jr.
SESSION: Session 1: GPU I
EffiSha: A Software Framework for Enabling Effficient Preemptive Scheduling of GPU
Guoyang Chen
Yue Zhao
Xipeng Shen
Huiyang Zhou
Layout Lock: A Scalable Locking Paradigm for Concurrent Data Layout Modifications
Nachshon Cohen
Arie Tal
Erez Petrank
Understanding the GPU Microarchitecture to Achieve Bare-Metal Performance Tuning
Xiuxia Zhang
Guangming Tan
Shuangbai Xue
Jiajia Li
Keren Zhou
Mingyu Chen
SESSION: Session 2: Concurrency
Checking Concurrent Data Structures Under the C/C++11 Memory Model
Peizhao Ou
Brian Demsky
An Efficient Abortable-locking Protocol for Multi-level NUMA Systems
Milind Chabbi
Abdelhalim Amer
Shasha Wen
Xu Liu
Contention in Structured Concurrency: Provably Efficient Dynamic Non-Zero Indicators for Nested Parallelism
Umut A. Acar
Naama Ben-David
Mike Rainey
Noise Injection Techniques to Expose Subtle and Unintended Message Races
Kento Sato
Dong H. Ahn
Ignacio Laguna
Gregory L. Lee
Martin Schulz
Christopher M. Chambreau
SESSION: Session 3: Tools
Thread Data Sharing in Cache: Theory and Measurement
Hao Luo
Pengcheng Li
Chen Ding
Exploiting Vector and Multicore Parallelism for Recursive, Data- and Task-Parallel Programs
Bin Ren
Sriram Krishnamoorthy
Kunal Agrawal
Milind Kulkarni
Isoefficiency in Practice: Configuring and Understanding the Performance of Task-based Applications
Sergei Shudler
Alexandru Calotoiu
Torsten Hoefler
Felix Wolf
Processor-Oblivious Record and Replay
Robert Utterback
Kunal Agrawal
I-Ting Angelina Lee
Milind Kulkarni
SESSION: Session 4: GPU II
Simple, Accurate, Analytical Time Modeling and Optimal Tile Size Selection for GPGPU Stencils
Nirmal Prajapati
Waruna Ranasinghe
Sanjay Rajopadhye
Rumen Andonov
Hristo Djidjev
Tobias Grosser
Combining SIMD and Many/Multi-core Parallelism for Finite State Machines with Enumerative Speculation
Peng Jiang
Gagan Agrawal
S-Caffe: Co-designing MPI Runtimes and Caffe for Scalable Deep Learning on Modern GPU Clusters
Ammar Ahmad Awan
Khaled Hamidouche
Jahanzeb Maqbool Hashmi
Dhabaleswar K. Panda
Model-based Iterative CT Image Reconstruction on GPUs
Amit Sabne
Xiao Wang
Sherman J. Kisner
Charles A. Bouman
Anand Raghunathan
Samuel P. Midkiff
SESSION: Session 5: Best Papers
Pagoda: Fine-Grained GPU Resource Virtualization for Narrow Tasks
Tsung Tai Yeh
Amit Sabne
Putt Sakdhnagool
Rudolf Eigenmann
Timothy G. Rogers
Groute: An Asynchronous Multi-GPU Programming Model for Irregular Computations
Tal Ben-Nun
Michael Sutton
Sreepathi Pai
Keshav Pingali
Tapir: Embedding Fork-Join Parallelism into LLVM's Intermediate Representation
Tao B. Schardl
William S. Moses
Charles E. Leiserson
A Multicore Path to Connectomics-on-Demand
Alexander Matveev
Yaron Meirovitch
Hayk Saribekyan
Wiktor Jakubiuk
Tim Kaler
Gergely Odor
David Budden
Aleksandar Zlateski
Nir Shavit
SESSION: Session 6: Languages & Compilers
SC-Haskell: Sequential Consistency in Languages That Minimize Mutable Shared Heap
Michael Vollmer
Ryan G. Scott
Madanlal Musuvathi
Ryan R. Newton
Synchronized-by-Default Concurrency for Shared-Memory Systems
Martin Bättig
Thomas R. Gross
Function Call Re-Vectorization
Rubens E.A. Moreira
Sylvain Collange
Fernando Magno Quintão Pereira
Optimizing the Four-Index Integral Transform Using Data Movement Lower Bounds Analysis
Samyam Rajbhandari
Fabrice Rastello
Karol Kowalski
Sriram Krishnamoorthy
P. Sadayappan
SESSION: Session 7: Data Analytics
Using Butterfly-Patterned Partial Sums to Draw from Discrete Distributions
Guy L. Steele, Jr.
Jean-Baptiste Tristan
KiWi: A Key-Value Map for Scalable Real-Time Analytics
Dmitry Basin
Edward Bortnikov
Anastasia Braginsky
Guy Golan-Gueta
Eshcar Hillel
Idit Keidar
Moshe Sulamy
Grammar-aware Parallelization for Scalable XPath Querying
Lin Jiang
Zhijia Zhao
Eunomia: Scaling Concurrent Search Trees under Contention Using HTM
Xin Wang
Weihua Zhang
Zhaoguo Wang
Ziyun Wei
Haibo Chen
Wenyun Zhao
SESSION: Session 8: Fault Tolerance
Self-Checkpoint: An In-Memory Checkpoint Method Using Less Space and Its Practice on Fault-Tolerant HPL
Xiongchao Tang
Jidong Zhai
Bowen Yu
Wenguang Chen
Weimin Zheng
Silent Data Corruption Resilient Two-sided Matrix Factorizations
Panruo Wu
Nathan DeBardeleben
Qiang Guan
Sean Blanchard
Jieyang Chen
Dingwen Tao
Xin Liang
Kaiming Ouyang
Zizhong Chen
POSTER SESSION: Session 9: Posters
POSTER: Reuse, don't Recycle: Transforming Algorithms that Throw Away Descriptors
Maya Arbel-Raviv
Trevor Brown
POSTER: An Architecture and Programming Model for Accelerating Parallel Commutative Computations via Privatization
Vignesh Balaji
Dhruva Tirumala
Brandon Lucia
POSTER: HythTM: Extending the Applicability of Intel TSX Hardware Transactional Support
Arnamoy Bhattacharyya
Mike Dai Wang
Mihai Burcea
Yi Ding
Allen Deng
Sai Varikooty
Shafaaf Hossain
Cristiana Amza
POSTER: Provably Efficient Scheduling of Cache-Oblivious Wavefront Algorithms
Rezaul Chowdhury
Pramod Ganapathi
Yuan Tang
Jesmin Jahan Tithi
POSTER: State Teleportation via Hardware Transactional Memory
Nachshon Cohen
Maurice Herlihy
Erez Petrank
Elias Wald
POSTER: IOGP: An Incremental Online Graph Partitioning for Large-Scale Distributed Graph Databases
Dong Dai
Wei Zhang
Yong Chen
POSTER: Distributed Control: The Benefits of Eliminating Global Synchronization via Effective Scheduling
Jesun Shariar Firoz
Thejaka Amila Kanewala
Marcin Zalewski
Martina Barnas
Andrew Lumsdaine
POSTER: MAPA: An Automatic Memory Access Pattern Analyzer for GPU Applications
Gangwon Jo
Jaehoon Jung
Jiyoung Park
Jaejin Lee
POSTER: Cache-Oblivious MPI All-to-All Communications on Many-Core Architectures
Shigang Li
Yunquan Zhang
Torsten Hoefler
POSTER: Automated Load Balancer Selection Based on Application Characteristics
Harshitha Menon
Kavitha Chandrasekar
Laxmikant V. Kale
POSTER: A GPU-Friendly Skiplist Algorithm
Nurit Moscovici
Nachshon Cohen
Erez Petrank
POSTER: Poor Man's URCU
Pedro Ramalhete
Andreia Correia
POSTER: A Wait-Free Queue with Wait-Free Memory Reclamation
Pedro Ramalhete
Andreia Correia
POSTER: STAR (Space-Time Adaptive and Reductive) Algorithms for Real-World Space-Time Optimality
Yuan Tang
Ronghui You
POSTER: Recovering Performance for Vector-based Machine Learning on Managed Runtime
Mingyu Wu
Haibing Guan
Binyu Zang
Haibo Chen
POSTER: On the Problem of Consistency Exceptions in the Context of Strong Memory Models
Minjia Zhang
Swarnendu Biswas
Michael D. Bond
POSTER: An Infrastructure for HPC Knowledge Sharing and Reuse
Yue Zhao
Chunhua Liao
Xipeng Shen