CCGrid 2012 — The 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

May 13-16, 2012, Ottawa, Canada – Delta Ottawa City Centre Hotel, 101 Lyon Street, Ottawa

Keynote Speakers

1. Discovering Knowledge from Massive Social Networks and Science Data – Next Frontier for HPC

Prof. Alok N. Choudhary

John G. Searle Professor

Electrical Engineering and Computer Science

Northwestern University, USA

http://users.eecs.northwestern.edu/~choudhar/

Abstract:
Knowledge discovery in science and engineering has been driven by theory, experiments and more recently by large-scale simulations suing high-performance computers. Modern experiments and simulations involving satellites, telescopes, high-throughput instruments, imaging devices, sensor networks, accelerators, and supercomputers yield massive amounts of data. At the same time, the world, including social communities is creating massive amounts of data at an astonishing pace. Just consider Facebook, Google, Articles, Papers, Images, Videos and others. But, even more complex is the network that connects the creators of data. There is knowledge to be discovered in both. This represents a significant and interesting challenge for HPC and opens opportunities for accelerating knowledge discovery.

In this talk, followed by an introduction to high-end data mining and the basic knowledge discovery paradigm, we present the process, challenges and potential for this approach. We will present many case examples, results and future directions including (1) mining sentiments from massive datasets on the web, (2) Real-time stream mining of text from millions of and tweets to identify influencers and sentiments of people; (3) Discovering knowledge from massive social networks containing millions of nodes and hundreds of billions of edges from real world Facebook, twitter and other social network data (E.g., Can anyone follow Presidential campaigns and real-time?) and (4) Discovering knowledge from massive datasets from science applications including climate, medicine, biology and sensors. The talk will illustrative and example driven and may include 1-2 live demonstrations.

Bioigraphy:
Alok Choudhary is a John G. Searle Professor of Electrical Engineering and Computer Science at Northwestern University. He is the founding director of the Center for Ultra-scale Computing and Information Security (CUCIS). Prof. Choudhary was a co-founder and VP of Technology of Accelchip Inc., in 2000, which was eventually acquired by Xilinx. He received the National Science Foundation's Young Investigator Award in 1993. He has also received an IEEE Engineering Foundation award, an IBM Faculty Development award, an Intel Research Council award.

He is a fellow of IEEE, ACM and AAAS. His research interests are in high-performance computing, data intensive computing, scalable data mining, computer architecture, high-performance I/O systems and software and their applications. Alok Choudhary has published more than 350 papers in various journals and conferences and has graduated 30 PhD students. Techniques developed by his group can be found on every modern processor and scalable software developed by his group can be found on most supercomputers. Alok Choudhary’s work has appeared in many traditional media including New York Times, Chicago Tribune, The Telegraph,, TV channels such as ABC, PBS and many international media outlets all over the world.

2. Assertion Based Parallel Debugging: A new way of thinking?

Prof. David Abramson

ARC Professorial Fellow

Professor of Computer Science

The Faculty of Information Technology

Monash University, Australia

http://www.csse.monash.edu.au/~davida/personal.html

Abstract:
Programming languages have advanced tremendously over the years, but program debuggers have hardly changed. Sequential debuggers do little more than allow a user to control the flow of a program and examine its state. Parallel ones support the same operations on multiple processes, and are adequate with a small number of cores, but become unwieldy and ineffective on very large machines. Typical scientific codes have enormous multi-dimensional data structures and it is impractical to expect a user to view the data using traditional display techniques.

In this talk I will discuss the use of debug-time assertions (both within and across programs), and show that these can be used to debug parallel programs. The techniques reduce the debugging complexity because they reason about the state of large arrays without requiring the user to know the expected value of every element. When used across programs, the technique can help find errors that occur when a program is ported to a new platform. Whilst assertions can be expensive to evaluate, their performance can be improved by running them in parallel. We have implemented these ideas in a new debugger called Guard, and will illustrate its performance on tens of thousands of cores on a Cray XE6.

Bioigraphy:
Professor David Abramson has been involved in computer architecture and high performance computing research since 1979. Previous to joining Monash University in 1997, he has held appointments at Griffith University, CSIRO, and RMIT. At CSIRO he was the program leader of the Division of Information Technology High Performance Computing Program, and was also an adjunct Associate Professor at RMIT in Melbourne. He served as a program manager and chief investigator in the Co-operative Research Centre for Intelligent Decisions Systems and the Co-operative Research Centre for Enterprise Distributed Systems.

Abramson is currently an ARC Professorial Fellow; Professor of Computer Science in the Faculty of Information Technology at Monash University, Australia, and science director of the Monash e-Research Centre. He is a fellow of the Association for Computing Machinery (ACM) and the Academy of Science and Technological Engineering (ATSE), and a member of the IEEE. Abramson has served on committees for many conferences and workshops, and has published over 200 papers and technical documents. He has given seminars and received awards around Australia and internationally and has received over $8 million in research funding. He also has a keen interest in R&D commercialization and consults for Axceleon Inc, who produce an industry strength version of Nimrod, and Guardsoft, a company focused on commercialising the Guard relative debugger. Abramson’s current interests are in high performance computer systems design and software engineering tools for programming parallel, distributed supercomputers and stained glass windows.

3. Twenty Years of Grid Scheduling Research and Beyond

Prof. Dick H.J. Epema

Delft University of Technology

Eindhoven University of Technology

The Netherlands

http://www.pds.ewi.tudelft.nl/epema

Abstract:
Exactly twenty years ago, we started our grid scheduling research in Delft with the design and implementation of the Condor flocking mechanism for load sharing across Condor pools. Over the last ten years we have designed and deployed the KOALA grid scheduler, which has served as our research vehicle for a large variety of research in grid scheduling. These are only two examples showing that in general, grids have been a present from heaven for researchers in resource management and scheduling in large-scale distributed systems. In this talk I will first look back: What have we learned from our past research in resource management and scheduling, and what is the value of this research in the first place?

In these twenty years, the computing world has drastically changed: data centers dominate the internet, clouds are driving out grids, data constitute virtual tsunamis, multicore has revived parallelism, and energy efficiency has become a concern. In this talk I will also try to look ahead. What are the essential differences between (research in) grids and clouds? What is the value of our experimental work when research systems are increasingly dwarfed by the systems deployed in industry? And as one of the application drivers for research in resource management I will present the case of massive multiplayer online games, which constitute a huge market and have a rich structure in terms of system requirements.

Bioigraphy:
Dick H.J. Epema holds an MSc and a PhD in mathematics (algebraic geometry) from Leiden University in the Netherlands, and an MSc in computer science from Delft University of Technology. Currently, he is an associate professor of Computer Science at Delft University of Technology, and a full professor in Decentralized Distributed Systems at Eindhoven University of Technology. His general research interests are in the areas of performance analysis and distributed systems in general, and in grids and clouds, and in peer-to-peer systems and online social networks in particular. In the area of grids and clouds, his focus is on resource management and scheduling, and his research centers around the KOALA grid scheduler, which has been deployed on the Dutch DAS system. In the area of peer-to-peer systems and online social networks, his research is on all aspects of video distribution in swarm-based P2P systems and on reputation mechanisms, and is part of the research and development of the Tribler P2P system.

Dick Epema participates in the Infrastructure Virtualization for e-Science project of the Dutch national COMMIT program and in the European P2P-Next project. He has authored over 90 scientific papers, and has been on numerous program committees in grids, clouds, and P2P computing. He is an associate editor of the IEEE Trans. on Parallel and Distributed Systems. He was general co-chair of the Euro-Par 2009 and the IEEE P2P 2010 conferences, and he is the general chair of the 21st Int'l ACM Symp. on High-Performance Parallel and Distributed Computing in 2012.