Champéry Winter School 2020

Time & Place

Time: The winter school will take place in the city of Champéry from Monday, February 3 (noon) to Friday, February 7 (noon).
Place: Hôtel Suisse, Rue du Village 55, 1874 Champéry (Google Maps ). The hotel is a short walk from the train station, in the center of the village. All rooms have wifi. You can rent ski and snow equipment right next to the hotel.

Theme – Exploring New Computing Platforms

Computing systems have historically relied upon general-purpose CPUs exclusively. Recently, with the appearance of tools such as CUDA or OpenCL, GPGPUs became another source of computing power. The list of alternative computing platforms keeps growing: TPUs and Tensor Flow, programmable switches and P4, FPGA and VHDL, “smart” SSDs and in-storage computations, to name a few. These “accelerators,” as these faster sources of specialized compute power are called, would have been useful but not too influential, if it were not for another paradigm shift.

CPUs are not getting much faster. People in the computer architecture field refer to the end of Moore’s Law and Dennard’s Scaling. Simply put, performance improvements are not going to come from writing better software for increasingly parallel hardware, as we have done from the mid-aughts. Performance will come from a tighter collaboration between increasingly specialized hardware and the right abstractions/optimizations to unlock their power. That is why we have seen a growing number of new computing platforms.

In this gathering, we will call upon the hardware experts and ask them to guide system software students through the new potential areas of research. Our goal is to inform a new generation of students about the ongoing confluence of hardware and software as the next source for scalable systems.

Instructors

Prof. Paolo Ienne (EPFL) – High-Level Synthesis
(Remotely) Dr. Thomas Heinis (Imperial College, UK) – DNA Storage
Prof. Onur Mutlu (ETH) – In-Memory Computing
(Remotely) Dr. Jian Ouyang (Chief Hardware Architect – Baidu, China) – ML Acceleration
Dr. Ricard Delgado (CSEM) – Wearable Computing: From Signal Processing to AI
Dr. Alberto Lerner (UNIFR) – In-Network Computing

Registration

Closed

Transportation

Champéry is easily accessible by train or car. Here is how to get there.

Tentative Schedule

Morning sessions: 8:30 to 11:45 (coffee break: 10:00).
Afternoons free from Tuesday on. (Did we mention the ski equipment rental store right by the hotel?)
Tea time (included but no obligatory): 16:00 to 16:30
Evening sessions: 16:30pm to 19:00pm (break: TBD).
Dinner : 19:30 (included)

Monday, February 3

Check-in will be open 3:30pm-7pm but you can drop your luggage at the hotel before that of course. The session will take place downstairs at the hotel (biggest of the two seminar rooms). Lunch is on your own.

13:00-13:15	Opening: A New Era of Heterogeneous Hardware (Organizers)
13:15-16:00	Memory Systems and Memory-Centric Computing Systems: Challenges and Opportunities, part I (Instructor: Prof. Onur Mutlu) Slides: Part I Part II
16:00-16:30	Tea Time
16:30-19:00	Memory Systems and Memory-Centric Computing Systems: Challenges and Opportunities, part II (Instructor: Prof. Onur Mutlu) Slides: Part III PartIV
19:30-	Dinner at At’Home (included)

Abstract:

The memory system is a fundamental performance and energy bottleneck in almost all computing systems. Recent system design, application, and technology trends that require more capacity, bandwidth, efficiency, and predictability out of the memory system make it an even more important system bottleneck. At the same time, DRAM and flash technologies are experiencing difficult technology scaling challenges that make the maintenance and enhancement of their capacity, energy efficiency, performance, and reliability significantly more costly with conventional techniques. In fact, recent reliability issues with DRAM, such as the RowHammer problem, are already threatening system security and predictability. We are at the challenging intersection where issues in memory reliability and performance are tightly coupled with not only system cost and energy efficiency but also system security.

In this tutorial, we first discuss major challenges facing modern memory systems (and the computing platforms we currently design around the memory system) in the presence of greatly increasing demand for data and its fast analysis. We then examine some promising research and design directions to overcome these challenges. We discuss at least three key topics in some detail, focusing on both open problems and potential solution directions:

fundamental issues in memory reliability and security and how to enable fundamentally secure, reliable, safe architectures
enabling data-centric and hence fundamentally energy-efficient architectures that are capable of performing computation near data
reducing both latency and energy consumption by tackling the fixed-latency/energy mindset

If time permits, we will also discuss research challenges and opportunities in enabling emerging NVM (non-volatile memory) technologies and scaling NAND flash memory and SSDs (solid state drives) into the future.

Bio:

Onur Mutlu is a Professor of Computer Science at ETH Zurich. He is also a faculty member at Carnegie Mellon University, where he previously held the Strecker Early Career Professorship. His current broader research interests are in computer architecture, systems, hardware security, and bioinformatics. A variety of techniques he, along with his group and collaborators, has invented over the years have influenced industry and have been employed in commercial microprocessors and memory/storage systems. He obtained his PhD and MS in ECE from the University of Texas at Austin and BS degrees in Computer Engineering and Psychology from the University of Michigan, Ann Arbor. He started the Computer Architecture Group at Microsoft Research (2006-2009), and held various product and research positions at Intel Corporation, Advanced Micro Devices, VMware, and Google. He received the ACM SIGARCH Maurice Wilkes Award, the inaugural IEEE Computer Society Young Computer Architect Award, the inaugural Intel Early Career Faculty Award, US National Science Foundation CAREER Award, Carnegie Mellon University Ladd Research Award, faculty partnership awards from various companies, and a healthy number of best paper or "Top Pick" paper recognitions at various computer systems, architecture, and hardware security venues. He is an ACM Fellow "for contributions to computer architecture research, especially in memory systems", IEEE Fellow for "contributions to computer architecture research and practice", and an elected member of the Academy of Europe (Academia Europaea). His computer architecture and digital logic design course lectures and materials are freely available on YouTube, and his research group makes a wide variety of software and hardware artifacts freely available online. For more information, please see his [webpage](https://people.inf.ethz.ch/omutlu/).

Tuesday, February 4

07:00-08:30	Breakfast buffet (included; open 7am-10am)
08:30-11:45	High-Level Synthesis for Software Programmers, Tutorial (Instructor: Prof. Paolo Ienne) Slides
16:00-16:30	Tea Time (included but not obligatory)
16:30-19:00	High-Level Synthesis for Software Programmers, Practice (Instructors: Lana Josipovi'c and Andrea Guerrieri)
19:30-	Dinner at At’Home (included)

Abstract:

High-level synthesis (HLS) tools create dedicated circuits from languages such as C and, thus, appear to promise to make hardware design accessible, among others, to software programmers targeting configurable platforms. Unfortunately, this is generally not quite the case: commercial and academic HLS tools almost universally need expert guidance; users must understand the nature of the circuits generated to restructure the input C code or add pragmas and thus force synthesis tools to generate the desired output. Partly, this significant shortcoming depends by the very nature of the traditional HLS process: tools almost universally generate statically scheduled datapaths and this implies that circuits out of HLS tools have a hard time exploiting parallelism to the limits hardware designers expect. While in many cases code refactoring and pragmas to the HLS tools solve the problem, when addressing code with potential memory dependencies, with control-dependent dependencies in inner loops, or where performance is limited by long latency control decisions, classic HLS tools just fail to produce competitive hardware. Recently, there has been some interest in generating dynamically scheduled circuits instead (somehow analogous to out-of-order superscalar processors in the software programmable world); such new HLS techniques may become critical in coming years if broader application classes will need the benefits of reconfigurable hardware acceleration. In this lecture, we start by reviewing the basic notions of classic HLS and develop a basic understanding of how tools generate circuits--and, therefore, of their fundamental limitations. We discuss through examples some of the code restructuring such tools need. Then, we describe our experiences with dynamically scheduled dataflow circuits: we present a generic methodology to generate them from C programs, contrast their operation to traditional HLS results, discuss some of their specific challenges, and outline the opportunities they create for results that are more friendly to software code. (Joint work with Lana Josipović, Andrea Guerrieri, et al.)

Bio:

Paolo Ienne has been a Professor at the EPFL since 2000 and heads the Processor Architecture Laboratory (LAP). Prior to that, he worked for the Semiconductors Group of Siemens AG, Munich, Germany (which later became Infineon Technologies AG) where he was at the head of the Embedded Memories unit in the Design Libraries division. His research interests include various aspects of computer and processor architecture, FPGAs and reconfigurable computing, electronic design automation, and computer arithmetic. Ienne was a recipient of Best Paper Awards at the 20th and at the 24th ACM/SIGDA International Symposia on Field-Programmable Gate Arrays (FPGA), in 2012 and 2016, at the 19th International Conference on Field-Programmable Logic and Applications (FPL), in 2009, at the International Conference on Compilers, Architectures, and Synthesis for Embedded Systems (CASES), in 2007, and at the 40th Design Automation Conference (DAC), in 2003; many other papers have been candidates to Best Paper Awards in prestigious venues. He has served as general, programme, and topic chair of renown international conferences, including organizing in Lausanne the 26th International Conference on Field-Programmable Logic and Applications in 2016. He serves on the steering committee of the IEEE Symposium on Computer Arithmetic (ARITH) and of the International Conference on Field-Programmable Logic and Applications (FPL). Ienne has guest edited a number of special issues and special sections on various topics for IEEE and ACM journals. He is regularly member of program committees of international workshops and conferences in the areas of design automation, computer architecture, embedded systems, compilers, FPGAs, and asynchronous design. He has been an associate editor of ACM Transactions on Architecture and Code Optimization (TACO), since 2015, of ACM Computing Surveys (CSUR), since 2014, and of ACM Transactions on Design Automation of Electronic Systems (TODAES) from 2011 to 2016.

Wednesday, February 5

07:00-08:30	Breakfast buffet (included; open 7am-10am)
08:30-10:00	The Challenges and Industry Perspectives of Building an AI Chip, part I (Instructor: Dr. Jian Ouyang) Slides
10:30-11:25	Panel Session: Privacy, Trust and Security (Group 1)
16:00-16:30	Tea Time (included but not obligatory)
17:15-17:55	Panel Session: Mobile/Wearable Devices and Personal Data (Group 2)
17:55-18:30	Panel Session: Prediction and Data Analysis (Group 3)
19:30-	“Soirée typique” at Cantine sur Coux (included)

Abstract:

Computing capability is the key factor for AI. And the AI chip is thought as the engine of AI computing capability. The AI chip, especially the AI chip for data center, is a full stack system, including product definition, algorithm, software, architecture and VISI implementation. In this talk, I will talk about the challenges and my perspectives of building a AI chip.

Bio:

Jian Ouyang, Baidu Distinguished Architect, is the head of Baidu AI silicon department. He is working on AI chip product, architecture and system. He had published several papers on ASPLOS2014, Hotchips2014/2016/2017 and so on.

Thursday, February 6

07:00-08:00	Breakfast buffet (included; open 7am-10am)
08:30-11:45	In-Network Computing – Opportunities and Challenges (Instructor: Dr. Alberto Lerner) Slides
16:00-16:30	Tea Time (included but not obligatory)
17:35-18:20	Panel Session: New Networking (Group 4)
18:20-19:00	Panel Session: Storage and Data Structures (Group 5)
19:30-	Dinner at Le Nord (included)

Abstract:

TBD

Bio:

TBD

Friday, February 7

07:00-08:30	Breakfast buffet (included; open 7am-10am)
07:00-08:45	Check out
08:45-10:15	Wearable Computing: From Signal Processing to AI (Instructor: Dr. Ricard Delgado, CSEM) Slides
10:15-10:45	Break
10:45-11:45	DNA Storage (Instructor: Dr. Thomas Heinis, Imperial College) Slides
11:45-11:50	Closing

Abstract:

TBD

Bio:

TBD

Looking forward to seeing you all in Champéry!

Alberto, Valerio, Pascal, and Philippe.