Coarse-grained reconfigurable arrays (CGRAs) are programmable hardware platforms with large coarse-grained blocks and datapath-style (bus-based) interconnect. While papers on CGRAs stretch back to the 1990s, and with recent commercial CGRAs introduced by several companies, a modelling and exploration framework for CGRA architectures has been lacking. CGRA-ME is a software framework under active development at the University of Toronto that allows a human architect to specify a hypothetical CGRA architecture using an XML-based language or a C++ API. A set of benchmark applications can be mapped into the modelled CGRA to assess its area and performance. Verilog RTL for the CGRA can be automatically generated for verification by simulation, or for synthesis with an FPGA or standard-cell ASIC flow. CGRA-ME provides an open-source platform enabling research on CGRA architectures and CAD algorithms.
EDRA is a Horizon 2020 FET Launchpad project, focusing on the commercialization of the Decoupled Access Execution Reconfigurable (DAER) framework – developed within the FET-HPC EXTRA project – on Amazon’s Elastic Cloud Compute (EC2) FPGA infrastructure. The project targets the high-performance computing (HPC) market by offering a ‘one-stop’ solution to accelerate compute-bound applications – that is, applications where the time taken to complete the task is determined mainly by processor speed. During the industrial & research project event, EDRA will present and demonstrate the architecture of its virtual machine deployed on Amazon’s cloud FPGA-support infrastructure that accelerates RAxML, a tool for phylogenetic analysis and post-analysis of large phylogenies.
New achievements in artificial machine learning are reported almost daily by the big firms. While those achievements are mainly based on high-end fast processing and massive data techniques, the potential of embedded machine learning techniques is still not understood well by the majority of the industrial players and SMEs. Nevertheless, the potential of machine learning, directly embedded in a device or system which is trained in an online or an offline fashion is perceived as very high. This has led to a broad demand by industry and SMEs for a practical and application-oriented feasibility study which helps them to understand the potential benefits but also the limitations of embedded Artificial Intelligence. Currently, the question where specific algorithms are realized, at the embedded device or in the cloud, is under discussion in several fields of applications. E.g. Xilinx supports the so called edge computing where algorithms are realized on FPGA based SoCs. On the other hand, they support also the high performance FPGA based cloud system with Amazon. Both realization alternatives have their pros and cons and need to be analyzed according the respective application domain. Furthermore, the realization in general on embedded systems is a crucial challenge for SMEs. This project aims at developing and demonstrating ‘best practices for embedded AI’ by means of four relevant industrial case studies. In those case studies we will tackle several elements which are related to technology, safety and certifiability.
Speaker: Michael Huebner, Brandenburg University of Technology
Evolve brings together Big Data, HPC and Cloud Domains in a single testbed and delivers an acceleration infrastructure that combines heterogenous technologies to create a powerful platform. Evolve’s acceleration infrastructure has been exposed to the Evolve cloud service as well as to a web-based graphical frontend. Applications from the deep learning domain have been leveraged, to evaluate the functionality and performance of the acceleration substrate within the Evolve’s integrated HW/SW testbed platform. Evolve’s users can develop their own accelerators and import them as libraries. Furthermore, they can create their applications based on major machine learning toolflows, such as Tensorflow. A tensorflow-gpu version targeting cublas has already been deployed on evolve-infrastructure. Similarly, Evolve aspires to provide generic pre-compiled machine learning FPGA bitstreams and even expose fpga-acceleration capabilities to Tensorflow.
JOINTER is a joint work from Universities of L'Aquila, Sassari and Cagliari. It is a framework that allows to develop complex heterogeneous architectures composed of programmable processors and dedicated reconfigurable accelerators on FPGA, together with customizable monitoring systems, keeping under control the introduced overhead. This work is an outcome of the FitOptiVis ECSEL project.
Recently system integrators have dramatically increased their efforts in heterogeneous computing by integrating heterogeneous cores on die (ARM), utilizing general purpose GPUs (NVIDIA), combining CPUs and GPUs on same die (Intel, AMD), leveraging FPGAs (Altera, Xilinx), integrating CPUs with FPGAs (Xilinx), and coupling FPGAs and CPUs in the same package (IBM-Altera, Intel-Altera). Heterogeneity aims to solve the problems associated with the end of Moore´s Law by incorporating more specialized compute units in the system hardware and by utilizing the most efficient compute unit. However, while software-stack support for heterogeneity is relatively well developed for performance, software stack support for power- and energy-efficient computing it is severely lacking. The primary ambition of the LEGaTO project is to address this challenge by starting with a Made-inEurope mature software stack, and by optimizing this stack to support energy-efficient computing on a commercial cuttingedge European-developed CPU-GPU-FPGA heterogeneous hardware substrate. In this talk I will present examples of how FPGAs are being utiized in the LEGaTO project for energy savings with specific cases looking at programming environment support, FPGA undervolting, flexible communication for cloud to edge HPC computing and FPGA checkpoointing.