Prototypes

You can find the majority of our distributed or Cloud-native tools and frameworks on our GitHub page.

These include:

  • Parallax: A persistent key-value store that is embeddable and optimized for fast storage. Available at GitHub.
  • Tebis: An efficient distributed key value store for fast storage devices and RDMA networks. Available at GitHub.
  • FastMap provides an optimized memory-mapped I/O path inside the Linux kernel. It is implemented as a kernel module and supported by user-space libraries for management purposes. Available at GitHub.
  • Arax: A runtime framework for decoupling applications from heterogeneous accelerators. Available at GitHub.
  • Teraheap: A JVM extension that reduces memory pressure for large memory applications such as big data frameworks. Available at GitHub. More details: ASPLOS'23 paper, presentation.
  • HPK: HPK allows running Kubernetes applications within HPC by translating deployments to Slurm and Singularity/Apptainer. Available at GitHub.
  • Knot: Knot is a complete environment for doing actual work on Kubernetes. It includes a complete set of web-based tools to help you unleash your productivity, without ever needing to use the command line. At its core, the Knot dashboard (previously known as Karvdash) supplies the landing page for users, allowing them to launch notebooks and other services, design workflows, and specify parameters related to execution through a user-friendly interface. Available at GitHub.
  • Frisbee: Frisbee is a cloud-native testbed for exploring, testing, and benchmarking distributed applications. Available at GitHub.

 

Software prototypes for HPC platforms, available upon request:

  • XHC (XPMEM-based Hierarchical Collectives) for Open MPI: XHC is an implementation for intra-node MPI collectives, complementing the Open MPI codebase, that addresses the performance and scaling challenges inherent to modern multi/many-core CPUs, by introducing several efficiency enhancements such as topology awareness and single-copy data transfers. XHC constructs a multi-level hierarchy that conforms to the processor's topological features and dictates the algorithms' data movement patterns. Data is exchanged over memory, without any redundant copies, via XPMEM-created user-level shared ad-dress space mappings. Synchronization is also realized through memory, with utilization of highly efficient lock-free techniques.
  • OpenMPI for ARM + RISC-V hybrid: We have created a port of Open MPI capable of utilizing together ARM and RISC-V cores (i.e. operate in a hybrid-ISA environ-ment, as contemplated for upcoming advanced SoCs for HPC). Considerations addressed include: (a) data type representation, conversions, padding, (b) efficient communication primitives and messaging protocols for hybrid environment, (c) process launch and management, (d) platform-aware collective primitives implementation.
  • RDMA-based implementation of MPI (MPICH): We have created a port of MPICH that utilizes RDMA and mailbox mechanisms exported by a Slim RDMA Hardware Transport IP. Both read-based and write-based messaging protocols are supported, as well as a hardware-assisted implementation of the MPI AllReduce collective primitive.
  • Yarvt RISC-V bootable image loader: Yarvt is a bootable image loader utility for hybrid-ISA HPC SoCs combining ARM and RISC_V cores. It operates on ARM (host-side) cores to initialize (accelerator-side) RISC-V cores, over shared memory. This utility facilitates the installation of the Linux kernel im-age, combined with the device tree describing peripherals and the OpenSBI firmware (incl. boot-loader functionality) in the selected memory region, and then reset the RISC-V core.
  • Yat Automation Tools for creating and deploying bootable images: Yat is suite of tools to support the creation and deployment of new bootable images for ARM and RISC-V cores in HPC systems. They package together updated device-tree, firmware, bootloader, and Linux kernel files, that they draw from specified reposito-ries. The build process invokes in proper sequence the required compiler and packaging tool-chains. These tools can also handle FPGA-based prototypes, with the additional steps needed for creating and installing appropriate FPGA bit-streams on the test plat-form.

 

Machine Learning algorithms (please contact us for more information):

  • Sentitour classifier of hotel reviews: Sentitour is a Machine Learning - based software suite, which classifies natural language documents of hotel reviews. A category of hotel reviews that are not easy to classify involves reviews, such as the ones of booking.com, where users enter two paragraphs, which are known a-priori to include a list of positive and a list of negative aspects respectively. Sentitour uses a sequence of LSTM and GRU layers to combine the two paragraphs and predict the overall score of reviews with an accuracy that outperforms by 7 percentile units state of the art NLP neural networks, such as, BERT, which has been developed and trained by Google.

 

Hardware IPs (please contact us for more information):

  • Slim RDMA Hardware Transport: This Network Interface Hardware IP has been developed within the ExaNeSt and EuroEXA project. It has formed the core hardware IP that allowed distributed MPI applications to run on the testbeds of these two projects that consisted of low-power ARM processors interconnected in Torus topology with a 10-20 Gb/s interconnect. The Network Interface (NI) consists of a number of smaller IPs that have been developed and proliferated over the years. The NI of CARV is currently upgraded within the RED-SEA project, where, among other things, we examine its adaptation for the BXI network of ATOS and its tight-coupling with RISC-V processors, as well its upgrade to 100 and 200 Gb/s speeds.
  • RDMA Accurate Congestion Control: An increasing number of datacenters and HPC installations adopt RDMA-based networks, with user-level initiated transfers that bypass the kernel network stack. Without the kernel involvement, handling congestion at the network hardware level is both necessary and a good opportunity to replace the suboptimal TCP in traditional IP networks. CARV has developed both a mechanism for congestion management (see Patent Application No. 14/864,355) as well as hardware IPs that implement this mechanism in hard-ware. These hardware IPs are (i) a Contention Point (CP) that measures the fair share of flows at links without keeping any per-flow state, (ii) distributed clocks, which to maintain and distribute a common clock across distributed nodes (FPGAs in our testbed), (iii) a rate limiter based on a hardware priority queue. We have coupled this congestion control with the RDMA engine and verified its benefits in hardware, while we currently examine how it can protect MPI applications collocated with batch (e.g. storage) traffic.
  • Shared L2/LLC Cache: The Shared L2 / Last-Level Cache (LLC) IP has been developed within the European Processor Initiative – Phase 1 (EPI-SGA1) project and is also exploited in EPI Phase 2 (EPI-SGA2) and relevant sister projects such as the “The European Pilot” and eProcessor. The Shared L2 Cache IP is designed to support the high-throughput requirements of the EPAC RISC-V Vector Accelerator (one of the designs of the EPI project) and supports very long vector operations. This IP is FPGA-proven and operates together the RISC-V cores and enables booting SMP Linux. This IP is part of the EPAC 1.0 Test Chip that has been taped-out in GlobalFoundries 22nm FDX technology and is now silicon-proven.

 

ExaNeSt: European Exascale System Interconnect and Storage.

 

EuroServer: Scale-out architecture for energy efficient servers & microservers.