Duration: 2018 – 2020
ABOUT THE PROJECT
E2Data aims to answer two key questions:
How can we improve execution times while using less hardware resources?
In order to address the alarming scalability concerns raised in recent years, end users and cloud infrastructure vendors (such as Google, Microsoft, Amazon, and Alibaba) are investing in heterogeneous hardware resources, making available a diverse selection of architectures such as CPUs, GPUs, FPGAs, and MICs. The aim is to further increase performance while minimizing operational costs. Furthermore, besides current investments in heterogeneous resources, large companies such as Google have developed in-house ASICs with TensorFlow being the prime example.
E2Data will provide a new Big Data software paradigm for achieving maximum resource utilization for heterogeneous cloud deployments without requiring developers to change their code. The proposed solution takes a cross-layer approach by allowing vertical communication between the four key layers of Big Data deployments: application, Big Data software, scheduler/cloud provider, and execution run time.
How can the user establish for each particular business scenario which is the highest performing and cheapest hardware configuration?
The E2Data consortium brings together two groups of EU Big Data practitioners to achieve its ambitious goals.
The following four industry partners bring well-defined performance requirements and infrastructure constraints to the project:
The following members are subject matter experts and researchers from the following institutions:
These experts are responsible for actually implementing the E2Data solution. They do this by extending existing European open-source projects and leveraging the results of their research at the bleeding edge of the field.