Quantifying the worth of DS projects & discussing the Modern Data Stack


using the hub data format to simplify computer vision AI development

Apr 27, 2022 6:00 PM — 8:00 PM
CEU Vienna

Hi all,

We are very excited to come together in person after two years!

Come join us and check out CEU’s Vienna Campus with two exceptional speakers from the Data Science and Data Engineering world!

We are going to have two talks followed by an open-ended discussion. Hosting and catering will be provided by CEU Vienna.

This will be a hybrid event. We will provide the Zoom link later.

1. From AUC to EUR - what is the worth of a data science project?

Gábor Békés, assistant professor at CEU, research affiliate at CEPR London

We will look at a case study of corporate default prediction and discuss how to quantify the analytical added value.

In this project, we worked as data scientists for a consultancy that was set to decide which firms to have a contract with. This means building a predictive algorithm for classifying firms into two categories: risky and reliable partners. After the pilot project, we had to quantify how a better model created value for the company. This actual case study used financial and management data on 20K companies and made it my textbook.

Gábor Békés is an assistant professor at CEU Department of Economics and Business, a research affiliate at CEPR London. His research is about firm performance, organizational behavior, and globalization. Gábor is co-author of Data Analysis for Business, Economics and Policy, a textbook recently published by Cambridge University Press and adopted over 100 courses worldwide. He is a former program director of CEU’s MS in Business Analytics.

2. Orchestrating data in the mesh of the fragmented modern data stack

Georg Heiler, senior data scientist & complexity researcher

The fragmented modern data stack has emerged as the unbundling of Airflow. Various tools operate in silos. Dagster as a next-generation data orchestrator allows you to clearly see the data dependencies of the individual pipelines on your data factory floor. Following along with my blog post series about Dagster I will cover:

  1. Getting started with dagster and building simple data pipelines
  2. How software-defined assets allow to turn data pipelines around and result in higher quality by allowing to integrate data quality tests straight into the pipelines as well as by separating business logic from infrastructure allowing for better testability.

Georg’s interests are in working with large-scale spatio-temporal graph data. He considers an end-to-end view of the data pipelines and holistic data architecture. As an experienced data scientist in the industry, he has delivered use cases concerning fraud detection, mobility analytics and predictive maintenance in cable networks. As part of his doctoral studies, he researched inferring supply networks and analyzed the impact on society of government interventions due to the COVID-19 pandemic using mobility data.

Please check the current CEU’s COVID regulations before the meetup.

We hope to see you soon!

Cheers, The VDSG Meetup organizers



  • part 1: (TODO)
  • part 2:
community building

We are an association promoting knowledge about data science as a nonprofit. We connect data scientists in Europe and all around the world. Our members are passionate data scientists from various areas of research and industry.