Data-Centric AI Practices


using the hub data format to simplify computer vision AI development

Dec 10, 2021 5:00 PM — 6:00 PM

about the speaker

PhD. at Princeton University advised by Sebastian Seung. Interested in artificial intelligence including machine learning, computer vision, a little bit neuroscience. The idea of creating Hub came to Davit while he was pursuing a Ph.D. at Princeton Neuroscience Lab, where his research involved reconstructing the topology of mouse brains. As part of his research, he dealt with large-scale unstructured data which could cost millions of dollars a year to store. That inspired Davit to create Hub, and eventually Activeloop, the database for AI company. More about Davit on his page.

about the talk

In computer vision projects, managing data is often an afterthought. That is why at least 31% of projects fail. Davit Buniatyan, CEO of Activeloop, will talk about how to build a solid data foundation to train more accurate and cost-effective computer vision models. In this session, you will learn about Hub, an open-source dataset format for AI. Hub works with computer vision datasets of any size and enables easy creation, storage, version control, and streaming to ML frameworks while training. Moreover, you will learn how to apply the data-centric framework to resolve common data bottlenecks when using tools like Amazon SageMaker. The speaker will also demonstrate how to visualize and explore datasets – from MNIST to ImageNet. As a result of the session, you will be able to easily build computer vision data pipelines and fully utilize the compute resources.

community building

We are an association promoting knowledge about data science as a nonprofit. We connect data scientists in Europe and all around the world. Our members are passionate data scientists from various areas of research and industry.