← Back

Distributed Data Science (At Scale): Machine Learning Models on IoT and Cloud

Marina Malaguti

cloud
data
machine learning

2:00 - 2:50 - Clarity

The Digital Universe. Comprised of data not only coming from people but also from “things”, will reach 44 zettabytes, or 44 trillion gigabytes by 2020. Trying to make sense of all these data will be practically impossible. Also, only ~20% of the information in the digital universe would be a candidate for analysis. So how can your organization effectively identify only the data needed for analysis? Should you move it all in the cloud and hope that technology will always get better, faster, cheaper? What about processing, operating and storage costs of these data? What would your organization do with the rest of the data unused? Distributed Data Science is a hybrid solution that involves executing Machine Learning algorithms on IoT (EDGE Computing) and in the Cloud.

In this talk, we’ll explore how Distributed Data Science can help solve the ‘Big Data’ problems in the Digital Universe and help your organization extract value from the data.