• Model Selection for Deep Learning on Greenplum Database - Yuhao Zhang

  • Title:
    Model Selection for Deep Learning on Greenplum Database
    Deep neural networks (deep nets) are revolutionizing many machine learning (ML) applications. But there is a major bottleneck to wider adoption: the pain of model selection. This empirical process involves exploring the deep net architecture and hyper-parameters, often requiring hundreds of trials. Alas, most ML systems focus on training one model at a time, reducing throughput and raising costs; some also sacrifice reproducibility. We present Cerebro, a system to raise deep net model selection throughput at scale without raising resource costs and without sacrificing reproducibility or accuracy. We then integrate Cerebro into Greenplum database by extending the Apache MADlib library for running model selection and deep nets training on Greenplum-resident data
    Ph.D. student advised by Prof. Arun Kumar. Research interest focuses on machine learning systems with a goal of making data science easier and faster.