Scalable Collaboration for Data Science Done in Open Source Tools and Frameworks

Monday, October 1, 2018 - 1:25pm - 2:25pm
Lind 305
Tyler Whitehouse (Gigantum)
Effective collaboration between individuals and institutions is a promising path for accelerating research and discovery. However, as research becomes increasingly data driven and software dependent, making sure researchers and practitioners of different skills and locations have robust means to work together goes far beyond simply passing academic papers back and forth. Collaborating more and more means being able to share highly customized code and data which don’t fall within the previous unified frameworks like Matlab. The move towards open source tools simultaneously provides broad access but creates significant barriers in terms of skill and labor.

This talk will discuss some recent approaches for allowing collaboration on research and work developed in open source tools and frameworks, and it discusses ways that academic and industrial data scientists can work together.

Tyler Whitehouse did an undergraduate degree in math at UC Santa Cruz and a Ph.D. at the University of Minnesota. He graduated in 2009 after working with Professor Gilad Lerman on problems dealing with the rectifiability of sets and measures in Hilbert spaces, then going on to Vanderbilt University as a postdoc for 3 years. From 2012 to 2017 he worked as a data scientist and consultant in the Washington DC area. Currently, he is the president of a data science software startup in the DC area.