Data Science for All

There is so much data in the world that’s now available to all of us. Just considering what is shared by the United States government in its open data initiative there is an inconceivable plethora of information waiting to be explored, processed, and researched. The challenge facing modern data science is less about access and more about human resources and the ingenuity to ask great questions.

I’ve played with R and Julia programming languages in the past to broaden my own capabilities in data science. Our tool sets are powerful now, even for casual programmers and researchers. I can dig around and find data sets to dig through when given a problem, but I suffer from a horrible writers-block-like sense when facing the blank page. When practicing my programming skills, which project should I take on? Do I do something that can help the world? Do I dig through crime statistics or weather figures or use infrared data on fields to cross-reference rainfall and identify vegetation at risk for drought?

I wish there were a place online where data science hobbyists could grab an open question and submit back their analyses via literate programming. I imagine a place like Github where collaborative people can “Pull Request” their work back up to the list and have it appear as one of several attempts at a solution. We could compare between the different solutions and the comment systems would allow deeper commentary and question perhaps even resulting in a new line of inquiry.

I’m writing this here in hopes that by putting it out into the “ether” the idea will promulgate and find root in reality. I certainly don’t have the time or energy to dedicate to building such an ecosystem myself. You, reading this now, should go do it for me. I’ll totally use it.