There are two key words in this title, data and science. Although it should be obvious, the ability to excel in this very hot and exciting field of Data Science requires a very good understanding of these terms and the thought and background behind each of them. Let’s start with science. It is the less complex but more controversial of the two concepts.
Science is a process and a methodology. It is a breath mint and a candy mint. Science is used to gather knowledge. Its principles, or The Scientific Method, were developed during the Islamic Golden Age, about a thousand years ago (1020), by Ibn al-Haytham. About three hundred years later, in the 13th century, Francis Bacon expanded upon al-Haytham’s process and memorialized its four key parts; observation, hypothesis development, experimentation and independent validation; or, watching, guessing, testing and getting a second opinion. This little bit of history is sort of like the song All Along the Watchtower; Dylan wrote the tune, but Jimi Hendrix owns it. Same for al-Haytham and Bacon - al-Haytham was the "thinkeruper" but Bacon got the glory.
Without understanding, accepting and using science and the scientific method, participating productively in the contemporary field of data science seems impossible. There is today however an odd circumstance. We have too many members of our society, and this also goes for the emerging data community, lacking some of the scientific basics. Without the scientific basics, how can one even begin to think about data? We may have unwittingly created part of our own problem.
Fifty eight percent of this group believes college is having a negative effect on our country.
Current thinking in America is casting doubt on the validity of science and on the institutions of higher learning that teach it. We see and hear this in the field – there is something missing and suppliers and buyers each know this, they’re just not sure what it is. There is a sense they’ve missed a chapter or two. There are questions being asked where the questioners know they should know…
There is definitely a rigor missing and we have focused on three elements in this post, Data Hygiene, Data Visualization and Distribution Trimming. The jury may still be out on exactly what is to account for the shallow thought pools associated with scientific inquiry, but the increasingly dismissive views of science are likely part of the reason. Recent findings on the perceived value of a college education may help explain this.
There is definitely a rigor missing
and we have focused on three important elements:
1) Data Hygiene
2) Data Visualization
3) Distribution Trimming