Data Science is a field allowing the collection, cleaning and analysis of data to uncover trends or knowledge.
Definition
Data Science is an interdisciplinary field between science and computing used to generate insights. It melts mathematics, scientific methods and processes. Thus, skills needed are various: from mathematics (Statistics and probability) and data engineering to computer science and software programming (Usually R or Python). It allows multiple projects, from object detection to machine learning.
It is still young and growing fast over the last years. The reason? The volume of data stored by companies is booming and public datasets can now be treated more easily with new programming languages allowing to extract the real value hidden among them.
Data Science Goals and Challenges
The objective of the data scientist is to explore, sort and analyze megadata from various sources in order to take advantage of them and reach conclusions to optimize business processes or for decision support. Examples include machine maintenance or (predictive maintenance), in the fields of marketing and sales with sales forecasting based on weather for example. The use cases are almost infinite…
The pillars on which the data scientist relies most often are data mining (data exploration), statistics, machine learning, search algorithms (random forest, decision tree, regression, neural network…), data visualization (Dataviz) with tools such as Matlo, Qlik… Data science is revolutionizing the processing of corporate or public data that until now were difficult to use with traditional (structured) technologies. The concomitance between the lightning growth of databases, the emergence of new technologies around machine learning, artificial intelligence and Big Data now allow to perform semi-structured data analysis.
We talk a lot about Data science when we talk about Big Data, but it is not limited to massive data sets. At Saagie, for example, we think it’s better to talk about Smart Data: it is possible to take advantage of data regardless of its size.
There is a strong appetite for data science in areas such as :
Industry
- Predictive maintenance
Banks and Insurances
- Process automation
- Customer insight
- Churn Rate reduction
Healthcare
- Epidemiology
- Toxicology
- Research
Retail
- Sales trends, forecasts
- Customer 360
- Predictive marketing
Environment
- Weather simulation
- Projected impact
Transportation and Urbanism
- Transportation optimization
- Smart Cities