Big Data has changed the face of the world!
With 2,3 trillion gigabytes of data created each day, companies have access to a broad range of information on their users, market and much more.
This Data allows them to constantly improve their product/service.
Companies have understood the opportunity that Big Data represents. The soar of Data Engineer and Data Scientist jobs show it to us.
In 2011, Harvard Business Review has elected Data Scientist the sexiest job of the 21st century to underline the success of the profession!
However, this job field being not being fully mature yet, Data jobs are still subject to misunderstandings. It appears for many as a blurry technical ‘thing’ which could potentially implement their product or service.
This misunderstanding could result in failure in a good use of resources. Let’s get back onto the fundamentals of these professions and decrypt the value of each.
Fig.1?—?THE DATA SCIENCE HIERARCHY OF NEEDS?—? Created by Monica Rogati
When a company makes a product/service, they need valuable information to start with good bases.
This information was difficult to find two decades before?—?when the internet does not exist. Today, with the evolution of technologies, data is the best way to understand the ecosystem where we evolve.
So here is the thing: Data Engineers and Data Scientists are two different professions which are part of a bigger plan. The “Data Science Hierarchy of Needs” pyramid illustrates well the process necessary to use Data in a company.
At the base, Software Developers work on the collection of all relevant Data for the Data Engineers.
Then, Data Engineers move and transform this Data into “pipelines” for the Data Scientists. They usually use programming languages such as Java, Scala, C++ or Python to do that work.
Finally, Data Scientists analyze, test, aggregate, optimize the data and present it for the company.
Sometimes, Research Scientists, Core Data Scientists and Machine Learning Engineers can be hired to optimize the ending process.
See “the Data Science hierarchy of needs” fig.1
Looking at fig.1, it becomes quite understandable that all these tasks have to be divided and given to specific Data professionals.
Data Engineers are specialized in 3 main data actions: to design, build and arrange Data “pipelines”.
They are sort of the Data Architects.
Data Engineers often have a computer engineering or science background and system creation skills.
“Data pipelines are sequences of processing and analysis steps applied to data for a specific purpose. They’re useful in production projects, and they can also be useful if one expects to encounter the same type of business question in the future, so as to save on design time and coding. For instance, one could remove outliers, apply dimensionality reduction techniques, and then run the result through a random forest classifier to provide automatic classification on a particular dataset that is pulled every week.”
Colleen Farrelly, Data Scientist/Poet/Social Scientist/Topologist (2009-present)
Fig.2 - Pipeline created from raw data to end results data.
What tasks have a Data Engineer in a company?
What competencies wait from a Data Engineer?
Data Scientists have normally 4 main tasks in a company. He analyses, tests, creates and presents them to the team.
Data Scientists have a math and statistical background. They are also comfortable with creating machine learning and artificial intelligence models.
What tasks have a Data Scientist in a company?
What competences wait from a Data Scientist?
More than 80% of data scientists have a master (in Science or Mathematical field) and 46% have a Ph.D. The most common fields chosen are in Mathematics and Statistics. Yet, Computer Science, Social Sciences and Physical Sciences or Engineering are great fields for Data Scientist. Having certifications and doing online courses could be add-on to your educational background. However, Phd are greatly appreciatedif you want to push your profile.
For the technical skills, we wait data scientists to have skills in programming languages.
Now for the choice of programming languages, here is a list of most demanded programming languages:
For the professionnal skills, what a skilled data scientist should have are:
As you can see, Engineers and Scientists require a different skillset and their profile are different.
Also, Data Scientists must have very good communication skills to present Data and propose decisions based on their work.
After having considered all these aspects, a Machine Learning and Artificial intelligence knowledge can be a game changer depending of the company you are applying in.
According to Glassdoor, the average base salaries in US (updated Sep 26, 2018) are :
Data Engineer : $151 / year on average
“The number of job openings for data engineers is almost five times higher than the number of job openings for data scientists. This makes sense as mostorganizations need more data engineers than data scientists on their team” according to Glassdoor.
Data Scientist is a dream work on the paper.
However, when they work in little structures, data scientists could be transformed as multitask employee.
When Data Scientists have to deal with all the Data Hierarchy, it can become difficult to do the work as they are not Data Engineer or Software Engineer.
It can result whether in a devaluation of the profession or a waste of resources for the company.
Sometimes, being a Data Scientist in a company could look like that:
As a result, studies show that in 2017, 24.0% of Data Scientists have changed job.
For sure, the Data Science job market is a flourishing environment which permits to change for the project employee like the most.
However, it also shows that a large amount of Data Scientists try to find a better place on the market.
Data Engineers have became a rare commodity.
Glassdoor makes a census of more than 107K Data Engineers job opening.
Data Engineers are in high demand:
“Even the hottest Silicon Valley companies are unable to achieve a one-to-two ratio. […] You don’t have enough engineering talent out there. It’s very expensive.” says Tomer Shiran, the CEO and co-founder of Gremio, a developer of big data middleware.
Why recruiters have difficulties to find data engineers today?
For example, in the Netherlands, recruiters are searching for native speaking Engineers with skills on very specific programming tools.
However, most of them face long-term waitings before they find their talent.
How to find a data engineer?
You should know first that it will require a good process.
Having a good candidate list of Data Engineers is essential to select the one that you need.
However, the process of finding Data Engineers need time and energy that companies doesn’t necessarily have.
The goal of recruitment agencies is to fulfill this gap between demand and offer.
By searching everyday best recruits on a specific field, there are able to answer to this major issue.
Follow us on Linkedin to read more blog articles: https://www.linkedin.com/company/digital-source/?