What are the missions of a Data Engineer?


The Data Engineer takes out the raw data and transforms it into actionable data for the data scientist team.

In general, we say that he develops, constructs, tests and maintains architectures. He can also deploy sophisticated analytics programs, machine learning, and statistical methods.



He is responsible for the development, construction, and maintenance of analytics infrastructure that enables almost every other functions in the data world. He is the main guy for the development, construction, maintenance, and testing of architectures such as databases and large-scale processing systems. Data Engineers also work on leveraging and improving data analytics systems. Here, his mission is to build a bridge between the data sources and the data scientists team to make it usable for them.

However, there is not only one type of Data Engineer. In reality, the tasks of a data engineer will depend on the size of the company, and the way they work with the data.


The generalist Data Engineer

He usually works on small teams. A generalist will have to work on an end-to-end perspective, meaning he will work on a data project from A to Z. From ingesting the raw data to deliver insights from those, he has a complete hand on the data project. This requires more skills than a usual data engineer as he has to work on an end-to-end product.  However, less experience is needed on the engineering aspect as a generalist data engineer has a smaller amount of data to work with. Companies hiring generalist data engineers want to put in place a data structure in their company and have usually fewer data to work with.


The pipeline-centric Data Engineer

We usually find the pipeline-centric engineer in mid-sized companies with complex data science needs.
This Data Engineer will work with a team of Data Scientist. He will transform data for into a format useful for analysis. The pipeline centric Data Engineer will have specific knowledge in distributed systems and computer science.
He also has to be able tocreate tools for predictive algorithm.


The database-centric Data Engineer

The database-centric Data Engineer is usually working in larger companies. He is coming from his part more focused on setting up and populating analytics databases.
His main missions are mostly to work on fast analysis and creating table schemas (ETL, datawarehouse).


To sum up the missions of a Data Engineer, we can say that he is the main author for:

  • Architecting distributed systems
  • Creating reliable pipelines
  • Combining data sources
  • Architecting data stores
  • Collaborating with data science teams and building the right solutions for them
  • Develop and translate computer algorithms into prototype codes and identifying trends on large data sets.


