1. The Past Expectation
It is vital to understand the past responsibility to adapt to the current expectation of the industry.
- The AI space was still relatively new (though not in academics) and many companies, startups were analyzing its application and valid use-case.
- The research was the primary focus. The caveat here was that this research many times was not directly in line with the core of the organization. So initially not much credibility was expected.
- Generally, companies used to blend the roles of a Data Scientist with a Data analyst or Data engineer. Again, due to the vagueness of AI enterprise application.
- Individuals also had a kind of similar dilemma. A lot of their research or work was not directly in line and practically not viable to be served as a product.
2. The current outlook
The democratization of AI has seen remarkable developments from businesses and startups. Let us try to understand it,
- The industry now distinguishes the role of a Data Scientist, Machine Learning Engineer, Data Analyst, Data engineer, even MLops engineer.
- Businesses no longer allow research in the wild, as they know what use-case exactly they are tapping in. A clear mindset & similar discrete approach from an individual is also required.
- Every Research or POC must have a tangible and servable product
3. The thorough dissection of all the Roles
If we have to pick one area where the Businesses have excelled in AI space, it is undoubtedly the clear expectation from all varieties of the Roles, which are in a nutshell:
Data Scientist: A Data Scientist is a person who (generally from a stats/maths background) uses a variety of means including AI to extract valuable information from data.
- A fundamental difference between Data Analyst & Data scientist is- the former generally rely on domain knowledge and manual old school methods to make sense of data on a small to medium scale, whereas, the latter is responsible for collecting, analyzing and interpreting data on a larger scale using wider means of tools like AI, SQL, old school manual ways, etc.,
- Domain knowledge is not a must but having is helpful.
- The primary job is to maintain and extract business contributing insights from data & not to develop the software or product.
- A Statistician or a Mathematician can become a good Data Scientist.
Machine Learning Engineer: A niche software engineer who develops a product or service based on AI.
- An ML engineer needs to have all the expertise of traditional software engineering along with knowledge of AI because he/she is eventually going to build software with AI at its core.
- The primary job is not to extract data but to develop an AI tool that can perform the same job.
- A developer with good knowledge of machine learning/deep learning as well as software engineering can become a good Machine learning engineer.
Machine Learning Operation Engineer: A niche software engineer who maintains and automates the pipeline which is used by the ML system.
- Relatively new field inspired by DevOps. Though different from traditional DevOps roles.
- Unlike traditional software engineering, development for any product/software/service based on AI doesn't stop at the completion of the building of software. It has to be updated regularly with new data, based on the Data-Drift.
- The primary job includes all traditional DevOps work as well as maintaining/automating pipeline and Data-Drift
- A developer with good knowledge of machine learning/deep learning, software engineering & cloud technologies can become a good MlOps engineer.
Data Engineer: A niche software engineer who develops a pipeline to serve all data needs using a variety of tools (generally cloud-based)
- The data engineer needs to have expertise with major cloud-based data platform, batch or stream processing, big data platform (depends on business requirement), database (though not like a Database administrator).
- Have to work in line with Data Governance policies.
- Primary job is to design, implements and maintain the Data pipeline.
- A developer with good knowledge of cloud technologies, data platform, processing needs can become a good Data engineer.
For a new seeker or someone who is aiming to advance in his or her career, all these roles and expectations must be well understood. Given that companies are clearly distinguishing this role, it is expected that this will also be the case for individuals. A vague mindset is totally useless.