Big Data Engineer Job Description
Big Data Engineer Duties & Responsibilities
To write an effective big data engineer job description, begin by listing detailed duties, responsibilities and expectations. We have included big data engineer job description templates that you can modify and use.
Sample responsibilities for this position include:
Big Data Engineer Qualifications
Qualifications for a job description may include education, certification, and experience.
Licensing or Certifications for Big Data Engineer
List any licenses or certifications required by the position: AWS, III, IAT, CCP, GCP, ISTQB, ISEB, OS, COMPTIA, PKI
Education for Big Data Engineer
Typically a job would require a certain level of education.
Employers hiring for the big data engineer job most commonly would prefer for their future employee to have a relevant degree such as Bachelor's and Master's Degree in Computer Science, Engineering, Technical, Statistics, Information Systems, Software Engineering, Mathematics, Business, Management Information Systems, Computer Engineering
Skills for Big Data Engineer
Desired skills for big data engineer include:
Desired experience for big data engineer includes:
Big Data Engineer Examples
Big Data Engineer Job Description
- Engage with business stakeholders and data scientists to understand and analyze business problems
- Build data solutions using available big data frameworks and technologies
- Acquire data from various data sources (cloud, databases, Hadoop, sensors, weblogs and social)
- Analyze huge sets of data in Hadoop (both structured and unstructured)
- Perform data discovery, integration and identify / fix data quality issues
- Define automated processes for tracking data quality and consistency
- Perform impact assessment and deep dive analysis to ensure data integrity, usability and completeness
- Design, develop, implement, and maintain code, information architecture, and conceptual models to support data processing, and flows thru data lake
- Support the Marine Corps Intelligence Activity (MCIA) Weapons and Technology Division (WTD) in Science & Technology Intelligence (S&TI) analysis of foreign basic and applied research and engineering, prototyping, technology transfer, developmental and operational testing military fielding and Integration of technologies
- Conduct S&TI research, analysis and produce finished (S&TI) assessments on information systems and emerging and disruptive technologies (E&DT) that will likely affect US Marine Corps operations
- Effective communication and presentation at senior levels of the organization
- Bachelor’s degree or equivalent, plus 5+ years of experience in building data solutions in large scale distributed systems
- Strong data analysis skills - ability to identify, analyze and integrate various large complex data sources (Internal and external marketplace data providers) into a readily consumable data product
- Strong experience in using open source frameworks
- Familiarity with data science model development and production deployment
- Highly curious, with a keen eye for data and a result-oriented attitude
Big Data Engineer Job Description
- Participate in development of data infrastructure capable of ingesting and storing petabytes of data and serving thousands of queries a day within seconds on that data
- Build fault tolerant, self-healing, adaptive and highly accurate data and event computational pipelines
- Prepare and capture data for machine learning and automation
- Focus on people and Improvement by providing team a platform for improvement not only during the retro but all the time
- Experience to provide training to the team on the agile methodologies
- Implement the winning strategy according as per the ground conditions
- Agile processes in each sprint at user story level as per the Definition of Done (DoD)
- Successfully run agile projects of varying size and complexity
- Identify project risks and raise them dedicatedly
- Agile process during the project execution
- A proven history of building big data solutions - TBs or PBs of data
- A passion for working with huge data sets
- Solid experience with big data frameworks
- Knowledge of OOP design and patterns
- Strong knowledge of source code repository like SVN/GIT & build tools like Maven • Proven use of open source frameworks
- Experience working with public clouds like Azure, AWS
Big Data Engineer Job Description
- Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems using Core Java technology stack
- Develop efficient software code for multiple use cases leveraging Core Java and Big Data technologies for various use cases built on the platform
- Implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark, Hadoop, any Cloud computing
- Develop programs in Python as part of data cleaning and processing
- Implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark and any Cloud computing
- Scope and deliver solutions with the ability to design solutions independently based on high-level architecture
- Maintain the production systems like Kafka, Hadoop, Cassandra, Elasticsearch
- Designs and establishes secure and performant data architectures, enhancements, updates, and programming changes for portions and subsystems of data pipelines, repositories or models for structured/unstructured data
- Writes and executes complete testing plans, protocols, and documentation for assigned portion of data system or component
- Leads a project team of other data engineers to develop reliable, cost effective and high-quality solutions for assigned data system, model, or component
- Bachelors of Science (or higher) in computer science or related field
- 5+ years of systems/application analysis & design experience
- 3+ years of data modeling & database administrator experience
- Experience with cluster management technologies such as YARN, Mesos, Kubernetes
- Excellent knowledge of relational databases (PostgreSQL), SQL and ORM technologies
- Preferred experience with ATTD and associated technologies (Fitnesse, DBSlim, Junit)
Big Data Engineer Job Description
- Implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Spark, Scala, Hive any Cloud computing
- Develop programs in Scala and Python as part of data cleaning and processing
- Develop efficient software code for multiple use cases leveraging Python and Big Data technologies for various use cases built on the platform
- Implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark, Kafka, any Cloud computing
- Develop efficient software code for multiple use cases leveraging Spark and Big Data technologies for various use cases built on the platform
- Collaborates and communicates with project team regarding project progress and issue resolution
- Represents the data engineering team for all phases of larger and more-complex development projects
- The experienced data engineer also helps to define and improve the standards
- With the squad, the data engineer works on solution description and implementation of different projects and enhancements within the big data hub-environment
- Be able to analyse and develop (complex) solutions
- Preferred experience with delivering code using Continuous Integration and Continuous Delivery (CI/CD) best practices and DevOps (Jenkins Pipelines, Docker, Groovy, Ansible)
- STEM degree related to labor cat subject matter
- 3 years experience performing S&TI analysis or waivered with a demonstrated proficiency in analyzing, summarizing, and writing
- Bachelor's degree in Computer Science, Engineering, Information Systems or related field required
- Minimum 3 years on the job experience required
- 3 years of experience with Big Data technologies (Hadoop, Kafka, Cassandra or Spark) required
Big Data Engineer Job Description
- Looking for around 5-6 yrs of experience overall & 2-3 yrs of Python/Spark/Scala experience
- Responsible to develop using Java / Scala
- Responsible for partitioning techniques of data with Spark PARTITION, DISTRIBUTE BY, CLUSTER
- Run Spark jobs number of executors, cores, memory, sizing
- Monitoring and quality assurance of deliverables and make suggestion for improvements
- Prepare and follow-up of installation in QA and PROD environment
- Follow up and analyse incidents, identify the rootcause
- Supporting analytic experiments in the Analytics Hub
- Sharing knowledge and uplifting capabilities with data engineers and analysts in the business
- Thought leadership in the approach to data engineering at the bank
- 3 years of experience with factors affecting performance of ETL processes and SQL queries and working on performance tuning required
- 3 years of experience with implementing applications required
- 3 years of experience with Impala, Hive, Oozie, Hue, Zookeeper, HDFS or Yarn required
- Mentor or coach for scrum teams
- Expert into Agile Scrum principals, Task meeting/Retrospective
- Proven in Relative estimation, Story-based development