Position Details: Big Data Engineer - 971893O
- Evaluate, extract/transform data for analytic purpose within the context of Bigdata environment.
- Responsible for using Hive and spark for data warehouse applications to maintain large datasets in AWS S3 and decide on engineering tools based on recommendations.
- Contribute to ETL and aggregate design.
- Design and develop spark scripts to gather data insights as per business requirements and collaborate with other teams on integration needs/design.
- Facilitate or performs application support, problem solving, and issue resolution with internal and external resources. Contributes and reviews recommendations for technical solutions.
- Resolve big data issues and determine options for issue resolution and risk mitigation.
- Ensure to use components such as sqoop, hive, spark for development. use open source tools such as Airflow, Genie, EMR and Cloudera Hadoop distribution for the tasks assigned.
- Review and approve performance test results, recommendations, and tuning results. Oversee and is responsible for the creation of test plans, test execution, and validation of test results.
- Responsible for EMR Cluster creation, administration, sizing and configuration.
- Development and unit testing on Hadoop and AWS ecosystem.
Automate and monitor the ETL process and applications.
- 5+ years of work experience in Data Warehouse/ BI environment.
- 2+ years of experience working with Hadoop, Hive, Spark
- Solid understanding of general BI and ETL concepts.
- Expert level experience in SQL.
- Working knowledge in Unix/ Linux environments. Shell scripting.
- Experience with Pyhon.
- Familiar with AWS technologies like S3, EMR.
- Familiar with Airflow.
- Understanding and familiarity with visualization tools like Tableau.