Data engineers support Client’s Global Product and Merchandising Analytics team (GPMA) by collecting and synthesizing product and marketplace data across multiple areas of Client’s business, enriching and preparing the data to meet the business’s core reporting needs, and efficiently operationalizing sophisticated analytical models to help drive Client’s understanding of how to best position our products in the market. GPMA provides the core insights at the intersection of consumer segmentation & demand and product design & merchandising.
Most development will be in the enterprise warehouse or Spark (generally through SparkSQL), so strong SQL skills are required, but additional file integration of Excel docs, Avro or Parquet files, etc, will also be in scope, so general familiarity with Python is required. Academic understanding of ETL best practices as applied across batch and real-time integration and ingestion of data is important. The best candidate will have deep knowledge of best practices related to the operationalization of applied statistics / machine learning models in a cloud-native, distributed environment so that they can run efficiently and at scale. Strong communication is (as always) critical, as the data engineer’s primary responsibility will be to take input from Analysts and Product Owners to quickly understand and implement requirements.
The GPMA team is highly cross-functional, and business and technology are closely integrated. Engineers are co-located with the business, and e try to live and breathe the Agile process, so experience with the stand-up/sprint cadence and a culture of iterative development is desirable. One of the core benefits the engineering team offers the business is a stable SDLC (so business analysts aren’t directly changing production objects), as ell, so familiarity with best practices around code promotion, integrated testing, and CI/CD is useful.
The best candidate will be comfortable quickly responding and pivoting to meet shifting business needs, working closely to ensure that turnaround time of requests is handled as quickly as possible, and receiving and incorporating feedback. Our goal is to serve our business fast, course correct, and iterate.
Bachelor’s Degree in a software-related field
3-5 years experience developing in a large enterprise environment
2+ years SQL development
2+ years Python development
2+ years ETL development (Informatica, Matillion, SSIS, etc is acceptable, but non-graphical/code-centric experience is preferred)
CI/CD and integrated testing experience is highly desirable.
Apache Airflow experience is a bonus
Agile methodology experience is a bonus