Roles and Responsibilities: .
Tests, monitors, manage and validate data warehouse activity including data extraction, transformation, movement, loading, cleansing, and updating processes
Building and maintaining optimized and highly available data pipelines that facilitate deeper analysis and reporting
Collaborate with stakeholders to understand, define and document business applications, data sources/relationships, and needs (metrics, dimensions, charts and dashboards).
Analyze and Identify new data sources as needed per stakeholder requirements.
Working closely with analytics team and other business stakeholders to create data processing frameworks that help the business make sense of growing data
Manage and monitor existing data pipelines and work with other team members to ensure best practices for ETL and data warehousing
Work on optimization of data storage and processing infrastructure to reduce costs
Facilitate integration of different BI tools as needed by business stakeholders
Assist in the design and documentation of data infrastructure and technologies
Skills Required :
Proficient in languages: Python.
Minimum 2+ years’ experience as Data Engineer
Experience in AWS Stack - Glue, Athena, Quick sight, RDS, Redshift, Kafka, pySparkExperience setting up data pipelines, archiving data, data lakes
Bachelors or Master’s degree in computer science/Maths/statistics or related field
Expertise with designing complex Data Models and Data Engineering solutions
Write complex SQL for processing raw data, data transformations as well as data validations
Experience with infrastructure as code using shell scripting
Hands on experience in working with big data on AWS environments including cleaning/transforming/cataloging/mapping etc.
Experience in setup of data warehouse using Amazon Redshift, creating Redshift clusters and perform data analysis queries
Experience in ETL and data modeling on AWS ecosystem components - AWS Glue, Redshift, DynamoDB
Familiarity with AWS data migration tools such as AWS DMS, Amazon EMR, and AWS Data Pipeline
Certifications on AWS will be an added plus.