- Design and develop Big Data architecture to support the processing of large volumes of data;
- Write complex and efficient code in SQL or Python to optimize data pipelines to ingest, cleanse, transform and integrate data from multiple disparate sources;
- Work extensively with AWS services to deploy and run our database code via CI/CD pipelines;
- Develop the data pipelines to support data lake, data science and ML initiatives;
- Adhere to best practice development standards for database design and architecture.
- Hands on experience in data management technologies (Microsoft SQL Server, Postgres, advanced SQL coding, relational database design, data warehousing);
- Experience in developing Big Data technologies (Spark, AWS EMR) and distributed data processing to handle large volumes of data;
- Experience writing Python or PySpark;
- Experience working in a Linux and Windows environment;
- Enjoy implementing new technologies and coming up with innovative solutions.
- 1+ years of experience working with Parquet file formats and AWS S3 storage;
- 1+ years of experience in writing infrastructure as code for AWS cloud services (Terraform, CloudFormation);
- 1+ years of experience creating CI/CD deployment pipelines in Azure DevOps;
- 1+ years of experience working with containers (Docker, Kubernetes).
Nice to have:
How to apply
Send us an e-mail at firstname.lastname@example.org with your CV or call us.