Job Posted:
16 March 2023
Job Description
? Understanding our data sets and how to bring them together.
? Working with our engineering team to support custom solutions offered to the product development..
? Filling the gap between development, engineering and data ops.
? Creating, maintaining and documenting scripts to support ongoing custom solutions.
? Excellent organizational skills, including attention to precise details
? Strong multitasking skills and ability to work in a fast-paced environment
? 5+ years experience with Python to develop scripts.
? Know your way around RESTFUL APIs.[Able to integrate not necessary to publish]
? You are familiar with pulling and pushing files from SFTP and AWS S3.
? Experience with any Cloud solutions including GCP / AWS / OCI / Azure.
? Familiarity with SQL programming to query and transform data from relational Databases.
? Familiarity to work with Linux (and Linux work environment).
? Excellent written and verbal communication skills
? Extracting, transforming, and loading data into internal databases and Hadoop
? Optimizing our new and existing data pipelines for speed and reliability
? Deploying product build and product improvements
? Documenting and managing multiple repositories of code
? Experience with SQL and NoSQL databases (Casendra, MySQL)
? Hands-on experience in data pipelining and ETL. (Any of these frameworks/tools: Hadoop, Big Query, Red Shift, Athena)
? Hands-on experience in Airflow
? Understanding of best practices, common coding patterns and good practices around
? storing, partitioning, warehousing and indexing of data
? Experience in reading the data from Kafka topic (both live stream and offline)
? Experience in PySpark and Data frames
Desired Candidate Profile
? Collaborating across an agile team to continuously design, iterate, and develop big data systems.
? Extracting, transforming, and loading data into internal databases.
? Optimizing our new and existing data pipelines for speed and reliability.
? Deploying new products and product improvements.
? Documenting and managing multiple repositories of code..
Skillset:
Python, PySpark, Kafka, Airflow, Sql, NoSql, API Integration,Data pipeline, Big Data, AWS/ GCP/ OCI/ Azure
Notes: Preferably from a product experience background
Key Skills
Python, PySpark, Kafka, Airflow, Sql, NoSql, API Integration,Data pipeline, Big Data, AWS/ GCP/ OCI/ Azure