Experienced Data Engineer and Analyst skilled in designing robust data architectures, optimizing data warehousing, and building efficient ETL pipelines to support business intelligence and advanced analytics. Proficient in data modelling, workflow automation, and data quality management, delivering real-time, actionable insights through dynamic dashboards. Expert at aligning data solutions with business objectives, driving operational efficiency, enhancing decision-making, and fostering business growth.
Overview
6
6
years of professional experience
Work History
Data Engineer
Tata Consultancy Services
Hyderabad
07.2020 - 12.2022
Migrated on-prem SQL databases to AWS, utilizing Databricks for ETL processing with RDS, S3, and EC2, enhancing scalability and ensuring high availability with automated failover
Implemented Databricks Delta Lake for schema evolution and optimized storage, reducing query latency by 35% and ensuring ACID compliance during cloud migration
Built scalable ETL pipelines in Databricks, integrating Apache Kafka to automate data validation workflows, reducing inconsistencies by 15% and cutting testing time by 60%
Optimized AWS Redshift performance by offloading transformations to Databricks, leveraging Photon Engine to improve query efficiency by 40% and reduce Redshift costs
Performed EDA on 5M+ records using Python and SQL, uncovering 30+ data trends and resolving anomalies that improved workflow efficiency by 20%
Automated inventory management with Python, Azure, and SQL Server, integrating data governance principles to ensure compliance and reduce manual tracking by 50%
Created real-time dashboards in Tableau and Power BI, leveraging DAX and Python scripting, cutting reporting time by 40%
Enhanced SQL-based ETL workflows with Databricks, improving data retrieval by 40% through caching and materialized views, cutting compute costs by 25%
Data Engineer Intern
Mu Sigma Inc.
Bangalore
06.2019 - 06.2020
Engineered scalable multi-petabyte ELT pipelines with Databricks and Azure Data Factory, employing Delta Lake and Z-ordering to cut transformation times by 30% and optimize SFTP transfer efficiency
Created 100+ Hive metastore tables with Databricks and PySpark SQL, boosting analytical efficiency by 35%
Migrated on-prem SQL databases to Azure SQL improving scalability and reducing query times by 40%
Data Analyst
Pennsylvania State University
Harrisburg
01.2023 - 12.2024
Designed and developed business intelligence dashboards in Power BI and Tableau, providing real-time insights into resource allocation, staff efficiency, and operational trends
Led data-driven decision-making by analyzing workforce performance metrics, ensuring optimized resource allocation during high-demand periods
Built and implemented a Python-based employee time-tracking system, reducing manual tracking effort by 40% and increasing workflow efficiency by 20%
Structured a star schema MySQL database to streamline data access for analytical workloads, enabling faster insights and improving efficiency by 30%.
Orchestrated and implemented automated ELT workflows with Apache Airflow, streamlining data extraction and transformation processes, resulting in a 25% boost in processing speed and reduced manual intervention.
Led the development of Tableau and QlikView dashboards with real-time data integration, delivering actionable insights and improving decision-making by 30%.
Data Analysis and Forecasting Project for JD.com
Automated data ingestion for 2.5M customers and 30K SKUs using Databricks and PySpark, reducing manual effort by 80%.
Crafted reusable transformation workflows with dbt and SQL Server, enhancing data model efficiency by 40% and ensuring consistency across research projects.
Developed ARIMA models for sales forecasting, improving inventory accuracy by 20% and operational efficiency by 15%.
Medical Appointments DBMS for UPMC
Led migration of IMS Mainframe databases to PostgreSQL and Oracle for UPMC, utilizing IBM InfoSphere Data Replication (IIDR) tools to implement Change Data Capture (CDC) for real-time synchronization.
Conducted risk analysis to identify data vulnerabilities, implementing access controls and validation checks that reduced security incidents by 30% and ensured data integrity.
Documented configurations and best practices for high availability and consolidation, streamlining SQL Server updates with cross-functional teams using tools like SQL Profiler and DMVs, reducing downtime by 25%.
Assistant Delivery Manager at Tata Consultancy Services, Global Shared ServicesAssistant Delivery Manager at Tata Consultancy Services, Global Shared Services