Summary
Overview
Work History
Education
Skills
Websites
Projects
Timeline
Generic

Harsha Vardhan Reddy Palleti

Middletown

Summary

Experienced Data Engineer and Analyst skilled in designing robust data architectures, optimizing data warehousing, and building efficient ETL pipelines to support business intelligence and advanced analytics. Proficient in data modelling, workflow automation, and data quality management, delivering real-time, actionable insights through dynamic dashboards. Expert at aligning data solutions with business objectives, driving operational efficiency, enhancing decision-making, and fostering business growth.

Overview

6
6
years of professional experience

Work History

Data Engineer

Tata Consultancy Services
Hyderabad
07.2020 - 12.2022
  • Migrated on-prem SQL databases to AWS, utilizing Databricks for ETL processing with RDS, S3, and EC2, enhancing scalability and ensuring high availability with automated failover
  • Implemented Databricks Delta Lake for schema evolution and optimized storage, reducing query latency by 35% and ensuring ACID compliance during cloud migration
  • Built scalable ETL pipelines in Databricks, integrating Apache Kafka to automate data validation workflows, reducing inconsistencies by 15% and cutting testing time by 60%
  • Optimized AWS Redshift performance by offloading transformations to Databricks, leveraging Photon Engine to improve query efficiency by 40% and reduce Redshift costs
  • Performed EDA on 5M+ records using Python and SQL, uncovering 30+ data trends and resolving anomalies that improved workflow efficiency by 20%
  • Automated inventory management with Python, Azure, and SQL Server, integrating data governance principles to ensure compliance and reduce manual tracking by 50%
  • Created real-time dashboards in Tableau and Power BI, leveraging DAX and Python scripting, cutting reporting time by 40%
  • Enhanced SQL-based ETL workflows with Databricks, improving data retrieval by 40% through caching and materialized views, cutting compute costs by 25%

Data Engineer Intern

Mu Sigma Inc.
Bangalore
06.2019 - 06.2020
  • Engineered scalable multi-petabyte ELT pipelines with Databricks and Azure Data Factory, employing Delta Lake and Z-ordering to cut transformation times by 30% and optimize SFTP transfer efficiency
  • Created 100+ Hive metastore tables with Databricks and PySpark SQL, boosting analytical efficiency by 35%
  • Migrated on-prem SQL databases to Azure SQL improving scalability and reducing query times by 40%

Data Analyst

Pennsylvania State University
Harrisburg
01.2023 - 12.2024
  • Designed and developed business intelligence dashboards in Power BI and Tableau, providing real-time insights into resource allocation, staff efficiency, and operational trends
  • Led data-driven decision-making by analyzing workforce performance metrics, ensuring optimized resource allocation during high-demand periods
  • Built and implemented a Python-based employee time-tracking system, reducing manual tracking effort by 40% and increasing workflow efficiency by 20%

Education

MS - Information Systems

Pennsylvania State University
12.2024

BS - Computer Science

Jawaharlal Nehru Technological University
06.2020

Skills

  • Python
  • SQL
  • T-SQL
  • PySpark
  • Java
  • Bash
  • R
  • SQL Server
  • PostgreSQL
  • Oracle
  • MySQL
  • DynamoDB
  • MongoDB
  • AWS
  • Glue
  • Lambda
  • S3
  • EC2
  • Redshift
  • Kinesis
  • AWS Step Functions
  • Rest APIs
  • Spring Boot
  • MicroServices
  • Snowflake
  • Databricks
  • Boomi
  • Hive
  • Hadoop
  • Docker

Projects

Turo Database and Visualization Project

  • Structured a star schema MySQL database to streamline data access for analytical workloads, enabling faster insights and improving efficiency by 30%.
  • Orchestrated and implemented automated ELT workflows with Apache Airflow, streamlining data extraction and transformation processes, resulting in a 25% boost in processing speed and reduced manual intervention.
  • Led the development of Tableau and QlikView dashboards with real-time data integration, delivering actionable insights and improving decision-making by 30%.

Data Analysis and Forecasting Project for JD.com

  • Automated data ingestion for 2.5M customers and 30K SKUs using Databricks and PySpark, reducing manual effort by 80%.
  • Crafted reusable transformation workflows with dbt and SQL Server, enhancing data model efficiency by 40% and ensuring consistency across research projects.
  • Developed ARIMA models for sales forecasting, improving inventory accuracy by 20% and operational efficiency by 15%.

Medical Appointments DBMS for UPMC

  • Led migration of IMS Mainframe databases to PostgreSQL and Oracle for UPMC, utilizing IBM InfoSphere Data Replication (IIDR) tools to implement Change Data Capture (CDC) for real-time synchronization.
  • Conducted risk analysis to identify data vulnerabilities, implementing access controls and validation checks that reduced security incidents by 30% and ensured data integrity.
  • Documented configurations and best practices for high availability and consolidation, streamlining SQL Server updates with cross-functional teams using tools like SQL Profiler and DMVs, reducing downtime by 25%.

Timeline

Data Analyst

Pennsylvania State University
01.2023 - 12.2024

Data Engineer

Tata Consultancy Services
07.2020 - 12.2022

Data Engineer Intern

Mu Sigma Inc.
06.2019 - 06.2020

MS - Information Systems

Pennsylvania State University

BS - Computer Science

Jawaharlal Nehru Technological University
Harsha Vardhan Reddy Palleti