Summary
Overview
Work History
Education
Skills
Websites
Timeline
Generic

SHERINE GEORGE

Data Scientist
Pittsburgh

Summary

Data Scientist with 3 years of experience in the banking domain, holding a Master’s in Business Analytics from Carnegie Mellon University. Passionate about leveraging AI and machine learning to deliver impactful solutions.

Overview

6
6
years of professional experience
2025
2025
years of post-secondary education

Work History

Senior Associate Data Scientist

BNY
07.2024 - Current
  • Developed and deployed a Large Language Model (LLM)-based taxonomy classification solution, automating data attribute mapping and enhancing accuracy for content onboarding.
  • Designed and implemented a two-step and one-step LLM-driven classification pipeline, leveraging generated descriptions and advanced noise chunking strategies to improve model precision.
  • Fine-tuned LLMs and experimented with embedding models to enhance retrieval accuracy and align with business needs.
  • Integrated NVIDIA's reranker to optimize classification outputs and improve semantic search performance in large-scale datasets.
  • Engineered scalable AI workflows by implementing asynchronous processing, batch techniques, and cloud-based deployment for high-performance pipelines.
  • Built a hybrid search pipeline combining LLM-powered semantic and keyword search, enhancing document retrieval capabilities.
  • Conducted advanced analytics on LLM misclassifications to refine ground truth datasets and improve model robustness.
  • Authored technical documentation and led efforts to scale Eliza (in house BNY AI solution) Cloud infrastructure for broader AI use cases.

Software Engineer - Data Science

Société Générale
11.2020 - 07.2023
  • Automated Transaction Discrepancy Reduction: Cut 3 months of manual work by using a Decision Tree Classification model to automate breaks identification between accounting and inventory data
  • US Branch Regulatory Reporting Automation: Saved 1 month of labor and improved accuracy by automating regulatory instruction translation with a Python script
  • Market Referential Alert Identification: Boosted automation by 50% and improved data quality using statistical concepts to automate manual market referential alert identification
  • Company-Wide Datathon Success: Secured 2nd place in a company-wide Datathon with 80% accuracy in user segmentation and target region identification using Python and Power BI
  • ESG Reporting Automation: Reduced manual efforts by 70% with an innovative solution to extract and preprocess data from PDF files, earning a special mention award
  • ML Workflow Integration: Integrated a data transformation tool with an ML modeling platform, saving an estimated $1 million through workflow automation
  • PAN Indian Herrising Award (Women Starters Category): Recognized for exceptional talent and Young Turk Award 2021 for outstanding performance and contributions in data science

IT Informatics Intern

LabCorp
01.2020 - 07.2020
  • Developed a Java-based test automation framework using Maven, TestNG, Rest Assured, Extent Report, and Cucumber
  • Achieved a 40% reduction in testing time, expanded test coverage by 30%, and streamlined software updates and quality control
  • Published dissertation on this automated API testing framework in IRJET

Summer Intern

Visa Inc
05.2019 - 07.2019
  • Worked on a tool called ORMB (Oracle Revenue Management and Billing) used by Visa for billing and pricing
  • Contributed in developing a test automation solution to use this tool and saved 2 months of manual efforts annually

Education

Master of Science - Business Analytics

CARNEGIE MELLON UNIVERSITY - TEPPER SCHOOL OF BUSINESS
Pittsburgh, PA

Post Graduate Diploma - Data Science

INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY (IIIT)

Bachelors - Computer Science Engineering

RV COLLEGE OF ENGINEERING

Skills

  • Large Language Models
  • RAG
  • Python for Data Science
  • EDA
  • Data Visualization
  • Inferential Statistics
  • Hypothesis Testing
  • Machine Learning - Linear Regression
  • Logistic Regression
  • Decision tree
  • Unsupervised Learning
  • Bagging
  • Boosting
  • Model Selection
  • Principal Component Analysis
  • Advanced Regression
  • Deep Learning
  • Time Series Analysis/Forecasting
  • Neural Networks & ANN
  • Convolutional Neural Networks
  • Recurrent Neural Networks
  • R
  • Python (Pandas, Numpy, ScikitLearn, Keras, Statsmodels, OpenCV, PyTorch, NLTK, TensorFlow, LangChain, OpenAI, FAISS)
  • Relational SQL and NoSQL Databases
  • Net Full Stack
  • JavaScript
  • Java
  • Jira
  • Data visualization tools (Tableau, PowerBI, Matplotlib, Seaborn, Plotly, Streamlit)
  • SQL
  • Version control systems (Git)
  • Data mining software (RapidMiner)
  • Data warehousing
  • Data extraction tools
  • ETL (SSRS, SSIS, Informatica)
  • Excel
  • Oral and Written Communication
  • Leadership
  • Listening
  • Teamwork
  • Conflict Resolution

Timeline

Senior Associate Data Scientist

BNY
07.2024 - Current

Software Engineer - Data Science

Société Générale
11.2020 - 07.2023

IT Informatics Intern

LabCorp
01.2020 - 07.2020

Summer Intern

Visa Inc
05.2019 - 07.2019

Post Graduate Diploma - Data Science

INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY (IIIT)

Bachelors - Computer Science Engineering

RV COLLEGE OF ENGINEERING

Master of Science - Business Analytics

CARNEGIE MELLON UNIVERSITY - TEPPER SCHOOL OF BUSINESS
SHERINE GEORGEData Scientist