Home

YASHWANTH - Senior Data Engineer Azure & Snowflake
[email protected]
Location: Houston, Texas, USA
Relocation: All states
Visa: GC
Resume file: R Yashwanth Senior DE_DA Resume_1767803824351.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
R Yashwanth
Senior Data Engineer
Phone: (430) 231-1142
Email: [email protected]
Professional Summary
10+ years of experience delivering enterprise-grade ETL, ELT, and data warehousing solutions across
healthcare, finance, insurance, and public sector domains.
Expert in modern ETL frameworks, designing and optimizing scalable data pipelines using Azure Data Factory
(ADF), Databricks (PySpark/Scala), Informatica PowerCenter, and SSIS.
Cloud migration specialist with proven success in modernizing SQL Server, Oracle, and flat-file systems into
Snowflake, Azure Synapse, and AWS/GCP-native warehouses for improved scalability and performance.
Strong knowledge of real-time streaming architectures, building low-latency ingestion pipelines using
Apache Kafka, Spark Structured Streaming, and event-driven frameworks.
Implemented enterprise-wide data governance and security frameworks (HIPAA, GDPR, SOX, 21 CFR Part
11), leveraging Azure Purview for lineage, metadata, classification, and PII masking.
Built reusable ETL templates and parameterized frameworks, accelerating onboarding of new data sources
by up to 50% and enforcing consistency across projects.
Skilled in data modeling Star and Snowflake schemas, fact-dimension modeling, Slowly Changing
Dimensions (SCD Types 1 & 2) to preserve historical data and support advanced analytics.
Proficient in DevOps and CI/CD automation for data projects, using Azure DevOps, Jenkins, and GitHub
Actions to streamline testing, deployments, and rollback processes.
Experienced in performance tuning optimized Spark jobs (caching, partitioning, broadcast joins) and
Snowflake workloads (clustering, warehouse tuning), achieving 60% faster queries and 25% reduction in
cloud costs.
Designed and delivered Lakehouse architectures, integrating Azure Data Lake, AWS S3, and GCS with curated
Snowflake/BigQuery semantic layers for cross-domain analytics.
Created real-time pipelines for mission-critical use cases, including fraud detection, IoT telemetry, predictive
maintenance, and actuarial forecasting.
Integrated BI tools (Power BI, Tableau, SSRS, Looker) with Snowflake and SQL Server to deliver executive
dashboards, KPI scorecards, and ad hoc exploration layers.
Developed automated testing and data quality frameworks (schema validation, anomaly detection,
reconciliation checks) to ensure data accuracy, consistency, and trustworthiness.
Implemented secure, multi-tenant Snowflake environments with RBAC, column-level masking, and secure
data sharing for departmental and partner use cases.
Collaborated with cross-functional stakeholders clinicians, actuaries, finance controllers, and compliance
teams to translate complex requirements into scalable, production-grade data solutions.
Recognized for mentorship and leadership, onboarding junior engineers, promoting best practices in ETL
design, and leading Agile delivery (Scrum, Jira, sprint planning, retrospectives).
Trusted to deliver mission-critical data platforms with real-time availability, strong compliance posture, and
measurable cost savings across global enterprises.
Technical Skills
Category
Technologies / Tools
ETL & Data Processing
Azure Data Factory (ADF), Informatica PowerCenter, SSIS, PySpark, Sqoop, Oozie
Category
Technologies / Tools
Cloud Platforms & Storage Microsoft Azure (Databricks, Synapse, Data Lake), AWS (S3)
Programming & Scripting Python, SQL, Scala, Shell Scripting
Data Warehousing
Snowflake, Azure Synapse Analytics, SQL Server, Oracle
Streaming & Real-Time
Apache Spark, Spark Streaming, Kafka, Delta Lake
BI & Reporting
Power BI, SSRS, Tableau
DevOps & CI/CD
Azure DevOps, Jenkins, GitHub Actions, Git
Data Modeling
Star Schema, Snowflake Schema, SCD Type 1 & 2, ERwin
Security & Governance
Azure Purview, Data Lineage, Data Masking, HIPAA & GDPR Compliance
Methodologies
Agile/Scrum, Jira, Confluence, Sprint Planning
Professional Experience
Senior Data Engineer Azure & Snowflake
Bayer Healthcare Whippany, NJ (May 2022 Present)
Designed and deployed enterprise-scale ETL pipelines using Azure Data Factory to ingest large volumes of clinical
and research data into Snowflake, improving data accessibility across teams.
Optimized Spark-based data transformations in Azure Databricks, reducing batch processing runtime by ~35% and
accelerating analytics for patient outcomes.
Built a healthcare-optimized Snowflake data warehouse with clustering keys and multi-cluster warehouses to
enable scalable, high-concurrency access for analysts and clinicians.
Developed real-time streaming data pipelines with Apache Kafka and Spark Structured Streaming to ingest IoT
medical device telemetry, enabling instant alerts on critical patient health events.
Created reusable, parameter-driven ETL frameworks and templates for onboarding new electronic medical record
(EMR) and diagnostic data sources, reducing development time by 40%.
Implemented robust data governance with Azure Purview (data cataloging, lineage) and data masking to ensure
HIPAA compliance and protect PHI across the pipeline.
Set up end-to-end CI/CD pipelines using Azure DevOps and Jenkins for ADF, Databricks notebooks, and Snowflake,
enabling automated deployments and consistent releases across environments.
Utilized Snowflake Streams and Tasks to implement incremental loading of data, reducing dashboard refresh times
and achieving near real-time data availability for stakeholders.
Performed extensive performance tuning and cost optimization on Snowflake (clustering, query profiling, caching)
and Spark jobs, cutting cloud compute costs by ~25% while maintaining SLA targets.
Key Achievements: Enabled faster insights and stronger compliance by modernizing the data platform
recognized by Bayer leadership for delivering a secure, scalable analytics architecture that lowered costs and
reduced data latency from days to minutes.
Data Engineer Budget Analytics & ETL Modernization
Illinois Department of Innovation & Technology (DoIT) Springfield, IL (May 2021 Apr 2022)
Consolidated financial data from 25+ state agencies into a centralized Snowflake data warehouse using Azure Data
Factory, enabling unified statewide budget analysis and oversight.
Developed dimensional data models and implemented SCD Type 2 logic to support year-over-year trend and
variance analysis, providing dynamic historical reporting capabilities.
Authored PySpark transformation scripts to clean, normalize, and standardize multi-agency datasets, resolving
schema inconsistencies and improving cross-department data accuracy by ~40%.
Configured fine-grained role-based access controls in Snowflake to allow secure inter-agency data sharing and
collaboration while maintaining compliance with state governance policies.
Built interactive Power BI dashboards with drill-through filters and DAX calculations on top of the Snowflake
warehouse, empowering stakeholders with on-demand insights into fund allocations, expenditures, and budget
performance.
Automated end-to-end ETL orchestration using Azure Logic Apps and ADF event triggers to enable real-time data
refresh cycles, eliminating manual intervention and delays in reporting.
Created parameterized ADF pipeline templates for onboarding new agencies, reducing setup time from 7 days to
under 48 hours and ensuring consistency across implementations.
Tuned Snowflake virtual warehouses (right-sizing, result caching) and optimized query logic to improve dashboard
responsiveness and reduce compute costs by ~30%.
Implemented incremental data loading and delta processing using Snowflake Streams and Tasks, cutting report
generation time from days to minutes and keeping data current for decision-makers.
Key Achievements: Eliminated over 5,000 hours of annual manual work by modernizing legacy Excel-based
processes. Recognized by state leadership for enabling real-time fiscal transparency and data-driven budgeting
through an automated, resilient analytics platform.
Senior Data Engineer Insurance Analytics (ETL & Compliance)
AXA Insurance New York, NY (May 2020 Apr 2021)
Reduced insurance claims processing time by ~40% by optimizing Spark ETL workflows in Azure Databricks,
accelerating adjudication and settlements.
Built scalable Snowflake data models for actuarial forecasting and underwriting analytics, leveraging clustering
keys and multi-cluster warehouses.
Implemented real-time data ingestion with Kafka + Spark Streaming to detect fraud in near real time, improving
risk response times.
Partnered with compliance/legal teams to deliver GDPR/IFRS 17-compliant datasets, ensuring regulatory audit
readiness.
Senior Data Engineer Azure Data Platform
Delta Airlines Atlanta, GA (May 2019 Apr 2020)
Developed real-time IoT ingestion pipelines using Azure Data Factory + Spark Streaming to capture aircraft
telemetry for predictive maintenance.
Modeled Snowflake data warehouses for flight, crew, and loyalty data, enabling high-concurrency queries for
global operations.
Tuned Spark ETL jobs (broadcast joins, caching) to improve execution times by 50% across large aviation datasets.
Delivered Power BI dashboards with route performance, fleet utilization, and delay KPIs, reducing decision-making
latency.
Data Engineer Hadoop ETL & Migration
Impetus (Consulting) Dallas, TX (May 2018 Apr 2019)
Migrated legacy MapReduce jobs to Spark-based frameworks, cutting batch runtimes by 40% and improving
reliability.
Designed HiveQL and PySpark ETL flows to process structured/semi-structured web and log data into AWS S3.
Orchestrated scalable workflows using Oozie with error handling and retries, ensuring SLA compliance.
Documented metadata lineage and built data dictionaries to support downstream BI and audit teams.
ETL Developer Informatica & SQL Server BI
Harmonia Holdings Group Maryland (Apr 2016 Apr 2018)
Built ETL pipelines with Informatica PowerCenter + SSIS to integrate ERP, flat-file, and MySQL data into SQL Server
DW.
Implemented SCD Type 2 logic in dimensional models to preserve historical accuracy for finance/HR analytics.
Automated SSRS dashboard refreshes and ETL schedules with SQL Server Agent, reducing manual work by 50%.
Tuned SQL queries, Informatica mappings, and cache settings to improve pipeline throughput and reporting speed.
Junior Data Analyst / BI Developer
Client: Ashburn, VA (Jun 2014 Mar 2016)
Developed SSIS and T-SQL ETL workflows for daily infrastructure health and asset utilization reporting.
Built and maintained Power BI/SSRS dashboards for uptime, performance, and incident tracking.
Authored stored procedures and dimensional models to support ad hoc and SLA-bound reporting.
Created automated SQL/SSIS error alerts, ensuring 99% SLA compliance for operational reports.
Education
Master of Science in Management Information Systems Lamar University, Beaumont, TX (May 2014)
Keywords: continuous integration continuous deployment business intelligence sthree active directory rlang Georgia Illinois New Jersey New York Texas Virginia
Keywords: continuous integration continuous deployment business intelligence sthree active directory rlang Georgia Illinois New Jersey New York Texas Virginia

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];6604
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: