Home

Nikitha - Data engineer
[email protected]
Location: Boston, Massachusetts, USA
Relocation: YES
Visa: GC
Resume file: Nikitha_ (1)_1774972546757.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Nikitha Bheemi Reddy
Phone: +1 781-813-4320
Email: [email protected]
https://www.linkedin.com/in/nikitha-b-296964353/

PROFESSIONAL SUMMARY
Senior Data Engineer with 12+ years of experience administering enterprise BI platforms, including Power BI (Microsoft Fabric) and Tableau Cloud. Expert in BI governance, security (RLS/OLS), CI/CD, platform operations, and Tableau Server-to-Cloud migrations, enabling secure, scalable analytics at enterprise scale.
Strong expertise in T-SQL development, complex stored procedures, query performance tuning, and SQL Server Agent scheduling
Supported Adobe Campaign by integrating campaign, delivery, recipient, and engagement data into Snowflake for analytics and reporting.
Proven experience designing ETL pipelines, data marts, and dimensional models, and migrating legacy systems to modern platforms with full data validation and reconciliation.
Hands-on experience supporting full SDLC, collaborating with business users, BAs, and system integrators in regulated environments.
Built and maintained SQL- and Python-based pipelines to process Adobe Campaign data with high data quality and SLA compliance.
Strong expertise in Snowflake, SQL, Python, Airflow, Informatica PowerCenter, and Azure-based data engineering.
Collaborated closely with Marketing Operations teams to translate campaign requirements into analytics-ready datasets.
Proven experience migrating Oracle-based data warehouses to Snowflake and enabling cloud-native analytics.
Hands-on experience supporting Marketing Operations by integrating campaign and customer engagement data into Snowflake for reporting and analytics.
Supported Adobe Campaign upgrade initiatives by validating schemas, reconciling historical data, and ensuring backward compatibility.
Experienced in CI/CD, Agile delivery, and cross-functional collaboration with Marketing and Data Services teams.
Experienced Data Engineer with strong Azure, Synapse, and ADLS expertise, specializing in building analytics-ready Gold Layer data models to support enterprise BI and Power BI consumption.
Skilled in designing Star/Snowflake data models, optimizing curated datasets, and enabling secure, governed data access through RBAC and integration with BI teams.
Delivered Snowflake tables, views, and transformations to enable email marketing performance and campaign effectiveness reporting.
Hands-on experience with Large Language Models (LLMs), GPT-based models, RAG pipelines, and Agentic AI workflows
Hands-on experience supporting BI and reporting teams with scalable, validated, and performance-optimized datasets aligned with enterprise data governance standards.
Provided production support for Adobe Campaign data feeds, troubleshooting data discrepancies and pipeline failures.
Experienced SDET with strong expertise in automation frameworks, API testing, and CI/CD-driven test pipelines.
Skilled in designing scalable automation frameworks using Java/Python, Selenium, and REST Assured.
Hands-on experience in integrating automated tests into AWS DevOps/Jenkins pipelines for continuous testing.
Hands-on familiarity with Qlik Sense dashboards and Qlik Replicate for real-time data replication from SQL and cloud data sources into Snowflake.
Proven expertise in designing Oracle-based data architectures, automating ETL workflows, and implementing dimensional models supporting large-scale analytics and modernization initiatives.
Adept in Azure Services and its components, including Azure Data Factory (ADF), Azure Databricks, Azure Synapse Analytics, Azure Data Lake Gen 2 (ADLS GEN 2), Azure Blob Storage, Key Vault, Azure Logical Apps, Azure function Apps, and Azure DevOps services.
Strong DevOps background with hands-on experience in automated code promotion, release management, and environment governance across cloud data platforms.
Designed enterprise lakehouse architectures using Delta Lake, Apache Iceberg, and Apache Hudi with ACID transactions, schema evolution, and time travel.
Skilled in deploying Snowflake objects (roles, warehouses, schemas, RBAC policies) using Terraform IaC modules and GitOps workflows.
Expert in building and optimizing batch and streaming data pipelines on Databricks using Python, PySpark, Delta Lake, and Structured Streaming, enabling feature engineering for ML and LLM workflows.
Skilled in operational pipeline management, debugging workflows, and resolving production data issues to ensure SLA compliance and high data reliability.
Experienced in building and automating CI/CD pipelines for Snowflake, Airflow, and Informatica IICS using GitHub Actions/Azure DevOps.
Proficient in Grafana and Datadog to implement dashboards, alerts, and distributed monitoring across cloud data pipelines and compute workloads.
Multi-cloud expert with hands-on experience across Azure, AWS, and GCP, integrating cross-cloud ingestion, storage, compute, and governance layers.
Familiar with core mainframe data structures, including fixed-width datasets, COBOL copybooks, and hierarchical IMS/DB-style formats, with hands-on experience integrating these sources into modern cloud ETL pipelines.
Worked on legacy modernization projects migrating on-prem and traditional systems to cloud platforms, providing transferable understanding of mainframe-to-cloud integration patterns
Strong experience administering Databricks Unity Catalog, implementing catalog/schema/table permissions, lineage, ACLs, and fine-grained governance.
Experienced in designing and developing high-performance data pipelines using Databricks (PySpark, Delta Lake), Azure Data Factory (ADF), and Azure SQL Server for scalable data integration and analytics.
Hands-on experience with Apache Iceberg table creation, schema evolution, partitioning, and integration with Snowflake.
Senior Data Engineer / Architect with expertise in designing enterprise-scale Data Fabric architectures on Azure and AWS for multi-source integration, analytics, and governance.
Performed deep data analysis to identify data quality gaps, detect anomalies, and improve business data accuracy across financial datasets.
Implemented API-driven ingestion frameworks in Python and PySpark for batch and near real-time workloads.
Strong experience in data analysis, data tracing, and remediation using SQL, AWS Athena, and Redshift to ensure data integrity and accuracy across large datasets.
Partnered with architecture and DevOps teams to deploy automated, version-controlled CI/CD pipelines (GitHub Actions, Azure DevOps) for repeatable and reliable data engineering workflows.
Designed and maintained PySpark-based ETL pipelines in Azure Databricks and AWS EMR, enabling scalable data ingestion, transformation, and validation for analytical reporting.
Hands-on experience with Infrastructure as Code (IaC) using Terraform and Azure DevOps pipelines for automated provisioning and configuration of Azure infrastructure.
Expert in designing and deploying Snowflake-based ETL pipelines on AWS cloud, leveraging SnowSQL, Snowpark, and Python for efficient data processing.
Developed and managed Azure PaaS databases to enable seamless integration with web applications, enhancing performance and reliability.
Designed and implemented Azure Functional apps to deploy serverless, event-driven applications, utilizing triggers and integrating with Azure Key Vault for secure management of cryptographic keys and secrets.
Expertly deployed Azure Functions, Azure Storage, and Service Bus queries, optimizing enterprise ERP integration systems for streamlined data processing and communication in complex environments.
Monitored Snowpipe performance and utilization metrics, optimizing configurations to improve data ingestion speed and reliability.
Collaborated with cross-functional teams to deploy data solutions in Snowflake, ensuring alignment with business objectives and data governance standards.
Demonstrated mastery in SDLC management, skilfully applying Agile Methodology to steer iterative development and continuous software project improvement.
Proof of concept (POC) initiatives utilizing Snowflake, Airflow, and DBT, exploring data warehousing, workflow orchestration, and transformation capabilities within a modern data engineering ecosystem.
EDUCATION
Master s in Computer Science at University of Central Missouri, Dec 2013
Bachelors in Computer Science and Engineering at A.S.N Women s Engineering College, May 2012
Certifications
DP-900 Microsoft Certified Azure Data Fundamentals.
DP-203 Microsoft Certified Azure Data Engineer Associate.
AZ-305 Designing Microsoft Azure Infrastructure Solutions
SnowPro Core Certification
SnowPro Advanced
TECHNICAL SKILLS
Cloud Services (AWS): Data Factory (ADF), Databricks, Synapse, Data Lake, Event Hubs, Key Vault, Logic Apps
, Functions, Azure DevOps, EMR, Glue, EKS, Kubernetes, HashiCorp Vault, Lambda, S3, Redshift, DynamoDB.
Managed multi-account AWS environment using AWS Organizations, implementing
Service Control Policies (SCPs) and cross-account IAM roles.
Implemented secure secrets management using AWS KMS, Secrets Manager,
and Parameter Store with encryption at rest and in transit.
Designed event-driven architectures using EventBridge, SNS, SQS, and Lambda
for scalable and loosely coupled integrations.
Designed and configured Amazon VPC environments including public/private subnets,route tables, NAT Gateways, SecuritGroups, and NACLs to enable secure workload isolation.
AWS: EMR, Lambda, Aurora, OpenSearch, S3, Glue, Redshift, DynamoDB.
AWS: EMR, Lambda, Aurora, OpenSearch, S3, Glue, Redshift, DynamoDB.
AWS: Amazon ECS, Amazon EKS, AWS Lambda, AWS Glue, AWS DataSync,
Amazon VPC (Subnets Public/Private, Route Tables, Internet Gateway, NAT Gateway, Security Groups, NACLs), AWS Organizations, IAM (Cross-account roles), AWS KMS, Secrets Manager, Parameter Store, CloudWatch, CloudTrail, SNS, SQS, EventBridge, EMR, S3, Redshift, DynamoDB.
Data Quality Tools: Informatica Data Quality (IDQ) rule creation, data profiling, monitoring.
Financial Data Platforms: FactSet, Security Master, Holdings datasets.
Monitoring & Automation: SLA Monitoring, Alerting Systems, AWS CloudWatch, Splunk, OpenSearch, SLA Monitoring, Alerting Systems.
Marketing Platforms: Adobe Campaign (data integration & analytics support)
Oracle (11g/12c/19c) :- PL/SQL, Partitioning, Index Optimization, Materialized Views, Performance Tuning, Oracle Analytics Cloud (conceptual)
Mainframe-Adjacent Skills: COBOL copybook integration, fixed-width parsing, mainframe extract processing, VSAM-style dataset handling, JCL-inspired batch workflow orchestration (Control-M, Autosys)
Data Virtualization: Denodo (Data Federation, Caching, Query Optimization, Lineage Tracking) Palantir Foundry (conceptual experience ontology, pipeline design, governance workflows)
Big Data Technologies: Hadoop (1.0X and 2.0X), Hortonworks HDP (2.4/2.6), HDFS, YARN,
MapReduce, PigHBase, Hive, Sqoop, Flume, Spark, Oozie, Airflow, Ambari and Apache Kafka.
Containers & Orchestration: Docker, Kubernetes (EKS).
Programming languages: MapReduce, PIG, Java, Python, C#, PySpark, SparkSQL, Linux, Unix, Shell, Scripting,
SQL, PL/SQL.
Databricks: Delta Live Tables (DLT), Unity Catalog, MLflow, Databricks Workflows, Delta Optimization (Z-ORDER, file compaction)
ETL Tools: IBM Information Server 11.5/9.1/8.7/8.5, IBM Infosphere DataStage 8.1.0, Assential
DataStage.7.5.X, Quality Stage, Talend 6.4, SSIS, SSRS, Informatica.
Business Intelligence: Power BI, SAP Business Objects 11.5, Qlik Sense, Tableau.
Scheduling: Control-M, Autosys, Oozie, Apache Airflow.
Version Control Tools: Git, CI/CD, Jenkins.
Databases: NoSQL: HBase and Cassandra
Row-Oriented: Oracle 11g/10g, MS SQL Server, MySQL, Teradata V2R5/V2R6, DB2.
Columnar: HP Vertica.
BI PLATFORM ADMINISTRATION
Power BI Service & Microsoft Fabric administration: workspace lifecycle, capacities, deployment pipelines, gateways, refresh schedules
Tableau Cloud & Tableau Server administration: sites/projects, permissions, extracts, schedules, Tableau Bridge
Security & governance: RLS/OLS, sensitivity labels, auditing, Entra ID (Azure AD) SSO
CI/CD for BI: GitHub-based SDLC for Power BI (.pbip), branching/PRs, automated deployments
Platform monitoring: capacity usage, adoption metrics, health/admin dashboards, incident troubleshooting
Migration leadership: Tableau Server Tableau Cloud content inventory, cleanup, permission mapping, cutover
WORK EXPERIENCE
Client: NTT Data Jan 2023-Present
Role: Lead Data Engineer, Canton, MI
Responsibilities:
Administered Power BI Service environments including workspace lifecycle management, dataset refresh scheduling, app publishing, and access control for enterprise users.
Led Tableau Server content inventory and cleanup activities in preparation for migration to Tableau Cloud.
Developed Tableau dashboards with drill-down, cross-tab, parameterized views, and summary reports to support operational and executive users.
Worked in highly regulated environments with strict data privacy, access controls, auditability, and compliance requirements, aligning closely with HIPAA-style governance models.
Designed Tableau reports using dynamic grouping, calculated fields, filters, and sorting based on business requirements.
Integrated various data sources with Tableau to build real-time visualizations for decision-makers in marketing, sales, and finance.
Managed Power BI Gateways (on-prem and cloud), troubleshooting refresh failures, latency issues, and data source connectivity problems.
Supported Adobe Campaign data integrations by building and maintaining Snowflake pipelines for campaign and customer engagement data.
Implemented RBAC, encryption, lineage, and access monitoring suitable for PHI-sensitive datasets.
Mapped users, groups, and permissions from on-prem Tableau Server to Tableau Cloud using enterprise identity standards.
Published and maintained Tableau dashboards on Tableau Server, managing refresh schedules and user access.
Collaborated with Marketing teams to deliver analytics-ready datasets supporting email and digital marketing initiatives.
Supported Marketing Operations by building and maintaining Snowflake data pipelines for campaign, customer, and engagement data.
Configured Tableau Bridge to support secure connectivity to on-prem data sources during cloud transition.
Implemented and maintained Row-Level Security (RLS) and Object-Level Security (OLS) aligned with business roles and Entra ID (Azure AD) groups.
Collaborated with business users and analysts to validate reporting requirements and improve data usability.
Validated dashboards and data sources post-migration to ensure data accuracy, performance, and refresh reliability.
Developed and orchestrated Snowflake ELT pipelines using SQL, Python, Airflow, and dbt-style transformation patterns.
Supported Power BI deployment pipelines across Dev/Test/Prod environments to enable controlled releases and minimize production issues.
Designed and implemented SSIS packages for data ingestion, cleansing, transformation, validation, and data distribution.
Designed and maintained SQL-based data models to support email marketing performance, audience segmentation, and campaign reporting.
Built ETL workflows to integrate external data sources with in-house SQL Server databases.
Worked directly with Marketing Operations teams to support Adobe Campaign reporting and analytics requirements.
Implemented error handling, logging, and reconciliation in SSIS packages to ensure data accuracy and reliability.
Collaborated with Marketing Operations teams to translate campaign reporting requirements into scalable data pipelines.
Configured and maintained Apache Atlas workflows for metadata cataloging, lineage tracking, and policy enforcement across fund administration data pipelines.
Supported Adobe Campaign upgrade activities by validating data pipelines, schemas, and historical campaign data.
Monitor the pipelines and make sure that they are running well, and periodically do unit testing for the data that we receive from these pipelines.
Designed and implemented large-scale data architectures using Azure Synapse, ADLS Gen2, and Databricks to support batch, streaming, and real-time analytics.
Experience in using Azure Logic Apps to trigger notifications and updates based on the transaction status.
Designed and delivered Star/Snowflake data models to support Power BI and enterprise reporting needs.
Designed automated ETL pipelines integrating Oracle on-prem data sources with cloud data platforms (Snowflake, Azure SQL), applying reusable transformation logic in Python and SQL.
Designed and automated CI/CD pipelines using AWS CodePipeline and CodeBuild to continuously execute UI and API test suites for every code commit, improving deployment reliability and reducing manual validation effort by 70%.
Integrated automated tests into the build-and-release workflow, enabling early defect detection and faster feedback loops for developers
Built curated Gold/Silver/Bronze layers in ADLS/Synapse for analytics and BI consumption.
Designed and implemented scalable ETL/ELT pipelines using Qlik Replicate and Qlik Compose (conceptually aligned with Databricks and ADF frameworks) for real-time data ingestion from relational and cloud data sources into Snowflake.
Led data domain activation across multiple business units by defining sub-domains, identifying critical data elements (CDEs), and driving stewardship assignments.
Built end-to-end CI/CD pipelines for Snowflake, automating SQL deployments, RBAC provisioning, Streams/Tasks, and object creation using GitHub Actions/Azure DevOps.
Built and maintained data pipelines supporting investment, fund administration, and portfolio reporting datasets, including holdings, transactions, valuations, and reference data.
Implemented Terraform-based IaC for Snowflake roles, warehouses, resource monitors, pipes, and integrations, ensuring repeatable and compliant deployments.
Architected Data Lake zones (Bronze Silver Gold) with Delta Lake/Iceberg/Hudi enabling ACID, schema evolution, and time travel.
Developed high-quality, performance-optimized datasets used by BI teams for dashboards and reports.
Built and maintained Qlik Sense dashboards to visualize operational metrics and data quality KPIs, integrating with Snowflake and AWS data sources for interactive analysis.
Collaborated on Oracle data migration and modernization initiatives, ensuring seamless data transfer and performance optimization in hybrid environments.
Designed and implemented enterprise Data Governance frameworks including data cataloging, metadata lineage, and stewardship processes across cloud platforms (Azure, AWS).
Collaborated with investment operations, actuarial, and finance teams to translate business rules into scalable ETL/ELT workflows.
Automated Informatica IICS deployments (mappings, tasks, connections) using API-based scripts and CI/CD workflows for Dev QA Prod promotion.
Configured Denodo data virtualization layer to integrate data from Oracle, Snowflake, and AWS S3, enabling a unified analytics view without physical data movement.
Built Python and Java-based microservices exposing AI capabilities (summarization, classification, and data retrieval) via REST endpoints.
Managed, monitored, and debugged complex ETL/ELT pipelines across Snowflake, Databricks, and Azure Data Factory, ensuring operational stability and high performance.
Designed Snowflake and Azure-based data models to support investment performance, risk, and exposure analysis.
Developed SQL-based validation and reconciliation frameworks to ensure consistency and accuracy across multiple data layers and systems.
Scheduled and monitored ETL pipelines using Control-M and Autosys for batch and streaming workflows, ensuring timely data availability.
Created and executed SQL-based test scripts to verify transformation logic, joins, and aggregations within Snowflake and Azure pipelines.
Created and maintained data documentation artifacts, including data dictionaries, lineage diagrams, and mapping documents for ETL workflows.
Built an indexing and retrieval framework to support enterprise data onboarding, integrating Azure AI Search, Databricks, and FastAPI services.
Collaborated in Agile sprints to deliver incremental data engineering solutions, ensuring timely releases and continuous integration.
Implemented Snowflake Streams and Tasks to enable incremental ETL processing and real-time data ingestion.
Led architectural decisions for enterprise-scale data pipelines ensuring scalability, performance, and operational efficiency.
Communicated technical findings, pipeline issues, and remediation steps to non-technical users and senior stakeholders, facilitating informed decision-making.
Collaborated with DevOps and CloudOps teams to support deployment management, hotfix releases, and version control through Azure DevOps and GitHub Actions.
Azure Data Lake Storage Gen2 in Azure cloud as a central repository, enabling scalable storage and efficient management of diverse data types for streamlined processing and analysis.
Collaborated with business stakeholders to define KPIs and ROI for AI initiatives.
Partnered with business stakeholders to analyze recurring failures and implement long-term preventive measures to enhance data reliability.
Developed complex SQL queries in Dremio leveraging window functions, CTEs, and analytical aggregations for performance-intensive analytics workloads.
Conducted query profiling and performance optimization in Dremio and PostgreSQL, reducing latency and improving interactive query response times.
Developed SLA monitoring and alerting frameworks to detect pipeline failures and ensure compliance with business requirements proactively.
Built ETL/ELT pipelines for structured and semi-structured datasets using Dremio, ensuring seamless integration across data sources and consumers.
Built and optimized Airflow DAGs (Directed Acyclic Graphs) to orchestrate complex data workflows, ensuring the reliable execution of data transformations and monitoring for failures or delays.
Utilized Airflow to integrate DBT with cloud-based data platforms such as Snowflake or BigQuery, streamlining the ETL process and automating data pipeline management for improved operational efficiency.
Worked with Azure Cosmos DB to manage and optimize the storage and retrieval of large-scale, globally distributed data, ensuring high availability and low-latency access to mission-critical data for real-time applications.
Implemented and optimized stored procedures in JavaScript within Azure Cosmos DB for efficient data processing, enabling seamless integration between the database and application layer for real-time data querying and reporting.
Designed and implemented Cosmos DB collections and partitioning strategies to ensure efficient data access, improve query performance, and scale as data volumes grew, meeting high-performance and scalability requirements.
Developed and optimized stored procedures in JavaScript to perform complex data manipulations and transformations within Cosmos DB, reducing the need for external data processing and ensuring data consistency.
Applied scaling strategies to Cosmos DB by implementing partitioning and indexing techniques, enabling efficient querying and faster response times even as the data volume increased.
Addressed data issues by conducting thorough root cause analysis to identify and resolve discrepancies in data integrity, ensuring data consistency and accuracy across systems.
Developed data validation and reconciliation frameworks using Python and SQL to ensure data integrity across bronze, silver, and gold layers.
Performed root cause analysis on production data failures and performance bottlenecks, implementing corrective actions to improve data quality and reduce downtime in production environments.
Provided production support for data pipelines, troubleshooting issues in DBT models, Airflow workflows, and Cosmos DB integrations, ensuring that data processes remained operational and efficient.
Collaborated with cross-functional teams to address and resolve data issues by designing and implementing solutions that improved data quality and streamlined data flows across systems.
Improved data governance and quality processes by addressing root cause issues in production environments, identifying recurring problems, and implementing proactive measures to prevent future data inconsistencies.
Optimized data workflows in DBT, Airflow, and Cosmos DB to ensure that ETL processes were efficient, scalable, and capable of handling large volumes of data while minimizing errors and performance degradation.

Client: Deloitte Nov 2019-Jan 2023
Role: Azure Data Engineer Suwanee, GA
Responsibilities:
Optimized SSIS package execution using parallelism, incremental loads, and performance tuning techniques.
Supported Microsoft Fabric workspaces and capacities, with working knowledge of OneLake concepts and Dataflows Gen2 for enterprise BI enablement
Enabled analytics-ready datasets for downstream BI tools and dashboards consumed by Marketing stakeholders.
Worked with BI teams to validate data models for RLS/OLS and governed data access requirements.
Performed ETL operations using Azure Databricks and successfully migrated on-premises Oracle ETL processes to Azure Synapse Analytics.
Supported campaign platform upgrades by validating schemas, reconciling historical data, and ensuring continuity across releases.
Integrated IICS deployment steps into existing DevOps pipelines, enabling zero-downtime releases and consistent parameter management.
Implemented Synapse Dedicated SQL Pool pipelines and stored procedures for enterprise-grade transformations.
Developed Qlik data models (Snowflake schema) following associative modeling principles to enable self-service analytics and faster query responses.
Facilitated data ownership and stewardship governance operating models, ensuring accountability across business, analytics, and IT teams.
Partnered closely with Power BI developers to deliver datasets optimized for dashboard performance and refresh cycles.
Implemented Azure RBAC and data-access controls to ensure secure and compliant data consumption.
Established business glossaries, data dictionaries, and standardized naming conventions to enhance data discoverability and consistency.
Implemented monitoring and alerting for IICS job failures and latency issues using CloudWatch / Grafana.
Implemented metadata management processes including glossary alignment, lineage documentation, and attribute definitions.
Built and optimized data ingestion pipelines for fund admin data using Python, PySpark, and Atlas APIs, ensuring accurate metadata capture and schema consistency.
Monitored and optimized Qlik ETL performance, applying caching, incremental reloads, and optimized query structures to reduce latency and improve system throughput.
Developed governance frameworks covering data quality, metadata, lifecycle management, access controls, and compliance.
Led Oracle OTC to Snowflake migration, designing data models, transformation logic, and validation frameworks using AWS Glue and SnowSQL.
Partnered with business and IT stakeholders to define data domain activation playbooks covering definitions, lineage, data quality, and lifecycle management.
Developed and maintained ETL/ELT data pipelines in Databricks and Azure Data Factory to process structured and semi-structured financial datasets.
Collaborated with DevOps to containerize LLM services using Docker and deploy to EKS/AKS clusters for production inference.
Collaborated with security teams to implement RBAC for Kubernetes clusters and Azure role assignments for resource-level governance.
Developed automated data validation scripts in Python and SQL to reconcile migrated data volumes and ensure referential integrity across systems.
Integrated Azure SQL Server with Databricks for data loading and transformations, optimizing read/write performance through partitioning and indexing strategies.
Designed incremental load processes using Snowflake Streams and Tasks, ensuring low-latency and high-performance data pipelines.
Implemented Azure Application Gateway/WAF for ingress traffic management, SSL termination, and centralized security enforcement.
Automated backup and disaster recovery workflows for critical services using Azure-native tooling and IaC modules.
Utilized Python automation to generate operational health reports and manage error-handling mechanisms in production pipelines.
Built and deployed custom LLM endpoints using Azure OpenAI Service for internal knowledge automation.
Proficient in utilizing Informatica PowerCenter for data integration, transformation, and ETL processes, ensuring seamless data flow and accuracy within complex business environments and experienced in designing and implementing scalable solutions for data warehousing and analytics.

Client: GM Financial Dec 2017-Nov 2019
Role: Data Engineer Atlanta, GA
Responsibilities:
Utilized Sqoop for periodic ingestion of data from MySQL into HDFS, ensuring seamless integration and efficient data transfer within big data environments, facilitating robust data processing and analysis workflows.
Performed aggregations on large amounts of data using Apache Spark and Scala and stored the data in Hive warehouse for further analysis.
Implemented governed data access and lineage for investment data using metadata cataloging, RBAC, and data governance frameworks.
Supported end-to-end investment data lifecycle, including ingestion, transformation, governance, and analytics enablement.
Utilized Unity Catalog for secure data governance and lineage tracking across Databricks environments.
Contributed to Lakehouse optimization by leveraging Apache Iceberg and Delta Lake for versioned, ACID-compliant, and performant analytical tables.
Implemented AWS DataSync to migrate on-premises data to S3 securely and efficiently,
optimizing data transfer performance and monitoring via CloudWatch.
Automated Airflow environment configuration (connections, variables, secrets) using Terraform and Python-based admin scripts.
Designed audit dashboards and reconciliation reports to monitor fund data ingestion status, validation errors, and Atlas lineage coverage.
Developed and maintained ETL workflows using IBM DataStage 11.5, integrating data from relational and big data systems for downstream analytics.
Configured VPC networking and IAM policies to ensure secure access between
AWS services and on-prem systems.
Automated Databricks deployments and pipeline versioning using Azure DevOps, Databricks CLI, and Git integration.
Engaged with Data Lakes and prominent big data ecosystems such as Hadoop, Spark, Hortonworks, and Cloudera, orchestrating data processing and analytics tasks within scalable and distributed computing environments.
Integrated AWS CloudWatch alarms and SNS notifications to monitor job performance and trigger remediation workflows.
Ingested and transformed extensive volumes of Structured, Semi-structured, and Unstructured data, leveraging big data technologies to handle diverse data formats efficiently within scalable distributed computing environments.
Strong expertise in AWS cloud architecture including VPC networking,
multi-account governance using AWS Organizations, container orchestration (ECS/EKS),
and event-driven serverless systems.
Implemented Apache Ambari for centralized management and monitoring of Big Data infrastructure, streamlining administration tasks and ensuring optimal performance across Hadoop clusters.

Client: Morgan Stanley Feb 2014- Dec 2017
Role: Data Warehouse Developer, Jersey City, NJ
Responsibilities:
Experience as a SQL Server Analyst/Developer/DBA specializing in SQL Server versions 2012, 2015, and 2016 within data warehousing environments.
Experience in developing complex store procedures, efficient triggers, required functions, creating indexes and indexed views for performance.
Participated in all phases of the SDLC, including requirements gathering, design, development, testing, deployment, and support
Developed and deployed Airflow DAGs through CI/CD pipelines, using Git triggers and containerized test environments.
Designed and implemented enterprise-class data warehouses on Oracle 11g/12c, leveraging dimensional (star and snowflake) data models for analytical reporting and performance optimization.
Developed advanced PL/SQL procedures, packages, and functions to automate ETL workflows, ensuring high data quality and efficient batch processing.
Created and tuned complex SQL and PL/SQL scripts for data extraction, transformation, and loading from flat files and relational sources.
Implemented Airflow observability dashboards, SLA monitoring, retry rules, and automated failure alerts integrated with Slack/Email/Splunk.
Optimized query performance using indexes, partitioning, and materialized views to reduce data processing time across large datasets.
Developed jobs, configured SQL Mail Agent, set up Alerts, and scheduled DTS/SSIS Packages within data warehousing.
Developed end-to-end ETL solutions by integrating SSIS for data extraction, transformation, and loading, and SSRS for creating detailed, interactive reports based on processed data.
Integrated SSIS and SSRS solutions to automate the generation of reports, ensuring accurate and timely delivery to stakeholders.
Manage and update the Erwin models - Logical/Physical Data Modeling for Consolidated Data Store (CDS), Actuarial Data Mart (ADM), and Reference DB according to the user requirements.
Proficient in designing and implementing dimensional modelling techniques, including star schema and snowflake schema, to optimize data storage, streamline querying processes, and enhance reporting efficiency in data warehousing environments.
Keywords: csharp continuous integration continuous deployment quality analyst artificial intelligence machine learning user interface business intelligence sthree database active directory information technology hewlett packard microsoft mississippi procedural language Arizona Georgia Idaho Michigan New Jersey

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];7068
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: