Home

Shruthi - Data Engineer AI Engineer
[email protected]
Location: San Francisco, California, USA
Relocation: No
Visa: H4 EAD
Resume file: Shruthi Guntumadugu - Data Engineer AI_1774906654761.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Shruthi G
Bay Area/ Remote Only
Email: [email protected]
+1 972-924-5835 (Employer),
C2C Only, H4 EAD (Passport Number also shared)
SUMMARY OF QUALIFICATIONS

9+ years of experience in Data Engineering, Software Development and Project Management across Retail, Manufacturing, and Supply Chain services.
Proficient in leveraging Azure Databricks, Matillion, Snowflake & AWS Glue to process large datasets, ensuring optimized data integration and transformation.
Strong knowledge of advanced analytical, data model design, development techniques using SQL, Python, and ETL tools.
Expert in implementing automated data validation using Python & Snowflake, ensuring 99.8% data accuracy and reducing reconciliation time by 40%.
Experience in optimizing Snowflake query performance using partitioning, clustering, indexing, and SnowSQL, reducing query execution time by 30%.
Comprehensive knowledge in AWS Cloud technologies EC2, S3, Lambda, API, Athena, Glue, SQS, SNS.
Experienced in leading projects and collaborating with cross-functional teams to deliver data-driven solutions.
Strong understanding of complete Software Development Life Cycle (SDLC), involving both technical implementation and project coordination.
Innate ability to deliver meaningful insights using BI Tools(Tableau, PowerBI) and communicate clearly with the cross-functional teams & stakeholders.
SKILLS

Programing Languages: Python, SQL, PySpark
Databases: Snowflake, BigQuery, MySQL, PostgreSQL, Oracle
Cloud Technologies: AWS EC2, Lambda, S3, SQS, SNS, API Gateway, Azure Databricks
ETL Tools: AWS Glue, Matillion, Athena, Talend, Dbt, SnapLogic
BI Tools: Tableau, Power BI, AWS QuickSight, Google Data Studio
Project Management: JIRA, Confluence

WORK EXPERIENCE

TalktoData AI, Santa Clara, CA Senior AI Engineer Dec 24- Present
Architected and built a resilient AI data analysis agent using GPT-4 APIs, enabling natural-language querying across Databricks SQL warehouses, Delta Lake tables, and file-based data sources on S3.
Built a modular agentic architecture with MCP servers, specialized sub-agents, and tool functions for dynamic query planning, schema introspection, and multi-step reasoning over Databricks metadata.
Integrated Databricks Unity Catalog for schema discovery, access control, and lineage-aware retrieval in GenAI workflows.s
Developed a comprehensive evaluation framework using LangSmith, running iterative experiments on accuracy, latency, and hallucination rates to continuously improve assistant quality.
Engineered AWS-native, cloud-scale GenAI APIs and microservices (API Gateway, Lambda/ECS, S3, IAM) serving 110,000+ users with high availability and secure data access.
Built a GenAI operations and reporting assistant using LangChain + OpenAI that enabled stakeholders to ask natural-language questions over LMS KPIs (enrollment, completion, engagement, SLAs, backlog).
Environment:
AWS, Databricks, Delta Lake, Unity Catalog, OpenAI (GPT-4), Lang Chain, RAG, Vector Databases

McKinsey , CA, US - Senior Data Engineer Jan 23 Nov 24
Led Data Engineering initiatives for a Learning Management System serving 50,000+ employees, resulting in 40% improved data accessibility and faster decision-making for stakeholders.
Spearheaded the migration from SnapLogic to AWS Glue, enhancing real-time reporting and analytical capabilities.
Built efficient data transformation frameworks with PySpark, Python, and SQL, connecting Oracle/MySQL sources to Snowflake/S3 destinations.
Architected end-to-end AWS Cloud ETL pipelines (Glue, Athena, Lambda, API Gateway, Snowflake), reducing reporting latency from hours to minutes.
Utilized AWS Glue Crawlers/data catalogs and Glue ETL for categorizing, cleaning, and enriching data, ensuring reliable movement between various data stores.
Developed PySpark, Lambda functions for serverless processing between S3 and SFTP, handling daily data transfers.
Developed Infrastructure-as-Code (IaC) solutions using Terraform, reducing manual provisioning by 60%.
Collaborated with cross-functional teams to streamline communication & ensure project timelines were met.
Environment:
Snowflake, AWS Glue, Athena, API Gateway, S3, SnapLogic, PySpark, Oracle, MySQL
DataFlix, CA, US Data Engineer Oct 21 Jan 23
Led design and development of ETL pipelines for multi-channel marketing data consolidation for Dataflix
using Matillion.
Refactored Data Warehouse ETL jobs in Matillion and migrated to the cloud platform, reducing operational
costs by 35%.
Created reusable Matillion frameworks for cross-platform marketing data migrations from social media,
CRM, and web analytics sources.
Utilized Matillion's orchestration capabilities to automate complex transformation workflows for campaign
performance metrics.
Leveraged Matillion and PySpark for large-scale marketing analytics, optimizing performance for executive
dashboards in Sisense.
Engineered Matillion ETL jobs with parameterized variables and reusable components for faster marketing
data integration.
Designed and managed various orchestration job types in Matillion, including data ingestion, workflow
sequencing, error handling, data quality validation, and environment-based deployments, ensuring reliable
and scalable ETL pipeline execution across development and production systems.
Developed and optimized transformation jobs in Matillion, including data cleansing, joining, aggregation,
enrichment, schema restructuring, and change data capture (CDC), enabling efficient data processing and
seamless loading into Snowflake for analytics and reporting.
Collaborated with marketing teams to analyze requirements and develop Sisense dashboards that improved
campaign ROI visibility.
Environment:
Matillion, Snowflake, Sisense, Marketing Platforms.

Factory Market Retail GmbH, Berlin, Germany Data Engineer Feb 20 - Sep 21
Worked in Data & Marketing Analytics team to build brand-new marketing data model and Click-stream DataMart to help management with real-time reporting and analytical capabilities.
Interacting with marketing stakeholders on delivering live dashboards and reports and to recommend best remediation strategies to ensure pristine quality of high-priority usage data elements.
Building various analytical visualizations using Power BI, Google Data Studio, Excel, and Python plotting libraries.
Used AWS Glue Crawler/data catalog to categorize, clean and enrich data and move reliably between various data stores
Developed Python scripts in a Lambda function for data cleaning and transformations on streaming data.
Performed extensive data quality and accuracy checks on data warehouse tables and regular loading jobs.
Frequent collaboration with end-users to do Requirement analysis, bug fixing and enhancing the rest end points in the application systems.
Constant collaboration with Product, Operations teams and participating in daily scrum meetings and estimating tasks.
Environment:
AWS Glue, Lambda, PowerBI, Google Data Studio, Python, Advanced Excel

Sphinx Pvt Ltd, Bengaluru, India Business Data Analyst Nov 17 Dec 19
Worked in a team of 7 members and contributed to the data, business analysis and reporting on various application data sources to help drive decisions making in product and management divisions.
Write SQL queries on Snowflake data warehouse to create data blends and design Power BI, AWS QuickSight reports and dashboards to publish audit violation reports and Key Performance Indicators (KPI) to stakeholders and executives
Responsible for Database table schema definitions, Development and Performance tuning of applications and SQL queries.
Coordinate with Quality Assurance teams to understand issues identified during other test cycles like integration performance and regression tests.
Tracking health of sprint using JIRA tool which is a bug tracking system to track and maintain the history of bugs/issues on everyday basis
Working on agile methodology and attending daily stand-up meetings to achieve the project development timelines
Knowledge in all types of Database migrations with all the versions of SQL Server
Environment:
Snowflake, Advanced Excel, MS SQL Server, MySQL, Power Bi, AWS Quick Sight, JIRA, Trello


Dell International (NTT Data), India Technical Client Associate-Analytics Oct 16 - Nov 17
Worked with Honeywell and other Supply-Chain clients in Data Analytics division to support management on support service reporting activities
Designed and developed advanced analytics dashboards for the Global Business Units to drive business decision making.
Support product development team to resolve the general IT and security issues with the application databases.
Responsible for specification requirements and make sure the application development is within the scope of specifications.
Environment:
PostgreSQL, Tableau, Talend, Advanced Excel, Google Cloud
CERTIFICATIONS, TRAINING & EXTRAS
AWS Certified Cloud Practitioner (B377C56LBFQ1Q7GP)
AWS Certified Data Analytics Specialty (UC4E5D4BDE1B17)
Matillion Data productivity Cloud- fac115d8-514c-4eed-a983-7fbba895768b
Matillion ETL Foundation- b1fea584-ef8b-4252-ae5b-fe2a303b31b5


EDUCATION

Sri Venkateshwara College of Engineering (VTU), India, Bachelor s in Computer Science, 2016
Keywords: artificial intelligence business intelligence sthree information technology microsoft mississippi California

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];7057
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: