| Surendra Varma Sagi - AI/ML Engineer, GenAI Engineer, GenAI Developer |
| [email protected] |
| Location: , , USA |
| Relocation: Any |
| Visa: H1B |
| Resume file: Surendra_Sagi_AI_Engineer_1772029229786.docx Please check the file(s) for viruses. Files are checked manually and then made available for download. |
|
PROFESSIONAL SUMMARY
Results-driven AI/ML Engineer and Python Developer with over 11 years of experience in designing, developing, and deploying enterprise-grade AI/ML, data engineering, and cloud-based applications across manufacturing, finance, healthcare, and retail domains. AI/ML Engineer with strong experience designing, building, and deploying production-grade machine learning and Generative AI solutions, including Retrieval-Augmented Generation (RAG) and agentic AI workflows. Hands-on experience implementing agent-driven AI patterns for multi-step task orchestration, tool invocation, retrieval coordination, and response validation to improve accuracy and reduce hallucinations. Applied prompt engineering, dynamic chaining, and agent-based architectures using LangChain to enhance response accuracy and explainability. Strong background in end-to-end ML lifecycle, including data ingestion, feature engineering, model training, evaluation, deployment, monitoring, and optimization in cloud environments. Proficient in integrating LLMs (OpenAI, Hugging Face) into production using FastAPI and Flask, with hands-on experience in MLOps, CI/CD, and model monitoring. Experienced AI/ML Developer skilled in building, training, and deploying machine learning and deep learning models using Python, PyTorch, TensorFlow, and AWS. Experienced in designing and deploying NLP pipelines for sentiment analysis, document summarization, named entity recognition (NER), and translation using Hugging Face Transformers, spaCy, and GPT-based architectures. Hands-on experience in computer vision, building and deploying object detection, OCR, and image segmentation solutions using OpenCV, TensorFlow, and Keras for real-time visual inference. Highly skilled in Python development for data manipulation, transformation, and analytics using Pandas, NumPy, Scikit-learn, PySpark, and Matplotlib, with proven success in building scalable ETL pipelines on Azure Databricks and AWS Glue to process multi-terabyte datasets via batch and streaming ingestion. Experienced in developing and deploying GenAI and machine learning solutions, including LLMs, Retrieval-Augmented Generation (RAG), LangChain orchestration, and Hugging Face Transformer fine-tuning, to deliver predictive intelligence and automate business workflows. Proficient in implementing CI/CD pipelines using Azure DevOps, Jenkins, Terraform, Docker, and Kubernetes, and in establishing data validation frameworks with PySpark and Great Expectations to ensure data accuracy and integrity. Extensive experience leveraging Azure (ADF, Synapse, Event Hubs, Purview, Delta Live Tables) and AWS (S3, EMR, Redshift, Kinesis, Lambda, CloudFormation, CloudWatch) for secure, compliant, and cost-optimized data ecosystems. Skilled in using AI-assisted development tools such as GitHub Copilot for intelligent code generation, testing, and documentation, enhancing development efficiency and quality assurance. Experienced in performing advanced analytics and visualization with Power BI, Tableau, and Excel (DAX, SQL) to deliver actionable business insights and KPI dashboards for stakeholders. Recognized for strong collaboration with cross-functional ML, DevOps, and product teams in Agile/Scrum environments, leading cloud migration, data governance, and compliance initiatives (GDPR, PII masking) to build reliable, scalable, and insight-driven enterprise data solutions. EDUCATION & CERTIFCATIONS University of South Florida | Master of Science in Business Analytics and Information Systems. Jawaharlal Nehru Technological University | Bachelor s in Electronics and Communication Engineering. Certified Python Developer (PCAP) Microsoft Certified: Python for Data Science Azure AI Fundamentals AWS Certified AI Practitioner Databricks Certified Data Engineer PROFESSIONAL EXPERIENCE HUBBELL INCORPORATED, CLEVELAND, OH Domain: Manufacturing AI/ML Engineer Aug 2023 Present Designed and deployed Retrieval-Augmented Generation (RAG) pipelines using OpenAI GPT-4, LangChain, and FAISS, delivering real-time insights from structured and unstructured datasets. Designed and implemented lightweight agentic AI workflows using LLMs to orchestrate multi-step tasks such as document retrieval, validation, summarization, and response generation across enterprise data sources. Built agent-driven pipelines with tool calling and conditional logic to decide when to retrieve data, re-rank results, or trigger follow-up actions, improving answer accuracy and reducing manual intervention. Designed, developed, and deployed end-to-end machine learning and deep learning models using Python, TensorFlow, PyTorch, and XGBoost for classification and clustering problems. Integrated MCP-style modules for structured tool orchestration and modular agent capabilities. Fine-tuned LLMs using Hugging Face Transformers and PyTorch, improving model precision and reducing manual response time by 35%. Designed and deployed NLP pipelines for text classification, sentiment analysis, and summarization using Hugging Face Transformers, improving text understanding accuracy by 30%. Designed REST APIs (FastAPI / Flask) to operationalize ML models, enabling seamless integration with upstream applications. Built API-driven applications using FastAPI with JWT-based authentication and role-based access control. Applied LLM optimization techniques such as LoRA, QLoRA, PEFT, and supervised fine-tuning (SFT) to improve model accuracy, latency, and cost efficiency in production use cases. Designed and trained Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) using Keras and TensorFlow, achieving >90% accuracy on image and sequence classification tasks. Built LLM-powered data retrieval pipelines using LangChain, enabling contextual insights from enterprise data stored in S3 and Redshift via Bedrock API orchestration. Built and deployed object detection and image segmentation models using OpenCV, Keras, and TensorFlow, enabling automated defect detection in manufacturing images Built scalable and modular ML pipelines leveraging SageMaker Processing, Training, and Model Registry for automated retraining and deployment. Built metadata-driven ETL frameworks in Python using configuration tables in Databricks SQL, reducing manual maintenance and enabling reuse across 100+ data sources. Implemented early-stage agentic AI patterns using LLMs for task decomposition, retrieval orchestration, and controlled reasoning over structured and unstructured data. Experimented with agent-based workflows integrating RAG pipelines, metadata filters, and validation steps to improve reliability and reduce hallucinations in enterprise AI systems. Automated model lifecycle workflows using MLOps practices, CI/CD pipelines, Docker-based packaging, and Kubernetes for orchestration. Built streamlined MLOps pipelines connecting LangChain-based RAG systems with AWS Bedrock, Glue, and Step Functions for production-ready AI integration. Built interactive analytics models and dashboards in Tableau, Power BI, and Databricks SQL, enabling self-service BI and operational insights. Implemented CI/CD pipelines using Jenkins for automated testing, static code analysis (SonarQube), and artifact promotion. Optimized model performance through hyperparameter tuning, feature engineering, and GPU acceleration, reducing training time and improving accuracy. Environment: Python, PySpark, Pandas, NumPy, Delta Lake, Azure Databricks, NLP, LLMs, RAG, Delta Live Tables, Unity Catalog, XGBoost, Power BI, SQL, GraphQL, Hugging Face Transformers, PyTorch, LangChain, GPT-4, FAISS, Docker, Kubernetes, Terraform, Jenkins, Azure DevOps, Great Expectations, Boto3, AWS CloudFormation, AWS Sagemaker, Keras. THOMSON REUTERS, EAGAN, MN Domain: Finance Python Full-stack Developer Aug 2021-Jul 2023 Designed front end and backend of the application using Python on Django Web Framework and AngularJS developed consumer-based features and applications using Python and Django in test driven Development and pair-based programming. Designed large-scale Python backend systems with FastAPI, JWT, MongoDB, Redis, and containerized deployments. Designed, trained, and deployed end-to-end machine learning models in Python using AWS SageMaker, optimizing model accuracy and inference latency for production workloads. Developing dynamic web pages using HTML5, CSS3, Bootstrap, SASS and JavaScript. Working with Python for data manipulation, wrangling, and analysis using libraries such as Pandas, NumPy, Scikit-learn, and Matplotlib. Designed and maintained large-scale Python-based backend systems using strong OOP principles, modular architecture, dependency injection, and reusable design patterns. Developed web applications and RESTful web services and APIs using Python Flask, Django, Pyramid, and PHP. Led a cross-functional team in migrating legacy systems to the cloud, focusing on Azure Databricks for its scalability and Python-friendly environment. Responsible in Working with various Python integrated development environments like PyCharm, Idle. Automated the existing scripts for performance calculations using NumPy and SQL Alchemy. Develop the scripts using Perl, Python, Unix and SQL. Involved in Unit testing and Integration testing of the code using PyTest. Writing a migration script from PostgreSQL to MongoDB with Python, using Gevent, pyscopg2python library, Postgres Cursors, and mongo Bulk insert. Implemented AI-driven ETL scripts in Python to automate data extraction, transformation, and loading, increasing efficiency and accuracy. Actively pursued and applied the latest advancements in Azure Databricks and Python libraries, fostering a culture of continuous improvement within the team. Used AWS including deploying new server instances through automation with Kubernetes and Jenkins. Developed using Git and Jira for code tracking and reviewing process. Used Python and Django creating graphics, XML processing, data exchange and business logic implementation with Spiff workflow development. Part of the team implementing REST API in Python using micro-framework like Flask with SQL Alchemy in the backend for management of data center resources. Developed the ETL scripts in Python to get data from one database table and insert, update the resultant data to another database table. Environment: Python, Django, Flask, AngularJS, HTML5, CSS3, JavaScript, Pandas, NumPy, Scikit-learn, AWS (SageMaker, EMR, S3, Redshift), Azure Databricks, PostgreSQL, MongoDB, SQLAlchemy, Kubernetes, Jenkins, Git, Jira, PyTest, Unix/Linux CYBAGE SOFTWARE PVT LIMITED, HYDERABAD, INDIA Domain: Healthcare Data Engineer Jan 2016 July 2019 Designed and maintained large-scale ETL pipelines on AWS using S3, EMR (Hadoop/Spark), and Redshift to process healthcare claims, EHR/EMR data, eligibility files, encounter data, lab results, and clinical documentation. Built PySpark and Scala Spark jobs on AWS EMR to transform high-volume healthcare datasets for claims adjudication, quality-of-care reporting, HEDIS measure calculations, and risk-adjustment workflows. Developed optimized Hive tables (ORC/Parquet, partitioned, bucketed) on EMR/Hive Metastore to support cost/utilization analysis, provider performance analytics, and population health insights. Ingested and normalized complex healthcare data formats including HL7, X12 837/835 claims, CCD/CDA documents, and FHIR bundles using Spark, custom parsers, and AWS storage layers. Designed and optimized scalable ML data pipelines capable of processing high-volume, high-dimensional datasets using Python, Pandas, NumPy, Spark, and cloud-native workflows. Integrated MongoDB and HBase datasets into EMR pipelines to process unstructured clinical notes, physician narratives, radiology reports, and discharge summaries for downstream analytics. Developed PL/SQL and SQL scripts to generate regulatory reports, patient risk profiles, chronic condition registries, ICD/SNOMED mappings, and utilization trend dashboards. Enhanced Redshift schemas and fact/dimension models to support healthcare star-schema reporting, enabling fast insights for claims trends, care management, readmission analysis, and cost forecasting. Supported production ETL workflows using Ab Initio and EMR-based Spark jobs, troubleshooting failures, optimizing cluster usage, and ensuring SLA adherence for mission-critical healthcare reporting cycles. Collaborated with data architects and clinical SMEs to implement HIPAA-compliant data pipelines, including PHI masking, secure S3 bucket policies, IAM controls, and audit logging across AWS components. Environment: AWS (S3, EMR, Redshift, IAM), Hadoop, Spark (PySpark, Scala), Hive, ORC, Parquet, MongoDB, HBase, Python, Pandas, NumPy, SQL, PL/SQL, Ab Initio, HL7, X12 (837/835), CCD/CDA, FHIR, ICD, SNOMED, Unix/Linux INDIUM SOFTWARE, CHENNAI, INDIA Domain: Retail Data Analyst Jun 2013 Dec 2015 Designed and developed 50+ Tableau and Power BI dashboards to track KPIs across sales, marketing, and operations, improving executive decision-making and performance visibility. Performed data mining and analysis using SQL, Excel, and Python (Pandas, NumPy), driving actionable insights for customer segmentation and targeted marketing campaigns. Automated data extraction and cleansing processes using Python scripts and shell automation, reducing manual reporting time by over 60%. Created and optimized dimensional data models (Star and Snowflake schemas) to support scalable reporting and analytics. Built Excel VBA macros and advanced formulas to automate recurring reporting tasks and data reconciliations across departments. Developed interactive Tableau dashboards with parameters, trend analysis, and forecasting models to identify seasonal sales patterns and improve revenue predictions. Integrated data from multiple SQL sources, CRM systems, and Google Analytics for consolidated customer behavior and campaign performance analysis. Defined and implemented custom KPIs and DAX measures in Power BI, enabling teams to monitor conversion rates, churn, and product performance in real time. Collaborated with business users to document reporting requirements, validate data accuracy, and enhance visualization usability through intuitive designs and tooltips. Conducted A/B testing and statistical analysis in R and Python, providing insights into the effectiveness of marketing and promotional strategies. Environment: SQL Server, Tableau, Power BI, Excel (VBA, Macros), Python 3.x, R, Shell Scripting, Google Analytics, CRM Systems, Star & Snowflake Schema Modeling, SSRS, Power Query, DAX Keywords: continuous integration continuous deployment artificial intelligence machine learning business intelligence sthree rlang procedural language Minnesota Ohio |