| Sravya - Sr. Generative AI Engineer/Principal AI Engineer/Machine Learning Engineer |
| [email protected] |
| Location: St Louis, Missouri, USA |
| Relocation: Yes |
| Visa: EAD |
| Resume file: Sravya Gen AI Engineer Resume_1771516594418.docx Please check the file(s) for viruses. Files are checked manually and then made available for download. |
|
Sravya
+1 314 529 0792 [email protected] Generative AI Engineer https://www.linkedin.com/in/sravya-chikkala-6b1142289/ PROFESSIONAL SUMMARY: Built and deployed enterprise Generative AI solutions using LLMs, RAG, LangChain, and Prompt Engineering to automate workflows across support, operations, and customer-facing systems. Developed scalable backend services using Python, FastAPI, REST APIs, and JSON, enabling secure model orchestration and tool-based execution. Designed modern UI experiences using React.js, TypeScript, HTML5, and CSS, delivering chat-based copilots and internal analytics portals. Implemented cloud-native deployments using AWS, Docker, and Kubernetes, supporting autoscaling, high availability, and production-grade releases. Built data ingestion pipelines using AWS S3, AWS Glue, ETL, and PySpark, enabling large-scale document and structured data processing for AI applications. Developed retrieval layers using Vector Databases (Pinecone, Weaviate) and optimized Embeddings to improve search relevance and reduce hallucinations. Integrated enterprise systems using SOAP APIs, REST APIs, OAuth 2.0, and JWT, ensuring secure interoperability with legacy and modern platforms. Implemented event-driven architecture using Kafka, Microservices, and asynchronous processing to support real-time streaming workflows. Built CI/CD automation using Jenkins, Git, and GitHub Actions, ensuring repeatable deployments, rollback safety, and controlled releases. Created model monitoring and evaluation pipelines using MLflow, LLM Evaluation, and Regression Testing, improving reliability and measurable performance over time. Developed analytics and reporting solutions using SQL, Snowflake, and Power BI, enabling KPI dashboards and executive-ready insights. Supported legacy and enterprise applications using SVN, Windows/Linux, and production incident handling workflows with strong troubleshooting and root cause analysis. Implemented security and compliance controls using RBAC, IAM, PII Redaction, and audit logging to meet enterprise governance requirements. Delivered high-quality engineering through Code Reviews, Unit Testing, PyTest, and structured API contract validation to reduce production defects. Led cross-functional delivery using Agile/Scrum, JIRA, and stakeholder collaboration, aligning GenAI delivery with business KPIs and operational SLAs. TECHNICAL SKILLS: Programming Languages: Python, R, SQL, JavaScript, TypeScript Machine Learning: Scikit-learn, XGBoost, LightGBM, Prophet, MLflow, SHAP, Optuna, Model Monitoring, Drift Detection Deep Learning & GenAI: LLMs, RAG, LangChain, Prompt Engineering, Embeddings, Tool Calling, JSON Schema / Structured Outputs, LLM Evaluation MLOps & Pipelines: MLflow, Azure ML, Azure Databricks, GCP Composer, Terraform, Docker, Kubernetes Cloud & Data Platforms: Azure, GCP (Vertex AI, Dataproc), AWS (S3, SageMaker), Azure Synapse, BigQuery Data Engineering: Apache Spark, PySpark, ETL, Apache Airflow, Kafka, Hadoop, Hive, Parquet, Data Quality Visualization & BI: Power BI, Tableau, Google Data Studio, Excel, Matplotlib NLP & CV: spaCy, NLTK, BERT, YOLO, Azure Cognitive Services, Hugging Face Transformers DevOps & CI/CD: Docker, Kubernetes, Jenkins, Git, GitHub Actions Security & Governance: OAuth 2.0, JWT, RBAC, PII Redaction, DLP, IAM, Audit Logging PROFESSIONAL EXPERIENCE: Client: BJC HealthCare, St. Louis, MO March 2025 to till date Role: Senior Generative AI Engineer (Enterprise AI Transformation Lead) Project: Call Center Agent Assist Copilot (GenAI + RAG + Omni-Channel). Roles & Responsibilities: Led end-to-end delivery of a real-time Agent Assist Copilot for BJC contact center teams by designing a secure RAG workflow that generated grounded answers during live patient calls. Architected the full solution using Python, FastAPI, and enterprise-grade REST APIs, enabling low-latency orchestration of retrieval, summarization, and response generation. Built ingestion pipelines for knowledge sources, including call scripts, policies, and FAQs, using AWS Lambda, AWS S3, and automated parsing workflows to keep content continuously updated. Implemented intelligent retrieval using Vector Databases (ex: Pinecone / Weaviate) with tuned embeddings and chunking logic, improving answer precision and reducing irrelevant citations. Developed prompt governance using Prompt Engineering, system prompts, and structured response constraints (JSON schema), ensuring safe and consistent responses for agents. Implemented real-time agent UI experiences using React.js, TypeScript, and reusable UI components, enabling agents to see suggested answers, next-best actions, and citations instantly. Enabled live response streaming using WebSockets and token streaming, reducing perceived latency and improving agent usability during active calls. Integrated the assistant with contact center workflows using Salesforce Service Cloud, Omni-Channel, and CTI concepts, allowing agent context (case, customer, intent) to drive AI suggestions. Built case summarization and after-call work automation using LLM summarization, reducing manual note-taking while maintaining consistent structured outputs. Designed and implemented secure authentication using OAuth 2.0, JWT, and RBAC, ensuring agent-level access control and compliance-safe handling of sensitive data. Implemented audit-ready logging using CloudWatch, structured logs, and trace IDs for every request, enabling compliance reporting and production troubleshooting. Delivered full observability and tracing using OpenTelemetry and centralized dashboards, tracking latency across ingestion, retrieval, and generation steps. Built evaluation pipelines using LLM evaluation, curated datasets, and accuracy scoring, enabling measurable improvements to hallucination reduction and response relevance. Optimized performance and cost through retrieval filtering, caching with Redis, and token budgeting strategies, improving response times while reducing LLM usage spend. Developed analytics dashboards using SQL, AWS Redshift, and KPI reporting to track adoption, deflection rate, agent satisfaction, and resolution time improvements. Established CI/CD automation using GitHub Actions, Docker, and Kubernetes, enabling repeatable deployments across dev/test/prod with rollback readiness. Implemented automated testing for reliability using PyTest, API contract testing, and mock LLM response harnesses to prevent regressions during frequent releases. Led cross-functional delivery across engineering, security, and operations using Agile/Scrum, JIRA, and stakeholder workshops to align AI outcomes with call center business KPIs. Environment: Python | FastAPI | React | TypeScript | Salesforce Service Cloud | Omni-Channel | CTI | AWS (Lambda, S3, Redshift, CloudWatch) | LangChain | RAG | Vector DB (Pinecone/Weaviate) | Redis | Docker | Kubernetes | GitHub Actions | OAuth2/JWT | PyTest | SQL Client: JPMorgan Chase Bank - New York, NY Nov 2023 to Feb 2025 Role: Principal AI Engineer Project: Fraud Operations LLM Decision Support + Case Summarization Platform Roles & Responsibilities: Led enterprise delivery of a fraud-ops LLM Decision Support platform by designing secure RAG workflows that generated grounded case insights from historical fraud patterns, internal playbooks, and investigation notes. Architected the backend orchestration layer using Python, FastAPI, and scalable REST APIs, enabling consistent fraud summarization, next-best-action recommendations, and investigator workflow automation. Built ingestion pipelines for structured and unstructured fraud data using Apache Spark, PySpark, and AWS S3, supporting high-volume daily refresh of case evidence and transaction metadata. Implemented semantic + hybrid retrieval using Vector Databases (ex: Pinecone / Weaviate) and optimized embeddings, improving investigator answer relevance and reducing retrieval noise. Developed tool-driven reasoning workflows using LangChain and controlled tool calling, enabling the LLM to safely query internal services (risk scoring, case history, policy rules) instead of guessing. Enforced deterministic outputs using JSON Schema, strict parsing, and response validation, ensuring fraud summaries were structured, auditable, and usable for downstream systems. Designed real-time fraud case summarization pipelines using LLM summarization, generating investigator-ready narratives (timeline, signals, risk factors, recommended actions) in seconds. Built secure access controls using OAuth 2.0, JWT, and fine-grained RBAC, ensuring investigators only accessed authorized cases and business-unit-specific fraud content. Implemented sensitive data controls using PII redaction, DLP patterns, and safe prompt filtering to prevent leakage of regulated customer information. Developed investigator-facing UI workflows using React.js, TypeScript, and component-driven architecture, enabling case chat, evidence citations, and one-click export of fraud summaries. Enabled real-time response streaming using WebSockets and asynchronous execution, improving usability for long multi-source fraud investigations. Integrated the platform with fraud case management systems using Kafka, event-driven ingestion, and downstream API integrations, ensuring updates reflected in near real time. Implemented enterprise observability using OpenTelemetry, Splunk, and distributed tracing, tracking retrieval latency, tool execution time, and hallucination-related failures. Built model performance monitoring and evaluation using LLM evaluation, golden datasets, and regression test suites to continuously measure citation accuracy and hallucination rate. Optimized system cost and latency using Redis caching, retrieval filtering, and token budgeting, reducing LLM spend while maintaining fraud response quality. Delivered cloud-native scalability using Docker, Kubernetes, and autoscaling policies, supporting peak fraud volume without investigator downtime. Established secure CI/CD pipelines using Jenkins, Git, and automated deployment controls aligned with regulated release processes and rollback readiness. Partnered with fraud, security, and governance teams using Agile/Scrum, architecture reviews, and stakeholder workshops to align the LLM platform with fraud KPIs and audit requirements. Environment: Python | FastAPI | LangChain | RAG | Vector DB (Pinecone/Weaviate) | Spark/PySpark | AWS S3 | Kafka | React | TypeScript | OAuth2/JWT | RBAC | DLP/PII Redaction | Redis | Docker | Kubernetes | Jenkins | OpenTelemetry | Splunk | SQL Client: State Farm - Bloomington, IL Jul 2021 to Oct 2023 Role: Lead Machine Learning Engineer Project: Claims Fraud Detection + Adjuster Assist ML Platform (Real-Time + Batch) Roles & Responsibilities: Led the end-to-end delivery of a production-grade Fraud Detection and Claims Risk Scoring platform by building scalable Machine Learning pipelines integrated directly into the claims processing workflow. Designed the core ML architecture using Python, Scikit-learn, and XGBoost, generating risk scores for claims and reducing false positives through iterative feature and threshold tuning. Built enterprise data pipelines using PySpark, Apache Spark, and SQL, enabling large-scale processing of structured claim records, transaction history, and policyholder metadata. Implemented real-time scoring services using FastAPI, REST APIs, and low-latency inference endpoints to support fraud alerts during claim intake. Developed feature engineering pipelines using Feature Store patterns, reusable ETL, and automated validation checks to ensure consistent training vs inference feature parity. Deployed ML models using Docker, Kubernetes, and autoscaling inference pods, supporting peak claim volume without performance degradation. Implemented CI/CD workflows using Jenkins, Git, and automated build/test pipelines to enable safe and repeatable deployments of ML services. Built model monitoring and drift detection using MLflow, model metrics, and scheduled evaluation jobs, preventing silent performance degradation in production. Developed explainability and transparency tooling using SHAP, model interpretability, and reason-code generation to support compliance and adjuster trust. Built fraud investigation dashboards using React.js, TypeScript, and interactive tables, enabling investigators to review risk factors, evidence, and model explanations. Integrated streaming claim events using Kafka, enabling near real-time fraud scoring triggers when new claim updates or suspicious activities were detected. Implemented secure authentication and access control using OAuth 2.0, JWT, and RBAC, ensuring only authorized fraud and claims teams could access sensitive outputs. Developed automated training pipelines using Airflow, scheduled retraining, and dataset versioning to keep models current with evolving fraud behavior. Built data validation and quality enforcement using Great Expectations, schema checks, and anomaly detection on input distributions prior to training and inference. Optimized model performance through hyperparameter tuning using Optuna, cross-validation, and threshold calibration to balance recall vs precision for fraud detection. Delivered scalable storage and analytics using AWS S3, Snowflake, and reporting datasets for fraud trend analysis, executive KPIs, and audit reporting. Implemented end-to-end observability using Prometheus, Grafana, and structured logging to track latency, throughput, errors, and model prediction distribution shifts. Led cross-functional delivery using Agile/Scrum, stakeholder workshops, and production readiness reviews to align ML outputs with claims KPIs and compliance expectations. Environment: Python | Scikit-learn | XGBoost | LightGBM | PySpark | Apache Spark | SQL | FastAPI | REST APIs | Kafka | Airflow | MLflow | SHAP | Great Expectations | Optuna | AWS S3 | Snowflake | Docker | Kubernetes | Jenkins | Git | OAuth 2.0 | JWT | RBAC | Prometheus | Grafana | CloudWatch Client: Costco Wholesale - Issaquah, WA Jan 2019 to Jun 2021 Role: Senior Data Engineer Machine Learning Specialist Project: Enterprise Demand Forecasting + Inventory Optimization Platform (Batch + Streaming) Roles& Responsibilities: Led end-to-end delivery of an enterprise Demand Forecasting and Inventory Optimization platform by building scalable data engineering and ML pipelines supporting replenishment decisions across warehouses. Architected ingestion workflows using AWS S3, AWS Glue, and automated ETL, consolidating sales, inventory, promotions, and seasonal event datasets into a unified lake. Built distributed transformations using Apache Spark, PySpark, and optimized SQL, enabling multi-terabyte processing of SKU/store-level history. Developed reusable feature engineering frameworks using Python, time-series features, and training dataset standardization for consistent model inputs. Implemented forecasting models using Prophet, XGBoost, and LightGBM, improving accuracy during promotions, holidays, and demand spikes. Built experiment tracking and model lifecycle workflows using MLflow, model versioning, and reproducible training pipelines for audit-ready deployments. Deployed forecasting inference services using FastAPI, REST APIs, and containerized endpoints for downstream replenishment systems. Enabled near real-time inventory updates using Kafka, supporting streaming-based stock movement events and rapid stockout detection. Orchestrated batch and ML workflows using Apache Airflow, automating daily feature refresh, retraining, and pipeline dependency management. Enforced reliability through data validation using Great Expectations, schema checks, and anomaly detection before training and inference execution. Built a governed analytics layer using Snowflake, dimensional modeling, and performance-tuned datasets for forecast reporting and business dashboards. Implemented end-to-end monitoring using CloudWatch, Prometheus, and alerting for pipeline failures, latency, and forecast error metrics. Optimized performance and storage using Parquet, partitioning, caching, and scalable compute tuning for reduced Spark runtime and cost. Delivered CI/CD automation using Git, Jenkins, and automated unit/integration tests to support safe releases of data + ML services. Containerized and deployed workloads using Docker and Kubernetes, enabling consistent runtime environments and autoscaling across environments. Environment: AWS S3 | Glue | Spark | PySpark | SQL | Airflow | Kafka | Python | Prophet | XGBoost | LightGBM | MLflow | Snowflake | Great Expectations | Parquet/Delta | FastAPI | Docker | Kubernetes | Jenkins | Prometheus/CloudWatch Client: Optimum Info System Pvt Ltd, Chennai, India Jan 2016 to Nov 2018 Role: Data Engineer Roles & Responsibilities: Performed customer churn analysis using logistic regression on PostgreSQL datasets and delivered results through Excel dashboards. Built interactive business reporting for sales forecasting and inventory trends using Power BI across quarterly units. Analyzed large-scale structured and semi-structured data from Hadoop using Hive queries and automated Python ETL routines. Developed early pricing elasticity models in R to measure regional sensitivity and simulate discount impact. Engineered robust data preprocessing workflows using Python to clean, normalize, and merge CRM + marketing logs. Created exploratory insights and trend reporting using Matplotlib, histograms, and density profiling for customer behavior analysis. Built customer segmentation logic using tenure, purchase frequency, and support tickets, supported by complex SQL joins and aggregations. Managed reproducible analysis and reporting workflows using SVN version control, enabling team collaboration and consistent monthly executive reporting. Environment: PostgreSQL | SQL | Hadoop | Hive | Python | R | Power BI | Excel (Pivot Tables, Dashboards) | Matplotlib | ETL/Data Preprocessing | Logistic Regression | Time-Series Forecasting | SVN Keywords: continuous integration continuous deployment artificial intelligence machine learning user interface javascript business intelligence sthree database rlang Illinois Missouri New York Washington |