Home

Abhinash Reddy - Gen AI Engineer
[email protected]
Location: Dallas, Texas, USA
Relocation: YES
Visa: GC
Resume file: Abhinash_Gen_AI_Engineer_1768341353589.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
Abhinash Reddy Peddyreddy
Gen AI Engineer
[email protected]
(270) 226-0303

Professional Summary
Gen AI Engineer with 11+ years of experience architecting and delivering enterprise-grade AI, ML, and Generative AI solutions across Healthcare, Telecom, and FinTech domains.
Expertise in designing Agentic AI Workflows and autonomous multi-agent systems using frameworks like LangGraph, CrewAI, and AutoGen for complex, multi-step business processes.
Production-scale deployment of RAG and GraphRAG architectures, utilizing Vector Databases (Pinecone, Milvus, Weaviate) and Knowledge Graphs (Neo4j) to minimize hallucinations and maximize retrieval accuracy.
Full-stack proficiency across major AI Clouds, including AWS Bedrock, Azure OpenAI (Prompt Flow), and Google Vertex AI, ensuring vendor-agnostic deployment and high availability.
Advanced LLM Fine-tuning and Optimization, specializing in PEFT, LoRA, and QLoRA techniques to adapt open-source models (Llama 3.1, Mistral, Gemma) for domain-specific tasks.
Strategic Prompt Engineering leadership, implementing Chain-of-Thought (CoT), ReAct, and Tree-of-Thought prompting to drive reasoning capabilities in customer-facing applications.
Deep implementation of LLMOps and Observability, utilizing LangSmith, Arize Phoenix, and TruLens for model evaluation, cost tracking, and real-time drift detection.
Architected scalable inference pipelines using vLLM, NVIDIA NIM, and Triton Inference Server to optimize latency, throughput, and GPU utilization.
Expertise in the Apache Spark ecosystem (PySpark, MLlib) and Databricks for processing petabyte-scale datasets and building robust feature stores for AI model training.
Proven success in Computer Vision and Multimodal AI, including handwritten OCR, object detection (YOLO), and image segmentation pipelines integrated with LLM reasoning.
Strong background in AI Governance and Security, implementing guardrails (NeMo, Llama Guard) for PII detection, content filtering, and protection against prompt injection attacks.
Developed sophisticated predictive and decision science models using XGBoost, LightGBM, and Ensemble methods for financial risk assessment and high-frequency forecasting.
Skilled in software internationalization and globalization, ensuring multilingual LLM readiness and translation validation for global software deployments.
Experienced in CI/CD for AI/ML lifecycle, managing end-to-end productionization through Docker, Kubernetes (K8s), and Infrastructure as Code (Terraform/CDK).
Quantifiable track record of business impact, including reducing operational costs by up to 40% through automation and improving model accuracy by 25%+ in mission-critical environments.
Technical Skills:
Programming Languages Python (Expert), SQL, Java, C, C++, R, TypeScript, JavaScript, Next.js, ReactJS, Scala, Impala, Hive, Shell Scripting (Bash)
Statistical Methods Statistical Inference, Hypothesis Testing, Experimental Design (A/B Testing), Multivariate Analysis, Time Series Analysis, Auto-correlation, Bayesian Statistics (Bayes' Theorem), Statistical Modelling, ANOVA, Chi-Square Tests, Correlation and Covariance Analysis, Probability Distributions, Sampling Techniques, Residual Analysis, Cross-Validation, Descriptive Statistics.
Machine Learning XGBoost, LightGBM, CatBoost, Ensemble Methods, Random Forest, Support Vector Machines (SVM), Recommendation Systems, Dimensionality Reduction (PCA, t-SNE), Feature Engineering, Hyperparameter Optimization (Optuna, Ray Tune), Time Series Forecasting (Prophet, ARIMA, SARIMA, LSTM), Model Interpretability (SHAP, LIME), Statistical Inference, Hypothesis Testing, A/B Testing, Scikit-learn, Azure Machine Learning Services, Model Evaluation Metrics (ROC-AUC, F1-Score, Precision-Recall), Error Analysis, Bias-Variance Tradeoff, Scalable ML Pipeline Design.
Deep Learning PyTorch, TensorFlow, Keras, Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Transformer Architectures, Attention Mechanisms, Transfer Learning (ResNet, Inception, MobileNet), Autoencoders, Object Detection (YOLO, Faster R-CNN), Image Segmentation (U-Net, Mask R-CNN), OCR, Optimization Algorithms (Adam, SGD, RMSprop), Regularization Techniques (Dropout, Batch Normalization), Distributed Model Training, Model Quantization, Model Deployment.
Natural Language Processing (NLP) Transformer Models (BERT, GPT, T5), Hugging Face, SpaCy, NLTK, Large Language Modeling, Sequence-to-Sequence Models, Attention Mechanisms, Encoder-Decoder Architectures, Word Embeddings (Word2Vec, FastText), Named Entity Recognition (NER), Sentiment Analysis, Text Summarization, Machine Translation, Question Answering Systems, Topic Modeling (LDA), Dependency Parsing, Sequence Labeling, Semantic Search, Word and Sentence Similarity, Language Model Pre-training and Fine-tuning.
Generative AI LLMs (GPT-4, Llama 3.1, Claude 3.5, Mistral, Gemma), Multi-Agent Systems (LangGraph, CrewAI, AutoGen), Retrieval-Augmented Generation (RAG), GraphRAG, Knowledge Graphs (Neo4j), Vector Databases (Pinecone, Milvus, Weaviate, Chroma, Faiss), LLM Orchestration (LangChain, LlamaIndex, Haystack), LangSmith, Fine-tuning (LoRA, QLoRA, PEFT), Prompt Engineering (CoT, ReAct), AI Guardrails (NeMo, Llama Guard), Local LLM Deployment (Ollama, vLLM, TGI), Inference APIs (Groq, NVIDIA NIM), Document Chunking & Ingestion Pipelines, Diffusion Models, Multimodal AI (CLIP, DALL-E), Transformer Architectures (T5, BERT), GANs (StyleGAN, CycleGAN), Variational Autoencoders (VAEs), Reinforcement Learning (RLHF), Zero-Shot & Self-Supervised Learning, AWS Bedrock.
MLOps & LLMOps CI/CD for ML/LLM Lifecycle, Model Deployment (REST API, gRPC, Batch), Scalable LLM Serving (vLLM, TGI, NVIDIA NIM), LLM Observability & Evaluation (LangSmith, TruLens, Ragas, PromptLayer), Experiment Tracking (MLflow, Weights & Biases), Model Registry & Versioning, Data Versioning (DVC), Feature Stores (Feast), Orchestration (Kubernetes, Airflow, Kubeflow, Argo Workflows), Infrastructure as Code (Terraform, CloudFormation, CDK), Containerization (Docker), Monitoring & Alerting (Prometheus, Grafana, Drift Detection), Token Usage & Cost Tracking, Cloud AI Platforms (AWS SageMaker, AWS Bedrock, Google Vertex AI, Azure AI Services), Azure DevOps, GitHub Actions.
Big Data: Apache Spark, Databricks, Scala, Hadoop, Hive, HBase, Amazon Kinesis, Azure Synapse Analytics, Azure Data Lake Storage (ADLS), Azure Blob Storage, AWS S3, Delta Lake, Real-time Data Ingestion, Distributed Computing, Large-scale Data Processing.
Amazon Web Services: AWS Bedrock, SageMaker, EMR, MSK, Kinesis, Glue, Athena, S3, DynamoDB, Lambda, EC2, Step Functions, API Gateway, AWS CDK, CloudFormation, IAM, CloudWatch, Glacier, Lex, Rekognition, Transcribe, QuickSight, CodeCommit.
Database Servers: PostgreSQL, MySQL, Microsoft SQL Server, Amazon Redshift, Amazon RDS, MongoDB, Teradata, SQLite, NoSQL, Relational Database Design, Query Optimization.
Other Tools & Technologies: Git, GitHub, GitLab, Docker Compose, CUDA Toolkit, Linux/Unix Command Line, Bash Scripting, Jenkins, Terraform, Nginx, Redis, REST APIs, Swagger, Postman, Conda, Virtualenv, Makefile, YAML, JSON, VS Code, Jupyter Notebook, Google Colab, Power BI, Tableau.

Professional Experience:

Client: Oracle, Austin, Texas Dec 2023 Present
Role: AI Engineer
Project: Oracle AI Cloud Services Intelligent Automation and Generative AI Platform
Responsibilities:
Led the architectural design and deployment of Generative AI solutions within Oracle Cloud Infrastructure (OCI), integrating LLMs to drive intelligent automation and system diagnostics across global enterprise applications.
Architected production-grade Retrieval-Augmented Generation (RAG) frameworks leveraging Oracle AI Vector Search, LangChain, and FAISS to revolutionize enterprise search capabilities within Oracle Fusion Cloud.
Engineered sophisticated multi-agent autonomous systems using LangChain and Vertex AI to automate high-value workflows, including service ticket classification and proactive IT support curation.
Orchestrated the fine-tuning and optimization of state-of-the-art LLMs (GPT-4, Claude, Llama-3) on OCI GPU clusters, achieving significant gains in latency reduction and domain-specific response fidelity.
Developed high-performance NLP pipelines using BERT and Hugging Face Transformers for intelligent document processing and knowledge extraction from complex multi-source support datasets.
Scaled AI services through microservice architectures using FastAPI and Flask, delivering RESTful APIs for real-time anomaly detection and predictive forecasting integrated with Oracle Digital Assistant.
Established comprehensive LLMOps and MLOps ecosystems using MLflow and Airflow, streamlining the end-to-end lifecycle from training to continuous governance across multi-cloud environments.
Deployed high-impact predictive models (XGBoost, PyTorch) for capacity planning and system health, resulting in a documented 25% reduction in mean time to resolution (MTTR).
Built petabyte-scale data ingestion and monitoring frameworks utilizing Kafka, Spark Streaming, and Delta Lake to process billions of telemetry records and log events daily.
Strategized the implementation of enterprise feature stores using PySpark and Snowflake, enabling scalable, consistent data access for model training across distributed teams.
Pioneered LLM evaluation and observability standards using TruLens and PromptLayer to ensure model explainability, bias mitigation, and alignment with corporate Responsible AI frameworks.
Secured AI infrastructure by integrating with OCI Identity and Access Management (IAM) and Vault services, ensuring robust data protection and secure key management for sensitive cloud operations.
Automated global AI infrastructure deployment using Infrastructure as Code (Terraform, Ansible), ensuring consistency across development, staging, and production environments.
Championed the GenAI Center of Excellence, defining enterprise governance standards, deployment templates, and mentoring cross-functional teams on modern AI integration patterns.
Delivered end-to-end AI/ML solutions across Azure and AWS ecosystems, leveraging Azure ML Studio, Bedrock, and Synapse to build scalable knowledge management and RAG-based search systems.


Environment: Python, PySpark, TensorFlow, PyTorch, LangChain, Hugging Face, Oracle Cloud (OCI), Vertex AI, Azure ML, AWS Bedrock, FastAPI, MLflow, Airflow, Kubernetes (OKE/EKS), Terraform, Snowflake, Kafka, Delta Lake.


Client: Paypal, CA(Remote) Apr 2021 Dec 2023
Role: AI/ML Engineer
Project: Generative AI Driven Fraud Intelligence and Customer Experience Automation Platform
Responsibilities:
Architected and led the deployment of Generative AI systems for high-stakes fraud analytics, transaction monitoring, and conversational support using LangChain, Vertex AI, and AWS Bedrock.
Engineered production-grade Retrieval-Augmented Generation (RAG) pipelines, integrating large-scale enterprise data lakes with vector databases (Pinecone, FAISS) to enable semantic search and automated fraud case summarization.
Developed autonomous multi-agent systems using Google Agent Builder and LangChain to automate complex compliance workflows, including KYC/AML validations and regulatory documentation review.
Optimized fraud detection accuracy by 26% through the development of real-time ML pipelines using XGBoost, LightGBM, and PyTorch, achieving sub-second inference for millions of daily payment events.
Orchestrated LLM benchmarking and fine-tuning initiatives (GPT-4, Claude, Gemini, Llama) to optimize precision, latency, and cost-effectiveness across unstructured financial datasets.
Scaled NLP architectures for intent detection and sentiment analysis using spaCy and Hugging Face Transformers, reducing manual triage time for customer disputes by 38%.
Designed high-throughput streaming frameworks using Kafka, Spark Streaming, and Delta Lake to process real-time telemetry and payment signals at petabyte scale.
Standardized LLMOps and MLOps lifecycles using MLflow, Airflow, and SageMaker, implementing automated versioning, CI/CD, and retraining triggers based on drift detection.
Pioneered Graph-based risk intelligence by building link-analysis models with Neo4j and GraphSAGE to identify and neutralize fraudulent transaction networks in real time.
Established AI Governance and Reliability frameworks using TruLens and RLHF to monitor for bias, ensure prompt fidelity, and maintain strict PCI-DSS and GDPR compliance.
Built scalable AI microservices via FastAPI and Flask, exposing high-availability RESTful APIs for real-time transaction scoring and generative assistant responses.
Reduced operational overhead and compute costs by 30% by optimizing data ingestion and feature engineering pipelines within Snowflake, Databricks, and PySpark.
Automated multi-cloud AI infrastructure using Terraform and AWS CloudFormation, ensuring secure, compliant, and elastic deployment across AWS and Azure environments.
Integrated enterprise-scale Knowledge Bases and semantic search workflows within AWS (S3, Lambda, Glue, Athena) to provide context-aware intelligence for risk investigation units.
Mentored cross-functional teams and junior engineers on LLM fine-tuning, agent orchestration, and modern MLOps best practices to foster a culture of AI excellence.
Environment: Python, PySpark, TensorFlow, PyTorch, LangChain, Hugging Face, Vertex AI, AWS Bedrock/SageMaker, Azure ML, FastAPI, MLflow, Airflow, Docker, Kubernetes (EKS), Terraform, Snowflake, Kafka, Delta Lake, Neo4j, Pinecone.


Client: Verizon Communications, Irving, Texas Nov 2019 Mar 2021
Role: Sr Data Engineer
Project: AI-Driven Network Optimization & Customer Retention Platform
Responsibilities:
Spearheaded the design and deployment of large-scale AI models for network fault prediction, churn analysis, and service outage detection, enabling a shift toward proactive maintenance.
Architected end-to-end ML pipelines in AWS SageMaker and GCP Vertex AI, orchestrating training, validation, and automated deployment across distributed network monitoring systems.
Developed high-precision predictive models using XGBoost, LSTM, and Prophet to forecast network traffic and subscriber churn with a documented 92% accuracy rate.
Implemented real-time anomaly detection systems leveraging Kafka Streams, Spark Structured Streaming, and PyTorch, successfully reducing false-positive alarms by 28%.
Integrated complex telemetry and IoT sensor data from global tower and router networks into centralized data lakes for advanced time-series forecasting and fault diagnostics.
Designed NLP-based analytics pipelines using BERT and spaCy to extract sentiment and intent from customer feedback, call transcripts, and internal NOC complaint logs.
Engineered high-performance RESTful inference APIs using FastAPI, embedding real-time predictions directly into mission-critical network operations dashboards.
Containerized and orchestrated ML workloads using Docker and Kubernetes (Amazon EKS), implementing horizontal pod autoscaling to handle fluctuating global traffic volumes.
Standardized MLOps workflows with MLflow and Airflow, automating experiment tracking, model versioning, and CI/CD deployments to ensure rapid iteration cycles.
Led the implementation of enterprise feature stores and distributed transformations using PySpark and AWS Glue to process and model petabyte-scale datasets.
Optimized deep learning inference by 40% through model quantization and ONNX Runtime, partnering with cross-functional teams to meet strict SLA latency requirements.
Deployed Graph Neural Network (GNN) prototypes on Neo4j and Amazon Neptune for sophisticated root-cause analysis of interdependent network node failures.
Operationalized model drift detection and automated retraining triggers using statistical monitoring and CloudWatch, maintaining long-term model precision.
Improved Mean Time to Detect (MTTD) network faults by 19% through A/B testing and simulation experiments of model-driven maintenance strategies.
Integrated AI-driven alerting into enterprise tools like ServiceNow and Grafana, providing field engineers with real-time, actionable insights into network health.

Environment: Python, PySpark, TensorFlow, PyTorch, XGBoost, MLflow, FastAPI, Kubernetes (EKS), Kafka, Spark Streaming, Airflow, AWS SageMaker/Lambda/Glue, GCP Vertex AI, ONNX, Neo4j, Grafana, Power BI, BERT, spaCy.


Client: UnitedHealth Group (UHG), North Carolina, USA Mar 2016 Oct 2019
Role: Data Analyst
Project: Predictive Health Risk Stratification & Cost Optimization Platform
Responsibilities:
Developed predictive ML models (XGBoost, Random Forest, Logistic Regression) for forecasting readmissions, high-cost claimants, and chronic disease progression.
Enhanced model accuracy through feature selection, correlation analysis, and cross-validation, achieving a ~25% improvement in prediction precision.
Automated data preparation and ML workflows on AWS (S3, EC2, SageMaker, Lambda) for scalable training and deployment of healthcare models.
Built end-to-end ML pipelines integrating data ingestion, transformation, model training, and inference orchestration through CI/CD frameworks.
Implemented NLP-based analytics (spaCy, NLTK) to extract ICD/CPT codes, conditions, and treatment patterns from physician notes and unstructured EHR narratives.
Prototyped deep learning models (TensorFlow 1.x, Keras) for sequential patient journey modeling and risk progression prediction.
Partnered with data engineering teams to migrate legacy analytics workloads to cloud-native architectures on AWS and Databricks.
Developed data visualization dashboards (Tableau, Power BI) to deliver patient risk trends, cost metrics, and care program effectiveness insights to stakeholders.
Collaborated with clinical and actuarial teams to translate predictive insights into actionable care strategies and targeted intervention campaigns.
Implemented model monitoring and drift detection pipelines to maintain accuracy and fairness over time using SHAP and internal validation frameworks.
Integrated model governance and documentation processes aligning with HIPAA, PHI, and FDA audit compliance.
Conducted A/B testing comparing ML-driven outreach vs. rule-based triggers, yielding a 17% higher early-risk detection rate.
Mentored junior analysts and data engineers in feature engineering, ML evaluation, and cloud model deployment best practices.
Presented insights and model outcomes to clinical leadership and operations teams, driving enterprise adoption of AI-based preventive care programs.
Environment: Python, SQL, Pandas, NumPy, Scikit-learn, TensorFlow (v1.x), Keras, XGBoost, spaCy, NLTK, AWS (S3, EC2, SageMaker, Lambda), Tableau, Power BI, Git, Jupyter Notebook, CI/CD, Data Governance (HIPAA).


Client: JP Morgan Chase, Jersey City, NJ December 2013 March 2016
Role: Python Developer/Data Engineer
Project: Enterprise Risk & Customer Analytics Platform
Responsibilities:
Collaborated with the Data Engineering and Risk Analytics teams to design scalable data ingestion pipelines consolidating credit card, loan, and customer transaction data across multiple internal systems.
Developed Python automation scripts for large-scale data extraction, cleansing, and transformation using requests, pandas, and numpy, improving processing efficiency by 30%.
Engineered ETL workflows on Google Cloud Platform (GCP) using Cloud Dataflow, Pub/Sub, and BigQuery for real-time data streaming and aggregation from transactional sources.
Built and optimized relational and dimensional data models (Star and Snowflake schema) in BigQuery and Teradata for marketing, credit, and compliance analytics.
Created data validation frameworks in Python ensuring consistency across regulatory and operational data (Basel III, CCAR datasets).
Implemented Apache Airflow DAGs for automated orchestration, dependency handling, and monitoring of daily data refresh pipelines.
Partnered with the credit risk team to build machine-learning models (Logistic Regression, Decision Trees) predicting credit default and delinquency probabilities.
Applied unsupervised learning (K-Means clustering) to segment customers by spending and repayment behavior for targeted offers.
Designed and implemented fraud detection prototypes leveraging PySpark MLlib and transaction anomaly rules on streaming datasets.
Developed and tuned complex SQL procedures, views, and materialized queries in Teradata to support downstream risk reporting.
Built Tableau dashboards visualizing key banking KPIs delinquency trends, utilization ratios, churn indicators, and fraud risk scores.
Containerized data pipelines using Docker for environment consistency across development, testing, and production.
Collaborated with infrastructure teams to deploy ML models into GCP Cloud Functions and schedule retraining with Airflow.
Supported data quality, lineage, and audit documentation to align with JP Morgan Chase s data governance and compliance standards.
Conducted A/B testing to evaluate model effectiveness and fine-tuned hyperparameters for improved recall and precision in fraud detection.
Integrated logging and monitoring via Stackdriver to ensure pipeline health and timely error alerting.
Contributed to migration of on-prem ETL jobs to GCP, reducing data load time by 40% and operational cost by 25%.
Presented analytics insights to business users and compliance teams to drive data-driven decisions on risk controls and marketing campaigns.
Mentored junior developers on Python best practices, query optimization, and GCP data tooling.
Environment: Python, PySpark, GCP (BigQuery, Dataflow, Pub/Sub, Cloud Functions, Stackdriver), SQL, Teradata, Apache Airflow, Docker, Tableau, Pandas, NumPy, Scikit-learn, BeautifulSoup, Matplotlib, Seaborn, AWS S3, Linux
Keywords: cprogramm cplusplus continuous integration continuous deployment artificial intelligence machine learning javascript business intelligence sthree rlang information technology California New Jersey

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];6634
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: