Home

Ashok - Generative AI Engineer
[email protected]
Location: Frisco, Texas, USA
Relocation: yes
Visa: GC EAD
Resume file: Ashok_GenerativeAI_TX _1770229722124.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
PROFESSIONAL SUMMARY:
.

AI/ML + GenAI Engineer (8 10+ years) building end-to-end solutions across banking, healthcare, and enterprise platforms.
Designed and deployed LLM-powered applications (chatbots, copilots, search/Q&A) using OpenAI, Hugging Face Transformers, LangChain, and modern prompt/agent workflows.
Built RAG pipelines with vector databases (Pinecone/FAISS/Chroma/Weaviate) and optimized retrieval using chunking, re-ranking, metadata filtering, and evaluation.
Developed NLP systems (classification, entity extraction, summarization, semantic search) using BERT/Transformer architectures and fine-tuning strategies (LoRA/PEFT where applicable).
Delivered production ML pipelines for prediction, risk scoring, anomaly detection, and forecasting using scikit-learn, TensorFlow, PyTorch, and strong feature engineering.
Implemented MLOps: MLflow/model registry, CI/CD (GitHub Actions/Jenkins/Azure DevOps), automated training, deployment, monitoring, and drift/performance alerting.
Deployed scalable services using Docker, Kubernetes, and serverless/managed platforms (e.g., AWS Lambda/SageMaker, Azure ML, GCP Vertex AI).
Built data engineering/ETL pipelines using Spark/Databricks, SQL, and orchestration tools; enabled batch + streaming workflows (Kafka when needed).
Designed cloud-native architectures on AWS/Azure/GCP with secure networking, IAM/RBAC, private endpoints, logging, and observability (CloudWatch/Azure Monitor/Grafana/Prometheus).
Worked with diverse data sources: structured + unstructured, APIs, documents, logs, and enterprise systems; ensured data quality and reproducibility.
Created analytics and executive reporting using Power BI/Tableau and Python-based analysis to translate model outcomes into measurable business impact.
Strong focus on security, privacy, and compliance (HIPAA/GDPR/SOC2 principles), including data anonymization and governed ML workflows.
Proven collaborator in Agile environments partnering with product, engineering, and stakeholders to deliver reliable, high-impact AI systems.
Tech stack highlights: Python, SQL, Spark, TensorFlow, PyTorch, scikit-learn, OpenAI APIs, LangChain, Hugging Face, MLflow, Docker, Kubernetes, AWS/Azure/GCP.
Built and tuned prompt templates, tool-calling flows, and safety/guardrails (input validation, policy filters, hallucination reduction patterns) for reliable GenAI apps.
Implemented model evaluation with offline + online metrics (precision/recall, BLEU/ROUGE where relevant, retrieval hit-rate, latency, cost per query) and A/B testing.
Optimized inference performance (batching, caching, quantization where applicable, async processing) to meet low-latency and high-throughput SLAs.
Designed feature stores / reusable feature pipelines and standardized training datasets for consistent experimentation and faster iteration.
Hands-on with APIs & microservices (FastAPI/Flask style patterns), authentication, rate-limiting, and production-grade error handling.
Developed document ingestion pipelines (PDF/HTML/text), enrichment (metadata, embeddings), and indexing strategies for enterprise knowledge search.
Experience with time-series and operational data (monitoring signals, device/telemetry logs) for forecasting, anomaly detection, and alerting.
Led/mentored teams on best practices in coding standards, peer reviews, model governance, and repeatable delivery across environments.
TECHNICAL SKILLS:

Programming & Scripting: Python, SQL, JavaScript/TypeScript, YAML; Bash for automation and tooling
Machine Learning Core: feature engineering, model selection, cross-validation, hyperparameter tuning, error analysis
Deep Learning: architecture design and optimization using PyTorch/TensorFlow/Keras; transfer learning and fine-tuning
GenAI / LLM Engineering: OpenAI & Azure OpenAI; prompt design, function/tool calling, structured outputs, guardrails
RAG & Vector Search: embedding generation, chunking strategies, retrieval tuning, reranking patterns; FAISS/Pinecone integration
NLP: Transformers (Hugging Face), spaCy/NLTK; NER, text classification, semantic search, summarization workflows
Computer Vision & Document AI: OpenCV pipelines; OCR with Tesseract; image/document preprocessing and extraction
Data Engineering: ETL/ELT pipeline development; Apache Spark/SparkSQL; workflow orchestration with Airflow
MLOps (Lifecycle): MLflow/W&B for experiment tracking, model registry, versioning; reproducible training pipelines
Model Deployment: FastAPI/REST services, batch vs. real-time inference, performance optimization and scaling basics
Cloud Platforms: AWS (S3, EC2, Lambda, SageMaker), Azure (ML Studio, Functions), GCP (Vertex AI, BigQuery)
Datastores: PostgreSQL/MySQL/SQL Server; MongoDB; Redis caching; schema design and query optimization fundamentals
Observability & Reliability: logging/metrics, drift monitoring, alerting, model performance tracking in production
Experimentation & Statistics: A/B testing, hypothesis testing, KPI-driven evaluation, uplift measurement
Security & Compliance: HIPAA-aware data handling, data masking, access control fundamentals, responsible AI practices
CI/CD & DevOps: Git-based workflows, automated testing, build/release pipelines (GitHub Actions/Jenkins)
Data Quality & Governance: schema validation, data lineage basics, DQ checks, documentation standards
API Integrations: third-party API ingestion, authentication (OAuth/API keys), rate-limiting and retry strategies
Performance Engineering: profiling, inference latency optimization, batching, caching strategies for scalable serving
Collaboration & Delivery: Agile/Scrum, Jira/Confluence, code reviews, stakeholder communication and technical documentation






PROFESSIONAL EXPERIENCE:

USAA, (Plano, Texas)
Generative AI Engineer (Contract)
Oct 2023 Till Date

Responsibilities:

Built and deployed GenAI solutions to auto-generate compliance reports from regulatory filings, reducing manual effort and cycle time for audit-ready documentation.
Developed GPT/Azure OpenAI assistants for internal audit, financial reconciliation, and policy Q&A, improving analyst productivity and response consistency.
Designed and implemented RAG architectures (LangChain/LangGraph) with embeddings + vector search (FAISS/Pinecone/Azure Cognitive Search) for secure contract search and clause extraction.
Fine-tuned domain LLMs (incl. LoRA/QLoRA) on financial/healthcare text to support fraud detection, anomaly detection, and risk analytics workflows.
Built document AI pipelines using LayoutLM/Transformers + OCR (Tesseract) for classification, extraction, and summarization boosting NER/document accuracy and automation.
Delivered an enterprise healthcare chatbot using Azure OpenAI + LangChain, reducing support calls by 30% and improving answer accuracy by 40%.
Implemented conversation context management with Azure Functions, Cosmos DB, Redis, cutting resolution time by 25% via multi-turn orchestration and fast retrieval.
Created MLOps pipelines with Azure ML, MLflow, Docker, Azure DevOps, reducing deployment cycles by 35% and improving reproducibility across environments.
Improved LLM inference performance via quantization, batching, caching, and endpoint tuning reducing latency and increasing throughput for production traffic.
Built cloud-native microservices (FastAPI/REST) with event-driven design (Event Grid/Service Bus), enabling reliable, scalable GenAI integrations.
Established monitoring & observability (Azure Monitor/Grafana/Log Analytics/ELK) for model health, drift signals, and proactive alerting in production.
Implemented data quality + regression testing (Great Expectations/pytest) to prevent pre-release issues and stabilize model and pipeline changes.
Developed enterprise analytics dashboards (Power BI/Tableau) and KPI reporting to communicate model impact, usage, and operational performance.
Integrated AI systems with AWS/Azure services (S3/EC2/Lambda/API Gateway/DynamoDB/Bedrock; Azure ML/Cognitive Services) for secure end-to-end delivery.
Achieved measurable business outcomes including 86% reduction in manual processing, 60% faster analysis cycles, 22% lower contact-center volume, and $500K annual operational savings through GenAI automation.

Environments :
Python (2.7/3.x), SQL, .NET/C#, Pandas/NumPy, scikit-learn (GridSearchCV), OpenCV; Deep Learning with TensorFlow/Keras, PyTorch, GANs.
GenAI/LLM: RAG, LangChain, OpenAI Studio, Gemini, chatbots/conversational AI, prompt engineering, LoRA/QLoRA.
Data + MLOps/Cloud: Databricks, Spark/PySpark, Hadoop, Kafka, Airflow/Zeppelin; MLflow, TensorBoard, ELK; AWS (SageMaker, Bedrock), Docker, Kubernetes, CI/CD, Tableau; Agile.

Holland America, (Seattle, WA)
AI/ML Engineer
Jan 2020 Sep 2023

Responsibilities:

Built predictive risk and financial forecasting models using Bayesian methods (HMM), SVM, Random Forest, and Gradient Boosting, improving accuracy and decision reliability for finance use cases.
Designed an AI-driven risk assessment pipeline on AWS SageMaker / GCP Vertex AI, covering training, evaluation, and scalable deployment for production inference.
Developed a reusable Vector/Embedding Development framework to support semantic search, similarity matching, and context retrieval across enterprise workflows.
Implemented real-time streaming data pipelines with Kafka + Spark/SparkSQL, enabling low-latency model scoring and near-real-time analytics.
Delivered LLM-powered automation using Azure OpenAI / OpenAI APIs, including summarization, intelligent search, and internal assistant capabilities for business teams.
Built secure RAG systems (LangChain/LangGraph) with FAISS/vector stores to answer questions from policies/contracts and reduce manual document review.
Created OCR + GPT pipelines (e.g., Tesseract + LLM) to digitize and interpret scanned/handwritten forms, producing structured data for downstream processing.
Developed claims and underwriting copilots (insurance domain) to summarize adjuster notes, accelerate policy issuance, and guide agents through recommendations and FAQs.
Applied document clustering and topic modeling to organize large policy bundles, improve discoverability, and streamline policy management workflows.
Generated synthetic datasets using LLMs and diffusion models to expand training coverage, support testing, and improve robustness for edge cases.
Implemented human-in-the-loop feedback loops and validation checks to reduce hallucinations, improve response quality, and align outputs with underwriting rules.
Automated CI/CD for ML using Cloud Build/Vertex AI pipelines (and/or Azure DevOps), including model versioning, scheduled retraining, and controlled rollouts.
Established monitoring and drift detection with production observability (logs/metrics/alerts), ensuring stable model performance over time.
Strengthened security & compliance (GDPR/SOC2-style controls) via IAM/RBAC, encryption, and audit-ready logging for regulated environments.
Built analytics and stakeholder reporting using Tableau/Power BI and Python dashboards, translating model outputs into actionable business insights.

Environments:
Python, SQL; ML/DL with scikit-learn, TensorFlow/PyTorch; GenAI with Azure OpenAI/OpenAI APIs, LangChain/LangGraph, RAG + FAISS, and OCR via Tesseract.
Cloud & data platforms: AWS (SageMaker, Bedrock) + GCP (Vertex AI); streaming/real-time analytics with Kafka and Spark/SparkSQL; orchestration with Airflow.
MLOps/Prod: MLflow, Docker, Kubernetes, CI/CD, monitoring/observability (logs/metrics/alerts), and security/compliance (IAM/RBAC, encryption, audit logging) with Tableau/Power BI dashboards.

Dell, (Boston, MA)
Data Scientist/ ML Engineer
Jun 2017 Dec 2019

Responsibilities:
Designed and implemented enterprise data warehousing using Star/Snowflake schema modeling to improve analytics performance, governance, and reporting reliability.
Built and automated ETL/ELT pipelines across multiple sources using Airflow, enabling consistent data ingestion and downstream BI consumption.
Developed scalable data processing pipelines leveraging Spark/SparkSQL, Hadoop/HDFS, and batch frameworks to process high-volume datasets efficiently.
Implemented real-time streaming architectures with Kafka + Spark Streaming/Flume/Storm, delivering low-latency insights and operational dashboards.
Engineered cloud-based data solutions using Azure Data Factory, Azure Synapse, AWS Glue, and Amazon Redshift, optimizing storage, compute, and secure access.
Optimized complex SQL queries, stored procedures, and performance tuning (Oracle/MySQL/PostgreSQL/SQL Server), reducing query latency and improving SLA adherence.
Built monitoring and observability for pipelines and clusters using ELK/Grafana/Cloudera Manager, enabling proactive alerting and faster incident resolution.
Implemented data quality validation using Great Expectations/Apache Griffin, ensuring accuracy, completeness, and consistency for regulated reporting.
Established CI/CD for data and ML workflows with Git, Jenkins/GitHub Actions, and Azure DevOps, improving release velocity and reducing deployment risk.
Developed and deployed ML/DL models (Python, scikit-learn, TensorFlow/Keras) for healthcare diagnostics and predictive analytics, including risk stratification use cases.
Built computer vision pipelines (CNNs) for medical imaging (CT/MRI/X-ray) classification and anomaly detection to support clinical decision workflows.
Applied NLP techniques to extract medical concepts and signals from unstructured clinical notes, improving documentation search and decision support.
Integrated ML solutions into hospital systems and operational workflows, enabling AI-driven triage and patient prioritization while maintaining low-latency inference.
Ensured data security, compliance, and anonymization aligned with healthcare standards; implemented access controls and audit-ready data handling practices.
Created stakeholder-ready Power BI/Tableau dashboards to visualize patient outcomes and operational KPIs, reducing reporting time and improving decision-making.

Environments:
Python, SQL; Data warehousing & ETL/ELT with Star/Snowflake modeling, Airflow, Azure Data Factory, Azure Synapse, AWS Glue/Redshift.
Big data & streaming: Spark/SparkSQL, Hadoop/HDFS, Kafka + Spark Streaming (Flume/Storm), with monitoring via ELK, Grafana, CloudWatch.
ML/AI & delivery: scikit-learn, TensorFlow/Keras (NLP + CNNs), data quality with Great Expectations/Apache Griffin, CI/CD with Git, Jenkins, GitHub Actions/Azure DevOps, and Tableau/Power BI.



Matrix,(Hyderabad, India)
Data Engineer
Jun 2014 July 2015

Responsibilities:
Designed and built scalable ETL/ELT pipelines using SQL and Python, automating recurring data workflows and improving data accuracy, freshness, and reliability for analytics and reporting.
Developed and optimized Informatica/SSIS mappings and workflows to integrate data from multiple sources, reducing load failures and improving throughput through efficient transformation design.
Modeled and maintained data warehouse schemas (star/snowflake) and implemented incremental loads in platforms like AWS Redshift/Synapse, improving query performance and reporting SLAs.
Implemented data quality checks (validation rules, reconciliation, anomaly checks) and automated alerts to prevent bad data propagation into downstream dashboards and ML pipelines.
Built and tuned complex SQL queries, views, and stored procedures, reducing latency and enabling high-performing BI datasets for Power BI/Tableau.
Orchestrated pipelines with schedulers (e.g., Airflow/Control-M) and implemented restartability, dependency management.
Enforced data governance practices including cataloging, lineage documentation, RBAC, and audit logging to meet privacy, security, and compliance requirements.
Partnered with cross-functional teams to translate business requirements into data products, documenting pipelines, metrics definitions, and operational runbooks for supportability.
Monitored production workloads, performed performance tuning, and delivered capacity planning to ensure scalable data processing under growing volumes and tight SLAs.
Enforced data governance practices including cataloging, lineage documentation, RBAC, and audit logging to meet privacy, security, and compliance requirements.
Monitored production workloads, performed performance tuning, and delivered capacity planning to ensure scalable data processing under growing volumes and tight SLAs.

Environments:

Python, SQL; ETL/ELT development with SSIS/SSIS mappings, scheduling/orchestration using Airflow/Control-M, and performance tuning with stored procedures/queries.
Data platforms & BI: Star/Snowflake schema modeling on AWS Redshift/Azure Synapse; reporting via Power BI/Tableau.
Ops & governance: Data quality/monitoring & alerting, Git-based documentation/runbooks, security/compliance with RBAC, audit logging, and data governance (cataloging/lineage).
Keywords: csharp continuous integration continuous deployment artificial intelligence machine learning business intelligence sthree database Connecticut Massachusetts Washington

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];6770
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: