Home

Srinikethan - Lead SRE/Observability Engineer
[email protected]
Location: Dallas, Texas, USA
Relocation: Yes
Visa: H1B
Resume file: Sriniketan Lead SRE Engineer_1774902701679.docx
Please check the file(s) for viruses. Files are checked manually and then made available for download.
SRINIKETAN K
Texas, USA
Email: [email protected]
+1 972-924-5835 (Employer),
C2C Only, H1B (Passport Number also shared)

DEV-OPS | SRE | PRODUCT-OPS | BUSINESS-OPS | OBSERVABILITY

Astute professional with 9+ years of experience in streamlining development workflows, foster a culture of reliability and continuous improvement, and contribute to operational excellence at scale. Leads world class efforts to build and optimize scalable automation pipelines, improve system reliability, and enhance end-to-end monitoring and incident response. Thrives cross-functionally with engineering, product, and business teams to ensure that cutting-edge technical solutions align with overall business goals. Skills and expertise include:

Version Control: Git | Bitbucket | GitHub-Actions
Alerting Tools: Ops Genie | X matters | Slack
Cloud Services: AWS | Azure | PCF | GCP
Databases: Oracle SQL | My SQL | Postgres | Redshift
Configuration Management: Ansible | Chef
Operating Systems: Linux | Windows | MacOS
Data Processing: Apache Kafka | Offset Explorer
Package Management: Nexus | Artifactory | NuGet
Containerization & Orchestration: Docker | Kubernetes
Infrastructure provisioning: Cloud formation Templates (CFT) | Terraform
Programming & Scripting Languages: Shell | Python | groovy | bash | Yaml
Tools: Jira | Intelli-J | Eclipse | Visual Studio | Remedy | Rally
Monitoring: Splunk | Dynatrace | Grafana | Thousand eyes |ELK | CloudWatch | Datadog
CI/CD: Jenkins | XLR | TFS Build | Maven | SonarQube | Control M | Coverity | Black duck

PROFESSIONAL EXPERIENCE

Mastercard Inc., Texas, USA May 2022 - Present
Lead SRE/Observability Engineer
Develop and deploy intelligent automation for comprehensive monitoring, streamlined deployments, and swift incident resolution. Automated alert resolution for Pivotal Cloud Foundry (PCF) using XLR predefined templated solutions aims to reduce Mean Time to Detect (MTTD) and Mean Time to Mitigate (MTTM) by eliminating manual intervention. This approach leverages an automation platform like XLR (Digital.ai Release) to orchestrate a series of automated actions in response to specific alerts.
Drive innovation by contributing to the transition from traditional middleware to cloud-native architectures. Partner with global engineering teams to ensure the resilience and stability of systems.
Automate infrastructure management using AWS CloudFormation and Terraform, resulting in a 42% reduction in provisioning time, ensuring consistent and repeatable deployments
Developed and maintained CI/CD pipelines for continuous integration and delivery of applications to Kubernetes on AWS which improved application scalability by 30%.
Streamlined the containerization process, reducing application deployment times by 25%.
Strategized migration of static servers to a Kubernetes environment, improving application reliability by 40%
Responsible for the maintenance and operational support of BMC Remedy modules such as Incident Management, Problem Management, Change Management, Service Request Management, and Asset Management.
Maintain and implement enterprise monitoring and alerting utilizing Splunk and Dynatrace and organizationally defined best practices. On-board the log data from different sources and implement the App performance Dashboards
Configure, deploy, and manage synthetic monitoring, transaction tests, network path visualization.

DMI Inc., Virginia, USA May 2021 - April 2022
Senior Dev-Ops engineer
Was responsible to build and deploy Net applications using MS/TFS Build, Jenkins, and Octopus. Deploy new releases, updates, and patches for asp.net applications. Troubleshoot application deployment and configuration issues.
Build and maintain a Jenkins CI/CD pipeline to support the team's non-prod and production systems; extensively worked on Jenkins by installing, configuring, and maintaining the pipelines for an end-to-end automation for all build and deployments.
Experience in dealing with Windows Azure IaaS - Virtual Networks, Virtual Machines, Cloud Services, Resource Groups, Express Route, VPN, Load Balancing, Application Gateways, Auto-Scaling, and Traffic Manager.
Major focus on Configuration, SCM, Build/Release Management, Infrastructure as a code (IAC) and as Azure DevOps operations Production and cross platform environments.
Created Azure services using ARM templates (JSON) and ensured no changes in the present infrastructure while doing incremental deployment.
Involve in writing custom PowerShell and JSON templates to remediate the Azure services, Groovy scripts to automate the deployment process using Jenkins, Octopus, and Administer the IIS App Pools and Virtual directories.
Use JIRA and Kanban Dashboard to track the defects of the applications and the SonarQube tool to analyze the code vulnerabilities.
Maintain technical system documentation for audits and compliance, and contribute to customer briefings
identify areas for improvement in the DevOps process and implement changes to enhance efficiency and effectiveness.
Ensuring and implement solutions that keep middleware services running with Always On high availability


Comcast Inc., Philly, USA August 2019 - April 2021
DevOps/SRE Engineer
Collaborated with project stakeholders to define requirements and worked closely with development teams within agile methodologies and cross-functional activities. Designed, built, and maintained robust CI/CD pipelines, automating manual development and deployment processes to accelerate software delivery
knowledge of the RDK (Reference Design Kit) framework to develop and integrate new features, as well as to debug and optimize existing code
Diagnose and resolve issues with GitHub (RDKCentral) workflows and integrations, ensuring a smooth and efficient development process.
Utilize and manage the Yocto build system to create custom Linux distributions and software images for various devices
Provided developers with secure continuous delivery system to promote developer adoption to production and supporting them with Kubernetes deployments.
Experienced in troubleshooting infrastructure issues with Linux, network latencies between systems and any other Dev-Ops teams to drive issues towards closure.
Design, develop, and maintain robust and scalable CI/CD pipelines using Jenkins Pipeline (Declarative and Scripted) to extend Jenkins functionality and automate tasks using Git, Artifactory, Ansible
Integrate Jenkins with notification and collaboration tools (e.g., Slack, Microsoft Teams, Ops genie, icinga).
Create and maintain Docker files for various applications and services, ensuring efficiency, security, and reproducibility; implement multi-stage builds to reduce final image size and improve security.
Emphasis on integrating Docker into CI/CD pipelines and managing containerized deployments.
Monitor the health and resource utilization of Docker containers. Configure and manage security plugins and settings in Jenkins.


Apple Inc., Cupertino, USA August 2018 - July 2019
System Software Engineer
Designed the Build and Deployment Automation, Release Management and implementing Service reliability best practices (Production Maintenance/Monitoring, alerting, L1/L2/L3 Troubleshooting). Execution of scripts to maintain health of systems. Maintain liaisons with other stakeholders and facilitate to provide necessary support and expertise, as necessary.

Worked on tasks like configuring, implementing application monitoring, setup dashboards, alerts, and logging strategies.
Successfully operationalized solutions in production environments, ensuring high availability across both production and non-production systems.
Develop and maintain Splunk search queries, dashboards, reports, and alerts based on user requirements.
Create and manage Splunk knowledge objects (e.g., lookups, field extractions, event types, tags).
Develop and maintain documentation for Splunk content and configurations.
Monitor security events and alerts in Splunk to identify potential security incidents.
Contribute to incident response efforts by providing Splunk data and analysis.
Working with application and operation teams for resolving runtime failures, troubleshoot Build and deployment issues, with minimal downtime.

Texas A&M University, Kingsville, Texas, USA August 2016 - July 2018
Graduate Research Assistant
Supervise the project electrical and software commissioning team.
Participate in the preparation of the control/software project budget preparation including Engineering & Design efforts.
Prepare timely progress reports, including revised schedule, engineering charges, budget, etc., and distribute them to other members of the project team as needed.
Maintain liaisons with other departments, disciplines and facilities to provide necessary support and expertise as necessary.


Orbit-Technologies, Chennai, INDIA August 2013 - July 2015
Trainee Engineer
Experience in creating the company's DevOps strategy in any environment (Linux, Windows) servers along with creating and implementing a cloud strategy
Reporting performance against SLOs to Squad members and implemented & configured Chaos Monkey on Datacenters
Continually Improved Performance Against Service Level Objectives (Performance testing Using J-meter)
Managed and supported Jenkins Security related issues and policies for user access
Worked with Ansible playbooks for virtual and physical instance provisioning, configuration management, patching and software deployment on environments
Applied zero downtime release strategies such as Blue Green deployments

EDUCATION

Master of Science (MS) in Electrical and Comp Science Dept August 2016 - August 2018
Texas A&M University, US
Keywords: continuous integration continuous deployment artificial intelligence microsoft mississippi

To remove this resume please click here or send an email from [email protected] to [email protected] with subject as "delete" (without inverted commas)
[email protected];7050
Enter the captcha code and we will send and email at [email protected]
with a link to edit / delete this resume
Captcha Image: