Vignesh Basker Basker

Work Experience

Total years of experience :0 years, 8 Months

SRE & DevOps Engineer at Ascendion

Kuwait
My current job since September 2023

Company: Collabera/Ascendion Location - Richmond, VA Sep 2023 - Till date
Responsibilities:
• Design, implement, and maintain highly available and scalable systems, services, and architectures to support our applications and infrastructure in Amazon Web Services (AWS)
• Lead eforts to improve system reliability, monitoring, and performance, utilizing automation and best practices for continuous integration and deployment
• Collaborate with cross-functional teams to identify and resolve performance bottlenecks, scalability issues, and architectural challenges
• Design, deploy, and manage AWS cloud infrastructure, including AWS ECS clusters, AWS EC2 instances, AWS S3 storage, AWS Fargate, and other AWS services
• Provide technical leadership and guidance around DevOps best practices
• Proficient in scripting languages (e.g., Python, Bash) and configuration management tools (e.g., Ansible)
• Contribute to the architecture and design of our AWS-based infrastructure, ensuring security and scalability best practices
• Experience with Infrastructure as Code technologies such as Terraform and CloudFormation
• Experience in building and managing automated CI/CD pipelines and related tools like GitHub, Jenkins
• Identify opportunities to reduce manual deployment and validation eforts
• Experience in vulnerability management metrics as per enterprise standards
• Assisting teams with vulnerability resolution, including providing assistance researching vulnerabilities solutions and addressing false positives to reduce system workloads building reports to provide teams with necessary data
• Implement enterprise standards and apply definition of done to all existing and new applications.
• Help solve problems related to mission-critical services and contribute to solutions to prevent problem recurrence.
• Communicate to business and technical partners on incidents as they occur when they impact system performance or availability at a critical level
• Manage end-to-end large-scale resiliency and data recovery events for organization and contribute to resolving gaps and issues
• Implement and enhance existing observability via platforms such as Splunk, New Relic, Datadog and Amazon CloudWatch
• Participate in on-call rotations to ensure our service level SLA's are met 24x7
• Assist with organization wide security considerations.
• Continue to automate (Infrastructure as code) common operations tasks to develop new runbooks
• Supports in maximizing cloud variable spending by applying rightsizing, scheduling, and serverless options.
• Develop and implement incident response procedures, conduct post-incident analysis, and drive root cause analysis (RCA) to prevent future incidents.
• Respond to incidents, troubleshoot system problems, and provide timely resolution.