Senior Site Reliability Engineer (Tanzu) – Opportunity for Working Remotely

The Elevator Pitch: Why will you enjoy this new opportunity? 

The Reliability Engineering team is growing with a laser focus on bringing reliability engineering practices to our Tanzu customers.

We are a modern SRE team where the original focus on a “S”ite has been augmented by a focus on the Customer – after all, not all workloads are websites.

We are looking for someone who views Reliability Engineering as a way of life and is interested in approaching everything you do with that mindset.

What is the primary need, technical challenge, and/or problem you will be responsible for?

Our customers need a partner to assist with establishing and meeting their reliability goals while building, running, and managing production workloads on VMware Tanzu, in particular Tanzu Kubernetes Grid (TKG).

Conversely, our sales, solutions, support, and product development teams have a partner in us that is suited to understand reliability engineering, our customers’ reliability goals, and associated challenges while building on or using our products and services.

As an engineer on this team, you will: 

  • Exercise your advanced Linux systems and network engineering experience while coaching customers on how to optimally build, run and manage production workloads using Kubernetes-based systems.
  • Teach how-to adopt reliability engineering practices such as observability, error budgets, blameless retrospectives, chaos engineering, etc.
  • Participate in oncall and interrupts rotations in the escalation path for TKG support and field teams.
  • Take a balanced approach to reduce operational load, where it’s cost-effective, through software and systems engineering.
  • Collaborate directly with customers–both via video and in writing.
  • Share your learnings and experiences with your team and others.
  • Participate within a fully remote and distributed team.

Success in the Role: What are the performance goals over the first 6-12 months you will work toward completing?

  • Propose, define, and drive assigned projects to completion, being clear when tradeoffs are needed, and deadlines need to be adjusted to accommodate higher-priority work.
  • Influence continuous improvements in our products by providing opinionated input in feature workstreams.
  • Demonstrate a commitment to the team SLOs.
  • Participate in training sessions, e.g. enabling support engineers across the globe to support their ability to work with VMware’s customers on the front-line.
  • Endeavor to complete the Certified Kubernetes Administrator (CKA) exam or similar.

What type of work will you be doing? What assignments, requirements, or skills will you be performing on a regular basis?

  • Contribute to reliability-related improvements and reliability features as part of VMware Tanzu services and upstream Kubernetes ecosystem.
  • Partner with Program Managers to ensure appropriate technical details are understood and tracked, so each time we work with partner teams, we are bringing forward depth and context.
  • Use our ticketing system to triage, prioritize, and engage with partner teams when they need our assistance.

What is leadership like for this role? What is the structure and culture of the team like?

Our culture is rooted in inclusion, collaboration, and growth. We are a tight-knit group of adaptive, self-starting, and mission-driven individuals with a passion for doing our best work and bringing our best selves.

We make sure to invest time to consider each other’s perspectives and to appreciate our efforts and accomplishments. For real, we have an appreciation section during our team meeting! We also share what we have learned with each other, and we care for each other’s well-being.

The hiring manager for this role is Gustavo Franco, Senior Engineering Manager. Gustavo recently joined VMware after managing CRE, GCP, Incident Response, Disaster Recovery, and other SRE teams at Google for six years. Prior to this, Gustavo was an individual contributor SRE at Google for six years on Google Compute Engine, Google Cloud Storage, Google+, and other products.

 

Category : Engineering and Technology
Subcategory: Site Reliability
Experience: Manager and Professional
Full Time/ Part Time: Full Time
Posted Date: 2021-04-19

VMware Company Overview: At VMware, we believe that software has the power to unlock new opportunities for people and our planet. We look beyond the barriers of compromise to engineer new ways to make technologies work together seamlessly. Our cloud, mobility, and security software form a flexible, consistent digital foundation for securely delivering the apps, services and experiences that are transforming business innovation around the globe. At the core of what we do are our people who deeply value execution, passion, integrity, customers, and community. Shape what’s possible today at http://careers.vmware.com.

Equal Employment Opportunity Statement: VMware is an Equal Opportunity Employer and Prohibits Discrimination and Harassment of Any Kind: VMware is committed to the principle of equal employment opportunity for all employees and to providing employees with a work environment free of discrimination and harassment. All employment decisions at VMware are based on business needs, job requirements and individual qualifications, without regard to race, color, religion or belief, national, social or ethnic origin, sex (including pregnancy), age, physical, mental or sensory disability, HIV Status, sexual orientation, gender identity and/or expression, marital, civil union or domestic partnership status, past or present military service, family medical history or genetic information, family or parental status, or any other status protected by the laws or regulations in the locations where we operate. VMware will not tolerate discrimination or harassment based on any of these characteristics. VMware encourages applicants of all ages. Vmware will provide reasonable accommodation to employees who have protected disabilities consistent with local law.Job ID: R2107064

Share this job