Cloud Supportability Engineer
Company: NetApp
Location: Boulder
Posted on: August 5, 2022
|
|
Job Description:
Job SummaryAt NetApp, we have an amazing opportunity to
transform industries with cutting-edge data services. We are
developing a broad portfolio of solutions that harness the power of
data. NetApp public/private cloud offerings, high performance &
highly scalable storage products are stretching what s possible for
our customers & partners. These solutions change how data is
stored, consumed & interpreted unleashing innovation. Join our team
& push the boundaries of what s possible.As a Cloud Supportability
Engineer with focus on driving supportability into new cloud
features, you will be empowered to take ownership and be a
self-starter. You will be working closely with product management,
engineer teams and support teams to deliver support readiness for
new features. Your ability to influence feature designs, support
tooling and automation, perform detailed support analysis and
improve case deflection will be key in meeting our support KPI
goals. Your attention to detail and feedback will be a key success
measure on the projects for which you contribute. You'll work in
diverse groups on cross team projects which have defined outcomes
and deadlinesResponsibilitiesProvide architectural guidance to
optimize the supportability across NetApp's cloud servicesBe
hands-on in the implementation of our supportability tooling and
automation. You ve driven the deployment of these tools at scale
and have the experience in working with a rapidly growing
infrastructureDevelop and improve tooling to assist in root cause
and mitigation for supportability of our cloud offering.Proactively
monitor systems, networks, and applications to provide input in
improving the stability, security, efficiency, and scalability of
systems.Drive more advanced automated solutions, workflows,
readiness, and/or process improvements (e.g., bots, Machine
Learning, diagnostic tools) for support teams and/or
customers.Identify new and innovative ways to analyze data and
trends to improve supportability KPIs.Participate in rotating
on-call incident response on the weekdays and on the
weekends.Participate in incident reviews to create improved
supportability documentation, diagnostics, tooling, error messages
and automation.Partner with Engineering and SRE teams to improve
process, automate and enhance incident response focused on
maintaining and improving SLA s.Mentor members of the staff on
large scale cloud deployments. You re an expert in deploying in the
cloud and can bring a teaching mindset to help others benefit from
your experienceExperience & Key Characteristics8+ years experience
running high availability systems and supporting distributed
infrastructureExperience with microservice architectures running on
Kubernetes and containersUnderstanding of using tracing systems and
finding ways to improve visibility holistically across a
distributed service.Experience using industry tools to search
databases, create custom queries, and generate reports.Familiarity
with operating systems such as Linux, CentOS, Ubuntu and Windows.A
basic understanding of at least one scripting language similar to
Bash, Python, or Pearl, Ruby.A basic understanding of public cloud
vendors such as Azure, AWS, Google Cloud or others.Experience
administering Clustered Data ONTAP or other storage platforms is a
big plus.Outstanding communication and presentation skills, written
and verbal. Excellent listening skills and a high degree of
empathy.You are great at solving problems, sorting meaningful
information from noise, and taking action.Ability to work in large,
collaborative teams to achieve organizational goals. Passionate
about building an innovative and inclusive
culture.EducationTypically requires a minimum 8 years of related
experience with a bachelor s degree; or 5 years and a Master s
degree; or a PhD without experience; or equivalent work
experience.The Team You'll Be Working WithThe SRE team within
NetApp Public Cloud Service group is responsible for the scaling
and support of our multi-region, multi-cloud application. This team
is made up of a group of software engineers, site reliability
engineers, and security experts that own the deployment
architecture and strive to improve our infrastructure through deep
partnership with other teams across the organization. We are a
customer focused team continuously improving our services to meet
the needs of our customers and reduce toil on our team. We meet
regularly to mentor and challenge each other as we collaborate
across projects. We're only able to accomplish our mission through
diverse teams innovating, together.If you ask a NetApp employee why
they work here, the answer is inevitably the same: the people. At
NetApp, our culture is at the heart of what we do. We place
importance in trust, integrity, teamwork, and caring above all
else. NetApp is a place where people are empowered to make a
difference. Empowered to innovate. Empowered to collaborate.
Empowered to help ourselves and others be data-driven and change
the world. We take care of each other, our customers, our partners,
and our communities simply because it s the right thing to do.We
work hard but also recognize the importance of work-life balance
for our employees because what s important to them is important to
us! Recently we implemented Family First, which encourages
employees to take paid time off to bond with a new child (through
birth or adoption) or to care for a family member with a serious
health condition. Our volunteer time off program is best in class,
offering employees 40 hours of paid time off per year to donate
their time with their favorite organizations. We provide
comprehensive medical, dental, wellness and vision plans for you
and your family. We offer educational assistance, legal services,
and access to discounts and fitness centers. We also offer
financial savings programs to help you plan for your future.
Keywords: NetApp, Boulder , Cloud Supportability Engineer, Engineering , Boulder, Colorado
Click
here to apply!
|