Work in Washington Veterans Jobs

Job Information

Oracle Director, Site Reliability Engineering (Join OCI) in Seattle, Washington

Job Description

Manage a team that designs, develops, troubleshoots and debugs software programs for databases, applications, tools, networks etc.

As a director of the software engineering division, you will apply your extensive knowledge of software architecture to manage software development tasks associated with developing, debugging or designing software applications, operating systems and databases according to provided design specifications. Build enhancements within an existing software architecture and envision future improvements to the architecture.

Assists in the development of short, medium, and long term plans to achieve strategic objectives. Regularly interacts across functional areas with senior management or executives to ensure unit objectives are met. Ability to influence thinking or gain acceptance of others in sensitive situations. Demonstrated leadership and people management skills. Strong communication skills, analytical skills, thorough understanding of product development. BS or MS degree or equivalent experience relevant to functional area. 7 years of software engineering or related experience.

If you are a Colorado resident, Please Contact us or Email us at oracle-salary-inquiries_us@oracle.com to receive compensation and benefits information for this role. Please include this Job ID: 144041 in the subject line of the email.

Responsibilities

The Oracle Cloud Infrastructure (OCI) team can provide you the opportunity to build and operate a suite of massive scale, integrated cloud services in a broadly distributed, multi-tenant cloud environment. OCI is committed to providing the best in cloud products that meet the needs of our customers who are tackling some of the world’s biggest challenges.

Oracle’s Cloud Infrastructure (OCI) team is a new ground-up effort to build Infrastructure-as-a-Service that operates at a high scale in a broadly distributed multi-tenant cloud environment. Our customers run their businesses on our cloud, and our mission is to provide them with best-in-class compute, storage, networking, database, security, and an ever-expanding set of foundational cloud-based services. These are exciting times in our space - we are growing fast, still at a relatively early stage, and working on ambitious new initiatives.

We are building a new Software Assurance Gateway team at OCI. Our mission is to build and operate a set of gateway services to perform comprehensive software assurance of the applications running within a tenancy (https://docs.oracle.com/en-us/iaas/Content/GSG/Concepts/concepts.htm) . Software assurance includes measures to prevent the deployment of malware or vulnerable, malicious, or unauthorized code into the application’s tenancy. It also includes monitoring the flow of data in and out of the application’s tenancy to prevent unauthorized exfiltration of data.

We’re looking for a Site Reliability Director for the Gateways team with expertise and passion in building teams, coaching individuals, and solving difficult problems in distributed systems, virtualized infrastructure, and highly available services. These are exciting times in our space - we are growing fast, still at an early stage, and working on ambitious new initiatives within the software assurance and security areas.

As a Site Reliability Engineering Director, you and your team ( direct or dotted line) will solve exciting technical challenges by analyzing, troubleshooting, and designing vital Oracle Cloud services, platforms, and infrastructure while always thinking about reliability, scalability, resilience, security, and performance. You will focus on engineering improvements to the systems that will eliminate whole classes of issues.

What You'll Do

  • Service Accountability –You will lead an SRE team whose mission is the shared full stack reliability of a collection of software security assurance services and technology areas, with our customers.

  • Ownership Scope – As an SRE, you will understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of the production services you collaborate with. In partnership with your Development colleagues, you will have the responsibility to ensure that services are designed and delivered to be mission critical with a focus on security, resiliency, scale, and performance.

  • Operations Engineering – You will understand and be able to communicate the scale, capacity, security, performance attributes, and requirements of the services you own. We are subject matter experts, able to understand and communicate every characteristic of our service stack, such as:

  • degradation and behavior under load of the services and their dependencies

  • end-to-end tuning needs, optimizing resource utilization, as load patterns fluctuate

  • Instrumentation and metrics that clearly describe the service behaviors

  • scaling requirements and patterns

  • resiliency and recoverability, ensuring that backup/restore and disaster recovery capabilities are implemented, tested, and maintained

  • Automation – You will have a clear understanding of automation and orchestration principles, and will be eager to help automate, wherever and whenever the possibility arises, while simultaneously eliminating technical debt. Automation must be part of your DNA.

  • Incidence Management - You will own and respond to customer incidents within the agreed-upon SLAs. You will manage the on-call for your services.

  • Technical Experts - You will have a deep understanding of service topology and the dependencies required to troubleshoot issues and define mitigations. You will bring this expertise to bear in driving reliability improvements in the services you engage with.

  • Broad Interests - SREs are a rare mix of sysadmins and Development Engineers, and as such, have the ability to understand and explain the effect of product architecture decisions on the ability to run as distributed systems. They are driven by professional curiosity, and a desire to develop a deep understanding of their services and their dependencies.

  • Cross-team collaboration – You will engage with and present to a wide variety of audiences, ranging from individual contributors and teams to executive leadership

Basic Qualifications:

  • BS degree in Computer Science or related technical field

  • 4+ years of experience managing Site Reliability teams and managers.

  • 10+ years of Operations/Support Engineer/Site Reliability Engineer experience

  • Experience with container technologies (Docker, Kubernetes)

  • Knowledge of Internet protocols and standards, including SMTP, REST, SSL/TLS, DNS, and HTTP.

  • Experience deploying code within change management procedures

  • Experience troubleshooting complex software and/or networking issues

  • Strong understanding of cloud concepts and platforms

  • Experience in cloud technical support, operations, NOC, or similar is preferred

  • Experience with orchestration/automation tools (Ansible, Terraform, Chef, etc.)

  • Understanding of service KPI metrics, alarms, logging, and system health dashboards such as Grafana.

  • Defining and documenting technical architecture of complex and highly scalable products.

  • Experience working in an operational environment with mission-critical tier-one services with associated pager duty

Preferred Qualifications:

  • Experience with resiliency design and operation

  • Experience optimizing loads of large volumes of data into a database

  • Experience managing large fleets

  • Experience with performance tuning and optimization

  • Expertise in Automation methodologies.

  • Experience and understanding of security and compliance.

  • Experience working with large enterprise customers

  • Experience using Kubernetes in CICD environments.

About Us

Innovation starts with inclusion at Oracle. We are committed to creating a workplace where all kinds of people can be themselves and do their best work. It’s when everyone’s voice is heard and valued, that we are inspired to go beyond what’s been done before. That’s why we need people with diverse backgrounds, beliefs, and abilities to help us create the future, and are proud to be an affirmative-action equal opportunity employer.

Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans status, age, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

About Us

Innovation starts with inclusion at Oracle. We are committed to creating a workplace where all kinds of people can be themselves and do their best work. It’s when everyone’s voice is heard and valued, that we are inspired to go beyond what’s been done before. That’s why we need people with diverse backgrounds, beliefs, and abilities to help us create the future, and are proud to be an affirmative-action equal opportunity employer.

Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans status, age, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

DirectEmployers