Lead Site Reliability Engineer – Tuxedo/WebLogic
Location: St. Louis, Missouri Type: Direct Hire Job #6823 You will be responsible for implementing enhancement changes, monitoring product health, long and short-term capacity planning, resolving production issues and designing initiatives planned to continue to evolve our services to align with our new API-first strategy. It is a very exciting time to be a part of the team as we continue to evolve in our support and development of new solutions and enable our business to grow through the implementation of new capabilities. The Lead Engineer is responsible for ensuring the availability, performance, efficiency, change management, monitoring, emergency response, and capacity planning. These systems are a hybrid collection of Cloud, Web Services, and Legacy solutions. The ideal candidate will be experienced with Site Reliability Engineering, incorporating aspects of software engineering and applying them to infrastructure and operations support. This position creates a bridge between development teams and operations by applying a software engineering mindset to system administration topics, and vice versa. As a Lead Engineer you will be expected to use your technical knowledge to routinely monitor solution health to ensure all rental solutions are meeting established SLAs. You will be called upon to assist with the resolution of production issues that can include performance degradation, capacity concerns, and outage events. In addition, you will be responsible for monitoring and ensuring system health during maintenance events such as patching and upgrades. The eventual comprehension of the complete rental solution ecosystem will be expected to fully participate in problem solving and troubleshooting efforts. We are looking for a talented individual that can serve as a subject matter expert in their area of focus and represent their department on complex assignments. You will be responsible for evaluating elements of technology's effectiveness, research, investigation, and make recommendations for improvements that result in increased solution consistency and reliability. Contributes to the development of strategic capacity planning for the department Applies advanced knowledge of professional concepts and company objectives to resolve a wide range of complex system issues in creative and effective ways Focus is on production infrastructure support and operational strategic activities Monitors key performance metrics; escalates and addresses problems Subject matter expert in more than one area of responsibility; represents team within and outside own department Assists with project planning; provides technical expertise to project teams and/or serves as technical lead on project teams within area/department Works on large, complex assignments Ability to support after hours implementations as required and on-call support rotations Experienced with Cloud technologies Experienced with production support role utilizing critical thinking and problem-solving skills. Some experience with command-line operation and system administration of multiple UNIX/Linux distributions Some Database Administration experience, SQL-based distributions preferred Has significant autonomy over completion of day-to-day work and receives general instructions on new projects or assignments Defines, develops, communicates, and implements standards, processes, and procedures for the team or department Establishes, maintains and fosters relationships both within and outside the team and department Collaborates with Architects and recommends adjustments to the architecture to improve the overall capacity, performance, and quality of a solution Provides, develops, and maintains documentation for elements of technology Provides instruction and guidance to less senior team members on new tasks and assignments; ensures deadlines are met As part of a collaborative team, you’ll work closely with peers across disciplines to design, build, and maintain robust solutions. Your expertise in site reliability technologies will be key in driving proactive system improvements and ensuring seamless operations. We're looking for someone who thrives on teamwork, innovation, and solving complex challenges. Required: Must be presently authorized to work in the U.S. without a requirement for work authorization sponsorship by our company for this position now or in the future Must reside in the United States (does not include Alaska or Hawaii) Must be able to work a schedule within U.S. Central Standard Time core business hours. Bachelor's degree in Computer Science, Computer Information Systems, Management Information Systems, or related field preferred 5+ years of experience working in a software development engineering environment supporting languages and technologies such as C/C++, JAVA, PL/SQL and shell scripting 5+ years of experience with Middleware use and configuration with Tuxedo, WebLogic, and Tomcat 5+ years of AIX sysadmin and associated tools 5+ years with Splunk, Dynatrace or equivalent tools Must be committed to incorporating security into all decisions and daily job responsibilities Preferred: General network engineering background Knowledge of system/infrastructure testing, problem solving and troubleshooting concepts Knowledge of capacity planning methodologies Knowledge of hardware concepts as they apply across infrastructure arenas, including networking devices, topologies and resiliency patterns Apply Now
Read More