쿠팡-[쿠팡페이] Staff, Back-end Engineer (Fintech SRE)
쿠팡-[쿠팡페이] Staff, Back-end Engineer (Fintech SRE)
쿠팡-[쿠팡페이] Staff, Back-end Engineer (Fintech SRE)
쿠팡-[쿠팡페이] Staff, Back-end Engineer (Fintech SRE)
쿠팡-[쿠팡페이] Staff, Back-end Engineer (Fintech SRE)
쿠팡-[쿠팡페이] Staff, Back-end Engineer (Fintech SRE)
쿠팡-[쿠팡페이] Staff, Back-end Engineer (Fintech SRE)
쿠팡-[쿠팡페이] Staff, Back-end Engineer (Fintech SRE)
쿠팡-[쿠팡페이] Staff, Back-end Engineer (Fintech SRE)
쿠팡-[쿠팡페이] Staff, Back-end Engineer (Fintech SRE)
쿠팡-[쿠팡페이] Staff, Back-end Engineer (Fintech SRE)
1/11
쿠팡서울 송파구경력 8년 이상

[쿠팡페이] Staff, Back-end Engineer (Fintech SRE)

포지션 상세

About CoupangPay : Coupang Pay focuses on delivering innovative payment and financial services solutions to everyone who uses the Coupang app — from customers buying products on Coupang.com, to marketplace vendors, and restaurants that offer their services via Coupang Eats. We develop solutions with our latest tech innovations to serve the growing needs of Coupang’s customers in Korea and Taiwan. This includes Coupay, an online wallet with a proprietary one-touch payment capability.


About the role : As a Staff Site Reliability Engineer (SRE) in CoupangPay, you will play a pivotal role in ensuring the reliability, scalability, and performance of our critical systems and services. You will be a technical leader, driving the design, implementation, and optimization of complex systems that meet the demands of a high-availability environment. This role requires deep expertise in the Observability Engineering (OE) stack—including Mimir, Loki, Tempo, and Grafana—and Terraform-based automation. Experienced in setting up, tuning, and scaling observability platforms to support business-critical services with high reliability and performance. As a Staff SRE engineer, you will be involves collaborating with cross-functional teams to architect solutions, identify and resolve system bottlenecks, and establish best practices in operational excellence. With a focus on automation, observability, and incident management, you will also mentor junior engineers, foster a culture of reliability, and contribute to the strategic direction of our product engineering initiatives. This is a unique opportunity to make a significant impact on the stability and scalability of our technology ecosystem.

주요업무

System Reliability and Performance
• Ensure the reliability, availability, and performance of critical systems and services.
• Proactively identify and address system bottlenecks, failures, and performance issues.

Technical Leadership
• Lead the design, implementation, and optimization of scalable and fault-tolerant architectures.
• Provide guidance and mentorship to junior engineers, fostering technical growth.

Automation and Tooling
• Develop and enhance automation tools to streamline operational processes and improve efficiency.
• Champion automation-first principles to reduce manual toil and operational overhead.

Observability and Incident Management
• Build and operate OE stack. Involve in performance tuning, cost optimisation and observability initiatives to best serve the interest of the business.
• Drive incident response, root cause analysis, and post-incident reviews to improve systems.

Collaboration and Best Practices
• Partner with cross-functional teams (e.g., development, product, and infrastructure) to build robust systems.
• Define and implement best practices for reliability engineering, including CI/CD pipelines and infrastructure as code.

Strategic Contributions
• Influence the strategic direction of infrastructure and platform engineering initiatives.
• Evaluate and implement new technologies to enhance system resilience and operational capabilities.

Operational Excellence
• Drive continuous improvement in operational processes, reducing time to resolution for incidents.
• Promote a culture of accountability, innovation, and reliability throughout the engineering organization.

자격요건

• Strong proficiency in programming languages such as Python, Go, or similar.In-depth knowledge of Linux/Unix systems, networking, and distributed systems.
• Experience with cloud platforms (AWS, GCP, or Azure) and container orchestration tools (e.g., Kubernetes, Docker).
• Strong understanding of observability tools (e.g., Prometheus, Grafana, or Datadog).
• Proficiency in Infrastructure as Code (IaC) using Terraform
• Expertise in scaling and tuning Mimir and Loki for high-throughput workloads.
• Familiarity with distributed tracing using Tempo
• Knowledge of performance optimization techniques for high-availability systems.
• Strong collaboration skills with the ability to work across cross-functional teams.
• Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
• 8+ years of experience in Site Reliability Engineering or related roles in high-availability environments.

기술 스택 • 툴

태그

마감일

상시채용

근무지역

서울특별시 송파구 송파대로 570
본 채용정보는 원티드랩의 동의없이 무단전재, 재배포, 재가공할 수 없으며, 구직활동 이외의 용도로 사용할 수 없습니다.
본 채용 정보는 에서 제공한 자료를 바탕으로 원티드랩에서 표현을 수정하고 이의 배열 및 구성을 편집하여 완성한 원티드랩의 저작자산이자 영업자산입니다. 본 정보 및 데이터베이스의 일부 내지는 전부에 대하여 원티드랩의 동의 없이 무단전재 또는 재배포, 재가공 및 크롤링할 수 없으며, 게재된 채용기업의 정보는 구직자의 구직활동 이외의 용도로 사용될 수 없습니다. 원티드랩은 에서 게재한 자료에 대한 오류나 그 밖에 원티드랩이 가공하지 않은 정보의 내용상 문제에 대하여 어떠한 보장도 하지 않으며, 사용자가 이를 신뢰하여 취한 조치에 대해 책임을 지지 않습니다.
<저작권자 (주)원티드랩. 무단전재-재배포금지>