10x Software Engineer
Lucidya View all jobs
- Riyadh
- Permanent
- Full-time
- Extreme reliability (SLO-driven engineering)
- High-scale distributed processing (billions of data points)
- AI-native architecture (LLM + real-time intelligence)
- Design and operate high-throughput, event-driven pipelines across a 100+ microservice ecosystem handling billions of data points
- Build and scale distributed messaging systems with RabbitMQ, backpressure management, consumer scaling, and queue health
- Develop and maintain API gateway layers with advanced routing (multi-upstream, traffic splitting, environment isolation)
- Architect SSO and identity federation for enterprise clients, supporting multi-IdP routing with zero coupling to core services
- Define clean service boundaries across ingestion, processing, and delivery pipelines spanning Ruby and Python
- Diagnose and resolve complex production issues (e.g., deadlocks, queue exhaustion, connection pool saturation) — and eliminate root causes
- Optimize PostgreSQL for heavy write workloads, contention management, schema design, triggers, and connection scaling
- Design and tune Elasticsearch for search, indexing, and real-time Arabic relevance at scale
- Make informed trade-offs between multi-process and async architectures based on workload characteristics
- Build and maintain observability across a large-scale system using Grafana, Loki, distributed tracing, and SLOs
- Own production incidents end-to-end, tracing failures across queues, search systems, and external integrations
- Lead root cause analysis and implement preventative measures across multi-service pipelines
- Build internal tooling that improves engineering velocity, automation, deployment gating, and review enforcement
- Turn architectural principles into enforceable standards and guardrails, not just documentation
- Drive platform decoupling and service isolation across the system
- Contribute to Kubernetes migration and infrastructure modernization
- Standardize and improve CI/CD pipelines across services
- Strong foundation in distributed systems. You understand failure modes before you write the first line
- Hands-on experience with event-driven architecture and message queues in production
- Deep comfort with concurrency, backpressure, and fault tolerance
- Track record debugging complex production issues — not just fixing them, preventing them
- Experience with Rails or Python backends at meaningful scale
- You improve systems you weren’t asked to touch
- Read and understand an existing codebase by week one
- See a broken system and fix it before anyone asks you to
- Have strong opinions about architecture and can back them up with data
- Think in systems: latency, throughput, failure modes, and cost at scale
- Treat documentation, tests, and observability as non-negotiable defaults and not afterthoughts
- Ship fast and without breaking things. Speed and quality are not a trade-off for you
- Consistently exceed expectations where meeting the bar is a floor, not a target
- Are hungry for hard challenges and actively seek problems at the edge of your limits
- Feel a sense of urgency that doesn’t require external pressure
- Have rebuilt or stabilised something significant and can talk about it concretely
- Real scale with billions of events and not just toy systems
- Direct impact at CTO and executive level
- Pre-IPO with clear trajectory and your work has real impact on clients