top of page

Re-Architecting a Multi-Tenant Cloud Vulnerability Scanning Platform for Scale, Isolation, and Observability

Cloud Vulnerability Scanning Flow

This figure is just for illustrative purposes.

When a client's primary vulnerability scanning service, a critical backend component for their Cloud Security Posture Module (CSPM), began failing under operational load, Falistro was asked to step in. The legacy system, built on a traditional client-server, pull-based SaltStack architecture, was exhibiting inefficiency, instability, and an inability to handle the required scale.

 

Falistro's diagnosis confirmed the system's core deficiencies were significant:​
  • Poor Multi-Tenancy: It could not ensure fair resource sharing between tenants, a critical issue when some customer accounts generated 100x more data than others.

  • Lack of Isolation: Failures in scanning one cloud account could cascade, causing resource starvation or logical conflicts that impact other accounts.

  • Low Observability: The client's primary concern was a lack of insight into the system's operational status, which made troubleshooting challenging.

  • Complex Constraints: The system needed to perform scanning on dedicated nodes, support workload prioritization, and function across multiple, disparate environments: a public SaaS, customer on-premise (both connected and air-gapped), and federated scenarios where on-premise clients could utilize the SaaS infrastructure.

​

Architectural Solution & Implementation

Falistro proposed a fundamental architectural shift, moving from the legacy pull-model to an innovative, push-based system. The new design was centered on leveraging Kubernetes priority-based workloads to manage job execution, resource allocation, and fault tolerance.

​

This approach, which the client termed "revolutionary," was meticulously modeled and presented with supporting data to demonstrate its viability. After securing client buy-in, the project involved a complete overhaul and rewrite of this critical component.

​

Falistro engineered the new architecture to address all initial requirements:​
  • Scalability & Fairness: Utilizing a push-based model with priority queues enables the system to dynamically manage workloads, ensuring fair resource distribution and horizontal scalability.

  • Resilience & Isolation: Workloads were containerized and isolated, preventing issues in one tenant's scan from affecting the stability of the entire system.

  • High Observability: The new system was built with deep instrumentation, providing the clear observability the client's team needed to manage operations effectively.

​

Outcome

The resulting architecture not only resolved the existing inefficiencies and bugs but also seamlessly met the three critical and complex deployment scenarios (SaaS, on-premise, and federated). The re-engineered service exceeded client expectations, delivering a highly scalable, observable, and efficient solution that secured a vital component of their cloud security offering.

Design. Develop. Scale

Registered Address

Basement, S-145 Panchsheel Park, New Delhi, 110017, India

bottom of page