Introduction
Kumo prioritizes the security of its services and is committed to ensuring the highest standards of protection. While technical security measures are important, equally important are the processes and people involved in keeping both the platform secure and your data as safe as possible. Our security philosophy centers around layered security controls designed to protect and secure Kumo’s AI SaaS cloud infrastructure. We believe in multiple logical and physical security control layers including access management, least privilege access, strong authentication, logging and monitoring, and vulnerability management including external penetration testing exercises. An integral component of our information security strategy is the proactive monitoring and management of systems to identify critical security issues. When issues are identified, they are thoroughly evaluated and promptly resolved. We rely on industry standard information security best practices and compliance frameworks, such as NIST 800-53, ISO 27000 series, to support our security initiatives. Our goal is to make users feel confident using our service for their most sensitive workloads. We firmly believe that maintaining transparency regarding our controls, environment, standards, and processes is of paramount importance. This document provides a deeper understanding of all the available security controls in the Kumo AI SaaS cloud infrastructure.Kumo AI SaaS Cloud Infrastructure
The Kumo AI SaaS cloud infrastructure supports our end-to-end machine learning (ML) platform. Kumo enables enterprises to leverage state-of-the-art predictive analytics to make predictions, allowing data scientists to immediately tackle many prediction problems by first registering data sources and then issuing different SQL-like predictive query interfaces that specify their ML tasks. Kumo then executes the predictive query and automates the entire process of feature preparation, label engineering, training dataset creation, model optimization, and MLOps, making it easy for users to build multiple ML models. Kumo is a fully managed software-as-a-service (SaaS) platform. More specifically:- There is no hardware (virtual or physical) to select, install, configure, or manage.
- There is no software to install, configure, or manage.
- Kumo handles ongoing maintenance, management, upgrades, and tuning.
High Level Architecture
The following architecture diagram describes the Kumo AI SaaS cloud infrastructure:
- Control Plane
- Predictive Query Engine
- Metadata DB
- Cache
Control Plane
The control plane is a collection of services that coordinate activities across Kumo. The control plane runs on compute instances provisioned by Kumo from the cloud provider.Predictive Query Engine
Predictive query processing is performed in the processing layer. Kumo processes queries using massively parallel processing (MPP) clusters of deep learning accelerators. After predictive query processing is complete, all data on the predictive query processing engine is purged.Metadata DB
The Kumo metadata database holds data schema information, machine learning models, aggregate data statistics and other metadata.Cache
The data cache is not directly visible nor accessible by users. Data cache objects are only accessible through predictive query operations run using Kumo. Kumo caches transformed and derived data for faster predictive query processing and execution, and manages all aspects of how this data is stored—to include the organization, file size, structure, compression, metadata, statistics, and other aspects of data storage. The cache retention policy is defined by the user.Architecture and Security
Architecture
The following diagram provides an overview of how Kumo integrates with your Databricks environment as a native application:
- The control plane is deployed in Kumo’s VPC.
- The Kumo Data engine will be Databricks compute clusters in your Databricks workspace.
- The Kumo AI engine will be deployed in Kumo’s VPC, operating only on ephemeral data read from the data store erased immediately after training/batch prediction complete.
- A UC volume in your Databricks workspace will be used as the Data store.
Security
The Native app for Databricks supports Kumo’s end-to-end machine learning (ML) platform. Kumo enables enterprises to leverage state-of-the-art predictive analytics to make predictions, allowing data scientists to immediately tackle many prediction problems by first registering data sources and then issuing different SQL-like predictive query interfaces that specify their ML tasks. Kumo then executes the predictive query and automates the entire process of feature preparation, label engineering, training dataset creation, model optimization, and MLOps, making it easy for users to build multiple ML models. With the Kumo’s native app for Databricks:- There is no hardware (virtual or physical) to select, install, configure, or manage.
- There is no software to install, configure, or manage.
- Kumo handles ongoing maintenance, management, upgrades, and tuning.