Scribble Data

Commercial Information

Vendor Name

Scribble Data

History

Originally created by Venkata Pingali and Indrayudh Ghoshal (founders)

Stand-alone vs. Platform

Stand-alone feature store

Delivery Model

Self Managed Commercial or Fully Managed Cloud service

Clouds Supported

On AWS, GCP, On-Prem

Pricing Model

Per Node Pricing, Other

Service Level Guarantees

Uptime

Support

24 x 7 support & response time guarantees

Feature Store Capabilities

Feature Definitions

Declarative framework for defining features (incl. transformations and materialization)

Feature definitions are managed in central repo

Automated Transforms

Automated pipeline orchestration

Managed Batch, Streaming and Real-Time Transformations

Automated backfill of historical data

Pipeline visualization

Feature Ingestion

Spark/Pandas batch feature ingestion into offline & online store

Spark Streaming feature ingestion into offline & online store

Storage and Feature Processing Infrastructure

Online storage: Redis (experimental)

Offline storage: Client-specific (S3, Hive, databases)

Feature Processing: Spark and Flink

Feature Sharing and Discovery

Web UI

Searchable feature catalog with metadata

Feature discovery including transformations, data lineage, and values
Feature versioning and dependency management

Training Dataset Generation

Dataset generated from offline storage using Python SDK

Online Serving

Serving layer designed by end-user

Monitoring and Alerting

Data drift detection

Data quality monitoring

Monitoring of serving latencies and uptime

Security and Data Governance

Data remains in end-user's cloud account

RBAC

SSO

Data encryption at rest and in flight

Integrations

Batch data: Any standard datasource (DB, object store)

Streaming data: Any standard source (Kafka)