Hopsworks

Video coming soon

Commercial Information

Vendor Name

Hopsworks

History

First developed at KTH University, now managed by startup Logical Clocks

Stand-alone vs. Platform

Part of the Hopsworks MLOps platform

Delivery Model

Open source, self-managed commercial, and fully-managed cloud service

Clouds Supported

AWS and Azure (managed service), GCP and on-prem (self-managed)

Pricing Model

Cloud service: consumption pricing

Self-managed: per node pricing

Open source: free

Service Level Guarantees

Uptime, Serving latencies

Support

24 x 7 support & response time guarantees

Feature Store Capabilities

Feature Definitions

Feature ingestion jobs managed in notebooks

Automated Transforms

Orchestration of ingestions jobs via Apache Airflow DAGs

Feature Ingestion

Spark/Pandas batch feature ingestion into offline & online store

Spark Streaming feature ingestion into online store

Storage and Feature Processing Infrastructure

Online storage: RonDB

Offline storage: HopsFS on AWS S3, on Azure Block Storage, or on Direct Attached Storage

Feature Processing: Spark and Python

Feature Sharing and Discovery

Web UI

Searchable feature catalog with metadata

Feature discovery including feature values

Feature versioning and dependency management

Training Dataset Generation

Dataset generated from HopsFS using Python SDK

Time Travel to one point of time in the past

Row-level time travel (on the roadmap)

Online Serving

Python SDK for online data retrieval (or direct retrieval from RonDB)

Monitoring and Alerting

Data quality monitoring

Security and Data Governance

Data remains in end-user's cloud account

ACL and RBAC

SSO

Data encryption at rest and in flight

Integrations

Batch data: Any data source that can be read by Python or Spark

Streaming data: Any Spark streaming data sources