Feature stores have become a critical component of the modern Machine Learning stack. They automate and centrally manage the data processes that power operational Machine Learning (ML) models in production, and allow data practitioners to build and deploy features quickly and reliably. Read more about what a feature store is and check out the additional resources below.
The MLOps Community has worked with vendors and community members to profile the major solutions available in the market today, based on our feature store evaluation framework.
Co-created by GO-JEK and Google Cloud, now governed by the Linux Foundation with Tecton as main contributor
Stand-alone feature store, integrates with 3rd party MLOps platforms
Open source
AWS, GCP, Azure, On-Prem
None
N/A (open source only)
Founded by the creators of Uber's Michelangelo platform
Stand-alone feature store, integrates with 3rd party MLOps platforms
Fully-managed cloud service
AWS (now), GCP and Azure (roadmap)
Uptime, Serving latencies
24 x 7 support & response time guarantees
Developed internally by AWS
Part of the Amazon SageMaker platform
Fully-managed cloud service
AWS
None
24 x 7 support & response time guarantees
Created by Databricks
Part of a broader MLOps platform
Fully-Managed Cloud Service
AWS, GCP, Azure
Uptime
24 x 7 support & response time guarantees
First developed at KTH University, now managed by startup Logical Clocks
Part of the Hopsworks MLOps platform
Open source, self-managed commercial, and fully-managed cloud service
AWS and Azure (managed service), GCP and on-prem (self-managed)
Uptime, Serving latencies
24 x 7 support & response time guarantees
The Qwak feature store was designed & built following the founder's experiences while leading the ML & data groups at AWS, Wix, and Payoneer.
Part of a broader platform
Fully-Managed Cloud Service
AWS
Uptime
Serving latencies
Yes
Originally created by Venkata Pingali and Indrayudh Ghoshal (founders)
Stand-alone feature store
Self Managed Commercial or Fully Managed Cloud service
On AWS, GCP, On-Prem
Uptime
24 x 7 support & response time guarantees
Founded by Patrick Dougherty and Jared Parker
Stand Alone Feature Store
Open Source Software
Fully-Managed Cloud Service
AWS, GCP, Azure, On-Prem
Uptime
24 x 7 support & response time guarantees
Feature store created by Iguazio. Includes open source components created and maintained by Iguazio
Part of the Iguazio Data Science Platform
Open source components, self-managed commercial, and fully-managed cloud service
AWS, GCP, Azure, On-Prem
Uptime
24 x 7 support & response time guarantees
Vendor |
Demo link |
History |
Stand-alone vs. Platform |
Delivery Model |
Clouds Supported |
Pricing Model |
Service Level Guarantees |
Support |
Feature Definitions |
Automated Transforms |
Feature Ingestion |
Storage and Feature Processing Infrastructure |
Feature Sharing and Discovery |
Training Dataset Generation |
Online Serving |
Monitoring and Alerting |
Security and Data Governance |
Integrations |
Are you looking to add a feature store to your ML stack? MLOps Community, with the help of feature store vendors, has created an evaluation framework to help you choose the right product for your needs.
Criteria 1
First, you need to assess whether the product’s commercial characteristics meet your needs. We recommend evaluating the following commercial criteria:
Criteria 2
You will want to make sure that the feature store fulfills all the capabilities you need across the operational data workflow. We’ve broken down the capabilities as follows:
Feature Definitions
Does the feature store provide a framework for creating feature definitions (including the transformation logic and materialization), and can data scientists collaborate on the definitions?
Automated Transforms:
Does the feature store automatically execute the pipelines required to process the feature values, including historical backfill and fresh feature values? Do the transformations support batch, streaming and real-time data sources?
Feature Ingestion:
How are features ingested into the online and offline store?
Storage and Feature Processing Infrastructure:
What infrastructure does the feature store use to store and process feature values?
Feature Sharing & Discovery:
Is there an easy way to manage, share and discover features across the organization?
Online Serving:
How are features served online at inference time?
Training Datasets:
How do data scientists generate point-in-time accurate training datasets from the offline store?
Monitoring and Alerting:
What monitoring and alerting capabilities does the feature store provide?
Security and Data Governance:
What measures are in place to protect data and control access?
Integrations:
Which 3rd party data and ML tools does the feature store integrate with?
A feature store is a tool (or set of tools) that handles the movement of data needed for Machine Learning. Most of the time feature stores help get your feature data into an online storage layer needed for real-time serving.
Feature stores are also widely associated with feature registries — tools that enable developers to share features and collaborate on the critical data assets that make your machine learning models great!
There are a lot of unique ways that features for machine learning are built and consumed that require specialized tooling above-and-beyond what a data warehouse provides. Some examples:
There are typically two things that teams use to decide they need a feature store:
Feature stores can be a benefit to data scientists, data engineers, and ML engineers.
Some of the most common ML use cases that really rely on feature stores are:
Building a feature store is a complex engineering effort. Check out some open-source offerings, see what you can adopt from those technologies, and then find out what additional requirements your use cases will have. For most companies, getting out a reasonable MVP takes a full engineering team at least a full year of effort.
Feature stores tend to excel when dealing with structured data. They are more commonly seen in use cases that deal with low latency requirements such as recommender systems, fraud detections and loan scoring.
You’ll want to use a feature store if: