Feast
Important Capabilities
| Capability | Status | Notes | 
|---|---|---|
| Descriptions | ✅ | Enabled by default | 
| Detect Deleted Entities | ✅ | Optionally enabled via stateful_ingestion.remove_stale_metadata | 
| Schema Metadata | ✅ | Enabled by default | 
| Table-Level Lineage | ✅ | Enabled by default | 
This plugin extracts:
- Entities as 
MLPrimaryKey - Fields as 
MLFeature - Feature views and on-demand feature views as 
MLFeatureTable - Batch and stream source details as 
Dataset - Column types associated with each entity and feature
 
CLI based Ingestion
Starter Recipe
Check out the following recipe to get started with ingestion! See below for full configuration options.
For general pointers on writing and running a recipe, see our main recipe guide.
source:
  type: feast
  config:
    # Coordinates
    path: "/path/to/repository/"
    # Options
    environment: "PROD"
sink:
  # sink configs
Config Details
- Options
 - Schema
 
Note that a . is used to denote nested fields in the YAML recipe.
| Field | Description | 
|---|---|
path ✅  string  | Path to Feast repository | 
enable_owner_extraction  boolean  | If this is disabled, then we NEVER try to map owners. If this is enabled, then owner_mappings is REQUIRED to extract ownership.  Default: False  | 
enable_tag_extraction  boolean  | If this is disabled, then we NEVER try to extract tags.  Default: False  | 
environment  string  | Environment to use when constructing URNs  Default: PROD  | 
fs_yaml_file  string  | Path to the feature_store.yaml file used to configure the feature store | 
owner_mappings  array  | Mapping of owner names to owner types | 
owner_mappings.map  map(str,string)  | |
stateful_ingestion  StatefulIngestionConfig  | Stateful Ingestion Config | 
stateful_ingestion.enabled  boolean  | Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or datahub_api is specified, otherwise False Default: False  | 
The JSONSchema for this configuration is inlined below.
{
  "title": "FeastRepositorySourceConfig",
  "description": "Base configuration class for stateful ingestion for source configs to inherit from.",
  "type": "object",
  "properties": {
    "stateful_ingestion": {
      "title": "Stateful Ingestion",
      "description": "Stateful Ingestion Config",
      "allOf": [
        {
          "$ref": "#/definitions/StatefulIngestionConfig"
        }
      ]
    },
    "path": {
      "title": "Path",
      "description": "Path to Feast repository",
      "type": "string"
    },
    "fs_yaml_file": {
      "title": "Fs Yaml File",
      "description": "Path to the `feature_store.yaml` file used to configure the feature store",
      "type": "string"
    },
    "environment": {
      "title": "Environment",
      "description": "Environment to use when constructing URNs",
      "default": "PROD",
      "type": "string"
    },
    "owner_mappings": {
      "title": "Owner Mappings",
      "description": "Mapping of owner names to owner types",
      "type": "array",
      "items": {
        "type": "object",
        "additionalProperties": {
          "type": "string"
        }
      }
    },
    "enable_owner_extraction": {
      "title": "Enable Owner Extraction",
      "description": "If this is disabled, then we NEVER try to map owners. If this is enabled, then owner_mappings is REQUIRED to extract ownership.",
      "default": false,
      "type": "boolean"
    },
    "enable_tag_extraction": {
      "title": "Enable Tag Extraction",
      "description": "If this is disabled, then we NEVER try to extract tags.",
      "default": false,
      "type": "boolean"
    }
  },
  "required": [
    "path"
  ],
  "definitions": {
    "DynamicTypedStateProviderConfig": {
      "title": "DynamicTypedStateProviderConfig",
      "type": "object",
      "properties": {
        "type": {
          "title": "Type",
          "description": "The type of the state provider to use. For DataHub use `datahub`",
          "type": "string"
        },
        "config": {
          "title": "Config",
          "description": "The configuration required for initializing the state provider. Default: The datahub_api config if set at pipeline level. Otherwise, the default DatahubClientConfig. See the defaults (https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/graph/client.py#L19).",
          "default": {},
          "type": "object"
        }
      },
      "required": [
        "type"
      ],
      "additionalProperties": false
    },
    "StatefulIngestionConfig": {
      "title": "StatefulIngestionConfig",
      "description": "Basic Stateful Ingestion Specific Configuration for any source.",
      "type": "object",
      "properties": {
        "enabled": {
          "title": "Enabled",
          "description": "Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or `datahub_api` is specified, otherwise False",
          "default": false,
          "type": "boolean"
        }
      },
      "additionalProperties": false
    }
  }
}
Code Coordinates
- Class Name: 
datahub.ingestion.source.feast.FeastRepositorySource - Browse on GitHub
 
Questions
If you've got any questions on configuring ingestion for Feast, feel free to ping us on our Slack.
Is this page helpful?
