Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Neuml Txtai Hub Cloud

From Leeroopedia


Knowledge Sources
Domains Cloud Storage, Model Hub, Hugging Face
Last Updated 2026-02-10 01:00 GMT

Overview

Concrete tool for syncing embeddings indexes to and from Hugging Face Hub provided by txtai.

Description

The HuggingFaceHub class is a Cloud provider implementation that enables saving and loading txtai embeddings indexes from Hugging Face Hub repositories. It supports both full repository snapshots and individual archive file downloads. On save, it creates a repository (private by default), updates .gitattributes to enable Git LFS tracking for large embeddings and document files, and uploads all index files. On load, it downloads either a single archive file or the entire repository snapshot to a local cache directory.

Usage

Use HuggingFaceHub when you want to persist and share txtai embeddings indexes via Hugging Face Hub. This is configured in the txtai cloud configuration by specifying the provider as "huggingface-hub" with a container (repo_id), optional revision (branch/tag), cache directory, and authentication token. It enables collaborative and versioned storage of embeddings indexes.

Code Reference

Source Location

  • Repository: Neuml_Txtai
  • File: src/python/txtai/cloud/hub.py

Signature

class HuggingFaceHub(Cloud):
    def metadata(self, path=None)
    def load(self, path=None)
    def save(self, path)
    def lfstrack(self)

Import

from txtai.cloud.hub import HuggingFaceHub

I/O Contract

Inputs

Name Type Required Description
config dict Yes Cloud configuration dictionary with keys: "container" (repo_id), optional "revision", "cache", "token", "private"
path str No Local file path; if an archive file path, operations target that specific file; otherwise operations target the entire repository

Outputs

Name Type Description
metadata result object or None HF file metadata (for archive paths) or model_info (for repos); None if repository not found
load result str Local file system path to the downloaded file or repository snapshot

Usage Examples

from txtai.cloud.hub import HuggingFaceHub

# Configure Hugging Face Hub cloud provider
config = {
    "container": "username/my-embeddings-index",
    "revision": "main",
    "token": "hf_your_token_here",
    "private": True,
    "cache": "/tmp/hf_cache"
}

hub = HuggingFaceHub(config)

# Check if repository exists
if hub.exists():
    # Load entire index repository to local cache
    local_path = hub.load()

# Save local embeddings index to Hugging Face Hub
hub.save("/path/to/local/index")

# Load a specific archive file
archive_path = hub.load("/path/to/embeddings.tar.gz")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment