README

HuggingFace AI Integration with Advanced Model Discovery and Vector Search for Drupal 11

Overview

AI Agent HuggingFace is a production-grade Drupal module providing seamless integration with HuggingFace's 200,000+ AI models, featuring advanced model discovery via Solr dense vector search, Milvus vector database integration, and pipeline management. It extends the Drupal AI module ecosystem with HuggingFace-specific capabilities including text generation, embeddings, classification, and named entity recognition.

What It Does

Core Function: Provides comprehensive HuggingFace model integration for Drupal, enabling discovery, deployment, and execution of HuggingFace models with vector search capabilities, semantic model discovery, and pipeline orchestration.

Integrations

Drupal Core Modules

Drupal 11: Modern Drupal APIs
Drupal AI (^1.0): Core AI framework
AI Agents (^1.1): Agent orchestration layer
Search API (^1.0): Model discovery infrastructure
LLM Module (^0.1): LLM platform integration

HuggingFace Services

HuggingFace AI Provider: Official provider integration
Inference API: Serverless model execution
Model Hub: Access to 200,000+ models
Transformers Pipeline: Text processing pipelines
Datasets: Training and evaluation datasets

Vector Search & Storage

Solr Dense Vector (^1.0): Dense vector search in Solr
Milvus VDB (^1.1): Milvus vector database integration
AI Search (^1.1): Semantic search infrastructure
Search API Solr (^4.3): Apache Solr backend

LLM Platform Services

agent-protocol: MCP server integration
agent-brain: Vector memory and embeddings
agent-router: Multi-provider LLM routing

Features

1. Model Discovery

Browse HuggingFace Model Hub (200,000+ models)
Semantic search via dense vectors
Filter by task, language, license, size
Sort by downloads, likes, trending
Model card visualization
Performance benchmarks

2. Model Execution

Text generation (GPT-2, BLOOM, Llama)
Text embeddings (BERT, RoBERTa, Sentence-BERT)
Classification (sentiment, NER, zero-shot)
Question answering
Summarization
Translation

3. Vector Search

Dense vector embeddings for model descriptions
Semantic similarity search
Cross-lingual model discovery
Task-based clustering
Milvus vector storage

4. Pipeline Management

Pre-configured HuggingFace pipelines
Custom pipeline creation
Pipeline chaining
Batch processing
Streaming support

5. Model Caching

Local model caching
Version management
Automatic updates
Cache pruning

6. Performance Optimization

Model quantization (8-bit, 4-bit)
GPU acceleration (CUDA)
Batch inference
Model distillation

Installation

# Install via Composer
composer require drupal/ai_agent_huggingface

# Install dependencies
composer require drupal/ai_provider_huggingface:^1.0
composer require drupal/ai_search:^1.1
composer require drupal/search_api_solr_dense_vector:^1.0
composer require drupal/ai_vdb_provider_milvus:^1.1

# Enable module
drush en ai_agent_huggingface -y

# Run updates
drush updb -y

# Clear cache
drush cr

Configuration

HuggingFace API Setup

Navigate to: /admin/config/ai/providers/huggingface

Configure:

API Token: Your HuggingFace API token
- Get token: https://huggingface.co/settings/tokens
- Permissions: read (for public models) or write (for private models)
Inference API: Enable serverless inference
- Free tier: 30,000 tokens/month
- Pro tier: Unlimited
Model Caching: Configure local model storage
- Path: /sites/default/files/huggingface
- Max size: 10GB (default)

Vector Search Setup

Navigate to: /admin/config/search/search-api

Create Milvus Server:

# Using Docker
docker run -d --name milvus \
  -p 19530:19530 \
  -p 9091:9091 \
  milvusdb/milvus:latest

Configure Search API Index:
- Index name: huggingface_models
- Server: Solr + Dense Vector
- Fields:
  - Model name (fulltext)
  - Description (fulltext)
  - Embedding vector (dense_vector, 384 dimensions)
  - Task (facet)
  - Language (facet)
  - Downloads (sortable)
Add Vector Field:
- Field type: search_api_solr_dense_vector
- Dimensions: 384 (Sentence-BERT default)
- Distance metric: Cosine similarity

Usage

Discover Models

use Drupal\ai_agent_huggingface\HuggingFaceModelDiscovery;

$discovery = \Drupal::service('ai_agent_huggingface.model_discovery');

// Search by keyword
$models = $discovery->search('text generation', [
  'task' => 'text-generation',
  'language' => 'en',
  'min_downloads' => 1000,
]);

// Semantic search using embeddings
$similar_models = $discovery->semanticSearch('summarize long documents', [
  'top_k' => 10,
]);

Execute Model

use Drupal\ai_agent_huggingface\HuggingFaceInference;

$inference = \Drupal::service('ai_agent_huggingface.inference');

// Text generation
$result = $inference->generate([
  'model' => 'gpt2',
  'inputs' => 'Once upon a time',
  'parameters' => [
    'max_length' => 100,
    'temperature' => 0.7,
  ],
]);

// Text embeddings
$embeddings = $inference->embed([
  'model' => 'sentence-transformers/all-MiniLM-L6-v2',
  'inputs' => 'This is a test sentence',
]);

// Classification
$classification = $inference->classify([
  'model' => 'distilbert-base-uncased-finetuned-sst-2-english',
  'inputs' => 'This movie was amazing!',
]);

Create Pipeline

use Drupal\ai_agent_huggingface\HuggingFacePipeline;

$pipeline = \Drupal::service('ai_agent_huggingface.pipeline');

// Summarization pipeline
$summarizer = $pipeline->create('summarization', [
  'model' => 'facebook/bart-large-cnn',
  'max_length' => 130,
  'min_length' => 30,
]);

$summary = $summarizer->run('Long article text...');

// Question answering pipeline
$qa = $pipeline->create('question-answering', [
  'model' => 'deepset/roberta-base-squad2',
]);

$answer = $qa->run([
  'question' => 'What is the capital of France?',
  'context' => 'Paris is the capital and largest city of France...',
]);

API Endpoints

REST API

Endpoints:

GET /api/v1/huggingface/models - List models
GET /api/v1/huggingface/models/{id} - Model details
POST /api/v1/huggingface/inference - Execute model
POST /api/v1/huggingface/embeddings - Generate embeddings
GET /api/v1/huggingface/tasks - List supported tasks
POST /api/v1/huggingface/pipeline - Create pipeline

GraphQL (optional)

query SearchModels($query: String!, $task: String!) {
  huggingfaceModels(query: $query, task: $task) {
    id
    name
    description
    task
    downloads
    likes
  }
}

mutation GenerateText($model: String!, $inputs: String!) {
  huggingfaceGenerate(model: $model, inputs: $inputs) {
    text
    tokenCount
    latency
  }
}

Vector Search

Semantic Model Discovery

Use Milvus + Solr for semantic model search:

$search = \Drupal::service('ai_agent_huggingface.vector_search');

// Find models semantically similar to query
$results = $search->similarModels(
  'I need a model for sentiment analysis in French',
  [
    'top_k' => 10,
    'threshold' => 0.7, // Minimum similarity score
  ]
);

foreach ($results as $model) {
  echo "{$model->name} (score: {$model->similarity})\n";
}

Embedding Models

Supported embedding models:

sentence-transformers/all-MiniLM-L6-v2 (384 dim, fast)
sentence-transformers/all-mpnet-base-v2 (768 dim, high quality)
intfloat/e5-large-v2 (1024 dim, state-of-the-art)

Supported Tasks

Text Generation: GPT-2, BLOOM, Llama, Mistral
Fill-Mask: BERT, RoBERTa, DistilBERT
Question Answering: SQuAD-tuned models
Summarization: BART, T5, Pegasus
Translation: MarianMT, M2M100, NLLB
Sentiment Analysis: DistilBERT, RoBERTa
Named Entity Recognition: BERT-NER, SpaCy
Zero-Shot Classification: BART-MNLI, DeBERTa
Text Embeddings: Sentence-BERT, E5, BGE

Performance

Model Discovery: <100ms with Milvus
Inference (Serverless): 200-500ms
Inference (Local): 50-200ms
Embeddings: <50ms per sentence
Batch Processing: 1000 items/minute

Caching

Enable model caching for faster inference:

$config = \Drupal::configFactory()->getEditable('ai_agent_huggingface.settings');
$config->set('cache_enabled', TRUE);
$config->set('cache_ttl', 3600); // 1 hour
$config->set('cache_max_size', 10737418240); // 10GB
$config->save();

Testing

# Run PHPUnit tests
vendor/bin/phpunit modules/custom/ai_agent_huggingface/tests

# Run PHPCS
vendor/bin/phpcs --standard=Drupal modules/custom/ai_agent_huggingface

# Run PHPStan
vendor/bin/phpstan analyse modules/custom/ai_agent_huggingface/src --level=5

# Integration tests (requires HuggingFace API key)
export HUGGINGFACE_API_KEY=your-key
vendor/bin/phpunit modules/custom/ai_agent_huggingface/tests/src/Functional

Permissions

access huggingface models - Browse model catalog
execute huggingface inference - Run model inference
manage huggingface pipelines - Create/delete pipelines
administer huggingface settings - Configure module

Troubleshooting

Model Not Found

# Clear cache
drush cr

# Reindex models
drush search-api:index huggingface_models

# Verify API token
drush config-get ai_agent_huggingface.settings api_token

Slow Inference

# Enable caching
drush config-set ai_agent_huggingface.settings cache_enabled TRUE

# Use smaller model
# e.g., distilgpt2 instead of gpt2

# Enable GPU acceleration (if available)
drush config-set ai_agent_huggingface.settings use_gpu TRUE

Vector Search Not Working

# Check Milvus connection
curl http://localhost:19530/v1/health

# Rebuild index
drush search-api:clear huggingface_models
drush search-api:index huggingface_models

Architecture

┌──────────────────────────────────────────────────┐
│       AI Agent HuggingFace                       │
│  ┌────────────┬──────────┬──────────────────┐   │
│  │ Model      │Inference │  Vector Search   │   │
│  │ Discovery  │ Engine   │  (Milvus/Solr)   │   │
│  └──────┬─────┴────┬─────┴──────┬───────────┘   │
│         │          │            │                │
│  ┌──────▼──────────▼────────────▼───────────┐   │
│  │      HuggingFace Inference API            │   │
│  │  ┌──────────┬──────────┬──────────────┐  │   │
│  │  │ Text Gen │Embeddings│Classification│  │   │
│  │  └──────────┴──────────┴──────────────┘  │   │
│  └───────────────────┬───────────────────────┘   │
└────────────────────┼──────────────────────────────┘
                     │
    ┌────────────────▼──────────────────┐
    │      External Services            │
    │  ┌──────────┬──────────────────┐  │
    │  │HuggingFac│  Milvus          │  │
    │  │e Hub     │  Vector DB       │  │
    │  └──────────┴──────────────────┘  │
    └───────────────────────────────────┘

Examples

Sentiment Analysis

$sentiment = $inference->classify([
  'model' => 'distilbert-base-uncased-finetuned-sst-2-english',
  'inputs' => 'I love using HuggingFace with Drupal!',
]);

// Result: ['label' => 'POSITIVE', 'score' => 0.9998]

Text Summarization

$summary = $pipeline->create('summarization')->run(
  'Very long article content here...',
  ['max_length' => 130, 'min_length' => 30]
);

Named Entity Recognition

$entities = $inference->tokenClassification([
  'model' => 'dslim/bert-base-NER',
  'inputs' => 'My name is John and I live in New York.',
]);

// Result: [
//   ['entity' => 'B-PER', 'word' => 'John', 'start' => 11, 'end' => 15],
//   ['entity' => 'B-LOC', 'word' => 'New', 'start' => 30, 'end' => 33],
//   ['entity' => 'I-LOC', 'word' => 'York', 'start' => 34, 'end' => 38],
// ]

Related Modules

ai: Core AI framework for Drupal
ai_agents: Agent orchestration
ai_provider_huggingface: HuggingFace provider
ai_search: Semantic search infrastructure
llm: LLM platform core

Documentation

HuggingFace Docs: https://huggingface.co/docs
Drupal AI Module: https://www.drupal.org/project/ai
GitLab Wiki: https://gitlab.bluefly.io/llm/all_drupal_custom/modules/ai_agent_huggingface/-/wikis

Support

Issues: https://www.drupal.org/project/issues/ai_agent_huggingface
GitLab: https://gitlab.bluefly.io/llm/all_drupal_custom/modules/ai_agent_huggingface

License

GPL-2.0-or-later - See LICENSE

Maintainers

Drupal AI Community
LLM Platform Team

bluefly / ai_agent_huggingface

Maintainers

Details

README

Overview

What It Does

Integrations

Drupal Core Modules

HuggingFace Services

Vector Search & Storage

LLM Platform Services

Features

1. Model Discovery

2. Model Execution

3. Vector Search

4. Pipeline Management

5. Model Caching

6. Performance Optimization

Installation

Configuration

HuggingFace API Setup

Vector Search Setup

Usage

Discover Models

Execute Model

Create Pipeline

API Endpoints

REST API

GraphQL (optional)

Vector Search

Semantic Model Discovery

Embedding Models

Supported Tasks

Performance

Caching

Testing

Permissions

Troubleshooting

Model Not Found

Slow Inference

Vector Search Not Working

Architecture

Examples

Sentiment Analysis

Text Summarization

Named Entity Recognition

Related Modules

Documentation

Support

License

Maintainers