bluefly/ai_agent_huggingface

HuggingFace AI agent with advanced model discovery and pipeline management

Installs: 79

Dependents: 0

Suggesters: 1

Security: 0

Type:drupal-module

pkg:composer/bluefly/ai_agent_huggingface

v1.0.0 2025-08-13 21:10 UTC

README

HuggingFace AI Integration with Advanced Model Discovery and Vector Search for Drupal 11

Overview

AI Agent HuggingFace is a production-grade Drupal module providing seamless integration with HuggingFace's 200,000+ AI models, featuring advanced model discovery via Solr dense vector search, Milvus vector database integration, and pipeline management. It extends the Drupal AI module ecosystem with HuggingFace-specific capabilities including text generation, embeddings, classification, and named entity recognition.

What It Does

Core Function: Provides comprehensive HuggingFace model integration for Drupal, enabling discovery, deployment, and execution of HuggingFace models with vector search capabilities, semantic model discovery, and pipeline orchestration.

Integrations

Drupal Core Modules

  • Drupal 11: Modern Drupal APIs
  • Drupal AI (^1.0): Core AI framework
  • AI Agents (^1.1): Agent orchestration layer
  • Search API (^1.0): Model discovery infrastructure
  • LLM Module (^0.1): LLM platform integration

HuggingFace Services

  • HuggingFace AI Provider: Official provider integration
  • Inference API: Serverless model execution
  • Model Hub: Access to 200,000+ models
  • Transformers Pipeline: Text processing pipelines
  • Datasets: Training and evaluation datasets

Vector Search & Storage

  • Solr Dense Vector (^1.0): Dense vector search in Solr
  • Milvus VDB (^1.1): Milvus vector database integration
  • AI Search (^1.1): Semantic search infrastructure
  • Search API Solr (^4.3): Apache Solr backend

LLM Platform Services

  • agent-protocol: MCP server integration
  • agent-brain: Vector memory and embeddings
  • agent-router: Multi-provider LLM routing

Features

1. Model Discovery

  • Browse HuggingFace Model Hub (200,000+ models)
  • Semantic search via dense vectors
  • Filter by task, language, license, size
  • Sort by downloads, likes, trending
  • Model card visualization
  • Performance benchmarks

2. Model Execution

  • Text generation (GPT-2, BLOOM, Llama)
  • Text embeddings (BERT, RoBERTa, Sentence-BERT)
  • Classification (sentiment, NER, zero-shot)
  • Question answering
  • Summarization
  • Translation

3. Vector Search

  • Dense vector embeddings for model descriptions
  • Semantic similarity search
  • Cross-lingual model discovery
  • Task-based clustering
  • Milvus vector storage

4. Pipeline Management

  • Pre-configured HuggingFace pipelines
  • Custom pipeline creation
  • Pipeline chaining
  • Batch processing
  • Streaming support

5. Model Caching

  • Local model caching
  • Version management
  • Automatic updates
  • Cache pruning

6. Performance Optimization

  • Model quantization (8-bit, 4-bit)
  • GPU acceleration (CUDA)
  • Batch inference
  • Model distillation

Installation

# Install via Composer
composer require drupal/ai_agent_huggingface

# Install dependencies
composer require drupal/ai_provider_huggingface:^1.0
composer require drupal/ai_search:^1.1
composer require drupal/search_api_solr_dense_vector:^1.0
composer require drupal/ai_vdb_provider_milvus:^1.1

# Enable module
drush en ai_agent_huggingface -y

# Run updates
drush updb -y

# Clear cache
drush cr

Configuration

HuggingFace API Setup

Navigate to: /admin/config/ai/providers/huggingface

Configure:

  1. API Token: Your HuggingFace API token

  2. Inference API: Enable serverless inference

    • Free tier: 30,000 tokens/month
    • Pro tier: Unlimited
  3. Model Caching: Configure local model storage

    • Path: /sites/default/files/huggingface
    • Max size: 10GB (default)

Vector Search Setup

Navigate to: /admin/config/search/search-api

  1. Create Milvus Server:

    # Using Docker
    docker run -d --name milvus \
      -p 19530:19530 \
      -p 9091:9091 \
      milvusdb/milvus:latest
    
  2. Configure Search API Index:

    • Index name: huggingface_models
    • Server: Solr + Dense Vector
    • Fields:
      • Model name (fulltext)
      • Description (fulltext)
      • Embedding vector (dense_vector, 384 dimensions)
      • Task (facet)
      • Language (facet)
      • Downloads (sortable)
  3. Add Vector Field:

    • Field type: search_api_solr_dense_vector
    • Dimensions: 384 (Sentence-BERT default)
    • Distance metric: Cosine similarity

Usage

Discover Models

use Drupal\ai_agent_huggingface\HuggingFaceModelDiscovery;

$discovery = \Drupal::service('ai_agent_huggingface.model_discovery');

// Search by keyword
$models = $discovery->search('text generation', [
  'task' => 'text-generation',
  'language' => 'en',
  'min_downloads' => 1000,
]);

// Semantic search using embeddings
$similar_models = $discovery->semanticSearch('summarize long documents', [
  'top_k' => 10,
]);

Execute Model

use Drupal\ai_agent_huggingface\HuggingFaceInference;

$inference = \Drupal::service('ai_agent_huggingface.inference');

// Text generation
$result = $inference->generate([
  'model' => 'gpt2',
  'inputs' => 'Once upon a time',
  'parameters' => [
    'max_length' => 100,
    'temperature' => 0.7,
  ],
]);

// Text embeddings
$embeddings = $inference->embed([
  'model' => 'sentence-transformers/all-MiniLM-L6-v2',
  'inputs' => 'This is a test sentence',
]);

// Classification
$classification = $inference->classify([
  'model' => 'distilbert-base-uncased-finetuned-sst-2-english',
  'inputs' => 'This movie was amazing!',
]);

Create Pipeline

use Drupal\ai_agent_huggingface\HuggingFacePipeline;

$pipeline = \Drupal::service('ai_agent_huggingface.pipeline');

// Summarization pipeline
$summarizer = $pipeline->create('summarization', [
  'model' => 'facebook/bart-large-cnn',
  'max_length' => 130,
  'min_length' => 30,
]);

$summary = $summarizer->run('Long article text...');

// Question answering pipeline
$qa = $pipeline->create('question-answering', [
  'model' => 'deepset/roberta-base-squad2',
]);

$answer = $qa->run([
  'question' => 'What is the capital of France?',
  'context' => 'Paris is the capital and largest city of France...',
]);

API Endpoints

REST API

Endpoints:

  • GET /api/v1/huggingface/models - List models
  • GET /api/v1/huggingface/models/{id} - Model details
  • POST /api/v1/huggingface/inference - Execute model
  • POST /api/v1/huggingface/embeddings - Generate embeddings
  • GET /api/v1/huggingface/tasks - List supported tasks
  • POST /api/v1/huggingface/pipeline - Create pipeline

GraphQL (optional)

query SearchModels($query: String!, $task: String!) {
  huggingfaceModels(query: $query, task: $task) {
    id
    name
    description
    task
    downloads
    likes
  }
}

mutation GenerateText($model: String!, $inputs: String!) {
  huggingfaceGenerate(model: $model, inputs: $inputs) {
    text
    tokenCount
    latency
  }
}

Vector Search

Semantic Model Discovery

Use Milvus + Solr for semantic model search:

$search = \Drupal::service('ai_agent_huggingface.vector_search');

// Find models semantically similar to query
$results = $search->similarModels(
  'I need a model for sentiment analysis in French',
  [
    'top_k' => 10,
    'threshold' => 0.7, // Minimum similarity score
  ]
);

foreach ($results as $model) {
  echo "{$model->name} (score: {$model->similarity})\n";
}

Embedding Models

Supported embedding models:

  • sentence-transformers/all-MiniLM-L6-v2 (384 dim, fast)
  • sentence-transformers/all-mpnet-base-v2 (768 dim, high quality)
  • intfloat/e5-large-v2 (1024 dim, state-of-the-art)

Supported Tasks

  • Text Generation: GPT-2, BLOOM, Llama, Mistral
  • Fill-Mask: BERT, RoBERTa, DistilBERT
  • Question Answering: SQuAD-tuned models
  • Summarization: BART, T5, Pegasus
  • Translation: MarianMT, M2M100, NLLB
  • Sentiment Analysis: DistilBERT, RoBERTa
  • Named Entity Recognition: BERT-NER, SpaCy
  • Zero-Shot Classification: BART-MNLI, DeBERTa
  • Text Embeddings: Sentence-BERT, E5, BGE

Performance

  • Model Discovery: <100ms with Milvus
  • Inference (Serverless): 200-500ms
  • Inference (Local): 50-200ms
  • Embeddings: <50ms per sentence
  • Batch Processing: 1000 items/minute

Caching

Enable model caching for faster inference:

$config = \Drupal::configFactory()->getEditable('ai_agent_huggingface.settings');
$config->set('cache_enabled', TRUE);
$config->set('cache_ttl', 3600); // 1 hour
$config->set('cache_max_size', 10737418240); // 10GB
$config->save();

Testing

# Run PHPUnit tests
vendor/bin/phpunit modules/custom/ai_agent_huggingface/tests

# Run PHPCS
vendor/bin/phpcs --standard=Drupal modules/custom/ai_agent_huggingface

# Run PHPStan
vendor/bin/phpstan analyse modules/custom/ai_agent_huggingface/src --level=5

# Integration tests (requires HuggingFace API key)
export HUGGINGFACE_API_KEY=your-key
vendor/bin/phpunit modules/custom/ai_agent_huggingface/tests/src/Functional

Permissions

  • access huggingface models - Browse model catalog
  • execute huggingface inference - Run model inference
  • manage huggingface pipelines - Create/delete pipelines
  • administer huggingface settings - Configure module

Troubleshooting

Model Not Found

# Clear cache
drush cr

# Reindex models
drush search-api:index huggingface_models

# Verify API token
drush config-get ai_agent_huggingface.settings api_token

Slow Inference

# Enable caching
drush config-set ai_agent_huggingface.settings cache_enabled TRUE

# Use smaller model
# e.g., distilgpt2 instead of gpt2

# Enable GPU acceleration (if available)
drush config-set ai_agent_huggingface.settings use_gpu TRUE

Vector Search Not Working

# Check Milvus connection
curl http://localhost:19530/v1/health

# Rebuild index
drush search-api:clear huggingface_models
drush search-api:index huggingface_models

Architecture

┌──────────────────────────────────────────────────┐
│       AI Agent HuggingFace                       │
│  ┌────────────┬──────────┬──────────────────┐   │
│  │ Model      │Inference │  Vector Search   │   │
│  │ Discovery  │ Engine   │  (Milvus/Solr)   │   │
│  └──────┬─────┴────┬─────┴──────┬───────────┘   │
│         │          │            │                │
│  ┌──────▼──────────▼────────────▼───────────┐   │
│  │      HuggingFace Inference API            │   │
│  │  ┌──────────┬──────────┬──────────────┐  │   │
│  │  │ Text Gen │Embeddings│Classification│  │   │
│  │  └──────────┴──────────┴──────────────┘  │   │
│  └───────────────────┬───────────────────────┘   │
└────────────────────┼──────────────────────────────┘
                     │
    ┌────────────────▼──────────────────┐
    │      External Services            │
    │  ┌──────────┬──────────────────┐  │
    │  │HuggingFac│  Milvus          │  │
    │  │e Hub     │  Vector DB       │  │
    │  └──────────┴──────────────────┘  │
    └───────────────────────────────────┘

Examples

Sentiment Analysis

$sentiment = $inference->classify([
  'model' => 'distilbert-base-uncased-finetuned-sst-2-english',
  'inputs' => 'I love using HuggingFace with Drupal!',
]);

// Result: ['label' => 'POSITIVE', 'score' => 0.9998]

Text Summarization

$summary = $pipeline->create('summarization')->run(
  'Very long article content here...',
  ['max_length' => 130, 'min_length' => 30]
);

Named Entity Recognition

$entities = $inference->tokenClassification([
  'model' => 'dslim/bert-base-NER',
  'inputs' => 'My name is John and I live in New York.',
]);

// Result: [
//   ['entity' => 'B-PER', 'word' => 'John', 'start' => 11, 'end' => 15],
//   ['entity' => 'B-LOC', 'word' => 'New', 'start' => 30, 'end' => 33],
//   ['entity' => 'I-LOC', 'word' => 'York', 'start' => 34, 'end' => 38],
// ]

Related Modules

  • ai: Core AI framework for Drupal
  • ai_agents: Agent orchestration
  • ai_provider_huggingface: HuggingFace provider
  • ai_search: Semantic search infrastructure
  • llm: LLM platform core

Documentation

Support

License

GPL-2.0-or-later - See LICENSE

Maintainers

  • Drupal AI Community
  • LLM Platform Team