What is IDAH?
IDAH (Ingedata Annotation Hub) is an open-source platform for collaborative data annotation, designed to streamline the creation of high-quality training datasets for machine learning models.
Overview
IDAH provides a comprehensive solution for managing annotation projects across various data modalities including images, videos, audio, text, and more. Built with modern web technologies and a microservices architecture, IDAH offers flexibility, scalability, and extensibility through its powerful plugin system.
Explore Key Features to learn what IDAH can do, or check out Use Cases to see how IDAH can be applied to your projects.
Architecture
IDAH follows a microservices architecture, with each service responsible for a specific domain:
idah/ ├─ app/ # Microservices │ ├─ frontend/ # Web UI (SvelteKit) │ ├─ iam/ # Identity & Access Management │ ├─ dataset/ # Dataset management service │ ├─ media/ # Media processing service │ ├─ sync/ # Data export & sync service │ ├─ notification/ # Notification service │ ├─ setting/ # Settings management │ └─ audit/ # Audit logging service ├─ common/ # Shared code & utilities ├─ plugins/ # Production plugins ├─ plugins_dev/ # Development plugins & CLI ├─ dev/ # Development configuration └─ doc/ # Documentation site
Microservices
Frontend
Web application built with SvelteKit. Provides the user interface for dataset management, annotation, and collaboration.
Technologies: SvelteKit, TypeScript, Tailwind CSS
IAM (Identity & Access Management)
Handles user authentication, authorization, and access control. Manages user accounts, roles, permissions, and session management.
Technologies: Ruby, Verse Framework, PostgreSQL, Redis
Dataset
Core service for managing datasets, entries, annotations, and workflow states. Handles annotation storage, version control, and query operations.
Technologies: Ruby, Verse Framework, PostgreSQL, Redis
Media
Processes and stores media files (images, videos, audio). Handles thumbnails, format conversions, and media optimization through plugin-based processors.
Technologies: Ruby, Verse Framework, PostgreSQL, Redis
Sync
Handles data export and synchronization with external systems. Supports multiple export formats and custom exporters through plugins.
Technologies: Ruby, Verse Framework, PostgreSQL, Redis
Setting
Manages application settings, configuration, and preferences at system, organization, and user levels.
Technologies: Ruby, Verse Framework, PostgreSQL, Redis
Audit
Tracks and logs user actions, system events, and data changes for auditing, compliance, and troubleshooting.
Technologies: Ruby, Verse Framework, PostgreSQL, Redis
Notification
Sends notifications and manages email communications for user alerts, task assignments, and system updates.
Technologies: Ruby, Verse Framework, PostgreSQL, Redis
Technology Stack
Frontend
- SvelteKit: Modern frontend framework with server-side rendering
- TypeScript: Type-safe JavaScript for robust development
- Tailwind CSS: Utility-first CSS framework for rapid UI development
Backend
- Ruby: Backend programming language
- Verse Framework: Lightweight micro-framework for Ruby microservices
- PostgreSQL: Relational database for persistent data
- Redis: In-memory store for caching and background jobs
Verse Framework Ecosystem
IDAH's backend services are built on Verse Framework, a modular Ruby micro-framework designed for building lightweight, high-performance microservices. Verse provides a suite of specialized gems that IDAH leverages:
Core micro-framework foundation
HTTP server with hooks and middleware
Request validation and coercion
JSON:API compliant response renderer
PostgreSQL database integration
Redis caching and pub/sub
Cron jobs and scheduled tasks
File upload and storage handling
JSON-RPC for service communication
This modular approach allows each IDAH microservice to use only the Verse gems it needs, keeping services lightweight and focused on their specific responsibilities.
Verse Framework created by Yacine Petitprez
Infrastructure
- Docker: Containerization for consistent environments
- Docker Compose: Multi-container orchestration for development
- Nginx: Reverse proxy and load balancer
- S3-compatible storage: Object storage for media files
Open Source
IDAH is open-source software, freely available for use, modification, and distribution. We welcome contributions from the community!
- License: Mozilla Public License 2.0
- Repository: github.com/idah-ai/idah
- Contributing: See our Contributing Guide
🚀 Ready to get started? Head over to the Installation Guide to set up IDAH on your machine.