Export Datasets
Learn how to add data export and synchronization capabilities to your IDAH plugin using the Sync Service backend.
Overview
The Sync Service backend handles data export and synchronization with external systems. Export datasets, entries, annotations, and media to various formats (JSON, CSV, XML) or external APIs.
Add Sync Backend to Your Plugin
Option 1: Create New Plugin with Sync Backend
When creating a new plugin, select "Sync Service" during the setup:
npx idah-plugin create my-plugin ./plugins When prompted, select Sync Service from the backend services options.
Option 2: Add Sync Backend to Existing Plugin
If you have an existing plugin without a sync backend, you can add it:
npx idah-plugin backend add my-plugin ./plugins When prompted, select Sync Service to add export capabilities.
Generated File Structure
The sync backend generator creates the following files:
<plugin_name>/ └─ backends/ └─ sync/ └─ <plugin_name_underscore>/ ├─ sync.rb # Sync service module (registers exporter) ├─ sync_spec.rb # Sync service tests ├─ export.rb # Export/sync logic └─ export_spec.rb # Export tests
Implementation Steps
Step 1: Register the Exporter
The sync module registers your exporter with IDAH:
module YourPlugin
class Sync
def self.init(context)
context.register_exports(
"your-plugin", # Export identifier
YourPlugin::Export # Export class
)
end
end
end Step 2: Implement the Export Logic
Implement the core export logic in your export class:
module YourPlugin
class Export
def export(context)
# Get dataset IDs being exported
dataset_ids = context.dataset_ids
# Get export options
options = context.options
# Create output file
file = context.io.file(format: "json")
# Process datasets
all_data = []
context.datasets.each do |dataset|
all_data << export_dataset(dataset, options)
end
# Write to file
File.write(file.path, all_data.to_json)
# Progress is auto-updated when using datasets iterator
end
private
def export_dataset(dataset, options)
# Your export logic here
{
id: dataset.record.id,
name: dataset.record.name,
entries: export_entries(dataset)
}
end
def export_entries(dataset)
dataset.entries.map do |entry|
{
id: entry.record.id,
annotations: entry.annotations.map(&:annotation)
}
end
end
end
end Export Context API
Context Attributes
context.dataset_ids # Array of dataset IDs
context.options # Export options hash
context.io # IO context for file operations Iterate Through Datasets
context.datasets.each do |dataset|
# Access dataset data
dataset_data = dataset.record
dataset_id = dataset.record.id
dataset_name = dataset.record.name
# Get entries
entries = dataset.entries
# Get filtered entries
completed_entries = dataset.entries({ status: "completed" })
end Note: Progress is automatically updated as you iterate through datasets.
Access Entries and Annotations
dataset.entries.each do |entry|
# Access entry data
entry_id = entry.record.id
resource = entry.record.resource
status = entry.record.status
# Get annotations
annotations = entry.annotations
# Get filtered annotations
boxes = entry.annotations({ type: "bounding_box" })
# Get media files
medias = entry.medias
original = entry.medias({ key: "" }).first
end Download Media Files
entry.medias.each do |media|
filename = media.media.filename
mime_type = media.media.mime_type
# Download media file
binary_data = media.download
# Save to export directory
File.write(File.join(dir, filename), binary_data)
end IO Operations
file = context.io.file(format: "json")
File.write(file.path, data.to_json)
dir = context.io.directory
File.write(File.join(dir, "data.json"), data.to_json)
zip_path = context.io.zip_directory Implementation Workflow
1. Add Sync Backend (if not present)
npx idah-plugin backend add data-exporter ./plugins Select "Sync Service" when prompted. This generates the sync backend structure.
2. Install Dependencies
cd plugins/data-exporter bundle install 3. Implement Export Logic
Edit backends/sync/<plugin_name>/export.rb:
require "json"
module YourPlugin
class Export
def export(context)
Verse.logger.info "Starting export for #{context.dataset_ids.size} datasets"
# Create output file
file = context.io.file(format: "json")
# Export all datasets
all_data = []
context.datasets.each do |dataset|
all_data << export_dataset(dataset)
end
# Write to file
File.write(file.path, JSON.pretty_generate(all_data))
Verse.logger.info "Export complete"
rescue StandardError => e
Verse.logger.error "Export failed: #{e.message}"
context.error!(e.message)
raise
end
private
def export_dataset(dataset)
{
id: dataset.record.id,
name: dataset.record.name,
modality: dataset.record.modality,
entries: export_entries(dataset)
}
end
def export_entries(dataset)
dataset.entries.map do |entry|
{
id: entry.record.id,
resource: entry.record.resource,
status: entry.record.status,
annotations: entry.annotations.map { |a| export_annotation(a) }
}
end
end
def export_annotation(annotation)
{
id: annotation.record.id,
dimensions: annotation.record.dimensions,
shape: annotation.record.shape
}
end
end
end 4. Write Tests
require "spec_helper"
require_relative "export"
RSpec.describe YourPlugin::Export do
let(:export) { described_class.new }
describe "#export" do
it "exports datasets successfully" do
context = double("context")
io = double("io")
file = double("file", path: "/tmp/export.json")
allow(context).to receive(:dataset_ids).and_return(["ds1"])
allow(context).to receive(:options).and_return({})
allow(context).to receive(:io).and_return(io)
allow(context).to receive(:datasets).and_return([])
allow(io).to receive(:file).and_return(file)
expect { export.export(context) }.not_to raise_error
end
end
end 5. Run Tests
bundle exec rspec backends/sync/ Data Context Objects
DatasetContext
context.datasets.each do |dataset|
# Access dataset record
dataset.record.id
dataset.record.name
dataset.record.modality
# Get all entries
entries = dataset.entries
# Get filtered entries
entries = dataset.entries({ status: "completed" })
end EntryContext
dataset.entries.each do |entry|
# Access entry record
entry.record.id
entry.record.resource
entry.record.status
# Get annotations
annotations = entry.annotations
annotations = entry.annotations({ type: "bounding_box" })
# Get media files
medias = entry.medias
original = entry.medias({ key: "" }).first
end AnnotationContext
entry.annotations.each do |annotation|
annotation.record.id
annotation.record.dimensions
annotation.record.annotation
annotation.record.metadata
end MediaContext
entry.medias.each do |media|
media.media.resource
media.media.key
media.media.filename
media.media.mime_type
# Download the file
binary_data = media.download
File.write(media.media.filename, binary_data)
end Common Export Patterns
Export to Single JSON File
def export(context)
file = context.io.file(format: "json")
all_data = []
context.datasets.each do |dataset|
all_data << {
dataset: dataset.record,
entries: dataset.entries.map { |e| format_entry(e) }
}
end
File.write(file.path, JSON.pretty_generate(all_data))
end Export to Multiple Files
def export(context)
dir = context.io.directory
context.datasets.each_with_index do |dataset, idx|
filename = "dataset_#{idx + 1}.json"
data = export_dataset(dataset)
File.write(File.join(dir, filename), data.to_json)
end
# Create summary
summary = { total: context.dataset_ids.size }
File.write(File.join(dir, "summary.json"), summary.to_json)
# Zip everything
zip_path = context.io.zip_directory
Verse.logger.info "Export zipped to #{zip_path}"
end Export to External API
require "net/http"
require "json"
def export(context)
api_key = context.options[:api_key]
api_url = context.options[:api_url]
context.datasets.each do |dataset|
data = prepare_dataset_data(dataset)
send_to_api(data, api_url, api_key)
end
end
def send_to_api(data, url, api_key)
uri = URI(url)
req = Net::HTTP::Post.new(uri)
req["Authorization"] = "Bearer #{api_key}"
req["Content-Type"] = "application/json"
req.body = data.to_json
Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http|
response = http.request(req)
raise "API error: #{response.code}" unless response.is_a?(Net::HTTPSuccess)
end
end Export with Media Files
def export(context)
dir = context.io.directory
# Create media directory
media_dir = File.join(dir, "media")
Dir.mkdir(media_dir)
context.datasets.each do |dataset|
# Export data
data = export_dataset_data(dataset)
File.write(File.join(dir, "#{dataset.record.id}.json"), data.to_json)
# Export media files
dataset.entries.each do |entry|
entry.medias.each do |media|
media_path = File.join(media_dir, media.media.filename)
File.write(media_path, media.download)
end
end
end
# Zip everything
context.io.zip_directory
end Best Practices
1. Handle Large Datasets
def export(context)
file = context.io.file(format: "json")
# Stream data to avoid memory issues
File.open(file.path, "w") do |f|
f.write("[")
context.datasets.each_with_index do |dataset, idx|
f.write(",") if idx > 0
f.write(export_dataset(dataset).to_json)
end
f.write("]")
end
end 2. Handle Errors Gracefully
def export(context)
context.datasets.each do |dataset|
begin
export_dataset(dataset)
rescue StandardError => e
Verse.logger.warn "Failed to export #{dataset.record.id}: #{e.message}"
# Continue or fail based on your requirements
end
end
end 3. Validate Options
def export(context)
raise ArgumentError, "API key required" unless context.options[:api_key]
raise ArgumentError, "No datasets to export" if context.dataset_ids.empty?
# Proceed with export
end 4. Clean Up Resources
def export(context)
temp_files = []
begin
# Create and process temp files
temp_files << create_temp_file
# ... processing
ensure
# Clean up
temp_files.each { |f| File.unlink(f) if File.exist?(f) }
context.io.cleanup
end
end Testing Your Sync Backend
Run Tests
bundle exec rspec backends/sync/
bundle exec rspec backends/sync/<plugin_name>/export_spec.rb Test in IDAH Platform
-
Build your plugin frontend:
cd frontend && pnpm build - Restart IDAH platform to load the plugin
- Create and annotate a dataset
- Trigger an export using your sync service
- Verify the exported data is correct
- Check logs for any errors or warnings
Real-World Example
See the UPD Exporter for a complete implementation:
📤 Ready to export data! Start by adding a sync backend to your plugin and implement your custom export logic.