🔗 Open Index Protocol

Blockchain-based data storage and retrieval system

Introduction

The Open Index Protocol (OIP) is a comprehensive blockchain-based data storage and retrieval system that combines the permanence and decentralization of blockchain technology with modern AI capabilities, private user data, cross-node synchronization, and flexible deployment options.

What is OIP?

OIP operates as a multi-layered platform that enables developers to build sophisticated, interconnected applications while maintaining user privacy and data ownership. The system provides:

🔗

Blockchain Storage

Permanent, immutable records on Arweave blockchain

🔒

Private Storage

Encrypted user-owned records with HD wallet authentication

🤖

AI Integration

Natural language processing with RAG (Retrieval-Augmented Generation)

📁

Media Distribution

Multi-network file sharing with BitTorrent, IPFS, and HTTP

🌐

Cross-Node Sync

Distributed deployment with automatic synchronization

📋

Template System

Flexible schema definitions for structured data

Key Benefits

  • True User Ownership: HD wallet-based authentication with cross-device access
  • AI-Powered Interface: Natural language interaction with intelligent content retrieval
  • Multi-Network Storage: Public Arweave storage and private GUN network
  • Media Distribution: Decentralized file sharing with BitTorrent and IPFS
  • Cross-Node Sync: Distributed deployment with automatic synchronization
  • Memory Optimization: Comprehensive leak prevention and performance monitoring
  • Custom Frontend Development: Multiple development patterns with hot reloading
  • Multi-Stack Deployment: Isolated concurrent deployments with complete resource isolation

Historical Context

OIP traces its origins to Alexandria (2014), evolving through several key milestones:

2014
Alexandria Genesis - Initial blockchain data storage concept
2016
Decentralized Web Summit - Community formation
2017
OIP Working Group - Formal protocol development
2018
Caltech ETDB - Real-world validation
2019
Teton County - Government implementation
2020
Al Bawaba MVP - Media platform validation
2021
Web Monetization - Payment integration
2022
Arweave rewrite - Modern blockchain integration
2023-2025
GUN/BitTorrent/IPFS integration - Multi-network architecture

Core Architecture

The Template-Record Paradigm

OIP operates on a fundamental two-tier architecture:

  1. Templates: Schema definitions that specify structure, field types, and compression indices
  2. Records: Data instances that conform to templates and get compressed using field-to-index mappings
1

Template Definition

Define schema structure and field types

2

Record Creation

Create data instances conforming to templates

3

Data Compression

Compress field names to numeric indices

4

Blockchain Publishing

Store compressed data on Arweave

Dual Storage Architecture

Public Records (Arweave)

  • Permanent Storage: Immutable records on Arweave blockchain
  • Public Access: Available to all users without authentication
  • Server-Signed: Records signed by server's Arweave wallet
  • Use Cases: Blog posts, recipes, exercises, news articles

Private Records (GUN Network)

  • Encrypted Storage: User-owned records with HD wallet authentication
  • Cross-Node Sync: Records synchronized across multiple OIP nodes
  • Organization Support: Team-level access control with domain-based membership
  • Use Cases: Private conversations, personal media, organization content

Template Structure

Templates are JSON objects that define:

  • Field Names: Human-readable identifiers
  • Field Types: Data type specifications (string, enum, dref, etc.)
  • Index Mappings: Numeric indices for compression (`index_fieldName`)
  • Enum Values: Predefined value sets for enum fields
  • Validation Rules: Type constraints and requirements

Example Template Structure

{
  "basic": {
    "name": "string",
    "index_name": 0,
    "description": "string", 
    "index_description": 1,
    "date": "long",
    "index_date": 2,
    "language": "enum",
    "languageValues": [
      { "code": "en", "name": "English" },
      { "code": "es", "name": "Spanish" }
    ],
    "index_language": 3,
    "avatar": "dref",
    "index_avatar": 4,
    "tagItems": "repeated string",
    "index_tagItems": 5
  }
}

Field Types

Type Description Example
stringText data"Hello World"
longInteger/timestamp1713783811
enumPredefined valuesLanguage codes, categories
drefReference to another record"did:arweave:abc123..."
repeated stringArray of strings["tag1", "tag2"]
repeated drefArray of record referencesMultiple citations
boolBoolean valuetrue/false
uint64Unsigned integerLarge numbers
floatFloating pointDecimal values

Key Components

🔌

API Layer

Comprehensive REST APIs for all operations including records, publishing, media, and authentication.

  • GET /api/records - Advanced querying
  • POST /api/records/newRecord - Publish records
  • POST /api/media/upload - Media uploads
  • POST /api/user/register - User registration
🤖

AI Integration (ALFRED)

Sophisticated AI assistant with voice processing, RAG, and multiple LLM backends.

  • Natural Language Processing
  • RAG System for context-aware responses
  • Voice Processing (STT/TTS)
  • Conversation Memory with encryption
📁

Media Distribution

Multi-network storage with BitTorrent, IPFS, HTTP, and Arweave integration.

  • BitTorrent/WebTorrent P2P distribution
  • HTTP Streaming with range support
  • IPFS decentralized storage
  • Arweave permanent blockchain storage
🖥️

Reference Client

Comprehensive web interface for browsing, publishing, and AI interaction.

  • Advanced search and filtering
  • Multi-record type publishing
  • AI Drawer for natural language
  • HD wallet authentication
🎤

Mac Client

Local voice interface for ALFRED with privacy-focused conversation management.

  • Local Whisper MLX processing
  • Voice Activity Detection
  • Smart Turn Management
  • Encrypted conversation storage
🔐

Authentication & Privacy

HD wallet-based authentication with BIP-39/BIP-32 standards.

  • 12-word mnemonic seed phrases
  • Cross-device access
  • Per-user encryption
  • Organization access control

API Reference

Records Endpoint

GET /api/records

Advanced querying with filtering, search, and resolution capabilities.

Query Parameters

Parameter Type Description Example
searchstringFull-text search query"blockchain technology"
recordTypestringFilter by record type"post", "recipe"
tagsstringFilter by tags (comma-separated)"cooking,italian"
limitnumberNumber of results20
sortBystringSort field and direction"date:desc"

Example Request

curl "https://api.oip.onl/api/records?search=blockchain&recordType=post&limit=10&sortBy=date:desc"

Publishing Endpoint

POST /api/records/newRecord

Publish records to Arweave or GUN storage.

Request Body

{
  "basic": {
    "name": "My Article",
    "description": "An example article",
    "date": 1713783811,
    "language": "en",
    "tagItems": ["example", "article"]
  },
  "post": {
    "bylineWriter": "John Doe",
    "articleText": "This is the main content of my article...",
    "webUrl": "https://example.com/my-article"
  },
  "blockchain": "arweave"
}

Media Upload Endpoint

POST /api/media/upload

Upload media files with BitTorrent distribution.

Request (multipart/form-data)

Content-Type: multipart/form-data
Authorization: Bearer <jwt-token>

Form Fields:
- file: <binary_file>           # Required: Media file
- name: "Media Title"           # Optional: Human-readable name
- access_level: "private"       # Optional: "private", "organization", "public"
- description: "Description"    # Optional: Description text

Advanced Features

Multi-Stack Deployment

OIP supports running multiple completely separate stacks on the same machine without conflicts.

Conflict Resolution

The OIP stack prevents all Docker conflicts through:

  • 🌐 Network Names: ${COMPOSE_PROJECT_NAME}_oip-network
  • 📦 Volume Names: ${COMPOSE_PROJECT_NAME}_volumename
  • 🔌 Port Conflicts: All service ports configurable via environment variables
  • 📁 Container Names: Automatically prefixed with project name

Custom Frontend Development

OIP supports multiple frontend development patterns:

Pattern 1: Frontend-First Development

Best for: Rapid frontend development, UI/UX iteration, hot reloading

# Project structure
RockHoppersGame/
├── oip-arweave-indexer/    # OIP backend on :3005
└── public/                 # Frontend on :3000 via npx serve
    ├── index.html
    ├── app.js
    └── config.js

Pattern 2: Integrated Development

Best for: Production-like testing, single-origin behavior

# Configure OIP to serve custom frontend
echo "CUSTOM_PUBLIC_PATH=true" >> .env
make standard  # Everything runs on :3005

Organization Access Control

Organizations provide multi-level access control for team collaboration:

Index Code Display Name Description
0invite-only"Invite Only"Members must be explicitly invited
1app-user-auto"Auto-Enroll App Users"Users from organization's domain auto-join
2token-gated"Token-Gated Membership"Requires specific tokens/NFTs
3open-join"Open Join"Anyone can join freely

Deployment Guide

Quick Start

1

Prerequisites

  • Docker and Docker Compose
  • Node.js 18+ (for development)
  • Git
2

Basic Setup

# Clone the repository
git clone https://github.com/your-org/oip-arweave-indexer.git
cd oip-arweave-indexer

# Start the system
make standard

# Access the reference client
open http://localhost:3000/reference-client.html

Environment Configuration

Core Environment Variables

# Project Configuration
COMPOSE_PROJECT_NAME=my-project
CUSTOM_PUBLIC_PATH=true
PORT=3005

# Service Ports
ELASTICSEARCH_PORT=9200
KIBANA_PORT=5601
OLLAMA_PORT=11434
GUN_RELAY_PORT=8765

# LLM APIs
OPENAI_API_KEY=sk-...
XAI_API_KEY=xai-...

# Voice Services
STT_SERVICE_URL=http://localhost:8003
TTS_SERVICE_URL=http://localhost:5002
ELEVENLABS_API_KEY=...

# GUN Database
GUN_RELAY_URL=http://gun-relay:8765
GUN_ENCRYPTION_KEY=gun-encryption-key

# Performance
ALFRED_CACHE_MAX_SIZE=1000
ALFRED_CACHE_MAX_AGE=1800000

Docker Services

The OIP stack includes the following services:

oip - Main API server
elasticsearch - Search and indexing engine
kibana - Elasticsearch management interface
ollama - Local LLM models
gun-relay - GUN network relay
tts-service-gpu - Text-to-speech service
stt-service-gpu - Speech-to-text service
smart-turn-service - Voice endpoint detection

Use Cases & Examples

1. Content Publishing Platform

Create a decentralized content platform with AI-powered features:

// Publish a blog post
const postData = {
  basic: {
    name: "Understanding Blockchain Technology",
    description: "A comprehensive guide to blockchain...",
    date: Math.floor(Date.now() / 1000),
    language: "en",
    tagItems: ["blockchain", "technology", "guide"]
  },
  post: {
    bylineWriter: "John Doe",
    articleText: "Blockchain technology represents a paradigm shift...",
    webUrl: "https://example.com/article"
  },
  blockchain: "arweave"
};

const response = await fetch('/api/records/newRecord', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify(postData)
});

2. Recipe Sharing Platform

Build a cooking platform with AI-generated images:

// Publish a recipe with AI-generated image
const recipeData = {
  basic: {
    name: "Authentic Italian Pasta",
    description: "Traditional Italian pasta recipe...",
    date: Math.floor(Date.now() / 1000),
    language: "en",
    tagItems: ["italian", "pasta", "traditional"]
  },
  recipe: {
    prepTimeMinutes: 15,
    cookTimeMinutes: 25,
    servings: 4,
    difficulty: "Easy",
    cuisine: "Italian",
    ingredients: ["2 cups flour", "3 eggs", "1 tsp salt"],
    instructions: ["Mix ingredients", "Knead dough", "Roll and cut"]
  },
  blockchain: "arweave"
};

3. AI-Powered Voice Assistant

Integrate ALFRED for natural language interactions:

// Voice conversation with ALFRED
const voiceData = new FormData();
voiceData.append('audio', audioBlob);
voiceData.append('processing_mode', 'rag');
voiceData.append('voiceConfig', JSON.stringify({
  engine: "elevenlabs",
  elevenlabs: {
    selectedVoice: "pNInz6obpgDQGcFmaJgB",
    stability: 0.5,
    similarity_boost: 0.75
  }
}));

const response = await fetch('/api/voice/converse', {
  method: 'POST',
  body: voiceData
});

const { dialogueId } = await response.json();

// Connect to real-time stream
const eventSource = new EventSource(`/api/voice/open-stream?dialogueId=${dialogueId}`);

4. Multi-Stack Game Development

Deploy multiple game instances on the same machine:

# Game 1: Rock Hoppers
cd ~/projects/RockHoppersGame/oip-arweave-indexer
echo "COMPOSE_PROJECT_NAME=rockhoppers-game" >> .env
echo "PORT=3005" >> .env
make standard

# Game 2: Space Adventure  
cd ~/projects/SpaceAdventure/oip-arweave-indexer
echo "COMPOSE_PROJECT_NAME=space-adventure" >> .env
echo "PORT=3105" >> .env
echo "ELASTICSEARCH_PORT=9300" >> .env
make standard

Troubleshooting

1. Services Won't Start

Problem: Docker containers fail to start

# Check Docker status
docker ps -a

# Check logs
docker logs oip-arweave-indexer-oip-1

# Restart services
make down
make standard

2. Elasticsearch Connection Issues

Problem: Cannot connect to Elasticsearch

# Check Elasticsearch health
curl http://localhost:9200/_cluster/health

# Restart Elasticsearch
docker restart oip-arweave-indexer-elasticsearch-1

# Check mapping issues
curl http://localhost:9200/_mapping

3. AI Services Not Responding

Problem: ALFRED AI not working

# Check Ollama models
docker exec -it oip-arweave-indexer-ollama-1 ollama list

# Check API keys
echo $OPENAI_API_KEY
echo $XAI_API_KEY

# Test AI endpoint
curl -X POST http://localhost:3005/api/voice/chat \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello", "processing_mode": "llm"}'

4. Port Conflicts

Problem: Port already in use

# Check what's using the port
lsof -i :3005

# Find available ports
for port in {3005..3010}; do ! lsof -i:$port && echo "Port $port is free"; done

# Update environment variables
echo "PORT=3006" >> .env

Health Checks

# Check all services
curl http://localhost:3005/api/health

# Check specific services
curl http://localhost:9200/_cluster/health  # Elasticsearch
curl http://ollama:11434/api/tags           # Ollama models
curl http://localhost:5002/health           # TTS service
curl http://localhost:8003/health           # STT service

Contributing

Development Setup

  1. Fork the repository
  2. Clone your fork
  3. Set up development environment
  4. Make your changes
  5. Test thoroughly
  6. Submit a pull request

Code Style

  • Use ES6+ JavaScript features
  • Follow existing code patterns
  • Add comprehensive comments
  • Include error handling
  • Write tests for new features

Testing

# Run tests
npm test

# Run specific test suites
npm test -- --grep "media upload"
npm test -- --grep "authentication"