Eliza Leaderboard

A modern analytics pipeline for tracking and analyzing GitHub contributions. The system processes contributor data, generates AI-powered summaries, and maintains a leaderboard of developer activity.

Prerequisites

Bun (recommended) or Node.js 18+
GitHub Personal Access Token with repo scope
OpenRouter API Key (optional, for AI summaries)
uv (optional, for syncing from production DB)

Features

Tracks pull requests, issues, reviews, and comments
Calculates contributor scores based on activity and impact
Generates AI-powered summaries of contributions
Exports daily summaries to JSON files
Maintains contributor expertise levels and focus areas
Interactive contributor profile pages
Activity visualizations and metrics
Daily, weekly, and monthly reports
Smart contributor scoring system

Setup

Install dependencies:

bun install

Set up environment variables in .env using .env.example for reference:

# Required for Github Ingest
GITHUB_TOKEN=your_github_personal_access_token_here
# Required for AI summaries
OPENROUTER_API_KEY=your_api_key_here
# configure local environment to use cheaper models
LARGE_MODEL=openai/gpt-4o-mini

# Optional site info
SITE_URL=https://elizaos.github.io
SITE_NAME="ElizaOS Leaderboard"

Then load the environment variables:

source .envrc
# Or if using direnv: direnv allow

Configure repositories in config/pipeline.config.ts:

export default {
  // Repositories to track
  repositories: [
    {
      owner: "elizaos",
      name: "eliza",
    },
  ],

  // Bot users to ignore
  botUsers: ["dependabot", "renovate-bot"],

  // Scoring and tag configuration...

  // AI Summary configuration
  aiSummary: {
    enabled: true,
    apiKey: process.env.OPENROUTER_API_KEY,
    // ...
  },
};

Initialize Database

You can either initialize an empty database or sync the latest data from production:

Option A - Initialize Empty Database:

# Apply migrations
bun run db:migrate

Option B - Sync Production Data:

If you want to download all historical data from the production data branch instead of having to reingest / generate it on your own, you can use the data:sync command, which depends on uv.

# Install uv first if you don't have it (required for database restoration)

pipx install uv  # Recommended method
# OR
brew install uv  # macOS with Homebrew

# More installation options: https://docs.astral.sh/uv/getting-started/installation/

# Download the latest data from production
bun run data:sync
# This will:
# - Fetch the latest data from the _data branch
# - Copy all data files (stats, summaries, etc.)
# - Restore the SQLite database from the diffable dump

# If you made local changes to the schema that don't exist in prod DB:
bun run db:generate
bun run db:migrate

The data sync utility supports several options:

# View all options
bun run data:sync --help

# Skip confirmation prompts (useful in scripts)
bun run data:sync -y

# Sync from a different remote (if you've added one)
bun run data:sync --remote upstream

# Skip database restoration (only sync generated JSON/MD files)
bun run data:sync --skip-db

# Delete all local data and force sync
bun run data:sync --force

After syncing or initializing the database, you can explore it using Drizzle Studio:

# Launch the database explorer
bun run db:studio

If you encounter any issues with Drizzle Studio due to Node.js version mismatches, you can use a different SQLite browser tool like SQLite Browser.

Commands and Capabilities

You can see the main pipelines and their usages with these commands below:

bun run pipeline ingest -h
bun run pipeline process -h
bun run pipeline export -h
bun run pipeline summarize -h

Data Ingestion

# Ingest latest Github data (default since last fetched, or 7 days)
bun run pipeline ingest

# Ingest from beginning
bun run pipeline ingest --after 2024-10-15

# Ingest with specific date range
bun run pipeline ingest --after 2025-01-01 --before 2025-02-20

# Ingest data for a specific number of days
bun run pipeline ingest --days 30 --before 2024-03-31

# Ingest with verbose logging
bun run pipeline ingest -v

# Ingest with custom config file
bun run pipeline ingest --config custom-config.ts

Data Processing and Analysis

# Process and analyze all repositories
bun run pipeline process

# Force recalculation of scores even if they already exist
bun run pipeline process --force

# Process specific repository
bun run pipeline process --repository owner/repo

# Process with verbose logging
bun run pipeline process -v

# Process with custom config
bun run pipeline process --config custom-config.ts

Generating Stats and Exports

# Export repository stats (defaults to 30 days)
bun run pipeline export

# Export with specific date range
bun run pipeline export --after 2025-01-01 --before 2025-02-20

# Export for a specific number of days
bun run pipeline export --days 60

# Export all data since contributionStartDate
bun run pipeline export --all

# Export for specific repository
bun run pipeline export -r owner/repo

# Export to custom directory
bun run pipeline export --output-dir ./custom-dir/

# Export with verbose logging
bun run pipeline export -v

# Regenerate and overwrite existing files
bun run pipeline export --force

AI Summary Generation

Generated project summaries are stored in data/<owner_repo>/<interval>/summaries/summary_<date>.json.

# Generate project summaries
bun run pipeline summarize -t project

# Generate contributor summaries
bun run pipeline summarize -t contributors

# Generate summaries with specific date range
bun run pipeline summarize -t project --after 2025-01-01 --before 2025-02-20

# Force overwrite existing summaries
bun run pipeline summarize -t project --force

# Generate and overwrite summaries for a specific number of days (default 7 days)
bun run pipeline summarize -t project --days 90 --force

# Generate project summaries for all data since contributionStartDate
bun run pipeline summarize -t project --all

# Generate summaries for specific repository
bun run pipeline summarize -t project --repository owner/repo

# Generate only weekly contributor summaries
bun run pipeline summarize -t contributors --weekly

# Generate summaries with verbose logging
bun run pipeline summarize -t project -v

By default, the summarize command wont regenerate summaries that already exist for a given day. To regenerate summaries, you can pass in the -f/--force flag.

Database Management

# Generate database migration files
bun run db:generate

# Apply database migrations
bun run db:migrate

# Launch interactive database explorer
bun run db:studio

Website Generation

# Build and generate contributor profile pages
bun run build

# View the site
bunx serve@latest out

CI/CD and Data Management

The project uses GitHub Actions for automated data processing, summary generation, and deployment. The system maintains separate branches for code and data to optimize Git history management.

GitHub Actions Workflows

Run Pipelines (run-pipelines.yml): Runs daily at 23:00 UTC to fetch GitHub data, process it, and generate summaries
- Runs the full ingest → process → export → summarize pipeline chain
- Maintains data in a dedicated _data branch
- Can be manually triggered from Github Actions tab with custom date ranges or forced regeneration
- Runs project summaries daily, but only runs contributor summaries on Sundays
Deploy to GitHub Pages (deploy.yml): Builds and deploys the site
- Triggered on push to main, manually, or after successful pipeline run
- Restores data from the _data branch before building
- Generates directory listings for the data folder
- Deploys to GitHub Pages
PR Checks (pr-checks.yml): Quality checks for pull requests
- Runs linting, typechecking, and build verification
- Tests the pipeline on a small sample of data
- Verifies migrations are up to date when schema changes

Data Management Architecture

The project uses a specialized data branch strategy to optimize both code and data storage:

Separate Data Branch: All pipeline data is stored in a separate branch (default: _data)
- Keeps the main branch clean and focused on code
- Prevents data changes from cluttering code commits
- Enables efficient data restoration in CI/CD and deployment
Database Serialization: Uses the sqlite-diffable utility to store database content as version-controlled files
- Converts SQLite database to diffable text files in data/dump/
- Enables Git to track database changes efficiently
- Provides an audit trail
- Allows for database "time travel" via git history
Custom GitHub Actions: Two custom actions are used in the workflows:
- restore-db: Restores data from the data branch using sparse checkout
- pipeline-data: Manages worktrees to retrieve and update data in the _data branch

This architecture ensures:

Efficient Git history management (code changes separate from data changes)
Reliable CI/CD workflows with consistent data access
Simplified deployment with automatic data restoration
Effective collaboration without data conflict issues

Development

TypeScript Pipeline

The project uses a TypeScript-based pipeline for data processing. See Pipeline Documentation for detailed information about:

Basic usage and commands
Pipeline architecture and components
Configuration options
Creating custom pipelines
Available customization points

Updating schema

If you need to modify the database schema (in src/lib/data/schema.ts), follow these steps:

Make your changes to the schema file
Generate migration files:

bun run db:generate

Apply the migrations:

bun run db:migrate

This process will:

Create new migration files in the drizzle directory
Apply the changes to your SQLite database
Ensure data consistency with the updated schema

Database Explorer

To interactively explore the database and its contents:

bun run db:studio

This launches Drizzle Studio, which provides a visual interface to browse tables, relationships, run queries, and export data.

Additional setup required if you use Safari or Brave: https://orm.drizzle.team/docs/drizzle-kit-studio#safari-and-brave-support

Troubleshooting

Common Issues

"GITHUB_TOKEN environment variable is required"
- Ensure your GitHub token is set in .env and the environment is loaded
- You can also run commands with the token directly: GITHUB_TOKEN=your_token bun run pipeline ingest -d 10
- GitHub Personal Access Token permissions:
  - Contents: Read and write
  - Metadata: (auto-enabled)
  - Actions: Read and write
  - Pages: Read and write
"No such table: repositories"
- Run bun run db:generate and bun run db:migrate to initialize the database
- Ensure the data directory exists: mkdir -p data
"Error fetching data from GitHub"
- Check your GitHub token has proper permissions
- Verify repository names are correct in config
- Ensure your token has not expired

Debugging

For more detailed logs, add the -v or --verbose flag to any command:

bun run pipeline ingest -d 10 -v

Directory Structure

.
├── data/               # Generated data and reports
│   └── db.sqlite       # SQLite database
├── cli/                # CLI program for pipeline
│   └── analyze-pipeline.ts  # Run typescript pipeline
├── config/             # Configuration files
│   └── pipeline.config.ts  # TypeScript pipeline configuration
├── drizzle/            # Database migration files
├── src/
│   ├── app/            # Next.js app router pages
│   ├── components/     # React components
│   │   └── ui/         # shadcn/ui components
│   │
│   └── lib/
│       ├── pipelines/  # Modular pipeline system
│       │   ├── contributors/  # Contributor-specific pipeline components
│       │   ├── export/        # Pipelines to export JSON data
│       │   ├── ingest/        # Data ingestion pipeline components
│       │   ├── summarize/     # Pipelines to generate AI summaries
│       ├── data/          # Data sources and storage
│       │   ├── db.ts      # Database connection and configuration
│       │   ├── github.ts  # GitHub API integration
│       │   ├── ingestion.ts  # Data ingestion from GitHub API
│       │   ├── schema.ts  # Database schema definitions
│       │   └── types.ts   # Core data type definitions
│       ├── logger.ts      # Logging system
│       └── typeHelpers.ts # TypeScript helper utilities
├── profiles/           # Generated static profiles
└── .github/workflows   # Automation workflows

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name	Name	Last commit message	Last commit date
Latest commit madjin May 7, 2025 1b4d08c · May 7, 2025 History 447 Commits
.claude/commands	.claude/commands	Update README and CLAUDE.md	Apr 2, 2025
.cursor/rules	.cursor/rules	Update taskmaster cursor rule for MCP support, and move pipeline docs…	Apr 23, 2025
.github	.github	Add removal of drizzle migrations before db restore	Apr 22, 2025
.husky	.husky	Remove deprecated husky lines	Apr 17, 2025
.vscode	.vscode	Virtualize leaderboard list for better performance	Jan 16, 2025
cli	cli	Update taskmaster cursor rule for MCP support, and move pipeline docs…	Apr 23, 2025
config	config	Fix pipeline config and generate migration	Apr 23, 2025
data	data	Add gitkeep to data directory	Apr 17, 2025
drizzle	drizzle	Fix schema	Apr 23, 2025
legacy-data	legacy-data	Move legacy data into separate folder	Apr 9, 2025
plan	plan	Add gemini generated plan for enhancing scoring algo	Apr 18, 2025
scripts	scripts	Address PR review	Apr 24, 2025
src	src	update daily max_tokens	May 7, 2025
.cursorignore	.cursorignore	Refactor pipeline config to separate file	Apr 3, 2025
.env.example	.env.example	Use small models in local development environment	Apr 18, 2025
.gitignore	.gitignore	Add gitkeep to data directory	Apr 17, 2025
.lintstagedrc.mjs	.lintstagedrc.mjs	Update README, setup eslint, husky, and prettier properly	Apr 7, 2025
.nojekyll	.nojekyll	add nojekyll	Jan 13, 2025
.prettierignore	.prettierignore	Update README, setup eslint, husky, and prettier properly	Apr 7, 2025
.prettierrc.json	.prettierrc.json	Update README, setup eslint, husky, and prettier properly	Apr 7, 2025
.windsurfrules	.windsurfrules	Setup Taskmaster AI	Apr 12, 2025
CLAUDE.md	CLAUDE.md	Move analyze-pipeline to cli/ directory	Apr 12, 2025
LICENSE	LICENSE	Create LICENSE	Apr 7, 2025
README-task-master.md	README-task-master.md	Setup Taskmaster AI	Apr 12, 2025
README.md	README.md	Update pipelines action to allow more options / configuration for pip…	Apr 18, 2025
bun.lock	bun.lock	Add reactions and PR closing issues fields to DB and ingest	Apr 23, 2025
components.json	components.json	Setup NextJS app for rendering leaderboard and profiles.	Jan 13, 2025
drizzle.config.ts	drizzle.config.ts	Implement SQLite database	Mar 3, 2025
eslint.config.mjs	eslint.config.mjs	Implement Tasks 1-5 (new daily metrics and leaderboard queries)	Apr 12, 2025
next.config.js	next.config.js	Rewrite db.ts file for nextjs	Apr 15, 2025
package.json	package.json	Add reactions and PR closing issues fields to DB and ingest	Apr 23, 2025
postcss.config.mjs	postcss.config.mjs	Setup NextJS app for rendering leaderboard and profiles.	Jan 13, 2025
tailwind.config.ts	tailwind.config.ts	chote: update user profile queries to streamline data structure, and …	Apr 16, 2025
tsconfig.json	tsconfig.json	Move analyze-pipeline to cli/ directory	Apr 12, 2025
tsconfig.nextjs.json	tsconfig.nextjs.json	Rewrite db.ts file for nextjs	Apr 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Eliza Leaderboard

Prerequisites

Features

Setup

Commands and Capabilities

Data Ingestion

Data Processing and Analysis

Generating Stats and Exports

AI Summary Generation

Database Management

Website Generation

CI/CD and Data Management

GitHub Actions Workflows

Data Management Architecture

Development

TypeScript Pipeline

Updating schema

Database Explorer

Troubleshooting

Common Issues

Debugging

Directory Structure

License

About

Releases

Packages

Contributors 6

Languages

License

elizaOS/elizaos.github.io

Folders and files

Latest commit

History

Repository files navigation

Eliza Leaderboard

Prerequisites

Features

Setup

Commands and Capabilities

Data Ingestion

Data Processing and Analysis

Generating Stats and Exports

AI Summary Generation

Database Management

Website Generation

CI/CD and Data Management

GitHub Actions Workflows

Data Management Architecture

Development

TypeScript Pipeline

Updating schema

Database Explorer

Troubleshooting

Common Issues

Debugging

Directory Structure

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages