Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Semaphore UI is a modern web interface for managing popular DevOps tools like An
### Run the application:
- ALWAYS run the bootstrapping steps first
- Setup database and admin user: `./bin/semaphore setup` (interactive, use BoltDB option 2 for development)
- Start server: `./bin/semaphore server --config ./config.json`
- Start server: `./bin/semaphore server --config ./config.json` (or `config.yaml`; see `docs/configuration.md`)
- Web UI: http://localhost:3000 (login: admin / changeme)
- API: http://localhost:3000/api/ (test with: `curl http://localhost:3000/api/ping`)

Expand Down Expand Up @@ -78,8 +78,10 @@ curl -I http://localhost:3000/ # Should return HTTP 200
├── db/ - Database models and interfaces
├── services/ - Business logic services
├── util/ - Utility functions and configuration
├── docs/ - Developer guides (configuration, runners, HA)
├── config.schema.yaml - JSON Schema for config.json / config.yaml
├── bin/ - Built binaries (after build)
└── config.json - Runtime configuration (after setup)
└── config.json - Runtime configuration (after setup; YAML also supported)
```

### Key Commands Reference
Expand Down Expand Up @@ -150,7 +152,8 @@ During setup, choose option 2 (BoltDB) for simplest development setup:
- **NEVER CANCEL** long-running builds or dependency installations
- Set appropriate timeouts: deps (5+ min), build (3+ min), tests (2+ min)
- The application serves the frontend from the Go backend - no separate frontend server needed
- Configuration is stored in `config.json` after running setup
- Configuration is stored in `config.json` after running setup (or use `config.yaml`; validate with `config.schema.yaml`)
- Developer docs: `docs/README.md`
- Default admin credentials after setup: admin / changeme
- Linting has known issues - focus on not introducing new ones
- Always test changes by running the full application, not just unit tests
11 changes: 11 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,11 +55,22 @@ When creating a pull-request you should:
go run cli/main.go service --config ./config.json
```

Setup writes `config.json` by default. You can also use `config.yaml`; see [docs/configuration.md](docs/configuration.md) for discovery paths, environment overrides, and [`config.schema.yaml`](config.schema.yaml).

Open [localhost:3000](http://localhost:3000)

Note: for Windows, you may need [Cygwin](https://www.cygwin.com/) to run certain commands because the [reflex](github.com/cespare/reflex) package probably doesn't work on Windows.
You may encounter issues when running `task watch`, but running `task build` etc... will still be OK.

## Developer documentation

Repository guides for contributors and operators:

- [docs/README.md](docs/README.md) — index
- [docs/configuration.md](docs/configuration.md) — config file, schema, env vars
- [docs/runners-and-tags.md](docs/runners-and-tags.md) — remote runners and tag routing
- [docs/cluster-dashboard.md](docs/cluster-dashboard.md) — HA cluster admin API

## Integration tests

Dredd is used for API integration tests, if you alter the API in any way you must make sure that the information in the api docs
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ For more installation options, visit our [Installation page](https://semaphoreui
* [User Guide](https://docs.semaphoreui.com)
* [API Reference](https://semaphoreui.com/api-docs)
* [Postman Collection](https://www.postman.com/semaphoreui)
* [Developer docs](docs/README.md) — configuration, runners, HA cluster dashboard (in-repo)

## Awesome Semaphore

Expand Down
11 changes: 11 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Developer documentation

Internal guides for contributors and operators. User-facing product docs live at [docs.semaphoreui.com](https://docs.semaphoreui.com).

| Guide | Audience | Covers |
|-------|----------|--------|
| [Configuration](configuration.md) | Developers, operators | `config.json` / `config.yaml`, env vars, JSON Schema |
| [Runners and tags](runners-and-tags.md) | Developers, operators | Remote runners, tag routing, webhooks |
| [Cluster dashboard](cluster-dashboard.md) | Operators (HA) | Admin cluster API, task state inspection, recovery |

Implementation plans for upcoming work are under `docs/plans/`.
109 changes: 109 additions & 0 deletions docs/cluster-dashboard.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Cluster dashboard (HA)

The cluster dashboard is an **admin-only** UI and API for inspecting high-availability (HA) deployments and the shared task state backend. It requires the enterprise HA feature (`features.high_availability`).

## When it applies

| `ha.enabled` | Dashboard |
|--------------|-----------|
| `false` | UI shows HA disabled; `GET /api/cluster` returns `{"ha_enabled": false}` only |
| `true` | Full node list, Redis stats, task snapshot, maintenance clear |

Configure HA in the server config:

```yaml
ha:
enabled: true
node_id: semaphore-1 # optional; auto-generated if empty
redis:
addr: redis.example.com:6379
pass: "<secret>"
```

`util.HAEnabled()` is true when `ha` is set and `ha.enabled` is true.

## Admin API

All routes require an authenticated **admin** session (same as other `/api/...` admin routes).

### `GET /api/cluster`

Returns cluster status:

- `ha_enabled` (boolean) — always present
- `node_id` (string) — this instance, when HA config exists
- `nodes` (array) — peer nodes, heartbeats, versions (when HA overlay is active)
- `redis` (object) — connection, memory, key groups (when inspector available)

When HA is enabled but the cluster inspector is unavailable, the handler responds with **503** and a short error message. When HA is disabled, the response is **200** with only `ha_enabled: false` (no error).

### `GET /api/cluster/tasks`

Returns a **task state snapshot** from the task pool store:

| Field | Meaning |
|-------|---------|
| `queue` | Tasks waiting to start |
| `running` | Tasks currently executing |
| `active_by_project` | Per-project active task records |
| `aliases` | Alias string → task ID |
| `claims` | Task IDs claimed for distributed coordination |

Works in non-HA mode too (in-memory store); fields may be empty arrays/objects if the store does not implement introspection.

### `DELETE /api/cluster/tasks`

Maintenance: clear selected record groups from the backend (Redis in HA). Body:

```json
{
"scope": {
"queue": true,
"running": false,
"active": false,
"aliases": false,
"claims": false,
"runtime_fields": false
}
}
```

At least one scope flag must be `true`. Use only when recovering from a stuck cluster state (orphaned queue entries, stale claims). Clearing **running** or **active** while real tasks execute can cause inconsistent behavior.

The UI exposes the same scope checkboxes under **Clear tasks from Redis** (enabled only when `ha_enabled` is true).

## UI entry

**Admin → Cluster dashboard** (`web/src/views/Cluster.vue`):

- Node table and Redis memory chart when HA is active
- Live task tables from `/api/cluster/tasks`
- Upgrade prompt when `features.high_availability` is false

## Architecture sketch

```mermaid
flowchart LR
subgraph nodes [Semaphore nodes]
N1[Node A]
N2[Node B]
end
Redis[(Redis task state)]
N1 --> Redis
N2 --> Redis
Admin[Admin UI] --> API["/api/cluster*"]
API --> N1
```

`TaskStateStore` implementations may expose `TaskStateInspector` for snapshots and `ClearTasks`. See `services/tasks/task_state_store.go`.

## OpenAPI

Cluster endpoints are documented in `api-docs.yml` under the `cluster` tag (may be commented until Dredd hooks cover them). Regenerate the public Swagger bundle when enabling them in CI.

## Related code

- `api/cluster.go` — handlers
- `api/router.go` — route registration
- `pro_interfaces` — `ClusterInspector` for nodes/Redis
- `services/tasks/task_state_store.go` — snapshot and clear types
109 changes: 109 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Configuration

Semaphore reads settings from a config file, then applies environment-variable overrides and built-in defaults. The canonical field list is maintained in [`config.schema.yaml`](../config.schema.yaml) (JSON Schema draft 2020-12), generated from `util.ConfigType` in Go.

## File format and discovery

Supported formats: **JSON** (`.json`) and **YAML** (`.yaml`, `.yml`). Keys use `snake_case` and match the `json` struct tags in `util/config.go`.

### Search order

When `--config` is not passed and `SEMAPHORE_CONFIG_PATH` is unset, the server looks for the first existing file among:

1. `./config.json`, `./config.yaml`, `./config.yml` (current working directory)
2. `/usr/local/etc/semaphore/config.{json,yaml,yml}`
3. `/etc/semaphore/config.{json,yaml,yml}`

Explicit path:

```bash
./bin/semaphore server --config /etc/semaphore/config.yaml
# or
export SEMAPHORE_CONFIG_PATH=/etc/semaphore/config.yaml
```

Interactive setup (`semaphore setup`) still writes `config.json` by default; YAML is fully supported for hand-written or GitOps-managed installs.

### Load order

`util.ConfigInit` applies settings in this order (later steps win):

1. Config file (if present and not disabled with `--no-config`)
2. Environment variables (`SEMAPHORE_*`, see `env:` tags on struct fields)
3. Defaults from struct `default:` tags

Sensitive values can be loaded from companion files (for example `runner.token_file`, `subscription.key_file`) after the main file is parsed.

## Schema validation

Use `config.schema.yaml` in your editor (YAML language server with JSON Schema) or in CI to validate configs before deploy. The schema `$id` is `https://semaphoreui.com/schemas/config.schema.json`.

To regenerate the schema after changing `util.ConfigType`, follow [`.claude/skills/semaphore-config-schema/SKILL.md`](../.claude/skills/semaphore-config-schema/SKILL.md).

## Common options (quick reference)

| Area | Keys | Notes |
|------|------|-------|
| Database | `dialect`, `mysql` / `postgres` / `sqlite` / `bolt` | `bolt` is deprecated; prefer `sqlite` for embedded DB |
| HTTP | `port`, `interface`, `web_host` | `web_host` is the public URL used in links and emails |
| TLS | `tls.enabled`, `tls.cert_file`, `tls.key_file` | Optional HTTP→HTTPS redirect via `tls.http_redirect_addr` **or** `tls.http_redirect_port` (mutually exclusive) |
| Auth | `mfa.totp`, `mfa.email` | Former top-level `auth` was renamed to `mfa` |
| Runners | `use_remote_runner`, `runner_registration_token`, `runner` | Per-runner CLI config block when running `semaphore runner` |
| HA | `ha.enabled`, `ha.node_id`, `ha.redis` | Requires enterprise overlay; see [Cluster dashboard](cluster-dashboard.md) |
| Concurrency | `max_parallel_tasks` | Server-wide cap; per-runner limit is `runner.max_parallel_tasks` |

Environment variable names mirror keys: `port` → `SEMAPHORE_PORT`, nested fields use underscores (`SEMAPHORE_TLS_ENABLED`, `SEMAPHORE_HA_REDIS_ADDR`). Fields tagged `sensitive` are cleared from the process environment after load so secrets do not leak to child processes.

## Examples

### Minimal development (SQLite)

```yaml
dialect: sqlite
sqlite:
host: /tmp/semaphore.db
port: ":3000"
tmp_path: /tmp/semaphore
cookie_hash: <base64-32-bytes>
cookie_encryption: <base64-32-bytes>
access_key_encryption: <base64-32-bytes>
```

Generate secrets with `semaphore setup` or `openssl rand -base64 32`.

### TLS with HTTP redirect

```yaml
tls:
enabled: true
cert_file: /etc/semaphore/tls.crt
key_file: /etc/semaphore/tls.key
http_redirect_port: 8080
```

A second listener on port `8080` redirects clients to HTTPS. Use `http_redirect_addr` instead when you need a non-default bind address (for example `:8080` or `127.0.0.1:8080`).

### Remote runner (server side)

```yaml
use_remote_runner: true
runner_registration_token: "<admin-generated-token>"
```

Runners register with that token; task routing uses project/global runners and optional tags (see [Runners and tags](runners-and-tags.md)).

## Troubleshooting

| Symptom | Check |
|---------|--------|
| Server exits on start | Run with explicit `--config`; validate against `config.schema.yaml` |
| Wrong database | `dialect` and the matching `mysql`/`postgres`/`sqlite` block |
| Broken login cookies after config change | `cookie_hash` / `cookie_encryption` must stay stable or all sessions invalidate |
| Runner never picks up jobs | `use_remote_runner`, runner `active`, tag match on template/inventory |
| HA features missing in UI | `ha.enabled` and enterprise subscription; cluster API returns `ha_enabled: false` when disabled |

## Related code

- `util/config.go`, `util/config_auth.go` — struct definitions and loading
- `util/config_test.go` — YAML/JSON load tests
- `cli/cmd/root.go` — `--config`, `--no-config` flags
78 changes: 78 additions & 0 deletions docs/runners-and-tags.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Runners and tags

Semaphore can execute tasks on the server process or on **remote runners** (separate `semaphore runner` processes). Tags restrict which runner may execute a task.

## Modes

| Mode | Config | Behavior |
|------|--------|----------|
| Local | `use_remote_runner: false` (default) | Task pool runs jobs on the Semaphore server |
| Remote | `use_remote_runner: true` | Tasks are assigned to registered runners via `RemoteJob` |

Runners are **project-scoped** (bound to one project) or **global** (any project). Registration uses `runner_registration_token` on the server and `semaphore runner register` on the runner host.

## Tags

### Data model

- Each runner has zero or more string **tags** (`db.Runner.Tags`).
- Templates and inventories may set optional `runner_tag`. When a task runs, the effective tag is **inventory overrides template** if the inventory defines one.

### Routing rules

When `use_remote_runner` is true and a task needs a runner (`TaskPool` / `RemoteJob`):

1. If `runner_tag` is set → select **active** runners whose tags include that value (`RunnerFilterTagCompleteMatch`).
2. If `runner_tag` is empty → select runners marked **default** (`RunnerFilterIsDefault`).
3. Project runners are tried before global runners; order within each group is shuffled (`crypto/rand`) for load spreading.
4. A runner is preferred if it sent a heartbeat within **30 minutes** or has a **webhook** configured (webhook-only runners are treated as always reachable).
5. Among eligible runners, the first with `running_tasks < max_parallel_tasks` wins.

If no runner matches, the task stays in **waiting** state with error `no runners available`.

### UI and API

- **Admin → Runners**: edit tags on global runners.
- **Project → Runners**: project-scoped runners and tags (requires `project_runners` feature).
- Template form: **Runner tag** dropdown populated from `GET /api/project/{id}/runner_tags`.
- Inventory form: optional **Runner tag** (overrides template).
- Tag catalog: `GET /api/runner_tags` (global), `GET /api/project/{id}/runner_tags` (project).

CLI registration:

```bash
semaphore runner register --tags linux,amd64
```

## Webhooks

Runners may define a `webhook` URL. Semaphore POSTs JSON when a task is assigned:

```json
{
"action": "start",
"project_id": 1,
"task_id": 42,
"template_id": 3,
"runner_id": 7
}
```

Use webhooks to spawn **one-off** runners (`runner.one_off` in config) in autoscaling environments.

## Operational checklist

1. Enable `use_remote_runner` and set `runner_registration_token`.
2. Register runners; confirm **Active** and recent **Last seen**.
3. Set template or inventory `runner_tag` when you need dedicated capacity.
4. Mark exactly one default runner per scope if you rely on untagged templates.
5. For stuck waiting tasks, verify tag spelling and that at least one active runner carries the tag.

Manual test case: [test/test-cases/TC-028-runner-tags.md](../test/test-cases/TC-028-runner-tags.md).

## Related code

- `services/tasks/RemoteJob.go` — runner selection
- `services/tasks/TaskPool.go` — when remote jobs are created
- `db/Runner.go` — tag filter modes
- `api/runners.go`, `pro/api/projects/runners.go` — HTTP handlers
Loading