Skip to content

Team management auth rbac

Enterprise Auth & Team Management (Multi-Tenant RBAC) — Implementation Guide

Author: Principal Lead Engineer Scope: Align FastAPI backend + Next.js frontend with multi-tenant RBAC, team management, and secure auth flows. This plan is constrained to officially documented patterns only.

Authoritative References (latest stable)

  • FastAPI security (JWT/OAuth2): https://fastapi.tiangolo.com/tutorial/security/oauth2-jwt/
  • FastAPI dependencies/middleware: https://fastapi.tiangolo.com/tutorial/dependencies/
  • StreamingResponse (for SSE alignment): https://fastapi.tiangolo.com/advanced/custom-response/#streamingresponse
  • Pydantic v2 models/settings: https://docs.pydantic.dev/latest/
  • SQLAlchemy 2.x asyncio ORM: https://docs.sqlalchemy.org/en/20/orm/extensions/asyncio.html
  • bcrypt hashing (official Python package): https://pypi.org/project/bcrypt/
  • python-jose JWT: https://python-jose.readthedocs.io/en/latest/
  • Redis (rate limiting / blacklist) client docs: https://redis-py.readthedocs.io/en/stable/
  • pyotp (TOTP RFC 6238): https://pyauth.github.io/pyotp/
  • Cryptography Fernet (secret encryption/rotation): https://cryptography.io/en/latest/fernet/
  • RabbitMQ monitoring & DLQ guidance: https://www.rabbitmq.com/docs/monitoring
  • Next.js 16 App Router & Server Actions: https://nextjs.org/docs
  • React 19 EventSource (WHATWG SSE): https://html.spec.whatwg.org/multipage/server-sent-events.html#the-eventsource-interface

Version Alignment (match compatibility matrix)

  • Python 3.11, FastAPI 0.124.2, Uvicorn 0.38.0, Pydantic v2, SQLAlchemy 2.x (async), bcrypt, aio-pika 9.4.x (for SSE), Node 22, Next.js 16.0.8, React 19.2.1, TypeScript 5.9.3.

Architecture Overview

  • Tenancy model: User (global identity) ↔ Team (owns resources) via TeamMember with Role.
  • RBAC: Permission (static, seeded) ↔ Role (per-team, editable except Owner) ↔ TeamMember.
  • Auth tokens: JWT (access short-lived, refresh longer), issued via FastAPI; stored as HttpOnly cookies.
  • Enforcement: FastAPI dependencies + middleware to resolve user_id, team_id, role, and check permissions before handlers.
  • SSE: Reuse existing events.topic exchange; auth will later scope SSE subscriptions by team/job once auth context is available.
  • Team creation rules: Only the account that upgrades from mightybox_trial to mightybox_billing may create a team; that user becomes Owner. Ownership transfer must be supported (policy-guarded) to move Owner to another account.
  • Virtuozzo session constraint: Only the Owner may retrieve/refresh a Virtuozzo session key for a team.

Data Model (to implement/migrate)

  • users: id, email (unique), hashed_password (bcrypt), is_active, is_verified, is_superuser, timestamps.
  • teams: id, owner_user_id, team_name, slug, status (TRIALING|ACTIVE|EXPIRED|ARCHIVED), timestamps.
  • team_members: id, user_id, team_id, role_id, timestamps.
  • permissions: slug (pk), description (seeded).
  • roles: id, team_id, name, is_editable, description.
  • role_permissions: role_id ↔ permission_slug (m2m).
  • invite_tokens: token, team_id, role_id, created_by, expires_at, used_at, used_by.
  • trusted_devices: per-user/device_fingerprint trust records with geo/IP/user_agent metadata and trusted/revoked timestamps.
  • active_sessions: per-user refresh session state (refresh_token_jti, device_fingerprint, ip, geo, last_seen_at, expires_at, revoked_reason).
  • login_activity: immutable audit log of login/refresh events with ip/ua/device_fingerprint/geo.
  • email_verification_tokens, password_reset_tokens (per security flows).

Core Flows (backend)

1) Registration - Door A (owner): create user + team + default roles; user becomes Owner only if upgraded to mightybox_billing; mark inactive until email verify; encrypt external creds if any. - Door B (invite): validate invite token, create user, attach to team with invited role, mark active/verified. 2) Login (Step 1) - Verify credentials (bcrypt hashing); return pre-auth token (scope team_select) + teams list. 3) Session Exchange (Step 2) - Input: pre-auth token + team_id; verify membership via indexed team_members (unique user_id+team_id) and ensure team status is allowed (TRIALING|ACTIVE). - Load team-scoped VZ session key by team_id: Redis cache (short TTL) → Postgres. If expired/near-expiry, take per-team lock and refresh using Owner’s encrypted VZ creds; others wait/reuse cached key. Enforce Owner-only refresh. - Issue access/refresh JWT (claims: sub, team_id, role) as HttpOnly cookies after successful membership/status validation. 4) Refresh / Logout - Refresh: validate refresh token, rotate, blacklist old (Redis); reissue cookies. - Logout: blacklist access token, clear cookies. 5) Authorization - Middleware: decode JWT, set request.state.user_id, team_id, role. - Dependency require_permission("perm"): join team_membersrolesrole_permissions; raise 403 if missing. 6) Invitations - Create invite (team.invite permission), single-use token with expiry; Door B consumes and invalidates. 7) Password & Email - Email verification tokens (24h), password reset tokens (1h), both single-use. 8) Ownership & Virtuozzo - Team creation allowed only for upgraded (mightybox_billing) accounts; they become Owner. - Ownership transfer flow (policy-guarded) to reassign Owner. - Only Owner can request/refresh Virtuozzo session keys for the team. - VZ email creation (modernized): on email verification or billing upgrade, generate a dedicated VZ email for the Owner (deterministic, collision-resistant, e.g., localpart_userid_timestamp@mightybox.app). Call the env-driven VZ registration endpoint to create the account; store VZ email + VZ UID + encrypted VZ password + the initial team-scoped session key together on teams (not on users). Do not recreate on each login; allow Owner-initiated repair if missing/invalid. 9) Device approval & suspicious login - Device approval: for new device fingerprints or untrusted combinations of device/IP/UA, require email-based OTP before issuing long-lived refresh tokens; persist trusted device records and allow later revoke. - Suspicious login: use recent login activity (IP/geo/device) plus a geo-IP lookup provider to flag high-risk logins, send email alerts with approve/deny actions, and gate refresh/session usage until the user makes a decision. 10) SSE auth & events - SSE /api/v1/events is gated by team membership and an events:read permission; cross-team subscriptions are rejected and auth events are published to a RabbitMQ topic exchange for streaming to clients.

Client Integration (Next.js 16 + React 19)

  • Use Server Actions / Route Handlers for auth POSTs; avoid pages/api.
  • Store auth cookies HttpOnly; client reads auth state via server-side checks.
  • SSE consumption: new EventSource("/api/v1/events?team_id=..."); when auth is wired, pass team filter only if user is member.

Security Controls

  • Hashing: bcrypt (official doc above); no plaintext storage.
  • JWT: HS256 via python-jose; short access (~5–15m), refresh (~7d); HttpOnly + Secure + SameSite.
  • Rate limiting: Redis counter for login/register; 401/429 on abuse (per FastAPI middleware).
  • CSRF: double-submit token for state-changing requests if cookies used cross-site.
  • Audit logging: log login, logout, invite, role edits with user_id, team_id, ip, ua.
  • Secrets & networking: all creds from env; service DNS (postgres, redis, rabbitmq) not localhost in containers.
  • External auth URL (Virtuozzo sign-in legacy): EXTERNAL_AUTH_URL=https://app.mymightybox.io/1.0/users/authentication/rest/signin should be injected via env and consumed only by the Virtuozzo client. This repo also supports VZ_VIRTUOZZO_SIGNIN_URL as the preferred name; EXTERNAL_AUTH_URL is treated as an alias for compatibility.
  • Session inactivity auto-expiry: enforce idle timeout (30–60 minutes, configurable) by requiring every refresh call to update active_sessions.last_seen_at; deny refresh if now - last_seen_at exceeds the idle threshold, return a session_inactive error payload, blacklist the stale tokens, surface the event via SSE, and emit an audit log entry.
  • Device approval OTP policy: email OTPs carry a 5‑minute TTL, max 5 verification attempts, and are stored hashed (HMAC-SHA256 using ENCRYPTION_KEY) inside Redis; per-user and per-IP rate limits (Redis counters) throttle issuance and verification; each approval/denial is captured in login_activity with device/IP metadata.
  • Geolocation/IP/device logging: capture IP, UA, and geo (via IP lookup) on login/refresh; store in audit log; surface recent-login metadata to users for security review (similar to Google’s device activity).
  • New device authorization: detect new device/fingerprint (IP + UA + optional device hash). If new, require the OTP policy above before issuing refresh/access tokens, persist the trusted device row (fingerprint, ip, ua, geo, approved_by), and expose revoke/deny operations.
  • SSE authentication & CSRF: /api/v1/events validates the same HttpOnly access token used by other endpoints via a dependency that runs on every connection (StreamingResponse per https://fastapi.tiangolo.com/advanced/custom-response/#streamingresponse). Connections require SameSite cookies; on 401/expiry the client replays the session-exchange flow before reconnecting.
  • Single-session enforcement / double-login block: maintain an active-session record per user (device fingerprint, IP, geo, last_seen). On new login:
  • If an active session exists, either block issuance of tokens or invalidate the previous session (blacklist prior access/refresh tokens) and notify both sessions.
  • Notify current active session via frontend channel (e.g., SSE/WS) that another login was attempted with device/IP/geo details; prompt to approve/deny.
  • If denied, reject new login and keep the existing session; if approved, revoke old session and allow new one.
  • All session revocations must blacklist existing tokens and clear server-side session state.
  • Secrets encryption & rotation: all Virtuozzo creds, TOTP secrets (once enabled), refresh session metadata, and backup codes are encrypted using Fernet (ENCRYPTION_KEY, 32-byte urlsafe). Rotate the key at least quarterly: mint new key, re-encrypt stored secrets in a single transaction per team/user, update Redis caches, and alert if any row cannot be re-encrypted (per https://cryptography.io/en/latest/fernet/). Never log plaintext secrets or keys.
  • MFA/TOTP scope: MFA is deferred until Phase D completes; when enabled, use pyotp (https://pyauth.github.io/pyotp/) with otpauth URI/QR generation, encrypted secret storage, single-use backup codes, and mandatory step-up during new-device approval.

Operational Safeguards (Redis / RabbitMQ)

  • Redis availability: /health/redis endpoint pings PING and a short-lived lock acquisition. Alert if two consecutive checks fail. When Redis is unavailable, reject rate-limited actions with 503 redis_unavailable instead of dropping locks silently.
  • Redis retries/backoff: lock acquisition and counter increments use capped exponential backoff (max 5 tries, 200 ms cap) to avoid stampedes; failures emit structured logs so SRE can trace contention.
  • RabbitMQ publishing: use publisher confirms with a 5s timeout (per https://www.rabbitmq.com/docs/monitoring#publishers) and retry at most 3 times with jittered delays to avoid thundering herds. On repeated failure, persist the event to Postgres for later replay.
  • RabbitMQ DLQs: each topic exchange used for SSE fan-out includes a dead-letter queue; consumers monitor DLQ depth via Prometheus and raise alerts if >100 messages or if redelivered flag spikes, forcing investigation before replay.

Sequenced Work Plan (phases)

  • Phase A (Foundation): async DB session, settings for JWT/bcrypt/Redis; models + Alembic for users/teams/roles/permissions/invites.
  • Phase B (Auth): password hashing, JWT utils, login + session exchange + refresh + logout endpoints; middleware + require_permission.
  • Phase C (Team/RBAC): seed permissions; seed Owner/Manager/Developer roles on team create; invite create/consume; role/permission edit endpoints.
  • Phase D (User lifecycle + MFA prep): email verification, password reset, and gating work needed to launch TOTP (schema placeholders, secret storage, backup code UX) while keeping MFA disabled until security sign-off.
  • Phase E (SSE alignment): apply auth context to /api/v1/events (filter by team_id, reject unauthorized), keep DB as source of truth.
  • Phase F (Tests/CI): unit (security/jwt), integration (auth flows), E2E (invite/registration), load targets; enforce constraints install in CI.

Env & Config (examples)

  • JWT_SECRET_KEY, ACCESS_TOKEN_EXPIRE_MINUTES, REFRESH_TOKEN_EXPIRE_DAYS, JWT_ALGORITHM=HS256
  • BCRYPT_ROUNDS (bcrypt cost factor)
  • REDIS_URL=redis://redis:6379/0 (session cache, locks, rate limit)
  • ENCRYPTION_KEY (Fernet, base64 urlsafe 32-byte, for VZ secrets/TOTP)
  • DATABASE_URL=postgresql+asyncpg://user:pass@postgres:5432/db
  • EMAIL_SENDER_CONFIG (SMTP) per provider docs; keep out of code.

Deliverables (documentation + code alignment)

  • Models & migrations for auth/RBAC/teams/invites.
  • Auth router + service + security utils; middleware + permission dependency.
  • Role/permission seed + team default roles on create.
  • Invite endpoints; email verification + password reset endpoints.
  • Tests: unit (hash/JWT), integration (login/session/refresh), E2E (Door A/B).
  • Ops: rate limit + blacklist (Redis), audit log events, env docs with official links (above).
  • Ownership transfer flow + policy checks; Owner-only Virtuozzo session retrieval enforcement.
  • Session inactivity enforcement and recent-login geo/IP/device logging surfaced to users/admins.
  • New-device authorization flow with trust/revoke tracking.
  • Single-session enforcement with notify/approve/deny flow for concurrent login attempts.

Non-Goals (defer)

  • MFA/OAuth providers remain disabled in production until Phase D completes and security signs off on TOTP rollout.
  • Multi-region HA specifics and SOC2 audit runbooks (handled separately).

Virtuozzo API placement

  • Create a dedicated client module under backend/app/infrastructure/external/virtuozzo/ (e.g., client.py, auth.py, accounts.py) to organize multiple endpoints. Keep backend/app/infrastructure/external/virtuozzo.py as a thin facade if desired.
  • All Virtuozzo endpoints (including https://app.mymightybox.io/1.0/users/authentication/rest/signin) must be env-driven; never hardcode.
  • Enforce Owner-only access when requesting/refreshing Virtuozzo session keys.

Module placement (per 001-hybrid-modular-ddd)

  • Interface/application for auth lives under backend/app/modules/auth/ (router, schemas, service, dependencies).
  • Persistence models remain under backend/app/infrastructure/database/models/.
  • API wiring uses api/v1 to include modules/auth/router.

Virtuozzo session key management (team-scoped, stable)

  • Store one encrypted team-scoped Virtuozzo session key with metadata (expires_at/last_refreshed) in Postgres; optionally cache in Redis with shorter TTL.
  • Refresh lazily only when the key is expired or near-expiry (e.g., within 5–10 minutes), never on every login. Use the Owner’s encrypted VZ creds to refresh; enforce per-team lock to avoid stampede.
  • Serve all team members with the same valid session key server-side; never expose the key to clients. Members’ requests use the stored key; only Owner can trigger refresh if needed.
  • Failure handling: on VZ 401/timeout, try one refresh; if still failing, surface a controlled error and prompt Owner to reauthenticate/update VZ creds.
  • Bug/edge mitigations:
  • Over-rotation: do not refresh per login; only on expiry/near-expiry with a single team-scoped key.
  • Stampede: per-team mutex/lock around refresh; others reuse cached key.
  • Stale key after failure: if refresh fails, invalidate cached key and require Owner reauth/update VZ creds; do not proceed with stale key.
  • Ownership check: guard refresh path with Owner-only policy; members cannot trigger or view keys.
  • Credentials drift: if stored encrypted VZ creds are rejected, bubble a clear error and force Owner reauth to replace creds.
  • TTL mismatch: use configurable TTL/near-expiry window; refresh if expires_at < now + safety_window.
  • Ownership transfer race: re-validate current Owner before refresh; lock per team to avoid old Owner overwriting after transfer.
  • Partial write: treat refresh as an atomic update (write new key + metadata together); on failure, keep last-known-good key until replacement confirmed.
  • Logging leakage: never log keys or VZ creds; redact sensitive fields in error logs.
  • Cache/DB mismatch: Redis TTL shorter than VZ TTL; include last_refreshed/version to detect stale cache vs DB.
  • Repeated 401s: after one failed refresh, drop the key and require Owner action instead of looping.
  • Multi-team users: always scope lookup/lock by team_id to prevent cross-team key bleed.
  • Time skew: include safety margin (e.g., 5–10 minutes) when checking expiry to avoid clock drift issues.
  • VZ email lifecycle: Owner-only (re)creation; write VZ email + VZ UID + encrypted VZ password atomically with initial session key. Never log sensitive values; redact where necessary for support.

Keep all implementation choices traceable to the official references listed above; do not deviate from documented behaviors or unstated versions.

Alignment matrix vs team_management.md (special rules)

  • Team creation: Only upgraded (mightybox_billing) account can create a team → becomes Owner; enforced at team creation API and Door A registration.
  • Ownership transfer: Must exist and be policy-guarded; re-validate Owner before VZ refresh; block old Owner after transfer.
  • Team status gating: Allow session exchange/refresh only if team status in {TRIALING, ACTIVE}; deny otherwise.
  • Forked registration: Door A (owner) creates team; Door B (invite) joins existing team with invited role.
  • Virtuozzo access: Only Owner can retrieve/refresh session keys; members use stored team key server-side.
  • Billing dependency: Team creation and active usage require billing upgrade; maintain status and enforce on auth/refresh/VZ calls.
  • SSE gating: After auth integration, /events must filter by membership/team_id; deny expired teams.
  • Invites: Single-use, expiring tokens; audit create/accept; respect role assignment on accept.
  • New device/double-login: Require step-up for new device; single-session enforcement with approve/deny and token revocation.

Coverage / CI gates (minimum)

  • Unit tests: security/JWT/crypto/helpers ≥95% coverage for core; fail CI under threshold.
  • Integration tests: login → team select → refresh → logout; Door A/B flows; invite acceptance; SSE auth gate smoke; target ≥85% api/module coverage.
  • E2E: Door A (with VZ session refresh), Door B invite accept, ownership transfer, new-device approval, single-session enforcement.
  • Performance smoke: VZ refresh under contention (lock correctness) and auth rate-limit behavior.
  • CI install: backend pip install --no-deps -c backend/constraints.txt -e .; frontend npm ci; run lint/type/tests; enforce coverage threshold.