Pipelines Docs is in beta — content is actively being added.
Platform GuideOrganizations & Projects

Organizations

Understand how organizations work as the top-level container for all platform resources.

An organization is the top-level entity in Pipelines. Everything you work with to test and evaluate your AI agents — projects, team members, models, tools, and evaluation criteria — belongs to an organization.

Creating an organization

Organizations are provisioned by the Pipelines team. Contact us to set up a new organization. Once created, the designated owner becomes the first Org Admin and can begin inviting team members and creating projects for agent test runs.

Organization settings

Org Admins can manage organization-wide resources from the admin sidebar. These resources are available across all projects in the organization:

Sidebar itemWhat it manages
ModelsCustom LLM models with bring-your-own-key credentials, used by agents and evaluators. See Model Registry.
MCP ServersMCP server connections and tool endpoints available to agents under test. See Tools.
EvaluationsOrganization-scoped evaluation criteria for scoring agent runs. See Evaluation Criteria.
API KeysAPI keys for external API access.
PeopleOrganization-wide team management. See Team Management.
CredentialsOrganization-level credentials for external services. Credentials are stored encrypted and displayed masked (e.g., ****abcd). Each credential type can only be set once per organization.

Organization dashboard

When an Org Admin navigates to the Dashboard from the sidebar, they see an organization-level overview with the following widgets:

WidgetWhat it shows
ProjectsCount of projects in the organization, with a sparkline trend. Click to see a detailed list of all projects with their contributor counts and statuses.
ActivityA composite view of evaluations submitted, completed runs, and daily active evaluators, with trend lines and a timeseries chart.
Run CompletionPercentage of agent runs that have reached a final outcome, shown as a progress ring.
EngagementAverage evaluations per active contributor and the maximum by a single contributor, with trend lines and a chart.
First Review Pass RateHow often human evaluations pass review on the first attempt.
Avg Reviews per Review NodeAverage number of review rounds per run across all projects.

The dashboard includes a timeframe selector to adjust the reporting period. For deeper, custom metrics over your agent runs, build dashboards in Studio.

Project Admins see a project-level dashboard when they navigate to a specific project. The org-level dashboard is only visible to Org Admins.