Skip to main content

Indexing API Overview

The Glean Indexing API enables organizations to make their custom content searchable and accessible through Glean's search and AI capabilities. This API is designed for pushing content from internal tools, on-premises systems, and applications that Glean doesn't natively support into Glean's search index, making it discoverable alongside your organization's other content.

Key Capabilities

Document Indexing

Index documents with full-text content, metadata, and permissions to make them searchable through Glean

Custom Datasources

Create and manage custom datasources to organize and categorize your indexed content

Structured Entities

Index structured key-value content for applications requiring schema-based data representation

Permission Management

Enforce fine-grained access controls ensuring users only see content they're authorized to access

Common Use Cases

  • Internal Tool Integration: Make content from proprietary or on-premises tools searchable in Glean
  • Legacy System Modernization: Bring content from older systems into modern search and AI workflows
  • Custom Application Data: Index structured data from internal applications and databases
  • Document Repositories: Make file servers, wikis, and document management systems searchable
  • Organizational Data: Push people information, team structures, and organizational knowledge into Glean

How It Works

The Indexing API follows a straightforward workflow:

1

Create Datasource

Set up a custom datasource to hold your content. This defines how your content appears in search results and provides organizational structure.

2

Index Content

Push documents, structured data, and metadata to your datasource using the indexing endpoints. Content can be indexed individually or in bulk.

3

Manage Permissions

Configure access controls to ensure users only see content they're authorized to access, respecting your organization's security policies.

4

Enable Discovery

Activate your datasource in Glean's admin console to make the indexed content discoverable through search and AI features.

Content Types

The Indexing API supports various types of content:

  • Documents: Full-text content with titles, bodies, metadata, and view URLs
  • Custom Entities: Structured key-value data for applications requiring schema-based representation
  • User Information: Employee profiles, organizational structures, and team data
  • Groups & Teams: Organizational units and membership information
  • External Shortcuts: Quick access links to frequently used external resources

Quick Example

Here's how to create a datasource and index your first document:

curl -X POST https://customer-be.glean.com/api/index/v1/adddatasource \
-H 'Authorization: Bearer <your_indexing_token>' \
-d '{
"name": "internal-docs",
"displayName": "Internal Documentation",
"datasourceCategory": "PUBLISHED_CONTENT",
"urlRegex": "^https://internal.company.com/docs.*",
"isUserReferencedByEmail": true
}'

Authentication & Permissions

Supported APIs:Indexing API

The Indexing API uses Glean-issued tokens for authentication. Only users with appropriate permissions can create and manage indexing tokens:

  • Super Admins: Can create all types of tokens
  • Admins: Can create user-scoped tokens
  • API Token Creators: Can create user-scoped tokens

Learn more in our Authentication Guide.

Next Steps

Important Considerations

  • Permissions: Always configure appropriate permissions to maintain security
  • Rate Limits: Review our rate limiting policies for bulk operations
  • Data Freshness: Content typically appears in search within a few minutes of indexing
  • Backwards Compatibility: Glean provides advance notice and deprecation periods for any breaking changes
  • Testing: Use test groups to validate content before making it visible to all users