Skip to main content


This connector captures data from GitHub repositories and organizations into Flow collections via GitHub's REST API.

It is available for use in the Flow web application. For local development or open-source workflows, provides the latest version of the connector as a Docker image. You can also follow the link in your browser to see past image versions.

This connector is based on an open-source connector from a third party, with modifications for performance in the Flow system. You can find their documentation here, but keep in mind that the two versions may be significantly different.

Supported data resources

When you configure the connector, you specify a list of GitHub organizations and/or repositories from which to capture data.

From your selection, the following data resources are captured:

Full refresh (batch) resourcesIncremental (real-time supported) resources
BranchesCommit comment reactions
CollaboratorsCommit comments
Issue labelsCommits
Pull request commitsDeployments
Team membersIssue comment reactions
Team membershipsIssue events
TeamsIssue milestones
UsersIssue reactions
Project cards
Project columns
Pull request comment reactions
Pull request stats
Pull requests
Review comments
Workflow runs

Each resource is mapped to a Flow collection through a separate binding.


The /start_date field is not applicable to the following resources:

  • Assignees
  • Branches
  • Collaborators
  • Issue labels
  • Organizations
  • Pull request commits
  • Pull request stats
  • Repositories
  • Tags
  • Teams
  • Users


There are two ways to authenticate with GitHub when capturing data into Flow: using OAuth2, and manually, by generating a personal access token. Their prerequisites differ.

OAuth is recommended for simplicity in the Flow web app; the access token method is the only supported method using the command line.

Using OAuth2 to authenticate with GitHub in the Flow web app

  • A GitHub user account with access to the repositories of interest, and which is a member of organizations of interest.

Configuring the connector specification manually

  • A GitHub user account with access to the repositories of interest, and which is a member of organizations of interest.

  • A GitHub personal access token. You may use multiple tokens to balance the load on your API quota.


You configure connectors either in the Flow web app, or by directly editing the catalog specification file. See connectors to learn more about using connectors. The values and specification sample below provide configuration details specific to the GitHub source connector.



The properties in the table below reflect the manual authentication method. If you're working in the Flow web app, you'll use OAuth2, so some of these properties aren't required.

/branchBranch (Optional)Space-delimited list of GitHub repository branches to pull commits for, e.g. `estuary/flow/your-branch`. If no branches are specified for a repository, the default branch will be pulled.string
/credentialsAuthenticationChoose how to authenticate to GitHubobjectRequired
/credentials/option_titleAuthentication methodSet to PAT Credentials for manual authenticationstring
/credentials/personal_access_tokenAccess tokenPersonal access token, used for manual authentication. You may include multiple access tokens as a comma separated list.
/page_size_for_large_streamsPage size for large streams (Optional)The Github connector captures from several resources with a large amount of data. The page size of such resources depends on the size of your repository. We recommended that you specify values between 10 and 30.integer10
/repositoryGitHub RepositoriesSpace-delimited list of GitHub organizations/repositories, e.g. `estuary/flow` for a single repository, `estuary/*` to get all repositories from an organization and `estuary/flow estuary/another-repo` for multiple repositories.stringRequired
/start_dateStart dateThe date from which you'd like to replicate data from GitHub in the format YYYY-MM-DDT00:00:00Z. For the resources that support this configuration, only data generated on or after the start date will be replicated. This field doesn't apply to all resources.stringRequired


/streamStreamGitHub resource from which collection is captured.stringRequired
/syncModeSync modeConnection method.stringRequired


This sample specification reflects the manual authentication method.

option_title: PAT Credentials
personal_access_token: {secret}
page_size_for_large_streams: 10
repository: estuary/flow
start_date: 2022-01-01T00:00:00Z
- resource:
stream: assignees
syncMode: full_refresh
target: ${PREFIX}/assignees