Implementation:Mage ai Mage ai GitHub Client
| Knowledge Sources | |
|---|---|
| Domains | Data_Integration, GitHub, API |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete HTTP client for making authenticated REST requests to the GitHub API, used by the Mage GitHub source connector.
Description
The GithubClient class manages authenticated HTTP communication with the GitHub REST API. It provides token-based authentication via session headers, automatic rate limiting with sleep-and-retry logic (handling both primary and secondary rate limits through X-RateLimit-Remaining and Retry-After headers), exponential backoff on transient errors (timeouts, connection errors, 5xx responses, 429 Too Many Requests), and link-header-based pagination for traversing multi-page API responses. The module also defines a comprehensive hierarchy of exception classes mapped to GitHub HTTP error codes (301, 304, 400, 401, 403, 404, 409, 422, 429, 500) and helper functions for rate throttling and error raising.
Usage
Used internally by the GitHub source connector streams and sync orchestrator to make API requests. The client is instantiated with a config dictionary containing access_token and repository keys, and optionally a base_url for GitHub Enterprise deployments.
Code Reference
Source Location
- Repository: mage-ai
- File: mage_integrations/mage_integrations/sources/github/tap_github/client.py
- Lines: 1-430
Signature
class GithubClient:
def __init__(self, config: dict):
...
def set_auth_in_session(self) -> None:
...
def get_request_timeout(self) -> float:
...
def authed_get(self, source, url, headers={}, stream="", should_skip_404=True) -> requests.Response:
...
def authed_get_all_pages(self, source, url, headers={}, stream="", should_skip_404=True) -> Generator:
...
def verify_repo_access(self, url_for_repo, repo) -> None:
...
def verify_access_for_repo(self) -> None:
...
def extract_orgs_from_config(self) -> list:
...
def extract_repos_from_config(self) -> tuple:
...
Import
from mage_integrations.sources.github.tap_github.client import GithubClient
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| config | dict | Yes | Configuration dictionary containing access_token, repository, and optionally base_url and request_timeout |
Outputs
| Name | Type | Description |
|---|---|---|
| response | requests.Response | HTTP response object from the GitHub API with parsed JSON body |
Key Behaviors
Authentication
Sets the authorization header to token <access_token> on the session for all subsequent requests.
Rate Limiting
- Checks Retry-After header for secondary rate limits and sleeps the specified number of seconds.
- Checks X-RateLimit-Remaining header; when it reaches 0, calculates sleep duration from X-RateLimit-Reset epoch timestamp plus a 2-second buffer.
- Raises GithubException if neither rate limit header is present (indicates an invalid base URL).
Error Handling
Custom exception classes are mapped to HTTP status codes: MovedPermanentlyError (301), NotModifiedError (304), BadRequestException (400), BadCredentialsException (401), AuthException (403), NotFoundException (404), ConflictError (409), UnprocessableError (422), TooManyRequests (429), InternalServerError (500). Any status code above 500 raises Server5xxError.
Pagination
The authed_get_all_pages method yields responses in a loop, following the next link from the response links header until no more pages remain.
Usage Examples
from mage_integrations.sources.github.tap_github.client import GithubClient
config = {
"access_token": "ghp_xxxx",
"repository": "mage-ai/mage-ai",
}
client = GithubClient(config)
# Single request
resp = client.authed_get("events", "https://api.github.com/repos/mage-ai/mage-ai/events")
events = resp.json()
# Paginated request
for page_resp in client.authed_get_all_pages("commits", "https://api.github.com/repos/mage-ai/mage-ai/commits"):
commits = page_resp.json()