gmuse.git#
Git utilities for extracting repository information.
This module provides functions to interact with git repositories using subprocess to extract staged diffs, commit history, and repository metadata.
- Public API:
is_git_repository: Check if directory is a git repo
get_repo_root: Get repository root path
get_staged_diff: Extract staged changes
get_commit_history: Fetch recent commits
get_current_branch: Get current branch information
truncate_diff: Truncate large diffs
load_repository_instructions: Load .gmuse file
- Data Classes:
StagedDiff: Staged changes information
CommitRecord: Single commit data
CommitHistory: Collection of commits
RepositoryInstructions: Content from .gmuse file
BranchInfo: Current branch information
Module Contents#
Classes#
Represents the git diff of staged changes. |
|
Single commit from repository history. |
|
Collection of recent commits for style context. |
|
Project-level guidance from .gmuse file. |
|
Information about the current git branch. |
Functions#
Execute a git command and return the result. |
|
Count added and removed lines in a diff. |
|
Parse a single commit log line into a CommitRecord. |
|
Sanitize branch name for prompt context. |
|
Parse branch name into type and summary. |
|
Check if the current/specified directory is a git repository. |
|
Get the root directory of the git repository. |
|
Extract staged changes from git repository. |
|
Fetch recent commit messages for style context. |
|
Truncate diff to fit within token limits. |
|
Load project-level instructions from .gmuse file. |
|
Get information about the current git branch. |
Data#
Timeout for quick git operations like rev-parse (seconds). |
|
Timeout for potentially slow git operations like diff/log (seconds). |
API#
- gmuse.git.logger = 'get_logger(...)'#
- gmuse.git._GIT_TIMEOUT_SHORT: Final[int] = 5#
Timeout for quick git operations like rev-parse (seconds).
- gmuse.git._GIT_TIMEOUT_LONG: Final[int] = 30#
Timeout for potentially slow git operations like diff/log (seconds).
- class gmuse.git.StagedDiff#
Represents the git diff of staged changes.
This dataclass encapsulates all information about staged changes in a git repository, including the raw diff content and computed metadata.
- Attributes:
raw_diff: Full output of git diff –cached files_changed: List of file paths that were modified lines_added: Total lines added across all files lines_removed: Total lines removed across all files hash: SHA256 hash of raw_diff (useful for caching/deduplication) size_bytes: Size of raw_diff in bytes truncated: Whether diff was truncated to fit token limits
- raw_diff: str = None#
- files_changed: list[str] = None#
- lines_added: int = None#
- lines_removed: int = None#
- hash: str = None#
- size_bytes: int = None#
- truncated: bool = False#
- class gmuse.git.CommitRecord#
Single commit from repository history.
- Attributes:
hash: Full git commit SHA (40 characters) message: Commit message subject line author: Commit author name timestamp: Commit timestamp as datetime object
- hash: str = None#
- message: str = None#
- author: str = None#
- timestamp: datetime.datetime = None#
- class gmuse.git.CommitHistory#
Collection of recent commits for style context.
Used to provide the LLM with examples of the repository’s commit style to help generate consistent messages.
- Attributes:
commits: Ordered list of recent commits (newest first) depth: Number of commits requested (may differ from len(commits)) repository_path: Absolute path to git repository root
- commits: list[gmuse.git.CommitRecord] = None#
- depth: int = None#
- repository_path: str = None#
- class gmuse.git.RepositoryInstructions#
Project-level guidance from .gmuse file.
The .gmuse file allows repository maintainers to provide custom guidance for commit message generation.
- Attributes:
content: Raw text content from .gmuse file (empty if not found) file_path: Absolute path to .gmuse file exists: Whether .gmuse file was found in the repository
- content: str = None#
- file_path: str = None#
- exists: bool = None#
- class gmuse.git.BranchInfo#
Information about the current git branch.
Used to provide context about the branch when generating commit messages. Branch names are sanitized to protect privacy and improve LLM understanding.
- Attributes:
raw_name: Original branch name from git branch_type: Extracted branch type (e.g., ‘feature’, ‘fix’, ‘hotfix’) branch_summary: Sanitized summary of branch purpose (truncated, tickets masked) is_default: Whether this is a default branch (main, master, develop)
- raw_name: str = None#
- branch_type: Optional[str] = None#
- branch_summary: Optional[str] = None#
- is_default: bool = False#
- gmuse.git._run_git(*args: str, cwd: Optional[str] = None, timeout: int = _GIT_TIMEOUT_SHORT, check: bool = True) subprocess.CompletedProcess[str]#
Execute a git command and return the result.
This is the core helper used by all git operations in this module.
- Args:
*args: Git command arguments (without ‘git’ prefix) cwd: Working directory for the command (None = current directory) timeout: Command timeout in seconds check: If True, raise CalledProcessError on non-zero exit code- Returns:
CompletedProcess with captured stdout/stderr
- Raises:
subprocess.CalledProcessError: If check=True and command fails subprocess.TimeoutExpired: If command exceeds timeout FileNotFoundError: If git executable is not found
- gmuse.git._count_diff_lines(raw_diff: str) tuple[int, int]#
Count added and removed lines in a diff.
- Args:
raw_diff: Raw git diff output
- Returns:
Tuple of (lines_added, lines_removed)
- gmuse.git._parse_commit_line(line: str) Optional[gmuse.git.CommitRecord]#
Parse a single commit log line into a CommitRecord.
Expected format: hash|author|timestamp|message
- Args:
line: Raw log line from git
- Returns:
CommitRecord if parsing succeeds, None otherwise
- gmuse.git._sanitize_branch_name(branch_name: str, max_length: int = 60) str#
Sanitize branch name for prompt context.
Normalizes separators, converts to lowercase, removes usernames and long hashes. Masks potential ticket IDs (e.g., JIRA-123 -> TICKET-XXX).
- Args:
branch_name: Raw branch name from git max_length: Maximum length for sanitized name (default: 60)
- Returns:
Sanitized branch name suitable for LLM context
- Example:
>>> _sanitize_branch_name("feature/USER-123-add-auth") 'feature/ticket-xxx/add-auth' >>> _sanitize_branch_name("fix/PROJ-456/update-api") 'fix/ticket-xxx/update-api'
- gmuse.git._parse_branch_info(branch_name: str, max_length: int = 60) tuple[Optional[str], Optional[str]]#
Parse branch name into type and summary.
Extracts branch type (feature, fix, hotfix, etc.) and summary from common branch naming patterns like ‘type/description’ or ‘type-description’.
- Args:
branch_name: Raw branch name from git max_length: Maximum length for branch summary (default: 60)
- Returns:
Tuple of (branch_type, branch_summary), both can be None
- Example:
>>> _parse_branch_info("feature/add-authentication") ('feature', 'add-authentication') >>> _parse_branch_info("fix/PROJ-123-bug-in-api") ('fix', 'ticket-xxx-bug-in-api')
- gmuse.git.is_git_repository(path: Optional[pathlib.Path] = None) bool#
Check if the current/specified directory is a git repository.
- Args:
path: Directory to check, defaults to current directory
- Returns:
True if directory is a git repository, False otherwise
- Example:
>>> is_git_repository() True >>> is_git_repository(Path("/tmp")) False
- gmuse.git.get_repo_root(path: Optional[pathlib.Path] = None) pathlib.Path#
Get the root directory of the git repository.
- Args:
path: Directory to start from, defaults to current directory
- Returns:
Path to repository root
- Raises:
NotAGitRepositoryError: If not in a git repository
- Example:
>>> root = get_repo_root() >>> print(root) /home/user/my-project
- gmuse.git.get_staged_diff() gmuse.git.StagedDiff#
Extract staged changes from git repository.
- Returns:
StagedDiff object with diff content and metadata
- Raises:
NotAGitRepositoryError: If not in a git repository NoStagedChangesError: If no files are staged
- Example:
>>> diff = get_staged_diff() >>> print(diff.files_changed) ['src/main.py', 'tests/test_main.py']
- gmuse.git.get_commit_history(depth: int = 5) gmuse.git.CommitHistory#
Fetch recent commit messages for style context.
- Args:
depth: Number of commits to fetch (default: 5)
- Returns:
CommitHistory object with recent commits
- Raises:
NotAGitRepositoryError: If not in a git repository
- Example:
>>> history = get_commit_history(depth=10) >>> for commit in history.commits: ... print(commit.message)
- gmuse.git.truncate_diff(diff: gmuse.git.StagedDiff, max_bytes: int = 20000) gmuse.git.StagedDiff#
Truncate diff to fit within token limits.
Strategy: - Keep file headers (diff –git, —, +++) - Keep as many lines as possible within byte limit - Add truncation marker when limit is reached - Preserve structure for LLM understanding
- Args:
diff: StagedDiff to truncate max_bytes: Maximum size in bytes (default: 20000 ≈ 5000 tokens)
- Returns:
New StagedDiff with truncated content and truncated=True
- Example:
>>> large_diff = get_staged_diff() >>> truncated = truncate_diff(large_diff, max_bytes=10000) >>> print(truncated.truncated) True
- gmuse.git.load_repository_instructions() gmuse.git.RepositoryInstructions#
Load project-level instructions from .gmuse file.
The .gmuse file allows repository maintainers to provide custom guidance for commit message generation (e.g., preferred formats, conventions).
- Returns:
RepositoryInstructions object with content from .gmuse file
- Raises:
NotAGitRepositoryError: If not in a git repository
- Example:
>>> instructions = load_repository_instructions() >>> if instructions.exists: ... print(instructions.content)
- gmuse.git.get_current_branch(max_length: int = 60) Optional[gmuse.git.BranchInfo]#
Get information about the current git branch.
Extracts the current branch name and parses it into structured information for use as context in commit message generation. Returns None if the repository is in a detached HEAD state or if branch detection fails.
- Args:
max_length: Maximum length for branch summary (default: 60)
- Returns:
BranchInfo object with parsed branch information, or None if unavailable
- Raises:
NotAGitRepositoryError: If not in a git repository
- Example:
>>> branch = get_current_branch() >>> if branch and not branch.is_default: ... print(f"Type: {branch.branch_type}, Summary: {branch.branch_summary}")