gmuse.prompts#
Prompt builder for LLM commit message generation.
This module assembles prompts from various context sources including staged diffs, commit history, repository instructions, and learning examples.
- Public API:
build_prompt: Build complete system and user prompts
build_context: Build context section from various sources
validate_message: Validate generated commit message
estimate_tokens: Estimate token count for text
- Format-specific task prompts:
get_freeform_task: Natural language commit messages
get_conventional_task: Conventional Commits format
get_gitmoji_task: Gitmoji-style commit messages
Module Contents#
Functions#
Get task prompt for freeform commit messages. |
|
Get task prompt for conventional commit messages. |
|
Get task prompt for gitmoji commit messages. |
|
Build context section of prompt from various sources. |
|
Build complete prompt for LLM generation. |
|
Validate generated commit message. |
|
Estimate token count for text. |
Data#
Version identifier for prompt format (useful for tracking/debugging). |
|
Base system prompt used for all commit message generations. |
|
Default maximum allowed length for generated commit messages. |
|
Default characters per token heuristic for GPT models. |
API#
- gmuse.prompts.logger = 'get_logger(...)'#
- gmuse.prompts.PROMPT_VERSION: Final[str] = '1.0.0'#
Version identifier for prompt format (useful for tracking/debugging).
- gmuse.prompts.SYSTEM_PROMPT: Final[str] = <Multiline-String>#
Base system prompt used for all commit message generations.
- gmuse.prompts.MAX_MESSAGE_LENGTH: Final[int] = 1000#
Default maximum allowed length for generated commit messages.
This serves as the default value used by validate_message() when no max_length parameter is provided. Can be overridden per-call or via configuration.
- gmuse.prompts.get_freeform_task() str#
Get task prompt for freeform commit messages.
- Returns:
Task prompt string
- Example:
>>> task = get_freeform_task() >>> print(task) Generate a commit message in natural language...
- gmuse.prompts.get_conventional_task(max_chars: int | None = None) str#
Get task prompt for conventional commit messages.
- Args:
- max_chars: Optional override for maximum characters. When provided, the
default fixed-length guidance is omitted to avoid conflicting instructions.
- Returns:
Task prompt string
- Example:
>>> task = get_conventional_task() >>> print(task) Generate a commit message following Conventional Commits specification...
- gmuse.prompts.get_gitmoji_task() str#
Get task prompt for gitmoji commit messages.
- Returns:
Task prompt string
- Example:
>>> task = get_gitmoji_task() >>> print(task) Generate a commit message with a relevant emoji prefix...
- gmuse.prompts.build_context(diff: gmuse.git.StagedDiff, commit_history: Optional[gmuse.git.CommitHistory] = None, repo_instructions: Optional[gmuse.git.RepositoryInstructions] = None, branch_info: Optional[gmuse.git.BranchInfo] = None, user_hint: Optional[str] = None, learning_examples: Optional[List[Tuple[str, str]]] = None) str#
Build context section of prompt from various sources.
- Args:
diff: Staged diff information commit_history: Recent commit history for style context repo_instructions: Repository-level instructions from .gmuse file branch_info: Current branch information for context user_hint: User-provided hint via –hint flag learning_examples: List of (generated, final) message pairs from learning
- Returns:
Formatted context string
- Example:
>>> from gmuse.git import StagedDiff >>> diff = StagedDiff( ... raw_diff="diff --git a/file.py...", ... files_changed=["file.py"], ... lines_added=10, ... lines_removed=2, ... hash="abc123", ... size_bytes=500, ... ) >>> context = build_context(diff) >>> print(context) Staged changes summary: - Files changed: 1 ...
- gmuse.prompts.build_prompt(diff: gmuse.git.StagedDiff, format: str = 'freeform', commit_history: Optional[gmuse.git.CommitHistory] = None, repo_instructions: Optional[gmuse.git.RepositoryInstructions] = None, branch_info: Optional[gmuse.git.BranchInfo] = None, user_hint: Optional[str] = None, learning_examples: Optional[List[Tuple[str, str]]] = None, max_chars: Optional[int] = None) Tuple[str, str]#
Build complete prompt for LLM generation.
- Args:
diff: Staged diff information format: Message format style (“freeform”, “conventional”, “gitmoji”) commit_history: Recent commit history for style context repo_instructions: Repository-level instructions from .gmuse file branch_info: Current branch information for context user_hint: User-provided hint via –hint flag learning_examples: List of (generated, final) message pairs from learning
- Returns:
Tuple of (system_prompt, user_prompt)
- Raises:
ValueError: If format is not recognized
- Example:
>>> from gmuse.git import StagedDiff >>> diff = StagedDiff( ... raw_diff="diff --git a/file.py...", ... files_changed=["file.py"], ... lines_added=10, ... lines_removed=2, ... hash="abc123", ... size_bytes=500, ... ) >>> system, user = build_prompt(diff, format="conventional") >>> print(system) You are an expert commit message generator...
- gmuse.prompts.validate_message(message: str, format: str = 'freeform', max_length: int = MAX_MESSAGE_LENGTH) None#
Validate generated commit message.
- Args:
message: Generated commit message to validate format: Expected message format max_length: Maximum allowed message length (default: 1000)
- Raises:
InvalidMessageError: If message fails validation
- Example:
>>> validate_message("feat(auth): add JWT", format="conventional") # No error >>> validate_message("", format="freeform") Traceback (most recent call last): ... InvalidMessageError: Generated message is empty
- gmuse.prompts._CHARS_PER_TOKEN: Final[int] = 4#
Default characters per token heuristic for GPT models.
This serves as the default value used by estimate_tokens() when no chars_per_token parameter is provided. Can be overridden per-call or via configuration.
- gmuse.prompts.estimate_tokens(text: str, chars_per_token: int = _CHARS_PER_TOKEN) int#
Estimate token count for text.
Uses a simple heuristic (default: ~4 characters per token, approximate for GPT models). This is a rough estimate and actual token counts vary by model and tokenizer.
- Args:
text: Text to estimate tokens for chars_per_token: Characters per token heuristic (default: 4)
- Returns:
Estimated token count
- Example:
>>> estimate_tokens("Hello world") 3 >>> estimate_tokens("A" * 400) 100