Architecture
Internal structure of CodeSight.
Module Overview
codesight/
├── __init__.py # version, metadata
├── __main__.py # python -m codesight entry point
├── cli.py # argparse CLI, output formatting
├── config.py # AppConfig, ProviderConfig, load/save
├── analyzer.py # core analysis engine, system prompts
├── compression.py # code map builder, token reduction
├── streaming.py # streaming output for all providers
├── templates.py # custom prompt template management
├── cost.py # token cost estimation per model
├── pipeline.py # multi-model triage -> verify pipeline
├── benchmark.py # LLM benchmark on vulnerable samples
├── sarif.py # SARIF output generation
└── providers/
├── base.py # BaseLLMProvider, Message, LLMResponse
├── factory.py # provider registry and instantiation
├── openai_provider.py
├── anthropic_provider.py
├── google_provider.py
├── ollama_provider.py
└── custom_provider.py # OpenAI-compatible adapter (OpenRouter, Groq, Azure, etc.)
Data Flow
Every analysis follows the same path:
- CLI parses the command and loads config
- Analyzer reads the source file and validates size
- Compression builds a code map if the file exceeds 300 lines
- System prompt is selected based on the task (review/bugs/security/etc.)
- Provider sends the prompt + code to the LLM API
- Response is parsed, formatted (markdown/json/sarif), and printed
- Cost is estimated from token usage
Provider Abstraction
All providers implement BaseLLMProvider:
class BaseLLMProvider(ABC):
async def complete(messages, max_tokens, temperature) -> LLMResponse
async def health_check() -> bool
def name -> str
A new provider = three methods + registration in factory.py. The custom_provider.py module handles any OpenAI-compatible endpoint: factory.create_provider falls back to CustomProvider when a ProviderConfig has a base_url but an unknown provider name. That is how OpenRouter, Groq, Together, xAI, Azure AI Foundry and others run without dedicated modules.
Compression
For files over 300 lines, the compression module extracts:
- Import statements
- Function/class/method signatures
- Structural outline with line numbers
Cuts token usage by 60-80% while keeping enough context for accurate analysis.
Multi-Model Pipeline
The pipeline feature chains two models for security analysis:
- Triage - a fast/cheap model (e.g., Llama 3 via Ollama) scans the code and flags suspicious areas
- Verify - a strong model (e.g., GPT-5.4) does deep analysis only on flagged sections
Cuts cost 70-85% vs running the full code through the expensive model, with minimal accuracy loss.
Configuration Hierarchy
- CLI flags (
--provider,--output) - Environment variables (
OPENAI_API_KEY,CODESIGHT_MODEL) - Config file (
~/.codesight/config.json) - Defaults (OpenAI, GPT-5.4, markdown output)