Architecture

Internal structure of CodeSight.

Module Overview

Project Structure

codesight/
├── __init__.py          # version, metadata
├── __main__.py          # python -m codesight entry point
├── cli.py               # argparse CLI, output formatting
├── config.py            # AppConfig, ProviderConfig, load/save
├── analyzer.py          # core analysis engine, system prompts
├── compression.py       # code map builder, token reduction
├── streaming.py         # streaming output for all providers
├── templates.py         # custom prompt template management
├── cost.py              # token cost estimation per model
├── pipeline.py          # multi-model triage -> verify pipeline
├── benchmark.py         # LLM benchmark on vulnerable samples
├── sarif.py             # SARIF output generation
└── providers/
    ├── base.py              # BaseLLMProvider, Message, LLMResponse
    ├── factory.py           # provider registry and instantiation
    ├── openai_provider.py
    ├── anthropic_provider.py
    ├── google_provider.py
    ├── ollama_provider.py
    └── custom_provider.py   # OpenAI-compatible adapter (OpenRouter, Groq, Azure, etc.)

Data Flow

Every analysis follows the same path:

CLI parses the command and loads config
Analyzer reads the source file and validates size
Compression builds a code map if the file exceeds 300 lines
System prompt is selected based on the task (review/bugs/security/etc.)
Provider sends the prompt + code to the LLM API
Response is parsed, formatted (markdown/json/sarif), and printed
Cost is estimated from token usage

Provider Abstraction

All providers implement BaseLLMProvider:

providers/base.py

class BaseLLMProvider(ABC):
    async def complete(messages, max_tokens, temperature) -> LLMResponse
    async def health_check() -> bool
    def name -> str

A new provider = three methods + registration in factory.py. The custom_provider.py module handles any OpenAI-compatible endpoint: factory.create_provider falls back to CustomProvider when a ProviderConfig has a base_url but an unknown provider name. That is how OpenRouter, Groq, Together, xAI, Azure AI Foundry and others run without dedicated modules.

Compression

For files over 300 lines, the compression module extracts:

Import statements
Function/class/method signatures
Structural outline with line numbers

Cuts token usage by 60-80% while keeping enough context for accurate analysis.

Multi-Model Pipeline

The pipeline feature chains two models for security analysis:

Triage - a fast/cheap model (e.g., Llama 3 via Ollama) scans the code and flags suspicious areas
Verify - a strong model (e.g., GPT-5.4) does deep analysis only on flagged sections

Cuts cost 70-85% vs running the full code through the expensive model, with minimal accuracy loss.

Configuration Hierarchy

CLI flags (--provider, --output)
Environment variables (OPENAI_API_KEY, CODESIGHT_MODEL)
Config file (~/.codesight/config.json)
Defaults (OpenAI, GPT-5.4, markdown output)