BugViper exposes two complementary search modes for indexed repositories: full-text search for finding exact identifiers and keywords, and semantic search for finding code by intent. Both modes query your Neo4j knowledge graph directly — no code needs to leave your environment, and results are anchored to precise line numbers in the original source.
Full-text search
Full-text search is backed by Apache Lucene indexes inside Neo4j — the same engine that powers Elasticsearch. When you submit a query, BugViper runs a three-tier strategy to find the best results:
Tier 1 — Lucene fulltext index on symbols
The first tier queries a code_search Lucene index built over Function, Class, and Variable nodes, searching across three fields: name, docstring, and source_code.
| Index | Node types | Fields searched |
|---|
code_search | Function, Class, Variable | name, docstring, source_code |
file_content_search | File | source_code |
For clean identifiers (like parse_unified_diff), BugViper uses phrase search to find the exact symbol instantly. For queries containing special characters (like "Authorization: Bearer"), the query is tokenised and joined with AND so Lucene can match across fields.
Tier 2 — Name CONTAINS fallback
If Tier 1 returns no results — for example, when a symbol hasn’t been indexed with a docstring and the name doesn’t phrase-match — BugViper falls back to a name CONTAINS query against the primary identifier extracted from your search string. This catches partial matches that Lucene misses.
Tier 3 — File content line search
If both symbol-level tiers return empty results, BugViper searches raw file content line-by-line using the file_content_search Lucene index (with a source_code CONTAINS fallback). This tier returns individual matching lines with their path and line number rather than full symbol bodies, keeping the response lean.
Your query
│
├─► Tier 1 — Lucene fulltext (name, docstring, source_code)
│ Clean identifiers → phrase search
│ Special characters → AND-keyword tokenisation
│
├─► Tier 2 — Name CONTAINS fallback (if Tier 1 empty)
│
└─► Tier 3 — File content line-by-line (if Tier 2 empty)
Returns: path + line_number + matching line
Search result fields
Every result from the /api/v1/query/search endpoint includes:
| Field | Description |
|---|
type | function, class, variable, or line |
name | Symbol name (empty for line-level results) |
path | Repo-relative file path |
line_number | Line in the source file |
score | Lucene relevance score (symbol results score higher than line results) |
Symbol results (functions, classes, variables) are returned before line-level matches because they carry higher Lucene scores.
The peek viewer
Line-level search results return only the matching line — not a full file dump. To read context around any result, use the /api/v1/query/code-finder/peek endpoint, which returns a configurable window of lines above and below the anchor line. The anchor line is flagged with is_anchor: true so your UI can highlight it.
GET /api/v1/query/code-finder/peek
?path=src/embedder.py
&line=42
&above=10
&below=10
You can expand or collapse the window (up to 200 lines in either direction) without re-fetching the whole file. This keeps responses fast regardless of file size.
Semantic search
Semantic search lets you find code by intent rather than exact identifier. Instead of matching keywords, BugViper embeds your query using text-embedding-3-small (via OpenRouter) and queries the Neo4j vector indexes using cosine similarity.
Semantic search requires that embeddings were enabled when the repository was indexed. If semantic search returns no results, the repository may not have been ingested with the embedding option enabled.
For example, querying "embedding model configuration" returns nodes like EmbeddingModelName, RoastResponse, and other conceptually related code at similarity scores of 73%, 69%, and 68% — even though none of those symbol names contain the words “embedding model configuration” verbatim.
Results are ranked by similarity score and capped at 10 results. Each result includes:
| Field | Description |
|---|
name | Symbol name |
type | function, class, variable, or unknown |
path | File path |
line_number | Line in the source file |
source_code | Full body of the matching symbol |
docstring | Docstring if present |
score | Cosine similarity score (0.0–1.0) |
Choosing between full-text and semantic search
Use full-text when
Use semantic search when
- You know the name of a function, class, or variable
- You’re searching for a specific string literal or pattern (e.g., an API endpoint path or a configuration key)
- You want exact matches ranked by relevance score
- You’re looking for where a specific error message or log string appears
- You don’t know the exact symbol name but know what the code should do
- You’re exploring unfamiliar codebases for functionality related to a concept
- You want to find code that implements a pattern, not just code that mentions a keyword
- You’re asking a question like “where does this codebase handle authentication retries?”
Both modes are available from the Query tab in the BugViper dashboard or directly via the REST API.