Semantic code search — BugViper API

Semantic search lets you describe what you are looking for in plain English rather than knowing the exact identifier. BugViper passes your question through the same embedding model used during ingestion, then queries the Neo4j vector indexes to find code nodes whose embeddings are closest to your query vector. No large language model is involved — results are pure vector cosine similarity rankings. This means results reflect structural and semantic similarity to your question, not an AI-generated interpretation of it.

POST /api/v1/query/semantic

Embeds a natural language question and returns the top 10 most similar code nodes from the knowledge graph.

question

string

required

Natural language description of what you are looking for. For example: "function that calculates cyclomatic complexity" or "class responsible for embedding model configuration".

repoOwner

string

Filter results to a specific repository owner. Must be combined with repoName.

repoName

string

Filter results to a specific repository name. Must be combined with repoOwner.

Response

results

object[]

Up to 10 code nodes ranked by cosine similarity score (highest first).

Show result properties

name

string

Symbol name (function, class, or variable name), or null if not available.

type

string

Node type, e.g. "function", "class", "variable", or "unknown".

path

string

Repo-relative file path where the symbol is defined.

line_number

number

Line number where the symbol starts (1-indexed), or null if not stored.

source_code

string

Full source code of the matched symbol.

docstring

string

Docstring extracted during ingestion, or null if none was found.

score

number

Cosine similarity score between 0.0 and 1.0. Higher values indicate greater semantic similarity.

total

number

Number of results returned. Maximum is 10.

Example

curl -X POST https://your-bugviper-instance/api/v1/query/semantic \
  -H "Authorization: Bearer YOUR_FIREBASE_ID_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "embedding model configuration",
    "repoOwner": "acme-corp",
    "repoName": "my-api"
  }'

{
  "results": [
    {
      "name": "embed_texts",
      "type": "function",
      "path": "common/embedder.py",
      "line_number": 12,
      "source_code": "def embed_texts(texts: list[str]) -> list[list[float]]:\n    ...",
      "docstring": "Embed a list of texts using the configured OpenRouter model.",
      "score": 0.73
    },
    {
      "name": "embed_nodes_in_neo4j",
      "type": "function",
      "path": "common/embedder.py",
      "line_number": 48,
      "source_code": "def embed_nodes_in_neo4j(client: Neo4jClient) -> dict[str, int]:\n    ...",
      "docstring": null,
      "score": 0.69
    },
    {
      "name": "SemanticInput",
      "type": "class",
      "path": "api/models/semantic.py",
      "line_number": 6,
      "source_code": "class SemanticInput(BaseModel):\n    question: str\n    repoName: Optional[str] = None\n    repoOwner: Optional[str] = None",
      "docstring": null,
      "score": 0.68
    }
  ],
  "total": 3
}

Semantic search requires embeddings to be generated for the repository. If you get zero results, run POST /api/v1/ingest/{owner}/{repo_name}/embed first to generate embeddings. See the ingest endpoints for details.

Documentation Index

​POST /api/v1/query/semantic

​Response

​Example

POST /api/v1/query/semantic

Response

Example