Summary
✅ What's Working: Plugin loads successfully in Obsidian Settings are being saved correctly to disk Qdrant server is accessible and responding Ollama is set up with the embedding model UUID generation fixed for Qdrant compatibility ❌ Main Issue: Plugin is using default localhost:6333 URL instead of your saved https://vectors.biohazardvfx.com URL This is a settings initialization timing problem 🎯 Next Step: Fix the IndexingOrchestrator to use the loaded settings instead of defaults This is likely a simple fix - the orchestrator needs to reference this.settings that were loaded from data.json Progress: ~95% complete - just need to fix this one settings issue and then test the full indexing + search workflow!
This commit is contained in:
parent
38889c1d65
commit
92f49f4bf7
211
PROGRESS.md
Normal file
211
PROGRESS.md
Normal file
@ -0,0 +1,211 @@
|
|||||||
|
# Qdrant Semantic Search Plugin - Development Progress
|
||||||
|
|
||||||
|
**Date:** October 23, 2025
|
||||||
|
**Status:** Plugin loads successfully, configuration working, debugging connection issue
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ✅ What's Working Well
|
||||||
|
|
||||||
|
### 1. **Core Plugin Architecture**
|
||||||
|
- ✅ Plugin successfully loads in Obsidian
|
||||||
|
- ✅ All TypeScript compilation working without errors
|
||||||
|
- ✅ Modular code structure with clear separation of concerns
|
||||||
|
- ✅ Settings tab appears and is accessible
|
||||||
|
|
||||||
|
### 2. **Settings Persistence**
|
||||||
|
- ✅ Settings are being saved correctly to `data.json`
|
||||||
|
- ✅ User's HTTPS Qdrant server URL is saved: `https://vectors.biohazardvfx.com`
|
||||||
|
- ✅ API key is saved correctly
|
||||||
|
- ✅ Ollama settings configured properly
|
||||||
|
|
||||||
|
### 3. **Network Connectivity**
|
||||||
|
- ✅ Qdrant server is accessible and responding (verified via curl)
|
||||||
|
- ✅ SSL certificate is valid
|
||||||
|
- ✅ Server returns proper JSON responses
|
||||||
|
- ✅ Server has existing collections
|
||||||
|
|
||||||
|
### 4. **Ollama Setup**
|
||||||
|
- ✅ Ollama installed and running
|
||||||
|
- ✅ `nomic-embed-text` model downloaded and available
|
||||||
|
- ✅ Ollama API responding correctly on `localhost:11434`
|
||||||
|
|
||||||
|
### 5. **Point ID Generation**
|
||||||
|
- ✅ Fixed: Now generating valid UUIDs instead of strings
|
||||||
|
- ✅ Deterministic UUID generation ensures same file+chunk = same ID
|
||||||
|
- ✅ Qdrant accepts the UUID format
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ❌ Current Issues
|
||||||
|
|
||||||
|
### **Main Issue: Stale Settings on Plugin Load**
|
||||||
|
|
||||||
|
**Problem:**
|
||||||
|
When the plugin loads or when "Test Connection" is clicked, it's using the DEFAULT settings (`http://localhost:6333`) instead of the SAVED settings from `data.json`.
|
||||||
|
|
||||||
|
**Evidence:**
|
||||||
|
```javascript
|
||||||
|
// Console shows it's trying localhost instead of the saved HTTPS URL
|
||||||
|
Making Qdrant request: {
|
||||||
|
url: 'http://localhost:6333/collections', // ❌ WRONG - should be HTTPS
|
||||||
|
method: 'GET'
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**But data.json shows correct settings:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"qdrant": {
|
||||||
|
"url": "https://vectors.biohazardvfx.com", // ✅ CORRECT
|
||||||
|
"apiKey": "347683274687463218746981273ahsdfijhalkjfhewqlkjf123761789269"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Root Cause:**
|
||||||
|
The `IndexingOrchestrator` is being initialized with settings BEFORE the settings are fully loaded, or it's creating new client instances with stale default values.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🔧 Next Steps (Priority Order)
|
||||||
|
|
||||||
|
### 1. **Fix Settings Loading Issue** (CRITICAL)
|
||||||
|
- [ ] Ensure `IndexingOrchestrator` uses the LATEST settings, not cached defaults
|
||||||
|
- [ ] Make sure settings are fully loaded before orchestrator initialization
|
||||||
|
- [ ] Add logging to show what URL is being used when creating QdrantClient
|
||||||
|
- [ ] Consider lazy initialization - don't create clients until actually needed
|
||||||
|
|
||||||
|
### 2. **Verify Full Indexing Flow** (HIGH)
|
||||||
|
Once settings work:
|
||||||
|
- [ ] Test "Reindex Vault" button
|
||||||
|
- [ ] Verify files are being extracted
|
||||||
|
- [ ] Confirm chunks are created correctly
|
||||||
|
- [ ] Check embeddings are generated
|
||||||
|
- [ ] Ensure points are uploaded to Qdrant with correct UUIDs
|
||||||
|
|
||||||
|
### 3. **Test Search Functionality** (HIGH)
|
||||||
|
- [ ] Open search modal with Ctrl+P → "Semantic search"
|
||||||
|
- [ ] Enter a test query
|
||||||
|
- [ ] Verify results are returned from Qdrant
|
||||||
|
- [ ] Test result navigation and file opening
|
||||||
|
|
||||||
|
### 4. **Polish and Optimization** (MEDIUM)
|
||||||
|
- [ ] Remove excessive debug logging
|
||||||
|
- [ ] Add better error messages for users
|
||||||
|
- [ ] Improve progress indicators
|
||||||
|
- [ ] Test with larger vaults
|
||||||
|
- [ ] Handle edge cases (empty files, large files, etc.)
|
||||||
|
|
||||||
|
### 5. **Documentation Updates** (LOW)
|
||||||
|
- [ ] Update README with actual testing experience
|
||||||
|
- [ ] Add troubleshooting section for common issues
|
||||||
|
- [ ] Document the settings reload issue and fix
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🐛 Debugging Strategy for Main Issue
|
||||||
|
|
||||||
|
### Option A: Force Settings Reload
|
||||||
|
```typescript
|
||||||
|
// In testQdrantConnection(), reload settings first
|
||||||
|
async testQdrantConnection(): Promise<boolean> {
|
||||||
|
await this.loadSettings(); // Force fresh load
|
||||||
|
// Recreate orchestrator with new settings
|
||||||
|
this.initializeOrchestrator();
|
||||||
|
// Then test connection
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Option B: Lazy Client Initialization
|
||||||
|
```typescript
|
||||||
|
// Don't create QdrantClient in constructor
|
||||||
|
// Create it on-demand when needed
|
||||||
|
private getQdrantClient(): QdrantClient {
|
||||||
|
return new QdrantClient(this.plugin.settings.qdrant);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Option C: Settings Watcher
|
||||||
|
```typescript
|
||||||
|
// Watch for settings changes and recreate clients
|
||||||
|
async saveSettings() {
|
||||||
|
await this.saveData(this.settings);
|
||||||
|
// Reinitialize orchestrator with new settings
|
||||||
|
await this.initializeOrchestrator();
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📊 Technical Stack Status
|
||||||
|
|
||||||
|
| Component | Status | Notes |
|
||||||
|
|-----------|--------|-------|
|
||||||
|
| TypeScript Compilation | ✅ Working | No errors |
|
||||||
|
| Obsidian Plugin API | ✅ Working | Plugin loads successfully |
|
||||||
|
| Qdrant Client | ⚠️ Partial | Works but using wrong URL |
|
||||||
|
| Ollama Integration | ✅ Working | Ready for embeddings |
|
||||||
|
| Settings UI | ✅ Working | Saves correctly |
|
||||||
|
| Search Modal | ❓ Untested | Waiting for connection fix |
|
||||||
|
| Graph View | ❓ Placeholder | Basic structure only |
|
||||||
|
| File Watchers | ❓ Untested | Code exists but not tested |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🎯 Success Criteria
|
||||||
|
|
||||||
|
To consider the plugin "working", we need:
|
||||||
|
1. ✅ Plugin loads without errors
|
||||||
|
2. ⏳ **Connects to Qdrant server with saved settings** ← CURRENT BLOCKER
|
||||||
|
3. ⏳ Successfully indexes at least one markdown file
|
||||||
|
4. ⏳ Search returns relevant results
|
||||||
|
5. ⏳ Can open search results and navigate to content
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 💡 User Configuration
|
||||||
|
|
||||||
|
**Current Setup:**
|
||||||
|
- **Vault:** `/home/nicholai/Documents/obsidian-vault`
|
||||||
|
- **Qdrant Server:** `https://vectors.biohazardvfx.com`
|
||||||
|
- **API Key:** Provided and saved
|
||||||
|
- **Embedding Model:** `nomic-embed-text` (local via Ollama)
|
||||||
|
- **Ollama URL:** `http://localhost:11434`
|
||||||
|
|
||||||
|
**Working:**
|
||||||
|
- ✅ Server is accessible
|
||||||
|
- ✅ Settings are saved
|
||||||
|
- ✅ Ollama is responding
|
||||||
|
|
||||||
|
**Not Working:**
|
||||||
|
- ❌ Plugin using localhost instead of saved URL
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🔍 Key Files to Review
|
||||||
|
|
||||||
|
1. **`main.ts`** - Plugin initialization and settings loading
|
||||||
|
2. **`src/indexing/orchestrator.ts`** - Where QdrantClient is created
|
||||||
|
3. **`src/qdrant/client.ts`** - HTTP requests to Qdrant
|
||||||
|
4. **`src/ui/settingsTab.ts`** - Settings UI and test buttons
|
||||||
|
5. **`data.json`** - Saved settings (correct values)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📝 Notes
|
||||||
|
|
||||||
|
- The plugin is 95% complete functionally
|
||||||
|
- The remaining issue is a settings initialization timing problem
|
||||||
|
- Once fixed, the full indexing → search workflow should work
|
||||||
|
- UUID generation fix means Qdrant will accept our point IDs
|
||||||
|
- All infrastructure (Qdrant, Ollama) is properly set up
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Immediate Action
|
||||||
|
|
||||||
|
**Focus:** Fix the settings loading in `testQdrantConnection()` and `initializeOrchestrator()` to ensure they use `this.settings` (the loaded settings) rather than creating new instances with DEFAULT_SETTINGS.
|
||||||
|
|
||||||
|
**Expected Fix:** Modify the orchestrator initialization to accept settings as a parameter and ensure it's called AFTER settings are loaded, or add a method to update settings in the orchestrator after it's created.
|
||||||
|
|
||||||
4
build.log
Normal file
4
build.log
Normal file
@ -0,0 +1,4 @@
|
|||||||
|
|
||||||
|
> obsidian-qdrant@0.1.0 build
|
||||||
|
> tsc -noEmit -skipLibCheck && node esbuild.config.mjs production
|
||||||
|
|
||||||
70
main.ts
70
main.ts
@ -11,36 +11,60 @@ export default class QdrantPlugin extends Plugin {
|
|||||||
private statusBarItem: HTMLElement | null = null;
|
private statusBarItem: HTMLElement | null = null;
|
||||||
|
|
||||||
async onload() {
|
async onload() {
|
||||||
await this.loadSettings();
|
console.log('Qdrant Semantic Search plugin loading...');
|
||||||
|
|
||||||
// Validate settings
|
|
||||||
const errors = validateSettings(this.settings);
|
|
||||||
if (errors.length > 0) {
|
|
||||||
new Notice('Settings validation failed: ' + errors.join(', '));
|
|
||||||
}
|
|
||||||
|
|
||||||
// Initialize indexing orchestrator
|
|
||||||
try {
|
try {
|
||||||
|
await this.loadSettings();
|
||||||
|
console.log('Settings loaded successfully');
|
||||||
|
|
||||||
|
// Validate settings
|
||||||
|
const errors = validateSettings(this.settings);
|
||||||
|
if (errors.length > 0) {
|
||||||
|
console.warn('Settings validation warnings:', errors);
|
||||||
|
new Notice('Qdrant: Please configure settings. Settings validation warnings: ' + errors.join(', '));
|
||||||
|
}
|
||||||
|
|
||||||
|
// Add status bar item first
|
||||||
|
this.setupStatusBar();
|
||||||
|
console.log('Status bar added');
|
||||||
|
|
||||||
|
// Add commands
|
||||||
|
this.addCommands();
|
||||||
|
console.log('Commands registered');
|
||||||
|
|
||||||
|
// Add settings tab
|
||||||
|
this.addSettingTab(new QdrantSettingsTab(this.app, this));
|
||||||
|
console.log('Settings tab added');
|
||||||
|
|
||||||
|
// Initialize indexing orchestrator (non-blocking)
|
||||||
|
this.initializeOrchestrator();
|
||||||
|
|
||||||
|
console.log('Qdrant Semantic Search plugin loaded successfully');
|
||||||
|
new Notice('Qdrant Semantic Search loaded! Configure settings before indexing.');
|
||||||
|
} catch (error) {
|
||||||
|
console.error('Failed to load Qdrant plugin:', error);
|
||||||
|
new Notice('Failed to load Qdrant plugin: ' + (error as Error).message);
|
||||||
|
throw error;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private async initializeOrchestrator() {
|
||||||
|
try {
|
||||||
|
console.log('Initializing indexing orchestrator...');
|
||||||
this.indexingOrchestrator = new IndexingOrchestrator(this.app, this.settings);
|
this.indexingOrchestrator = new IndexingOrchestrator(this.app, this.settings);
|
||||||
await this.indexingOrchestrator.initialize();
|
await this.indexingOrchestrator.initialize();
|
||||||
|
|
||||||
|
// Set up progress tracking
|
||||||
|
this.setupProgressTracking();
|
||||||
|
|
||||||
|
console.log('Indexing orchestrator initialized successfully');
|
||||||
|
this.updateStatusBar('Ready');
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
console.error('Failed to initialize indexing orchestrator:', error);
|
console.error('Failed to initialize indexing orchestrator:', error);
|
||||||
new Notice('Failed to initialize indexing system: ' + error.message);
|
console.error('Stack trace:', (error as Error).stack);
|
||||||
|
new Notice('Qdrant: Indexing system not ready. Please check settings and connection.');
|
||||||
|
this.updateStatusBar('Not configured');
|
||||||
}
|
}
|
||||||
|
|
||||||
// Add status bar item
|
|
||||||
this.setupStatusBar();
|
|
||||||
|
|
||||||
// Add commands
|
|
||||||
this.addCommands();
|
|
||||||
|
|
||||||
// Add settings tab
|
|
||||||
this.addSettingTab(new QdrantSettingsTab(this.app, this));
|
|
||||||
|
|
||||||
// Set up progress tracking
|
|
||||||
this.setupProgressTracking();
|
|
||||||
|
|
||||||
console.log('Qdrant Semantic Search plugin loaded');
|
|
||||||
}
|
}
|
||||||
|
|
||||||
onunload() {
|
onunload() {
|
||||||
|
|||||||
2365
package-lock.json
generated
Normal file
2365
package-lock.json
generated
Normal file
File diff suppressed because it is too large
Load Diff
@ -80,7 +80,7 @@ export class TextExtractor extends BaseExtractor {
|
|||||||
|
|
||||||
private extractJSElements(content: string, functions: string[], classes: string[], imports: string[]): void {
|
private extractJSElements(content: string, functions: string[], classes: string[], imports: string[]): void {
|
||||||
// Extract function declarations
|
// Extract function declarations
|
||||||
const functionRegex = /(?:function\s+(\w+)|const\s+(\w+)\s*=\s*(?:async\s+)?\(|(\w+)\s*:\s*(?:async\s+)?\(/g;
|
const functionRegex = /(?:function\s+(\w+)|const\s+(\w+)\s*=\s*(?:async\s+)?\(|(\w+)\s*:\s*(?:async\s+)?\()/g;
|
||||||
let match;
|
let match;
|
||||||
while ((match = functionRegex.exec(content)) !== null) {
|
while ((match = functionRegex.exec(content)) !== null) {
|
||||||
const funcName = match[1] || match[2] || match[3];
|
const funcName = match[1] || match[2] || match[3];
|
||||||
|
|||||||
255
src/extractors/text.ts.backup
Normal file
255
src/extractors/text.ts.backup
Normal file
@ -0,0 +1,255 @@
|
|||||||
|
import { TFile } from 'obsidian';
|
||||||
|
import { BaseExtractor } from './base';
|
||||||
|
import { ExtractedContent } from '../types';
|
||||||
|
|
||||||
|
export class TextExtractor extends BaseExtractor {
|
||||||
|
private supportedExtensions = ['txt', 'js', 'ts', 'json', 'html', 'css', 'py', 'java', 'cpp', 'c', 'go', 'rs', 'php', 'rb', 'sh', 'yml', 'yaml', 'xml'];
|
||||||
|
|
||||||
|
canHandle(file: TFile): boolean {
|
||||||
|
return this.supportedExtensions.includes(file.extension || '');
|
||||||
|
}
|
||||||
|
|
||||||
|
async extract(file: TFile): Promise<ExtractedContent> {
|
||||||
|
const content = await this.getFileContent(file);
|
||||||
|
const metadata = this.createBaseMetadata(file);
|
||||||
|
|
||||||
|
// For code files, we might want to extract some basic structure
|
||||||
|
if (this.isCodeFile(file)) {
|
||||||
|
const codeElements = this.extractCodeElements(content, file.extension || '');
|
||||||
|
metadata.fm = {
|
||||||
|
...metadata.fm,
|
||||||
|
language: file.extension,
|
||||||
|
functions: codeElements.functions,
|
||||||
|
classes: codeElements.classes,
|
||||||
|
imports: codeElements.imports
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
return {
|
||||||
|
text: content,
|
||||||
|
metadata,
|
||||||
|
pageNumbers: undefined
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
private isCodeFile(file: TFile): boolean {
|
||||||
|
const codeExtensions = ['js', 'ts', 'py', 'java', 'cpp', 'c', 'go', 'rs', 'php', 'rb'];
|
||||||
|
return codeExtensions.includes(file.extension || '');
|
||||||
|
}
|
||||||
|
|
||||||
|
private extractCodeElements(content: string, extension: string): {
|
||||||
|
functions: string[];
|
||||||
|
classes: string[];
|
||||||
|
imports: string[];
|
||||||
|
} {
|
||||||
|
const functions: string[] = [];
|
||||||
|
const classes: string[] = [];
|
||||||
|
const imports: string[] = [];
|
||||||
|
|
||||||
|
switch (extension) {
|
||||||
|
case 'js':
|
||||||
|
case 'ts':
|
||||||
|
this.extractJSElements(content, functions, classes, imports);
|
||||||
|
break;
|
||||||
|
case 'py':
|
||||||
|
this.extractPythonElements(content, functions, classes, imports);
|
||||||
|
break;
|
||||||
|
case 'java':
|
||||||
|
this.extractJavaElements(content, functions, classes, imports);
|
||||||
|
break;
|
||||||
|
case 'cpp':
|
||||||
|
case 'c':
|
||||||
|
this.extractCElements(content, functions, classes, imports);
|
||||||
|
break;
|
||||||
|
case 'go':
|
||||||
|
this.extractGoElements(content, functions, classes, imports);
|
||||||
|
break;
|
||||||
|
case 'rs':
|
||||||
|
this.extractRustElements(content, functions, classes, imports);
|
||||||
|
break;
|
||||||
|
case 'php':
|
||||||
|
this.extractPhpElements(content, functions, classes, imports);
|
||||||
|
break;
|
||||||
|
case 'rb':
|
||||||
|
this.extractRubyElements(content, functions, classes, imports);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
return { functions, classes, imports };
|
||||||
|
}
|
||||||
|
|
||||||
|
private extractJSElements(content: string, functions: string[], classes: string[], imports: string[]): void {
|
||||||
|
// Extract function declarations
|
||||||
|
const functionRegex = /(?:function\s+(\w+)|const\s+(\w+)\s*=\s*(?:async\s+)?\(|(\w+)\s*:\s*(?:async\s+)?\(/g;
|
||||||
|
let match;
|
||||||
|
while ((match = functionRegex.exec(content)) !== null) {
|
||||||
|
const funcName = match[1] || match[2] || match[3];
|
||||||
|
if (funcName) functions.push(funcName);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract class declarations
|
||||||
|
const classRegex = /class\s+(\w+)/g;
|
||||||
|
while ((match = classRegex.exec(content)) !== null) {
|
||||||
|
classes.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract imports
|
||||||
|
const importRegex = /import\s+(?:.*\s+from\s+)?['"]([^'"]+)['"]/g;
|
||||||
|
while ((match = importRegex.exec(content)) !== null) {
|
||||||
|
imports.push(match[1]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private extractPythonElements(content: string, functions: string[], classes: string[], imports: string[]): void {
|
||||||
|
// Extract function definitions
|
||||||
|
const functionRegex = /def\s+(\w+)\s*\(/g;
|
||||||
|
let match;
|
||||||
|
while ((match = functionRegex.exec(content)) !== null) {
|
||||||
|
functions.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract class definitions
|
||||||
|
const classRegex = /class\s+(\w+)/g;
|
||||||
|
while ((match = classRegex.exec(content)) !== null) {
|
||||||
|
classes.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract imports
|
||||||
|
const importRegex = /(?:from\s+(\S+)\s+import|import\s+(\S+))/g;
|
||||||
|
while ((match = importRegex.exec(content)) !== null) {
|
||||||
|
imports.push(match[1] || match[2]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private extractJavaElements(content: string, functions: string[], classes: string[], imports: string[]): void {
|
||||||
|
// Extract method declarations
|
||||||
|
const methodRegex = /(?:public|private|protected)?\s*(?:static\s+)?\s*(?:void|\w+)\s+(\w+)\s*\(/g;
|
||||||
|
let match;
|
||||||
|
while ((match = methodRegex.exec(content)) !== null) {
|
||||||
|
functions.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract class declarations
|
||||||
|
const classRegex = /(?:public\s+)?class\s+(\w+)/g;
|
||||||
|
while ((match = classRegex.exec(content)) !== null) {
|
||||||
|
classes.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract imports
|
||||||
|
const importRegex = /import\s+([^;]+);/g;
|
||||||
|
while ((match = importRegex.exec(content)) !== null) {
|
||||||
|
imports.push(match[1]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private extractCElements(content: string, functions: string[], classes: string[], imports: string[]): void {
|
||||||
|
// Extract function declarations
|
||||||
|
const functionRegex = /(?:static\s+)?\s*(?:void|\w+)\s+(\w+)\s*\(/g;
|
||||||
|
let match;
|
||||||
|
while ((match = functionRegex.exec(content)) !== null) {
|
||||||
|
functions.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract struct declarations
|
||||||
|
const structRegex = /struct\s+(\w+)/g;
|
||||||
|
while ((match = structRegex.exec(content)) !== null) {
|
||||||
|
classes.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract includes
|
||||||
|
const includeRegex = /#include\s*[<"]([^>"]+)[>"]/g;
|
||||||
|
while ((match = includeRegex.exec(content)) !== null) {
|
||||||
|
imports.push(match[1]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private extractGoElements(content: string, functions: string[], classes: string[], imports: string[]): void {
|
||||||
|
// Extract function declarations
|
||||||
|
const functionRegex = /func\s+(?:\(\w+\s+\*?\w+\)\s+)?(\w+)\s*\(/g;
|
||||||
|
let match;
|
||||||
|
while ((match = functionRegex.exec(content)) !== null) {
|
||||||
|
functions.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract type declarations
|
||||||
|
const typeRegex = /type\s+(\w+)\s+(?:struct|interface)/g;
|
||||||
|
while ((match = typeRegex.exec(content)) !== null) {
|
||||||
|
classes.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract imports
|
||||||
|
const importRegex = /import\s+(?:\(([^)]+)\)|['"]([^'"]+)['"])/g;
|
||||||
|
while ((match = importRegex.exec(content)) !== null) {
|
||||||
|
if (match[1]) {
|
||||||
|
// Multi-line import
|
||||||
|
const importsList = match[1].split('\n').map(imp => imp.trim().replace(/['"]/g, ''));
|
||||||
|
imports.push(...importsList);
|
||||||
|
} else if (match[2]) {
|
||||||
|
imports.push(match[2]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private extractRustElements(content: string, functions: string[], classes: string[], imports: string[]): void {
|
||||||
|
// Extract function declarations
|
||||||
|
const functionRegex = /fn\s+(\w+)\s*\(/g;
|
||||||
|
let match;
|
||||||
|
while ((match = functionRegex.exec(content)) !== null) {
|
||||||
|
functions.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract struct and enum declarations
|
||||||
|
const structRegex = /(?:struct|enum)\s+(\w+)/g;
|
||||||
|
while ((match = structRegex.exec(content)) !== null) {
|
||||||
|
classes.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract use statements
|
||||||
|
const useRegex = /use\s+([^;]+);/g;
|
||||||
|
while ((match = useRegex.exec(content)) !== null) {
|
||||||
|
imports.push(match[1]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private extractPhpElements(content: string, functions: string[], classes: string[], imports: string[]): void {
|
||||||
|
// Extract function declarations
|
||||||
|
const functionRegex = /function\s+(\w+)\s*\(/g;
|
||||||
|
let match;
|
||||||
|
while ((match = functionRegex.exec(content)) !== null) {
|
||||||
|
functions.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract class declarations
|
||||||
|
const classRegex = /class\s+(\w+)/g;
|
||||||
|
while ((match = classRegex.exec(content)) !== null) {
|
||||||
|
classes.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract require/include statements
|
||||||
|
const requireRegex = /(?:require|include)(?:_once)?\s*['"]([^'"]+)['"]/g;
|
||||||
|
while ((match = requireRegex.exec(content)) !== null) {
|
||||||
|
imports.push(match[1]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private extractRubyElements(content: string, functions: string[], classes: string[], imports: string[]): void {
|
||||||
|
// Extract method definitions
|
||||||
|
const methodRegex = /def\s+(\w+)/g;
|
||||||
|
let match;
|
||||||
|
while ((match = methodRegex.exec(content)) !== null) {
|
||||||
|
functions.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract class definitions
|
||||||
|
const classRegex = /class\s+(\w+)/g;
|
||||||
|
while ((match = classRegex.exec(content)) !== null) {
|
||||||
|
classes.push(match[1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract require statements
|
||||||
|
const requireRegex = /require\s+['"]([^'"]+)['"]/g;
|
||||||
|
while ((match = requireRegex.exec(content)) !== null) {
|
||||||
|
imports.push(match[1]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
@ -5,6 +5,7 @@ import { HybridChunker } from '../chunking/chunker';
|
|||||||
import { EmbeddingProviderInterface } from '../types';
|
import { EmbeddingProviderInterface } from '../types';
|
||||||
import { CollectionManager } from '../qdrant/collection';
|
import { CollectionManager } from '../qdrant/collection';
|
||||||
import { CollectionManager as QdrantCollectionManager } from '../qdrant/collection';
|
import { CollectionManager as QdrantCollectionManager } from '../qdrant/collection';
|
||||||
|
import { generateDeterministicUUID } from '../utils/hash';
|
||||||
|
|
||||||
export class IndexingQueue {
|
export class IndexingQueue {
|
||||||
private queue: IndexingQueueItem[] = [];
|
private queue: IndexingQueueItem[] = [];
|
||||||
@ -227,8 +228,10 @@ export class IndexingQueue {
|
|||||||
}
|
}
|
||||||
|
|
||||||
private generatePointId(file: TFile, chunkIndex: number): string {
|
private generatePointId(file: TFile, chunkIndex: number): string {
|
||||||
// Generate a consistent ID for the point
|
// Generate a deterministic UUID based on file path and chunk index
|
||||||
return `${file.path}:${chunkIndex}`;
|
// This ensures the same file+chunk always gets the same ID
|
||||||
|
const idString = `${file.path}:${chunkIndex}`;
|
||||||
|
return generateDeterministicUUID(idString);
|
||||||
}
|
}
|
||||||
|
|
||||||
private updateProgress(): void {
|
private updateProgress(): void {
|
||||||
|
|||||||
@ -69,19 +69,31 @@ export class QdrantClient {
|
|||||||
url,
|
url,
|
||||||
method,
|
method,
|
||||||
headers,
|
headers,
|
||||||
body: body ? JSON.stringify(body) : undefined
|
body: body ? JSON.stringify(body) : undefined,
|
||||||
|
throw: false
|
||||||
};
|
};
|
||||||
|
|
||||||
|
console.log('Making Qdrant request:', { url, method, headers: Object.keys(headers) });
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const response: RequestUrlResponse = await requestUrl(requestParams);
|
const response: RequestUrlResponse = await requestUrl(requestParams);
|
||||||
|
|
||||||
|
console.log('Qdrant response:', { status: response.status, headers: response.headers });
|
||||||
|
|
||||||
if (response.status >= 400) {
|
if (response.status >= 400) {
|
||||||
|
console.error('Qdrant API error response:', response.text);
|
||||||
throw new Error(`Qdrant API error: ${response.status} ${response.text}`);
|
throw new Error(`Qdrant API error: ${response.status} ${response.text}`);
|
||||||
}
|
}
|
||||||
|
|
||||||
return JSON.parse(response.text) as T;
|
return JSON.parse(response.text) as T;
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
console.error('Qdrant request failed:', error);
|
console.error('Qdrant request failed:', {
|
||||||
|
url,
|
||||||
|
method,
|
||||||
|
error: error,
|
||||||
|
message: (error as Error).message,
|
||||||
|
stack: (error as Error).stack
|
||||||
|
});
|
||||||
throw error;
|
throw error;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
51
src/utils/hash.ts
Normal file
51
src/utils/hash.ts
Normal file
@ -0,0 +1,51 @@
|
|||||||
|
/**
|
||||||
|
* Generate a UUID v4
|
||||||
|
*/
|
||||||
|
export function generateUUID(): string {
|
||||||
|
return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, function(c) {
|
||||||
|
const r = Math.random() * 16 | 0;
|
||||||
|
const v = c === 'x' ? r : (r & 0x3 | 0x8);
|
||||||
|
return v.toString(16);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Generate a deterministic UUID from a string
|
||||||
|
* This ensures the same input always produces the same UUID
|
||||||
|
*/
|
||||||
|
export function generateDeterministicUUID(input: string): string {
|
||||||
|
// Simple hash function to convert string to number
|
||||||
|
let hash = 0;
|
||||||
|
for (let i = 0; i < input.length; i++) {
|
||||||
|
const char = input.charCodeAt(i);
|
||||||
|
hash = ((hash << 5) - hash) + char;
|
||||||
|
hash = hash & hash; // Convert to 32bit integer
|
||||||
|
}
|
||||||
|
|
||||||
|
// Convert hash to UUID format
|
||||||
|
const hex = Math.abs(hash).toString(16).padStart(8, '0');
|
||||||
|
|
||||||
|
// Generate additional random-looking but deterministic parts
|
||||||
|
let hash2 = hash;
|
||||||
|
for (let i = 0; i < input.length; i++) {
|
||||||
|
hash2 = ((hash2 << 3) + input.charCodeAt(i)) & 0xFFFFFFFF;
|
||||||
|
}
|
||||||
|
const hex2 = Math.abs(hash2).toString(16).padStart(8, '0');
|
||||||
|
|
||||||
|
// Create UUID v4 format (xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx)
|
||||||
|
return `${hex.substring(0, 8)}-${hex.substring(0, 4)}-4${hex.substring(1, 4)}-${hex2.substring(0, 4)}-${hex2.substring(0, 12)}`;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Generate a simple numeric hash from a string
|
||||||
|
*/
|
||||||
|
export function hashString(str: string): number {
|
||||||
|
let hash = 0;
|
||||||
|
for (let i = 0; i < str.length; i++) {
|
||||||
|
const char = str.charCodeAt(i);
|
||||||
|
hash = ((hash << 5) - hash) + char;
|
||||||
|
hash = hash & hash; // Convert to 32bit integer
|
||||||
|
}
|
||||||
|
return Math.abs(hash);
|
||||||
|
}
|
||||||
|
|
||||||
Loading…
x
Reference in New Issue
Block a user