# ODMDB Natural Language Query PoC This is a **Proof of Concept (PoC)** that demonstrates the conversion of natural language queries into ODMDB search queries using OpenAI's structured output API. ## Current Status ⚠️ **Partial Implementation**: Currently only the **seekers** object mapping is implemented. This PoC focuses on demonstrating the natural language to DSL query conversion for seeker-related searches. ## Features - **Natural Language Processing**: Converts human questions into structured ODMDB queries - **Real ODMDB Integration**: Works with actual ODMDB data from `../smatchitObjectOdmdb/` - **Schema-Based Mapping**: Uses actual seekers.json schema for accurate field mapping (62 properties) - **Local Data Execution**: Processes queries against local seeker files in `objects/seekers/itm/` - **OpenAI Structured Output**: Ensures reliable JSON query generation - **Query Validation**: Validates generated queries against real ODMDB schema rules - **jq Integration**: Powerful result processing, filtering, and CSV export capabilities ## Prerequisites - Node.js (v16 or higher) - OpenAI API key ## Installation 1. Make sure you have the ODMDB data structure available: ``` ../smatchitObjectOdmdb/ ├── schema/ │ └── seekers.json # Seeker schema (62 properties) └── objects/ └── seekers/ └── itm/ # Individual seeker JSON files ``` 2. Install dependencies: ```bash npm install ``` 3. Set your OpenAI API key: ```bash export OPENAI_API_KEY=sk-your-api-key-here ``` ## Usage ### Running the PoC **Query Generation Only (Default):** ```bash npm start ``` **Query Generation + Execution:** ```bash EXECUTE_QUERY=true npm start ``` ``` This will process the hardcoded natural language query and output the generated ODMDB query in JSON format. When `EXECUTE_QUERY=true`, it will also execute the query against the ODMDB server. ### Changing the Query To test different natural language queries, edit the `NL_QUERY` constant in `poc.js`: ```javascript // Line 16 in poc.js const NL_QUERY = "your natural language query here"; ``` ### Example Queries **Status-based queries:** - `"show me seekers with status startasap and their email and experience"` - `"find seekers looking for jobs urgently with their skills"` **Date-based queries:** - `"give me new seekers since last week with email and experience"` - `"show me seekers from yesterday with their skills"` **Field-specific queries:** - `"find seekers with job titles and salary expectations"` - `"show me seeker locations and availability"` **Supported filter types:** - **Status filtering**: `seekstatus` (startasap, norush, notlooking) - **Date filtering**: `dt_create` with date ranges - **Index optimization**: Uses ODMDB indexes for efficient queries This demonstrates various jq operations including: - Basic data formatting and field selection - CSV conversion from JSON - Advanced filtering and transformations - Statistical summaries and aggregations ## Environment Variables - `OPENAI_API_KEY` - Your OpenAI API key (required) - `EXECUTE_QUERY` - Set to "true" to execute queries against ODMDB (default: false) - `ODMDB_BASE_URL` - ODMDB server URL (default: http://localhost:3000) - `ODMDB_TRIBE` - ODMDB tribe name (default: smatchit) - `OPENAI_MODEL` - OpenAI model to use (default: gpt-5) ## Output Format **Query Generation:** The PoC generates ODMDB queries in this format: ```json { "object": "seekers", "condition": ["prop.dt_create(>=:2025-10-06)"], "fields": ["alias", "email", "seekworkingyear"] } ``` ## ODMDB DSL Support The PoC understands and generates these ODMDB DSL patterns: - **Property queries**: `prop.(operator:value)` - **Index queries**: `idx.(value)` - **Join queries**: `join(remoteObject:localKey:remoteProp:operator:value)` ## Field Mappings Currently supports mapping for seekers object: - `email` → `email` - `experience` → `seekworkingyear` - `job titles` → `seekjobtitleexperience` - `status` → `seekstatus` ## Schema Context The PoC can optionally load schema files for context: - `main.json` - Combined schema definitions - `lg.json` - Localization/language mappings ## Limitations - **Seekers only**: Other ODMDB objects (jobads, recruiters, etc.) are not yet implemented - **No execution**: Only generates queries, doesn't execute them against ODMDB - **Hardcoded query**: Single query per run (no interactive mode) - **Basic validation**: Limited DSL syntax validation ## Next Steps - [ ] Add support for other ODMDB objects (jobads, recruiters, etc.) - [ ] Interactive CLI for multiple queries - [ ] Integration with actual ODMDB backend - [ ] Enhanced field mapping and validation - [ ] Multi-turn conversation support ## Files - `poc.js` - Main PoC implementation - `package.json` - Dependencies and scripts - `main.json` - Optional schema context (if available) - `lg.json` - Optional localization context (if available)