ODMDB Natural Language Query PoC
This is a Proof of Concept (PoC) that demonstrates the conversion of natural language queries into ODMDB search queries using OpenAI's structured output API.
Current Status
⚠️ Partial Implementation: Currently only the seekers object mapping is implemented. This PoC focuses on demonstrating the natural language to DSL query conversion for seeker-related searches.
Features
- Natural Language Processing: Converts human questions into structured ODMDB queries
- Real ODMDB Integration: Works with actual ODMDB data from
../smatchitObjectOdmdb/
- Schema-Based Mapping: Uses actual seekers.json schema for accurate field mapping (62 properties)
- Local Data Execution: Processes queries against local seeker files in
objects/seekers/itm/
- OpenAI Structured Output: Ensures reliable JSON query generation
- Query Validation: Validates generated queries against real ODMDB schema rules
- jq Integration: Powerful result processing, filtering, and CSV export capabilities
Prerequisites
- Node.js (v16 or higher)
- OpenAI API key
Installation
-
Make sure you have the ODMDB data structure available:
../smatchitObjectOdmdb/ ├── schema/ │ └── seekers.json # Seeker schema (62 properties) └── objects/ └── seekers/ └── itm/ # Individual seeker JSON files
-
Install dependencies:
npm install
-
Set your OpenAI API key:
export OPENAI_API_KEY=sk-your-api-key-here
Usage
Running the PoC
Query Generation Only (Default):
npm start
Query Generation + Execution:
EXECUTE_QUERY=true npm start
This will process the hardcoded natural language query and output the generated ODMDB query in JSON format. When `EXECUTE_QUERY=true`, it will also execute the query against the ODMDB server.
### Changing the Query
To test different natural language queries, edit the `NL_QUERY` constant in `poc.js`:
```javascript
// Line 16 in poc.js
const NL_QUERY = "your natural language query here";
Example Queries
Status-based queries:
"show me seekers with status startasap and their email and experience"
"find seekers looking for jobs urgently with their skills"
Date-based queries:
"give me new seekers since last week with email and experience"
"show me seekers from yesterday with their skills"
Field-specific queries:
"find seekers with job titles and salary expectations"
"show me seeker locations and availability"
Supported filter types:
- Status filtering:
seekstatus
(startasap, norush, notlooking) - Date filtering:
dt_create
with date ranges - Index optimization: Uses ODMDB indexes for efficient queries
This demonstrates various jq operations including:
- Basic data formatting and field selection
- CSV conversion from JSON
- Advanced filtering and transformations
- Statistical summaries and aggregations
Environment Variables
OPENAI_API_KEY
- Your OpenAI API key (required)EXECUTE_QUERY
- Set to "true" to execute queries against ODMDB (default: false)ODMDB_BASE_URL
- ODMDB server URL (default: http://localhost:3000)ODMDB_TRIBE
- ODMDB tribe name (default: smatchit)OPENAI_MODEL
- OpenAI model to use (default: gpt-5)
Output Format
Query Generation: The PoC generates ODMDB queries in this format:
{
"object": "seekers",
"condition": ["prop.dt_create(>=:2025-10-06)"],
"fields": ["alias", "email", "seekworkingyear"]
}
ODMDB DSL Support
The PoC understands and generates these ODMDB DSL patterns:
- Property queries:
prop.<field>(operator:value)
- Index queries:
idx.<indexName>(value)
- Join queries:
join(remoteObject:localKey:remoteProp:operator:value)
Field Mappings
Currently supports mapping for seekers object:
email
→email
experience
→seekworkingyear
job titles
→seekjobtitleexperience
status
→seekstatus
Schema Context
The PoC can optionally load schema files for context:
main.json
- Combined schema definitionslg.json
- Localization/language mappings
Limitations
- Seekers only: Other ODMDB objects (jobads, recruiters, etc.) are not yet implemented
- No execution: Only generates queries, doesn't execute them against ODMDB
- Hardcoded query: Single query per run (no interactive mode)
- Basic validation: Limited DSL syntax validation
Next Steps
- Add support for other ODMDB objects (jobads, recruiters, etc.)
- Interactive CLI for multiple queries
- Integration with actual ODMDB backend
- Enhanced field mapping and validation
- Multi-turn conversation support
Files
poc.js
- Main PoC implementationpackage.json
- Dependencies and scriptsmain.json
- Optional schema context (if available)lg.json
- Optional localization context (if available)