302 lines
9.2 KiB
Markdown
302 lines
9.2 KiB
Markdown
# ODMDB Natural Language Query PoC
|
|
|
|
This is a **Proof of Concept (PoC)** that demonstrates the conversion of natural language queries into ODMDB search queries using OpenAI's structured output API.
|
|
|
|
## Current Status
|
|
|
|
⚠️ **Partial Implementation**: Currently only the **seekers** object mapping is implemented. This PoC focuses on demonstrating the natural language to DSL query conversion for seeker-related searches.
|
|
|
|
## Features
|
|
|
|
- **Natural Language Processing**: Converts human questions into structured ODMDB queries
|
|
- **Real ODMDB Integration**: Works with actual ODMDB data from `../smatchitObjectOdmdb/`
|
|
- **Schema-Based Mapping**: Uses actual seekers.json schema for accurate field mapping (62 properties)
|
|
- **Local Data Execution**: Processes queries against local seeker files in `objects/seekers/itm/`
|
|
- **OpenAI Structured Output**: Ensures reliable JSON query generation
|
|
- **Query Validation**: Validates generated queries against real ODMDB schema rules
|
|
- **jq Integration**: Powerful result processing, filtering, and CSV export capabilities
|
|
|
|
## Prerequisites
|
|
|
|
- Node.js (v16 or higher)
|
|
- OpenAI API key
|
|
|
|
## Installation
|
|
|
|
1. Make sure you have the ODMDB data structure available:
|
|
|
|
```
|
|
../smatchitObjectOdmdb/
|
|
├── schema/
|
|
│ └── seekers.json # Seeker schema (62 properties)
|
|
└── objects/
|
|
└── seekers/
|
|
└── itm/ # Individual seeker JSON files
|
|
```
|
|
|
|
2. Install dependencies:
|
|
|
|
```bash
|
|
npm install
|
|
```
|
|
|
|
3. Set your OpenAI API key:
|
|
```bash
|
|
export OPENAI_API_KEY=sk-your-api-key-here
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Running the PoC
|
|
|
|
**Query Generation Only (Default):**
|
|
|
|
```bash
|
|
npm start
|
|
```
|
|
|
|
**Query Generation + Execution:**
|
|
|
|
```bash
|
|
EXECUTE_QUERY=true npm start
|
|
```
|
|
|
|
````
|
|
|
|
This will process the hardcoded natural language query and output the generated ODMDB query in JSON format. When `EXECUTE_QUERY=true`, it will also execute the query against the ODMDB server.
|
|
|
|
### Changing the Query
|
|
|
|
To test different natural language queries, edit the `NL_QUERY` constant in `poc.js`:
|
|
|
|
```javascript
|
|
// Line 16 in poc.js
|
|
const NL_QUERY = "your natural language query here";
|
|
````
|
|
|
|
### Example Queries
|
|
|
|
**Status-based queries:**
|
|
|
|
- `"show me seekers with status startasap and their email and experience"`
|
|
- `"find seekers looking for jobs urgently with their skills and salary expectations"`
|
|
- `"get seekers who are not looking with their employment status"`
|
|
|
|
**Date-based queries:**
|
|
|
|
- `"give me new seekers since last week with email and experience"`
|
|
- `"show me seekers from yesterday with their location and availability"`
|
|
- `"find recently updated seekers with their job preferences"`
|
|
|
|
**Comprehensive field queries:**
|
|
|
|
- `"show me seeker contact info and work experience"`
|
|
- `"find seekers with personality types and language skills"`
|
|
- `"get seeker salary expectations and preferred working hours"`
|
|
- `"show me seeker education and training preferences"`
|
|
- `"find seekers with their job applications and saved jobs"`
|
|
|
|
**Location & preferences:**
|
|
|
|
- `"show me seekers in Paris with remote work preferences"`
|
|
- `"find seekers available to work in multiple countries"`
|
|
- `"get seekers with specific location and salary requirements"`
|
|
|
|
**Skills & competencies:**
|
|
|
|
- `"find seekers with technical skills and years of experience"`
|
|
- `"show me seekers with language abilities and personality profiles"`
|
|
- `"get seekers with specific know-how and job radar interests"`
|
|
|
|
**Job search activity:**
|
|
|
|
- `"show me seekers who applied to jobs recently"`
|
|
- `"find seekers with saved jobs and their preferences"`
|
|
- `"get seekers who were invited to apply with their status"`
|
|
|
|
**Notifications & communication:**
|
|
|
|
- `"show me seekers with email preferences and notification settings"`
|
|
- `"find seekers who receive weekly reports and interview tips"`
|
|
|
|
**Supported filter types:**
|
|
|
|
- **Status filtering**: `seekstatus` (startasap, norush, notlooking)
|
|
- **Date filtering**: `dt_create`, `dt_update`, `matchinglastdate` with date ranges
|
|
- **Index optimization**: Uses ODMDB indexes (`lst_alias`, `seekstatus_alias`) for efficient queries
|
|
|
|
### Demo & Testing Tools
|
|
|
|
**Interactive Demo:**
|
|
|
|
```bash
|
|
node demo.js
|
|
```
|
|
|
|
**Live PoC demonstration** that actually uses the query generation functionality to show:
|
|
|
|
- Real query generation from natural language using OpenAI
|
|
- ODMDB schema loading and field mapping
|
|
- Current ODMDB data status and sample data
|
|
|
|
**Demo with Query Execution:**
|
|
|
|
```bash
|
|
EXECUTE_DEMO=true node demo.js
|
|
```
|
|
|
|
Runs the demo with actual query execution against real seeker data files.
|
|
|
|
**jq Processing Test:**
|
|
|
|
```bash
|
|
node test-jq.js
|
|
```
|
|
|
|
Demonstrates various jq operations including:
|
|
|
|
- Basic data formatting and field selection
|
|
- CSV conversion from JSON
|
|
- Advanced filtering and transformations
|
|
- Statistical summaries and aggregations
|
|
|
|
**jq Playground (Optional):**
|
|
|
|
```bash
|
|
node experiment-jq-playground.js
|
|
```
|
|
|
|
A playground to experiment with jq commands - not vital to the PoC but useful for learning jq syntax.
|
|
|
|
## Environment Variables
|
|
|
|
- `OPENAI_API_KEY` - Your OpenAI API key (required)
|
|
- `EXECUTE_QUERY` - Set to "true" to execute queries against ODMDB (default: false)
|
|
- `ODMDB_BASE_URL` - ODMDB server URL (default: http://localhost:3000)
|
|
- `ODMDB_TRIBE` - ODMDB tribe name (default: smatchit)
|
|
- `OPENAI_MODEL` - OpenAI model to use (default: gpt-5)
|
|
|
|
## Output Format
|
|
|
|
**Query Generation:**
|
|
The PoC generates ODMDB queries in this format:
|
|
|
|
```json
|
|
{
|
|
"object": "seekers",
|
|
"condition": ["prop.dt_create(>=:2025-10-06)"],
|
|
"fields": ["alias", "email", "seekworkingyear"]
|
|
}
|
|
```
|
|
|
|
## ODMDB DSL Support
|
|
|
|
The PoC understands and generates these ODMDB DSL patterns:
|
|
|
|
- **Property queries**: `prop.<field>(operator:value)`
|
|
- **Index queries**: `idx.<indexName>(value)`
|
|
- **Join queries**: `join(remoteObject:localKey:remoteProp:operator:value)`
|
|
|
|
## Comprehensive Field Mappings
|
|
|
|
Supports extensive natural language mapping for **all 62 seeker properties**:
|
|
|
|
**Contact & Identity:**
|
|
|
|
- `email`, `contact`, `mail` → `email`
|
|
- `id`, `username`, `alias` → `alias`
|
|
- `bio`, `description`, `summary` → `shortdescription`
|
|
|
|
**Work Experience & Status:**
|
|
|
|
- `experience`, `years of experience`, `career length` → `seekworkingyear`
|
|
- `job titles`, `positions`, `roles`, `work history` → `seekjobtitleexperience`
|
|
- `status`, `availability`, `urgency` → `seekstatus`
|
|
- `employment`, `work status`, `job status` → `employmentstatus`
|
|
|
|
**Location & Geography:**
|
|
|
|
- `location`, `where`, `work location` → `seeklocation`
|
|
- `countries`, `work countries` → `countryavailabletowork`
|
|
- `current location`, `last location` → `lastlocation`
|
|
|
|
**Salary & Compensation:**
|
|
|
|
- `salary`, `pay`, `compensation`, `wage` → `salaryexpectation`
|
|
- `currency`, `salary currency` → `salarydevise`
|
|
- `salary unit`, `pay period` → `salaryunit`
|
|
|
|
**Skills & Competencies:**
|
|
|
|
- `skills`, `competencies`, `abilities` → `skills`
|
|
- `languages`, `language skills` → `languageskills`
|
|
- `knowledge`, `expertise`, `know-how` → `knowhow`
|
|
|
|
**Personality & Preferences:**
|
|
|
|
- `personality`, `MBTI`, `type` → `mbti`
|
|
- `likes`, `interests`, `preferences` → `thingsilike`
|
|
- `dislikes`, `avoid`, `not interested` → `thingsidislike`
|
|
|
|
**Job Search Activity:**
|
|
|
|
- `applied jobs`, `applications` → `jobadapply`
|
|
- `saved jobs`, `bookmarked jobs` → `jobadsaved`
|
|
- `viewed jobs`, `job views` → `jobadview`
|
|
- `invitations`, `invited to apply` → `jobadinvitedtoapply`
|
|
|
|
**Availability & Schedule:**
|
|
|
|
- `working hours`, `preferred hours`, `schedule` → `preferedworkinghours`
|
|
- `unavailable`, `blocked times` → `notavailabletowork`
|
|
|
|
**Dates & Activity:**
|
|
|
|
- `created`, `new`, `recent`, `since` → `dt_create`
|
|
- `updated`, `modified`, `last update` → `dt_update`
|
|
- `last matching`, `matching date` → `matchinglastdate`
|
|
|
|
_Plus comprehensive mappings for education, notifications, training, and system fields._
|
|
|
|
## Schema Context
|
|
|
|
The PoC can optionally load schema files for context:
|
|
|
|
- `main.json` - Combined schema definitions
|
|
- `lg.json` - Localization/language mappings
|
|
|
|
## Limitations
|
|
|
|
- **Seekers only**: Other ODMDB objects (jobads, recruiters, etc.) are not yet implemented
|
|
- **Local execution only**: Works with file-based data, not live ODMDB server API
|
|
- **Hardcoded query**: Single query per run (no interactive mode)
|
|
- **Performance limit**: Processes first 50 seeker files for PoC performance
|
|
- **Simplified DSL**: Basic condition parsing (date ranges, status filtering)
|
|
|
|
## Next Steps
|
|
|
|
- [ ] Add support for other ODMDB objects (jobads, recruiters, etc.)
|
|
- [ ] Interactive CLI for multiple queries
|
|
- [ ] Integration with actual ODMDB backend
|
|
- [ ] Enhanced field mapping and validation
|
|
- [ ] Multi-turn conversation support
|
|
|
|
## Files
|
|
|
|
**Core Implementation:**
|
|
|
|
- `poc.js` - Main PoC implementation with full ODMDB integration
|
|
- `package.json` - Dependencies and scripts
|
|
|
|
**Demo & Testing:**
|
|
|
|
- `demo.js` - **Live PoC demo** that actually generates and executes queries using real ODMDB data
|
|
- `test-jq.js` - jq processing capabilities demonstration
|
|
- `experiment-jq-playground.js` - jq learning playground (optional, not vital to PoC)
|
|
|
|
**Data & Schema:**
|
|
|
|
- `main.json` - Optional consolidated schema context (if available)
|
|
- `../smatchitObjectOdmdb/schema/seekers.json` - Real seekers schema (62 properties)
|
|
- `../smatchitObjectOdmdb/objects/seekers/itm/` - Individual seeker data files
|