ODMDB Natural Language Query PoC

This is a Proof of Concept (PoC) that demonstrates the conversion of natural language queries into ODMDB search queries using OpenAI's structured output API.

Current Status

⚠️ Partial Implementation: Currently only the seekers object mapping is implemented. This PoC focuses on demonstrating the natural language to DSL query conversion for seeker-related searches.

Features

  • Natural Language Processing: Converts human questions into structured ODMDB queries
  • Real ODMDB Integration: Works with actual ODMDB data from ../smatchitObjectOdmdb/
  • Schema-Based Mapping: Uses actual seekers.json schema for accurate field mapping (62 properties)
  • Local Data Execution: Processes queries against local seeker files in objects/seekers/itm/
  • OpenAI Structured Output: Ensures reliable JSON query generation
  • Query Validation: Validates generated queries against real ODMDB schema rules
  • jq Integration: Powerful result processing, filtering, and CSV export capabilities

Prerequisites

  • Node.js (v16 or higher)
  • OpenAI API key

Installation

  1. Make sure you have the ODMDB data structure available:

    ../smatchitObjectOdmdb/
    ├── schema/
    │   └── seekers.json       # Seeker schema (62 properties)
    └── objects/
        └── seekers/
            └── itm/           # Individual seeker JSON files
    
  2. Install dependencies:

    npm install
    
  3. Set your OpenAI API key:

    export OPENAI_API_KEY=sk-your-api-key-here
    

Usage

Running the PoC

Query Generation Only (Default):

npm start

Query Generation + Execution:

EXECUTE_QUERY=true npm start

This will process the hardcoded natural language query and output the generated ODMDB query in JSON format. When `EXECUTE_QUERY=true`, it will also execute the query against the ODMDB server.

### Changing the Query

To test different natural language queries, edit the `NL_QUERY` constant in `poc.js`:

```javascript
// Line 16 in poc.js
const NL_QUERY = "your natural language query here";

Example Queries

Status-based queries:

  • "show me seekers with status startasap and their email and experience"
  • "find seekers looking for jobs urgently with their skills and salary expectations"
  • "get seekers who are not looking with their employment status"

Date-based queries:

  • "give me new seekers since last week with email and experience"
  • "show me seekers from yesterday with their location and availability"
  • "find recently updated seekers with their job preferences"

Comprehensive field queries:

  • "show me seeker contact info and work experience"
  • "find seekers with personality types and language skills"
  • "get seeker salary expectations and preferred working hours"
  • "show me seeker education and training preferences"
  • "find seekers with their job applications and saved jobs"

Location & preferences:

  • "show me seekers in Paris with remote work preferences"
  • "find seekers available to work in multiple countries"
  • "get seekers with specific location and salary requirements"

Skills & competencies:

  • "find seekers with technical skills and years of experience"
  • "show me seekers with language abilities and personality profiles"
  • "get seekers with specific know-how and job radar interests"

Job search activity:

  • "show me seekers who applied to jobs recently"
  • "find seekers with saved jobs and their preferences"
  • "get seekers who were invited to apply with their status"

Notifications & communication:

  • "show me seekers with email preferences and notification settings"
  • "find seekers who receive weekly reports and interview tips"

Supported filter types:

  • Status filtering: seekstatus (startasap, norush, notlooking)
  • Date filtering: dt_create, dt_update, matchinglastdate with date ranges
  • Index optimization: Uses ODMDB indexes (lst_alias, seekstatus_alias) for efficient queries

Demo & Testing Tools

Interactive Demo:

node demo.js

Live PoC demonstration that actually uses the query generation functionality to show:

  • Real query generation from natural language using OpenAI
  • ODMDB schema loading and field mapping
  • Current ODMDB data status and sample data

Demo with Query Execution:

EXECUTE_DEMO=true node demo.js

Runs the demo with actual query execution against real seeker data files.

jq Processing Test:

node test-jq.js

Demonstrates various jq operations including:

  • Basic data formatting and field selection
  • CSV conversion from JSON
  • Advanced filtering and transformations
  • Statistical summaries and aggregations

jq Playground (Optional):

node experiment-jq-playground.js

A playground to experiment with jq commands - not vital to the PoC but useful for learning jq syntax.

Environment Variables

  • OPENAI_API_KEY - Your OpenAI API key (required)
  • EXECUTE_QUERY - Set to "true" to execute queries against ODMDB (default: false)
  • ODMDB_BASE_URL - ODMDB server URL (default: http://localhost:3000)
  • ODMDB_TRIBE - ODMDB tribe name (default: smatchit)
  • OPENAI_MODEL - OpenAI model to use (default: gpt-5)

Output Format

Query Generation: The PoC generates ODMDB queries in this format:

{
  "object": "seekers",
  "condition": ["prop.dt_create(>=:2025-10-06)"],
  "fields": ["alias", "email", "seekworkingyear"]
}

ODMDB DSL Support

The PoC understands and generates these ODMDB DSL patterns:

  • Property queries: prop.<field>(operator:value)
  • Index queries: idx.<indexName>(value)
  • Join queries: join(remoteObject:localKey:remoteProp:operator:value)

Comprehensive Field Mappings

Supports extensive natural language mapping for all 62 seeker properties:

Contact & Identity:

  • email, contact, mailemail
  • id, username, aliasalias
  • bio, description, summaryshortdescription

Work Experience & Status:

  • experience, years of experience, career lengthseekworkingyear
  • job titles, positions, roles, work historyseekjobtitleexperience
  • status, availability, urgencyseekstatus
  • employment, work status, job statusemploymentstatus

Location & Geography:

  • location, where, work locationseeklocation
  • countries, work countriescountryavailabletowork
  • current location, last locationlastlocation

Salary & Compensation:

  • salary, pay, compensation, wagesalaryexpectation
  • currency, salary currencysalarydevise
  • salary unit, pay periodsalaryunit

Skills & Competencies:

  • skills, competencies, abilitiesskills
  • languages, language skillslanguageskills
  • knowledge, expertise, know-howknowhow

Personality & Preferences:

  • personality, MBTI, typembti
  • likes, interests, preferencesthingsilike
  • dislikes, avoid, not interestedthingsidislike

Job Search Activity:

  • applied jobs, applicationsjobadapply
  • saved jobs, bookmarked jobsjobadsaved
  • viewed jobs, job viewsjobadview
  • invitations, invited to applyjobadinvitedtoapply

Availability & Schedule:

  • working hours, preferred hours, schedulepreferedworkinghours
  • unavailable, blocked timesnotavailabletowork

Dates & Activity:

  • created, new, recent, sincedt_create
  • updated, modified, last updatedt_update
  • last matching, matching datematchinglastdate

Plus comprehensive mappings for education, notifications, training, and system fields.

Schema Context

The PoC can optionally load schema files for context:

  • main.json - Combined schema definitions
  • lg.json - Localization/language mappings

Limitations

  • Seekers only: Other ODMDB objects (jobads, recruiters, etc.) are not yet implemented
  • Local execution only: Works with file-based data, not live ODMDB server API
  • Hardcoded query: Single query per run (no interactive mode)
  • Performance limit: Processes first 50 seeker files for PoC performance
  • Simplified DSL: Basic condition parsing (date ranges, status filtering)

Next Steps

  • Add support for other ODMDB objects (jobads, recruiters, etc.)
  • Interactive CLI for multiple queries
  • Integration with actual ODMDB backend
  • Enhanced field mapping and validation
  • Multi-turn conversation support

Files

Core Implementation:

  • poc.js - Main PoC implementation with full ODMDB integration
  • package.json - Dependencies and scripts

Demo & Testing:

  • demo.js - Live PoC demo that actually generates and executes queries using real ODMDB data
  • test-jq.js - jq processing capabilities demonstration
  • experiment-jq-playground.js - jq learning playground (optional, not vital to PoC)

Data & Schema:

  • main.json - Optional consolidated schema context (if available)
  • ../smatchitObjectOdmdb/schema/seekers.json - Real seekers schema (62 properties)
  • ../smatchitObjectOdmdb/objects/seekers/itm/ - Individual seeker data files
Description
No description provided
Readme 131 KiB
Languages
JavaScript 100%