- Updated `poc.js` to support queries for multiple object types (seekers, jobads, recruiters, etc.) with intelligent routing based on natural language input. - Implemented a query validation mechanism to prevent excessive or sensitive requests. - Introduced a mapping manager for dynamic schema handling and object detection. - Enhanced the response schema generation to accommodate various object types and their respective fields. - Added a new script `verify-mapping.js` to verify and display the mapping details for the seekers schema, including available properties, indexes, access rights, and synonyms.
13 KiB
ODMDB Natural Language Query PoC
This is a Proof of Concept (PoC) that demonstrates the conversion of natural language queries into ODMDB search queries using OpenAI's structured output API.
Current Status
✅ Complete Multi-Schema Implementation: Supports all ODMDB object types including seekers, jobads, recruiters, persons, and sirets. The system intelligently detects the target object from natural language queries and generates appropriate ODMDB DSL queries.
Features
- Multi-Object Natural Language Processing: Intelligently detects target object (seekers, jobads, recruiters, persons, sirets) from natural language queries
- Real ODMDB Schema Integration: Dynamically loads actual schema files for all object types with verified accuracy
- Comprehensive Field Mapping: Uses real schema definitions with proper access rights for recruiter-readable fields
- Index-Aware Query Generation: Leverages actual ODMDB indexes for optimal query performance
- Schema Mapping Manager: Centralized system reading real schema files and generating comprehensive field synonyms
- Multi-Object Query Support: Handles queries across all ODMDB object types with object-specific optimizations
- OpenAI Structured Output: Dynamic JSON schema generation for any target object type
- Real Data Validation: Verified against actual ODMDB schema properties and index registers
- Prepared Query Demos: Ready-to-use example queries for all supported object types
Prerequisites
- Node.js (v16 or higher)
- OpenAI API key
Installation
-
Make sure you have the complete ODMDB data structure available:
../smatchitObjectOdmdb/ ├── schema/ │ ├── seekers.json # Seeker schema (62 properties, 27 readable fields) │ ├── jobads.json # Job advertisement schema │ ├── recruiters.json # Recruiter schema │ ├── persons.json # Person schema │ ├── sirets.json # Company/Siret schema │ └── *.json # Additional schema files └── objects/ ├── seekers/ │ ├── idx/ # Index files (lst_alias, seekstatus_alias, etc.) │ └── itm/ # Individual seeker JSON files ├── jobads/ │ ├── idx/ # Job ad indexes │ └── itm/ # Job ad data files ├── recruiters/ │ ├── idx/ # Recruiter indexes │ └── itm/ # Recruiter data files ├── persons/ │ └── itm/ # Person data files └── sirets/ └── itm/ # Company data files
-
Install dependencies:
npm install
-
Set your OpenAI API key:
export OPENAI_API_KEY=sk-your-api-key-here
Usage
Running the PoC
Interactive Demo (Recommended):
node demo.js
This runs the comprehensive demo with prepared queries for all object types and shows real-time query generation.
Main PoC (Query Generation Only):
npm start
Main PoC with Query Execution:
EXECUTE_QUERY=true npm start
This will process the hardcoded natural language query and output the generated ODMDB query in JSON format. When EXECUTE_QUERY=true
, it will also execute the query against the ODMDB server.
Changing the Query
To test different natural language queries, edit the NL_QUERY
constant in poc.js
:
// Line 16 in poc.js
const NL_QUERY = "your natural language query here";
The system will automatically detect which object type you're asking about and generate the appropriate query.
Example Queries by Object Type
Seekers (Job Seekers)
Status-based queries:
"show me seekers with status startasap and their email and experience"
"find seekers looking for jobs urgently with their skills and salary expectations"
"get seekers who are not looking with their employment status"
Skills & experience:
"find seekers with technical skills and years of experience"
"show me seekers with language abilities and personality profiles"
"get seekers with specific know-how and job radar interests"
Location & preferences:
"show me seekers in Paris with remote work preferences"
"find seekers available to work in multiple countries"
"get seekers with specific location and salary requirements"
Job Ads
Job search queries:
"show me recent job postings in technology"
"find job ads with high salary ranges"
"get job advertisements posted this week"
Company & location:
"show me jobs at specific companies"
"find remote job opportunities"
"get job ads in Paris or Lyon"
Recruiters
Recruiter information:
"show me active recruiters and their specializations"
"find recruiters from specific companies"
"get recruiter contact information and experience"
Persons
General person queries:
"show me person profiles with their roles"
"find persons by their experience or background"
Companies (Sirets)
Company information:
"show me companies in the technology sector"
"find companies by size or location"
"get company details and contact information"
Supported Query Types
Multi-Object Intelligence: The system automatically detects which object you're asking about:
- Mentions of "seekers", "candidates", "job seekers" → seekers object
- Mentions of "jobs", "positions", "job ads" → jobads object
- Mentions of "recruiters", "hiring managers" → recruiters object
- Mentions of "persons", "people", "profiles" → persons object
- Mentions of "companies", "employers", "organizations" → sirets object
Filter Types:
- Status filtering: Object-specific status fields
- Date filtering: Creation dates, update dates with date ranges
- Index optimization: Uses real ODMDB indexes for efficient queries
- Field-specific: Searches within specific properties
Schema Mapping System
The PoC uses a sophisticated schema mapping system located in schema-mappings/
:
Architecture
- ODMDBMappingManager: Central manager that loads and caches schema mappings
- Base Mapping: Core field synonym generation and mapping logic
- Object-Specific Mappings: Individual mapping files for each object type
- Real Schema Integration: Direct reading from actual ODMDB schema files
Verified Schema Coverage
Seekers Object:
- 62 total schema properties mapped
- 27 recruiter-readable fields identified
- 3 indexes available (lst_alias, seekstatus_alias, alias)
- 206+ field synonyms generated from real schema definitions
All Objects:
- Dynamic schema loading for any ODMDB object type
- Access rights properly extracted from apxaccessrights structure
- Index definitions read from actual idx directories
- Field synonyms generated from real property definitions
Field Mapping Examples
The system provides comprehensive natural language to field mappings:
Contact & Identity:
email
,contact
,mail
→email
id
,username
,alias
→alias
bio
,description
,summary
→shortdescription
Work Experience & Status:
experience
,years of experience
,career length
→seekworkingyear
job titles
,positions
,roles
,work history
→seekjobtitleexperience
status
,availability
,urgency
→seekstatus
Location & Geography:
location
,where
,work location
→seeklocation
countries
,work countries
→countryavailabletowork
Skills & Competencies:
skills
,competencies
,abilities
→skills
languages
,language skills
→languageskills
knowledge
,expertise
,know-how
→knowhow
(Plus hundreds more mappings for all object types)
Output Format
The PoC generates ODMDB queries in this format:
{
"object": "seekers",
"condition": ["prop.dt_create(>=:2025-10-06)"],
"fields": ["alias", "email", "seekworkingyear"]
}
ODMDB DSL Support
The PoC understands and generates these ODMDB DSL patterns:
- Property queries:
prop.<field>(operator:value)
- Index queries:
idx.<indexName>(value)
- Join queries:
join(remoteObject:localKey:remoteProp:operator:value)
Demo & Testing Tools
Interactive Demo:
node demo.js
Live PoC demonstration featuring:
- Real query generation from natural language using OpenAI
- Multi-object detection and schema loading
- Prepared queries for all supported object types
- Real-time field mapping and validation
- Current ODMDB data status display
Demo Features:
- Prepared Queries: 4 example queries per object type (20 total)
- Schema Validation: Shows actual field counts and mappings
- Real-time Generation: Demonstrates actual OpenAI API integration
- Multi-Object Support: Covers seekers, jobads, recruiters, persons, sirets
Environment Variables
OPENAI_API_KEY
- Your OpenAI API key (required)EXECUTE_QUERY
- Set to "true" to execute queries against ODMDB (default: false)EXECUTE_DEMO
- Set to "true" to execute demo queries with real generationODMDB_BASE_URL
- ODMDB server URL (default: http://localhost:3000)ODMDB_TRIBE
- ODMDB tribe name (default: smatchit)OPENAI_MODEL
- OpenAI model to use (default: gpt-4o)
System Validation
The mappings have been thoroughly validated to ensure they:
✅ Read actual ODMDB schema files - Not hardcoded mappings
✅ Access real index registers - Uses actual idx directory files
✅ Extract proper access rights - Reads apxaccessrights.recruiters.R structure
✅ Generate comprehensive synonyms - 200+ field mappings per object
✅ Support all object types - Dynamic loading for any ODMDB schema
Technical Architecture
Core Components
- poc.js: Main PoC engine with multi-object support
- demo.js: Comprehensive demonstration with prepared queries
- schema-mappings/: Real schema integration system
- package.json: Dependencies and execution scripts
Schema Integration Flow
- Schema Loading: ODMDBMappingManager reads actual schema files
- Field Extraction: Extracts properties and access rights from real schemas
- Index Integration: Reads index definitions from idx directories
- Synonym Generation: Creates comprehensive field mappings
- Query Generation: Uses OpenAI with dynamic schema for target object
- Validation: Ensures generated queries match schema constraints
Data Flow
Natural Language Query
↓
Object Detection (seekers/jobads/recruiters/persons/sirets)
↓
Schema Loading (real ODMDB schema files)
↓
Field Mapping (comprehensive synonym matching)
↓
OpenAI Structured Output (dynamic JSON schema)
↓
ODMDB DSL Query (validated against real schema)
Limitations
- Local schema files required: Needs access to actual ODMDB schema structure
- OpenAI API dependency: Requires valid API key and credits
- Performance considerations: Schema loading and mapping generation takes time
- Single query per run: No interactive conversation mode (yet)
Next Steps
- Interactive CLI for multiple queries in conversation
- Enhanced query execution with real ODMDB server integration
- Query result processing and formatting improvements
- Advanced multi-object join queries
- Performance optimizations for schema loading
- User interface for non-technical users
Files
Core Implementation:
poc.js
- Main PoC engine supporting all ODMDB object typesdemo.js
- Comprehensive demo with real query generationpackage.json
- Dependencies and scripts
Schema System:
schema-mappings/
- Complete schema mapping systemodmdb-mapping-manager.js
- Central mapping coordinatorbase-mapping.js
- Core mapping logic and synonym generationseekers-mapping.js
,jobads-mapping.js
, etc. - Object-specific mappings
Data Integration:
../smatchitObjectOdmdb/schema/*.json
- Real ODMDB schema files../smatchitObjectOdmdb/objects/*/idx/
- Index definition files../smatchitObjectOdmdb/objects/*/itm/
- Data files for all object types
Verification
The system has been validated against real ODMDB data:
- Schema Properties: All properties correctly read from actual schema files
- Index Access: Confirmed access to real index files (lst_alias, seekstatus_alias, etc.)
- Access Rights: Proper extraction of recruiter-readable fields
- Field Mappings: Comprehensive synonym generation from actual definitions
- Multi-Object Support: Verified functionality across all object types
This ensures the PoC works with actual ODMDB schema properties and accesses real index registers as required for production readiness.