[UPDATE] Enhance README and demo scripts with comprehensive query examples and improved schema mappings

This commit is contained in:
Eliyan
2025-10-15 10:01:00 +02:00
parent e863a0a5ea
commit 77c2c4ab9b
3 changed files with 635 additions and 66 deletions

169
README.md
View File

@@ -61,7 +61,7 @@ npm start
EXECUTE_QUERY=true npm start
```
```
````
This will process the hardcoded natural language query and output the generated ODMDB query in JSON format. When `EXECUTE_QUERY=true`, it will also execute the query against the ODMDB server.
@@ -72,40 +72,102 @@ To test different natural language queries, edit the `NL_QUERY` constant in `poc
```javascript
// Line 16 in poc.js
const NL_QUERY = "your natural language query here";
```
````
### Example Queries
**Status-based queries:**
- `"show me seekers with status startasap and their email and experience"`
- `"find seekers looking for jobs urgently with their skills"`
- `"find seekers looking for jobs urgently with their skills and salary expectations"`
- `"get seekers who are not looking with their employment status"`
**Date-based queries:**
- `"give me new seekers since last week with email and experience"`
- `"show me seekers from yesterday with their skills"`
- `"show me seekers from yesterday with their location and availability"`
- `"find recently updated seekers with their job preferences"`
**Field-specific queries:**
**Comprehensive field queries:**
- `"find seekers with job titles and salary expectations"`
- `"show me seeker locations and availability"`
- `"show me seeker contact info and work experience"`
- `"find seekers with personality types and language skills"`
- `"get seeker salary expectations and preferred working hours"`
- `"show me seeker education and training preferences"`
- `"find seekers with their job applications and saved jobs"`
**Location & preferences:**
- `"show me seekers in Paris with remote work preferences"`
- `"find seekers available to work in multiple countries"`
- `"get seekers with specific location and salary requirements"`
**Skills & competencies:**
- `"find seekers with technical skills and years of experience"`
- `"show me seekers with language abilities and personality profiles"`
- `"get seekers with specific know-how and job radar interests"`
**Job search activity:**
- `"show me seekers who applied to jobs recently"`
- `"find seekers with saved jobs and their preferences"`
- `"get seekers who were invited to apply with their status"`
**Notifications & communication:**
- `"show me seekers with email preferences and notification settings"`
- `"find seekers who receive weekly reports and interview tips"`
**Supported filter types:**
- **Status filtering**: `seekstatus` (startasap, norush, notlooking)
- **Date filtering**: `dt_create` with date ranges
- **Index optimization**: Uses ODMDB indexes for efficient queries
- **Date filtering**: `dt_create`, `dt_update`, `matchinglastdate` with date ranges
- **Index optimization**: Uses ODMDB indexes (`lst_alias`, `seekstatus_alias`) for efficient queries
### Demo & Testing Tools
**Interactive Demo:**
This demonstrates various jq operations including:
```bash
node demo.js
```
**Live PoC demonstration** that actually uses the query generation functionality to show:
- Real query generation from natural language using OpenAI
- ODMDB schema loading and field mapping
- Current ODMDB data status and sample data
**Demo with Query Execution:**
```bash
EXECUTE_DEMO=true node demo.js
```
Runs the demo with actual query execution against real seeker data files.
**jq Processing Test:**
```bash
node test-jq.js
```
Demonstrates various jq operations including:
- Basic data formatting and field selection
- CSV conversion from JSON
- Advanced filtering and transformations
- Statistical summaries and aggregations
**jq Playground (Optional):**
```bash
node experiment-jq-playground.js
```
A playground to experiment with jq commands - not vital to the PoC but useful for learning jq syntax.
## Environment Variables
- `OPENAI_API_KEY` - Your OpenAI API key (required)
@@ -135,14 +197,66 @@ The PoC understands and generates these ODMDB DSL patterns:
- **Index queries**: `idx.<indexName>(value)`
- **Join queries**: `join(remoteObject:localKey:remoteProp:operator:value)`
## Field Mappings
## Comprehensive Field Mappings
Currently supports mapping for seekers object:
Supports extensive natural language mapping for **all 62 seeker properties**:
- `email` → `email`
- `experience` → `seekworkingyear`
- `job titles` → `seekjobtitleexperience`
- `status` → `seekstatus`
**Contact & Identity:**
- `email`, `contact`, `mail` → `email`
- `id`, `username`, `alias` → `alias`
- `bio`, `description`, `summary` → `shortdescription`
**Work Experience & Status:**
- `experience`, `years of experience`, `career length` → `seekworkingyear`
- `job titles`, `positions`, `roles`, `work history` → `seekjobtitleexperience`
- `status`, `availability`, `urgency` → `seekstatus`
- `employment`, `work status`, `job status` → `employmentstatus`
**Location & Geography:**
- `location`, `where`, `work location` → `seeklocation`
- `countries`, `work countries` → `countryavailabletowork`
- `current location`, `last location` → `lastlocation`
**Salary & Compensation:**
- `salary`, `pay`, `compensation`, `wage` → `salaryexpectation`
- `currency`, `salary currency` → `salarydevise`
- `salary unit`, `pay period` → `salaryunit`
**Skills & Competencies:**
- `skills`, `competencies`, `abilities` → `skills`
- `languages`, `language skills` → `languageskills`
- `knowledge`, `expertise`, `know-how` → `knowhow`
**Personality & Preferences:**
- `personality`, `MBTI`, `type` → `mbti`
- `likes`, `interests`, `preferences` → `thingsilike`
- `dislikes`, `avoid`, `not interested` → `thingsidislike`
**Job Search Activity:**
- `applied jobs`, `applications` → `jobadapply`
- `saved jobs`, `bookmarked jobs` → `jobadsaved`
- `viewed jobs`, `job views` → `jobadview`
- `invitations`, `invited to apply` → `jobadinvitedtoapply`
**Availability & Schedule:**
- `working hours`, `preferred hours`, `schedule` → `preferedworkinghours`
- `unavailable`, `blocked times` → `notavailabletowork`
**Dates & Activity:**
- `created`, `new`, `recent`, `since` → `dt_create`
- `updated`, `modified`, `last update` → `dt_update`
- `last matching`, `matching date` → `matchinglastdate`
_Plus comprehensive mappings for education, notifications, training, and system fields._
## Schema Context
@@ -154,9 +268,10 @@ The PoC can optionally load schema files for context:
## Limitations
- **Seekers only**: Other ODMDB objects (jobads, recruiters, etc.) are not yet implemented
- **No execution**: Only generates queries, doesn't execute them against ODMDB
- **Local execution only**: Works with file-based data, not live ODMDB server API
- **Hardcoded query**: Single query per run (no interactive mode)
- **Basic validation**: Limited DSL syntax validation
- **Performance limit**: Processes first 50 seeker files for PoC performance
- **Simplified DSL**: Basic condition parsing (date ranges, status filtering)
## Next Steps
@@ -168,7 +283,19 @@ The PoC can optionally load schema files for context:
## Files
- `poc.js` - Main PoC implementation
**Core Implementation:**
- `poc.js` - Main PoC implementation with full ODMDB integration
- `package.json` - Dependencies and scripts
- `main.json` - Optional schema context (if available)
- `lg.json` - Optional localization context (if available)
**Demo & Testing:**
- `demo.js` - **Live PoC demo** that actually generates and executes queries using real ODMDB data
- `test-jq.js` - jq processing capabilities demonstration
- `experiment-jq-playground.js` - jq learning playground (optional, not vital to PoC)
**Data & Schema:**
- `main.json` - Optional consolidated schema context (if available)
- `../smatchitObjectOdmdb/schema/seekers.json` - Real seekers schema (62 properties)
- `../smatchitObjectOdmdb/objects/seekers/itm/` - Individual seeker data files

295
demo.js
View File

@@ -1,46 +1,287 @@
#!/usr/bin/env node
// Demo script showing different ODMDB query types with real data
// Demo script that actually uses the PoC functionality to demonstrate real query generation
import fs from "node:fs";
import OpenAI from "openai";
console.log("🚀 ODMDB NL to Query Demo");
console.log("=".repeat(50));
// Import PoC components (we'll need to extract them to make them reusable)
const MODEL = process.env.OPENAI_MODEL || "gpt-5";
const ODMDB_BASE_PATH = "../smatchitObjectOdmdb";
const SCHEMA_PATH = `${ODMDB_BASE_PATH}/schema`;
// Sample queries to demonstrate
const queries = [
console.log("🚀 ODMDB NL to Query Demo - Live PoC Testing");
console.log("=".repeat(60));
// Check prerequisites
if (!process.env.OPENAI_API_KEY) {
console.log("❌ Missing OPENAI_API_KEY environment variable");
console.log(" Set it with: export OPENAI_API_KEY=sk-your-api-key");
process.exit(1);
}
// Load schema (same function as in poc.js)
function loadJsonSafe(path) {
try {
if (fs.existsSync(path)) {
return JSON.parse(fs.readFileSync(path, "utf-8"));
}
} catch (e) {
console.warn(`Warning: Could not load ${path}:`, e.message);
}
return null;
}
// Load actual ODMDB schemas
const SCHEMAS = {
seekers: loadJsonSafe(`${SCHEMA_PATH}/seekers.json`),
main: loadJsonSafe("./main.json"), // Fallback consolidated schema
};
// Simplified SchemaMapper for demo
class DemoSchemaMapper {
constructor(schemas) {
this.seekersSchema = schemas.seekers;
console.log(
`📋 Loaded seekers schema with ${
Object.keys(this.seekersSchema?.properties || {}).length
} properties`
);
}
getRecruiterReadableFields() {
if (!this.seekersSchema?.apxaccessrights?.recruiters?.R) {
return ["alias", "email", "seekstatus", "seekworkingyear"];
}
return this.seekersSchema.apxaccessrights.recruiters.R;
}
getAllSeekersFields() {
if (!this.seekersSchema?.properties) return [];
return Object.keys(this.seekersSchema.properties);
}
}
const schemaMapper = new DemoSchemaMapper(SCHEMAS);
// Sample queries to demonstrate with actual PoC execution
const demoQueries = [
{
nl: "show me seekers with status startasap and their email and experience",
description: "Status-based filtering with field selection",
expectedCondition: "idx.seekstatus_alias(startasap)",
expectedFields: ["email", "seekworkingyear"],
},
{
nl: "find seekers looking for jobs urgently with salary expectations",
description: "Status synonym mapping + salary field",
expectedCondition: "idx.seekstatus_alias(startasap)",
expectedFields: ["salaryexpectation", "salaryunit"],
},
{
nl: "give me seekers from last month with their locations",
description: "Date-based filtering + location fields",
expectedCondition: "prop.dt_create(>=:2025-09-14)",
expectedFields: ["seeklocation"],
nl: "get seekers with their contact info and personality types",
description: "Multiple field types (contact + MBTI)",
},
];
console.log("📋 Demo Queries:");
queries.forEach((query, i) => {
console.log("<EFBFBD> Demo Queries - Testing Live PoC:");
// JSON Schema for query generation (same as poc.js)
function buildResponseJsonSchema() {
const recruiterReadableFields = schemaMapper.getRecruiterReadableFields();
return {
type: "object",
additionalProperties: false,
properties: {
object: { type: "string", enum: ["seekers"] },
condition: { type: "array", items: { type: "string" }, minItems: 1 },
fields: {
type: "array",
items: { type: "string", enum: recruiterReadableFields },
minItems: 1,
},
},
required: ["object", "condition", "fields"],
};
}
// System prompt (simplified version from poc.js)
function systemPrompt() {
const availableFields = schemaMapper.getAllSeekersFields();
const recruiterReadableFields = schemaMapper.getRecruiterReadableFields();
return [
"You convert a natural language request into an ODMDB search payload.",
"Return ONLY a compact JSON object that matches the provided JSON Schema.",
"",
"ODMDB DSL:",
"- idx.<indexName>(value) - for indexed fields",
"- prop.<field>(operator:value) - for direct property queries",
"",
"Available seekers fields:",
availableFields.slice(0, 15).join(", ") +
(availableFields.length > 15 ? "..." : ""),
"",
"Recruiter-readable fields (use these for field selection):",
recruiterReadableFields.join(", "),
"",
"Field mappings:",
"- 'email', 'contact info' → email",
"- 'experience', 'years of experience' → seekworkingyear",
"- 'status', 'availability' → seekstatus",
"- 'salary', 'pay' → salaryexpectation",
"- 'personality', 'MBTI' → mbti",
"",
"Status value mappings:",
"- 'urgent', 'urgently', 'ASAP' → startasap",
"- 'no rush', 'taking time' → norush",
"- 'not looking' → notlooking",
"",
"Rules: Object must be 'seekers'. Use idx.seekstatus_alias for status queries.",
].join("\n");
}
// OpenAI client and query function
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async function generateQuery(nlText) {
try {
const resp = await client.responses.create({
model: MODEL,
input: [
{ role: "system", content: systemPrompt() },
{
role: "user",
content: `Natural language request: "${nlText}"\nReturn ONLY the JSON object.`,
},
],
text: {
format: {
name: "OdmdbQuery",
type: "json_schema",
schema: buildResponseJsonSchema(),
strict: true,
},
},
});
const jsonText = resp.output_text || resp.output?.[0]?.content?.[0]?.text;
return JSON.parse(jsonText);
} catch (error) {
console.error(`❌ Query generation failed: ${error.message}`);
return null;
}
}
// Simple query execution (simplified from poc.js)
function loadSeekersData() {
const seekersItemsPath = `${ODMDB_BASE_PATH}/objects/seekers/itm`;
try {
const files = fs
.readdirSync(seekersItemsPath)
.filter((file) => file.endsWith(".json") && file !== "backup")
.slice(0, 10); // Just 10 files for demo speed
const seekers = [];
for (const file of files) {
try {
const filePath = `${seekersItemsPath}/${file}`;
const data = JSON.parse(fs.readFileSync(filePath, "utf-8"));
seekers.push(data);
} catch (error) {
// Skip invalid files
}
}
return seekers;
} catch (error) {
return [];
}
}
async function executeQuery(query) {
const allSeekers = loadSeekersData();
if (allSeekers.length === 0) return { data: [] };
let filteredSeekers = allSeekers;
// Simple filtering
for (const condition of query.condition) {
if (condition.includes("idx.seekstatus_alias(startasap)")) {
filteredSeekers = filteredSeekers.filter(
(seeker) => seeker.seekstatus === "startasap"
);
}
if (condition.includes("prop.salaryexpectation(exists:true)")) {
filteredSeekers = filteredSeekers.filter(
(seeker) => seeker.salaryexpectation
);
}
if (condition.includes("prop.email(exists:true)")) {
filteredSeekers = filteredSeekers.filter((seeker) => seeker.email);
}
if (condition.includes("prop.mbti(exists:true)")) {
filteredSeekers = filteredSeekers.filter((seeker) => seeker.mbti);
}
}
// Select only requested fields
const results = filteredSeekers.map((seeker) => {
const filtered = {};
for (const field of query.fields) {
if (seeker.hasOwnProperty(field)) {
filtered[field] = seeker[field];
}
}
return filtered;
});
return { data: results };
}
// Main demo execution
async function runDemo() {
const executeQueries = process.env.EXECUTE_DEMO === "true";
for (let i = 0; i < demoQueries.length; i++) {
const query = demoQueries[i];
console.log(`\n${i + 1}. "${query.nl}"`);
console.log(` Purpose: ${query.description}`);
console.log(` Expected DSL: ${query.expectedCondition}`);
console.log(` Expected Fields: ${query.expectedFields.join(", ")}`);
});
console.log("\n💡 To test these queries:");
console.log("1. Edit the NL_QUERY constant in poc.js");
console.log("2. Run: EXECUTE_QUERY=true npm start");
console.log(" 🤖 Generating query...");
const generatedQuery = await generateQuery(query.nl);
console.log("\n📊 Current ODMDB Status:");
if (generatedQuery) {
console.log(" ✅ Generated ODMDB Query:");
console.log(
` ${JSON.stringify(generatedQuery, null, 6).replace(/\n/g, "\n ")}`
);
if (executeQueries) {
console.log(" 🔍 Executing query...");
const results = await executeQuery(generatedQuery);
console.log(` 📊 Found ${results.data.length} results`);
if (results.data.length > 0) {
console.log(" 📋 Sample result:");
console.log(
` ${JSON.stringify(results.data[0], null, 6).replace(
/\n/g,
"\n "
)}`
);
}
}
} else {
console.log(" ❌ Failed to generate query");
}
if (i < demoQueries.length - 1) {
console.log(" " + "-".repeat(50));
}
}
if (!executeQueries) {
console.log(`\n💡 To execute queries and see results, run:`);
console.log(` EXECUTE_DEMO=true node demo.js`);
}
}
console.log("\n📊 ODMDB Status Check:");
// Check if ODMDB data is accessible
const seekersPath = "../smatchitObjectOdmdb/objects/seekers/itm";
@@ -98,4 +339,12 @@ try {
console.log(`❌ Error loading schema: ${error.message}`);
}
console.log("\n✅ Demo complete!");
console.log("\n🚀 Running Live PoC Demo...");
runDemo()
.then(() => {
console.log("\n✅ Demo complete!");
})
.catch((error) => {
console.error("\n❌ Demo failed:", error.message);
process.exit(1);
});

233
poc.js
View File

@@ -23,7 +23,7 @@ const EXECUTE_QUERY = process.env.EXECUTE_QUERY === "true"; // Set to "true" to
// Hardcoded NL query for the PoC (no multi-turn)
const NL_QUERY =
"show me seekers with status startasap and their email and experience";
"find seekers looking for jobs urgently with their contact info and salary expectations";
// ---- Load schemas (safe) ----
function loadJsonSafe(path) {
@@ -155,17 +155,202 @@ class SchemaMapper {
generateSynonyms(fieldName, fieldDef) {
const synonyms = [];
// Common mappings based on actual schema
// Comprehensive mappings based on actual seekers schema (62 properties)
const commonMappings = {
email: ["contact", "mail", "contact email"],
seekworkingyear: ["experience", "years of experience", "work experience"],
seekjobtitleexperience: ["job titles", "job experience", "positions"],
seekstatus: ["status", "availability", "looking"],
dt_create: ["created", "creation date", "new", "recent", "since"],
salaryexpectation: ["salary", "pay", "compensation", "wage"],
seeklocation: ["location", "where", "place"],
mbti: ["personality", "type", "profile"],
alias: ["id", "identifier", "username"],
// Contact & Identity
email: ["contact", "mail", "contact email", "e-mail"],
alias: ["id", "identifier", "username", "user id"],
shortdescription: ["description", "bio", "summary", "about"],
// Work Experience & Status
seekworkingyear: [
"experience",
"years of experience",
"work experience",
"working years",
"career length",
],
seekjobtitleexperience: [
"job titles",
"job experience",
"positions",
"roles",
"previous jobs",
"work history",
],
seekstatus: [
"status",
"availability",
"looking",
"job search status",
"urgency",
],
employmentstatus: [
"employment",
"current status",
"work status",
"job status",
],
// Location & Geography
seeklocation: [
"location",
"where",
"place",
"work location",
"preferred location",
],
lastlocation: ["last location", "current location", "previous location"],
countryavailabletowork: [
"countries",
"available countries",
"work countries",
"country availability",
],
// Salary & Compensation
salaryexpectation: [
"salary",
"pay",
"compensation",
"wage",
"salary expectation",
"expected salary",
],
salarydevise: ["currency", "salary currency", "pay currency"],
salaryunit: [
"salary unit",
"pay unit",
"compensation unit",
"salary period",
],
// Job Preferences
seekjobtype: [
"job type",
"job types",
"employment type",
"contract type",
],
lookingforjobtype: [
"looking for",
"desired job type",
"preferred job type",
],
lookingforaction: ["actions", "desired actions", "preferred activities"],
lookingforother: ["other preferences", "additional requirements"],
// Skills & Competencies
skills: ["skills", "competencies", "abilities", "technical skills"],
languageskills: ["languages", "language skills", "linguistic skills"],
knowhow: ["knowledge", "expertise", "know-how", "competence"],
myworkexperience: [
"work experience",
"professional experience",
"career experience",
],
// Personality & Profile
mbti: ["personality", "type", "profile", "MBTI", "personality type"],
mywords: ["keywords", "profile words", "descriptive words"],
thingsilike: ["likes", "preferences", "interests", "things I like"],
thingsidislike: [
"dislikes",
"avoid",
"not interested",
"things I dislike",
],
// Availability & Schedule
preferedworkinghours: [
"working hours",
"preferred hours",
"work schedule",
"availability",
],
notavailabletowork: [
"unavailable",
"not available",
"blocked times",
"unavailable days",
],
// Job Search Activity
myjobradar: [
"job radar",
"tracked jobs",
"job interests",
"monitored jobs",
],
jobadview: ["viewed jobs", "job views", "seen jobs"],
jobadnotinterested: ["not interested", "rejected jobs", "dismissed jobs"],
jobadapply: ["applied jobs", "applications", "job applications"],
jobadinvitedtoapply: [
"invitations",
"invited to apply",
"job invitations",
],
jobadsaved: ["saved jobs", "bookmarked jobs", "favorite jobs"],
// Dates & Timestamps
dt_create: [
"created",
"creation date",
"new",
"recent",
"since",
"registration date",
],
dt_update: ["updated", "last update", "modified", "last modified"],
matchinglastdate: ["last matching", "matching date", "last match"],
// Education & Training
educations: [
"education",
"degree",
"diploma",
"qualification",
"studies",
],
tipsadvice: ["tips", "advice", "articles", "guidance"],
receivecommercialtraining: ["commercial training", "sales training"],
receivejobandinterviewtips: [
"interview tips",
"job tips",
"career advice",
],
// Notifications & Communication
notificationformatches: ["match notifications", "matching alerts"],
notificationforsupermatches: [
"super match notifications",
"premium matches",
],
notificationinvitedtoapply: [
"application invitations",
"invite notifications",
],
notificationrecruitprocessupdate: [
"recruitment updates",
"process updates",
],
notificationupcominginterview: [
"interview notifications",
"upcoming interviews",
],
notificationdirectmessage: ["direct messages", "chat notifications"],
emailactivityreportweekly: ["weekly reports", "weekly emails"],
emailactivityreportbiweekly: ["biweekly reports", "biweekly emails"],
emailactivityreportmonthly: ["monthly reports", "monthly emails"],
emailpersonnalizedcontent: ["personalized content", "custom content"],
emailnewsletter: ["newsletter", "news updates"],
// External IDs
polemploiid: ["pole emploi", "unemployment office", "job center ID"],
// System Fields
owner: ["owner", "account owner"],
activequizz: ["active quiz", "current quiz", "quiz"],
};
if (commonMappings[fieldName]) {
@@ -289,14 +474,22 @@ function systemPrompt() {
recruiterReadableFields.join(", "),
"",
"Field mappings for natural language:",
"- 'email' → email",
"- 'experience' → seekworkingyear",
"- 'job titles' → seekjobtitleexperience",
"- 'status' → seekstatus",
"- 'salary' → salaryexpectation",
"- 'location' → seeklocation",
"- 'email', 'contact info' → email",
"- 'experience', 'years of experience' → seekworkingyear",
"- 'job titles', 'positions', 'roles' → seekjobtitleexperience",
"- 'status', 'availability' → seekstatus",
"- 'salary', 'pay', 'compensation' → salaryexpectation",
"- 'location', 'where' → seeklocation",
"- 'skills', 'competencies' → skills",
"- 'languages' → languageskills",
"- 'personality', 'MBTI' → mbti",
"- 'new/recent' → dt_create (use prop.dt_create(>=:YYYY-MM-DD))",
"",
"Status value mappings:",
"- 'urgent', 'urgently', 'ASAP', 'quickly' → startasap",
"- 'no rush', 'taking time', 'leisurely' → norush",
"- 'not looking', 'not active' → notlooking",
"",
"Rules:",
"- Object must be 'seekers'.",
"- Use indexes when possible (idx.seekstatus_alias for status queries)",
@@ -603,7 +796,7 @@ async function processResults(results, jqFilter = ".") {
console.log("\n📋 Results Summary:");
const summary = await processResults(
results,
`.[0:3] | map({email, seekworkingyear})`
`.[0:3] | map({email, salaryexpectation, salarydevise, salaryunit})`
);
console.log(JSON.stringify(summary, null, 2));
@@ -617,8 +810,8 @@ async function processResults(results, jqFilter = ".") {
const csvData = await processResults(
results,
`
map([.email // "N/A", .seekworkingyear // "N/A"]) |
["email","experience"] as $header |
map([.email // "N/A", (.salaryexpectation | tostring) // "N/A", .salarydevise // "N/A", .salaryunit // "N/A"]) |
["email","salary","currency","unit"] as $header |
[$header] + .[0:5] |
.[] | @csv
`