Syncing elasticsearch on connection - nodeJS

Question!

Aim: sync elasticsearch with postgres database
Why: sometimes newtwork or cluster/server break so future updates should be recorded

This article https://qafoo.com/blog/086_how_to_synchronize_a_database_with_elastic_search.html suggests that I should create a separate table updates that will sync elasticsearch's id, allowing to select new data (from database) since the last record (in elasticsearch). So I thought what if I could record elasticsearch's failure and successful connection: if client ponged back successfully (returned a promise), I could launch a function to sync records with my database.

Here's my elasticConnect.js

import elasticsearch from 'elasticsearch'
import syncProcess from './sync'

const client = new elasticsearch.Client({
  host:  'localhost:9200',
  log: 'trace'
});


client.ping({
   requestTimeout: Infinity,
   hello: "elasticsearch!"
})
.then(() => syncProcess) // successful connection 
.catch(err => console.error(err))


 export default client

This way, I don't even need to worry about running cron job (if question 1 is correct), since I know that cluster is running.

Questions

  1. Will syncProcess run before export default client? I don't want any requests coming in while syncing...

  2. syncProcess should run only once (since it's cached/not exported), no matter how many times I import elasticConnect.js. Correct?

  3. Is there any advantages using the method with updates table, instead of just selecting data from parent/source table?

  4. The articles' comments say "don't use timestamp to compare new data!".Ehhh... why? It should be ok since database is blocking, right?

By : Antartica


Answers

For 1: As it is you have not warranty that syncProcess will have run by the time the client is exported. Instead you should do something like in this answer and export a promise instead.

For 2: With the solution I linked to in the above question, this would be taken care of.

For 3: An updates table would also catch record deletions, while simply selecting from the DB would not, since you don't know which records have disappeared.

For 4: The second comment after the article you linked to provides the answer (hint: timestamps are not strictly monotonic).

By : Val


just started mine last week following this method: https://youtu.be/Jh0er2pRcq8?t=1h12m26s Then you can simply look at mongoose schema, it allows you creating relations in mongo :)

exemple of schema

var companySchema = new mongoose.Schema({
    name: String,
    npa: String,
    city: String,
    country: String,
    tags: [{
        type: mongoose.Schema.Types.ObjectId,
        ref: 'Tag'  //  réf. tagSchema
    }
    ]
})


var tagSchema = new mongoose.Schema({
    label: String,
    use: Number
})


If you can specify the minimum ClientCustomerId length, e.g. it can never be less than four characters, you can limit the results thus:

WHERE ClientCustomerId like left('SoseJost75G', 4) + '%'

Here an index can be used to get the matching records. Your criteria

AND ClientCustomerId <= 'SoseJost75G' and ClientCustomerId

would then have to be looked up only in the records already found.

The complete query:

SELECT CustomerId
FROM Customer cust
WHERE ClientCustomerId like left('SoseJost75G', 4) + '%'
AND ClientCustomerId <= 'SoseJost75G' and ClientCustomerId;

BTW: Your criteria can also be written as

ClientCustomerId = left('SoseJost75G', length(ClientCustomerId))

but I suppose that this isn't faster than your version.



This video can help you solving your question :)
By: admin