Ticket #77 (new task)

Opened 3 years ago

Last modified 17 months ago

Have a strategy for comitting the SOLR index

Reported by: bruno Owned by: bruno
Priority: major Milestone: 1.2
Component: Indexer Version:
Keywords: Cc: lily-developers@…

Description

Not sure if we need one though, maybe this can be configured on the SOLR side too.

There is a 'commitWithin' feature that might be used.

Change History

comment:1 Changed 2 years ago by stevenn

  • Milestone set to 1.1

Based upon customer comment, we need to come up with sensible defaults for autocommit config of Solr. New users expect autocommit to be enabled, perhaps with a development-cycle-friendly interval (30s, 1 minute?). If users run into issues with autocommit (performance-wise), it is quite likely they have grown into a larger-scale production setup, and understand the potential penalty of autocommits better.

One can also assume that a batch/MR based (re)indexing operation should always end with an index commit.

comment:2 follow-up: ↓ 3 Changed 20 months ago by bruno

Would be interesting to be able to specify the commitWithin value in the indexerconf, based on rules, e.g. based on record type, or value of some field, ...

comment:3 in reply to: ↑ 2 ; follow-up: ↓ 4 Changed 20 months ago by slim tebourbi

Replying to bruno:

Would be interesting to be able to specify the commitWithin value in the indexerconf, based on rules, e.g. based on record type, or value of some field, ...

From a Lily user point of view, all the 'write' operations are done by Lily so all the related configurations, settings... should be better done via Lily, without loosing the ability to do it in solr, of course. For example, the solr's commitWithin functionality could be exposed via the Lily Repository record write/update/delete api. Take a look at the CQRS (Command and Query Request Segregation) pattern and probably the Event Sourcing one. I think that these kind of approaches could have many benefits to Lily api consistency.

comment:4 in reply to: ↑ 3 Changed 20 months ago by bruno

Replying to slim tebourbi:

Replying to bruno:

Would be interesting to be able to specify the commitWithin value in the indexerconf, based on rules, e.g. based on record type, or value of some field, ...

From a Lily user point of view, all the 'write' operations are done by Lily so all the related configurations, settings... should be better done via Lily, without loosing the ability to do it in solr, of course. For example, the solr's commitWithin functionality could be exposed via the Lily Repository record write/update/delete api.

Interesting, so this would basically allow the user/client doing the update to dictate the commitWithin. While this does add another degree of flexibility, I don't see it as better than having the commitWithin defined by server-side rules (= defining it in the indexerconf, or whatever), but just as another option. Server-side rules allow for central control, thus avoiding abuse by individual clients. Different indexes might also require different commit rules, and there could be different rules for the updating of denormalized data. Expressing all that as an extra argument to create/update/delete calls would become complex.

For most flexibility, we could have a mix of both solutions.

Take a look at the CQRS (Command and Query Request Segregation) pattern and probably the Event Sourcing one. I think that these kind of approaches could have many benefits to Lily api consistency.

I will have to read up on those, but I don't immediately see the relationship to the commitWithin topic?

comment:5 Changed 17 months ago by bruno

  • Milestone changed from 1.1 to 1.2
Note: See TracTickets for help on using tickets.