Ticket #89 (new task)
Opened 3 years ago
Blob extracted content cache
|Reported by:||bruno||Owned by:||bruno|
When a record with blob fields needs to be reindexed, the content for those blobs needs to be re-extracted (via tika) each time.
To avoid re-extracting the content from blobs when only non-blob fields have changed, we could keep a 'blob extracted content cache' (e.g. in the form of an HBase table).