Koha 21.05 – Error 500 with Elasticsearch 6.8.20

Error 500

The importance of QueryRegexEscapeOptions

Today I upgraded Koha to version 21.05, but not without a curious indexing error. At the same time I upgraded Koha, I also upgraded Elasticsearch to the latest 6.x version, 6.8.20.

The Problem

After running apt-get update && apt-get upgrade in Ubuntu 18.04, which upgraded Elasticsearch and Koha at the same time, I noticed that after searching for particular records in both the Opac and staff interface, I was getting Error 500s instead of the details of the bibliographic record you would expect from the opac-details page. This only affected a subset of my bibliographic records, of which there are 327.

Example

The finer points of sausage dogs by Alexander McCall Smith
OPAC: After upgrading to Koha 21.05 + ES 6.8.20
Staff Client: After upgrading to Koha 21.05 + ES 6.8.20

The problem seems to be related to the AACR2r punctuation in field 245 subfield $b, or the “/” not being escaped.

I checked my log files by running

tail -f /var/log/koha/$INSTANCE/*.log

and I noticed the following:

==> /var/log/koha/library2/plack-opac-error.log <==
[2021/10/31 13:09:28] [WARN] [Request] ** [http://localhost:9200]-[400] [query_shard_exception] Failed to parse query [(host-item:(The* finer* points* of* sausage* dogs* /))], with: {"index":"koha_library2_biblios","index_uuid":"eSiLbWAnRqO0CR7sIjRfDw"}, called from sub Search::Elasticsearch::Role::Client::Direct::__ANON__ at /usr/share/koha/lib/Koha/SearchEngine/Elasticsearch/Search.pm line 96. With vars: {'body' => {'error' => {'type' => 'search_phase_execution_exception','grouped' => bless( do{\(my $o = 1)}, 'JSON::PP::Boolean' ),'failed_shards' => [{'index' => 'koha_library2_biblios','node' => 'ZCj2zlZpSueOWf0foNWidQ','reason' => {'reason' => 'Failed to parse query [(host-item:(The* finer* points* of* sausage* dogs* /))]','index' => 'koha_library2_biblios','type' => 'query_shard_exception','caused_by' => {'caused_by' => {'reason' => 'Lexical error at line 1, column 55.  Encountered: <EOF> after : "/))"','type' => 'token_mgr_error'},'type' => 'parse_exception','reason' => 'Cannot parse \'(host-item:(The* finer* points* of* sausage* dogs* /))\': Lexical error at line 1, column 55.  Encountered: <EOF> after : "/))"'},'index_uuid' => 'eSiLbWAnRqO0CR7sIjRfDw'},'shard' => 0}],'root_cause' => [{'reason' => 'Failed to parse query [(host-item:(The* finer* points* of* sausage* dogs* /))]','type' => 'query_shard_exception','index_uuid' => 'eSiLbWAnRqO0CR7sIjRfDw','index' => 'koha_library2_biblios'}],'reason' => 'all shards failed','phase' => 'query'},'status' => 400},'status_code' => 400,'request' => {'mime_type' => 'application/json','path' => '/koha_library2_biblios/_search','qs' => {},'serialize' => 'std','ignore' => [],'method' => 'GET','body' => {'size' => 0,'aggregations' => {'author' => {'terms' => {'size' => '20','field' => 'author__facet'}},'location' => {'terms' => {'size' => '20','field' => 'location__facet'}},'holdingbranch' => {'terms' => {'field' => 'holdingbranch__facet','size' => '20'}},'ccode' => {'terms' => {'size' => '20','field' => 'ccode__facet'}},'title-series' => {'terms' => {'field' => 'title-series__facet','size' => '20'}},'itype' => {'terms' => {'size' => '20','field' => 'itype__facet'}},'subject' => {'terms' => {'field' => 'subject__facet','size' => '20'}},'su-geo' => {'terms' => {'size' => '20','field' => 'su-geo__facet'}},'ln' => {'terms' => {'size' => '20','field' => 'ln__facet'}}},'from' => 0,'query' => {'query_string' => {'default_operator' => 'AND','fields' => ['author','subject','title-later','ln-audio','title-expanded','number-natl-biblio','date-time-last-modified','notforloan','damaged','udc-classification','datelastseen','curriculum','control-number','identifier-publisher-for-music','related-periodical','lexile-number','holdingbranch','material-type','host-item','subject-name-personal','ff7-01-02','ccode','language-original','note','thematic-number','totalissues','title','music-key','nlm-call-number','editor','date-of-publication','cn-prefix','code-geographic','host-item-number','title-former','rtype','issues','date-of-acquisition','materials-specified','cn-suffix','copydate','cn-bib-source','personal-name','ln-subtitle','llength','not-onloan-count','ff8-23','location','ff7-00','koha-auth-number','title-key','stack','microform-generation','lf','geographic-class','number-govt-pub','publisher','number-legal-deposit','bgf-number','author-title','local-classification','stock-number','cn-class','name-geographic','conference-name','record-source','lc-card-number','author-name-corporate','dewey-classification','local-number','replacementprice','number-db','interest-age-level','renewals','itemnumber','record-control-number','interest-grade-level','identifier-other','ctype','cross-reference','replacementpricedate','dissertation-information','title-abbreviated','cn-sort','title-series','other-control-number','name','issn','price','code-institution','biblioitemnumber','abstract','classification-source','arl','su-geo','index-term-genre','provider','withdrawn','bio','extent','coded-location-qualifier','isbn','reserves','ta','index-term-uncontrolled','author-in-order','title-cover','date-entered-on-file','title-collective','corporate-name','uri','name-and-title','nal-call-number','report-number','identifier-standard','reading-grade-level','itemtype','author-name-personal','ff7-01','homebranch','pl','coden','ff8-29','indexed-by','lc-call-number','copynumber','bib-level','acqsource','ff7-02','map-scale','bnb-card-number','lost','barcode','restricted','datelastborrowed','number-local-acquisition','ln','cn-item','author-personal-bibliography','title-uniform','cn-bib-sort','arp','title-other-variant','itype'],'fuzziness' => 'auto','lenient' => bless( do{\(my $o = 1)}, 'JSON::PP::Boolean' ),'type' => 'cross_fields','analyze_wildcard' => $VAR1->{'request'}{'body'}{'query'}{'query_string'}{'lenient'},'query' => '(host-item:(The* finer* points* of* sausage* dogs* /))'}}}}}

I then did a Google search to see what I could find and came across an October, 2021 thread https://lists.katipo.co.nz/public/koha/2021-October/056849.html. It mentioned that Elasticsearch sometimes threw an error 500 when certain characters are present in bib records because they weren’t being escaped.

The Fix

I tried reindexing my bibs and authorities in Elasticsearch. No effect.

I tried restarting memcached, plack, koha, apache. No effect

I tried switiching to Zebra. This removed the error 500s, and confirmed that the problem was with Elasticsearch. This did not fix anything though.

What did work was the following:

There is system preference in Koha called QueryRegexEscapeOptions, which looks like this:

QueryRegexEscapeOptions system preference in Koha 21.05

I initially had my value set to “Unescape escaped”. I changed this to “Escape” and the error 500s disappeared!

Craig Butosi
Library professional of ten years, with six years of library management and administration experience. Graduate of critical media studies and music. Autodidact and lover all things that inch us closer to the good life.