Scorched API

API

scorched.connection.grouper(iterable, n)[source]

grouper(‘ABCDEFG’, 3) –> [[‘ABC’], [‘DEF’], [‘G’]]

class scorched.connection.SolrConnection(url, http_connection, mode, retry_timeout, max_length_get_url, search_timeout=())[source]
__init__(url, http_connection, mode, retry_timeout, max_length_get_url, search_timeout=())[source]
Parameters:
  • url (str) – url to Solr
  • http_connection (requests connection) – existing requests.Session object, or None to create a new one.
  • mode (str) – mode (readable, writable) Solr
  • retry_timeout (int) – timeout until retry
  • max_length_get_url (int) – max length until switch to post
  • search_timeout (float or tuple) – (optional) How long to wait for the server to send data before giving up, as a float, or a (connect timeout, read timeout) tuple.
get(ids, fl=None)[source]

Perform a RealTime Get

mlt(params, content=None)[source]
Parameters:params (dict) – LuceneQuery converted to a dictionary with search queries
Returns:json – json string

Perform a MoreLikeThis query using the content specified There may be no content if stream.url is specified in the params.

request(*args, **kwargs)[source]
Parameters:
  • args (tuple) – arguments
  • kwargs (dict) – key word arguments
select(params)[source]
Parameters:params (dict) – LuceneQuery converted to a dictionary with search queries
Returns:json – json string

We perform here a search on the select handler of Solr.

update(update_doc, **kwargs)[source]
Parameters:update_doc (json data) – data send to Solr
Returns:json – json string

Send json to Solr

url_for_update(commit=None, commitWithin=None, softCommit=None, optimize=None, waitSearcher=None, expungeDeletes=None, maxSegments=None)[source]
Parameters:
  • commit (bool) – optional – commit actions
  • commitWithin (int) – optional – document will be added within that time
  • softCommit (bool) – optional – performant commit without “on-disk” guarantee
  • optimize – optional – optimize forces all of the index segments to be merged into a single segment first.
  • waitSearcher (bool) – optional – block until a new searcher is opened and registered as the main query searcher,
  • expungeDeletes (bool) – optional – merge segments with deletes away
  • maxSegments (int) – optional – optimizes down to at most this number of segments
Returns:

str – url with all extra paramters set

This functions sets all extra parameters for the optimize and commit function.

class scorched.connection.SolrInterface(url, http_connection=None, mode=u'', retry_timeout=-1, max_length_get_url=2048, search_timeout=())[source]
__init__(url, http_connection=None, mode=u'', retry_timeout=-1, max_length_get_url=2048, search_timeout=())[source]
Parameters:
  • url (str) – url to Solr
  • http_connection (requests connection) – optional – already existing connection
  • mode (str) – optional – mode (readable, writable) Solr
  • retry_timeout (int) – optional – timeout until retry
  • max_length_get_url (int) – optional – max length until switch to post
  • search_timeout (float or tuple) – (optional) How long to wait for the server to send data before giving up, as a float, or a (connect timeout, read timeout) tuple.
add(docs, chunk=100, **kwargs)[source]
Parameters:
  • docs (dict) – documents to be added
  • chunk – optional – size of chunks in which the add command

should be split :type chunk: int :param kwargs: optinal – additional arguments :type kwargs: dict :returns: list of SolrUpdateResponse – A Solr response object.

Add a document or a list of document to Solr.

commit(waitSearcher=None, expungeDeletes=None, softCommit=None)[source]
Parameters:
  • waitSearcher (bool) – optional – block until a new searcher is opened and registered as the main query searcher, making the changes visible
  • expungeDeletes (bool) – optional – merge segments with deletes away
  • softCommit (bool) – optional – perform a soft commit - this will refresh the ‘view’ of the index in a more performant manner, but without “on-disk” guarantees.
Returns:

SolrUpdateResponse – A Solr response object.

A commit operation makes index changes visible to new search requests.

delete_all()[source]
Returns:SolrUpdateResponse – A Solr response object.

Delete everything

delete_by_ids(ids, **kwargs)[source]
Parameters:ids (list) – ids of entries that should be deleted
Returns:SolrUpdateResponse – A Solr response object.

Delete entries by a given id

delete_by_query(query, **kwargs)[source]
Parameters:query (LuceneQuery) – criteria how witch entries should be deleted
Returns:SolrUpdateResponse – A Solr response object.

Delete entries by a given query

extract(fh, extractOnly=True, extractFormat=u'text')[source]
Parameters:fh (open file handle) – binary file (PDF, MSWord, ODF, ...)
Returns:SolrExtract

Extract text and metadatada from binary file.

The ExtractingRequestHandler is expected to be registered at the ‘/update/extract’ endpoint in the solrconfig.xml file of the server.

get(ids, fields=None)[source]

RealTime Get document(s) by id(s)

Parameters:
  • ids (list, string or int) – id(s) of the document(s)
  • fields – optional – list of fields to return
mlt_query(fields, content=None, content_charset=None, url=None, query_fields=None, **kwargs)[source]
Parameters:
  • fields (list) – field names to compute similarity upon
  • content (str) – optional – string on witch to find similar documents
  • content_charset (str) – optional – charset e.g. (iso-8859-1)
  • url (str) – optional – like content but retrive directly from url
  • query_fields (dict e.g. ({"a": 0.25, "b": 0.75})) – optional – adjust boosting values for fields
Returns:

MltSolrSearch

Perform a similarity query on MoreLikeThisHandler

The MoreLikeThisHandler is expected to be registered at the ‘/mlt’ endpoint in the solrconfig.xml file of the server.

Other MoreLikeThis specific parameters can be passed as kwargs without the ‘mlt.’ prefix.

Returns:SolrResponse – A Solr response object.

More like this search Solr

optimize(waitSearcher=None, maxSegments=None)[source]
Parameters:
  • waitSearcher (bool) – optional – block until a new searcher is opened and registered as the main query searcher, making the changes visible
  • maxSegments (int) – optional – optimizes down to at most this number of segments
Returns:

SolrUpdateResponse – A Solr response object.

An optimize is like a hard commit except that it forces all of the index segments to be merged into a single segment first.

query(*args, **kwargs)[source]
Returns:SolrSearch – A solrsearch.

Build a Solr query

rollback()[source]
Returns:SolrUpdateResponse – A Solr response object.

The rollback command rollbacks all add/deletes made to the index since the last commit

search(**kwargs)[source]
Returns:SolrResponse – A Solr response object.

Search solr