Feed API 2.0

From NUBWiki

Jump to: navigation, search

Contents

Introduction

The Feed API v2.0 is a non-backwards compatible replacement of the existing v1.0 API. Perhaps the biggest change between the two versions is the underlying data model. In v1.0 there were three distinct constructs - productions, activities and locations. In v2.0, these have been merged into a single event construct. The advantage of this approach is that it is now possible to support a large set of search criteria in a single API.

The other large change between versions is the introduction of #Search Facets, which exploit the categorisation of Events to provide a powerful way for user's to refine their searches.

The following page will introduce these concepts in more depth as well as describing the API in full.

Events

The Feed API v2.0 is built around the concept of events. These are designed to closely match what we intuitively see an event as, something happening somewhere at sometime. While in v1.0 these three concepts where separated into productions, locations and activities respectively, v2.0 brings these into the single object - event. Everything that is known about that event, which includes everything known about when the event is occurring and the location its occurring at, is part of the event object. For those who are used to normalized data or relationship databases, this can seem a little counter-intuitive. If multiple events are occurring at the same location, then the location's details will be listed multiple times. Although this means the data is duplicated, the power of this de-normalization is that it is now possible to find events based on criteria related to whats happening, when its happening, and where its happening.

Nub XML

All responses from the Feed API v2.0 uses a XML format known as the Nub XML. The XSD can be found here. The following is a summary of the format:

 <nubxml>
   <suggestion>...</suggestion>
   <events start=... rows=... numfound=...>
     <event eventcidn=... productioncidn=... datecreated=... datechanged=...>
       ...
     </event>
   </events>
   <facets>
      <facet name=... param=...>
         ...
      </facet>
   </facets>
 </nubxml>

Contained in the XML are:

Searching for Events

Searching for events is done through HTTP GET requests to a REST interface. The following is a description of the interface.

Requests

Base URL: http://pscluster.uitburo.nl:8080/agenda/search.do

Parameters:

Parameter Name Explanation Example
key All requests must include the access key for your account. This key will be used to retrieve your account and apply any account specific filtering during searches key=39a8629319cc746839f929a1444b2598
locale The Feed API v2.0 supports searching in multiple locales. This parameter defines both the locale for searching, and the locale for the results. Note, the search is applied against the index for the locale, so only those events which have a translation for the specified locale, will be considered. See #Locales for more information. locale=fy_NL
start Defines the starting index for the current page of search results. A search can potentially result in thousands of results. Returning this all in one request would be very inefficient, therefore the Feed API requires users to 'page' through results. The start parameter, in combination with the rows parameter defined below, allows control over the page size and which page to retrieve. Note, the start value is an offset from 0, rather than a page number. Assuming the default rows of 10, to retrieve the second page of results: start=10
rows Defines the number of results to return in the current request. This can be considered as the page size, and for most users it makes sense to use the same number for each request. Values in excess 1000 will take some time to respond. To define a page size of 12: rows=12
text Text to query events for. This is perhaps the most important parameter, as it is used to find events and will influence their relevancy score. The value will be used to query a number of different fields in events. Case is ignored. To find Shakespearean events: text=shakespeare
locationText Text used to find events that are occurring in a certain city. Although this is a free text search like the text parameter, it does not influence the relevancy score of results. Case is ignored. To find events in Enschede: locationText=enschede
location Exact name of a location that events must be occurring at. Here, location refers to the name of the location, such as Centrale Bibliotheek. While it is possible to include the location in the text parameter, the text parameter is a free text search. This is a filter, meaning it must match the location title of the event exactly and has no influence on relevancy scoring. The source of these values are most likely the location facet described below. Case is important. To find events occurring at the Centrale Bibliotheek: location=Centrale Bibliotheek
headGenre Filters the events to only those that have the head genre. Note this is a case sensitive exact match. Can be specified multiple times to select events which have head_genre_1 OR head_genre_2 OR ... head_genre_n To find events with the head genres Klassieke Muziek or Film: headGenre=Klassieke Muziek&headGenre=Film
subGenre Filters the events to only those that have the sub genre. Note this is a case sensitive exact match. Can be specified multiple times to select events which have sub_genre_1 OR sub_genre_2 OR sub_genre_n To find events with the sub genres Kamermuziek or SpeelFilm: subGenre=Kamermuziek&subGenre=SpeelFilm
period Named period of time which events must start during. See #Periods for a list of the named periods and what they resolve too. To find all events today: period=today
periodStart Defines the start of the period in which events must start during. See #Dates & Times for the datetime format. To find all events starting after March 15 2010: periodStart=2010-04-15T00:00:00Z
periodEnd Defines the end of the period in which events must start during. See #Dates & Times for the datetime format. To find all events starting before September 15 2010: periodEnd=2010-09-15T00:00:00Z
gratis Defines whether events should be noted as gratis or not. Possible values 1 or 0. When omitted, no filtering is applied. To find gratis events: gratis=1
regionId Defines which region events must be occurring in (or the responsibility of). See #Regions for the list of region ids. When omitted, no region filtering is applied To find events in Amsterdam: regionId=1
lmts Defines whether events should have Last Minute Ticket Shop (LMTS) prices. Possible values of 1 or 0. When omitted, no filtering is applied. To find LMTS events: lmts=1
changedSince Defines that events must have changed after the given date Date. See #Dates & Times for the date format. To find all events that have changed after March 15 2010: changedSince=2010-04-15T00:00:00Z
hasMedia Defines whether events should have media associated with them. Note, this is only media related to the event itself, not the location where the event is occurring. Possible values of 1 or 0. When omitted, no filtering is applied To find events with media: hasMedia=1
sort Defines the kind of sorting that should be applied to the search results. By default, search results are sorted by their relevancy score descending, meaning the most relevant results are first. By setting this parameter, different sorting will occur. Possible values are:
  • startdatetime: Sorts by the start datetime of events
  • title: Sorts by the title of the events, lexographically
  • XYZ: By providing any other value, it is possible to sort the events randomly. Use the same value to ensure the same random sorting is applied as you page through the results.
To sort events by their title: sort=title
direction Defines the direction of the sort specified by the sort parameter. Two possible values, "asc" for ascending, and "desc" for descending. When omitted, descending sorting is applied. To sort events in a descending order: direction=desc
eventcidn Chooses an event with a specific event cidn. Can be specified multiple times to select multiple events. Maximum number of values that can be defined is 1024. To choose the event with CIDN 2009-A-001-123456: eventcidn=2009-A-001-123456
productioncidn Chooses events with a specific production cidn. Can be specified multiple times to select events with different production cidns. Maximum number of values that can be defined is 1024. To choose events with the production CIDN 2009-P-001-123456: productioncidn=2009-P-001-123456
locationcidn Chooses events with a specific location cidn. Can be specified multiple times to select events with different location cidns. Maximum number of values that can be defined is 1024. To choose events with the location CIDN 2009-L-001-123456: locationcidn=2009-L-001-123456

Note, events which have completed in the past cannot be requested and will not be included in the search results. Such requests will not result in an error, just an empty result list.

Examples

The following are examples of common searches:

  • Find all events:

http://pscluster.uitburo.nl:8080/agenda/search.do?key=39a8629319cc746839f929a1444b2598

  • Find all events involving Hamlet:

http://pscluster.uitburo.nl:8080/agenda/search.do?key=39a8629319cc746839f929a1444b2598&text=hamlet

  • Find all Dans events:

http://pscluster.uitburo.nl:8080/agenda/search.do?key=39a8629319cc746839f929a1444b2598&headGenre=Dans

  • Find all Dans events happening at Het Muziektheater Amsterdam:

http://pscluster.uitburo.nl:8080/agenda/search.do?key=39a8629319cc746839f929a1444b2598&headGenre=Dans&location=Het%20Muziektheater%20Amsterdam

  • Find all events sorted by startdatetime:

http://pscluster.uitburo.nl:8080/agenda/search.do?key=39a8629319cc746839f929a1444b2598&sort=startdatetime&direction=asc

  • Find all events in Amsterdam Uitburo region:

http://pscluster.uitburo.nl:8080/agenda/search.do?key=39a8629319cc746839f929a1444b2598&regionId=1

Locales

One major difference between the Feed API 2.0 and 1.0 is the support for multiple locales. Through the locale parameter it is possible to search for events in a specific locale. For simplicity, searches are only done on events which have a translation in the specified locale. Therefore although an event might be found when searching in nl_NL, it might not be found when searching in fy_NL. The following table defines the locales and their parameter values:

Locale Locale Parameter Value
Nederlands nl_NL
Fries fy_NL
English (coming soon) en_US
French (coming soon) fr_FR
German (coming soon) de_DE

Regions

Each event in the Feed API is assigned a specific region id, which connects an event to its owning Uitburo. The following table defines the regions with their region ids:

Region Name Region ID
Netherlands (whole country) 0
Amsterdam 1
Rotterdam 2
Den Haag 3
Groningen 4
Enschede 5
Leiden 6
Maastricht 7
Limburg 8
Utrecht 9
Nijmegen 10
Noord-Holland 11
Apeldoorn 12
Friesland 13
Arnhem 14
Drenthe 15
Flevoland 16
Gelderland 17
Noord-Brabant 18
Overijssel 19
Zuid-Holland 21
Zwolle 22

Dates & Times

The Feed API v2.0 uses UTC for all its datetimes, in the following format: The Feed API v2.0 uses datetimes in the following format: yyyy-MM-dd'T'HH:mm:ss'Z'. See here for a description of the format.

The only exception to this is the <starttime/> element which is stored as text and is in CET time.

Some events are entered into the system without a time in their startdatetime. Such events are given the start time 06:03 CET (05:03 UTC). Equally, some events do not have enddatetimes. These events are given given the startdatetime as their enddatetime.

Periods

Periods are simple shorthand values for common date periods. The following table defines the current periods with their full date ranges:

Period Date Range
today 5am Today - 5am Tomorrow
tomorrow 5am Tomorrow - 5am the following day
thisWeek 5am Today - 5am 7 days from Today
thisMonth 5am Today - 5am 31 days from Today
todayEvening 4pm Today - 5am Tomorrow
tomorrowEvening 4pm Tomorrow - 5am the following day

Periods can be used in search requests through the period parameter.

Relevancy

At the heart of the Feed API 2.0 is the concept of search result relevancy. As opposed to a traditional relational database based system, where the order of how results are returned to the user is less important, free text search system such as the Feed API 2.0 are driven by the concept of a relevancy score. When a search is executed, a relevancy score is computed for each search result. The score is computed using Apache Lucene's academically proven algorithm which is described here.

Part of the scoring algorithm is field specific boosting. This allows matches on specific fields in events to be considered more important than matches on other fields. The order of importance defined in the Feed API 2.0 is as follows:

  1. title
  2. location name
  3. gezelschappen
  4. medewerker name
  5. head genre
  6. sub genre
  7. summary, short description, description

Matches on combinations of these fields will be combined together when computing the final score. Therefore a match on title and head genre will be more important than just a match on title.

Search Facets

In addition to events, the Feed API 2.0 includes search facets. Conceptually, search facets use the categorisation of events in a search result list, to provide ways to filter the list further.

Imagine a situation where a user searches for 'Anouk' using the text parameter defined above. This will result in a wide array of events occurring at different locations. Through the location search facet, it would be possible to see how many of these events are occurring at each location. Using the location values, it would then be possible to filter the search results to see only those events happening at Amsterdam Arena, for example.

The facet information included in the Feed API 2.0 results follows the following format:

 <facet name="FACET_NAME" param="FACET_PARAM">
   <facetEntry count="COUNT">VALUE</facetEntry>
   ...
 </facet>

Here:

  • FACET_NAME is a logical name for the facet
  • FACET_PARAM defines the parameter to use in combination with VALUE to filter the search results
  • COUNT defines the number of events in the total search results (not just the current page) that have the value VALUE
  • VALUE is the value that the events have for the facet

Therefore assuming the following example:

 <facet name="HeadGenre" param="headGenre">
   <facetEntry count="240">Cabaret</facetEntry>
   <facetEntry count="384">Dans</facetEntry>
   <facetEntry count="38">Film</facetEntry>
 </facet>

For the facet 'HeadGenre' there exists 240 events with the value 'Cabaret', 384 with 'Dans' and 38 with 'Film'. To then filter the results to those with 'Film', you would append headGenre=Film to your request.

The Feed API 2.0 defines 4 facets:

  1. HeadGenre (based on the event head genres)
  2. SubGenre (based on the event sub genres)
  3. Location (based on the event location titles)
  4. Period

Spelling Suggestions

Another component of the response from the Feed API 2.0, is spelling suggestions. Included at the top of the NUBXML under the <suggestion/> tag, spelling suggestions are suggestions for new text parameter values, based on those provided in a request. Internally, the text parameter value is compared against indexed terms and a suggestion that will result in the highest number of results is generated.

Consider a search with text=hamlek. Such a search will most likely result in 0 results. In this situation, the Feed will suggest the correctly spelled 'hamlet' as an alternative text value. The following is the XML illustrating this example:

 <nubxml>
   <suggestion>hamlet</suggestion>
   <events start="0" rows="10" numfound="0">
   </events>
   ...
 </nubxml>

Note, suggestions are segregated on a per region basis. This means that when searching with the regionId=1, only those events with this region id will be used as a source of suggestions. Also, when the generated suggestion is the same as the provided text parameter value, the suggestion will not be included in the search response.

Personal tools
Uitburo producten