REST PubSub

The REST PubSub implementation is included in bullet-core and can be launched along with the Web Service. If it is enabled, the Web Service will expose two additional REST endpoints, one for reading/writing Bullet queries, and one for reading/writing results.

How does it work?

When the Web Service receives a query from a user, it will create a PubSubMessage and write the message to the "query" RESTPubSub endpoint. This PubSubMessage will contain not only the query, but also some metadata, including the appropriate host/port to which the response should be sent (this is done to allow for multiple Web Services running simultaneously). The query is then stored in memory until the backend does a GET from this endpoint, at which time the query will be served to the backend, and dropped from the queue in memory.

Once the backed has generated the results of the query, it will wrap those results in PubSubMessage. The backend extracts the URL to send the results to from the metadata and writes the results PubSubMessage to the "results" REST endpoint with a POST. This result will then be stored in memory until the Web Service does a GET to that endpoint, at which time the Web Service will have the results of the query to send back to the user.

Setup

To enable the RESTPubSub and expose the two additional necessary REST endpoints, you must enable the setting:

bullet.pubsub.builtin.rest.enabled: true

...in the Web Service application.yaml configuration file. This can also be done from the command line when launching the Web Service jar file by adding the command-line option:

--bullet.pubsub.builtin.rest.enabled=true

This will enable the two necessary REST endpoints, the paths for which can be configured in the application.yaml file with the settings:

bullet.pubsub.builtin.rest.query.path: /pubsub/query
bullet.pubsub.builtin.rest.result.path: /pubsub/result

Plug into the Backend

Configure the backend to use the REST PubSub:

bullet.pubsub.context.name: "QUERY_PROCESSING"
bullet.pubsub.class.name: "com.yahoo.bullet.pubsub.rest.RESTPubSub"
# Path to the SerDe for PubSubMessages. You MUST use the IdentityPubSubMessageSerDe
bullet.pubsub.message.serde.class.name: 'com.yahoo.bullet.pubsub.IdentityPubSubMessageSerDe'
bullet.pubsub.rest.connect.timeout.ms: 5000
bullet.pubsub.rest.subscriber.max.uncommitted.messages: 100
bullet.pubsub.rest.query.subscriber.min.wait.ms: 10
bullet.pubsub.rest.query.urls:
    - "http://<API_HOST_A>:9901/api/bullet/pubsub/query"
    - "http://<API_HOST_B>:9901/api/bullet/pubsub/query"
Setting Name Default Value Meaning
bullet.pubsub.context.name QUERY_PROCESSING Tells the PubSub that it is running in the backend
bullet.pubsub.class.name com.yahoo.bullet.pubsub.rest.RESTPubSub Tells Bullet to use this class for its PubSub
bullet.pubsub.message.serde.class.name com.yahoo.bullet.pubsub.IdentityPubSubMessageSerDe Tells Bullet to use this SerDe for reading and writing PubSubMessage payloads
bullet.pubsub.rest.connect.timeout.ms 5000 Sets the HTTP connect timeout to 5 s
bullet.pubsub.rest.subscriber.max.uncommitted.messages 100 This is the maximum number of uncommitted messages allowed to be read by the subscriber before blocking
bullet.pubsub.rest.query.subscriber.min.wait.ms 10 This is used to avoid making an HTTP request too rapidly and overloading the HTTP endpoint. It will force the backend to poll the query endpoint at most once every 10ms
bullet.pubsub.rest.query.urls This should be a list of all the query REST endpoint URLs. If you are only running one Web Service this will only contain one URL (the URL of your Web Service followed by the full path of the query endpoint)

Plug into the Web Service

Configure the Web Service to use the REST PubSub by passing in the yaml file using application.yaml bullet.pubsub.config:

bullet.pubsub.context.name: "QUERY_SUBMISSION"
bullet.pubsub.class.name: "com.yahoo.bullet.pubsub.rest.RESTPubSub"
# Path to the SerDe for PubSubMessages. You MUST use the IdentityPubSubMessageSerDe
bullet.pubsub.message.serde.class.name: 'com.yahoo.bullet.pubsub.IdentityPubSubMessageSerDe'
bullet.pubsub.rest.connect.timeout.ms: 5000
bullet.pubsub.rest.subscriber.max.uncommitted.messages: 100
bullet.pubsub.rest.result.subscriber.min.wait.ms: 10
bullet.pubsub.rest.result.url: "http://localhost:9901/api/bullet/pubsub/result"
bullet.pubsub.rest.query.urls:
    - "http://localhost:9901/api/bullet/pubsub/query"
Setting Name Default Value Meaning
bullet.pubsub.context.name QUERY_SUBMISSION Tells the PubSub that it is running in the Web Service
bullet.pubsub.class.name com.yahoo.bullet.pubsub.rest.RESTPubSub Tells Bullet to use this class for its PubSub
bullet.pubsub.message.serde.class.name com.yahoo.bullet.pubsub.IdentityPubSubMessageSerDe Tells Bullet to use this SerDe for reading and writing PubSubMessage payloads
bullet.pubsub.rest.connect.timeout.ms 5000 Sets the HTTP connect timeout to 5 s
bullet.pubsub.rest.subscriber.max.uncommitted.messages 100 This is the maximum number of uncommitted messages allowed to be read by the subscriber before blocking
bullet.pubsub.rest.result.subscriber.min.wait.ms 10 This is used to avoid making an HTTP request too rapidly and overloading the HTTP endpoint. It will force the Web Service to poll the query endpoint at most once every 10ms
bullet.pubsub.rest.result.url http://localhost:9901/api/bullet/pubsub/result This is the endpoint from which the Web Service should read results. This is the hostname of that machine the Web Service is running on (or localhost)
bullet.pubsub.rest.query.urls http://localhost:9901/api/bullet/pubsub/query In the Web Service, this should contain exactly one URL (the URL to which queries should be written). This is the hostname of that machine the Web Service is running on (or localhost)