Data Ingestion

Bullet operates on a generic data container that it understands. In order to get Bullet to operate on your data, you need to convert your data records into this format. This conversion is usually done when you plug in your data source into Bullet. Bullet does not make any assumptions on where you get this data from. It could be Kafka, RabbitMQ, or something else.

If you are trying to set up Bullet...

The rest of this page gives more information about the Record container and how to depend on it in code directly. If you are setting up Bullet, the Record is already included by default with the Bullet artifact. You can head on over to setting up the Storm topology to build the piece that gets your data into the Record container.

Bullet Record

The Bullet backend processes data that must be stored in a Bullet Record which is an abstract Java class that can be implemented as to be optimized for different backends or use-cases.

There are currently two concrete implementations of BulletRecord:

  1. SimpleBulletRecord which is based on a simple Java HashMap
  2. AvroBulletRecord which uses Avro for serialization

Types

Data placed into a Bullet Record is strongly typed. We support these types currently:

Primitives

  1. Boolean
  2. Integer
  3. Long
  4. Float
  5. Double
  6. String

Complex

  1. Map of Strings to any of the Primitives
  2. Map of Strings to any Map in 1
  3. List of any of the Primitives
  4. List of any Map in 1

With these types, it is unlikely you would have data that cannot be represented as Bullet Record but if you do, please let us know and we are more than willing to accommodate. It is also possible to place UNKNOWN container types such as Maps and Lists into the record. This can be useful for more deeply nested data structures or heterogenous container types. However, operations that extract fields from it can only work if the type of the extracted object is in the supported types above.

Installing the Record directly

Generally, you depend on the Bullet Core artifact for your Stream Processor when you plug in the piece that gets your data into the Stream processor. The Bullet Core artifact already brings in the Bullet Record containers as well. See the usage for the Storm for an example.

However, if you need it, the artifacts are available through Maven Central to depend on them in code directly. Below is a Maven example:

<dependency>
  <groupId>com.yahoo.bullet</groupId>
  <artifactId>bullet-record</artifactId>
  <version>${bullet.version}</version>
</dependency>

If you just need the jar artifact, you can download it directly from Maven Central.

You can also add <classifier>sources</classifier> or <classifier>javadoc</classifier> if you want the sources or the javadoc.