# Source Streaming

Apache Pekko HTTP supports completing a request with an Apache Pekko @apidoc[Source[T, \_]], which makes it possible to easily build
and consume streaming end-to-end APIs which apply back pressure throughout the entire stack.

It is possible to complete requests with raw @apidoc[Source[ByteString, \_]], however often it is more convenient to
stream on an element-by-element basis, and allow Apache Pekko HTTP to handle the rendering internally - for example as a JSON array,
or CSV stream (where each element is followed by a newline).

In the following sections we investigate how to make use of the JSON Streaming infrastructure,
however the general hints apply to any kind of element-by-element streaming you could imagine.

# JSON Streaming

[JSON Streaming](https://en.wikipedia.org/wiki/JSON_Streaming) is a term referring to streaming a (possibly infinite) stream of element as independent JSON
objects as a continuous HTTP request or response. The elements are most often separated using newlines,
however do not have to be. Concatenating elements side-by-side or emitting "very long" JSON array is also another
use case.

In the below examples, we'll be referring to the `Tweet` case class as our model, which is defined as:

Scala
:   @@snip [JsonStreamingExamplesSpec.scala](/docs/src/test/scala/docs/http/scaladsl/server/directives/JsonStreamingExamplesSpec.scala) { #tweet-model }

Java
:   @@snip [JsonStreamingExamplesTest.java](/docs/src/test/java/docs/http/javadsl/server/JsonStreamingExamplesTest.java) { #tweet-model }

@@@ div { .group-scala }

And as always with `spray-json`, we provide our marshaller and unmarshaller instances as implicit values using the `jsonFormat##`
method to generate them statically:

Scala
:   @@snip [JsonStreamingExamplesSpec.scala](/docs/src/test/scala/docs/http/scaladsl/server/directives/JsonStreamingExamplesSpec.scala) { #tweet-format }

@@@

## Responding with JSON Streams

In this example we implement an API representing an infinite stream of tweets, very much like Twitter's [Streaming API](https://developer.twitter.com/en/docs).

@@@ div { .group-scala }

Firstly, we'll need to get some additional marshalling infrastructure set up, that is able to marshal to and from an
Apache Pekko Streams @apidoc[Source[T, \_]]. One such trait, containing the needed marshallers is `SprayJsonSupport`, which uses
`spray-json` (a high performance JSON parser library), and is shipped as part of Apache Pekko HTTP in the
`pekko-http-spray-json` module.

Once the general infrastructure is prepared, we import our model's marshallers, generated by `spray-json` (Step 1)
and enable JSON Streaming by making an implicit @apidoc[EntityStreamingSupport] instance available (Step 2).
Apache Pekko HTTP pre-packages JSON and CSV entity streaming support, however it is simple to add your own, in case you'd
like to stream a different content type (for example plists or protobuf).

@@@

@@@ div { .group-java }

Firstly, we'll need to get some additional marshalling infrastructure set up, that is able to marshal to and from an
Apache Pekko Streams @apidoc[Source[T, ?]]. Here we'll use the `Jackson` helper class from `pekko-http-jackson` (a separate library
that you should add as a dependency if you want to use Jackson with Apache Pekko HTTP).

First we enable JSON Streaming by making an implicit @apidoc[EntityStreamingSupport] instance available (Step 1).

The default mode of rendering a @apidoc[Source] is to represent it as an JSON Array. If you want to change this representation
for example to use Twitter style new-line separated JSON objects, you can do so by configuring the support trait accordingly.

In Step 1.1. we demonstrate how to configure the rendering to be new-line separated, and also how parallel marshalling
can be applied. We configure the Support object to render the JSON as series of new-line separated JSON objects,
simply by appending a ByteString consisting of a single new-line character to each ByteString in the stream. Although this format is *not* valid JSON, it is pretty popular since parsing it is relatively
simple - clients need only to find the new-lines and apply JSON unmarshalling for an entire line of JSON.


@@@

The final step is simply completing a request using a Source of tweets, as simple as that:

Scala
:   @@snip [JsonStreamingExamplesSpec.scala](/docs/src/test/scala/docs/http/scaladsl/server/directives/JsonStreamingExamplesSpec.scala) { #spray-json-response-streaming }

Java
:   @@snip [JsonStreamingExamplesTest.java](/docs/src/test/java/docs/http/javadsl/server/JsonStreamingExamplesTest.java) { #response-streaming }

The reason the @apidoc[EntityStreamingSupport] has to be enabled explicitly is that one might want to configure how the
stream should be rendered. We'll discuss this in depth in the next section though.

@@@ div { .group-scala }

### Customising response rendering mode

Since it is not always possible to directly and confidently answer the question of how a stream of `T` should look on
the wire, the @apidoc[EntityStreamingSupport] traits come into play and allow fine-tuning the stream's rendered representation.

For example, in case of JSON Streaming, there isn't really one standard about rendering the response. Some APIs prefer
to render multiple JSON objects in a line-by-line fashion (Twitter's streaming APIs for example), while others simply return
very large arrays, which could be streamed as well.

Apache Pekko defaults to the second one (streaming a JSON Array), as it is correct JSON and clients not expecting
a streaming API would still be able to consume it in a naive way if they'd want to.

The line-by-line approach however is also pretty popular even though it is not valid JSON. Its simplicity for
client-side parsing is a strong point in case to pick this format for your Streaming APIs.
Below we demonstrate how to reconfigure the support trait to render the JSON line-by-line.

Scala
:   @@snip [JsonStreamingExamplesSpec.scala](/docs/src/test/scala/docs/http/scaladsl/server/directives/JsonStreamingExamplesSpec.scala) { #line-by-line-json-response-streaming }

Another interesting feature is parallel marshalling. Since marshalling can potentially take much time,
it is possible to marshal multiple elements of the stream in parallel. This is simply a configuration
option on @apidoc[EntityStreamingSupport] and is configurable like this:

Scala
:   @@snip [JsonStreamingExamplesSpec.scala](/docs/src/test/scala/docs/http/scaladsl/server/directives/JsonStreamingExamplesSpec.scala) { #async-rendering }

The above shown mode preserves ordering of the Source's elements, which may sometimes be a required property,
for example when streaming a strictly ordered dataset. Sometimes the concept of strict order does not apply to the
data being streamed, though, which allows us to exploit this property and use an `unordered` rendering.

This `unordered` rendering can be enabled via a configuration option as shown below. Effectively, this allows Apache Pekko HTTP's marshalling infrastructure to concurrently marshall up to as many elements as defined in `parallelism` and emit the first one which is marshalled into the @apidoc[HttpResponse]:

Scala
:   @@snip [JsonStreamingExamplesSpec.scala](/docs/src/test/scala/docs/http/scaladsl/server/directives/JsonStreamingExamplesSpec.scala) { #async-unordered-rendering }

This allows us to _potentially_ render elements faster into the HttpResponse, since it can avoid "head of line blocking",
in case one element in front of the stream takes a long time to marshall, yet others after it are very quick to marshall.

@@@

## Consuming JSON Streaming uploads

Sometimes a client sends a streaming request. For example, an embedded device initiated a connection with
the server and is feeding it with one line of measurement data.

In this example, we want to consume this data in a streaming fashion from the request entity and also apply
back pressure to the underlying TCP connection should the server be unable to cope with the rate of incoming data. Back pressure
is automatically applied thanks to @extref[Apache Pekko Streams](pekko-docs:stream/index.html).

Scala
:   @@snip [JsonStreamingExamplesSpec.scala](/docs/src/test/scala/docs/http/scaladsl/server/directives/JsonStreamingExamplesSpec.scala) { #measurement-model #measurement-format }

Java
:   @@snip [JsonStreamingExamplesTest.java](/docs/src/test/java/docs/http/javadsl/server/JsonStreamingExamplesTest.java) { #measurement-model #measurement-format }


Scala
:   @@snip [JsonStreamingExamplesSpec.scala](/docs/src/test/scala/docs/http/scaladsl/server/directives/JsonStreamingExamplesSpec.scala) { #spray-json-request-streaming }

Java
:   @@snip [JsonStreamingExamplesTest.java](/docs/src/test/java/docs/http/javadsl/server/JsonStreamingExamplesTest.java) { #incoming-request-streaming }


## Simple CSV streaming example

Apache Pekko HTTP provides another @apidoc[EntityStreamingSupport] out of the box, namely `csv` (comma-separated values).
For completeness, we demonstrate its usage in the snippet below. As you'll notice, switching between streaming
modes is fairly simple: You only have to make sure that an implicit @apidoc[Marshaller] of the requested type is available
and that the streaming support operates on the same `Content-Type` as the rendered values. Otherwise, you'll see
an error during runtime that the marshaller did not expose the expected content type and thus we can't render
the streaming response).

Scala
:   @@snip [JsonStreamingExamplesSpec.scala](/docs/src/test/scala/docs/http/scaladsl/server/directives/JsonStreamingExamplesSpec.scala) { #csv-example }

Java
:   @@snip [JsonStreamingExamplesTest.java](/docs/src/test/java/docs/http/javadsl/server/JsonStreamingExamplesTest.java) { #csv-example }

## Implementing custom EntityStreamingSupport traits

The @apidoc[EntityStreamingSupport] infrastructure is open for extension and not bound to any single format, content type,
or marshalling library. The provided JSON support does not rely on `spray-json` directly, but uses @apidoc[Marshaller[T, ByteString]]
instances, which can be provided using any JSON marshalling library (such as Circe, Jawn or Play JSON).

When implementing a custom support trait, one should simply extend the @apidoc[EntityStreamingSupport] abstract class
and implement all of its methods. It's best to use the existing implementations as a guideline.

## Supporting custom content types

In order to marshal into custom content types, both a @apidoc[Marshaller] that can handle that content type
**as well as an @apidoc[EntityStreamingSupport] of matching content type** is required.

Refer to the complete example below, showcasing how to configure a custom marshaller and change
the entity streaming support's content type to be compatible. This is an area that would benefit from additional type safety,
which we hope to add in a future release.

Scala
:   @@snip [JsonStreamingFullExamples.scala](/docs/src/test/scala/docs/http/scaladsl/server/directives/JsonStreamingFullExamples.scala) { #custom-content-type }

Java
:   @@snip [JsonStreamingFullExample.java](/docs/src/test/java/docs/http/javadsl/server/directives/JsonStreamingFullExample.java) { #custom-content-type }

## Consuming streaming JSON on client-side

For consuming such streaming APIs with, for example, JSON responses refer to @ref[Consuming JSON Streaming style APIs](../common/json-support.md#consuming-json-streaming-style-apis)
documentation in the JSON support section.
