How to Optimize PDFs for Streaming in Java
Learn how to linearize and optimize a PDF document so it's easier to stream and download on the web.
It’s generally hard to find fault with the way PDF documents are structured. After all, PDFs are secure, web-accessible files capable of displaying high quality bitmap images, text, and all different kinds of multimedia. There’s a lot to love in there.
PDF Download & Streaming Limitations
If we DID have to pick a bone, however, it might be with the way PDFs open by default, and the way that impacts viewing and/or downloading them on the web. Traditionally, PDFs are structured in a way that requires downloading (or streaming) the full document before any part of it can be viewed.
That isn’t a big deal most of the time - until we encounter a really big and bulky PDF document with lots of pages laden with multimedia. Including PDF document viewers on a web page is common practice these days, but if page visitors get stuck waiting for a big document to completely load, they might lose patience and shift their attention elsewhere.
Thankfully, there’s a way we can get around traditional PDF file structure. By optimizing PDFs through linearization, we can reorganize the file internally so viewers can load the first few pages of the document before the rest of the pages download. This contributes to a much more positive user experience, increasing the likelihood we’ll retain web page visitors.
Tutorial: Optimizing PDFs for Download in Java
In this tutorial, we’re going to learn how to call an API that optimizes PDFs for web streaming via linearization. We’ll structure our API call with ready-to-run Java code examples, and we’ll wrap up our whole process in minutes.
Step 1: Installing the Client with Maven
In our first step, we’re going to install the Maven SDK. To do that, we can start by adding the below reference to the repository in our pom.xml:
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
And right after that, we can add another reference to the dependency in pom.xml:
<dependencies>
<dependency>
<groupId>com.github.Cloudmersive</groupId>
<artifactId>Cloudmersive.APIClient.Java</artifactId>
<version>v4.25</version>
</dependency>
</dependencies>
Step 2: Adding the Imports to our Controller
In this step, we’re going to copy the following snippet containing the imports and add that to the top of our controller:
// Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.EditPdfApi;
Step 3: Configuring our API Key
In this step, we’ll turn our attention to authorizing our API calls. We won’t be able to make our API calls without a Cloudmersive API key; thankfully, we can get one of those for free by heading to the Cloudmersive website and creating a free account.
With our API key ready to go, we can now copy the configuration snippet into our file and supply our API key string where indicated in the code comments:
ApiClient defaultClient = Configuration.getDefaultApiClient();
// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");
Step 4: Calling the PDF Optimization Function
We’ve reached the final step in our process: calling the PDF linearization & optimization function. We can use the below snippet to create an instance of the API and call the function (our file goes in the File inputFile
variable):
EditPdfApi apiInstance = new EditPdfApi();
File inputFile = new File("/path/to/inputfile"); // File | Input file to perform the operation on.
try {
byte[] result = apiInstance.editPdfLinearize(inputFile);
System.out.println(result);
} catch (ApiException e) {
System.err.println("Exception when calling EditPdfApi#editPdfLinearize");
e.printStackTrace();
}
And that’s all there is to it - no more code required! Our API call will return the optimized version of our document, and we can display that version on our web pages.
If you have any questions about optimizing PDF processes with Cloudmersive APIs, please feel free to contact a member of our team! Don’t forget to subscribe to our technical blog for many more tutorials like this one.