Dileep K V Dileep K V - 1 year ago 78
Java Question

PDFBox creating Sound object with link/reference to external mp3 or wav file

I am writing a utility application using open source java based PDFBox to convert PDF file containing 'Hyperlink to open an mp3 file' to replace it with sound object.

I used PDFBox API since it appears to be mature enough to work with Sound object. I could read the PDF file and find the hyperlink with reference to mp3. But I am not able to replace it with sound object. I created the Sound Object and associate with action but it does not work. I think I am missing some important part how to create Sound object using PDActionSound object. Is it possible to refer to external wav file using PDFBox API?

for (PDPage pdPage : pages) {
List<PDAnnotation> annotations = pdPage.getAnnotations();
for (PDAnnotation pdAnnotation : annotations) {
if (pdAnnotation instanceof PDAnnotationLink) {
PDAnnotationLink link = ((PDAnnotationLink) pdAnnotation);
PDAction action = link.getAction();
if (action instanceof PDActionLaunch) {
PDActionLaunch launch = ((PDActionLaunch) action);
String fileInfo = launch.getFile().getFile();
if (fileInfo.contains(".mp3")) {
/* create Sound object referring to external mp3*/
//something like
PDActionSound actionSound = new PDActionSound(
//set the ActionSound to the link.

How to create sound object (PDActionSound) and add to link successfully?

Answer Source

Speaking of mature, that part has never been used, and now that I had a closer look at the code, I think some work remains to be done... Please try this, I created this with PDFBox 2.0 after reading the PDF specification:

PDSimpleFileSpecification fileSpec = new PDSimpleFileSpecification(new COSString("/C/dir1/dir2/blah.mp3")); // see "File Specification Strings" in PDF spec
COSStream soundStream = new COSStream();
soundStream.setItem(COSName.F, fileSpec);
soundStream.setInt(COSName.R, 44100); // put actual sample rate here
PDActionSound actionSound = new PDActionSound(); 
actionSound.getCOSObject().setItem(COSName.getPDFName("Sound"), soundStream)); 
link.setAction(actionSound); // reassign the new action to the link annotation

edit: as the above didn't work, here's an alternative solution as requested in the comments. The file is embedded. It works only with .WAV files, and you have to know details of them. About 1/2 seconds are lost at the beginning. The sound you should hear is "I am Al Bundy". I tried with MP3 and didn't succeed. While googling, I found some texts saying that only "old" formats (wav, aif etc) are supported. I did find another way to play sounds ("Renditions") that even worked with embedded mp3 in another product, but the generated structure in the PDF is even more complex.

COSStream soundStream = new COSStream();
OutputStream os = soundStream.createOutputStream(COSName.FLATE_DECODE);
URL url = new URL("http://cd.textfiles.com/hackchronii/WAV/ALBUNDY1.WAV");
InputStream is = url.openStream();
// FileInputStream is = new FileInputStream(".....WAV");
IOUtils.copy(is, os);
// See p. 506 in PDF spec, Table 294
soundStream.setInt(COSName.C, 1); // channels
soundStream.setInt(COSName.R, 22050); // sampling rate
//soundStream.setString(COSName.E, "Signed"); // The encoding format for the sample data
soundStream.setInt(COSName.B, 8); // The number of bits per sample value per channel. Default value: 8
// soundStream.setName(COSName.CO, "MP3"); // doesn't work
PDActionSound actionSound = new PDActionSound();
actionSound.getCOSObject().setItem(COSName.getPDFName("Sound"), soundStream);

Update 9.7.2016:

We discussed this on the PDFBox mailing list, and thanks to Gilad Denneboom we know two more things: 1) in Adobe Acrobat it only lets you select either WAV or AIF files 2) code by Gilad Denneboom with MP3SPI to convert MP3 to raw:

private static InputStream getAudioStream(String filename) throws Exception {
    File file = new File(filename);
    AudioInputStream in = AudioSystem.getAudioInputStream(file);
    AudioFormat baseFormat = in.getFormat();
    AudioFormat decodedFormat = new AudioFormat(
    return AudioSystem.getAudioInputStream(decodedFormat, in);