Andrea Andrea - 1 month ago 29
Java Question

Removing an XWPFParagraph keeps the paragraph symbol (¶) for it

I am trying to remove a set of contiguous paragraphs from a Microsoft Word document, using

Apache POI
.

From what I have understood, deleting a paragraph is possible by removing all of its runs, this way:

/*
* Deletes the given paragraph.
*/
public static void deleteParagraph(XWPFParagraph p) {
if (p != null) {
List<XWPFRun> runs = p.getRuns();
//Delete all the runs
for (int i = runs.size() - 1; i >= 0; i--) {
p.removeRun(i);
}
p.setPageBreak(false); //Remove the eventual page break
}
}


In fact, it works, but there's something strange. The block of removed paragraphs does not disappear from the document, but it's converted in a set of empty lines. It's just like every paragraph would be converted into a new line.

By printing the paragraphs' content from code I can see, in fact, a space (for each one removed). Looking at the content directly from the document, with the formatting mark's visualization enabled, I can see this:

enter image description here

The vertical column of ΒΆ corresponds to the block of deleted elements.

Do you have an idea for that? I'd like my paragraphs to be completely removed.

I also tried by replacing the text (with
setText()
) and by removing eventual spaces that could be added automatically, this way:

p.setSpacingAfter(0);
p.setSpacingAfterLines(0);
p.setSpacingBefore(0);
p.setSpacingBeforeLines(0);
p.setIndentFromLeft(0);
p.setIndentFromRight(0);
p.setIndentationFirstLine(0);
p.setIndentationLeft(0);
p.setIndentationRight(0);


But with no luck.

Answer

I would delete paragraphs by deleting paragraphs, not by deleting only the runs in this paragraphs. Deleting paragraphs is not part of the apache poi high level API. But using XWPFDocument.getDocument().getBody() we can get the low level CTBody and there is a removeP(int i).

Example:

import java.io.*;
import org.apache.poi.xwpf.usermodel.*;

import java.awt.Desktop;

import org.apache.poi.openxml4j.exceptions.InvalidFormatException;

public class WordRemoveParagraph {

 /*
  * Deletes the given paragraph.
  */

 public static void deleteParagraph(XWPFParagraph p) {
  XWPFDocument doc = p.getDocument();
  int pPos = doc.getPosOfParagraph(p);
  doc.getDocument().getBody().removeP(pPos);
 }

 public static void main(String[] args) throws IOException, InvalidFormatException {

  XWPFDocument doc = new XWPFDocument(new FileInputStream("source.docx"));

  int pNumber = doc.getParagraphs().size() -1;
  while (pNumber >= 0) {
   XWPFParagraph p = doc.getParagraphs().get(pNumber);
   if (p.getParagraphText().contains("delete")) {
    deleteParagraph(p);
   }
   pNumber--;
  }

  doc.write(new FileOutputStream("result.docx"));
  doc.close();

  System.out.println("Done");
  Desktop.getDesktop().open(new File("result.docx"));

 }

}

This deletes all paragraphs from the document source.docx where the text contains "delete" and saves the result in result.docx.

Comments