Skip to content Skip to sidebar Skip to footer

How To Get Contents Between Two Tags In Jsoup/javascript

Chapter One

A piece of computer code

FirstnameLast

Solution 1:

Is this format going to be consistent? If so, you can simply query nextSibling twice for the strong element's parent (p).

If it's going to vary, you might need to manually check when to stop iterating through the siblings, such as verifying if the sibling contains a strong element.

It all depends on the full context.

Here's example with basic loops. You may want to add more checks or better queries given a different situation.

Document doc = Jsoup.connect(url).get();
List<Elements> data = new ArrayList<>();
Elements chapters = doc.select("p > strong");
for (Element chapter : chapters) {
    if (!chapter.ownText().toLowerCase().contains("chapter"))
        continue; //we've reached a strong element that isn't actually a chapter
    List<Element> siblings = new ArrayList<>();
    Element next = chapter.nextElementSibling();
    while (next != null) {
        if (next.ownText().toLowerCase().contains("chapter"))
            break; //we've reached the end of this chapter
        siblings.add(next);
        next = next.nextElementSibling();
    }
    data.add(new Elements(siblings));
}

Post a Comment for "How To Get Contents Between Two Tags In Jsoup/javascript"