Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get text without nested children's texts #10

Open
Micka33 opened this issue Mar 9, 2020 · 1 comment
Open

How to get text without nested children's texts #10

Micka33 opened this issue Mar 9, 2020 · 1 comment

Comments

@Micka33
Copy link

Micka33 commented Mar 9, 2020

How can I get just "This is some text"? and not "This is some textFirst span textSecond span text"?

<li id="listItem">
    This is some text
    <span id="firstSpan">First span text</span>
    <span id="secondSpan">Second span text</span>
</li>

Example:

let cheerio = require('cheerio');
let $ = cheerio.load(`
<li id="listItem">
    This is some text
    <span id="firstSpan">First span text</span>
    <span id="secondSpan">Second span text</span>
</li>`)

let jsonframe = require('jsonframe-cheerio')
jsonframe($)

let frame = {"text": "li#listItem"}
console.log( $('body').scrape(frame, { string: true } ))
// {
//   "text": "This is some text First span text Second span text"
// }
@Micka33 Micka33 changed the title How to get text without nested span's texts How to get text without nested children's texts Mar 9, 2020
@Micka33
Copy link
Author

Micka33 commented Mar 9, 2020

This does the trick.

let frame = {"text": "li#listItem < html || ([\\w\\s\\.\\d]+)<span"}

But it only works because I either have nothing or <span after my text. Is there a better builtin solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant