Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

where is isSameSentence() ? #17

Open
CZHIC opened this issue Jul 8, 2021 · 3 comments
Open

where is isSameSentence() ? #17

CZHIC opened this issue Jul 8, 2021 · 3 comments

Comments

@CZHIC
Copy link

CZHIC commented Jul 8, 2021

No description provided.

@patrickxchong
Copy link

I have the same question too!

@patrickxchong
Copy link

I'm not sure what's the actual intended behaviour, but this worked for me at some level (although I ended up manually parsing the text output of GetTextByRow instead)

func isSameSentence(text pdf.Text, lastTextStyle pdf.Text) bool {
	return (text.Font == lastTextStyle.Font) && (text.FontSize == lastTextStyle.FontSize) && (text.X == lastTextStyle.X)
}

@white0ut
Copy link

white0ut commented Jan 15, 2024

For future visitors, the above isSameSentence isn't quite on the mark. The above definition prints the font, font-size, x, and y coords of each character of text in the PDF.

It might be useful to say that something is of the same sentence if it has the same font and font-size. In which case the function definition you'd want would be

func isSameSentence(text pdf.Text, lastTextStyle pdf.Text) bool {
	return (text.Font == lastTextStyle.Font) && (text.FontSize == lastTextStyle.FontSize)
}

That really isn't true to the definition of "sameSentence" here, so you may want to check to see if a period was present in lastTextStyle before return true and effectively adding on the character to the text that get's printed along-side it's text style.

func isSameSentence(text pdf.Text, lastTextStyle pdf.Text) bool {
	return (text.Font == lastTextStyle.Font) && (text.FontSize == lastTextStyle.FontSize) && strings.Contains(lastTextStyle, ".")
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants