Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misclassified tables and/or figures maybe tossed incorrectly #1206

Open
lfoppiano opened this issue Dec 3, 2024 · 4 comments
Open

Misclassified tables and/or figures maybe tossed incorrectly #1206

lfoppiano opened this issue Dec 3, 2024 · 4 comments
Assignees
Labels
bug From Hemiptera and especially its suborder Heteroptera implemented The issue has been implemented

Comments

@lfoppiano
Copy link
Collaborator

lfoppiano commented Dec 3, 2024

I have been reported a few cases of text disappearing from the fulltext.

I've identified two issues related to figures and tables.

First case, where paragraphs are misclassified as tables, by the fulltext model:

sample	sample	s	sa	sam	samp	e	le	ple	mple	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<paragraph>
lysates	lysates	l	ly	lys	lysa	s	es	tes	ates	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<paragraph>
(	(	(	(	(	(	(	(	(	(	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	OPENBRACKET	9	2	0	NUMBER	0	0	I-<citation_marker>
1	1	1	1	1	1	1	1	1	1	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	1	NOPUNCT	9	2	0	NUMBER	1	0	<table>
/	/	/	/	/	/	/	/	/	/	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	NOPUNCT	9	2	0	NUMBER	0	0	<table>
5000	5000	5	50	500	5000	0	00	000	5000	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<table>
of	of	o	of	of	of	f	of	of	of	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<table>
stock	stock	s	st	sto	stoc	k	ck	ock	tock	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<table>
solution	solution	s	so	sol	solu	n	on	ion	tion	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<table>
mix	mix	m	mi	mix	mix	x	ix	mix	mix	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<table>
A	a	A	A	A	A	A	A	A	A	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	NOPUNCT	9	2	0	NUMBER	0	0	<table>

Subsequently, the table model classify all the text as <content>,

1	1	1	1	1	1	1	1	1	1	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	1	NOPUNCT	9	2	0	NUMBER	1	0	<content>
/	/	/	/	/	/	/	/	/	/	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	NOPUNCT	9	2	0	NUMBER	0	0	<content>
5000	5000	5	50	500	5000	0	00	000	5000	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<content>
of	of	o	of	of	of	f	of	of	of	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<content>
stock	stock	s	st	sto	stoc	k	ck	ock	tock	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<content>
solution	solution	s	so	sol	solu	n	on	ion	tion	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<content>
mix	mix	m	mi	mix	mix	x	ix	mix	mix	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<content>
A	a	A	A	A	A	A	A	A	A	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	NOPUNCT	9	2	0	NUMBER	0	0	<content>
(	(	(	(	(	(	(	(	(	(	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	OPENBRACKET	9	2	0	NUMBER	0	0	<content>
Garvan	garvan	G	Ga	Gar	Garv	n	an	van	rvan	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<content>
Institute	institute	I	In	Ins	Inst	e	te	ute	tute	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<content>
of	of	o	of	of	of	f	of	of	of	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<content>
Medical	medical	M	Me	Med	Medi	l	al	cal	ical	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<content>
Research	research	R	Re	Res	Rese	h	ch	rch	arch	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<content>
)	)	)	)	)	)	)	)	)	)	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	ENDBRACKET	9	2	0	NUMBER	0	0	<content>
and	and	a	an	and	and	d	nd	and	and	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	2	0	NUMBER	0	0	<content>
2	2	2	2	2	2	2	2	2	2	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	1	NOPUNCT	9	2	0	NUMBER	1	0	<content>
µL	µl	µ	µL	µL	µL	L	µL	µL	µL	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
of	of	o	of	of	of	f	of	of	of	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
External	external	E	Ex	Ext	Exte	l	al	nal	rnal	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
RNA	rna	R	RN	RNA	RNA	A	NA	RNA	RNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
Control	control	C	Co	Con	Cont	l	ol	rol	trol	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
Consortium	consortium	C	Co	Con	Cons	m	um	ium	tium	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	1	0	<content>
(	(	(	(	(	(	(	(	(	(	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	OPENBRACKET	9	3	0	NUMBER	0	0	<content>
ERCC	ercc	E	ER	ERC	ERCC	C	CC	RCC	ERCC	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
)	)	)	)	)	)	)	)	)	)	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	ENDBRACKET	9	3	0	NUMBER	0	0	<content>
spike	spike	s	sp	spi	spik	e	ke	ike	pike	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
-	-	-	-	-	-	-	-	-	-	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	HYPHEN	9	3	0	NUMBER	0	0	<content>
in	in	i	in	in	in	n	in	in	in	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
controls	controls	c	co	con	cont	s	ls	ols	rols	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
(	(	(	(	(	(	(	(	(	(	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	OPENBRACKET	9	3	0	NUMBER	0	0	<content>
ThermoFisher	thermofisher	T	Th	The	Ther	r	er	her	sher	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
Scientific	scientific	S	Sc	Sci	Scie	c	ic	fic	ific	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
,	,	,	,	,	,	,	,	,	,	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	COMMA	9	3	0	NUMBER	0	0	<content>
4456740	4456740	4	44	445	4456	0	40	740	6740	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
)	)	)	)	)	)	)	)	)	)	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	ENDBRACKET	9	3	0	NUMBER	0	0	<content>
were	were	w	we	wer	were	e	re	ere	were	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
added	added	a	ad	add	adde	d	ed	ded	dded	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
to	to	t	to	to	to	o	to	to	to	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
12	12	1	12	12	12	2	12	12	12	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	3	0	NUMBER	1	0	<content>
µl	µl	µ	µl	µl	µl	l	µl	µl	µl	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
RNA	rna	R	RN	RNA	RNA	A	NA	RNA	RNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
eluate	eluate	e	el	elu	elua	e	te	ate	uate	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
.	.	.	.	.	.	.	.	.	.	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	DOT	9	3	0	NUMBER	0	0	<content>
Genomic	genomic	G	Ge	Gen	Geno	c	ic	mic	omic	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
DNA	dna	D	DN	DNA	DNA	A	NA	DNA	DNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
was	was	w	wa	was	was	s	as	was	was	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
removed	removed	r	re	rem	remo	d	ed	ved	oved	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
by	by	b	by	by	by	y	by	by	by	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
adding	adding	a	ad	add	addi	g	ng	ing	ding	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
1	1	1	1	1	1	1	1	1	1	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	1	NOPUNCT	9	3	0	NUMBER	1	0	<content>
μL	μl	μ	μL	μL	μL	L	μL	μL	μL	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
HL	hl	H	HL	HL	HL	L	HL	HL	HL	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
-	-	-	-	-	-	-	-	-	-	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	HYPHEN	9	3	0	NUMBER	0	0	<content>
dsDNase	dsdnase	d	ds	dsD	dsDN	e	se	ase	Nase	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
(	(	(	(	(	(	(	(	(	(	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	OPENBRACKET	9	3	0	NUMBER	0	0	<content>
ArcticZymes	arcticzymes	A	Ar	Arc	Arct	s	es	mes	ymes	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
,	,	,	,	,	,	,	,	,	,	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	COMMA	9	3	0	NUMBER	0	0	<content>
70800	70800	7	70	708	7080	0	00	800	0800	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
-	-	-	-	-	-	-	-	-	-	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	HYPHEN	9	3	0	NUMBER	0	0	<content>
202	202	2	20	202	202	2	02	202	202	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
)	)	)	)	)	)	)	)	)	)	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	ENDBRACKET	9	3	0	NUMBER	0	0	<content>
and	and	a	an	and	and	d	nd	and	and	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
1	1	1	1	1	1	1	1	1	1	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	1	NOPUNCT	9	3	0	NUMBER	1	0	<content>
.	.	.	.	.	.	.	.	.	.	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	DOT	9	3	0	NUMBER	0	0	<content>
6	6	6	6	6	6	6	6	6	6	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	1	NOPUNCT	9	3	0	NUMBER	1	0	<content>
µL	µl	µ	µL	µL	µL	L	µL	µL	µL	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	3	0	NUMBER	0	0	<content>
reaction	reaction	r	re	rea	reac	n	on	ion	tion	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
buffer	buffer	b	bu	buf	buff	r	er	fer	ffer	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
(	(	(	(	(	(	(	(	(	(	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	OPENBRACKET	9	4	0	NUMBER	0	0	<content>
ArcticZymes	arcticzymes	A	Ar	Arc	Arct	s	es	mes	ymes	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
,	,	,	,	,	,	,	,	,	,	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	COMMA	9	4	0	NUMBER	0	0	<content>
66001	66001	6	66	660	6600	1	01	001	6001	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
)	)	)	)	)	)	)	)	)	)	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	ENDBRACKET	9	4	0	NUMBER	0	0	<content>
to	to	t	to	to	to	o	to	to	to	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
12	12	1	12	12	12	2	12	12	12	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	4	0	NUMBER	1	0	<content>
µL	µl	µ	µL	µL	µL	L	µL	µL	µL	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
RNA	rna	R	RN	RNA	RNA	A	NA	RNA	RNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
eluate	eluate	e	el	elu	elua	e	te	ate	uate	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
,	,	,	,	,	,	,	,	,	,	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	COMMA	9	4	0	NUMBER	0	0	<content>
10	10	1	10	10	10	0	10	10	10	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	4	0	NUMBER	1	0	<content>
min	min	m	mi	min	min	n	in	min	min	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
incubation	incubation	i	in	inc	incu	n	on	ion	tion	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
at	at	a	at	at	at	t	at	at	at	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
37	37	3	37	37	37	7	37	37	37	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	4	0	NUMBER	1	0	<content>
°C	°c	°	°C	°C	°C	C	°C	°C	°C	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
,	,	,	,	,	,	,	,	,	,	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	COMMA	9	4	0	NUMBER	0	0	<content>
followed	followed	f	fo	fol	foll	d	ed	wed	owed	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
by	by	b	by	by	by	y	by	by	by	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
5	5	5	5	5	5	5	5	5	5	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	1	NOPUNCT	9	4	0	NUMBER	1	0	<content>
min	min	m	mi	min	min	n	in	min	min	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
incubation	incubation	i	in	inc	incu	n	on	ion	tion	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
at	at	a	at	at	at	t	at	at	at	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
55	55	5	55	55	55	5	55	55	55	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
°C	°c	°	°C	°C	°C	C	°C	°C	°C	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
.	.	.	.	.	.	.	.	.	.	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	DOT	9	4	0	NUMBER	0	0	<content>
RNA	rna	R	RN	RNA	RNA	A	NA	RNA	RNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
was	was	w	wa	was	was	s	as	was	was	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
stored	stored	s	st	sto	stor	d	ed	red	ored	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
at	at	a	at	at	at	t	at	at	at	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
-	-	-	-	-	-	-	-	-	-	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	HYPHEN	9	4	0	NUMBER	0	0	<content>
80	80	8	80	80	80	0	80	80	80	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
°C	°c	°	°C	°C	°C	C	°C	°C	°C	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
and	and	a	an	and	and	d	nd	and	and	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
thawed	thawed	t	th	tha	thaw	d	ed	wed	awed	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
on	on	o	on	on	on	n	on	on	on	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
ice	ice	i	ic	ice	ice	e	ce	ice	ice	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
immediately	immediately	i	im	imm	imme	y	ly	ely	tely	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
before	before	b	be	bef	befo	e	re	ore	fore	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
library	library	l	li	lib	libr	y	ry	ary	rary	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
preparation	preparation	p	pr	pre	prep	n	on	ion	tion	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	4	0	NUMBER	0	0	<content>
.	.	.	.	.	.	.	.	.	.	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	DOT	9	4	0	NUMBER	0	0	<content>
Messenger	messenger	M	Me	Mes	Mess	r	er	ger	nger	BLOCKSTART	LINESTART	LINEINDENT	NEWFONT	SAMEFONTSIZE	1	0	INITCAP	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
RNA	rna	R	RN	RNA	RNA	A	NA	RNA	RNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	1	0	ALLCAP	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
capture	capture	c	ca	cap	capt	e	re	ure	ture	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	1	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
library	library	l	li	lib	libr	y	ry	ary	rary	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	1	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
preparation	preparation	p	pr	pre	prep	n	on	ion	tion	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	1	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
,	,	,	,	,	,	,	,	,	,	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	1	0	ALLCAP	NODIGIT	1	COMMA	9	5	0	NUMBER	0	0	<content>
sequencing	sequencing	s	se	seq	sequ	g	ng	ing	cing	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	1	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
,	,	,	,	,	,	,	,	,	,	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	1	0	ALLCAP	NODIGIT	1	COMMA	9	5	0	NUMBER	0	0	<content>
and	and	a	an	and	and	d	nd	and	and	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	1	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
quantification	quantification	q	qu	qua	quan	n	on	ion	tion	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	1	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
MRNA	mrna	M	MR	MRN	MRNA	A	NA	RNA	MRNA	BLOCKSTART	LINESTART	LINEINDENT	NEWFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
capture	capture	c	ca	cap	capt	e	re	ure	ture	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
library	library	l	li	lib	libr	y	ry	ary	rary	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
preparation	preparation	p	pr	pre	prep	n	on	ion	tion	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
in	in	i	in	in	in	n	in	in	in	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
pan	pan	p	pa	pan	pan	n	an	pan	pan	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
-	-	-	-	-	-	-	-	-	-	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	HYPHEN	9	5	0	NUMBER	0	0	<content>
cancer	cancer	c	ca	can	canc	r	er	cer	ncer	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
and	and	a	an	and	and	d	nd	and	and	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
three	three	t	th	thr	thre	e	ee	ree	hree	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
-	-	-	-	-	-	-	-	-	-	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	HYPHEN	9	5	0	NUMBER	0	0	<content>
cancer	cancer	c	ca	can	canc	r	er	cer	ncer	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
cohorts	cohorts	c	co	coh	coho	s	ts	rts	orts	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
started	started	s	st	sta	star	d	ed	ted	rted	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
from	from	f	fr	fro	from	m	om	rom	from	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
8	8	8	8	8	8	8	8	8	8	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	1	NOPUNCT	9	5	0	NUMBER	1	0	<content>
.	.	.	.	.	.	.	.	.	.	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	DOT	9	5	0	NUMBER	0	0	<content>
5	5	5	5	5	5	5	5	5	5	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	1	NOPUNCT	9	5	0	NUMBER	1	0	<content>
µL	µl	µ	µL	µL	µL	L	µL	µL	µL	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
DNase	dnase	D	DN	DNa	DNas	e	se	ase	Nase	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
treated	treated	t	tr	tre	trea	d	ed	ted	ated	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
RNA	rna	R	RN	RNA	RNA	A	NA	RNA	RNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
eluate	eluate	e	el	elu	elua	e	te	ate	uate	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
.	.	.	.	.	.	.	.	.	.	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	DOT	9	5	0	NUMBER	0	0	<content>
cDNA	cdna	c	cD	cDN	cDNA	A	NA	DNA	cDNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
synthesis	synthesis	s	sy	syn	synt	s	is	sis	esis	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
was	was	w	wa	was	was	s	as	was	was	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
performed	performed	p	pe	per	perf	d	ed	med	rmed	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
using	using	u	us	usi	usin	g	ng	ing	sing	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
TruSeq	truseq	T	Tr	Tru	TruS	q	eq	Seq	uSeq	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
RNA	rna	R	RN	RNA	RNA	A	NA	RNA	RNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
Library	library	L	Li	Lib	Libr	y	ry	ary	rary	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
Prep	prep	P	Pr	Pre	Prep	p	ep	rep	Prep	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	5	0	NUMBER	0	0	<content>
for	for	f	fo	for	for	r	or	for	for	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
Enrichment	enrichment	E	En	Enr	Enri	t	nt	ent	ment	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
(	(	(	(	(	(	(	(	(	(	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	OPENBRACKET	9	6	0	NUMBER	0	0	<content>
Illumina	illumina	I	Il	Ill	Illu	a	na	ina	mina	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
,	,	,	,	,	,	,	,	,	,	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	COMMA	9	6	0	NUMBER	0	0	<content>
20020189	20020189	2	20	200	2002	9	89	189	0189	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
)	)	)	)	)	)	)	)	)	)	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	ENDBRACKET	9	6	0	NUMBER	0	0	<content>
as	as	a	as	as	as	s	as	as	as	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
previously	previously	p	pr	pre	prev	y	ly	sly	usly	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
described	described	d	de	des	desc	d	ed	bed	ibed	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
22	22	2	22	22	22	2	22	22	22	BLOCKIN	LINEIN	LINEINDENT	NEWFONT	LOWERFONT	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	6	0	NUMBER	1	1	<content>
.	.	.	.	.	.	.	.	.	.	BLOCKIN	LINEIN	LINEINDENT	NEWFONT	HIGHERFONT	0	0	ALLCAP	NODIGIT	1	DOT	9	6	0	NUMBER	0	0	<content>
Briefly	briefly	B	Br	Bri	Brie	y	ly	fly	efly	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
,	,	,	,	,	,	,	,	,	,	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	COMMA	9	6	0	NUMBER	0	0	<content>
RNA	rna	R	RN	RNA	RNA	A	NA	RNA	RNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
was	was	w	wa	was	was	s	as	was	was	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
fragmented	fragmented	f	fr	fra	frag	d	ed	ted	nted	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
,	,	,	,	,	,	,	,	,	,	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	COMMA	9	6	0	NUMBER	0	0	<content>
and	and	a	an	and	and	d	nd	and	and	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
first	first	f	fi	fir	firs	t	st	rst	irst	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
strand	strand	s	st	str	stra	d	nd	and	rand	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
cDNA	cdna	c	cD	cDN	cDNA	A	NA	DNA	cDNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
was	was	w	wa	was	was	s	as	was	was	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
generated	generated	g	ge	gen	gene	d	ed	ted	ated	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
using	using	u	us	usi	usin	g	ng	ing	sing	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
random	random	r	ra	ran	rand	m	om	dom	ndom	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
priming	priming	p	pr	pri	prim	g	ng	ing	ming	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
.	.	.	.	.	.	.	.	.	.	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	DOT	9	6	0	NUMBER	0	0	<content>
RNA	rna	R	RN	RNA	RNA	A	NA	RNA	RNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
templates	templates	t	te	tem	temp	s	es	tes	ates	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
were	were	w	we	wer	were	e	re	ere	were	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
subsequently	subsequently	s	su	sub	subs	y	ly	tly	ntly	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
removed	removed	r	re	rem	remo	d	ed	ved	oved	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
and	and	a	an	and	and	d	nd	and	and	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
replaced	replaced	r	re	rep	repl	d	ed	ced	aced	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
by	by	b	by	by	by	y	by	by	by	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
a	a	a	a	a	a	a	a	a	a	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	1	NOPUNCT	9	6	0	NUMBER	0	0	<content>
newly	newly	n	ne	new	newl	y	ly	wly	ewly	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
synthesized	synthesized	s	sy	syn	synt	d	ed	zed	ized	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
second	second	s	se	sec	seco	d	nd	ond	cond	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
strand	strand	s	st	str	stra	d	nd	and	rand	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
of	of	o	of	of	of	f	of	of	of	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
cDNA	cdna	c	cD	cDN	cDNA	A	NA	DNA	cDNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
.	.	.	.	.	.	.	.	.	.	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	DOT	9	6	0	NUMBER	0	0	<content>
AMPure	ampure	A	AM	AMP	AMPu	e	re	ure	Pure	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	6	0	NUMBER	0	0	<content>
XP	xp	X	XP	XP	XP	P	XP	XP	XP	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
beads	beads	b	be	bea	bead	s	ds	ads	eads	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
(	(	(	(	(	(	(	(	(	(	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	OPENBRACKET	9	7	0	NUMBER	0	0	<content>
Beckman	beckman	B	Be	Bec	Beck	n	an	man	kman	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
Coulter	coulter	C	Co	Cou	Coul	r	er	ter	lter	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
Life	life	L	Li	Lif	Life	e	fe	ife	Life	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
Sciences	sciences	S	Sc	Sci	Scie	s	es	ces	nces	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
,	,	,	,	,	,	,	,	,	,	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	COMMA	9	7	0	NUMBER	0	0	<content>
A63881	a63881	A	A6	A63	A638	1	81	881	3881	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	CONTAINSDIGITS	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
)	)	)	)	)	)	)	)	)	)	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	ENDBRACKET	9	7	0	NUMBER	0	0	<content>
were	were	w	we	wer	were	e	re	ere	were	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
used	used	u	us	use	used	d	ed	sed	used	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
for	for	f	fo	for	for	r	or	for	for	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
purifying	purifying	p	pu	pur	puri	g	ng	ing	ying	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
the	the	t	th	the	the	e	he	the	the	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
blunt	blunt	b	bl	blu	blun	t	nt	unt	lunt	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
-	-	-	-	-	-	-	-	-	-	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	HYPHEN	9	7	0	NUMBER	0	0	<content>
ended	ended	e	en	end	ende	d	ed	ded	nded	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
double	double	d	do	dou	doub	e	le	ble	uble	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
stranded	stranded	s	st	str	stra	d	ed	ded	nded	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
cDNA	cdna	c	cD	cDN	cDNA	A	NA	DNA	cDNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
.	.	.	.	.	.	.	.	.	.	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	DOT	9	7	0	NUMBER	0	0	<content>
30	30	3	30	30	30	0	30	30	30	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	7	0	NUMBER	1	0	<content>
μL	μl	μ	μL	μL	μL	L	μL	μL	μL	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
cDNA	cdna	c	cD	cDN	cDNA	A	NA	DNA	cDNA	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
of	of	o	of	of	of	f	of	of	of	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
each	each	e	ea	eac	each	h	ch	ach	each	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
sample	sample	s	sa	sam	samp	e	le	ple	mple	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
was	was	w	wa	was	was	s	as	was	was	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
then	then	t	th	the	then	n	en	hen	then	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
used	used	u	us	use	used	d	ed	sed	used	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
as	as	a	as	as	as	s	as	as	as	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
input	input	i	in	inp	inpu	t	ut	put	nput	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
for	for	f	fo	for	for	r	or	for	for	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
Illumina	illumina	I	Il	Ill	Illu	a	na	ina	mina	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
DNA	dna	D	DN	DNA	DNA	A	NA	DNA	DNA	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
Prep	prep	P	Pr	Pre	Prep	p	ep	rep	Prep	BLOCKSTART	LINESTART	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
with	with	w	wi	wit	with	h	th	ith	with	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
Enrichment	enrichment	E	En	Enr	Enri	t	nt	ent	ment	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
(	(	(	(	(	(	(	(	(	(	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	OPENBRACKET	9	7	0	NUMBER	0	0	<content>
previously	previously	p	pr	pre	prev	y	ly	sly	usly	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
Nextera	nextera	N	Ne	Nex	Next	a	ra	era	tera	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
Flex	flex	F	Fl	Fle	Flex	x	ex	lex	Flex	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
for	for	f	fo	for	for	r	or	for	for	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
Enrichment	enrichment	E	En	Enr	Enri	t	nt	ent	ment	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
;	;	;	;	;	;	;	;	;	;	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	PUNCT	9	7	0	NUMBER	0	0	<content>
Illumina	illumina	I	Il	Ill	Illu	a	na	ina	mina	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	INITCAP	NODIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
,	,	,	,	,	,	,	,	,	,	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	COMMA	9	7	0	NUMBER	0	0	<content>
20025524	20025524	2	20	200	2002	4	24	524	5524	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	0	NOPUNCT	9	7	0	NUMBER	0	0	<content>
)	)	)	)	)	)	)	)	)	)	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	ENDBRACKET	9	7	0	NUMBER	0	0	<content>
.	.	.	.	.	.	.	.	.	.	BLOCKEND	LINEEND	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	DOT	9	7	0	NUMBER	0	0	<content>

and the incomplete table is then tossed away.

I wonder whether it would be possible to detect false positive tables by the related classes and convert them as <paragraph>

PDF: pub.1158465915.pdf

@lfoppiano lfoppiano added the bug From Hemiptera and especially its suborder Heteroptera label Dec 3, 2024
@lfoppiano lfoppiano self-assigned this Dec 3, 2024
@lfoppiano lfoppiano changed the title Misclassified tables maybe tossed incorrectly Misclassified tables and/or figures maybe tossed incorrectly Dec 4, 2024
@lfoppiano
Copy link
Collaborator Author

lfoppiano commented Dec 4, 2024

I'm delving into this issue, looking at

goodTable = goodTable && validateTable();

it seems that the tables are validated before being postprocessed, however the tables that do not pass the validation are not marked and dealt, somehow, I wonder if those should be just marked as paragraph and returned as fulltext. 🤔

My plan is to take those tables and reset the classification into the fulltext. It's probably better to have a table mangled in the <paragraph> rather than having text missing from the output 🤔

@lfoppiano
Copy link
Collaborator Author

Regarding this issue, another worrying thing is that <table> does not have the initial I- prefix:

(	(	(	(	(	(	(	(	(	(	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	ALLCAP	NODIGIT	1	OPENBRACKET	9	2	0	NUMBER	0	0	I-<citation_marker>
1	1	1	1	1	1	1	1	1	1	BLOCKIN	LINEIN	LINEINDENT	SAMEFONT	SAMEFONTSIZE	0	0	NOCAPS	ALLDIGIT	1	NOPUNCT	9	2	0	NUMBER	1	0	<table>

@lfoppiano
Copy link
Collaborator Author

lfoppiano commented Dec 6, 2024

Identification of misclassified figures is difficult, unless we enforce the presence of the header Figure X. This also. to fix the process of the paper first provided in #1160 (comment)

@lfoppiano
Copy link
Collaborator Author

This should be implemented and merged from #1207

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug From Hemiptera and especially its suborder Heteroptera implemented The issue has been implemented
Projects
None yet
Development

No branches or pull requests

1 participant