Add text classification to inference client #1606

martinbrose · 2023-08-20T11:55:46Z

Add text-classification to HuggingFace🤗 Hub

References #1539

This is an ongoing list of model tasks to implement in the Hugging Face Hub inference client. Each task is planned to be its own PR. The task for this is text-classification.

Key Features

Modifies _client.py to call a text classification model.
Modifies testing and documentation to reflect changes.

HuggingFaceDocBuilderDev · 2023-08-20T12:00:34Z

The documentation is not available anymore as the PR was closed or merged.

Wauplin

Hey @martinbrose, I'm starting to review all your inference-related PRs. Thanks a ton for the massive work, it's good quality for what I saw! 👏 🙏 I have left some comments on this PR that are also relevant to the other PRs (especially simplifying as much as possible the methods signature + the merge conflict issue). In the meantime, I'll take the time to thoroughly review the other PRs.

FYI, I'm off this Thursday/Friday and be fully back on the project starting from next week :)

src/huggingface_hub/inference/_client.py

docs/source/guides/inference.md

tests/test_inference_client.py

martinbrose · 2023-08-30T12:06:50Z

Hey @martinbrose, I'm starting to review all your inference-related PRs. Thanks a ton for the massive work, it's good quality for what I saw! 👏 🙏 I have left some comments on this PR that are also relevant to the other PRs (especially simplifying as much as possible the methods signature + the merge conflict issue). In the meantime, I'll take the time to thoroughly review the other PRs.

FYI, I'm off this Thursday/Friday and be fully back on the project starting from next week :)

Thanks for the review!
Apologies for this long list of separate PRs. I started with one... and then couldn't stop using them as bite-sized challenges at night.

…fication

codecov · 2023-09-04T23:52:37Z

Codecov Report

Patch coverage: 62.50% and project coverage change: -0.53% ⚠️

Comparison is base (b94f891) 82.30% compared to head (2cca3fd) 81.78%.

❗ Current head 2cca3fd differs from pull request most recent head 206c06b. Consider uploading reports for the commit 206c06b to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1606      +/-   ##
==========================================
- Coverage   82.30%   81.78%   -0.53%     
==========================================
  Files          62       60       -2     
  Lines        6964     6785     -179     
==========================================
- Hits         5732     5549     -183     
- Misses       1232     1236       +4

Files Changed	Coverage Δ
...gingface_hub/inference/_generated/_async_client.py	`58.37% <25.00%> (-0.62%)`	⬇️
src/huggingface_hub/inference/_client.py	`79.51% <100.00%> (+0.40%)`	⬆️

... and 7 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Wauplin

Thanks for making the changes @martinbrose :) I left some comments but most of them are due to the change between a multi-text input and a single-text input. Once those are addressed, I think we'll be good to merge 🚀

Wauplin · 2023-09-06T13:43:57Z

src/huggingface_hub/inference/_client.py

+        >>> output
+        {'label': 'POSITIVE', 'score': 0.9998695850372314}


Revert back to list output

Suggested change

>>> output

{'label': 'POSITIVE', 'score': 0.9998695850372314}

[[{'label': 'POSITIVE', 'score': 0.9998695850372314}, {'label': 'NEGATIVE', 'score': 0.0001304351753788069}]]

Wauplin · 2023-09-06T13:45:52Z

src/huggingface_hub/inference/_client.py

+        payload: Dict[str, Any] = {"inputs": text}
+        response = self.post(
+            json=payload,
+            model=model,
+            task="text-classification",
+        )


(nit) having a separate payload variable is fine as well but when it's tiny like this one (no parameter other than inputs), I prefer to pass it directly to .post method. No big deal anyway.

Suggested change

payload: Dict[str, Any] = {"inputs": text}

response = self.post(

json=payload,

model=model,

task="text-classification",

)

response = self.post(json={"inputs": text}, model=model, task="text-classification")

Wauplin · 2023-09-06T13:46:39Z

src/huggingface_hub/inference/_generated/_async_client.py

+        >>> output
+        {'label': 'POSITIVE', 'score': 0.9998695850372314}


Suggested change

>>> output

{'label': 'POSITIVE', 'score': 0.9998695850372314}

[{'label': 'POSITIVE', 'score': 0.9998695850372314}, {'label': 'NEGATIVE', 'score': 0.0001304351753788069}]

(same as sync version)

Wauplin · 2023-09-06T13:47:18Z

src/huggingface_hub/inference/_generated/_async_client.py

+        payload: Dict[str, Any] = {"inputs": text}
+        response = await self.post(
+            json=payload,
+            model=model,
+            task="text-classification",
+        )


Suggested change

payload: Dict[str, Any] = {"inputs": text}

response = await self.post(

json=payload,

model=model,

task="text-classification",

)

response = await self.post(json={"inputs": text}, model=model, task="text-classification")

(same as sync version)

Wauplin · 2023-09-06T13:49:03Z

tests/cassettes/InferenceClientVCRTest.test_text_classification.yaml

@@ -0,0 +1,48 @@
+interactions:
+- request:
+    body: '{"inputs": ["I like you", "I love you."]}'


Suggested change

body: '{"inputs": ["I like you", "I love you."]}'

body: '{"inputs": ["I like you"]}'

should be only 1 sample now

Wauplin · 2023-09-06T13:49:49Z

tests/cassettes/InferenceClientVCRTest.test_text_classification.yaml

+    uri: https://api-inference.huggingface.co/models/distilbert-base-uncased-finetuned-sst-2-english
+  response:
+    body:
+      string: '[[{"label":"POSITIVE","score":0.9998695850372314},{"label":"NEGATIVE","score":0.0001304351753788069}],[{"label":"POSITIVE","score":0.9998705387115479},{"label":"NEGATIVE","score":0.00012938841246068478}]]'


Suggested change

string: '[[{"label":"POSITIVE","score":0.9998695850372314},{"label":"NEGATIVE","score":0.0001304351753788069}],[{"label":"POSITIVE","score":0.9998705387115479},{"label":"NEGATIVE","score":0.00012938841246068478}]]'

string: '[[{"label":"POSITIVE","score":0.9998695850372314},{"label":"NEGATIVE","score":0.0001304351753788069}]]'

... and therefore only 1 response

Wauplin · 2023-09-06T13:50:20Z

tests/test_inference_client.py

+            self.assertIsInstance(item[0]["score"], float)
+            self.assertIsInstance(item[0]["label"], str)


Suggested change

self.assertIsInstance(item[0]["score"], float)

self.assertIsInstance(item[0]["label"], str)

self.assertIsInstance(item["score"], float)

self.assertIsInstance(item["label"], str)

1 level less

Wauplin · 2023-09-06T13:51:12Z

src/huggingface_hub/inference/_client.py

+            model=model,
+            task="text-classification",
+        )
+        return _bytes_to_list(response)


Suggested change

return _bytes_to_list(response)

return _bytes_to_list(response)[0]

Since we take as input only a str (not a List[str]), we need to output the first item returned (since the server returns a list of list of items).

Wauplin · 2023-09-06T13:51:41Z

src/huggingface_hub/inference/_generated/_async_client.py

+            model=model,
+            task="text-classification",
+        )
+        return _bytes_to_list(response)


Suggested change

return _bytes_to_list(response)

return _bytes_to_list(response)[0]

(same as sync version)

Wauplin · 2023-09-06T13:57:33Z

I've merged the suggested changes (see above) and tried it locally. It works great! :)
Will merge the PR once CI is green.

Add text classification to inference client

f0f1d0f

Wauplin reviewed Aug 30, 2023

View reviewed changes

src/huggingface_hub/inference/_client.py Outdated Show resolved Hide resolved

src/huggingface_hub/inference/_client.py Outdated Show resolved Hide resolved

docs/source/guides/inference.md Outdated Show resolved Hide resolved

tests/test_inference_client.py Outdated Show resolved Hide resolved

Wauplin added 2 commits August 30, 2023 15:14

Merge branch 'main' into martinbrose-1539-InferenceClient-text-classi…

466f100

…fication

Merge branch 'main' into 1539-InferenceClient-text-classification

35a33e4

Wauplin mentioned this pull request Sep 4, 2023

Add token classification to inference client #1607

Merged

Address PR review comments

8ed6dd9

Wauplin mentioned this pull request Sep 5, 2023

Distinguish _bytes_to_dict and _bytes_to_list + fix issues #1641

Merged

Wauplin and others added 3 commits September 5, 2023 12:35

Merge branch 'main' into 1539-InferenceClient-text-classification

d35b544

Return a list of dictionaries and update test

559872f

Merge branch 'main' into 1539-InferenceClient-text-classification

2cca3fd

Wauplin reviewed Sep 6, 2023

View reviewed changes

Wauplin added 3 commits September 6, 2023 15:54

Apply suggestions from code review

cf8e176

Merge branch 'main' into 1539-InferenceClient-text-classification

206c06b

make style

5c0c283

Wauplin approved these changes Sep 6, 2023

View reviewed changes

Wauplin merged commit 0cd7405 into huggingface:main Sep 6, 2023
14 checks passed

Wauplin mentioned this pull request Sep 6, 2023

Implement remaining tasks in InferenceClient #1539

Closed

12 tasks

martinbrose deleted the 1539-InferenceClient-text-classification branch September 6, 2023 16:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add text classification to inference client #1606

Add text classification to inference client #1606

martinbrose commented Aug 20, 2023

HuggingFaceDocBuilderDev commented Aug 20, 2023 •

edited

Loading

Wauplin left a comment

martinbrose commented Aug 30, 2023

codecov bot commented Sep 4, 2023 •

edited

Loading

Wauplin left a comment •

edited

Loading

Wauplin Sep 6, 2023

Wauplin Sep 6, 2023

Wauplin Sep 6, 2023

Wauplin Sep 6, 2023

Wauplin Sep 6, 2023

Wauplin Sep 6, 2023

Wauplin Sep 6, 2023

Wauplin Sep 6, 2023

Wauplin Sep 6, 2023

Wauplin commented Sep 6, 2023

		>>> output
		{'label': 'POSITIVE', 'score': 0.9998695850372314}

	>>> output
	{'label': 'POSITIVE', 'score': 0.9998695850372314}
	[[{'label': 'POSITIVE', 'score': 0.9998695850372314}, {'label': 'NEGATIVE', 'score': 0.0001304351753788069}]]

	>>> output
	{'label': 'POSITIVE', 'score': 0.9998695850372314}
	[{'label': 'POSITIVE', 'score': 0.9998695850372314}, {'label': 'NEGATIVE', 'score': 0.0001304351753788069}]

	body: '{"inputs": ["I like you", "I love you."]}'
	body: '{"inputs": ["I like you"]}'

	string: '[[{"label":"POSITIVE","score":0.9998695850372314},{"label":"NEGATIVE","score":0.0001304351753788069}],[{"label":"POSITIVE","score":0.9998705387115479},{"label":"NEGATIVE","score":0.00012938841246068478}]]'
	string: '[[{"label":"POSITIVE","score":0.9998695850372314},{"label":"NEGATIVE","score":0.0001304351753788069}]]'

		self.assertIsInstance(item[0]["score"], float)
		self.assertIsInstance(item[0]["label"], str)

	return _bytes_to_list(response)
	return _bytes_to_list(response)[0]

Add text classification to inference client #1606

Add text classification to inference client #1606

Conversation

martinbrose commented Aug 20, 2023

HuggingFaceDocBuilderDev commented Aug 20, 2023 • edited Loading

Wauplin left a comment

Choose a reason for hiding this comment

martinbrose commented Aug 30, 2023

codecov bot commented Sep 4, 2023 • edited Loading

Codecov Report

Wauplin left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wauplin commented Sep 6, 2023

HuggingFaceDocBuilderDev commented Aug 20, 2023 •

edited

Loading

codecov bot commented Sep 4, 2023 •

edited

Loading

Wauplin left a comment •

edited

Loading