Skip to content
View tianchiguaixia's full-sized avatar

Block or report tianchiguaixia

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
tianchiguaixia/README.md
  • 👋 Hi, I’m @tianchiguaixia
  • 👀 I’m interested in python,NLP
  • 🌱 I’m currently learning NLP
  • 💞️ I’m looking to collaborate on NLP
  • 📫 How to reach me:[email protected]

Pinned Loading

  1. layoutlmv3-chinese layoutlmv3-chinese Public

    该项目是为了使用layoutlmv3针对中文图片训练和推理。 其中主要解决三个问题: 1.数据标准化成可以的训练数据集格式 2.layoutlmv3-base-chinese 分词修改 2.超过512长度的文本切分和滑窗操作

    Python 33 6

  2. text_classification text_classification Public

    该项目通过新闻数据集演示文本分类全流程:数据清洗,模型训练,模型部署和前端展示。使用的模型和工具:pytorch,bert,streamlit

    Python 18

  3. medical_ocr_streamlit medical_ocr_streamlit Public

    该项目主要是为了识别图片里面的表格数据,并将表格数据抽取处理,导出成csv的文件。整个项目会使用streamlit进行部署和展示。使用的技术:paddleocr,PPStructure,streamlit

    Python 33 4

  4. medical_records_extract medical_records_extract Public

    该项目主要是抽取病历文件中的一些关键信息。并将抽取的内容进行streamlit前端的展示。目前支持的文件类型:图片,pdf文件,word文件

    Python 22 6

  5. qwen1.5-ner qwen1.5-ner Public

    使用Qwen1.5-0.5B-Chat模型进行通用信息抽取任务的微调,旨在: 验证生成式方法相较于抽取式NER的效果; 为新手提供简易的模型微调流程,尽量减少代码量; 大模型训练的数据格式处理。

    Python 9